# Computer Vision Assignments: Sessions 1 & 2

This notebook contains tasks and assignments based on Sessions 1 and 2. You are required to implement the functions and complete the exercises as described. Use OpenCV and other necessary libraries like NumPy and Matplotlib.

**Instructions:**
- Complete each task in the provided code cells.
- Test your implementations with sample images (e.g., download test images [here](https://sipi.usc.edu/database/database.php?volume=misc) or [here](https://www.hlevkin.com/hlevkin/06testimages.htm) or use your own test images).
- Include comments in your code for clarity.
- Display results using cv2.imshow() or Matplotlib where appropriate.
- Submit the completed notebook along with any output images or explanations on [our google drive for the CV sessions](https://drive.google.com/drive/folders/1IjVhJmAXxNQTGT-ybJ-yc5smYtR5v8CO?usp=sharing) **upload your files in a new folder under your name**

## Session 1: Basic Image Operations (Reading, Resizing, Cropping, Rotating)

### Task 1: Read and Display an Image
Read an image from a file and display it in both BGR and grayscale formats. Handle errors if the image cannot be read.

In [3]:
import cv2 as cv
import numpy as np
import sys
import matplotlib.pyplot as plt
%matplotlib inline

# Your code here
path = "dogBGR.jpg"  # Replace with your image path

# Read in BGR
img = cv.imread(path)

# Read in Grayscale
gray_img = cv.cvtColor(img, cv.COLOR_BGR2GRAY)

# Display both using cv.imshow() or plt.imshow()
cv.imshow("dog", img)
cv.waitKey(0)
cv.imshow("gray dog", gray_img)
cv.waitKey(0)
cv.destroyAllWindows()

### Task 2: Resize Image with Aspect Ratio Preservation
Implement resizing while preserving aspect ratio. Downscale to 60% and upscale to 200%. Compare shapes and display originals vs resized.

In [11]:
# Your code here
# Load image

# Downscale to 60%
scale_factor = 60
h, w, _ = img.shape
h = int(scale_factor * h / 100)
w = int(scale_factor * w / 100)
dim = (h, w)
down_img = cv.resize(img, dim , interpolation = cv.INTER_AREA)

# Upscale to 200%
scale_factor = 200
h, w, _ = img.shape
h = int(scale_factor * h / 100)
w = int(scale_factor * w / 100)
dim = (h, w)
up_img = cv.resize(img, dim , interpolation = cv.INTER_AREA)

# Display all three
cv.imshow("dog", img)
cv.waitKey(0)
cv.imshow("small dog", down_img)
cv.waitKey(0)
cv.imshow("big dog", up_img)
cv.waitKey(0)
cv.destroyAllWindows()


### Task 3: Resize Without Preserving Aspect Ratio
Resize only width to 100 pixels, only height to 200 pixels, and both to (200, 200). Display and discuss distortions.

In [12]:
# Your code here
h,w , _ = img.shape
h = h
w = 100
dim = (h, w)
wide = cv.resize(img, dim , interpolation = cv.INTER_AREA)
cv.imshow("wide dog", wide)
cv.waitKey(0)
cv.destroyAllWindows()

### Task 4: Resize Using Scale Factors (fx, fy)
Scale up by 1.2 in both directions and down by 0.6. Use different interpolations (INTER_LINEAR, INTER_NEAREST) and compare quality.

In [13]:
# Your code here
scale_up_x = 1.2
scale_up_y = 1.2
scale_down = 0.6
scaled_down = cv.resize(img, None, fx= scale_down, fy= scale_down, interpolation= cv.INTER_LINEAR)
scaled_up = cv.resize(img, None, fx= scale_up_x, fy= scale_up_y, interpolation= cv.INTER_LINEAR)

cv.imshow("dog", img)
cv.waitKey(0)
cv.imshow("small dog", scaled_down)
cv.waitKey(0)
cv.imshow("big dog", scaled_up)
cv.waitKey(0)

# Experiment with interpolations
scaled_down_nearest = cv.resize(img, None, fx= scale_down, fy= scale_down, interpolation= cv.INTER_NEAREST)
scaled_up_nearest = cv.resize(img, None, fx= scale_up_x, fy= scale_up_y, interpolation= cv.INTER_NEAREST)
cv.imshow("small dog nearest", scaled_down_nearest)
cv.waitKey(0)
cv.imshow("big dog nearest", scaled_up_nearest)
cv.waitKey(0)
cv.destroyAllWindows()

### Task 5: Cropping an Image
Crop a region (e.g., [20:200, 50:200]) from the image. Display original and cropped.

In [14]:
# Your code here
cropped = img[20:200 , 50:200]
cv.imshow("og dog", img)
cv.waitKey(0)
cv.imshow("earless dog", cropped)
cv.waitKey(0)
cv.destroyAllWindows()

### Task 6: Advanced Cropping - Patch Image into Blocks
Divide the image into 4 equal blocks (2x2 grid) by cropping. Display each block separately and then stitch them back using NumPy concatenation to verify.

In [15]:
# Your code here
# Calculate midpoints for height and width
h, w, _ = img.shape
h_mid = h // 2
w_wid = w // 2

# Crop into top-left, top-right, bottom-left, bottom-right
top_left = img[:h_mid, :w_wid]
top_right = img[:h_mid, w_wid:]
bottom_left = img[h_mid:, :w_wid]
bottom_right = img[h_mid:, w_wid:]
# Display each
titles = ["Top Left", "Top Right", "Bottom Left", "Bottom Right"]
blocks = [top_left, top_right, bottom_left, bottom_right]

for i, block in enumerate(blocks):
    cv.imshow(f"Block {i+1}", block)
    cv.waitKey(0)
cv.destroyAllWindows()

# Stitch back (use np.hstack and np.vstack)
top = np.concatenate((top_left, top_right), axis=1)
bottom = np.concatenate((bottom_left, bottom_right), axis=1)
stitched = np.concatenate((top, bottom), axis=0)
cv.imshow("back together", stitched)
cv.waitKey(0)
cv.destroyAllWindows()

### Task 7: Rotating an Image
Rotate the image by 45°, 90°, and 180° using getRotationMatrix2D and warpAffine. Display all rotations.

In [16]:
# Your code here
# Calculate center
h, w, _ = img.shape
center = (h / 2, w / 2)
# For each angle: get matrix, warp, display
angle45matrix = cv.getRotationMatrix2D(center=center, angle=45, scale=1)
angle45image = cv.warpAffine(src=img, M=angle45matrix, dsize=(w, h))
cv.imshow("og dog", img)
cv.waitKey(0)
cv.imshow("slightly rotated dog", angle45image)
cv.waitKey(0)

angle90matrix = cv.getRotationMatrix2D(center=center, angle=90, scale=1)
angle90image = cv.warpAffine(src=img, M=angle90matrix, dsize=(w, h))
cv.imshow("mid rotated dog", angle90image)
cv.waitKey(0)

angle180matrix = cv.getRotationMatrix2D(center=center, angle=180, scale=1)
angle180image = cv.warpAffine(src=img, M=angle180matrix, dsize=(w, h))
cv.imshow("heavily rotated dog", angle180image)
cv.waitKey(0)





32

### Task 8: Rotate with Scaling
Rotate by 45° and scale by 0.5 in **one** operation. Compare with separate resize and rotate.

In [17]:
# Your code here
angle45matrixplusrotate = cv.getRotationMatrix2D(center=center, angle=45, scale=0.5)
angle45imageplusrotate = cv.warpAffine(src=img, M=angle45matrixplusrotate, dsize=(w, h))
cv.imshow("not resized", angle45image)
cv.waitKey(0)
cv.imshow("resized", angle45imageplusrotate)
cv.waitKey(0)
cv.destroyAllWindows()

## Session 2: Image Acquisition, Formats, Color Spaces, Enhancement, and Filtering

### Task 9: Read Image in Different Color Spaces
Read an image in BGR, convert to RGB (for Matplotlib), HSV, LAB and Grayscale. Display all.

In [19]:
# Your code here
# Use cv.cvtColor()
RGB = cv.cvtColor(img, cv.COLOR_BGR2RGB)
HSV = cv.cvtColor(img, cv.COLOR_BGR2HSV)
LAB = cv.cvtColor(img, cv.COLOR_BGR2LAB)
images = [img, gray_img, RGB, HSV, LAB]
for i, pic in enumerate(images):
    cv.imshow("color", pic)
    cv.waitKey(0)

cv.destroyAllWindows()

### Task 10: Image Sharpening
Apply cv2.blur() with a 5x5 kernel, then use cv2.filter2D() with sharpening kernels of varying strengths (e.g., [[0, -1, 0], [-1, 5, -1], [0, -1, 0]] and [[0, -2, 0], [-2, 9, -2], [0, -2, 0]]).
Compare between original and sharpened image after blurring.

In [20]:
# Your code here
# Use cv2.blur
blurred = cv.blur(img, (5,5))
# Define sharpen kernel, use cv.filter2D()
kernel1 = np.array([[0, -1, 0],
                    [-1,  5, -1],
                    [0, -1, 0]])

kernel2 = np.array([[0, -2, 0],
                    [-2,  9, -2],
                    [0, -2, 0]])

sharpened1 = cv.filter2D(blurred, -1, kernel1)
sharpened2 = cv.filter2D(blurred, -1, kernel2)

cv.imshow("og", img)
cv.waitKey(0)
cv.imshow("blurred", blurred)
cv.waitKey(0)
cv.imshow("lightly sharpened", sharpened1)
cv.waitKey(0)
cv.imshow("strongly sharpened", sharpened2)
cv.waitKey(0)
cv.destroyAllWindows()

### Task 11: Add Salt and Pepper Noise to Image
Implement a function to add salt and pepper noise to an image. Control noise density (e.g., 0.05).

In [4]:
# Your code here
from skimage.util import random_noise
def add_salt_pepper_noise(image, density=0.05):
    # Implement using random pixels set to 0 or 255
    noisy = random_noise(img, mode = "s&p", amount = density)
    noisy = np.array(255 * noisy, dtype = "uint8")
    return noisy
# Apply to an image and display
noisy = add_salt_pepper_noise(img, 0.05)
cv.imshow("noisy", noisy)
cv.waitKey(0)
cv.destroyAllWindows()

### Task 12: Remove Salt and Pepper Noise Using Median Filter
Apply cv.medianBlur() to a noisy image. Experiment with kernel sizes (3,5,7) and compare results.

In [22]:
# Your code here
medianFilter = cv.medianBlur(noisy, 5)
cv.imshow("median", medianFilter)
cv.waitKey(0)
cv.destroyAllWindows()


### Task 13: Implement Adaptive Median Filter
Write a custom function for adaptive median filtering. It should dynamically increase window size until noise is removed or max size is reached. Apply to a noisy image and compare with standard median.

In [7]:
#Adaptive Median Filter

def adaptive_median_filter(image, max_size):
    h, w = image.shape
    pad = max_size // 2
    padded = cv.copyMakeBorder(image, pad, pad, pad, pad, cv.BORDER_REFLECT)
    out = np.zeros_like(image)

    for i in range(h):
        for j in range(w):
            S = 3
            while True:
                window = padded[i:i+S, j:j+S]
                Zmin, Zmax = window.min(), window.max()
                Zmed = np.median(window)
                Zxy = padded[i+pad, j+pad]

                if Zmed > Zmin and Zmed < Zmax:  
                    if Zxy > Zmin and Zxy < Zmax:  
                        out[i, j] = Zxy
                    else:
                        out[i, j] = Zmed
                    break
                else:
                    S += 2
                    if S > max_size:
                        out[i, j] = Zmed
                        break
    return out.astype("uint8")


img = cv.imread("dogBGR.jpg", cv.IMREAD_GRAYSCALE)

noisy = random_noise(img, mode="s&p", amount=0.05)
noisy = (255 * noisy).astype("uint8")

median_std = cv.medianBlur(noisy, 3)
median_adapt = adaptive_median_filter(noisy, max_size=3)

cv.imshow("Original", img)
cv.waitKey(0)

cv.imshow("Noisy", noisy)
cv.waitKey(0)

cv.imshow("Median (3x3)", median_std)
cv.waitKey(0)

cv.imshow("Adaptive Median", median_adapt)
cv.waitKey(0)

cv.destroyAllWindows()


### Task 14: Implement Bilateral Filter Function
Write a Python function to perform bilateral filtering on an image. Use Gaussian weights for both spatial and intensity. Parameters: diameter, sigma_color, sigma_space. Compare with cv.bilateralFilter().

In [None]:
import cv2 as cv
import numpy as np

def custom_bilateral_filter(image, diameter, sigma_color, sigma_space):
    h, w = image.shape
    pad = diameter // 2
    padded = cv.copyMakeBorder(image, pad, pad, pad, pad, cv.BORDER_REFLECT)

    # Precompute spatial Gaussian weights
    y, x = np.mgrid[-pad:pad+1, -pad:pad+1]
    spatial_weights = np.exp(-(x**2 + y**2) / (2 * sigma_space**2))

    output = np.zeros_like(image, dtype=np.float64)

    for i in range(h):
        for j in range(w):
            region = padded[i:i+diameter, j:j+diameter]
            intensity_diff = region - image[i, j]
            range_weights = np.exp(-(intensity_diff**2) / (2 * sigma_color**2))
            
            # Total weight = spatial * range
            weights = spatial_weights * range_weights
            weights /= np.sum(weights)

            output[i, j] = np.sum(region * weights)

    return output.astype(np.uint8)



img = cv.imread("dogBGR.jpg", cv.IMREAD_GRAYSCALE)

custom = custom_bilateral_filter(img, diameter=5, sigma_color=100, sigma_space=100)
opencv = cv.bilateralFilter(img, d=9, sigmaColor=75, sigmaSpace=75)

cv.imshow("Original", img)
cv.waitKey(0)
cv.imshow("Custom Bilateral", custom)
cv.waitKey(0)
cv.imshow("OpenCV Bilateral", opencv)
cv.waitKey(0)
cv.destroyAllWindows()


### [BONUS] Task 15: Comprehensive Camera Task 
Combine: Live camera feed -> grayscale -> add noise -> remove with median -> sharpen. Display all stages in separate windows.

In [None]:
# To read video from camera example:
import cv2 as cv
import numpy as np
from skimage.util import random_noise

camera_id = 0
delay = 1

windows = ['og', 'gray', 'noisy', 'median filter', 'sharpened']

cap = cv.VideoCapture(camera_id)

if not cap.isOpened():
    sys.exit()

sharpen_kernel = np.array([[0, -1, 0],
                           [-1, 5,-1],
                           [0, -1, 0]])

while cap.isOpened():
    ret, frame = cap.read()
    if not ret:
        break

    gray = cv.cvtColor(frame, cv.COLOR_BGR2GRAY)
    noisy = random_noise(gray, mode="s&p", amount=0.05)
    noisy = (255 * noisy).astype(np.uint8)
    medianFilter = cv.medianBlur(noisy, 5)
    sharpened = cv.filter2D(medianFilter, -1, sharpen_kernel)

    results = [frame, gray, noisy, medianFilter, sharpened]

    for win_name, img in zip(windows, results):
        cv.imshow(win_name, img)
        key = cv.waitKey(delay) & 0xFF
        if key == ord('q'):
            cap.release()
            cv.destroyAllWindows()

cap.release()
cv.destroyAllWindows()

### [BONUS]Task 16: Comprehensive Video Task
Similar to Task 18 but for a video file. Save the final processed video.

In [None]:
import cv2 as cv
import numpy as np
from skimage.util import random_noise

delay = 1

windows = ['og', 'gray', 'noisy', 'median filter', 'sharpened']

cap = cv.VideoCapture(r'C:\Users\rsl_f\OneDrive\Desktop\shows\severance\Severance S02E07.mkv')

sharpen_kernel = np.array([[0, -1, 0],
                           [-1, 5,-1],
                           [0, -1, 0]])

while cap.isOpened():
    ret, frame = cap.read()
    if not ret:
        break

    gray = cv.cvtColor(frame, cv.COLOR_BGR2GRAY)
    noisy = random_noise(gray, mode="s&p", amount=0.05)
    noisy = (255 * noisy).astype(np.uint8)
    medianFilter = cv.medianBlur(noisy, 5)
    sharpened = cv.filter2D(medianFilter, -1, sharpen_kernel)

    results = [frame, gray, noisy, medianFilter, sharpened]

    for win_name, img in zip(windows, results):
        cv.imshow(win_name, img)
        key = cv.waitKey(delay) & 0xFF
        if key == ord('q'):
            cap.release()
            cv.destroyAllWindows()

cap.release()
cv.destroyAllWindows()

### Task 17: Performance Comparison
Time the execution of standard median vs adaptive median on a large noisy image. Discuss when adaptive median filter is better.

In [8]:
import time
# Your code here
# Use time.time() to measure

img = cv.imread("dogBGR.jpg")
noisy = random_noise(gray, mode="s&p", amount=0.05)
noisy = (255 * noisy).astype(np.uint8)

start = time.time()
median_std = cv.medianBlur(noisy, 5)
end = time.time()
print("standard median filter time:", end - start, "seconds")

start = time.time()
median_adapt = adaptive_median_filter(noisy, max_size=3)
end = time.time()
print("adaptive median filter time", end - start, "seconds")


standard median filter time: 0.003565549850463867 seconds
adaptive median filter time 29.75269603729248 seconds


adaptive median filter is better for images with much more noise and it retains the edge info in the case of hgih density impluse noises, so it's best for when there's a need for denoising as well as preserving detail info. its limitation is that it takes much longer time since the standard median filter is a built-in function and uses a fixed window size. adaptive median filter checks variable windows sizes, hence taking up more time