# OVO Tutorial #1: Multi-Scale Image Analysis and Edge Detection

In this tutorial, we'll explore:
1. Basic image operations and homography estimation
2. Multi-scale image analysis using image pyramids
3. Edge detection using the Canny algorithm

## Setup and Imports

In [None]:
import numpy as np
import matplotlib.pyplot as plt
import cv2

# Configure matplotlib for notebook display
%matplotlib inline
plt.rcParams['figure.figsize'] = [12, 8]
plt.rcParams['figure.dpi'] = 100

# Part 1: Image Basics and Homography
## Exercise 1.1: Loading and Basic Image Properties

Explore the images basic properties:
- Display the image shape
- Print the data type
- Show the value range (min and max)
- Display the image using matplotlib

In [None]:
# We load the grayscale image using imread and specifying that it's a grayscale image in the second argument
gray = cv2.imread('graycat.jpg', cv2.IMREAD_GRAYSCALE)

In [None]:
# Write your solution here

## Exercise 1.2: Point Transformations

Implement two functions:
1. `adjust_brightness(image, beta)`: Adjusts image brightness by adding beta
2. `adjust_contrast(image, alpha)`: Adjusts image contrast by multiplying by alpha

Remember to:
- Convert to float for calculations
- Convert back to uint8

In [None]:
def adjust_brightness(image, beta):
    """Adjust brightness by adding a constant

    Args:
        image: uint8 image array
        beta: brightness adjustment (-255 to 255)
    """
    # Your code here
    pass

def adjust_contrast(image, alpha):
    """Adjust contrast by multiplication

    Args:
        image: uint8 image array
        alpha: contrast adjustment (0 to 3)
    """
    # Your code here
    pass

# Test your implementation: beta in {50, -50} and alpha in {0.5, 1.5}

## Exercise 1.3: Working with Image Patches

Implement the `extract_patch` function that extracts a rectangular region from an image:
- Take center coordinates and patch size as input
- Return the extracted patch

In [None]:
def extract_patch(image, center, size):
    """Extract a patch from the image

    Args:
        image: Input image
        center: (x, y) coordinates of patch center
        size: (width, height) of patch
    """
    # Your code here
    pass

# Test your implementation with 100x100 patches centered at: the center of the image, the center of each quadrant (5 patches total)

## Exercise 1.4: Frequency Analysis

For each extracted patch:
1. Compute the 2D FFT
2. Visualize the log magnitude spectrum

Questions to consider:
- How does the frequency content differ between smooth and detailed regions?
- What patterns do you see in the magnitude spectra?

In [None]:
# Write your solution here

## Exercise 1.5: Homography Estimation

1. Understand how the DLT algorithm works
3. Test with the provided point correspondences

In [None]:
def estimate_homography_dlt(pts1, pts2):
    """Estimate homography matrix using DLT

    Args:
        pts1, pts2: 4x2 arrays of corresponding points
    Returns:
        H: 3x3 homography matrix mapping pts1 to pts2
    """
    # Your code here

    # Construct the equations matrix A

    # Solve using SVD

    # Return normalized homography
    pass

The following code will visualize your homography for you - you don't have to modify it, just define correctly the previous function

In [None]:
def verify_homography(img1, img2, H, padding_percent=10):
    """Verify homography by displaying a visual comparison of two images and their alignment

    Args:
        img1 (numpy.ndarray): First input image
        img2 (numpy.ndarray): Second input image
        H (numpy.ndarray): 3x3 homography matrix mapping img1 to img2
        padding_percent (int): Amount of padding to add around output image in percent

    Returns:
        numpy.ndarray: Blended result showing alignment of warped img1 with img2
    """

    # Ensure images are grayscale
    if len(img1.shape) == 3:
        img1 = cv2.cvtColor(img1, cv2.COLOR_BGR2GRAY)
    if len(img2.shape) == 3:
        img2 = cv2.cvtColor(img2, cv2.COLOR_BGR2GRAY)

    h2, w2 = img2.shape[:2]

    # Calculate padding
    pad_x = int(w2 * padding_percent / 100)
    pad_y = int(h2 * padding_percent / 100)

    # Create output size with padding
    out_w = w2 + 2*pad_x
    out_h = h2 + 2*pad_y

    # Adjust homography for the padding offset
    T = np.array([
        [1, 0, pad_x],
        [0, 1, pad_y],
        [0, 0, 1]
    ])
    H_adj = T @ H

    # Warp and create padded images
    img1_warped = cv2.warpPerspective(img1, H_adj, (out_w, out_h))
    img2_padded = np.zeros((out_h, out_w), dtype=np.uint8)
    img2_padded[pad_y:pad_y+h2, pad_x:pad_x+w2] = img2

    # Visualization
    plt.figure(figsize=(15, 5))

    plt.subplot(131)
    plt.imshow(img1, cmap='gray')
    plt.scatter(pts1[:, 0], pts1[:, 1], c='r', s=100)
    plt.title('Image 1 with points')
    plt.axis('off')

    plt.subplot(132)
    plt.imshow(img2, cmap='gray')
    plt.scatter(pts2[:, 0], pts2[:, 1], c='r', s=100)
    plt.title('Image 2 with points')
    plt.axis('off')

    plt.subplot(133)
    blend = cv2.addWeighted(img1_warped, 0.5, img2_padded, 0.5, 0)
    plt.imshow(blend, cmap='gray')
    plt.title('Blended Result')
    plt.axis('off')

    plt.tight_layout()
    plt.show()

    return cv2.addWeighted(img1_warped, 0.5, img2_padded, 0.5, 0)

# Correspondence points are provided
pts1 = np.array([[2023.2, 2350.1], [1408.0, 1536.8],
                 [1770.8, 2176.3], [1754.1, 2074.8]])
pts2 = np.array([[2982.5, 2398.1], [2354.1, 1538.9],
                 [2710.7, 2194.5], [2694.3, 2090.2]])

Load the images `grayforest1.jpg` and `grayforest2.jpg` and compute the homography between them using the given correspondence points

In [None]:
# Load the images
# Your code here

# Compute the homography using estimate_homography_dlt
# Your code here

# Visualize results by calling the verify_homography function
# Your code here

# Part 2: Multi-Scale Image Analysis
## Exercise 2.1: Convolution and Gaussian Kernels

First, implement a basic 2D convolution function:
1. Zero padding
2. Stride = 1 (simple convolution)
3. Test with simple kernels (e.g., blur, edge sharpen, edge)

In [None]:
def zero_pad(image, pad):
    """Add zero padding around the border of an image

    Args:
        image (numpy.ndarray): Input image to pad
        pad (int): Number of pixels of padding to add on all sides

    Returns:
        numpy.ndarray: Zero-padded image with original image in center
    """
    # Your code here
    pass

def conv_2d(image, kernel):
    """Perform 2D convolution of an image with a kernel

    Args:
        image (numpy.ndarray): Input image to convolve
        kernel (numpy.ndarray): 2D convolution kernel

    Returns:
        numpy.ndarray: Result of convolving image with kernel

    Notes:
        Uses zero padding and stride=1
    """
    # Your code here
    pass

Now work with Gaussian kernels:

We provide a function to create Gaussian kernels. Your tasks:
1. Experiment with different kernel sizes and sigma values
2. Visualize and analyze how parameters affect the kernel shape
3. Consider the implications for image smoothing

In [None]:
def create_gaussian_kernel(size=5, sigma=1.0):
    """Create a 2D Gaussian kernel for filtering

    Args:
        size (int): Size of the kernel (must be odd)
        sigma (float): Standard deviation of the Gaussian distribution

    Returns:
        numpy.ndarray: Normalized 2D Gaussian kernel of shape (size, size)

    Raises:
        ValueError: If size is not odd
    """
    if size % 2 == 0:
        raise ValueError("Kernel size must be odd")

    x = np.linspace(-(size//2), size//2, size)
    y = x[:, np.newaxis]
    kernel = np.exp(-(x*x + y*y)/(2*sigma*sigma))
    return kernel / kernel.sum()

In [None]:
# Your code here

## Exercise 2.2: Building Gaussian Pyramids

In [None]:
def downsample(image):
    """Downsample image by factor of 2 using proper averaging

    Args:
        image (numpy.ndarray): Input image to downsample

    Returns:
        numpy.ndarray: Downsampled image at half the input resolution
    """
    return image[::2, ::2]

def upsample(image, original_shape):
    """Upsample image by factor of 2 using linear interpolation

    Args:
        image (numpy.ndarray): Input image
        original_shape (tuple): Shape to upsample to

    Returns:
        numpy.ndarray: Upsampled image
    """
    upsampled = np.zeros((image.shape[0]*2, image.shape[1]*2))
    upsampled[::2, ::2] = image
    # Linear interpolation
    upsampled[1::2, ::2] = upsampled[:-1:2, ::2]
    upsampled[::2, 1::2] = upsampled[::2, :-1:2]
    upsampled[1::2, 1::2] = upsampled[:-1:2, :-1:2]
    return upsampled[:original_shape[0], :original_shape[1]]

Implement the `build_gaussian_pyramid` function.

In [None]:
def build_gaussian_pyramid(image, levels=4, kernel_size=5, sigma=1.0):
    """Build Gaussian pyramid from input image

    Args:
        image (numpy.ndarray): Input image
        levels (int): Number of pyramid levels
        kernel_size (int): Size of Gaussian kernel
        sigma (float): Standard deviation of Gaussian

    Returns:
        list: Gaussian pyramid levels
    """
    # Your code here
    pass

In [None]:
# Load a test image
# Your code here

# Build and visualize Gaussian pyramid for first image
gauss_pyramid = build_gaussian_pyramid(image1)

# Visualize Gaussian pyramid
plt.figure(figsize=(15, 3))
for i, level in enumerate(gauss_pyramid):
    plt.subplot(1, len(gauss_pyramid), i+1)
    plt.imshow(level, cmap='gray')
    plt.title(f'Level {i}')
    plt.axis('off')
plt.suptitle('Gaussian Pyramid of Image 1')
plt.tight_layout()
plt.show()

## Exercise 2.3: Laplacian Pyramid

Implement functions for:
1. Building the Laplacian pyramid
2. Reconstructing the original image from the Laplacian pyramid

Consider:
- How does the Laplacian pyramid represent image details?
- What information is captured at each level?

In [None]:
def build_laplacian_pyramid(gaussian_pyramid):
    """Build Laplacian pyramid from Gaussian pyramid

    Args:
        gaussian_pyramid (list): Gaussian pyramid levels

    Returns:
        list: Laplacian pyramid levels
    """
    # Your code here
    pass

def reconstruct_from_laplacian(laplacian_pyramid):
    """Reconstruct image from Laplacian pyramid

    Args:
        laplacian_pyramid (list): Laplacian pyramid levels

    Returns:
        numpy.ndarray: Reconstructed image
    """
    # Your code here
    pass

Plot the pixel-wise error of the reconstruction using the Laplacian pyramid

In [None]:
# Your code here

## Exercise 2.4: Multi-Scale Image Blending

Implement the pyramid blending algorithm:
1. Build pyramids for both images
2. Build pyramid for the mask
3. Blend pyramids
4. Reconstruct final result

In [None]:
def pyramid_blend(image1, image2, mask, levels=4):
    """Blend two images using pyramid blending

    Args:
        image1, image2 (numpy.ndarray): Images to blend
        mask (numpy.ndarray): Blending mask
        levels (int): Number of pyramid levels

    Returns:
        numpy.ndarray: Blended image
    """
    # Your code here
    pass

Demonstrate the pyramid blend on the grayscale images `racoon.jpg` and `crowd.jpg` using the mask provided in `blend_mask.jpg`

In [None]:
# Your code here

How does the number of levels in the pyramid affect the blending?

# Part 3: Edge Detection
## Exercise 3.1: Gradient Computation

Implement gradient computation using Sobel operators:
1. Define Sobel kernels
2. Compute x and y gradients
3. Calculate magnitude and direction

In [None]:
def compute_gradients(image):
    """Compute gradients using Sobel operators

    Args:
        image (numpy.ndarray): Input image

    Returns:
        tuple: (gradient magnitude, gradient direction in radians)
    """
    # Your code here
    pass

## Exercise 3.2: Non-Maximum Suppression

Implement non-maximum suppression:
1. Convert gradient direction to angles
2. Compare magnitude with neighbors along gradient direction
3. Suppress non-maximum pixels

In [None]:
def non_maximum_suppression(magnitude, direction):
    """Apply non-maximum suppression to gradient magnitude

    Args:
        magnitude (numpy.ndarray): Gradient magnitude
        direction (numpy.ndarray): Gradient direction in radians

    Returns:
        numpy.ndarray: Suppressed gradient magnitude
    """
    # Your code here
    pass

## Exercise 3.3: Double Thresholding and Edge Tracking

Implement:
1. Double thresholding to identify strong/weak edges
2. Edge tracking by hysteresis

In [None]:
def double_threshold(image, low_ratio=0.05, high_ratio=0.15):
    """Apply double thresholding to classify edges

    Args:
        image (numpy.ndarray): Input image
        low_ratio (float): Low threshold ratio
        high_ratio (float): High threshold ratio

    Returns:
        tuple: (strong edges, weak edges)
    """
    # Your code here
    pass

def edge_tracking(strong_edges, weak_edges):
    """Track edges using hysteresis.

    Args:
        strong_edges (numpy.ndarray): Binary image of strong edges
        weak_edges (numpy.ndarray): Binary image of weak edges

    Returns:
        numpy.ndarray: Final binary edge image
    """
    # Your code here
    pass

## Exercise 3.4: Complete Canny Edge Detector

Combine all components into a complete Canny edge detector:
1. Gaussian smoothing
2. Gradient computation
3. Non-maximum suppression
4. Double thresholding
5. Edge tracking

In [None]:
def canny_edge_detector(image, kernel_size=5, sigma=1.0,
                       low_ratio=0.05, high_ratio=0.15):
    """Complete Canny edge detection implementation

    Args:
        image (numpy.ndarray): Input image
        kernel_size (int): Size of Gaussian kernel
        sigma (float): Standard deviation of Gaussian
        low_ratio (float): Low threshold ratio
        high_ratio (float): High threshold ratio

    Returns:
        dict: Dictionary containing final edges and intermediate results
    """
    # Your code here
    # Return all intermediate results for visualization
    return {
        'smoothed': smoothed,
        'magnitude': magnitude,
        'direction': direction,
        'suppressed': suppressed,
        'strong_edges': strong_edges,
        'weak_edges': weak_edges,
        'final_edges': final_edges
    }

## Discussion Questions

1. How does changing sigma in the Gaussian smoothing affect the detected edges?
- How does it impact noise sensitivity?
- What happens to fine details as sigma increases?

2. Why do we need both gradient magnitude and direction?
- What information does each component provide?
- How do they work together in edge detection?

3. Analyze the role of non-maximum suppression:
- What problem does it solve?
- How would edges look without it?