# HOG-Based Face Detection Functions Documentation


This document explains each function used in the HOG-based face detection example code. The code demonstrates:
- Computing image gradients using different differential filters (Sobel, Prewitt, Scharr).
- Extracting Histogram of Oriented Gradients (HOG) features.
- Visualizing HOG features overlaid on the original image.
- Performing sliding window matching using Normalized Cross-Correlation (NCC) on HOG features.
- Applying Non-Maximum Suppression (NMS) to filter duplicate detections.
- Visualizing detected face regions with bounding boxes and NCC scores.

Below are the detailed explanations for each function.

#### Import Libraries

In [1]:
import numpy as np
import matplotlib.pyplot as plt
from PIL import Image
from scipy.signal import convolve2d

#### Differential Filter Implementation
**Description:**  
Computes the gradients of the input grayscale image using a differential filter. The function supports three methods: **Sobel**, **Prewitt**, and **Scharr**. Also, please use scipy convolve2d function for calculation.
- **Parameters:**
  - `image`: 2D numpy array (grayscale image).
  - `method`: String indicating the differential filter method ('sobel', 'prewitt', or 'scharr').
- **Returns:**
  - `grad_x`: The gradient of the image in the x-direction.
  - `grad_y`: The gradient of the image in the y-direction.
- **Usage:**  
  This function is used as the first step in HOG feature extraction to obtain the gradient magnitudes and orientations.


##### Sobel Operator

**Description:**  
The Sobel operator emphasizes intensity changes by combining smoothing and differentiation. It is widely used in edge detection and feature extraction.

**Kernels:**

- **Kernel for x-direction:**

  $$
  \begin{pmatrix}
  -1 & 0 & 1 \\
  -2 & 0 & 2 \\
  -1 & 0 & 1
  \end{pmatrix}
  $$

- **Kernel for y-direction:**

  $$
  \begin{pmatrix}
  -1 & -2 & -1 \\
   0 &  0 &  0 \\
   1 &  2 &  1
  \end{pmatrix}
  $$

##### Prewitt Operator

**Description:**  
The Prewitt operator approximates the gradient with simpler kernels and without the extra weighting used in the Sobel operator. It is computationally more efficient though slightly less robust against noise.

**Kernels:**

- **Kernel for x-direction:**

  $$
  \begin{pmatrix}
  -1 & 0 & 1 \\
  -1 & 0 & 1 \\
  -1 & 0 & 1
  \end{pmatrix}
  $$

- **Kernel for y-direction:**

  $$
  \begin{pmatrix}
  -1 & -1 & -1 \\
   0 &  0 &  0 \\
   1 &  1 &  1
  \end{pmatrix}
  $$

##### Scharr Operator

**Description:**  
The Scharr operator provides improved rotational symmetry and more accurate gradient estimation compared to Sobel. It applies stronger weighting to the central pixels, enhancing edge detection.

**Kernels:**

- **Kernel for x-direction:**

  $$
  \begin{pmatrix}
   -3 &  0 &  3 \\
  -10 &  0 & 10 \\
   -3 &  0 &  3
  \end{pmatrix}
  $$

- **Kernel for y-direction:**

  $$
  \begin{pmatrix}
   -3 & -10 & -3 \\
    0 &   0 &  0 \\
    3 &  10 &  3
  \end{pmatrix}
  $$

In [None]:
def compute_gradients(image, method='sobel'):
    """
    Compute image gradients using a differential filter.
    
    Parameters:
    - image: 2D numpy array (grayscale)
    - method: 'sobel', 'prewitt', or 'scharr'
    
    Returns:
    - grad_x: gradient in the x direction
    - grad_y: gradient in the y direction
    """
    if method.lower() == 'sobel':
        #############
        # CODE HERE #
        #############
    elif method.lower() == 'prewitt':
        #############
        # CODE HERE #
        #############
    elif method.lower() == 'scharr':
        #############
        # CODE HERE #
        #############
    else:
        raise ValueError("Unsupported filter method. Choose among 'sobel', 'prewitt', or 'scharr'.")
    
    # Convolve the image with the kernels (using symmetric padding)
    #############
    # CODE HERE #
    #############
    return grad_x, grad_y

#### HOG Feature Extraction and Visualization

**Description:**  
Extracts HOG (Histogram of Oriented Gradients) features from a grayscale image by:
- Dividing the image into cells of size `cell_size`.
- Computing an orientation histogram with `bin_size` bins (covering 0–180 degrees) for each cell.
- Normalizing groups of cells (blocks) using L2 norm over a block size of `block_size`.  
- **Parameters:**
  - `image`: 2D numpy array (grayscale).
  - `cell_size`: Size of each cell in pixels.
  - `bin_size`: Number of bins for the orientation histogram.
  - `block_size`: Number of cells per block for normalization.
  - `filter_method`: Differential filter method to compute gradients.
- **Returns:**
  - `feature_vector`: A flattened, normalized HOG feature vector for use in matching.
  - `cell_hist`: The orientation histogram for each cell, used for HOG visualization.
- **Usage:**  
  The HOG feature vector is used for sliding window matching (template matching) and the cell histograms are used for visualizing the HOG representation.


In [None]:
def hog_feature(image, cell_size, bin_size, block_size, filter_method):
    """
    Extract HOG features from an image.
    
    Parameters:
    - image: 2D numpy array (grayscale)
    - cell_size: size of each cell in pixels
    - bin_size: number of bins in the orientation histogram (covering 0-180 degrees)
    - block_size: number of cells per block for normalization
    - filter_method: differential filter method ('sobel', 'prewitt', or 'scharr')
    
    Returns:
    - feature_vector: normalized HOG feature vector (flattened for sliding window matching)
    - cell_hist: histogram for each cell (for HOG visualization)
    """
    # Compute gradients
    gx, gy = compute_gradients(image, method=filter_method)

    #############
    # CODE HERE #
    #############
    magnitude = 
    # Compute angle in degrees and adjust to [0, 180)
    angle = 

    h, w = image.shape
    n_cells_y = 
    n_cells_x = 
    cell_hist = 

    bin_width = 

    # Calculate the histogram for each cell
    #############
    # CODE HERE #
    #############

    # Block normalization using L2 norm
    #############
    # CODE HERE #
    #############
    return feature_vector, cell_hist

**Description:**  
Visualizes the HOG features by overlaying red lines representing gradient orientations and magnitudes on the original image.  
- **Parameters:**
  - `original_image`: The original image (can be grayscale or RGB) used as the background.
  - `cell_hist`: The cell histograms obtained from the `hog_feature` function.
  - `cell_size`: Size of each cell in pixels.
  - `bin_size`: Number of histogram bins.
  - `output_file`: File name where the visualization will be saved.
- **Usage:**  
  This function helps to visually verify the HOG feature extraction by showing how the gradients are distributed across the image.


In [None]:
def visualize_hog(original_image, cell_hist, cell_size=8, bin_size=9, output_file='hog_visualization.png'):
    """
    Visualize HOG features overlaid on the original image.
    
    Parameters:
    - original_image: the original image (2D or 3D numpy array)
    - cell_hist: cell histograms from hog_feature function
    - cell_size: size of each cell in pixels
    - bin_size: number of histogram bins
    - output_file: file name to save the visualization
    
    The function overlays red lines representing HOG features on top of the original image.
    """
    #############
    # CODE HERE #
    #############
    print(f"HOG visualization saved to {output_file}")

#### Sliding Window NCC Matching
**Description:**  
Converts an input image to grayscale.  
- **Parameters:**
  - `image`: A numpy array that can be either a 3D (RGB) or 2D (grayscale) image.
- **Returns:**
  - A 2D grayscale numpy array.
- **Usage:**  
  Ensures that images are in the correct grayscale format before processing with HOG feature extraction functions.

In [5]:
def convert_to_gray(image):
    """
    Convert an image to grayscale.
    
    Parameters:
    - image: numpy array; can be 2D (grayscale) or 3D (RGB)
    
    Returns:
    - 2D grayscale numpy array
    """
    if image.ndim == 3:
        return np.dot(image[...,:3], [0.299, 0.587, 0.114])
    else:
        return image

**Description:**  
Performs sliding window matching by computing the normalized cross-correlation (NCC) between the HOG features of a template image and each window in the target image.  
- **Parameters:**
  - `target_image`: Grayscale target image where face detection is performed.
  - `template_image`: Grayscale template image (face region) used for matching.
  - `cell_size`, `bin_size`, `block_size`, `filter_method`: Parameters for the HOG feature extraction.
  - `ncc_threshold`: Threshold for NCC score; windows with scores above this threshold are considered valid detections.
  - `step_size`: Pixel step size for moving the sliding window.
- **Returns:**
  - A list of candidate bounding boxes, each represented as `[x1, y1, x2, y2, ncc_score]`.
- **Usage:**  
  This function is used to find regions in the target image that match the template face based on HOG feature similarity.


In [None]:
def sliding_window_ncc(target_image, template_image, cell_size=8, bin_size=9, block_size=2, 
                       filter_method='sobel', ncc_threshold=0.5, step_size=4):
    """
    Perform sliding window matching using NCC on HOG features.
    
    Parameters:
    - target_image: target image (grayscale numpy array)
    - template_image: template image for face detection (grayscale numpy array)
    - cell_size, bin_size, block_size, filter_method: parameters for hog_feature
    - ncc_threshold: threshold for NCC score to consider a detection valid
    - step_size: pixel step for sliding window movement
    
    Returns:
    - List of candidate bounding boxes [x1, y1, x2, y2, ncc_score]
    """
    template_gray = convert_to_gray(template_image)
    template_feature, _ = hog_feature(template_gray, cell_size, bin_size, block_size, filter_method)
    template_feature = template_feature / (np.linalg.norm(template_feature) + 1e-6)
    
    t_h, t_w = template_gray.shape
    H, W = target_image.shape
    detections = []
    
    # Slide the window over the target image
    #############
    # CODE HERE #
    #############
    return detections

#### Non-Maximum Suppression
**Description:**  
Applies Non-Maximum Suppression (NMS) to remove overlapping detections. For overlapping boxes with an Intersection over Union (IoU) above the threshold, only the one with the highest NCC score is retained.  
- **Parameters:**
  - `detections`: List of candidate bounding boxes `[x1, y1, x2, y2, score]`.
  - `iou_threshold`: IoU threshold for suppression (default is 0.5).
- **Returns:**
  - A filtered list of bounding boxes after removing duplicates.
- **Usage:**  
  Helps in reducing duplicate detections, ensuring only the most confident detection for each face remains.


In [None]:
def non_max_suppression(detections, iou_threshold=0.5):
    """
    Apply Non-Maximum Suppression to remove overlapping bounding boxes.
    
    Parameters:
    - detections: list of [x1, y1, x2, y2, score]
    - iou_threshold: IoU threshold for suppression (default 0.5)
    
    Returns:
    - List of filtered bounding boxes
    """
    #############
    # CODE HERE #
    #############
    
    return filtered_bounding_boxes

#### Visualization of Detections on Original Image
**Description:**  
Visualizes detected faces by overlaying red bounding boxes and NCC scores on the original image. The NCC score is formatted to two decimal places and displayed on the image.  
- **Parameters:**
  - `original_image`: The original image (grayscale or RGB) where detections are visualized.
  - `detections`: List of bounding boxes `[x1, y1, x2, y2, score]`.
  - `output_file`: File name where the final detection visualization is saved.
- **Usage:**  
  Used to present the final detection results for evaluation or reporting.


In [8]:
def visualize_detections(original_image, detections, output_file='detections.png'):
    """
    Visualize detections by overlaying bounding boxes and NCC scores on the original image.
    
    Parameters:
    - original_image: the original image (numpy array; grayscale or RGB)
    - detections: list of bounding boxes [x1, y1, x2, y2, score]
    - output_file: file name to save the visualization
    
    The function draws red bounding boxes and overlays the NCC score (formatted to two decimals).
    """
    fig, ax = plt.subplots(1)
    if original_image.ndim == 2:
        ax.imshow(original_image, cmap='gray')
    else:
        ax.imshow(original_image)
    for det in detections:
        x1, y1, x2, y2, score = det
        width = x2 - x1
        height = y2 - y1
        # Draw bounding box
        rect = plt.Rectangle((x1, y1), width, height, edgecolor='red', facecolor='none', linewidth=2)
        ax.add_patch(rect)
        # Overlay NCC score text (formatted to two decimal places)
        ax.text(x1, y1, f"{score:.2f}", color='yellow', fontsize=12, backgroundcolor='black')
    plt.axis('off')
    plt.savefig(output_file, bbox_inches='tight', pad_inches=0)
    plt.close()
    print(f"Detection result saved to {output_file}")

#### Main Function

**Description:**  
The main function ties together the entire process:
1. Loads the template and target images and converts them to grayscale.
2. Extracts and visualizes HOG features on the target image.
3. Performs sliding window matching using NCC on HOG features.
4. Applies Non-Maximum Suppression to filter duplicate detections.
5. Visualizes the final detected faces on the original target image with bounding boxes and NCC scores.

**Notes for Students:**  
- **STUDENT:** Modify the file paths for `template.png` and `target.png` as needed for your environment.
- Adjust HOG parameters (e.g., `cell_size`, `bin_size`, `block_size`, `filter_method`) if required.


In [None]:
def main(template_path, 
         target_path,
         cell_size,
         bin_size,
         block_size,
         filter_method,
         ncc_threshold,
         step_size,
         iou_threshold):
    
    # Load images using PIL and convert them to grayscale
    template_img = np.array(Image.open(template_path).convert('L'))
    target_img = np.array(Image.open(target_path).convert('L'))
    
    # Compute HOG features for the target image and get cell histograms
    _, cell_hist = hog_feature(target_img, cell_size, bin_size, block_size, filter_method)
    
    # Visualize HOG features overlaid on the original target image
    visualize_hog(target_img, cell_hist, cell_size, bin_size, output_file='hog_visualization.png')
    
    # Extract HOG features for the template and perform sliding window NCC matching
    detections = sliding_window_ncc(target_img, 
                                    template_img, 
                                    cell_size, 
                                    bin_size, 
                                    block_size, 
                                    filter_method, 
                                    ncc_threshold, 
                                    step_size)
    print("Detections before NMS:", detections)
    
    # Apply Non-Maximum Suppression with an IoU threshold
    detections_nms = non_max_suppression(detections, iou_threshold)
    print("Detections after NMS:", detections_nms)
    
    # Visualize detections on the original target image
    visualize_detections(target_img, detections_nms, output_file='detections.png')
    

#### Run

In [None]:
img_template_path = 'template.png'
img_target_path = 'target.png'

main(template_path=img_template_path, 
     target_path=img_target_path,
     cell_size=8,
     bin_size=9,
     block_size=2,
     filter_method='sobel',
     ncc_threshold=0.5,
     step_size=4,
     iou_threshold=0.2)