<img align='right' style='max-width: 200px; height: auto' src='hsg_logo.png'>

# 7,861: Computer Vision

## Lab 04 - Image Features
### Wednesday 15/11/2023

*Michael Mommert, Joëlle Hanna* - University of St. Gallen, Fall Term 2023


After today's lab, you should be able to:

- Implement some basic object descriptors techniques
- Understand which features are robust to transformations


As always, don't hesitate to ask all your questions either during the lab, post them in our CANVAS (StudyNet) forum (https://learning.unisg.ch), or send us an email (using the course email).

# Submission:

Kindly submit your notebook by midnight on Tuesday, November 22, 2023 via e-mail to `joelle.hanna@unisg.ch`. 

IMPORTANT: Please save your file using the following format: lab04_firstname_lastname.ipynb.


# Content

1. [Introduction to Template Matching](#template_matching)
1. [Correlation](#correlation)
2. [Histogram of Oriented Gradients (HOG)](#hog)
3. [Scale-Invariant Feature Transform (SIFT)](#sift)


<a id='template_matching'></a>
# 1. Introduction to Template Matching (1 point)

In this assignment, we will address the problem of detection and recognition using template matching. 

The idea behind template matching is to compare in some way a 'template' of the object you're trying to detect and the image. For example, if you were trying to detect, let's say a football, you will need to create a base template of the object, and then 'search' for it in the image. 

As you may have guessed, we need to define some measure of similarity between the image and the template. We will go through different image descriptors, as well as different metrics to measure similarities. 

In [51]:
import skimage.io as io
import numpy as np
import matplotlib.pyplot as  plt
from skimage.transform import rotate


* Exercise 1: Load and display the _oscar_selfie.jpg_ image. Our goal is to detect Ellen DeGeneres's face in this selfie. Crop her face and display it too. This will be your template.

In [1]:
# Your Code HERE:

<a id='correlation'></a>
# 2. Correlation (3 points)

In this exercise, we will use the correlation; the largest values of the correlation coefficients would indicate that the template and current image patch are matching.

Hint: You have to apply the correlation function between images with zero nominal average i.w. with a symmetric dynamic range with respect to zero (ex: [-128, 127] or [-0.5, 0.5])

* Exercise 2.1: Implement a sliding window mechanism. Then compute the correlation between the image window and the template, using `np.correlate()`. Make sure you specify `mode='same'`. The position of the template in the image corresponds to the position of the maximum of the correlation. Play with the stepsize of the sliding window mechanism: it is a tradeoff between detection accuracy and compute time!

In [62]:
# Your Code HERE:

# Zero nominal average
template_norm = 
image_norm = 


template_height, template_width, _ = template.shape


def template_matching_correlation(image, template):
    # Your Code HERE:
    
    return imax, jmax # position of the template in the image.

In [2]:
def display_image_with_roi(image, i_template, j_template):
    plt.figure()
    plt.imshow(image)

    plt.gca().add_patch(plt.Rectangle(
        (i_template, j_template),
        template_width,
        template_height,
        edgecolor='r',
        linewidth=2,
        fill=False
    ))

    plt.show()

In [63]:
i_template, j_template = template_matching_correlation(image_norm, template_norm)

In [3]:
display_image_with_roi(image, i_template, j_template)

* Exercise 2.2: Try the same with a flipped image (rotate with an angle of 180 degrees), keeping the same template. 

In [67]:
image_rotated = 

In [68]:
# Zero nominal average
image_rotated_norm = 

# Your Code HERE: 
# Use previous function to get i_template_rotated, j_template_rotated

In [4]:
display_image_with_roi(image_rotated, i_template_rotated, j_template_rotated)

* Exercise 2.3: Comment on the robustness of this technique, with respect to image translation.

[Your Answer HERE]

<a id='hog'></a>
# 3. Histogram of Oriented Gradients (HOG) (3 points)

Histogram of Oriented Gradients (HOG) is a widely employed descriptor in the fields of computer vision and machine learning, primarily for the purpose of object detection. This technique transforms the pixel information within an image into a vector representation, highlighting image characteristics, all the while mitigating the influence of complicating factors such as varying illumination conditions. 

In practical applications, HOG is commonly employed in tandem with a Linear Support Vector Machine (SVM) to facilitate object detection tasks. HOG quickly emerged as one of the go-to descriptors in image classification due to its ability to effectively characterize the local appearance and shape of objects by examining the distribution of local intensity gradients.

Later in the course, you will learn the steps involved in combining HOG and a Linear SVM to create an object classifier. For now, it is essential to grasp that HOG primarily serves as a descriptor for object detection, and these descriptors can later be input into a machine learning classifier for further analysis.

In [30]:
from skimage.feature import hog

* Exercise 3.1: Compute the HOG descriptors for the selfie image, and the template, using the `hog()` function from `skimage.features`. Display them next to each other.

In [None]:
# Your Code HERE:

* Exercise 3.2: Using a sliding window mechanism, compute the euclidian distance between the image window and the template, using `np.linalg.norm()`. The position of the template in the image corresponds to the position of the minimum distance. Again, play with the stepsize of the sliding window mechanism: it is a tradeoff between detection accuracy and compute time!

In [77]:
def template_matching_hog_distance(image, template):
    # Your Code HERE:

    return imin, jmin # position of the template in the image.

In [80]:
i_template_hog, j_template_hog = template_matching_hog_distance(hog_image, hog_template)

In [5]:
display_image_with_roi(image, i_template_hog, j_template_hog)

* Exercise 3.3: Try the same with a flipped image, keeping the same template. 

In [83]:
# Your Code HERE:
# Use previous function to get i_template_hog_rotated, j_template_hog_rotated

In [6]:
display_image_with_roi(image_rotated, i_template_hog_rotated, j_template_hog_rotated)

* Exercise 3.4: Comment on the robustness of this technique, with respect to image translation.

[ Your Answer HERE]

<a id='sift'></a>
# 4. Scale-Invariant Feature Transform (SIFT) (3 points)

Good feature descriptor should be robust to various geometric transformations such as *translsation*, *rotation* and *scaling*. 

In 2004, **D.Lowe**, University of British Columbia, came up with a new feature descriptor, Scale Invariant Feature Transform (*SIFT*) in his paper, Distinctive Image Features from Scale-Invariant Keypoints, which extract keypoints and compute its descriptors ([ref](https://www.robots.ox.ac.uk/~vgg/research/affine/det_eval_files/lowe_ijcv2004.pdf)).

In order to achieve such invariance, there are four steps involved in *SIFT*:
- Detect extremums in Scale-space
- Localize keypoints
- Define Keypoint's orientation
- Build Keypoint descriptor

In [85]:
from skimage.feature import SIFT, match_descriptors
from skimage.color import rgb2gray

* Exercise 4.1: Compute the keypoints and descriptors of the image and the template, using the `SIFT()` method from `skimage.feature`. 

HINT: Start by converting images to grayscale.

In [86]:
def get_keypoints_descriptors(image, template):
    # Your Code HERE:

    return keypointsImage, descriptorsImage, keypointsTemplate, descriptorsTemplate


In [87]:
keypointsImage, descriptorsImage, keypointsTemplate, descriptorsTemplate = get_keypoints_descriptors(image, template)

Luckily, there's a function already implemented in `skimage` called `match_descriptors()`. For each descriptor in the first set, this function finds the closest descriptor in the second set.

* Exercise 4.2: Find the matching features (descriptors) in the image and the template using `match_descriptors()`. 

In [88]:
# Your Code HERE:

matches = 

In [7]:
fig, axes = plt.subplots(1, 2, sharex=True, sharey=True, figsize=(20, 10))


axes[0].imshow(image, cmap='Greys_r')
axes[0].scatter(keypointsImage[matches[:, 0], 1], keypointsImage[matches[:, 0], 0], marker='.', color='red')
axes[0].axis('off')

axes[1].imshow(template, cmap='Greys_r')
axes[1].scatter(keypointsTemplate[matches[:, 1], 1], keypointsTemplate[matches[:, 1], 0], marker='.', color='blue')
axes[1].axis('off')

* Exercise 4.3: Try the same with a flipped image, keeping the same template. Display the image/templates with their matching descriptors.

In [97]:
# Your Code HERE:

* Exercise 4.4: Comment on the robustness of this technique, with respect to image translation.

[ Your Answer HERE]