#CVPR 2024, Lab class #2

## Feature detection
In the following we first detect keypoints using basic single-scale detectors (Harris detector, Hessian detector, LoG detector). Then we detect blobs in scale-space using normalized derivatives.

### Libraries
Besides [Numpy](https://numpy.org/) and the Pyplot interface to [Matplotlib](https://matplotlib.org/), we will need [OpenCV](https://opencv.org/).

In [None]:
from matplotlib import pyplot as plt

import os

import numpy as np

import cv2 # OpenCV
print (cv2.__version__)

4.8.0


### Load and display images
Let the images be located in the `images/` subfolder.

In [None]:
from google.colab import drive
drive.mount('/content/drive/', force_remount=True)
GDrivePath = '/content/drive/MyDrive/'

Mounted at /content/drive/


In [None]:
%ls
%cd /content/drive/MyDrive/
%ls 'Colab Notebooks/CVPR2023/LabLecture2-FeatureDetection/images'

[0m[01;34mdrive[0m/  [01;34msample_data[0m/
/content/drive/MyDrive
[0m[01;36m'Colab Notebooks/CVPR2023/LabLecture2-FeatureDetection/images'[0m@


In [None]:
folderpath = GDrivePath + 'Colab Notebooks/CVPR2023/LabLecture2-FeatureDetection/images/'
filepath = folderpath+ 'rubik.jpg'
image = cv2.imread(filepath)
plt.imshow(cv2.cvtColor(image, cv2.COLOR_BGR2RGB))

Convert the image to grayscale and double precision:

In [None]:
img_gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
plt.imshow(img_gray,cmap='gray')
print(img_gray.dtype)
img_gray=img_gray.astype(np.float64)
print(img_gray.dtype)

Resize the image. Pay attention to the `interpolation` method.

In [None]:
# Get the original image dimensions
height, width = img_gray.shape[:2]

# Calculate the new dimensions for 25% scaling
new_width = int(width * .25)
new_height = int(height * .25)

# Resize the image to the new dimensions
img_gray = cv2.resize(img_gray, (new_width, new_height),interpolation=cv2.INTER_AREA) # the interpolation method cv.INTER_AREA is recommended when shrinking images (it has a sort of antialiasing effect)


In [None]:
plt.imshow(img_gray,cmap='gray')

## Harris corner detector

Recall that the Harris corner detector is based on the weighted second moment matrix:

$$
\hat{C} =\left\lbrack \begin{array}{cc}
\sum I_x^2 w(d_x ,d_y ) & \sum I_x I_y w(d_x ,d_y )\\
\sum I_x I_y w(d_x ,d_y ) & \sum I_y^2 w(d_x ,d_y )
\end{array}\right\rbrack
$$

where $I_x$ and $I_y$ are the directional derivatives along $x$ and $y$ respectively.

We now approximate the directional derivatives using Sobel filters.

We specify the argument `ddepth=cv2.CV_64F` in order to represent the output as `float64`, ensuring maximum precision.

## Hessian detector

The Hessian detector is based on the Hessian matrix:

$$H(x,y)=\left\lbrack \begin{array}{cc}
I_{xx}  & I_{xy} \\
I_{xy}  & I_{yy}
\end{array}\right\rbrack $$

whose elements are the directional second derivatives, i.e., the derivatives of the first derivatives, which we previously computed as `Ix, Iy`

In [None]:
Ixx, Ixy = compute_gradient(Ix)
Iyx, Iyy = compute_gradient(Iy)

In [None]:

fig, ax = plt.subplots(1, 3, figsize=(20, 7))
ax[0].imshow(Ixx, cmap="gray")
ax[0].set_title("$I_{xx}$")

ax[1].imshow(Ixy, cmap="gray")
ax[1].set_title("$I_{xy}$")

ax[2].imshow(Iyy, cmap="gray")
ax[2].set_title("$I_{yy}$")

Now we find the local maxima above a threshold.

In [None]:
findLocalMax = lambda x: (x >(cv2.dilate(x,np.array([[1,1,1],[1,0,1],[1,1,1]],dtype=np.uint8))))

## Blob detection using Laplacian of Gaussian (LoG) filter

We now apply two LoG filters to detect blobs of different size

OpenCV doesn't provide a function to directly apply the LoG filter. Instead, we can build a LoG filter:
 $$\text{LoG}(x,y)_\sigma = -\frac{1}{\pi\sigma^4}(1-\frac{x^2 + y^2}{2\sigma^2})\exp(-\frac{x^2 + y^2}{2\sigma^2})$$

and apply the filter via `cv2.filter2D`.

In [None]:
# the following function implements the Laplacian of Gaussian (disregarding the factor 1/pi)
def LoG(siz,sigma):
    hsiz=(siz-1)/2
    std2=sigma**2
    x,y = np.meshgrid(np.linspace(-hsiz,hsiz,int(siz)),np.linspace(-hsiz,hsiz,int(siz)), indexing='xy')

    h1 = (x*x + y*y - 2*std2)/(2*std2**3) * np.exp( -(x*x + y*y)/(2*std2) );
    log_filter=h1-np.sum(h1)/(siz*siz) #subtract the mean to get zero in flat regiorns

    return log_filter



## Blob detection in scale-space using normalized LoG filter

We now employ the normalized LoG across scales:

$$\nabla^2_\text{norm}L(x,y,t) = t\nabla^2 G(x, y, \sqrt{t}) * I(x, y)$$

and search for local extrema across space and scale (recall that $t=\sigma^2$).

We define the vector containing the scales

To get the local maxima (and minima) using the same dilation trick as before, we need a 3D dilation, that unfortunately is not implemented in OpenCV. We resort to scipy

In [None]:
import scipy.ndimage as sp
# Define a 3D kernel (structuring element)
kernel_size = (3, 3, 3)
kernel = np.ones(kernel_size, dtype=np.uint8)
kernel[1,1,1]=0

thr=150

# Perform 3D dilation
dilated_volume = sp.grey_dilation(norm_LoG_vec, footprint=kernel)
# Get the maxima
local_maxima=(norm_LoG_vec>dilated_volume)&(np.abs(norm_LoG_vec)>thr)

# we need minima as well
dilated_volume_ = sp.grey_dilation(-norm_LoG_vec, footprint=kernel)
local_minima=((-norm_LoG_vec)>dilated_volume_) & (np.abs(norm_LoG_vec)>thr)

extrema=local_maxima | local_minima

## SIFT detector and descriptor

We now apply the SIFT detector for detecting keypoints and extract descriptors.



# Homework

The following are suggested exercises.


1. Experiment with the basic detectors using different values for $\sigma$ on the same image and observe how the detected keypoints change.

2. Check the rotational covariance of the basic detectors, by applying them to an image and its rotated version.

3. Experiment with photometric transformations (for instance, reduce the intensity range of the image) and check the invariance of the results when using the basic detectors.

4. Reproduce the results of blob detection in scale-space by employing a difference-of-Gaussians (DoG) instead of a LoG filter.

5. Prove that a circular blob of radius $r$ results in a maximum in scale-space at a scale $\sigma =\frac{r}{\sqrt{2}}$.

6. Find (or capture) a pair of images of the same object from slightly different point of views, detect SIFT features in both images, find (manually) a matched pair and compare the corresponding descriptors.
