# 4.3 Detector & Descriptor Benchmarking

Now we have met some of the most interesting keypoint detectors and descriptors, it would be interesting to test them and compare their results in terms of number of detections, robustness, invariance and performance. In the context of our photo-stitching application, not all the keypoint detectors and descriptors seem to perform the same.

Thus, in this notebook, you are asked to evaluate the following methods:

- Harris + NCC
- Harris + ORB (descriptor)
- ORB 
- SIFT
- SURF

in images that suffer changes in:

- lighting conditions
- rotation
- scale
- point of view

So, for each situation, you'll be provided with a pair of images that you will have to use to detect, describe and match the above-mentioned keypoints. After that, provide and plot in a bar chart the following statistics:

- average number of keypoints detected in the images
- number of found matches
- time spent per keypoint at detection (including description)*
- time spent per match during matching*

*use `time.process_time()` from the [`time`](https://docs.python.org/3/library/time.html) package to measure time.

In [2]:
# preamble
import numpy as np
import cv2
import matplotlib.pyplot as plt
import matplotlib
import time
matplotlib.rcParams['figure.figsize'] = (20.0, 20.0)
images_path = './images/'

# for sift
# import sys
# sys.path.append("..")
# from utils.third_party import pysift #https://github.com/rmislam/PythonSIFT

### Prepare output

In [3]:
# Create output vectors
stats_kps = np.zeros((4,5))
stats_mat = np.zeros((4,5))
stats_tdet = np.zeros((4,5))
stats_tmat = np.zeros((4,5))

### Preliminary functions (brought from 4.1 and adapted)

In [4]:
# Define a function to detect Harris
def detectHarris(image,w_size,sobel_size,k):
    """ Detect Harris features, perform non-max-suppression and return it
    
        Args:
                image: Input image
                w_size: window size for blurring
                sobel_size: window size for Sobel
                k: Harris 'k' parameter
        
        Returns:
                kps: list of 'cv2.KeyPoint' with the found keypoints
    """
    # Write your code here!
    # [...]
    # return kps
    
# ... and another one to match them
def matchHarris(image_l,image_r,kps_l,kps_r):
    """ Match Harris features using NCC, and return a list of 'cv2.DMatch'
    
        Args:
                image_l: Input left image
                image_r: Input right image
                kps_l: List of keypoints for the left image
                kps_r: List of keypoints for the right image
        
        Returns:
                matches: list of 'cv2.DMatch' with the found matches
    """
    # Write your code here!
    # [...]
    # return matches

# This method has been provided to you in a previous notebook
from scipy import signal
def nonmaxsuppts(cim, radius, thresh):
    """ Binarize and apply non-maximum suppresion.   
    
        Args:
            cim: the harris 'R' image
            radius: the aperture size of local maxima window
            thresh: the threshold value for binarization
                    
        Returns: 
            r, c: two numpy vectors being the row (r) and the column (c) of each keypoint
    """   
    
    rows, cols = np.shape(cim)
    sze = 2 * radius + 1
    mx = signal.order_filter(cim, np.ones([sze, sze]), sze ** 2 - 1)
    bordermask = np.zeros([rows, cols]);
    bordermask[radius:(rows - radius), radius:(cols - radius)] = 1
    cim = np.array(cim)
    r, c = np.where((cim == mx) & (cim > thresh) & (bordermask == 1))
    return r, c

### Create a process function for each method

In [5]:
def process_Harris_NCC(image_l, image_l_gray, image_r, image_r_gray):
    """ Process all the Harris+NCC part for a pair of input images (in RGB and gray versions).
    
        Args:
                image_l: Input left image (RGB)
                image_l_gray: Input left image (grayscale)
                image_r: Input right image
                image_r_gray: Input right image (grayscale)
        
        Returns:
                num_kps: average number of keypoints found in each image
                num_matches: number of matches found
                tdet: detection time per keypoint
                tmat: matching time per match
    """
    # Write your code here!
    # [...]
    # return num_kps, num_matches, tdet, tmat

In [6]:
def process_Harris_ORB(image_l, image_l_gray, image_r, image_r_gray):
    """ Process all the Harris+ORB part for a pair of input images (in RGB and gray versions).
    
        Args:
                image_l: Input left image (RGB)
                image_l_gray: Input left image (grayscale)
                image_r: Input right image
                image_r_gray: Input right image (grayscale)
        
        Returns:
                num_kps: average number of keypoints found in each image
                num_matches: number of matches found
                tdet: detection time per keypoint
                tmat: matching time per match
    """
    # Write your code here!
    # [...]
    # return num_kps, num_matches, tdet, tmat

In [7]:
def process_ORB(image_l, image_l_gray, image_r, image_r_gray):
    """ Process all the ORB part for a pair of input images (in RGB and gray versions).
    
        Args:
                image_l: Input left image (RGB)
                image_l_gray: Input left image (grayscale)
                image_r: Input right image
                image_r_gray: Input right image (grayscale)
        
        Returns:
                num_kps: average number of keypoints found in each image
                num_matches: number of matches found
                tdet: detection time per keypoint
                tmat: matching time per match
    """
    # Write your code here!
    # [...]
    # return num_kps, num_matches, tdet, tmat

In [8]:
def process_SIFT(image_l, image_l_gray, image_r, image_r_gray):
    """ Process all the SIFT part for a pair of input images (in RGB and gray versions).
    
        Args:
                image_l: Input left image (RGB)
                image_l_gray: Input left image (grayscale)
                image_r: Input right image
                image_r_gray: Input right image (grayscale)
        
        Returns:
                num_kps: average number of keypoints found in each image
                num_matches: number of matches found
                tdet: detection time per keypoint
                tmat: matching time per match
    """
    # Write your code here!
    # [...]
    # return num_kps, num_matches, tdet, tmat

In [9]:
def process_SURF(image_l, image_l_gray, image_r, image_r_gray):
    """ Process all the SURF part for a pair of input images (in RGB and gray versions).
    
        Args:
                image_l: Input left image (RGB)
                image_l_gray: Input left image (grayscale)
                image_r: Input right image
                image_r_gray: Input right image (grayscale)
        
        Returns:
                num_kps: average number of keypoints found in each image
                num_matches: number of matches found
                tdet: detection time per keypoint
                tmat: matching time per match
    """
    # Write your code here!
    # [...]
    # return num_kps, num_matches, tdet, tmat

## Exercise 1: Changes in ligthing conditions
Use `bright1.png` and `bright2.png` images.

<img src="./images/bright1.png" width="300" align="left"/><img src="./images/bright2.png" width="300" align="rigth"/>

### Read images and convert them to gray (will be used for all the methods)

In [10]:
# Write your code here!
## Read images and convert them to gray
image_l = cv2.imread(images_path + 'bright1.png')
image_r = cv2.imread(images_path + 'bright2.png')
image_l = cv2.cvtColor(image_l,cv2.COLOR_BGR2RGB)
image_r = cv2.cvtColor(image_r,cv2.COLOR_BGR2RGB)
image_l_gray = cv2.cvtColor(image_l,cv2.COLOR_RGB2GRAY)
image_r_gray = cv2.cvtColor(image_r,cv2.COLOR_RGB2GRAY)

### Make tests

In [1]:
# HARRIS + NCC
# stats_kps[0,0],stats_mat[0,0],stats_tdet[0,0],stats_tmat[0,0] = process_Harris_NCC(image_l, image_l_gray, image_r, image_r_gray)

In [2]:
# HARRIS + ORB
# stats_kps[0,1],stats_mat[0,1],stats_tdet[0,1],stats_tmat[0,1] = process_Harris_ORB(image_l, image_l_gray, image_r, image_r_gray)

In [3]:
# ORB
# stats_kps[0,2],stats_mat[0,2],stats_tdet[0,2],stats_tmat[0,2] = process_ORB(image_l, image_l_gray, image_r, image_r_gray)

In [4]:
# SIFT
# stats_kps[0,3],stats_mat[0,3],stats_tdet[0,3],stats_tmat[0,3] = process_SIFT(image_l, image_l_gray, image_r, image_r_gray)

In [5]:
# SURF
# stats_kps[0,4],stats_mat[0,4],stats_tdet[0,4],stats_tmat[0,4] = process_SURF(image_l, image_l_gray, image_r, image_r_gray)

## Exercise 2: Changes in rotation
Use `rotate1.png` and `rotate2.png` images.

<img src="./images/rotate1.png" width="300" align="left"/><img src="./images/rotate2.png" width="300" align="rigth"/>

### Read images and convert them to gray (will be used for all the methods)

In [16]:
# Write your code here!

### Make tests

In [9]:
# HARRIS + NCC

In [10]:
# HARRIS + ORB

In [8]:
# ORB

In [7]:
# SIFT

In [6]:
# SURF

## Exercise 3: Changes in scale
Use `scale1.png` and `scale2.png` images.

<img src="./images/scale1.png" width="300" align="left"/><img src="./images/scale2.png" width="300" align="rigth"/>

In [11]:
# Write your code here!

### Make tests

In [13]:
# HARRIS + NCC

In [14]:
# HARRIS + ORB

In [15]:
# ORB

In [16]:
# SIFT

In [17]:
# SURF

## Exercise 4: Changes in point of view
Use `pov1.png` and `pov2.png` images.

<img src="./images/pov1.png" width="300" align="left"/><img src="./images/pov2.png" width="300" align="rigth"/>

In [18]:
# Write your code here!

### Make tests

In [19]:
# HARRIS + NCC

In [20]:
# HARRIS + ORB

In [21]:
# ORB

In [22]:
# SIFT

In [23]:
# SURF

### Final Graphs

Finally, create a 4x4 bar plot with the results obtained in each test for the number of keypoints (row 1), number of matches (row 2) and timing information (rows 3 and 4) for each method (columns).

In [24]:
# Write your code here

### CONCLUSION

- Are the evaluated methods invariant to these changes?
- Which one would you use if you had to work with each kind of images?
- Which one would you use if you needed a real-time system?
- If there is any method NOT invariant against a certain change, can you think in any solution to make it more robust against this?

**<span style="color:blue">(Answer these questions here!)</span>**