# Lab08 - Interest Point Detection and Description
### CDS6334 Visual Information Processing


This lab explores the creation of local invariant features through two steps: detection of interest points (or keypoints), and the description of local features from the extracted interest points. A number of functions to perform these tasks are readily available from OpenCV, while some are now available from the unofficial OpenCV contrib modules which can be easily installed.

In [None]:
import cv2
import numpy as np
from matplotlib import pyplot as plt

## Interest point detection

### Harris corner detector

The Harris corner detector is the most classic interest point detector which aims to find corners in an image. Corners are said to be very good candidates for interest points as they possess properties of distinctiveness (interesting and unique to the image) and repeatability (easily found again despite transformations). <br>
The measure of cornerness is given by the equation:

$$R = det(M) - k(trace(M))^2$$

where
* $M = \sum \limits_{x,y}w(x,y)\begin{bmatrix}
I_xI_x & I_xI_y \\
I_xI_y & I_yI_y
\end{bmatrix}$
* $det(M) = \lambda_{1} \lambda_{2}$
* $trace(M) = \lambda_{1} + \lambda_{2}$
* $\lambda_{1}$ and $\lambda_{2}$ are the eigenvalues of $M$

The function `cv2.cornerHarris(img, blocksize, ksize, k)` has the following arguments:

- *img* - Input image, it should be grayscale and float32 type.
- *blockSize* - It is the size of neighbourhood considered for corner detection
- *ksize* - Aperture parameter of Sobel derivative used, basically the mask size
- *k* - Harris detector free parameter in the equation.

It returns an output with the same shape as the input image (`img`) where the values represent the *cornerness* of the pixels.

In [None]:
img = cv2.imread('house.png')
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
plt.imshow(gray, cmap='gray'), plt.xticks([]), plt.yticks([])
plt.show()

gray = np.float32(gray)
dst = cv2.cornerHarris(gray,3,3,0.04)

# you can turn this on to make the corners more obvious
dst = cv2.dilate(dst,None)     

# Threshold for an optimal value, it may vary depending on the image.
img[dst>0.01*dst.max()]=[0,0,255]

plt.imshow(img)
plt.show()

**Q1**: Study the re-implemented Harris Corner Detector function given (7 lines only!). Modify it further to take the top *N* corners, as determined by the cornerness measure. This way, we can take only the *N* strongest corners in the image (indirecty, this also removes the need to threshold the found corners!).

In [None]:
img = cv2.imread('house.png')
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)

# Write the function of your modified Harris corner detector here
def modifiedHarris(gray, blockSize, ksz, k):
    Ix = cv2.Sobel(gray,cv2.CV_64F,1,0,ksize=ksz)
    Iy = cv2.Sobel(gray,cv2.CV_64F,0,1,ksize=ksz)
    Ix2 = cv2.GaussianBlur(Ix*Ix,(blockSize,blockSize),0)
    Iy2 = cv2.GaussianBlur(Iy*Iy,(blockSize,blockSize),0)
    Ixy = cv2.GaussianBlur(Ix*Iy,(blockSize,blockSize),0)
    cornerness = (Ix2*Iy2 - Ixy*Ixy) - k*((Ix2 + Iy2)**2);    # det(A) - k*[trace(A)^2]
    ###Add code here to rank and return only N strongest corners
    
    ###
    return cornerness

c = modifiedHarris(gray, 3, 3, 0.04)   # same parameters as OpenCV's function
img[c>0.01*c.max()]=[0,0,255]

plt.imshow(img)
plt.show()

### Refining corners

Sometimes, you may need to find the corners that are "more exact". OpenCV comes with a function `cv2.cornerSubPix()` which further refines the corners detected with sub-pixel accuracy (going beyond pixel coordinates to find "true corners").

First, find the Harris corners. Then, pass the centroids of these corners (There may be a bunch of pixels at a corner, we take their centroid) to refine them. We mark the Harris corners in red pixels while the refined corners are marked in green pixels. Also, we have to define the criteria to stop the iteration. We stop it after a specified number of iteration or a certain accuracy is achieved, whichever occurs first. We also need to define the size of neighbourhood it would search for corners.

In [None]:
img = cv2.imread('house.png')
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)

# find Harris corners
gray = np.float32(gray)
dst = cv2.cornerHarris(gray,2,3,0.04)

# this dilation step is useful to "grow" the corners so that they overlap with other nearby corners.
# then, we can refine the corner with better consideration
ori_dst = dst
dst = cv2.dilate(dst,None)

# threshold
ori_nCorners = np.sum(ori_dst>0.01*ori_dst.max())
ret, dst = cv2.threshold(dst,0.01*dst.max(),255,0)
dst = np.uint8(dst)

# find centroids
ret, labels, stats, centroids = cv2.connectedComponentsWithStats(dst)

# define the criteria to stop and refine the corners
criteria = (cv2.TERM_CRITERIA_EPS + cv2.TERM_CRITERIA_MAX_ITER, 100, 0.001)
corners = cv2.cornerSubPix(gray,np.float32(centroids),(5,5),(-1,-1),criteria)

# draw the original corners
img2 = cv2.imread('house.png')    # read again instead of making copies of the array 'img'
img2[dst>0.01*dst.max()]=[0,0,255]

# draw the refined corners
res = np.hstack((centroids,corners))
res = np.intp(res)

# draw bigger marker at the corners for better clarity        
for i in range(3):
    for j in range(3):
        img[res[:,1]+i-1,res[:,0]+j-1]=[255,0,0]
        img[res[:,3]+i-1,res[:,2]+j-1] = [0,255,0]
                
        
plt.figure(figsize=(10, 5))
plt.subplot(121), plt.imshow(img2), plt.title('Original Harris corners (blue)')
plt.subplot(122), plt.imshow(img), plt.title('Refined Harris corners (green)')
plt.show()
print('Number of corners before refinement: %d'%(ori_nCorners))
print('Number of corners after refinement: %d'%(res.shape[0]))

Now that this refinement step helps to reduce the number of redundant corners (that are too close to a particular true corner), we can now revisit the earlier task on taking the top N corners, which will reduce further the number of refined corners to a much smaller number.

## SIFT

Let's turn our attention to the SIFT detector and descriptor. Unfortunately, for OpenCV 3.0 and later, SIFT and a few other feature methods (such as SURF) have been removed since they are considered "non-free" patented algorithms. The unofficial [OpenCV contrib modules](https://pypi.python.org/pypi/opencv-contrib-python) can now be installed through conda:

**`pip install opencv-contrib-python`**

Then, you are good to go.

In [None]:
img = cv2.imread('notredame1.jpg')
img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
gray= cv2.cvtColor(img, cv2.COLOR_RGB2GRAY)

# this creates an xfeatures2d_SIFT object, doesn't do anything yet
sift = cv2.xfeatures2d.SIFT_create()
print(type(sift))

Documentation is difficult to find and may be confusing to beginners! A majority of feature extractors are residing under [`feature2D`](https://docs.opencv.org/3.4.1/d0/d13/classcv_1_1Feature2D.html) class; over there, those that are from the OpenCV contrib modules can be found under [`xfeatures2d`](https://docs.opencv.org/3.4.1/d5/d3c/classcv_1_1xfeatures2d_1_1SIFT.html) class where SIFT resides. The function to detect SIFT interest points and compute its descriptors is [`detectAndCompute()`](https://docs.opencv.org/master/d0/d13/classcv_1_1Feature2D.html#a8be0d1c20b08eb867184b8d74c15a677).

In [None]:
# detects and computes SIFT keypoints and descriptors
(kps, descs) = sift.detectAndCompute(gray, None)

print(len(kps))        # number of SIFT keypoints
print(descs.shape)     # shape of descriptor matrix. N keypoints x D dimensions

This tells us that there are altogether 2,337 SIFT keypoints detected, and it resulted in the same number of descriptors, each being 128-dimension (basically 128 values describing the feature). 

To set other parameters related to SIFT, you have to do it earlier in the `SIFT_create()` function. 
- The *contrast threshold* is used to filter out weak features in low-contrast regions. The larger the threshold, the less features are produced by the detector.
- The *edgeThreshold* is used to filter out edge-like features. The larger the threshold, the less features are filtered out (more features are retained). 
- There are 3 more parameters: *nFeatures*, *nOctaveLayers* and *sigma* that can be used to alter how SIFT works.

In [None]:
sift = cv2.xfeatures2d.SIFT_create(contrastThreshold=0.14, edgeThreshold=5)
(kps, descs) = sift.detectAndCompute(gray, None)
print(len(kps))
print(descs.shape)

Now we have much lesser number of keypoints/descriptors, due to a higher *contrastThreshold* set.

In [None]:
print(kps)

What you see are *KeyPoint* objects. OpenCV is big on descriptors, they have even a dedicated object type just for the purpose of handling interest points! Documentation for Keypoint [here](https://docs.opencv.org/master/d2/d29/classcv_1_1KeyPoint.html).

A popular feature library made famous in Matlab called VLFeat (we don't use it here), has a descriptor plotting function that can show the SIFT patches at different scales and orientations, which looks very cool:
![plotdescriptor](http://www.vlfeat.org/demo/sift_basic_3.jpg)

For us to visualize these keypoints, we can opt to draw a circle centered upon each keypoint with its radius indicating the size of the keypoint.

In [None]:
plt.figure(figsize=(8,6))
plt.imshow(img)
fig = plt.gcf()
ax = fig.gca()

for r in np.arange(len(kps)): 
    circle1 = plt.Circle((kps[r].pt[0], kps[r].pt[1]), kps[r].size/2, color='r', fill=False)    
    ax.add_artist(circle1)
plt.show()

Another way to visualize the SIFT keypoints is to use the `cv2.drawKeypoints()` function. For this, we need to create KeyPoint objects to store the location, size (radius) and orientation of the keypoints.

In [None]:
imm = np.zeros((img.shape))    
imm = cv2.drawKeypoints(img, kps, imm, flags=4);
plt.figure(figsize=(10,8))
plt.imshow(imm)
plt.show()

**Q2**: Extract the top 200 features from SIFT and display them. You may have to relax on the contrast threshold (bring it down) to generate more keypoints. Compare them with the previous ones. What do you observe?

In [None]:
#Enter code here


Extract SIFT again for the second Notre Dame image. For consistency, make sure the parameters used are the same as the SIFTs extracted from the first Notre Dame image.

In [None]:
img1 = cv2.imread('notredame1.jpg')
img1 = cv2.cvtColor(img1, cv2.COLOR_BGR2RGB)
gray1= cv2.cvtColor(img1, cv2.COLOR_RGB2GRAY)
sift1 = cv2.xfeatures2d.SIFT_create()
(kps1, des1) = sift1.detectAndCompute(gray1, None)
print(des1.shape)
 
img2 = cv2.imread('notredame2.jpg')
img2 = cv2.cvtColor(img2, cv2.COLOR_BGR2RGB)
gray2= cv2.cvtColor(img2, cv2.COLOR_RGB2GRAY)
sift2 = cv2.xfeatures2d.SIFT_create()
(kps2, des2) = sift2.detectAndCompute(gray2, None)
print(des2.shape)    

imm1 = np.zeros((img1.shape))    
imm1 = cv2.drawKeypoints(img1, kps1, imm1, flags=4);   
imm2 = np.zeros((img2.shape))    
imm2 = cv2.drawKeypoints(img2, kps2, imm2, flags=4);   

plt.imshow(imm1), plt.title('notre dame 1')
plt.xticks([]), plt.yticks([])
plt.show()
plt.imshow(imm2), plt.title('notre dame 2')
plt.xticks([]), plt.yticks([])
plt.show()

Both Notre Dames have different number of descriptors. That's alright, this is normal when using SIFT. Because SIFT is dependent on the image content, it always produces different number of descriptors, even with the same parameter settings. More complex image content normally generates more interest points, hence more descriptors as well.

### Matching SIFT descriptors between images

OpenCV's Brute-Force matcher can help us match descriptors between images. How it works is really simple. It takes the descriptor of one feature in first set and is matched with all other features in second set using some distance calculation. And the closest one is returned.

First, create a `BFMatcher` object. The first parameter specifies the distance measure used for matching. L2-norm (Euclidean distance) is good for descriptors like SIFT and SURF, while binary string based descriptors like ORB, BRIEF, BRISK should use the Hamming distance, indicated by `cv2.NORM_HAMMING`. The second parameter enables cross-checking between two descriptors that match each other, i.e. only match those pairs whereby the $i$-th descriptor in set A has $j$-th descriptor in set B as the best match and vice-versa (both ways). This is a reliable technique to the "ratio test" method proposed in David Lowe's original SIFT specifications.

In [None]:
# Create BFMatcher object
bf = cv2.BFMatcher(cv2.NORM_L2, crossCheck=True)

# Match descriptors.
matches = bf.match(des1,des2)

# Let's see what kind of information is in this list....
print(matches)

Let's examine the first match...

In [None]:
num = 0
print(matches[num].distance,"between descriptor",matches[num].queryIdx,"and",matches[num].trainIdx)

It's a list of `DMatch` objects. Read up more [here](http://docs.opencv.org/trunk/d4/de0/classcv_1_1DMatch.html) about `Dmatch`.

It's a class for keypoint descriptor matches, and it contains a bunch of values -- query descriptor index, train descriptor index, train image index, and distance between descriptors.

Next, sort the matched descriptors in the order of their distance. Use the in-built `sorted` function to sort the `matches` according to the distance attribute of the `DMatch` objects. A [lambda function](http://www.secnetix.de/olli/Python/lambda_functions.hawk) (a function that does not have a name, that can be created at run-time) is useful here to take the distance attributes from each object for comparison. Then, use `cv2.drawMatches()` to draw the first K matches between the two images.

In [None]:
# Sort them in the order of their distance.
matches = sorted(matches, key = lambda x:x.distance)

# Prepare a composite image that can contain both images
img_matches = np.zeros((img.shape[0]+img2.shape[0], img.shape[1]+img2.shape[1]))

# Draw first K matches.
K = 20
img_matches = cv2.drawMatches(img,kps1,img2,kps2,matches[:K],img_matches,flags=4) # try flags=4

plt.imshow(img_matches)
plt.title('SIFT matching')
plt.xticks([]), plt.yticks([])
plt.show()

And, our matching is done! Observe carefully to check if the matched keypoint descriptors correspond to the correct "patch" on both images.

## Additional Exercises


**Q1**: A simplistic way of comparing images using SIFT descriptors is to sum up the distance between the top K matched descriptors. In other words, the distance between the top K pairs of matched descriptors are summed up to obtain a single value of dissimilarity. The smaller this value is, the more similar the two images that were matched. 

Here's a new image, featuring our very own KLCC 
<img src="klcc.jpg" width="150">
Write some code in the function `findClosestPair` that can be used to compare pairs of images. Compare all 3 images (the two Notre Dame images and the KLCC image) against each other using matched SIFT descriptors to find out which two images are the closest to each other.

In [None]:
# determine for yourself what input parameters to use.
def findClosestPair(): 
    #Enter code here
    


**Q2**: Show the top K patches of two images that were matched. It will be interesting to see how *similar* or *different* the matched patches were, and why are they important interest points. 

In [None]:
#Enter code here
