In this lab, we will investigate two highly useful features for image analysis: color histogram and SIFT. Both of these features are widely used in computer vision and image analysis problems of all kinds, such as face recognition, similar image search, image classification, and object detection.

## Notes on OpenCV

Due to certain legal restrictions, current updated packages in OpenCV no longer have SIFT builtin. Depending on the OS you are using there might be different instructions on how to build and install OpenCV with Python on SIFT. We will be using OpenCV 3.1.0 for this lab. Older versions should work as well, but you might have to change some lines of code. Depending on your comfort level, you may choose either pathway.

If you chose to build OpenCV from source, this will take a while. After the build process completes, a `cv2.so` will be produced in the lib folder. Copy this file to your notebook directory, this will allow you to do `import cv2`. **DO NOT SUBMIT THIS FILE**

Make sure to at least get OpenCV and SIFT working before you leave the lab today.

Documentation of OpenCV can be found [here](http://docs.opencv.org)

### *A note about submission*:

For this lab, submit this notebook with all required code contained therin. If explanations or observations are asked for, create a text box in the notebook in the section of the Exercise and put your comments there.

Since Exercise 2 requires the submission of a folder of the similar images found by your code, submit this notebook and that folder in a zip file together. Name the file "yourNetID_lab12.zip."

In [1]:
%matplotlib inline

# Color Histogram

For image analysis, much of the information of an image can be conveyed merely by its histogram, although other, more sofisticated features, such as SIFT, are often used and needed in big data problems. Here, we'll investigate how useful the histogram can be. For this section, you will need to download the Caltech-101 dataset (http://www.vision.caltech.edu/Image_Datasets/Caltech101/). This dataset contains 101 different object categories, each with between 40 to 800 images of size 300 x 200, roughly. (This dataset pales in comparison to other modern, "big data" image datasets, but it's the best that your computer can likely handle easily.)

In [2]:
import cv2
import matplotlib.pyplot as plt
from scipy import misc
import numpy as np

ImportError: No module named cv2

First, we'll look at three test-case images.

In [None]:
beach = misc.imread('beach.png')
desert1 = misc.imread('desert1.jpg')
desert2 = misc.imread('desert2.jpg')

plt.imshow(beach)
plt.figure()
plt.imshow(desert1)
plt.figure()
plt.imshow(desert2)

We provide the following two functions to generate and compare histograms. They are built upon OpenCV's implementation.

In [None]:
# Compares to histograms using the histogram intersection

def compHist(h1, h2):
    # h1, h2 - input histograms to compare, which should be of dimension (3,N) for color images and (1,N) for
    #              grayscale images. If the two histograms are not the same dimensions, the output is a large
    #              value to enforce their disimilarity
    if h1.shape[0] != h2.shape[0]:
        return 100
    else:
        x1 = 1 - cv2.compareHist(h1[0,:].astype('float32'), h2[0,:].astype('float32'), cv2.HISTCMP_INTERSECT)
        x2 = 1 - cv2.compareHist(h1[1,:].astype('float32'), h2[1,:].astype('float32'), cv2.HISTCMP_INTERSECT)
        x3 = 1 - cv2.compareHist(h1[2,:].astype('float32'), h2[2,:].astype('float32'), cv2.HISTCMP_INTERSECT)
        comp = np.sqrt(x1**2 + x2**2 + x3**2)
        return comp

In [None]:
# Generates a histogram from an input image

def generateHist(img, n_bins):
    # img - the input image
    # n_bins - the number of bins of the output histogram
    # returns: h - the histogram
    k = len(img.shape)
    if k == 2:
        h = np.zeros((1,n_bins),dtype='float32')
        M,N = img.shape
        h[0,:][:, np.newaxis] = cv2.calcHist([img], [0], None, [n_bins], [0,256])
        h = h/(M*N)
    else:
        h = np.zeros((3,n_bins),dtype='float32')
        M,N,D = img.shape
        h[0,:][:, np.newaxis] = cv2.calcHist([img], [0], None, [n_bins], [0,256])
        h[1,:][:, np.newaxis] = cv2.calcHist([img], [1], None, [n_bins], [0,256])
        h[2,:][:, np.newaxis] = cv2.calcHist([img], [2], None, [n_bins], [0,256])
        h = h/(M*N)
    return h

Now, we'll use the above functions to generate histograms for the three test-cases.

In [None]:
Q = 100

hist_desert1 = generateHist(desert1, Q)
hist_desert2 = generateHist(desert2, Q)
hist_beach = generateHist(beach, Q)

In [None]:
f, (ax1, ax2, ax3) = plt.subplots(3, sharex=True, sharey=True)
f.set_size_inches((8,8))
ax1.plot(hist_beach[0,:])
ax1.plot(hist_beach[1,:])
ax1.plot(hist_beach[2,:])
ax1.set_title('Histograms of Beach and Desert Scenes')

ax2.plot(hist_desert1[0,:])
ax2.plot(hist_desert1[1,:])
ax2.plot(hist_desert1[2,:])

ax3.plot(hist_desert2[0,:])
ax3.plot(hist_desert2[1,:])
ax3.plot(hist_desert2[2,:])
ax3.set_xlabel('Color Intensity Bins')

Using the kernel interesction as a comparison metric, we compare how "similar" are the histograms. Which two images are the most similar? (Remember that this metric is a distance, so smaller values mean more "similar.")

In [None]:
print 'Beach vs. Desert1:\t' + str(compHist(hist_beach, hist_desert1)) # beach vs. desert1
print 'Desert2 vs. Desert1:\t' + str(compHist(hist_desert2, hist_desert1)) # desert2 vs. desert1
print 'Beach vs. Desert2:\t' + str(compHist(hist_beach, hist_desert2)) # beach vs. desert2

### Exercise 1:

Write your own implementation of the histogram. It should take in an image and a parameter called "n_bins," which is the number of bins for the histogram. (In an image, pixel values range from 0 - 255, but the histogram could have fewer than 255 bins. If, for instance, n_bins = 100, then each bin of the histogram would store between 2-3 different pixel intensity values.) For grayscale images, the output will be a 1D vector of dimension (1, n_bins). For color images, compute a 1D histogram of each channel with n_bins and form the three histograms into a 2D matrix of dimension (3, n_bins). The histogram should also be normalized by the total number of pixels, so that along each row it sums to 1.

In [None]:
def histogram(image, n_bins):
    if len(image.shape) != 3:       # grayscale image

    else:                           # color image
    
    return hist

# Similar Image Search by Histogram Matching

Using the histogram as a feature for our image, let's see how well it works when searching for similar images in a large dataset. You will need to import the lab5.py file, as done below, to use a function to load the Caltech-101 dataset. You should have downloaded from [here](http://www.vision.caltech.edu/Image_Datasets/Caltech101/) and unzipped this dataset to the same directory as your notebook. It should be in a folder called '101_ObjectCategories.'

In [None]:
import lab5

dataset_dir = '101_ObjectCategories/'

n_images, classes, image_names = lab5.load_image_dataset(dataset_dir)

The following code generates histograms for all of the images in the dataset.

In [None]:
hists = []

n_bins = 100

for name in image_names:
    temp_image = misc.imread(name)
    temp_hist = generateHist(temp_image, n_bins)
    hists.append(temp_hist)

Here are the test query images that we will use to search for similar images. Notice the diversity in their histograms.

In [None]:
# Compare a test histogram to the database
test_ind = [450, 2423, 5211, 8134]
n_bins = 100

for i in test_ind:
    test = misc.imread(image_names[i])
    plt.figure()
    plt.imshow(test)

    test_hist = generateHist(test, n_bins)

    plt.figure()
    plt.plot(test_hist[0,:], color='red')
    plt.plot(test_hist[1,:], color='green')
    plt.plot(test_hist[2,:], color='blue')
    plt.title('Histogram for Image ' + str(i))

### Exercise 2:

The code below iterates through the above test query images and searches for their 10 most similar images. Fill in the missing elements to generate the histograms and compare them using your own histogram implementation from Exercise 1. The number of bins for the histogram should be n_bins = 100.

The images will be stored in a directory called "closest" in the same location as your notebook. They will have suffixes denoting which of the four images they match and their order in terms of similarity. (Note that the first image should be the query image itself, since it is still in the dataset and is more similar to itself than any other image.) Submit this folder with your notebook for the lab.

Do the results look reasonable? Explain why or why not.

In [None]:
import os

closest_dir = 'closest/'
if not os.path.exists(closest_dir):
    os.makedirs(closest_dir)

n_bins = 100
K = 10
    
for ind in test_ind:
    print 'Finding closest ' + str(K) + ' images to ' + image_names[ind]
    test = misc.imread(image_names[ind])
    
    test_hist = [] #edit this line to make things right

    D = np.zeros(n_images)
    for i in range(n_images):
        D[i] =  0 #edit this line to make things right 
    idx = np.argsort(D)
    
    closest = []
    for i in range(K):
        closest.append(image_names[idx[i]])
        img = misc.imread(closest[i])
        misc.imsave(closest_dir + 'closest_' + str(ind) + '_' + str(i) + '.jpg', img)
        print '\tSaving closest image ' + str(i)
    

# Scale Invariant Feature Transform (SIFT)

### Plotting the Keypoints

We will use OpenCV's built-in function to perform SIFT. In this part, we will find and match the keypoints of two images. One application is recognizing the traffic sign using the camera. You can watch a demo [here](http://bit.ly/1l51Tra). Let's first load the image and visualize the keypoints.

Documentation can be found [here](http://docs.opencv.org/3.1.0/da/df5/tutorial_py_sift_intro.html#gsc.tab=0)

In [None]:
cv2.__version__

In [None]:
import cv2
import numpy as np

boxImage1 = cv2.imread('box.png')
grayImage1= cv2.cvtColor(boxImage1,cv2.COLOR_BGR2GRAY)

#sift = cv2.SIFT() #use this for versions 3.0.0 and below
sift = cv2.xfeatures2d.SIFT_create() #Use this for versions >= 3.1.0
kp = sift.detect(grayImage1,None)

img = boxImage1.copy()
cv2.drawKeypoints(grayImage1,kp, img)
plt.imshow(img)

We repeat this for the second image which we would like to match.

In [None]:
boxImage2 = cv2.imread('box_in_scene.png')
grayImage2 = cv2.cvtColor(boxImage2,cv2.COLOR_BGR2GRAY)

sift = cv2.xfeatures2d.SIFT_create() 
kp = sift.detect(grayImage2,None)

img=boxImage2.copy()
cv2.drawKeypoints(grayImage2,kp, img)
plt.imshow(img)

### Matching the Keypoints

To search for matching keypoints, we will use the Brute Force Matcher, which tries to evaluate every possible match.

In [None]:
# find the keypoints and descriptors with SIFT
kp1, des1 = sift.detectAndCompute(grayImage1,None)
kp2, des2 = sift.detectAndCompute(grayImage2,None)
bfmatcher = cv2.BFMatcher()

Here, you are provided with some functions to help you compute matches and display them. Read through them and make sure you understand what's going on

In [1]:
def match_and_draw(kp1, kp2, des1, des2, grayImage1, grayImage2):
    raw_matches = bfmatcher.knnMatch(np.asarray(des1, np.float32), np.asarray(des2, np.float32), k = 2) #2
    p1, p2, kp_pairs = filter_matches(kp1, kp2, raw_matches)
    if len(p1) >= 4:
        H, status = cv2.findHomography(p1, p2, cv2.RANSAC, 5.0)
        #print '%d / %d  inliers/matched' % (np.sum(status), len(status))
    else:
        H, status = None, None
        #print '%d matches found, not enough for homography estimation' % len(p1)
    vis = explore_match(grayImage1, grayImage2, kp_pairs, status, H)
    return vis
def filter_matches(kp1, kp2, matches, ratio = 0.75):
    mkp1, mkp2 = [], []
    for m in matches:
        if len(m) == 2 and m[0].distance < m[1].distance * ratio:
            m = m[0]
            mkp1.append( kp1[m.queryIdx] )
            mkp2.append( kp2[m.trainIdx] )
    p1 = np.float32([kp.pt for kp in mkp1])
    p2 = np.float32([kp.pt for kp in mkp2])
    kp_pairs = zip(mkp1, mkp2)
    return p1, p2, kp_pairs

def explore_match(img1, img2, kp_pairs, status = None, H = None):
    h1, w1 = img1.shape[:2]
    h2, w2 = img2.shape[:2]
    vis = np.zeros((max(h1, h2), w1+w2), np.uint8)
    vis[:h1, :w1] = img1
    vis[:h2, w1:w1+w2] = img2
    vis = cv2.cvtColor(vis, cv2.COLOR_GRAY2BGR)

    if H is not None:
        corners = np.float32([[0, 0], [w1, 0], [w1, h1], [0, h1]])
        corners = np.int32( cv2.perspectiveTransform(corners.reshape(1, -1, 2), H).reshape(-1, 2) + (w1, 0) )
        cv2.polylines(vis, [corners], True, (255, 255, 255))

    if status is None:
        status = np.ones(len(kp_pairs), np.bool_)
    p1 = np.int32([kpp[0].pt for kpp in kp_pairs])
    p2 = np.int32([kpp[1].pt for kpp in kp_pairs]) + (w1, 0)

    green = (0, 255, 0)
    red = (0, 0, 255)
    white = (255, 255, 255)
    kp_color = (51, 103, 236)
    for (x1, y1), (x2, y2), inlier in zip(p1, p2, status):
        if inlier:
            col = green
            cv2.circle(vis, (x1, y1), 2, col, -1)
            cv2.circle(vis, (x2, y2), 2, col, -1)
        else:
            col = red
            r = 2
            thickness = 3
            cv2.line(vis, (x1-r, y1-r), (x1+r, y1+r), col, thickness)
            cv2.line(vis, (x1-r, y1+r), (x1+r, y1-r), col, thickness)
            cv2.line(vis, (x2-r, y2-r), (x2+r, y2+r), col, thickness)
            cv2.line(vis, (x2-r, y2+r), (x2+r, y2-r), col, thickness)
    vis0 = vis.copy()
    for (x1, y1), (x2, y2), inlier in zip(p1, p2, status):
        if inlier:
            cv2.line(vis, (x1, y1), (x2, y2), green)
    return vis

The match_and_draw function takes care of finding the matching keypoints and drawing them.

In [None]:
pairedImage = match_and_draw(kp1, kp2, des1, des2, grayImage1, grayImage2)
plt.imshow(pairedImage)

### Exercise 3: 

1. What is the information reduction factor when the image is represented by these SIFT features? Write the Python code to calculate the information reduction factor.
2. In the matching code above, BFMatcher was used to find matching pairs using the KNN algorithm. Suggest an improvement that might be able to decrease the run time of this step.
3. Extra Credit: Reimplement the matching function to use your improvement suggested above. You may use SKLearn or OpenCV's built-ins for this.
4. Choose any 3 images each from 2 different classes in the Caltech 101 dataset. Compute SIFT features and try to match them. Some images will match well while others will not. Briefly explain why this is so. Support your answer with some code.
5. Extra Credit: RANSAC was used to filter out false matches. Bonus points for those who could give me an explanation of what is done.

In [None]:
#Code to compute and match SIFT features from Caltech 101 here