# Introduction

For this project my pipelines are as follows:
1. SIFT keypoints and descriptors -> vector quantization -> SVM
2. SIFT keypoints and descriptors -> LBP -> SVM
3. Dense keypoints -> LBP -> Random Forest
4. SIFT keypoints and descriptors -> LBP w/ spacial pyramid -> SVM

Each of these pipelines will be discussed in detail separately.The rest of the discussion will go as follows:
1. Similarity of features that are sorted into the same k means cluster
2. The heatmap of the histograms for each category
3. Results for detection of birds and butterflies and the associated PR curve

**Note:** All code used to generate the shown figures will be added at the end of the notebook, with comments specifying which file the code was copied from. This is due to problems with the ```multiprocessing``` library not working correctly when imported into the notebook. 

**Note2:** Occasionally when I would run a cell that made use of the ```multiprocessing``` library, it would use all my memory and crash my computer. This problem may come from not fully stopping a previously executed cell before beginning with a new one (i.e. if it is displaying a graph, matplotlib is still running, and by extent so is the program.) This is because these programs can take ~5 Gb of RAM when they run. For that reason, I recommend that you don't run the code in the cells and use a terminal instead.

## SIFT keypoints and descriptors -> Vector Quantization -> SVM

![](write_up_images/sift_confusion.png)

This particular run of the pipeline used the top 1000 SIFT features for each image of the training set (for a total of ~380k features) and sent it through a k=400 k means clustering algorithm. 

The accuracy of ~78% was one of the best results I was able to produce, however occasionally a run would break 80%. Prior to this, I was using k=200 and getting values in the 71-74% range, whereas k=400 is more like 75-78% on average. However I saw a dramatic drop in accuracy after k=500.

I also found that the number of SIFT features had a significant impact. Using all of the features gave me around 66% accuracy, but only using 500 gave me ~70% fairly consistently. I found that between 1000-1200 features marked the sweet spot where my accuracy plateaued. Because the accuracy varied from run to run, it was not possible to narrow down the range any further. 

Overall this was my best preforming pipeline, and I used the parameters for k and the number of features that worked well for it as a baseline for the rest of my pipelines. I also did not use normalization for any of my histograms because I found that it destroyed the accuracy on the pipelines. And finally, I resized all of the images so that their y axis was 500 px with the ration of pixels preserved on the x axis. I didn't experiment with this parameter very much, but I found this first order implementation to be helpful.

## SIFT keypoints and descriptors -> LBP -> SVM

![](write_up_images/sift_lbp_confusion.png)

To state what will become obvious over the next three pipelines is that anything that uses LBP is terrible. By using the same 1000 SIFT features as before, the pipeline will consistently give ~20% accuracy. 

I have two theories on why LBP preforms so badly. 
1. Based on the impact that the number of bins had on vector quantization's accuracy, I suspect that the default 8 bit, 256 bin version of LBP isn't granular to pick out the major descriminating features. 
2. I think LBP may be looking at too small of a window to get a good idea of what the gradient around a feature point really is doing. By sampling pixels that are further away, I suspect that the type of information is going to be more useful.

The nice thing is that testing out both of these methods can easily be done by changing a single aspect of how LBP is coded up, but unfortunately I did not have time to implement those changes to see if they have any merit.


## Dense keypoints -> LBP -> Random Forest

![](write_up_images/dense_confusion.png)

As with the previous pipeline, the accuracy is terrible. But it is not as terrible as it could be. 

Originally I tried this pipeline out using SVM instead of RF and I consistently got ~15% for accuracy. This indicates that the dense keypoints are inferrior to the SIFT keypoints, but even with that deficency, RF is able to compensate. I find that very surprising, but when I did a test run on the first pipeline with RF, I got a dismal 54% accuracy. Weird.

For the run above, I sampled every 5th pixel to get the dense representation. Using 3 pixels produced essentially equivilant results, but 10 pixels saw a significant drop down to 13%. It seems like every bit of information counts when using a dense sampling technique. 

## SIFT keypoints and descriptors -> LBP w/ spacial pyramid -> SVM

![](write_up_images/sift_lbp_spacial.png)

The last pipeline made use of a spacial pyramid that broke the image up into four quadrants and concatonated the LBP histograms for each together. Apparently the spacial data added a significant amount missing information to the LBP, which I interperate as further evidence of my theory that the default LBP implementation doesn't gather enough information around it to be useful. 

I would be curious to know how the spacial data would improve if the image was segmented up into more chunks. 

## Similarity of features that are sorted into the same k means cluster

![](write_up_images/similar_features.png)

To generate these images, I chose four clusters from a k=400 k means that was generated from ~380k features and looked at a 64x64 area around features that were found to be in those clusters. 

From the images it is hard to tell what the deliniating feature for each group is. On first look it looks like the selected features are never being fully sorted for some of the clusters. But if you ignore the brightness of the image and instead look at the gradients, it becomes more feasible to find similarities between the images.  

Even with that consideration, some of the images (like the one in the lower left of the fourth group) do not seem to share any discernable features with their cluster mates.

It could be that I'm looking at bad clusters for this, but I don't think that there is enough data just from the vector quantization to make truly distinct clusters. Adding in some other features like spacial or color information may help with this.

## Heatmap of the histograms for each category

![](write_up_images/heatmap.png)

I believe that there is something wrong with the averaged graph on the right. It is pretty easy to chose a bright line in the right graph and to not be able to find a collection of bright dots in the left graph. Unfortunately I'm not sure what the problem is.

## Results for detection of birds and butterflies and the associated PR curve

![](write_up_images/detection.png)

As expected the detection rate is significantly higher than the classification rates. The PR curve more-or-less matches what the ```sklearn``` library produces.

In [None]:
# SIFT keypoints and descriptors -> vector quantization -> SVM
# code taken from sift.py
from DataSet import DataSet
import matplotlib.pyplot as plt
import cv2
import numpy as np
from multiprocessing import Pool
import os
from sklearn.cluster import MiniBatchKMeans
from sklearn.metrics import confusion_matrix
from sklearn.svm import LinearSVC
from sklearn.ensemble import RandomForestClassifier
import image_operations as img_op

ds = DataSet()
data_train = ds.butterfly_train
data_test = ds.butterfly_test

mbk = MiniBatchKMeans(n_clusters=400)


def vector_quantization_train(features):
    sift_features = np.empty((0, 128))
    for f, label in features:
        mbk.partial_fit(f)
    return vector_quantization(features)


def vector_quantization(features):
    X = []
    y = []
    count = 0
    for f, label in features:
        count += len(f)
        vq = mbk.predict(f)
        vals, bins, _ = plt.hist(vq, bins=400, histtype='step')
        X.append(vals)
        y.append(label)
    # plt.show()
    plt.close()
    return X, y, count


def get_features(args):
    i = args[0]
    data = args[1]
    img = ds.get_image(data[i][1])
    label = data[i][0]
    sift = cv2.xfeatures2d.SIFT_create(1000)
    kp, des = sift.detectAndCompute(img, None)
    return [des, label]


pool = Pool(os.cpu_count())
print("Training")
features = pool.map(get_features, [(i, data_train)
                                   for i in range(len(data_train))])

X, y, count = vector_quantization_train(features)
print("Number of features: ", count)
clf = LinearSVC()
# clf = RandomForestClassifier()
clf.fit(X, y)

print("Testing")
features = pool.map(get_features, [(i, data_test)
                                   for i in range(len(data_test))])
X, y, count = vector_quantization(features)
print("Number of features: ", count)
predictions = clf.predict(X)
accuracy = np.count_nonzero(np.where(predictions == y)[
                            0]) / predictions.shape[0]
print("Accuracy: ", accuracy)
cm = confusion_matrix(y, predictions)
plt.figure()
img_op.plot_confusion_matrix(
    cm, classes=ds.categories, title='SIFT w/ VQ Confusion Matrix  \nAccuracy={}'.format(accuracy))
plt.show()

In [None]:
# SIFT keypoints and descriptors -> LBP -> SVM
# code in sift_lbp.py
from DataSet import DataSet
import matplotlib.pyplot as plt
import cv2
import image_operations as img_op
import numpy as np
from multiprocessing import Pool
import os
from sklearn.svm import LinearSVC
import random
from sklearn.metrics import confusion_matrix

ds = DataSet()
data_train = ds.butterfly_train
data_test = ds.butterfly_test

lbp_features = np.empty((0))


def make_image_histogram(kps, lbp):
    kpx = np.asarray([np.around(kp.pt[0]) for kp in kps], dtype=np.int32)
    kpy = np.asarray([np.around(kp.pt[1]) for kp in kps], dtype=np.int32)
    vector = lbp[kpy, kpx]
    vals, bins, _ = plt.hist(vector, bins=256, histtype='step')
    return vals


def get_features(args):
    i = args[0]
    data = args[1]
    img = ds.get_image(data[i][1])
    label = data[i][0]
    gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
    lbp = img_op.LBP(gray)
    sift = cv2.xfeatures2d.SIFT_create(1000)
    kp, des = sift.detectAndCompute(gray, None)
    return [make_image_histogram(kp, lbp), label]


def clf_format_data(features):
    X = []
    y = []
    for f, label in features:
        X.append(f)
        y.append(label)
    return X, y


print("Training")
# get_features([0, data_train])
pool = Pool(os.cpu_count())
features = pool.map(get_features, [(i, data_train)
                                   for i in range(len(data_train))])
X, y = clf_format_data(features)
clf = LinearSVC()
clf.fit(X, y)

print("Testing")
pool = Pool(os.cpu_count())
features = pool.map(get_features, [(i, data_test)
                                   for i in range(len(data_test))])
X, y = clf_format_data(features)
predictions = clf.predict(X)
accuracy = np.count_nonzero(np.where(predictions == y)[
                            0]) / predictions.shape[0]
print("Accuracy: ", accuracy)
cm = confusion_matrix(y, predictions)
plt.figure()
img_op.plot_confusion_matrix(
    cm, classes=ds.categories, title='SIFT w/ LBP Confusion Matrix  \nAccuracy={}'.format(accuracy))
plt.show()

In [None]:
# Dense keypoints -> LBP -> Random Forest
# code in dense.py
from DataSet import DataSet
import matplotlib.pyplot as plt
import cv2
from sklearn.metrics import confusion_matrix
import image_operations as img_op
import numpy as np
from multiprocessing import Pool
import os
from sklearn.svm import LinearSVC
from sklearn.ensemble import RandomForestClassifier

import random

ds = DataSet()
data_train = ds.butterfly_train
data_test = ds.butterfly_test

lbp_features = np.empty((0))


def make_image_histogram(kps, lbp):
    kpx = np.asarray([np.around(kp.pt[0]) for kp in kps], dtype=np.int32)
    kpy = np.asarray([np.around(kp.pt[1]) for kp in kps], dtype=np.int32)
    vector = lbp[kpy, kpx]
    vals, bins, _ = plt.hist(vector, bins=256, histtype='step')
    return vals


def get_features(args):
    i = args[0]
    data = args[1]
    img = ds.get_image(data[i][1])
    label = data[i][0]
    gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
    lbp = img_op.LBP(gray)
    # sampling done by lbp[::x, :;x] where x is number of pixels to skip between samplings
    vals, bins, _ = plt.hist(
        lbp[::5, ::5].flatten(), bins=256, histtype='step')
    return [vals, label]


def clf_format_data(features):
    X = []
    y = []
    for f, label in features:
        X.append(f)
        y.append(label)
    return X, y


print("Training")
# get_features([0, data_train])
pool = Pool(os.cpu_count())
features = pool.map(get_features, [(i, data_train)
                                   for i in range(len(data_train))])
X, y = clf_format_data(features)
clf = LinearSVC()
# clf = RandomForestClassifier()
clf.fit(X, y)

print("Testing")
pool = Pool(os.cpu_count())
features = pool.map(get_features, [(i, data_test)
                                   for i in range(len(data_test))])
X, y = clf_format_data(features)
predictions = clf.predict(X)
accuracy = np.count_nonzero(np.where(predictions == y)[
    0]) / predictions.shape[0]
print("Accuracy: ", accuracy)
cm = confusion_matrix(y, predictions)
plt.figure()
img_op.plot_confusion_matrix(
    cm, classes=ds.categories, title='Dense Sampling w/ LBP Confusion Matrix \nAccuracy={}'.format(accuracy))
plt.show()

In [None]:
# SIFT keypoints and descriptors -> LBP w/ spacial pyramid -> SVM
# code from sift_lbp_spacial.py
from DataSet import DataSet
import matplotlib.pyplot as plt
import cv2
import image_operations as img_op
import numpy as np
from multiprocessing import Pool
import os
from sklearn.svm import LinearSVC
import random
from sklearn.metrics import confusion_matrix

ds = DataSet()
data_train = ds.butterfly_train
data_test = ds.butterfly_test

lbp_features = np.empty((0))


def spacial(lbp, kpy, kpx):
    xbound = int(lbp.shape[1]/2)
    ybound = int(lbp.shape[0]/2)
    quadrants = [[0, xbound, 0, ybound],
                 [xbound, lbp.shape[1]+1, 0, ybound],
                 [0, xbound, ybound, lbp.shape[0]+1],
                 [xbound, lbp.shape[1]+1, ybound, lbp.shape[0]+1]]
    vector = np.empty((0))
    for xlower, xupper, ylower, yupper in quadrants:
        mask = np.logical_and(np.logical_and(np.logical_and(
            xlower <= kpx, ylower <= kpy), kpx <= xupper), kpy <= yupper)
        vals, bins, _ = plt.hist(
            lbp[kpy[mask], kpx[mask]], bins=256, histtype='step')
        vector = np.hstack((vector, vals))
    return vector


def make_image_histogram(kps, lbp):
    kpx = np.asarray([np.around(kp.pt[0]) for kp in kps], dtype=np.int32)
    kpy = np.asarray([np.around(kp.pt[1]) for kp in kps], dtype=np.int32)
    return spacial(lbp, kpy, kpx)


def get_features(args):
    i = args[0]
    data = args[1]
    img = ds.get_image(data[i][1])
    label = data[i][0]
    gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
    lbp = img_op.LBP(gray)
    sift = cv2.xfeatures2d.SIFT_create(500)
    kp, des = sift.detectAndCompute(gray, None)
    return [make_image_histogram(kp, lbp), label]


def clf_format_data(features):
    X = []
    y = []
    for f, label in features:
        X.append(f)
        y.append(label)
    return X, y


print("Training")
# get_features([0, data_train])
pool = Pool(os.cpu_count())
features = pool.map(get_features, [(i, data_train)
                                   for i in range(len(data_train))])
X, y = clf_format_data(features)
clf = LinearSVC()
clf.fit(X, y)

print("Testing")
pool = Pool(os.cpu_count())
features = pool.map(get_features, [(i, data_test)
                                   for i in range(len(data_test))])
X, y = clf_format_data(features)
predictions = clf.predict(X)
accuracy = np.count_nonzero(np.where(predictions == y)[
                            0]) / predictions.shape[0]
print("Accuracy: ", accuracy)
cm = confusion_matrix(y, predictions)
plt.figure()
img_op.plot_confusion_matrix(
    cm, classes=ds.categories, \
    title='SIFT LBP & Spacial Pyramid pooling Confusion Matrix  \nAccuracy={}'.format(accuracy))
plt.show()

In [None]:
# Similarity of features that are sorted into the same k means cluster
# code in similar_features.py
from DataSet import DataSet
import matplotlib.pyplot as plt
import cv2
import numpy as np
from multiprocessing import Pool
import os
from sklearn.cluster import MiniBatchKMeans
from sklearn.metrics import confusion_matrix
from sklearn.svm import LinearSVC
import image_operations as img_op

ds = DataSet()
data_train = ds.butterfly_train
data_test = ds.butterfly_test

mbk = MiniBatchKMeans(n_clusters=400)


def get_features(args):
    i = args[0]
    data = args[1]
    img = ds.get_image(data[i][1])
    label = data[i][0]
    sift = cv2.xfeatures2d.SIFT_create(1000)
    kp, des = sift.detectAndCompute(img, None)
    return [kp, des]


print("Training")
features = []
for i in range(50):
    features.append(get_features([i, data_train]))

sift_features = np.empty((0))
for kp, des in features:
    mbk.partial_fit(des)

similar_patches = []
for i in np.random.randint(99, size=6):
    patches = []
    for j in range(5):
        kps, dess = get_features([i, data_train])
        for kp, des in zip(kps, dess):
            predict = mbk.predict([des])
            if predict == j:
                patches.append(img_op.getPatchFor(
                    kp, cv2.cvtColor(ds.get_image(data_train[i][1]), cv2.COLOR_BGR2GRAY)))
    similar_patches.append(patches)
images = []
for patches in similar_patches:
    if len(patches) >= 6:
        top_row = np.hstack((patches[0], patches[1]))
        top_row = np.hstack((top_row, patches[2]))
        # top_row = np.hstack((top_row, patches[3]))
        bottom_row = np.hstack((patches[3], patches[4]))
        bottom_row = np.hstack((bottom_row, patches[5]))
        # bottom_row = np.hstack((bottom_row, patches[7]))
        images.append(np.vstack((top_row, bottom_row)))
print(len(images))
for i in range(1, 5):
    plt.subplot(4, 1, i)
    plt.imshow(cv2.cvtColor(images[i-1], cv2.COLOR_GRAY2RGB))
plt.show()

In [None]:
# The heatmap of the histograms for each category
# coded in k_dim_histogram.py
from DataSet import DataSet
import matplotlib.pyplot as plt
import cv2
import numpy as np
from multiprocessing import Pool
import os
from sklearn.cluster import MiniBatchKMeans
from sklearn.metrics import confusion_matrix
from sklearn.svm import LinearSVC
import image_operations as img_op

ds = DataSet()
data = ds.k_histogram

mbk = MiniBatchKMeans(n_clusters=200)


def vector_quantization_train(features):
    sift_features = np.empty((0, 128))
    for f, label in features:
        mbk.partial_fit(f)
    return vector_quantization(features)


def vector_quantization(features):
    X = []
    y = []
    count = 0
    for f, label in features:
        count += len(f)
        vq = mbk.predict(f)
        vals, bins, _ = plt.hist(vq, bins=200, histtype='step')
        X.append(vals)
        y.append(label)
    # plt.show()
    plt.close()
    return X, y, count


def get_features(args):
    i = args[0]
    data = args[1]
    img = ds.get_image(data[i][1])
    label = data[i][0]
    sift = cv2.xfeatures2d.SIFT_create(1000)
    kp, des = sift.detectAndCompute(img, None)
    return [des, label]


pool = Pool(os.cpu_count())
print("Training")
features = pool.map(get_features, [(i, data)
                                   for i in range(len(data))])

X, y, count = vector_quantization_train(features)
X = np.asarray(X).T
y = np.asarray(y)
plt.subplot(121)
plt.imshow(X, cmap='jet', interpolation='nearest')
plt.title("Heat Map of All Histograms")

# X_category = np.ones(X.shape)
# y = np.asarray(y)
# for i in range(1, 11):
#     mask = y == i
#     avg = np.mean(X[:, mask], axis=0)
#     X_category[:, mask] = X_category[:, mask] * avg[:, np.newaxis]
for i in range(1, 11):
    mask = y == i
    avg = np.around(np.mean(X[:, mask], axis=1))
    # import pdb
    # pdb.set_trace()
    for j in range(21):
        if i == 1:
            X_cat = avg
        else:
            X_cat = np.vstack((X_cat, avg))
plt.subplot(122)
plt.imshow(X_cat.T, cmap='jet', interpolation='nearest')
plt.title("Heat Map of Averaged Histograms for each Catgory")
plt.show()

In [None]:
# Results for detection of birds and butterflies and the associated PR curve
# code in detection.py
from DataSet import DataSet
import matplotlib.pyplot as plt
import cv2
import numpy as np
from multiprocessing import Pool
import os
from sklearn.cluster import MiniBatchKMeans
from sklearn.metrics import confusion_matrix
from sklearn.svm import LinearSVC
import pickle
import image_operations as img_op

ds = DataSet()
data_train = ds.training_set
data_test = ds.test_set

mbk = MiniBatchKMeans(n_clusters=400)


def vector_quantization_train(features):
    sift_features = np.empty((0, 128))
    for f, label in features:
        mbk.partial_fit(f)
    return vector_quantization(features)


def vector_quantization(features):
    X = []
    y = []
    count = 0
    for f, label in features:
        count += len(f)
        vq = mbk.predict(f)
        vals, bins, _ = plt.hist(vq, bins=400, histtype='step')
        X.append(vals)
        y.append(label)
    # plt.show()
    plt.close()
    return X, y, count


def get_features(args):
    i = args[0]
    data = args[1]
    img = ds.get_image(data[i][1])
    label = data[i][0]
    sift = cv2.xfeatures2d.SIFT_create(500)
    kp, des = sift.detectAndCompute(img, None)
    return [des, label]


pool = Pool(os.cpu_count())
print("Training")
features = pool.map(get_features, [(i, data_train)
                                   for i in range(len(data_train))])

X, y, count = vector_quantization_train(features)
print("Number of features: ", count)
clf = LinearSVC()
clf.fit(X, y)

print("Testing")
features = pool.map(get_features, [(i, data_test)
                                   for i in range(len(data_test))])
X, y, count = vector_quantization(features)
print("Number of features: ", count)
predictions = clf.predict(X)
accuracy = np.count_nonzero(np.where(predictions == y)[
                            0]) / predictions.shape[0]
print("Accuracy ", accuracy)
cm = confusion_matrix(y, predictions)
plt.figure()
plt.subplot(121)
img_op.plot_confusion_matrix(
    cm, classes=['butterflies', 'birds'], title='SIFT w/ VQ Confusion Matrix \nAccuracy={}'.format(accuracy))
plt.subplot(122)
confidences = clf.decision_function(X)
img_op.precision_recall(y, confidences)
# img_op.precision_recall(y, predictions)
plt.show()