## Group Members
1. Piyush Metkar - 47509180
2. Rishab Vaishya - 47505527
3. Dhaval Gogri - 47444609

# 1 BUSINESS UNDERSTANDING

We are always surrounded by objects, wherever we go, which can be categorized into cars, baseball-bats, chairs, lamps etc. Nowadays with taking photos on mobile becoming a common phenomenon, image processing to extract data from images  has become a field of interests among many companies such as Google, Facebook, Amazon etc for <b>tagging</b> objects in images.

Our data consists of 45 categories of objects with a total of more 5000 images.

We have taken a data set of images which consists of many categories of objects. By visualizing our data set we can extract the features of data from all the train images we have and can use those features to identify that object in  test images containing multiple objects. For example, if you are selling a product on Amazon and you upload wrong images for your product, like photos of chair instead of lamp, our algorithm with help to recognize this mistake. It would compare the data we have collected and the products that match as the same way on Amazon. This will help in improving user experience as any user looking for a lamp won't have to go through chairs instead.

Also it would he helpful in <b>tagging</b> objects in images. For example an image of a person with a car and an apple in hand. It would then recognize the person, car and the apple in the image.

For our algorithm to be successful, it has to clearly identify atleast the objects which are visually very different. Feature extraction techniques should be able to highlight the main components of images and distinguish them. It is ok to miss few items which are visually similar. For example, the canopy of an umbrella and the shade of lamps are visually similar and hence we expect to miss such items. Overall we are looking at <b>90%</b> hit rate.


# 2. DATA PREPARATION

We will resize all the images to 100x100 resolution for maintaining consistency. Also we will perform grey-scale conversion on all the images by removing all the RGB and alpha layer.

We have linearized the array into a 1-D array as per the requirement.



In [1]:
from sklearn import datasets as ds
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import matplotlib.image as mpimg

low_memory = False

#Read in images into dataset object 

imageFolderPath=r'/Users/piyushmetkar/101_Object'
im_array = ds.load_files(imageFolderPath, description=None, categories=None, load_content=True, shuffle=True, encoding=None, decode_error='strict', random_state=0) 

#Here ds.loadfiles() reads the images as text files. So we use matplotlib.image to read the image file
#ds.loadfiles() gives us an object of the entire dataset and maps filenames, target, targetnames accordingly

print('Data:',len(im_array.data))
print('Target:',len(im_array.target))
print('FileNames:',len(im_array.filenames))
print('TargetNames:',len(im_array.target_names))

def rgb2gray(rgb):
    return np.dot(rgb[...,:3], [0.299, 0.587, 0.114])

im_array.data = np.zeros((5485, 10000))

for i in range(len(im_array.data)):
    try:
        img = mpimg.imread(im_array.filenames[i]) #Using imread() to read image files
        #img.resize(100,100,100) # resizing all images to 50, 50
        gray = rgb2gray(img)    # Converting all images to grayscale
        farr = gray.flatten()   # Linearize the images to 1-D image features
        im_array.data[i] = farr # replace in corresponding data
    except:
        pass

X = np.array(im_array.data)
y = np.array(im_array.target)
names = im_array.target_names
print('X Shape-', X.shape)
print('y Shape-', y.shape)

print(X[0][0])
n_samples, n_features = X.shape
h, w = 100,100
#n_classes = len(names)


Data: 294
Target: 294
FileNames: 294
TargetNames: 5
X Shape- (5485, 10000)
y Shape- (294,)
255.0


In [2]:
df = pd.DataFrame(im_array.data)
# add in the class targets and names
df['Target'] = im_array.target.astype(np.int)
print (df.info())
df.head()

ValueError: Length of values does not match length of index

# 3 DATA REDUCTION

## 3.1 Linear dimensionality reduction of the images using principal components analysis


In [None]:
from sklearn.decomposition import PCA

pca = PCA(n_components=2)
X_pca = pca.fit(X).transform(X) # fit data and then transform it

# print the components
print ('pca:', pca.components_.shape)

def plot_explained_variance(pca):
    import plotly
    from plotly.graph_objs import Scatter, Marker, Layout, XAxis, YAxis, Bar, Line
    plotly.offline.init_notebook_mode() # run at the start of every notebook
    
    explained_var = pca.explained_variance_ratio_
    cum_var_exp = np.cumsum(explained_var)
    
    plotly.offline.iplot({
        "data": [Bar(y=explained_var, name='individual explained variance'),
                 Scatter(y=cum_var_exp, name='cumulative explained variance')
            ],
        "layout": Layout(xaxis=XAxis(title='Principal components'), yaxis=YAxis(title='Explained variance ratio'))
    })

pca = PCA(n_components=200)
X_pca = pca.fit(X)
plot_explained_variance(X_pca)

Thus here we need about <b><u>59</u></b> principal components to adequately represent 90% of the image data.

## 3.2  Non-linear dimensionality reduction

In [None]:
%%time
from sklearn.decomposition import KernelPCA

n_components = 10
print ("Extracting the top %d eigenobjects from %d objects, not calculating inverse transform" % (n_components, X.shape[0]))

kpca = KernelPCA(n_components=n_components, kernel='rbf', 
                fit_inverse_transform=False, gamma=8, # very sensitive to the gamma parameter,
                remove_zero_eig=True)  
kpca.fit(X.copy())

In [None]:
%%time
#  THIS  TAKES A LONG TIME TO RUN
from sklearn.decomposition import KernelPCA

n_components = 10
print ("Extracting the top %d eigenobjects from %d objects, ALSO getting inverse transform" % (n_components, X.shape[0]))

kpca = KernelPCA(n_components=n_components, kernel='rbf', 
                fit_inverse_transform=True, gamma=8, # very sensitive to the gamma parameter,
                remove_zero_eig=True)  
kpca.fit(X.copy())

In [None]:

# the above operation takes a long time to save the inverse transform parameters
# so let's save out the results to load in later!
import pickle

pickle.dump(kpca, open( '/Users/piyushmetkar/Desktop/KPCA/kpca.p', 'wb' ))

In [None]:
# Load the Kpca
import pickle
kpca = pickle.load(open( '/Users/piyushmetkar/Desktop/KPCA/kpca.p', 'rb' ))

In [None]:
# widgets example
from ipywidgets import widgets  # make this interactive!
def f(x):
    return x
widgets.interact(f, x=10)

In [None]:
import warnings
# warnings.simplefilter('ignore', DeprecationWarning)
# warnings.simplefilter("always",DeprecationWarning)



def plt_reconstruct(idx_to_reconstruct):
    idx_to_reconstruct = np.round(idx_to_reconstruct)
    
    reconstructed_image = pca.inverse_transform(pca.transform(X[idx_to_reconstruct].reshape(1, -1)))
    #reconstructed_image_rpca = rpca.inverse_transform(rpca.transform(X[idx_to_reconstruct].reshape(1, -1)))
    reconstructed_image_kpca = kpca.inverse_transform(kpca.transform(X[idx_to_reconstruct].reshape(1, -1)))
    
    
    plt.figure(figsize=(15,7))
    
    plt.subplot(1,3,1)
    imshow(X[idx_to_reconstruct].reshape((h, w)), cmap=plt.cm.gray)
    plt.title(names[y[idx_to_reconstruct]])
    plt.grid()
    
    plt.subplot(1,3,2)
    imshow(reconstructed_image.reshape((h, w)), cmap=plt.cm.gray)
    plt.title('Full PCA')
    plt.grid()
    
    plt.subplot(1,3,3)
    imshow(reconstructed_image_kpca.reshape((h, w)), cmap=plt.cm.gray)
    plt.title('Kernel PCA')
    plt.grid()
    plt.show()
    
widgets.interact(plt_reconstruct,idx_to_reconstruct=(0,n_samples-1,1),__manual=True)

## 3.3 
When we compare PCA with Kernel PCA, it is observed that we get higher principal components for KPCA when the number of images are less. However, as number of images increases, the KPCA becomes very smudgy. 

On an average standard PCA yeilds approx 40 more prinicpal components than KPCA when less images are fed in which they are visually similar. Thus PCA is better at dimensionality reduction.


## 3.4 Feature Extraction - Gabor

In [None]:
from matplotlib import pyplot as plt
import seaborn as sns
import copy
from ipywidgets import fixed
# put it together inside a nice widget
def closest_image(dmat,idx1):
    distances = copy.deepcopy(dmat[idx1,:]) # get all image diatances
    distances[idx1] = np.infty # dont pick the same image!
    idx2 = np.argmin(distances)
    
    distances[idx2] = np.infty
    idx3 = np.argmin(distances)
    
    plt.figure(figsize=(10,16))
    plt.subplot(1,3,1)
    plt.imshow(X[idx1].reshape((h,w)))
    plt.title("Original Image "+names[y[idx1]])
    plt.grid()

    plt.subplot(1,3,2)
    plt.imshow(X[idx2].reshape((h,w)))
    plt.title("Closest Image  "+names[y[idx2]])
    plt.grid()
    
    plt.subplot(1,3,3)
    plt.imshow(X[idx3].reshape((h,w)))
    plt.title("Next Closest Image "+names[y[idx3]])
    plt.grid()
    plt.show()
    
#widgets.interact(closest_image,idx1=(0,n_samples-1,1),dmat=fixed(dist_matrix),__manual=True)

In [None]:
from skimage.filters import gabor_kernel
from scipy import ndimage as ndi
from scipy import stats

# prepare filter bank kernels
kernels = []
for theta in range(4):
    theta = theta / 4. * np.pi
    for sigma in (1, 3):
        for frequency in (0.05, 0.25):
            kernel = np.real(gabor_kernel(frequency, theta=theta,
                                          sigma_x=sigma, sigma_y=sigma))
            kernels.append(kernel)

            
# compute the filter bank and take statistics of image
def compute_gabor(row, kernels, shape):
    feats = np.zeros((len(kernels), 4), dtype=np.double)
    for k, kernel in enumerate(kernels):
        filtered = ndi.convolve(row.reshape(shape), kernel, mode='wrap')
        _,_,feats[k,0],feats[k,1],feats[k,2],feats[k,3] = stats.describe(filtered.reshape(-1))
        # mean, var, skew, kurt
        
    return feats.reshape(-1)

idx_to_reconstruct = int(np.random.rand(1)*len(X))

gabr_feature = compute_gabor(X[idx_to_reconstruct], kernels, (h,w))
gabr_feature

In [None]:
# takes ~3 minutes to run entire dataset
%time gabor_stats = np.apply_along_axis(compute_gabor, 1, X, kernels, (h,w))
print(gabor_stats.shape)

In [None]:
from sklearn.metrics.pairwise import pairwise_distances
# find the pairwise distance between all the different image features
%time dist_matrix_gabor = pairwise_distances(gabor_stats)

In [None]:
plt.show()
widgets.interact(closest_image,idx1=(0,n_samples-1,1),dmat=fixed(dist_matrix_gabor),__manual=True)

## 3.5
As we can see above, Gabor works very well for our business case. In approx 90% of the cases, it identifies the closest neighbour from the same target class.
