Official notes: http://cs231n.github.io/classification/

# Image classification

### Data driven approach

It means that the progress in an activity is compelled by data, rather than by intuition or by personal experience


1. Collect a dataset of images and labels
2. Use machine learning to train an image classifier
3. Evaluate the classifier on a withheld set of test images

def train(images, labels):
    # build a model for images -> labels
    return model
    
def predict(model, test_images):
    # predict test labels using the model
    return test_labels

### First classifier: Nearest neighbour classifier

1. Remember all training images and their labels
2. Predict the label of the most similar training image

To compare images we can use the **L1 distance** or the **Manhattan distance** 

\begin{equation*}
 d_1(I_1, I_2) =  \sum_{p} |I_1^p - I_2^p| 
\end{equation*}

* Gets trained instantly(as it just saves the images and labels for testing)
* Expensive(linear in size of training data) to test

Test time performance is much more important in practice. CNNs are expensive to train but has cheap test evaluation while the nearest neighbour classifier flips this.

**Approximate nearest neighbour(ANN) - library**:
    finds approximate nearest neighbours quickly

Validation sets are used for tuning hyperparameters. We split the training data into training data and validation data, test the model on validation data to get the best parameters and then use those parameters to test the test data.
** Number of neighbours to consider for classification(k) and metric to use for comparison are the hyper-parameters here**

In [45]:
class NearestNeighbor(object):
  def __init__(self):
    pass

  def train(self, X, y):
    """ X is N x D where each row is an example. Y is 1-dimension of size N """
    # the nearest neighbor classifier simply remembers all the training data
    self.Xtr = X
    self.ytr = y

  def predict(self, X):
    """ X is N x D where each row is an example we wish to predict label for """
    num_test = X.shape[0]
    # lets make sure that the output type matches the input type
    Ypred = np.zeros(num_test, dtype = self.ytr.dtype)

    # loop over all test rows
    for i in range(num_test):
      # find the nearest training image to the i'th test image
      # using the L1 distance (sum of absolute value differences)
      distances = np.sum(np.abs(self.Xtr - X[i,:]), axis = 1)
      min_index = np.argmin(distances) # get the index with smallest distance
      Ypred[i] = self.ytr[min_index] # predict the label of the nearest example

    return Ypred

In [37]:
from sklearn.model_selection import train_test_split
import numpy as np
import os
from PIL import Image
import scipy
import matplotlib.pyplot as plt
from sklearn.neighbors import NearestNeighbors


def read_data(rdir,cNames,fExt,refSize):
    featData = None
    labelData = None
    for c,cname in enumerate(cNames):
        dirName = rdir + cname
        #print(dirName)
        for root, dirs, files in os.walk(dirName):
            #print(root)
            for file in files:
                if(file.endswith(fExt)):
                    # read Image
                    img = np.asarray(Image.open(os.path.join(root,file)))
                    if(len(img.shape)!=3):
                        continue
                    img = scipy.misc.imresize(img,(refSize))
                    # collapse into vector
                    feat = np.reshape(img,(1,np.prod(refSize)))
                    # append to dataset
                    if featData is None:
                        featData = feat
                        labelData = c
                    else:
                        featData = np.vstack((featData,feat))
                        labelData = np.hstack((labelData,c))
                    #print("{}".format(img.shape))
                    #imgplot = plt.imshow(img)
                    #plt.show()
    
    #print("feat shape: {}, label shape:{}".format(featData.shape,labelData.shape))
    return featData, labelData

In [46]:
opts = {'rdir': r"C:\Users\nabhu\Documents\ML\Mlabs\mlabs-2018-problem-set\data\ETHZShapeClasses-V1.2/",
        'classNames' : {'Mugs','Swans'},
        'fExt':'jpg',
        'refSize' : [10,10,3],
        'trainSplit' : 0.7,
                'inf' : 1e10,
        'seed':0}

np.random.seed(opts['seed'])


# read the data
feat,label = read_data(opts['rdir'],
                       opts['classNames'],
                       opts['fExt'],
                       opts['refSize'])
print(feat,label)

C:\Users\nabhu\Documents\ML\Mlabs\mlabs-2018-problem-set\data\ETHZShapeClasses-V1.2/Mugs


`imresize` is deprecated in SciPy 1.0.0, and will be removed in 1.2.0.
Use ``skimage.transform.resize`` instead.


C:\Users\nabhu\Documents\ML\Mlabs\mlabs-2018-problem-set\data\ETHZShapeClasses-V1.2/Swans
[[ 27  23  23 ...  48 102 148]
 [ 50  30  20 ...  36  25  20]
 [175 111  88 ... 197 165 137]
 ...
 [254 254 253 ... 254 254 253]
 [ 62  86  51 ... 119 126 105]
 [121 132 123 ...  80  92  81]] [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
 1 1 1 1 1]


In [47]:
feat = feat.reshape(feat.shape[0], 10*10*3)
# train test split
# ref: https://www.youtube.com/watch?v=Bk2-5FoQJr0
ftrain,ftest,ltrain,ltest = train_test_split(feat,label,train_size=opts['trainSplit'])



In [54]:
nn = NearestNeighbor() # create a Nearest Neighbor classifier class
nn.train(ftrain, ltrain) # train the classifier on the training images and labels
y_pred = nn.predict(ftest) # predict labels on the test images
print("Accuracy : {}".format(np.mean(ltest==y_pred)))

Accuracy : 0.4583333333333333
