### Lecture 1: KNN and linear classifier

In [1]:
import numpy as np

##### 1) K-nearest-neighbors
* Hyperparameters: 
     - K (number of neighbors)
     - distance (loss function used)
* Description: 
     - Computes label of sample based on majority vote on labels of K closest neighbors in training set, while closeness is defined according to the distance metrics chosen (L1, L2,...)
     - Hyperparameters typically chosen using validation set/cross fold validation of training set.
* Drawbacks: 
     - Fast for training (O(1)), slow for prediction (O(N)), we usually wants the opposite. 
     - Distance metrics on pixels is not informative - performs badly on pictures.
     - Curse of dimensionality (we need the number of training examples to be exponential to the dimension of the problem).

In [3]:
class KNearestNeighbor:
    def __init__(self, K=1, loss_function='L1'):
        self.K = K
        self.distance = loss_function
    def train(self, X, y):
        self.Xtr = X
        self.ytr = y
    def predict(self, X):
        y_pred = np.zeros(X.shape[0], dtype=self.ytr.dtype)
        for i in range(X.shape[0]):
            if self.distance == 'L1':
                distances = np.sum(np.abs(self.Xtr - X[i,:]), axis=1)
            elif self.distance == 'L2':
                distances = np.sum(np.abs(self.Xtr - X[i,:])**2, axis=1)
            else:
                distances = np.sum(np.abs(self.Xtr - X[i,:]), axis=1)
            min_ind = np.argpartition(distances, self.K)[:self.K]
            y_true = np.unique(self.ytr[min_ind], return_counts=True)
            y_pred[i] = y_true[0][np.argmax(y_true[1])]
        return y_pred

Example of use:

In [6]:
KNN = KNearestNeighbor()
X = np.array([[0,0,1], [0,1,0], [1,0,0]])
y = np.array(['r', 'g', 'b'])
X_new = [[0,0.5,1], [1,0,0]]
KNN.train(X,y)
print(KNN.predict(np.array(X_new)))

['r' 'b']


##### 2) Linear classifier
* Description: 
     - f(x, W) = Wx + b (score function for each class). Size: #classes x 1
     - x: sample. Size: N x 1 (need to flatten images).
     - W: Weights matrix (updated during training). Size: #classes x N
     - b: bias (updated during training). Size: #classes x 1
* Drawbacks: 
     - Specific partitions not allowing the hyperplanes to correctly partition the data