# Introduction

The objective of this session is to continue practicing with tensors, deal with a real data-set, and get a feeling of how good/bad are the k-nearest neighbor rule and the PCA dimension reduction on MNIST and CIFAR10.

The questions should be answered by writing a source file and executing it by running the python command in a terminal, with the source file name as argument.

Both can be done from the main Jupyter window with.
• “New” → “Text file” to create the source code, or selecting the file and clicking “Edit” to edit an existing one.
• “New” → “Terminal” to start a shell from which you can run python.
Another option is to connect to the VM on port 2022 on the host with a SSH client such as PuTTY1.

The source should start with

    import torch
    from torch import Tensor
    import dlc_practical_prologue as prologue

to use the functions provided in the provided prologue. You are of course free to do without it.
You can get information about the practical sessions and the provided helper functions on the course’s website.
                             https://fleuret.org/dlc/

In [1]:
import torch
from torch import Tensor
import dlc_practical_prologue as prologue

In [2]:
prologue.load_data()

* Using MNIST
** Reduce the data-set (use --full for the full thing)
** Use 1000 train and 1000 test samples


(tensor([[0., 0., 0.,  ..., 0., 0., 0.],
         [0., 0., 0.,  ..., 0., 0., 0.],
         [0., 0., 0.,  ..., 0., 0., 0.],
         ...,
         [0., 0., 0.,  ..., 0., 0., 0.],
         [0., 0., 0.,  ..., 0., 0., 0.],
         [0., 0., 0.,  ..., 0., 0., 0.]]),
 tensor([5, 0, 4, 1, 9, 2, 1, 3, 1, 4, 3, 5, 3, 6, 1, 7, 2, 8, 6, 9, 4, 0, 9, 1,
         1, 2, 4, 3, 2, 7, 3, 8, 6, 9, 0, 5, 6, 0, 7, 6, 1, 8, 7, 9, 3, 9, 8, 5,
         9, 3, 3, 0, 7, 4, 9, 8, 0, 9, 4, 1, 4, 4, 6, 0, 4, 5, 6, 1, 0, 0, 1, 7,
         1, 6, 3, 0, 2, 1, 1, 7, 9, 0, 2, 6, 7, 8, 3, 9, 0, 4, 6, 7, 4, 6, 8, 0,
         7, 8, 3, 1, 5, 7, 1, 7, 1, 1, 6, 3, 0, 2, 9, 3, 1, 1, 0, 4, 9, 2, 0, 0,
         2, 0, 2, 7, 1, 8, 6, 4, 1, 6, 3, 4, 5, 9, 1, 3, 3, 8, 5, 4, 7, 7, 4, 2,
         8, 5, 8, 6, 7, 3, 4, 6, 1, 9, 9, 6, 0, 3, 7, 2, 8, 2, 9, 4, 4, 6, 4, 9,
         7, 0, 9, 2, 9, 5, 1, 5, 9, 1, 2, 3, 2, 3, 5, 9, 1, 7, 6, 2, 8, 2, 2, 5,
         0, 7, 4, 9, 7, 8, 3, 2, 1, 1, 8, 3, 6, 1, 0, 3, 1, 0, 0, 1, 7, 2, 7, 3,
         

# 1 Nearest neighbor

Write a function that gets a training set and a test sample and returns the label of the training point the closest to the latter.
More precisely, write:

    def nearest_classification(train ̇input, train ̇target, x):

where

    • train ̇input is a 2d float tensor of dimension n × d containing the training vectors, • train ̇target is a 1d long tensor of dimension n containing the training labels,
    • x is 1d float tensor of dimension d containing the test vector,
and the returned value is the class of the train sample closest to x for the L2 norm.
Hint: The function should have no python loop, and may use in particular torch.mean , torch.view ,
torch.pow , torch.sum , and torch.sort or torch.min . My version is 164 characters long.

In [3]:
def nearest_classification(train_input, train_target, x):
    n = train_input.size(0)
    d = train_input.size(1)
    diff = train_input-x
    diff_2 = diff.pow(2)
    sum_tot = [torch.mean(diff_2[i,:]) for i in range(n)]
    argmin = 0
    for i in range(1, n):
        if sum_tot[i]<sum_tot[argmin]:
            argmin = i
    return train_target[argmin]

In [4]:
train_input, train_target, test_input, test_target = prologue.load_data(cifar=True)

* Using CIFAR
Files already downloaded and verified
Files already downloaded and verified
** Reduce the data-set (use --full for the full thing)
** Use 1000 train and 1000 test samples


In [5]:
nearest_classification(train_input,train_target, test_input[0])

tensor(4)

### Correction

In [6]:
# Mauvaises habitudes de codes, ne pas copier
def nearest_classification(train_input, train_target, x):
    dist = (train_input - x).pow(2).sum(1).view(-1)
    _, n = torch.min(dist, 0)
    return train_target[n[0]]

In [7]:
nearest_classification(train_input,train_target, test_input[0])

  """


tensor(4)

In [8]:
print(train_input.narrow(1, 0, 1))


tensor([[ 59.],
        [154.],
        [255.],
        [ 28.],
        [170.],
        [159.],
        [164.],
        [ 28.],
        [134.],
        [125.],
        [ 53.],
        [142.],
        [164.],
        [ 17.],
        [100.],
        [100.],
        [235.],
        [110.],
        [197.],
        [ 23.],
        [153.],
        [252.],
        [ 86.],
        [126.],
        [ 73.],
        [162.],
        [131.],
        [ 45.],
        [128.],
        [202.],
        [126.],
        [236.],
        [ 50.],
        [  7.],
        [172.],
        [251.],
        [169.],
        [ 95.],
        [110.],
        [ 98.],
        [101.],
        [145.],
        [127.],
        [ 99.],
        [139.],
        [ 54.],
        [ 94.],
        [ 77.],
        [191.],
        [255.],
        [ 16.],
        [213.],
        [ 63.],
        [157.],
        [ 45.],
        [156.],
        [141.],
        [ 66.],
        [ 97.],
        [252.],
        [201.],
        [114.],
        

# 2 Error estimation
Write a function

    def compute ̇nb ̇errors(train ̇input, train ̇target, test ̇input, test ̇target,
                      mean = None, proj = None):
where

    • train ̇input is a 2d float tensor of dimension n × d containing the train vectors, • train ̇target is a 1d long tensor of dimension n containing the train labels,
    • test ̇input is a 2d float tensor of dimension m × d containing the test vectors, • test ̇target is a 1d long tensor of dimension m containing the test labels,
    • mean is either None or a 1d float tensor of dimension d,
    • proj is either None or a 2d float tensor of dimension c × d 
that subtracts mean (if it is not   ) from the vectors of both train ̇input and test ̇input, apply the operator proj (if it is not   ) to both, and returns the number of classification errors using the 1-nearest-neighbor rule on the resulting data.

Hint: Use in particular torch.mm . My version is 487 characters long, and it has a loop (the horror!)

In [9]:
def compute_nb_errors(train_input, train_target, test_input, test_target, mean=None, proj=None):
    n_input = train_input.size(0)
    n_test = test_input.size(0)
    d = train_input.size(1)
    mean_train = torch.Tensor(1,d).fill_(1.0)
    mean_test = torch.Tensor(1,d).fill_(1.0)
    #substract mean
    for i in range (d):
        mean_train[:,i]= train_input.narrow(1,i,1).mean()
        mean_test[:,i] = test_input.narrow(1,i,1).mean()
    diff_train = train_input-mean_train
    diff_test = test_input-mean_test
    if proj != None:
        proj_train = proj(diff_train)
        proj_test = proj(diff_test)
    nb_error = 0
    for i in range(n_test):
        label_predict = nearest_classification(train_input, train_target, test_input[i])
        label_true = test_target[i]
        if label_predict != label_true:
            nb_error +=1
    return f'The nb of error is {nb_error}, the prctg of failure is {nb_error/n_test}'
        
    
        
        
        

In [10]:
compute_nb_errors(train_input, train_target, test_input, test_target)

  """


'The nb of error is 746, the prctg of failure is 0.746'

### Correction

In [11]:
def compute_nb_errors(train_input, train_target,
                      test_input, test_target,
                      mean = None, proj = None):

    if mean is not None:
        train_input = train_input - mean
        test_input = test_input - mean

    if proj is not None:
        train_input = train_input.mm(proj.t())
        test_input = test_input.mm(proj.t())

    nb_errors = 0

    # With loop, but I prefer clearer code when counting errors
    for n in range(0, test_input.size(0)):
        if test_target[n] != nearest_classification(train_input, train_target, test_input[n]):
            nb_errors = nb_errors + 1

    return nb_errors



In [12]:
compute_nb_errors(train_input, train_target, test_input, test_target)

  """


746

### Note de correction

Problème de lecture du projecteur et du mean. Mais bon comme juste translation pas de problème

# 3 PCA

Write a function

    def PCA(x):

where x is a 2d float tensor of dimension n × d , which returns a pair composed of the 1d mean vector of dimension d and the PCA basis, ranked in decreasing order of the eigen-values, as a 2d tensor of dimension d × d.

### Hint: 
The function should have no python loop, and use in particular torch.eig , and torch.sort . My version is 275 characters long.

In [13]:
def PCA(x):
    mean = x.mean(0)
    x = x-mean
    n = x.size(0)
    d = x.size(1)
    x_1 = x.t()
    X = x_1.mm(x)
    X_val, X_eig = X.eig(eigenvectors=True)
    
    return X_eig

In [None]:
PCA(test_input)

### Correction

In [None]:
def PCA(x):
    mean = x.mean(0)
    b = x - mean
    Sigma = b.t().mm(b)
    eigen_values, eigen_vectors = Sigma.eig(True)
    right_order = eigen_values[:,0].abs().sort(0, True)[1]
    eigen_vectors = eigen_vectors.t()[right_order]
    return mean, eigen_vectors

### Analyse :

Oubli du sort, lire plus la doc, l'astuce de la dim est super importante

# 4 Check that all this makes sense

Compare the performance of the 1-nearest neighbor rule on data projected either a 100d random subspace (i.e. using a basis generated with a normal) and using the PCA basis for different dimensions (e.g. 3, 10, 50, 100).

Compare also the performance between MNIST and CIFAR. Does all this make sense?

Pas de own submission j'ai compris R

### Correction

In [None]:
for c in [ False, True ]:

    train_input, train_target, test_input, test_target = prologue.load_data(cifar=c)

    nb_errors = compute_nb_errors(train_input, train_target, test_input, test_target)
    print('Baseline nb_errors {:d} error {:.02f}%'.format(nb_errors, 100 * nb_errors / test_input.size(0)))

    ##

    basis = train_input.new(100, train_input.size(1)).normal_()

    nb_errors = compute_nb_errors(train_input, train_target, test_input, test_target, None, basis)
    print('Random {:d}d nb_errors {:d} error {:.02f}%'.format(basis.size(0), nb_errors, 100 * nb_errors / test_input.size(0)))

    ##

    mean, basis = PCA(train_input)

    for d in [ 100, 50, 10, 3 ]:
        basis = basis.narrow(0, 0, d)
        nb_errors = compute_nb_errors(train_input, train_target, test_input, test_target, mean, basis)
        print('PCA {:d}d nb_errors {:d} error {:.02f}%'.format(d, nb_errors, 100 * nb_errors / test_input.size(0)))

