# Machine Learning 2 - Neural Networks

In this lab, we will use simple Neural Networks to classify the images from the simplified CIFAR-10 dataset. We will compare our results with those obtained with Decision Trees and Random Forests.

Lab objectives
----
* Classification with neural networks
* Influence of hidden layers and of the selected features on the classifier results

In [6]:
from lab_tools import CIFAR10, evaluate_classifier, get_hog_image
        
dataset = CIFAR10('./CIFAR10/')

Pre-loading training data
Pre-loading test data


We will use the *[Multi-Layer Perceptron](http://scikit-learn.org/stable/modules/generated/sklearn.neural_network.MLPClassifier.html#sklearn.neural_network.MLPClassifier)* implementation from scikit-learn, which is only available since version 0.18. You can check which version of scikit-learn is installed by executing this :

In [7]:
import sklearn
print(sklearn.__version__)

0.23.1


If you have version 0.17 or older, please update your scikit-learn installation (for instance, with the command *pip install scikit-learn==0.19.1* in the terminal or Anaconda prompt)

## Build a simple neural network

* Using the [MLPClassifier](http://scikit-learn.org/stable/modules/generated/sklearn.neural_network.MLPClassifier.html) from scikit-learn, create a neural network with a single hidden layer.
* Train this network on the CIFAR dataset.
* Using cross-validation, try to find the best possible parameters.

In [5]:
from sklearn.neural_network import MLPClassifier

clf = MLPClassifier()
clf.n_layers = 3

In [8]:
clf.fit(dataset.train['hog'], dataset.train['labels'])



MLPClassifier()

In [9]:
from sklearn.metrics import accuracy_score
from sklearn.metrics import confusion_matrix

In [10]:
pred = clf.predict(dataset.train['hog'])
score = accuracy_score(dataset.train['labels'], pred)
print("Descriptive score", score)
cm = confusion_matrix(dataset.train['labels'], pred)
print(cm)

Descriptive score 0.8485333333333334
[[4220  540  240]
 [ 475 3970  555]
 [ 106  356 4538]]


In [11]:
pred = clf.predict(dataset.test['hog'])
score = accuracy_score(dataset.test['labels'], pred)
print("Predictive score", score)
cm = confusion_matrix(dataset.test['labels'], pred)
print(cm)

Predictive score 0.7946666666666666
[[815 138  47]
 [121 721 158]
 [ 30 122 848]]


In [12]:
# On code nous-mêmes la cross validation
from sklearn.model_selection import StratifiedKFold

def cross_validation(alpha=1.0, max_iter=1000):
    kf = StratifiedKFold(5)

    scores = []

    for train,test in kf.split(dataset.train['hog'], dataset.train['labels']):
        train_x = dataset.train['hog'][train]
        train_y = dataset.train['labels'][train]

        test_x = dataset.train['hog'][test]
        test_y = dataset.train['labels'][test]
    
        clf = MLPClassifier(alpha=alpha, max_iter=max_iter)
        clf.n_layers = 3
        
        clf.fit(train_x, train_y)
        
        pred = clf.predict(test_x)
        score = accuracy_score(test_y, pred)
        
        scores.append(score)
    return scores

In [13]:
alpha = [0.001, 0.01, 0.1, 1.0]
max_iter = [200, 400, 800, 1000]

In [15]:
import numpy as np

In [None]:
for a in alpha:
    print("Alpha ", a)
    for i in max_iter:
        print("Max iter ", i)
        scores = cross_validation(alpha=a, max_iter=i)
        mean = np.mean(scores)
        print("La moyenne est ", mean)

Alpha  0.001
Max iter  200




La moyenne est  0.7861333333333334
Max iter  400




La moyenne est  0.7945333333333334
Max iter  800




La moyenne est  0.7888666666666667
Max iter  1000




La moyenne est  0.7906
Alpha  0.01
Max iter  200




La moyenne est  0.7802
Max iter  400




La moyenne est  0.7955333333333334
Max iter  800


## Add hidden layers to the network.

Try to change the structure of the network by adding hidden layers. Using cross-validation, try to find the best architecture for your network.

In [None]:

## -- Your code here -- ##
