# Machine Learning 2 - Neural Networks

In this lab, we will use simple Neural Networks to classify the images from the simplified CIFAR-10 dataset. We will compare our results with those obtained with Decision Trees and Random Forests.

Lab objectives
----
* Classification with neural networks
* Influence of hidden layers and of the selected features on the classifier results

In [1]:
from lab_tools import CIFAR10, evaluate_classifier, get_hog_image
        
dataset = CIFAR10('./CIFAR10/')

Pre-loading training data
Pre-loading test data


We will use the *[Multi-Layer Perceptron](http://scikit-learn.org/stable/modules/generated/sklearn.neural_network.MLPClassifier.html#sklearn.neural_network.MLPClassifier)* implementation from scikit-learn, which is only available since version 0.18. You can check which version of scikit-learn is installed by executing this :

In [2]:
import sklearn
print(sklearn.__version__)

1.4.1.post1


If you have version 0.17 or older, please update your scikit-learn installation (for instance, with the command *pip install scikit-learn==0.19.1* in the terminal or Anaconda prompt)

## Build a simple neural network

* Using the [MLPClassifier](http://scikit-learn.org/stable/modules/generated/sklearn.neural_network.MLPClassifier.html) from scikit-learn, create a neural network with a single hidden layer.
* Train this network on the CIFAR dataset.
* Using cross-validation, try to find the best possible parameters.

In [4]:
from sklearn.neural_network import MLPClassifier
from sklearn.metrics import accuracy_score
from sklearn.metrics import confusion_matrix

split_val = 0.1
len_dataset = int(split_val*len(dataset.train["hog"]))
train_X = dataset.train["hog"][:-len_dataset]
train_Y = dataset.train["labels"][:-len_dataset]
val_X = dataset.train["hog"][-len_dataset:]
val_Y = dataset.train["labels"][-len_dataset:]

model = MLPClassifier()
model.fit(train_X, train_Y)
pred_model = model.predict(val_X)
score_model = accuracy_score(val_Y, pred_model)
print(f"Predictive model: {score_model}") 

Predictive model: 0.8




In [8]:
from sklearn.model_selection import StratifiedKFold
from sklearn.model_selection import GridSearchCV

X, y = dataset.train["hog"], dataset.train["labels"]
skf = StratifiedKFold(n_splits=5, random_state=42, shuffle=True)

param_grid = {
    'activation': ["logistic", "tanh", "relu"],
    'solver': ["sgd", "adam"],
    'alpha': [0.001, 0.01, 0.1, 1],
    'learning_rate_init': [0.01, 0.1]
}

model = MLPClassifier()
grid_search = GridSearchCV(model, param_grid, cv=skf, scoring='accuracy', verbose=1)

grid_search.fit(X, y)

print("Best result model: ", grid_search.best_score_)
print("Best parameters model: ", grid_search.best_params_)

Fitting 5 folds for each of 48 candidates, totalling 240 fits




Best result model:  0.8078666666666667
Best parameters model:  {'activation': 'relu', 'alpha': 0.01, 'learning_rate_init': 0.01, 'solver': 'adam'}


In [9]:
from sklearn.decomposition import PCA

pca = PCA(n_components=0.80) 

X, y = dataset.train["hog"], dataset.train["labels"]
X_train_pca = pca.fit_transform(X)
skf = StratifiedKFold(n_splits=5, random_state=42, shuffle=True)

param_grid = {
    'activation': ["logistic", "tanh", "relu"],
    'solver': ["sgd", "adam"],
    'alpha': [0.001, 0.01, 0.1, 1],
    'learning_rate_init': [0.01, 0.1]
}

model = MLPClassifier()
grid_search = GridSearchCV(model, param_grid, cv=skf, scoring='accuracy', verbose=1)

grid_search.fit(X_train_pca, y)

print("Best result model: ", grid_search.best_score_)
print("Best parameters model: ", grid_search.best_params_)

Fitting 5 folds for each of 48 candidates, totalling 240 fits




Best result model:  0.8074
Best parameters model:  {'activation': 'relu', 'alpha': 0.01, 'learning_rate_init': 0.01, 'solver': 'adam'}


In [14]:
#Comparison of results based on the two hyper-parameters found
model = MLPClassifier(activation="relu", solver="adam", alpha=0.01, learning_rate_init=0.01)
model.fit(dataset.train['hog'], dataset.train['labels'])
pred_model = model.predict(dataset.test["hog"])
score_model = accuracy_score(dataset.test["labels"], pred_model)
print(f"Predictive best parameters model nn (raw data): {score_model}") #Predictive based on the testing/validation data
cm_model = confusion_matrix(dataset.test["labels"], pred_model)
print(cm_model)

Predictive best parameters model nn (raw data): 0.8183333333333334
[[826 127  47]
 [ 97 759 144]
 [ 29 101 870]]


In [None]:
from sklearn.model_selection import StratifiedKFold

kf = StratifiedKFold(5)

for train,test in kf.split(dataset.train['hog'], dataset.train['labels']):
    train_x = dataset.train['hog'][train]
    train_y = dataset.train['labels'][train]
    
    test_x = dataset.train['hog'][test]
    test_y = dataset.train['labels'][test]

## Add hidden layers to the network.

Try to change the structure of the network by adding hidden layers. Using cross-validation, try to find the best architecture for your network.

In [13]:
## -- Your code here -- ##
X, y = dataset.train["hog"], dataset.train["labels"]
skf = StratifiedKFold(n_splits=5, random_state=42, shuffle=True)

param_grid = {
    'hidden_layer_sizes': [
        (50, 50),                    
        (100, 50),                   
        (200, 100),                  
        (100, 50, 25),               
        (200, 100, 50),              
        (50, 100, 50),               
        (300, 200, 100),             
        (128, 64, 32),               
        (50, 50, 50, 50),            
        (30, 30, 30, 30, 30),        
    ]
}


model = MLPClassifier(activation="relu", solver="adam", alpha=0.01, learning_rate_init=0.01)
grid_search = GridSearchCV(model, param_grid, cv=skf, scoring='accuracy', verbose=1)

grid_search.fit(X, y)

print("Best result model: ", grid_search.best_score_)
print("Best parameters model: ", grid_search.best_params_)

Fitting 5 folds for each of 10 candidates, totalling 50 fits
Best result model:  0.7883333333333333
Best parameters model:  {'hidden_layer_sizes': (200, 100)}
