**Applied Machine Learning - Homework 5 - Task1**


Amaury Sudrie (UNI: AS5961)
Maxime Tchibozo (UNI: MT3390)

Foreword: Some of the methods used in this notebook are highly computationally and memory intensive. To run this code, we used Google Colab notebooks, and we encourage you to do the same.

In [0]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

from sklearn.datasets import load_digits
from sklearn.model_selection import train_test_split, GridSearchCV

import tensorflow as tf
import keras
from keras.models import Sequential
from keras.layers import Dense
from keras.regularizers import l2 as L2_reg
from keras.wrappers.scikit_learn import KerasClassifier

Using TensorFlow backend.


## Task 1

Run a multilayer perceptron (feed forward neural network) with two hidden layers and rectified  linear nonlinearities on the digits dataset from sklearn using the keras​ Sequential interface​.  Include code for selecting L2 regularization strength and number of hidden units using GridSearchCV and evaluation on an independent test-set.

First a little preprocessing of the data

In [0]:
X,y = load_digits(n_class=10, return_X_y=True)

X_train, X_test, y_train, y_test = train_test_split(X,y)

y_train = keras.utils.to_categorical(y_train, 10)
y_test = keras.utils.to_categorical(y_test, 10)

Before parameter tuning using GridSearch, we want to evaluate how long it takes to train and evaluate a single model. This information will be useful to know how many values per parameter we can set in the GridSearch. After compiling, one model takes 2.7sec to train.

In [0]:
import time
t = time.time()

model = Sequential()
model.add(Dense(256, input_shape=(64,), activation='relu'))
model.add(Dense(128, activation='relu'))
model.add(Dense(10, activation='softmax'))

model.compile(optimizer="adam", loss="categorical_crossentropy", metrics=['accuracy'])

model.fit(X_train, y_train, batch_size=32, epochs=20, verbose=0)
print("time to execute:", round(time.time()-t,3))

time to execute: 2.741


For simplicity we chose the same alpha for both the kernel regularizer and for the bias regularizer.

In [0]:
def make_model(hidden_size1=128, hidden_size2=64, alpha1=0.1, alpha2=0.1):

  model = Sequential()
  model.add(Dense(hidden_size1, input_shape=(64,), activation='relu',
                  kernel_regularizer=L2_reg(alpha1),
                  bias_regularizer=L2_reg(alpha1)))
  model.add(Dense(hidden_size2, activation='relu', 
                  kernel_regularizer=L2_reg(alpha2),
                  bias_regularizer=L2_reg(alpha2)))
  model.add(Dense(10, activation='softmax'))

  model.compile(optimizer="adam", loss="categorical_crossentropy", metrics=['accuracy'])
    
  return model



Here we build a Grid Search. We will tune both the number of neurons per layer and the alpha coefficients for regularization.

In [0]:
clf = KerasClassifier(make_model)
param_grid = {'epochs': [20],
              'verbose': [0],
              'hidden_size1': [64, 128, 256],
              'hidden_size2': [64, 128, 256],
              'alpha1': [0.001, 0.01, 0.1],
              'alpha2': [0.001, 0.01, 0.1]}

grid = GridSearchCV(clf, param_grid=param_grid)
grid.fit(X_train, y_train)
print("Best mean cross-validation score for LR: {:.3f}".format(grid.best_score_))
print("Best parameters: {}".format(grid.best_params_))

Best mean cross-validation score for LR: 0.979
Best parameters: {'alpha1': 0.001, 'alpha2': 0.001, 'epochs': 20, 'hidden_size1': 128, 'hidden_size2': 128, 'verbose': 0}


On the train dataset we achieved 97.9% of accuracy with 128 neurons in both hidden layers and alpha 0.001 for both regularizers.

In [0]:
model_final = make_model(hidden_size1=grid.best_params_['hidden_size1'],
                         hidden_size2=grid.best_params_['hidden_size2'],
                         alpha1=grid.best_params_['alpha1'],
                         alpha2=grid.best_params_['alpha2'])

model_final.fit(X_train, y_train, batch_size=32, epochs=grid.best_params_['epochs'], verbose=0)
score = model_final.evaluate(X_test, y_test, verbose=0)
print("Test Accuracy: {:.3f}".format(score[1]))

Test Accuracy: 0.984


With these parameters, we achieved even an even better score on the test dataset than on training with an accuracy of 98.4%.