# Hyperparameter tuning

This notebook focuses on tuning the hyperparameters of an initial network to find a better model on the training set. 
The starting point is a 4 layered feed forward neural network with 120, 60, 25 and 10 neurons per layer.

## Random vs grid search
To reduce computation a random search is preferred to a complete grid search.
Later, the model obtained with the random search will be finetuned with a grid search to optimize it locally, 
also, 5 fold cross validation is performed when testing a model.

The total possible models are 400, the randm search tries with only 100.
Also, the fit implements an early stopper.

In [1]:
import sys
sys.path.append("..")
from src.model import NeuralNetwork

In [2]:
k = 2
neurons = [(120, 60 + i*k, 25 + j*k, 10) for i in (-2,-1,0,1,2) for j in (-2,-1,0,1,2)]

learning_rate = [0.001, 0.01, 0.1, 0.5]
momentum = [0.0, 0.01, 0.1, 1]
epochs = [100]
batch_size = [32]

param_dist = dict(neurons=neurons, 
                  learning_rate=learning_rate,
                  momentum=momentum, 
                  epochs=epochs, 
                  batch_size=batch_size)

results = NeuralNetwork.optimize_model(method="random", 
                                       param_grid=param_dist,
                                       dataset_path="../data/processed/extended/train_pca.csv", 
                                       iterations=100)

Fitting 5 folds for each of 100 candidates, totalling 500 fits


In [3]:
print("Best: %f using %s\n" % (results.best_score_, results.best_params_))
means = results.cv_results_['mean_test_score']
stds = results.cv_results_['std_test_score']
params = results.cv_results_['params']
for mean, stdev, param in sorted(zip(means, stds, params), key=lambda x : -x[0])[:5]:
    print("%f (%f) with: %r" % (mean, stdev, param))
print("...")

Best: 0.585677 using {'neurons': (120, 62, 27, 10), 'momentum': 0.1, 'learning_rate': 0.1, 'epochs': 100, 'batch_size': 32}

0.585677 (0.071518) with: {'neurons': (120, 62, 27, 10), 'momentum': 0.1, 'learning_rate': 0.1, 'epochs': 100, 'batch_size': 32}
0.583235 (0.072438) with: {'neurons': (120, 56, 23, 10), 'momentum': 0.0, 'learning_rate': 0.1, 'epochs': 100, 'batch_size': 32}
0.580792 (0.084351) with: {'neurons': (120, 64, 23, 10), 'momentum': 0.1, 'learning_rate': 0.01, 'epochs': 100, 'batch_size': 32}
0.579233 (0.085570) with: {'neurons': (120, 64, 25, 10), 'momentum': 0.1, 'learning_rate': 0.01, 'epochs': 100, 'batch_size': 32}
0.576124 (0.075047) with: {'neurons': (120, 58, 25, 10), 'momentum': 0.0, 'learning_rate': 0.1, 'epochs': 100, 'batch_size': 32}
...


## Fine tune random result with grid search
To try further optimization, the random search result is fine tuned with grid search.

In [4]:
results.best_params_

{'neurons': (120, 62, 27, 10),
 'momentum': 0.1,
 'learning_rate': 0.1,
 'epochs': 100,
 'batch_size': 32}

In [2]:
param_grid = dict(neurons=[(120, 62 + i, 27 + j, 10) for i in [-1,0,1] for j in [-1,0,1]], 
                  learning_rate=[0.08, 0.1, 0.12],
                  momentum=[0.08, 0.1, 0.12], 
                  epochs=[100], 
                  batch_size=[32])

grid_results = NeuralNetwork.optimize_model(method="grid", 
                                            param_grid=param_grid,
                                            dataset_path="../data/processed/extended/train_pca.csv")

print("Best: %f using %s\n" % (grid_results.best_score_, grid_results.best_params_))
means = grid_results.cv_results_['mean_test_score']
stds = grid_results.cv_results_['std_test_score']
params = grid_results.cv_results_['params']
for mean, stdev, param in sorted(zip(means, stds, params), key=lambda x : -x[0])[:5]:
    print("%f (%f) with: %r" % (mean, stdev, param))
print("...")

Fitting 5 folds for each of 81 candidates, totalling 405 fits
Best: 0.587674 using {'batch_size': 32, 'epochs': 100, 'learning_rate': 0.08, 'momentum': 0.12, 'neurons': (120, 62, 28, 10)}

0.587674 (0.076336) with: {'batch_size': 32, 'epochs': 100, 'learning_rate': 0.08, 'momentum': 0.12, 'neurons': (120, 62, 28, 10)}
0.586790 (0.084240) with: {'batch_size': 32, 'epochs': 100, 'learning_rate': 0.1, 'momentum': 0.1, 'neurons': (120, 62, 26, 10)}
0.583673 (0.086433) with: {'batch_size': 32, 'epochs': 100, 'learning_rate': 0.08, 'momentum': 0.08, 'neurons': (120, 61, 28, 10)}
0.583243 (0.084687) with: {'batch_size': 32, 'epochs': 100, 'learning_rate': 0.12, 'momentum': 0.08, 'neurons': (120, 63, 27, 10)}
0.582332 (0.074344) with: {'batch_size': 32, 'epochs': 100, 'learning_rate': 0.08, 'momentum': 0.12, 'neurons': (120, 62, 26, 10)}
...


## Results

Performing grid search on the random search result led to a really small improvement to the model, but it's still a positive result.

Let's now test this model performances on the test sets to find how it performs.

In [3]:
p = grid_results.best_params_

model = NeuralNetwork.create_model(neurons=p["neurons"],
                                   learning_rate=p["learning_rate"], 
                                   momentum=p["momentum"])

from src.data import Dataset

d = Dataset(dataset_path="../data/processed/extended/train_pca.csv", 
            test_size=0)

from keras.callbacks import EarlyStopping
stopper = EarlyStopping(monitor='accuracy', patience=3, verbose=1)
fit_params = dict(callbacks=[stopper])
model.fit(*d.get_splits(), epochs=p["epochs"], batch_size=p["batch_size"], verbose=0, **fit_params)

Epoch 00019: early stopping


<tensorflow.python.keras.callbacks.History at 0x7f7a8439c0a0>

In [4]:
from src.utils import show_accuracy_loss

accuracy, loss = show_accuracy_loss(model, scaling="pca", test_dataset_path="../data/processed/extended")


Accuracy:
	Mean: 0.6750125050544739 
	Standard deviation: 0.029288029064255778

Loss:
	Mean: 2.440521240234375 
	Standard deviation: 0.3727582139497537
