# Hyperparameter tuning

This notebook focuses on tuning the hyperparameters of an initial network to find a better model on the training set. 
The starting point are two networks, the best performers on the training set.

## Random search
To reduce computation a random search is preferred to a complete grid search.
A total of 100 models are tested.

In [5]:
import sys
sys.path.append("..")
from src.model import NeuralNetwork
import numpy as np
np.random.seed(1)

def print_results(results):
    print("Best: %f using %s\n" % (results.best_score_, results.best_params_))
    means = results.cv_results_['mean_test_score']
    stds = results.cv_results_['std_test_score']
    params = results.cv_results_['params']
    for mean, stdev, param in sorted(zip(means, stds, params), key=lambda x : -x[0])[:3]:
        print("%f (%f) with: %r" % (mean, stdev, param))
    print("...")

In [2]:
k = 2
neurons = [(180, 80 + i*k, 46 + j*k, 10) for i in (-2,-1,0,1,2) for j in (-2,-1,0,1,2)]
learning_rate = [0.001, 0.01, 0.1, 0.5]
momentum = [0.0, 0.01, 0.1, 1]
epochs = [60, 80, 100]
batch_size = [32, 64]

param_dist = dict(neurons=neurons, 
                  learning_rate=learning_rate,
                  momentum=momentum, 
                  epochs=epochs, 
                  batch_size=batch_size)

extended_results = NeuralNetwork.optimize_model(method="random", 
                                                param_grid=param_dist,
                                                dataset_path="../data/processed/extended/train_extended.csv", 
                                                iterations=100)

Fitting 5 folds for each of 100 candidates, totalling 500 fits
Instructions for updating:
The `validate_indices` argument has no effect. Indices are always validated on CPU and never validated on GPU.


In [3]:
print_results(extended_results)

Best: 0.649682 using {'neurons': (180, 76, 50, 10), 'momentum': 0.01, 'learning_rate': 0.01, 'epochs': 80, 'batch_size': 64}

0.649682 (0.043150) with: {'neurons': (180, 76, 50, 10), 'momentum': 0.01, 'learning_rate': 0.01, 'epochs': 80, 'batch_size': 64}
0.649463 (0.039227) with: {'neurons': (180, 80, 46, 10), 'momentum': 0.0, 'learning_rate': 0.1, 'epochs': 100, 'batch_size': 32}
0.647470 (0.034759) with: {'neurons': (180, 84, 44, 10), 'momentum': 0.1, 'learning_rate': 0.1, 'epochs': 80, 'batch_size': 32}
...


In [4]:
param_dist["neurons"] = [(102, 45 + i*k, 30 + j*k, 10) for i in (-2,-1,0,1,2) for j in (-2,-1,0,1,2)]

pca_results = NeuralNetwork.optimize_model(method="random", 
                                           param_grid=param_dist,
                                           dataset_path="../data/processed/extended/train_pca.csv", 
                                           iterations=100)

Fitting 5 folds for each of 100 candidates, totalling 500 fits


In [5]:
print_results(pca_results)

Best: 0.627023 using {'neurons': (102, 45, 32, 10), 'momentum': 0.01, 'learning_rate': 0.1, 'epochs': 80, 'batch_size': 32}

0.627023 (0.031555) with: {'neurons': (102, 45, 32, 10), 'momentum': 0.01, 'learning_rate': 0.1, 'epochs': 80, 'batch_size': 32}
0.626788 (0.050642) with: {'neurons': (102, 49, 28, 10), 'momentum': 0.1, 'learning_rate': 0.1, 'epochs': 80, 'batch_size': 32}
0.626355 (0.031895) with: {'neurons': (102, 45, 28, 10), 'momentum': 0.01, 'learning_rate': 0.1, 'epochs': 100, 'batch_size': 32}
...


## Final model
The best model found is the following

In [8]:


best_param = extended_results.best_params_
best_param

{'neurons': (180, 76, 50, 10),
 'momentum': 0.01,
 'learning_rate': 0.01,
 'epochs': 80,
 'batch_size': 64}

## Test results

The final model is evaluated on the test set to determine performances.

In [9]:
p = best_param

model = NeuralNetwork.create_model(neurons=p["neurons"],
                                   learning_rate=p["learning_rate"], 
                                   momentum=p["momentum"])

from src.data import Dataset
from sklearn.utils import class_weight

d = Dataset(dataset_path="../data/processed/extended/train_extended.csv", 
            test_size=0)

from keras.callbacks import EarlyStopping
stopper = EarlyStopping(monitor='accuracy', patience=3, verbose=1)
fit_params = dict(callbacks=[stopper])

x, y = d.get_splits()
class_weights = class_weight.compute_class_weight('balanced', np.unique(y), y)
weights_dict = dict(zip(np.unique(y), class_weights))

model.fit(x, y, class_weight=weights_dict, epochs=p["epochs"], batch_size=p["batch_size"], verbose=0, **fit_params)

Epoch 00055: early stopping


<tensorflow.python.keras.callbacks.History at 0x7f0be824e490>

In [10]:
from src.utils import show_accuracy_loss

accuracy, loss = show_accuracy_loss(model, scaling="extended", test_dataset_path="../data/processed/extended")


Accuracy:
	Mean: 0.6708542704582214 
	Standard deviation: 0.04780035914539101

Loss:
	Mean: 1.3936396598815919 
	Standard deviation: 0.3427208511953272
