# Hyper-parameter tuning via Grid Search Cross Validation

This notebook allows to fine-tune the hyper parameters of an augmented simulator. The class `HyperParameterTuner` provided under the simulators module is designed for this purpose. It takes an object of an augmented simulator and porivdes a function called `tune` for fine tuning purpose. This function takes as parameters a dataset and a list of hyper parameters which constitute in turn the search space for a grid search algorithm. This class could be used for any augmented simulator, to fine tune its set of parameters.

## Load the dataset for this experiment using the first benchmark

In [2]:
import os
from lips.neurips_benchmark import NeuripsBenchmark1
path_benchmark = os.path.join("reference_data")
neurips_benchmark1 = NeuripsBenchmark1(path_benchmark=path_benchmark,
                                       load_data_set=True)

In [3]:
from lips.augmented_simulators import FullyConnectedAS
# the three lines bellow might be familiar to the tensorflow users. They tell tensorflow to not take all
# the GPU video RAM for the model.
import tensorflow as tf
physical_devices = tf.config.list_physical_devices('GPU') 
for el in physical_devices:
    tf.config.experimental.set_memory_growth(el, True)

## Tuning the parameters 

In [4]:
from lips.augmented_simulators import HyperParameterTuner

An augmented simulator should be instanciated at the first step. Here, we opt for a fully connected network with all the required functions.

In [5]:
my_simulator = FullyConnectedAS(name="test_FullyConnectedAS")

The `HyperParameterTuner`class can be instanciated by taking the fully connected model as parameter.

In [6]:
tuner = HyperParameterTuner(my_simulator)

The main step would be to run the tuning via `tune` function of this class. Here, we give a dataset as input and a set of parameters which could be a list of possible and desired hyper parameters which in turn constitute the search space. This function porforms grid search cross validation and number of folds could be adjusted by `n_folds` argument of tune function. <span style="color:red">Attention : It takes too much time to execute this grid search cross validation.</span>

In [None]:
grid_results = tuner.tune(neurips_benchmark1.train_dataset,
                          sizes_layer=[(150,150), (200,200,200), (300,300,300,300)], 
                          layer_act=["relu"], 
                          lr=[3e-4, 1e-2], 
                          batch_size=[64,128], 
                          epochs=[10, 50, 100], 
                          loss=["mse", "mae"], 
                          n_folds=5, 
                          verbose=1, 
                          n_jobs=1
                          )

Once the grid search is finished, the results can be accessed and we print the different search results below.

In [14]:
print("Best: %f using %s" % (grid_results.best_score_, grid_results.best_params_))
means = grid_results.cv_results_['mean_test_score']
stds = grid_results.cv_results_['std_test_score']
params = grid_results.cv_results_['params']
for mean, stdev, param in zip(means, stds, params):
    print("%f (%f) with: %r" % (mean, stdev, param))

Best: -0.000333 using {'batch_size': 128, 'epochs': 100, 'layer_act': 'relu', 'loss': 'mse', 'lr': 0.0003, 'sizes_layer': (300, 300, 300, 300)}
-0.003003 (0.000255) with: {'batch_size': 64, 'epochs': 10, 'layer_act': 'relu', 'loss': 'mse', 'lr': 0.0003, 'sizes_layer': (150, 150)}
-0.001696 (0.000134) with: {'batch_size': 64, 'epochs': 10, 'layer_act': 'relu', 'loss': 'mse', 'lr': 0.0003, 'sizes_layer': (200, 200, 200)}
-0.001294 (0.000151) with: {'batch_size': 64, 'epochs': 10, 'layer_act': 'relu', 'loss': 'mse', 'lr': 0.0003, 'sizes_layer': (300, 300, 300, 300)}
-0.007245 (0.002094) with: {'batch_size': 64, 'epochs': 10, 'layer_act': 'relu', 'loss': 'mse', 'lr': 0.01, 'sizes_layer': (150, 150)}
-0.059667 (0.024964) with: {'batch_size': 64, 'epochs': 10, 'layer_act': 'relu', 'loss': 'mse', 'lr': 0.01, 'sizes_layer': (200, 200, 200)}
-0.218516 (0.045694) with: {'batch_size': 64, 'epochs': 10, 'layer_act': 'relu', 'loss': 'mse', 'lr': 0.01, 'sizes_layer': (300, 300, 300, 300)}
-0.030331 

It could be seen that for the fullyConnected augmented simulator, the best model has the following set of parameters : 
- Batch_size = 128
- epochs = 100
- loss criterion = "mse"
- lr = 3e-4
- layer sizes = (300,300,300,300) or four layers with 300 neurons each