# Hyperparameter Optimization in PyTorch Lightning with Optuna

In notebook 4 of this exercise, you've learned how to develop and train models with PyTorch Lightning.

In this optional notebook, we'll show you how you can perform hyperparameter optimization in PyTorch Lightning using the  framework *Optuna*.



In [None]:
import shutil
#!pip install pytorch-lightning==0.7.6 > /dev/null
import pytorch_lightning as pl
import torch
from exercise_code.MyPytorchModel import MyPytorchModel
from exercise_code.Util import save_model, load_model, test_and_save

%load_ext autoreload
%autoreload 2

<img src=https://raw.githubusercontent.com/optuna/optuna/master/docs/image/optuna-logo.png></img>

Optuna is an automatic hyperparameter optimization framework, which works with several deep learning libraries, including PyTorch Lightning. Have a look at https://github.com/optuna/optuna!

Two important concepts of Optuna are the terms **Study** and **Trial**:
* **Study**: optimization based on an objective function (e.g.: maximize validation accuracy)
* **Trial**: a single execution of the objective function (i.e., train a model for a specific hyperparameter configuration)

The goal of a study is to find out the optimal set of hyperparameter values through multiple trials (e.g., n_trials=100). Optuna is a framework designed for the automation and the acceleration of the optimization studies.

On a high level, hyper parameter tuning with Optuna works very similar to the search algorithms we implemented in exercise 6. However, Optuna has a set of advantages:

* **Parallelization** of hyperparameter searches over multiple threads or processes without modifying code
* more **efficient search algorithms** for large search spaces and **pruning of unpromising trials** for faster results
* many additional features for automated search of optimal hyperparameters

### Install & Import Optuna

If you haven't done yet, install Optuna via pip by uncommenting the line in the cell below: `!pip install optuna`.

If you want to install Optuna with conda or you have problems with the installation via pip, you can also use `$ conda install -c conda-forge optuna`.

In [2]:
#!pip install optuna
import optuna
from optuna.integration import PyTorchLightningPruningCallback

### Logger

The default logger in PyTorch Lightning automatically writes to event files to be consumed by TensorBoard. 

Here we just set up a simple callback, that keeps the metrics from each validation step in memory, which we can then use to find the best model.

In [3]:
from pytorch_lightning import Callback

class MetricsCallback(Callback):
    """PyTorch Lightning metric callback."""
    def __init__(self):
        super().__init__()
        self.metrics = []

    def on_validation_end(self, trainer, pl_module):
        self.metrics.append(trainer.callback_metrics)

### Objective

As mentioned above, the goal of our hyperparameter tuning is, to optimize an objective function by finding the best set of hyperparameters. 
(
Optuna is a black-box optimizer, which means we need to provide this objective function (`objective(trial)`), which gets passed a `trial` object and returns a numerical value to evaluate the performance of the current hyperparameters and where to sample in the upcoming trial.

In our case, the metric we want to optimize is the validation accuracy of our models. To get the validation accuracy for a given hyperparameter configuration, we do the same thing as in the previous notebook:

* define a PL-trainer
* define hyper parameters - by sampling them from the provided hyperparameter ranges, similar as in our old random search implementation
* initialize a model with these hyperparameters
* train that model, and report its validation accuracy as our objective function value

As you can see in the code cell below, the implementation is straight forward:

PS: notice the different sampling modes, e.g. `trial.suggest_int`, `trial.suggest_loguniform` which are also similar to our previous implementation! Check out the documentation at https://optuna.readthedocs.io/en/latest/reference/trial.html to find out about further sampling modes, and **feel free to add additional hyperparameters!**

In [4]:
def objective(trial):
    # as explained above, we'll use this callback to collect the validation accuracies
    metrics_callback = MetricsCallback()  
    
    # create a trainer
    trainer = pl.Trainer(
        #train_percent_check=1.0,
        #val_percent_check=1.0,
        logger=False,                                                                  # deactivate PL logging
        max_epochs=20,                                                                 # epochs
        gpus=1 if torch.cuda.is_available() else None,                                 # #gpus
        callbacks=[metrics_callback],                                                  # save latest accuracy
        early_stop_callback=PyTorchLightningPruningCallback(trial, monitor="val_acc"), # early stopping
    )

    # here we sample the hyper params, similar as in our old random search
    trial_hparams = {"hidden_size": trial.suggest_int("hidden_size", 100, 250, 10),
                     "n_layers": trial.suggest_int("n_layers", 1, 6),
                     "dropout_rate": trial.suggest_loguniform("dropout_rate", 1e-4, 5e-1),
                     "activation": trial.suggest_categorical("activation", ('PReLU', 'ReLU', 'LeakyReLU')),
                     "lr": trial.suggest_loguniform("lr", 1e-5, 1e-1),
                     "lr_decay_rate": trial.suggest_loguniform("lr_decay_rate", 1e-1, 5e-1),
                     "batch_size": 250}
    
    # create model from these hyper params and train it
    model = MyPytorchModel(trial_hparams)
    model.prepare_data()
    trainer.fit(model)

    # save model
    save_model(model, '{}.p'.format(trial.number), "checkpoints")

    # return validation accuracy from latest model, as that's what we want to minimize by our hyper param search
    return metrics_callback.metrics[-1]["val_acc"]

An important difference to our previous search implementation is, that Optuna does not sample randomly! 

It uses a **Tree-structured Parzen Estimator (TPE)** sampler, which is a kind of bayesian optimization, and more efficient than a pure random search, because it chooses the hyperparameter values after evaluating all previous trials to make a smart guess where the best hyperparameters can be found.

### Run the Search

After we've defined our objective function, we can start the search.

For that, we can also provide a **pruner**, which is a very useful concept supported by Optuna: Similar to *early stopping*, a pruner automatically stops unpromising trials in early stages of training (a.k.a. automated early-stopping) and therefore can significantly speed up the search.

If you want to specify a pruner, have a look at the available options at https://optuna.readthedocs.io/en/latest/reference/pruners.html. 

Then, we create our `study`, define the direction that we want to optimize (i.e., *maximize* the validation accuracy) and start the optimization.

In [5]:
pruner = optuna.pruners.NopPruner()
study = optuna.create_study(direction="maximize", pruner=pruner)
study.optimize(objective, n_trials=100, timeout=3600)

GPU available: True, used: True
No environment variable for node rank defined. Set as 0.
CUDA_VISIBLE_DEVICES: [0]

   | Name     | Type        | Params
-------------------------------------
0  | model    | Sequential  | 897 K 
1  | model.0  | Linear      | 768 K 
2  | model.1  | Dropout     | 0     
3  | model.2  | PReLU       | 1     
4  | model.3  | BatchNorm1d | 500   
5  | model.4  | Linear      | 62 K  
6  | model.5  | Dropout     | 0     
7  | model.7  | BatchNorm1d | 500   
8  | model.8  | Linear      | 62 K  
9  | model.9  | Dropout     | 0     
10 | model.11 | BatchNorm1d | 500   
11 | model.12 | Linear      | 2 K   


HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validation sanity check', layout=Layout…



HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Training', layout=Layout(flex='2'), max…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…




[I 2020-06-11 23:23:07,563] Finished trial#0 with value: 0.5217 with parameters: {'hidden_size': 250, 'n_layers': 3, 'dropout_rate': 0.01079639422176214, 'activation': 'PReLU', 'lr': 0.00041741334235915697, 'lr_decay_rate': 0.38560137431060787}. Best is trial#0 with value: 0.5217.
GPU available: True, used: True
No environment variable for node rank defined. Set as 0.
CUDA_VISIBLE_DEVICES: [0]

  | Name    | Type        | Params
------------------------------------
0 | model   | Sequential  | 617 K 
1 | model.0 | Linear      | 614 K 
2 | model.1 | Dropout     | 0     
3 | model.2 | PReLU       | 1     
4 | model.3 | BatchNorm1d | 400   
5 | model.4 | Linear      | 2 K   


HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validation sanity check', layout=Layout…



HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Training', layout=Layout(flex='2'), max…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…




[I 2020-06-11 23:32:04,020] Finished trial#1 with value: 0.5025 with parameters: {'hidden_size': 200, 'n_layers': 1, 'dropout_rate': 0.011037224163898778, 'activation': 'PReLU', 'lr': 0.0006621341653181218, 'lr_decay_rate': 0.1175631130444548}. Best is trial#0 with value: 0.5217.
GPU available: True, used: True
No environment variable for node rank defined. Set as 0.
CUDA_VISIBLE_DEVICES: [0]

   | Name     | Type        | Params
-------------------------------------
0  | model    | Sequential  | 339 K 
1  | model.0  | Linear      | 307 K 
2  | model.1  | Dropout     | 0     
3  | model.2  | PReLU       | 1     
4  | model.3  | BatchNorm1d | 200   
5  | model.4  | Linear      | 10 K  
6  | model.5  | Dropout     | 0     
7  | model.7  | BatchNorm1d | 200   
8  | model.8  | Linear      | 10 K  
9  | model.9  | Dropout     | 0     
10 | model.11 | BatchNorm1d | 200   
11 | model.12 | Linear      | 10 K  
12 | model.13 | Dropout     | 0     
13 | model.15 | BatchNorm1d | 200   
14 | model

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validation sanity check', layout=Layout…



HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Training', layout=Layout(flex='2'), max…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…




[I 2020-06-11 23:41:14,001] Finished trial#2 with value: 0.4707 with parameters: {'hidden_size': 100, 'n_layers': 4, 'dropout_rate': 0.0011053236129852317, 'activation': 'PReLU', 'lr': 0.04859927637500007, 'lr_decay_rate': 0.20612731455560215}. Best is trial#0 with value: 0.5217.
GPU available: True, used: True
No environment variable for node rank defined. Set as 0.
CUDA_VISIBLE_DEVICES: [0]

   | Name     | Type        | Params
-------------------------------------
0  | model    | Sequential  | 870 K 
1  | model.0  | Linear      | 706 K 
2  | model.1  | Dropout     | 0     
3  | model.2  | LeakyReLU   | 0     
4  | model.3  | BatchNorm1d | 460   
5  | model.4  | Linear      | 53 K  
6  | model.5  | Dropout     | 0     
7  | model.7  | BatchNorm1d | 460   
8  | model.8  | Linear      | 53 K  
9  | model.9  | Dropout     | 0     
10 | model.11 | BatchNorm1d | 460   
11 | model.12 | Linear      | 53 K  
12 | model.13 | Dropout     | 0     
13 | model.15 | BatchNorm1d | 460   
14 | model

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validation sanity check', layout=Layout…



HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Training', layout=Layout(flex='2'), max…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…




[I 2020-06-11 23:50:21,672] Finished trial#3 with value: 0.5144 with parameters: {'hidden_size': 230, 'n_layers': 4, 'dropout_rate': 0.005425180189635561, 'activation': 'LeakyReLU', 'lr': 0.0004066772987640371, 'lr_decay_rate': 0.3854136748909353}. Best is trial#0 with value: 0.5217.
GPU available: True, used: True
No environment variable for node rank defined. Set as 0.
CUDA_VISIBLE_DEVICES: [0]

  | Name    | Type        | Params
------------------------------------
0 | model   | Sequential  | 524 K 
1 | model.0 | Linear      | 522 K 
2 | model.1 | Dropout     | 0     
3 | model.2 | LeakyReLU   | 0     
4 | model.3 | BatchNorm1d | 340   
5 | model.4 | Linear      | 1 K   


HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validation sanity check', layout=Layout…



HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Training', layout=Layout(flex='2'), max…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…




[I 2020-06-11 23:59:33,099] Finished trial#4 with value: 0.5026 with parameters: {'hidden_size': 170, 'n_layers': 1, 'dropout_rate': 0.0003407427566959894, 'activation': 'LeakyReLU', 'lr': 0.001158474829464422, 'lr_decay_rate': 0.42799962539841185}. Best is trial#0 with value: 0.5217.
GPU available: True, used: True
No environment variable for node rank defined. Set as 0.
CUDA_VISIBLE_DEVICES: [0]

   | Name     | Type        | Params
-------------------------------------
0  | model    | Sequential  | 687 K 
1  | model.0  | Linear      | 553 K 
2  | model.1  | Dropout     | 0     
3  | model.2  | PReLU       | 1     
4  | model.3  | BatchNorm1d | 360   
5  | model.4  | Linear      | 32 K  
6  | model.5  | Dropout     | 0     
7  | model.7  | BatchNorm1d | 360   
8  | model.8  | Linear      | 32 K  
9  | model.9  | Dropout     | 0     
10 | model.11 | BatchNorm1d | 360   
11 | model.12 | Linear      | 32 K  
12 | model.13 | Dropout     | 0     
13 | model.15 | BatchNorm1d | 360   
14 | 

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validation sanity check', layout=Layout…



HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Training', layout=Layout(flex='2'), max…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…




[I 2020-06-12 00:08:13,904] Finished trial#5 with value: 0.5238 with parameters: {'hidden_size': 180, 'n_layers': 5, 'dropout_rate': 0.0020598759491134554, 'activation': 'PReLU', 'lr': 0.0004895397848279103, 'lr_decay_rate': 0.3652484188519135}. Best is trial#5 with value: 0.5238.
GPU available: True, used: True
No environment variable for node rank defined. Set as 0.
CUDA_VISIBLE_DEVICES: [0]

   | Name     | Type        | Params
-------------------------------------
0  | model    | Sequential  | 769 K 
1  | model.0  | Linear      | 583 K 
2  | model.1  | Dropout     | 0     
3  | model.2  | PReLU       | 1     
4  | model.3  | BatchNorm1d | 380   
5  | model.4  | Linear      | 36 K  
6  | model.5  | Dropout     | 0     
7  | model.7  | BatchNorm1d | 380   
8  | model.8  | Linear      | 36 K  
9  | model.9  | Dropout     | 0     
10 | model.11 | BatchNorm1d | 380   
11 | model.12 | Linear      | 36 K  
12 | model.13 | Dropout     | 0     
13 | model.15 | BatchNorm1d | 380   
14 | mode

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validation sanity check', layout=Layout…



HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Training', layout=Layout(flex='2'), max…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…




[I 2020-06-12 00:16:54,974] Finished trial#6 with value: 0.496 with parameters: {'hidden_size': 190, 'n_layers': 6, 'dropout_rate': 0.14292456873055015, 'activation': 'PReLU', 'lr': 0.001882364325920283, 'lr_decay_rate': 0.25245789551045583}. Best is trial#5 with value: 0.5238.


Notice that we save every model in a directory called `./checkpoints`.

### Check the Results

After we've finished the search, we access the best trial at `study.best_trial`: 

In [6]:
print("Number of finished trials: {}".format(len(study.trials)))

print("Best trial:")
best_trial = study.best_trial

print("  Value: {}".format(best_trial.value))

print("  Params: ")
for key, value in best_trial.params.items():
    print("    {}: {}".format(key, value))

Number of finished trials: 7
Best trial:
  Value: 0.5238
  Params: 
    hidden_size: 180
    n_layers: 5
    dropout_rate: 0.0020598759491134554
    activation: PReLU
    lr: 0.0004895397848279103
    lr_decay_rate: 0.3652484188519135


### Saving your best model


When you're done with your hyperparameter search and have achieved at least 50% validation accuracy, you can save your best model to the `./models`-directory in order to submit the model.

Before that, we will check again whether the number of parameters is below 5 Mio and the file size is below 20 MB.

When your final model is saved, we'll lastly report the test accuracy.

In [7]:
best_model = load_model('checkpoints/{}.p'.format(best_trial.number))
best_model.prepare_data()
test_and_save(best_model)

Validation-Accuracy: 51.27%
FYI: Your model has 0.687 params.
Great! Your model size is less than 20 MB and will be accepted :)
Your model has been saved and is ready to be submitted. NOW, let's check the test-accuracy.
Test-Accuracy: 0.5191%


### Remove checkpoints directory
Lastly, let's remove the checkpoints directory again to clear up space.

In [8]:
shutil.rmtree("checkpoints")

### References

Here we provided just a quick overview about hyperparameter optimization using Optuna. Optuna has a lot more to offer, so definitely make sure to check out the following resources:

* Source Code: https://github.com/optuna/optuna
* Docs: https://optuna.readthedocs.io/en/stable/
* Website: https://optuna.org
* Paper: https://arxiv.org/pdf/1907.10902.pdf