<a href="https://colab.research.google.com/github/dnth/tsai/blob/master/tutorial_nbs/11_Optuna_HPO.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

created by Dickson Neoh - dickson.neoh@gmail.com and with incredible help from Ignacio Oguiza - email: timeseriesAI@gmail.com

## Purpose 😇

The purpose of this notebook is to show you how you can take any model or dataset in TSAI and run a hyperparameter optimization job to search for optimal hyperparameter combination that yields the best result on the dataset.

## Import libraries 📚

In [None]:
# ## NOTE: UNCOMMENT AND RUN THIS CELL IF YOU NEED TO INSTALL/ UPGRADE TSAI
# stable = False # True: latest version from github, False: stable version in pip
# if stable: 
#     !pip install tsai -U >> /dev/null
# else:      
#     !pip install git+https://github.com/timeseriesAI/tsai.git -U >> /dev/null
# ## NOTE: REMEMBER TO RESTART (NOT RECONNECT/ RESET) THE KERNEL/ RUNTIME ONCE THE INSTALLATION IS FINISHED

In [3]:
from tsai.all import *
p = !python  -V
print(f'python         : {p[0].split(" ")[1]}')
print(f'tsai           : {tsai.__version__}')
print(f'fastai         : {fastai.__version__}')
print(f'fastcore       : {fastcore.__version__}')
print(f'torch          : {torch.__version__}')
print(f'#cpus          : {cpus}')
iscuda = torch.cuda.is_available()
print(f'device         : {device} ({torch.cuda.get_device_name(0)})' if iscuda else f'device         : {device}')

python         : 3.8.11
tsai           : 0.2.19
fastai         : 2.5.2
fastcore       : 1.3.26
torch          : 1.9.0+cu102
#cpus          : 4
device         : cpu


## Baseline 📉

Before embarking on any hyperparameter optimization tasks, it is important to get a baseline performance so that we can note the improvements after the optimization is done.
In this notebook we use the InceptionTimePlus model and and train on the NATOPS dataset, both conveniently provided in TSAI in just few lines of codes. 
Feel free to use any other models and datasets.

In [None]:
dsid = 'NATOPS' 
X, y, splits = get_UCR_data(dsid, return_split=False)

# Old API
tfms  = [None, [Categorize()]]
dsets = TSDatasets(X, y, tfms=tfms, splits=splits, inplace=True)
dls   = TSDataLoaders.from_dsets(dsets.train, dsets.valid, bs=[64, 128], batch_tfms=[TSStandardize()], num_workers=0)
model = InceptionTimePlus(dls.vars, dls.c)
learn = Learner(dls, model, metrics=accuracy)
learn.fit_one_cycle(5, lr_max=1e-3)
learn.plot_metrics()

# New API
# learn = TSClassifier(X, y, splits=splits, bs=[64, 128], batch_tfms=[TSStandardize()], arch=InceptionTimePlus, metrics=accuracy)
# learn.fit_one_cycle(5, lr_max=1e-3)
# learn.plot_metrics()

Note the performance of the baseline model. It is about 50% accuracy on my local machine.

## Install Optuna 🕹️

[Optuna]('https://optuna.readthedocs.io/en/stable/index.html') is an automatic hyperparameter optimization software framework, particularly designed for machine learning.

In [None]:
# !pip install optuna

## Define objective function 🎯

There are two components in the objective function that you need to define:
1. Search space - the hyperparameter values that you would like to search. In this example we are searching for the combination of nf and dropout rate.
2. Objective value - the value that will be used to indicate the performance of the model. In this example we use the validation loss as the objective value.

In [None]:
import optuna
from optuna.integration import FastAIPruningCallback # A callback to prune unpromising trials (ie early stopping) in Optuna. Optional.

def objective(trial:optuna.Trial):
    
    # Define search space here. More info here https://optuna.readthedocs.io/en/stable/tutorial/10_key_features/002_configurations.html
    num_filters = trial.suggest_categorical('num_filters', [32, 64, 96, 128]) # search through all values in the provided list
    depth = trial.suggest_int("depth", 4, 6) # search through all values between 4 and 6
    dropout_rate = trial.suggest_float("dropout_rate", 0.0, 1.0) # search through all values between 0.0 and 1.0
    
    # Define the model with the search space value
    model = InceptionTimePlus(dls.vars, dls.c, nf=num_filters, fc_dropout=dropout_rate, depth=depth)
    learn = Learner(dls, model, metrics=accuracy, cbs=FastAIPruningCallback(trial))
    learn.fit_one_cycle(5, lr_max=1e-3)
       
    # Return the objective value
    # learn.recorder.values returns [train_loss, valid_loss, metrics] for every epoch.
    return learn.recorder.values[-1][1] # return the validation loss value of the last epoch 

## Start the study 🧑‍🎓

In Optuna, the hyperparameter search job is known as a Study. Each Study consists of many Trials. The number of Trials indicate how many times do you want Optuna to search through the search space. After configuring the objective function above we would like to let Optuna to perform the search (study) for the combination of hyperparameters that yield the best objective value. 

📝Note: In the objective function we used the validation loss as our our objective value. Hence in the the study, we must tell Optuna minimize the objective value (This can be specified in the `direction='minimize'` as shown below). Alternatively, if you have chosen to use the accuracy metric as the objective value, you will need to tell Optuna to maximize instead. (This can be specified in the `direction='maximize'` )

In [None]:
study = optuna.create_study(direction='minimize')
study.optimize(objective, n_trials=100)

In [None]:
print("Study statistics: ")
print("  Number of finished trials: ", len(study.trials))

print("Best trial:")
trial = study.best_trial

print("  Value: ", trial.value)

print("  Params: ")
for key, value in trial.params.items():
    print("    {}: {}".format(key, value))

For reference, the default hyperparameter for the baseline `InceptionTimePlus` [model]("https://github.com/timeseriesAI/tsai/blob/main/tsai/models/InceptionTimePlus.py") is `nf=32`, `depth=6`, and `dropout_rate=0.0`

## Training the optimized model

Now that we have obtained the optimized hyperparameters from the Optuna study, we can train the InceptionTimePlus model again with the optimal hyperparameter values and note the improvement from the baseline model.

In [None]:
# Get the best nf and dropout rate from the best trial object
trial = study.best_trial
nf = trial.params['num_filters']
depth = trial.params['depth']
dropout_rate = trial.params['dropout_rate']

In [None]:
model = InceptionTimePlus(dls.vars, dls.c, nf=nf, depth=depth, fc_dropout=dropout_rate)
learn = Learner(dls, model, metrics=accuracy, cbs=FastAIPruningCallback(trial))

# New API
# learn = TSClassifier(X, y, splits=splits, bs=[64, 128], 
#                      batch_tfms=[TSStandardize()], metrics=accuracy,  
#                      arch=InceptionTimePlus, arch_config={'nf':nf, 'fc_dropout':dropout_rate})

learn.fit_one_cycle(5, lr_max=1e-3)
learn.plot_metrics()

For comparison our baseline model can only scored around 50% accuracy with 5 epochs training. However, using the hyperparameters from the Optuna study results in much higher accuracy. The numbers might vary due to the randomness in training. You can seed you runs or train the model a few times to verify the results. Sometimes the study fails to find a combination that works better than the baseline. In this case you might want to increase the number of trials in the study.

Happy learning! 