# Quick Visualization for Hyperparameter Optimization Analysis

Optuna provides various visualization features in :mod:`optuna.visualization` to analyze optimization results visually.

This tutorial walks you through this module by visualizing the history of lightgbm model for breast cancer dataset.

For visualizing multi-objective optimization (i.e., the usage of :func:`optuna.visualization.plot_pareto_front`),
please refer to the tutorial of `multi_objective`.

<div class="alert alert-info"><h4>Note</h4><p>By using [Optuna Dashboard](https://github.com/optuna/optuna-dashboard), you can also check the optimization history,
   hyperparameter importances, hyperparameter relationships, etc. in graphs and tables.
   Please make your study persistent using `RDB backend <rdb>` and execute following commands to run Optuna Dashboard.

```console
$ pip install optuna-dashboard
$ optuna-dashboard sqlite:///example-study.db
```
   Please check out [the GitHub repository](https://github.com/optuna/optuna-dashboard) for more details.

   .. list-table::
      :header-rows: 1

      * - Manage Studies
        - Visualize with Interactive Graphs
      * - .. image:: https://user-images.githubusercontent.com/5564044/205545958-305f2354-c7cd-4687-be2f-9e46e7401838.gif
        - .. image:: https://user-images.githubusercontent.com/5564044/205545965-278cd7f4-da7d-4e2e-ac31-6d81b106cada.gif</p></div>


In [7]:
from sklearn.metrics import mean_absolute_error
from utils.postprocessing import ProcessedResult
import optuna

from utils.metrics import LinEx, LinLin, weighted_RMSE, RMSE

SEED = 42

# You can use Matplotlib instead of Plotly for visualization by simply replacing `optuna.visualization` with
# `optuna.visualization.matplotlib` in the following examples.
from optuna.visualization import plot_contour
from optuna.visualization import plot_edf
from optuna.visualization import plot_intermediate_values
from optuna.visualization import plot_optimization_history
from optuna.visualization import plot_parallel_coordinate
from optuna.visualization import plot_param_importances
from optuna.visualization import plot_slice

import argparse
import os
import torch
import yaml

import numpy as np
from datetime import datetime
from exp.exp_informer import Exp_Informer
from exp.args_parser import args_parsing

args = args_parsing()

now = datetime.now().strftime("%d-%m-%Y_%H-%M-%S")

Exp = Exp_Informer

Args in experiment:
Namespace(data='SRL_NEG_00_04', model='informer', loss='linlin', w_rmse_weight=5, linex_weight=0.05, linlin_weight=0.1, seq_len=4, label_len=3, pred_len=1, timestamp='10-07-2023_20-36-15', root_path='d:\\srl_informer\\data\\processed\\SRL', data_path='SRL_NEG_00_04.csv', features='S', cols=None, itr=2, train_epochs=6, scale='standard', target='capacity_price', freq='d', checkpoints='./checkpoints/', enc_in=1, dec_in=1, c_out=1, d_model=512, n_heads=8, e_layers=2, d_layers=1, s_layers=[3, 2, 1], d_ff=2048, factor=5, padding=0, distil=True, dropout=0.05, attn='prob', embed='timeF', activation='gelu', output_attention=False, do_predict=False, mix=True, num_workers=0, batch_size=32, patience=3, learning_rate=0.0001, des='test', lradj='type1', use_amp=False, inverse=False, use_gpu=False, gpu=0, use_multi_gpu=False, devices='0,1,2,3', tune_num_samples=200, detail_freq='d')



In [8]:
args = args_parsing()
args.tune_num_samples = 20

Args in experiment:
Namespace(data='SRL_NEG_00_04', model='informer', loss='linlin', w_rmse_weight=5, linex_weight=0.05, linlin_weight=0.1, seq_len=4, label_len=3, pred_len=1, timestamp='10-07-2023_20-36-16', root_path='d:\\srl_informer\\data\\processed\\SRL', data_path='SRL_NEG_00_04.csv', features='S', cols=None, itr=2, train_epochs=6, scale='standard', target='capacity_price', freq='d', checkpoints='./checkpoints/', enc_in=1, dec_in=1, c_out=1, d_model=512, n_heads=8, e_layers=2, d_layers=1, s_layers=[3, 2, 1], d_ff=2048, factor=5, padding=0, distil=True, dropout=0.05, attn='prob', embed='timeF', activation='gelu', output_attention=False, do_predict=False, mix=True, num_workers=0, batch_size=32, patience=3, learning_rate=0.0001, des='test', lradj='type1', use_amp=False, inverse=False, use_gpu=False, gpu=0, use_multi_gpu=False, devices='0,1,2,3', tune_num_samples=200, detail_freq='d')



Define the objective function.



In [9]:
def objective(trial):
    
    torch.cuda.empty_cache()
    
    # param = {
    #     "objective": "binary",
    #     "metric": "auc",
    #     "verbosity": -1,
    #     "boosting_type": "gbdt",
    #     "bagging_fraction": trial.suggest_float("bagging_fraction", 0.4, 1.0),
    #     "bagging_freq": trial.suggest_int("bagging_freq", 1, 7),
    #     "min_child_samples": trial.suggest_int("min_child_samples", 5, 100),
    # }
    
    # SEARCH SPACE
    
    match args.loss:
        case 'linex':
            args.linex_weight = trial.suggest_float('linex_weight', 0.01, 3, step=0.01)
        case 'w_rmse':
            args.w_rmse_weight = trial.suggest_float('w_rmse_weight', 1.0, 10.0, step=0.1)
        case 'linlin':
            args.linlin_weight = trial.suggest_float('linlin_weight', 0.05, 0.45, step=0.005)
    
    args.learning_rate = trial.suggest_float("learning_rate", 1e-5, 1e-1, log=True)
    args.train_epochs = trial.suggest_int("train_epochs", 3, 6)
    args.seq_len = trial.suggest_int('seq_len', 49, 112, step=7)
    label_seq_len_ratio = trial.suggest_float('label_seq_len_ratio', 0.4, 0.8, step=0.025)
    args.label_len = min(int(label_seq_len_ratio * args.seq_len), 77)
    args.e_layers = trial.suggest_int('e_layers', 2, 7)
    args.d_layers = trial.suggest_int('d_layers', 1, 4)
    args.n_heads = trial.suggest_int('n_heads', 4, 32, step=4)
    args.d_model = trial.suggest_int('d_model', 1024, 2048, step=256)
    args.batch_size = trial.suggest_int('batch_size', 16, 64, step=8)
    
    # args.n_estimators = trial.suggest_int("n_estimators", 10, 100)
    # args.max_depth = trial.suggest_int("max_depth", 3, 12)
    # args.learning_rate = trial.suggest_float("learning_rate", 1e-1, 1e0, log=True)
    # args.min_child_weight = trial.suggest_int('min_child_weight', 1, 10)
    # args.gamma = trial.suggest_float('gamma', 0, 1, step=0.1)
    # args.subsample = trial.suggest_float('subsample', 0.5, 1.0, step=0.1)
    # args.colsample_bytree = trial.suggest_float('colsample_bytree', 0.5, 1.0, step=0.1)
    # args.reg_alpha = trial.suggest_loguniform('reg_alpha', 1e-5, 1.0)
    # args.reg_lambda = trial.suggest_loguniform('reg_lambda', 1e-5, 1.0)

    exp = Exp(args)
    
    loss, revenue, _ = exp.tune()
    
    torch.cuda.empty_cache()
    
    # if trial.should_prune():
    #     raise optuna.TrialPruned()
    
    # # Add a callback for pruning.
    # pruning_callback = optuna.integration.LightGBMPruningCallback(trial, "auc")
    # gbm = lgb.train(param, dtrain, valid_sets=[dvalid], callbacks=[pruning_callback])

    # preds = gbm.predict(valid_x)
    # pred_labels = np.rint(preds)
    # accuracy = sklearn.metrics.accuracy_score(valid_y, pred_labels)
    return loss, revenue

In [10]:
import logging
import sys

import optuna

# Add stream handler of stdout to show the messages
optuna.logging.get_logger("optuna").addHandler(logging.StreamHandler(sys.stdout))
study_name = "example-study"  # Unique identifier of the study.
storage_name = "sqlite:///{}.db".format(study_name)
study = optuna.create_study(study_name=study_name, storage=storage_name)

[I 2023-07-10 20:36:21,020] A new study created in RDB with name: example-study


A new study created in RDB with name: example-study


In [19]:
study = optuna.create_study(
    directions=['minimize', 'maximize'],
    sampler=optuna.samplers.TPESampler(seed=SEED),
    # pruner=optuna.pruners.MedianPruner(n_warmup_steps=10),
)
study.optimize(objective, n_trials=args.tune_num_samples, timeout=600)

[32m[I 2023-07-10 13:57:27,257][0m A new study created in memory with name: no-name-65cb425d-0ae6-481c-aaff-ee057004ab4e[0m


Use GPU: cuda:0













































Updating learning rate to 0.06351221010640701













































EarlyStopping counter: 1 out of 3
Updating learning rate to 0.031756105053203504









Debug
Debug
Debug
Debug


[32m[I 2023-07-10 13:58:00,408][0m Trial 0 finished with values: [2.983625650405884, 0.0] and parameters: {'linlin_weight': 0.2, 'learning_rate': 0.06351221010640701, 'train_epochs': 2, 'seq_len': 84, 'label_seq_len_ratio': 0.45, 'e_layers': 2, 'd_layers': 1, 'n_heads': 28, 'd_model': 2816, 'batch_size': 48}. [0m


Use GPU: cuda:0
























































































































Updating learning rate to 0.07579479953348005
























































































































Updating learning rate to 0.037897399766740024

























Debug
Debug
Debug
Debug


[32m[I 2023-07-10 13:58:28,637][0m Trial 1 finished with values: [8.623517990112305, 0.0] and parameters: {'linlin_weight': 0.055, 'learning_rate': 0.07579479953348005, 'train_epochs': 2, 'seq_len': 63, 'label_seq_len_ratio': 0.47500000000000003, 'e_layers': 3, 'd_layers': 2, 'n_heads': 20, 'd_model': 2304, 'batch_size': 32}. [0m


Use GPU: cuda:0



























































































































































































































































































































Updating learning rate to 3.613894271216525e-05































































Debug
Debug
Debug
Debug


[32m[I 2023-07-10 13:59:02,708][0m Trial 2 finished with values: [0.1815057396888733, 39.27] and parameters: {'linlin_weight': 0.295, 'learning_rate': 3.613894271216525e-05, 'train_epochs': 1, 'seq_len': 70, 'label_seq_len_ratio': 0.5750000000000001, 'e_layers': 6, 'd_layers': 1, 'n_heads': 20, 'd_model': 2816, 'batch_size': 16}. [0m


Use GPU: cuda:0


[33m[W 2023-07-10 13:59:07,057][0m Trial 3 failed with parameters: {'linlin_weight': 0.295, 'learning_rate': 4.809461967501571e-05, 'train_epochs': 1, 'seq_len': 112, 'label_seq_len_ratio': 0.8, 'e_layers': 6, 'd_layers': 2, 'n_heads': 4, 'd_model': 3072, 'batch_size': 40} because of the following error: OutOfMemoryError('CUDA out of memory. Tried to allocate 5.74 GiB (GPU 0; 15.84 GiB total capacity; 5.68 GiB already allocated; 3.26 GiB free; 11.61 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation.  See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF').[0m
Traceback (most recent call last):
  File "c:\Users\CLE\AppData\Local\miniforge3\envs\thesis\lib\site-packages\optuna\study\_optimize.py", line 200, in _run_trial
    value_or_values = func(trial)
  File "C:\Users\local_CLE\Temp\ipykernel_10704\1932876297.py", line 50, in objective
    loss, revenue, _ = exp.tune()
  File "c:\codes\srl_




OutOfMemoryError: CUDA out of memory. Tried to allocate 5.74 GiB (GPU 0; 15.84 GiB total capacity; 5.68 GiB already allocated; 3.26 GiB free; 11.61 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation.  See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

## Plot functions
Visualize the optimization history. See :func:`~optuna.visualization.plot_optimization_history` for the details.



In [11]:
study.get_trials()

[FrozenTrial(number=0, state=TrialState.COMPLETE, values=[0.09716078639030457, 801.5], datetime_start=datetime.datetime(2023, 7, 10, 13, 46, 18, 102083), datetime_complete=datetime.datetime(2023, 7, 10, 13, 46, 28, 890631), params={'linlin_weight': 0.2, 'learning_rate': 0.06351221010640701, 'train_epochs': 2, 'seq_len': 84, 'label_seq_len_ratio': 0.45, 'e_layers': 2, 'd_layers': 1, 'n_heads': 28, 'd_model': 1280, 'batch_size': 24}, user_attrs={}, system_attrs={}, intermediate_values={}, distributions={'linlin_weight': FloatDistribution(high=0.45, log=False, low=0.05, step=0.005), 'learning_rate': FloatDistribution(high=0.1, log=True, low=1e-05, step=None), 'train_epochs': IntDistribution(high=2, log=False, low=1, step=1), 'seq_len': IntDistribution(high=112, log=False, low=49, step=7), 'label_seq_len_ratio': FloatDistribution(high=0.8, log=False, low=0.4, step=0.025), 'e_layers': IntDistribution(high=7, log=False, low=2, step=1), 'd_layers': IntDistribution(high=4, log=False, low=1, st

In [12]:
optuna.visualization.plot_pareto_front(study, target_names=["loss", "revenue"])

In [10]:
plot_optimization_history(study, target=lambda t: t.values[1])

Visualize the learning curves of the trials. See :func:`~optuna.visualization.plot_intermediate_values` for the details.



In [17]:
plot_intermediate_values(study)

[W 2023-07-06 23:38:03,494] You need to set up the pruning feature to utilize `plot_intermediate_values()`


Visualize high-dimensional parameter relationships. See :func:`~optuna.visualization.plot_parallel_coordinate` for the details.



In [13]:
plot_parallel_coordinate(study,target=lambda t: t.values[1])

Select parameters to visualize.



In [None]:
plot_parallel_coordinate(study, params=["bagging_freq", "bagging_fraction"])

Visualize hyperparameter relationships. See :func:`~optuna.visualization.plot_contour` for the details.



In [55]:
plot_contour(study, target=lambda t: t.values[1])

Select parameters to visualize.



In [None]:
plot_contour(study, params=["bagging_freq", "bagging_fraction"])

Visualize individual hyperparameters as slice plot. See :func:`~optuna.visualization.plot_slice` for the details.



In [None]:
plot_slice(study)

Select parameters to visualize.



In [None]:
plot_slice(study, params=["bagging_freq", "bagging_fraction"])

Visualize parameter importances. See :func:`~optuna.visualization.plot_param_importances` for the details.



In [None]:
plot_param_importances(study)

Learn which hyperparameters are affecting the trial duration with hyperparameter importance.



In [None]:
optuna.visualization.plot_param_importances(
    study, target=lambda t: t.duration.total_seconds(), target_name="duration"
)

Visualize empirical distribution function. See :func:`~optuna.visualization.plot_edf` for the details.



In [None]:
plot_edf(study)