---
execute:
  cache: false
  eval: true
  echo: true
  warning: false
title: Hyperparameter Tuning with `spotpython` and `PyTorch` Lightning Using a CondNet Model
jupyter: python3
---

In [1]:
#| label: 608_user-user-imports
#| echo: false
import os
from math import inf
import warnings
warnings.filterwarnings("ignore")

* We use the `Diabetes` dataset to illustrate the hyperparameter tuning process of a `CondNet` model using the `spotpython` package.
* The CondNet model is a conditional neural network that can be used to model conditional distributions [[LINK]](https://sequential-parameter-optimization.github.io/spotPython/reference/spotpython/light/regression/nn_condnet_regressor/).


In [2]:
#| label: 608_cond_net_setup
from spotpython.data.diabetes import Diabetes
from spotpython.hyperdict.light_hyper_dict import LightHyperDict
from spotpython.fun.hyperlight import HyperLight
from spotpython.utils.init import (fun_control_init, surrogate_control_init, design_control_init)
from spotpython.utils.eda import gen_design_table
from spotpython.spot import spot
from spotpython.utils.file import get_experiment_filename
from math import inf
from spotpython.hyperparameters.values import set_hyperparameter

PREFIX="CondNet_01"

data_set = Diabetes()
input_dim = 10
output_dim = 1
cond_dim = 2

fun_control = fun_control_init(
    PREFIX=PREFIX,
    fun_evals=inf,
    max_time=1,
    data_set = data_set,
    core_model_name="light.regression.NNCondNetRegressor",
    hyperdict=LightHyperDict,
    _L_in=input_dim - cond_dim,
    _L_out=1,
    _L_cond=cond_dim,)

fun = HyperLight().fun


set_hyperparameter(fun_control, "optimizer", [ "Adadelta", "Adam", "Adamax"])
set_hyperparameter(fun_control, "l1", [3,4])
set_hyperparameter(fun_control, "epochs", [3,7])
set_hyperparameter(fun_control, "batch_size", [4,5])
set_hyperparameter(fun_control, "dropout_prob", [0.0, 0.025])
set_hyperparameter(fun_control, "patience", [2,3])
set_hyperparameter(fun_control, "lr_mult", [0.1, 20.0])

design_control = design_control_init(init_size=10)

print(gen_design_table(fun_control))

Seed set to 123


Seed set to 123


module_name: light
submodule_name: regression
model_name: NNCondNetRegressor
| name           | type   | default   |   lower |   upper | transform             |
|----------------|--------|-----------|---------|---------|-----------------------|
| l1             | int    | 3         |     3   |   4     | transform_power_2_int |
| epochs         | int    | 4         |     3   |   7     | transform_power_2_int |
| batch_size     | int    | 4         |     4   |   5     | transform_power_2_int |
| act_fn         | factor | ReLU      |     0   |   5     | None                  |
| optimizer      | factor | SGD       |     0   |   2     | None                  |
| dropout_prob   | float  | 0.01      |     0   |   0.025 | None                  |
| lr_mult        | float  | 1.0       |     0.1 |  20     | None                  |
| patience       | int    | 2         |     2   |   3     | transform_power_2_int |
| batch_norm     | factor | 0         |     0   |   1     | None                  |

In [3]:
#| label: 608_cond_net_run
spot_tuner = spot.Spot(fun=fun,fun_control=fun_control, design_control=design_control)
res = spot_tuner.run()

GPU available: True (mps), used: True


TPU available: False, using: 0 TPU cores


HPU available: False, using: 0 HPUs



  | Name       | Type             | Params | Mode  | In sizes           | Out sizes
-----------------------------------------------------------------------------------------
0 | cond_layer | ConditionalLayer | 192    | train | [[32, 8], [32, 2]] | [32, 16] 
1 | layers     | Sequential       | 587    | train | [32, 16]           | [32, 1]  
-----------------------------------------------------------------------------------------
779       Trainable params
0         Non-trainable params
779       Total params
0.003     Total estimated model params size (MB)
26        Modules in train mode
0         Modules in eval mode


`Trainer.fit` stopped: `max_epochs=16` reached.


GPU available: True (mps), used: True


TPU available: False, using: 0 TPU cores


HPU available: False, using: 0 HPUs



  | Name       | Type             | Params | Mode  | In sizes           | Out sizes
-----------------------------------------------------------------------------------------
0 | cond_layer | ConditionalLayer | 96     | train | [[16, 8], [16, 2]] | [16, 8]  
1 | layers     | Sequential       | 153    | train | [16, 8]            | [16, 1]  
-----------------------------------------------------------------------------------------
249       Trainable params
0         Non-trainable params
249       Total params
0.001     Total estimated model params size (MB)
18        Modules in train mode
0         Modules in eval mode


train_model result: {'val_loss': 24158.83203125, 'hp_metric': 24158.83203125}


`Trainer.fit` stopped: `max_epochs=32` reached.


GPU available: True (mps), used: True


TPU available: False, using: 0 TPU cores


HPU available: False, using: 0 HPUs



  | Name       | Type             | Params | Mode  | In sizes           | Out sizes
-----------------------------------------------------------------------------------------
0 | cond_layer | ConditionalLayer | 192    | train | [[32, 8], [32, 2]] | [32, 16] 
1 | layers     | Sequential       | 691    | train | [32, 16]           | [32, 1]  
-----------------------------------------------------------------------------------------
883       Trainable params
0         Non-trainable params
883       Total params
0.004     Total estimated model params size (MB)
36        Modules in train mode
0         Modules in eval mode


train_model result: {'val_loss': 23447.546875, 'hp_metric': 23447.546875}


GPU available: True (mps), used: True


TPU available: False, using: 0 TPU cores


HPU available: False, using: 0 HPUs



  | Name       | Type             | Params | Mode  | In sizes           | Out sizes
-----------------------------------------------------------------------------------------
0 | cond_layer | ConditionalLayer | 96     | train | [[32, 8], [32, 2]] | [32, 8]  
1 | layers     | Sequential       | 197    | train | [32, 8]            | [32, 1]  
-----------------------------------------------------------------------------------------
293       Trainable params
0         Non-trainable params
293       Total params
0.001     Total estimated model params size (MB)
24        Modules in train mode
0         Modules in eval mode


train_model result: {'val_loss': 5128.1162109375, 'hp_metric': 5128.1162109375}


GPU available: True (mps), used: True


TPU available: False, using: 0 TPU cores


HPU available: False, using: 0 HPUs



  | Name       | Type             | Params | Mode  | In sizes           | Out sizes
-----------------------------------------------------------------------------------------
0 | cond_layer | ConditionalLayer | 192    | train | [[16, 8], [16, 2]] | [16, 16] 
1 | layers     | Sequential       | 691    | train | [16, 16]           | [16, 1]  
-----------------------------------------------------------------------------------------
883       Trainable params
0         Non-trainable params
883       Total params
0.004     Total estimated model params size (MB)
36        Modules in train mode
0         Modules in eval mode


train_model result: {'val_loss': 24131.58984375, 'hp_metric': 24131.58984375}


GPU available: True (mps), used: True


TPU available: False, using: 0 TPU cores


HPU available: False, using: 0 HPUs



  | Name       | Type             | Params | Mode  | In sizes           | Out sizes
-----------------------------------------------------------------------------------------
0 | cond_layer | ConditionalLayer | 192    | train | [[16, 8], [16, 2]] | [16, 16] 
1 | layers     | Sequential       | 587    | train | [16, 16]           | [16, 1]  
-----------------------------------------------------------------------------------------
779       Trainable params
0         Non-trainable params
779       Total params
0.003     Total estimated model params size (MB)
26        Modules in train mode
0         Modules in eval mode


train_model result: {'val_loss': 22622.26953125, 'hp_metric': 22622.26953125}


`Trainer.fit` stopped: `max_epochs=16` reached.


GPU available: True (mps), used: True


TPU available: False, using: 0 TPU cores


HPU available: False, using: 0 HPUs



  | Name       | Type             | Params | Mode  | In sizes           | Out sizes
-----------------------------------------------------------------------------------------
0 | cond_layer | ConditionalLayer | 96     | train | [[32, 8], [32, 2]] | [32, 8]  
1 | layers     | Sequential       | 153    | train | [32, 8]            | [32, 1]  
-----------------------------------------------------------------------------------------
249       Trainable params
0         Non-trainable params
249       Total params
0.001     Total estimated model params size (MB)
18        Modules in train mode
0         Modules in eval mode


train_model result: {'val_loss': 23945.794921875, 'hp_metric': 23945.794921875}


`Trainer.fit` stopped: `max_epochs=8` reached.


GPU available: True (mps), used: True


TPU available: False, using: 0 TPU cores


HPU available: False, using: 0 HPUs



  | Name       | Type             | Params | Mode  | In sizes           | Out sizes
-----------------------------------------------------------------------------------------
0 | cond_layer | ConditionalLayer | 96     | train | [[16, 8], [16, 2]] | [16, 8]  
1 | layers     | Sequential       | 197    | train | [16, 8]            | [16, 1]  
-----------------------------------------------------------------------------------------
293       Trainable params
0         Non-trainable params
293       Total params
0.001     Total estimated model params size (MB)
24        Modules in train mode
0         Modules in eval mode


train_model result: {'val_loss': 23592.5, 'hp_metric': 23592.5}


GPU available: True (mps), used: True


TPU available: False, using: 0 TPU cores


HPU available: False, using: 0 HPUs



  | Name       | Type             | Params | Mode  | In sizes           | Out sizes
-----------------------------------------------------------------------------------------
0 | cond_layer | ConditionalLayer | 96     | train | [[16, 8], [16, 2]] | [16, 8]  
1 | layers     | Sequential       | 153    | train | [16, 8]            | [16, 1]  
-----------------------------------------------------------------------------------------
249       Trainable params
0         Non-trainable params
249       Total params
0.001     Total estimated model params size (MB)
18        Modules in train mode
0         Modules in eval mode


train_model result: {'val_loss': 3970.03271484375, 'hp_metric': 3970.03271484375}


`Trainer.fit` stopped: `max_epochs=32` reached.


GPU available: True (mps), used: True


TPU available: False, using: 0 TPU cores


HPU available: False, using: 0 HPUs



  | Name       | Type             | Params | Mode  | In sizes           | Out sizes
-----------------------------------------------------------------------------------------
0 | cond_layer | ConditionalLayer | 192    | train | [[32, 8], [32, 2]] | [32, 16] 
1 | layers     | Sequential       | 691    | train | [32, 16]           | [32, 1]  
-----------------------------------------------------------------------------------------
883       Trainable params
0         Non-trainable params
883       Total params
0.004     Total estimated model params size (MB)
36        Modules in train mode
0         Modules in eval mode


train_model result: {'val_loss': 22140.10546875, 'hp_metric': 22140.10546875}


train_model result: {'val_loss': 22977.068359375, 'hp_metric': 22977.068359375}


GPU available: True (mps), used: True


TPU available: False, using: 0 TPU cores


HPU available: False, using: 0 HPUs



  | Name       | Type             | Params | Mode  | In sizes           | Out sizes
-----------------------------------------------------------------------------------------
0 | cond_layer | ConditionalLayer | 192    | train | [[16, 8], [16, 2]] | [16, 16] 
1 | layers     | Sequential       | 691    | train | [16, 16]           | [16, 1]  
-----------------------------------------------------------------------------------------
883       Trainable params
0         Non-trainable params
883       Total params
0.004     Total estimated model params size (MB)
36        Modules in train mode
0         Modules in eval mode


train_model result: {'val_loss': 9490.4794921875, 'hp_metric': 9490.4794921875}
spotpython tuning: 3970.03271484375 [----------] 4.21% 


GPU available: True (mps), used: True


TPU available: False, using: 0 TPU cores


HPU available: False, using: 0 HPUs



  | Name       | Type             | Params | Mode  | In sizes           | Out sizes
-----------------------------------------------------------------------------------------
0 | cond_layer | ConditionalLayer | 96     | train | [[32, 8], [32, 2]] | [32, 8]  
1 | layers     | Sequential       | 197    | train | [32, 8]            | [32, 1]  
-----------------------------------------------------------------------------------------
293       Trainable params
0         Non-trainable params
293       Total params
0.001     Total estimated model params size (MB)
24        Modules in train mode
0         Modules in eval mode


train_model result: {'val_loss': 4811.8203125, 'hp_metric': 4811.8203125}


spotpython tuning: 3970.03271484375 [#---------] 8.28% 


GPU available: True (mps), used: True


TPU available: False, using: 0 TPU cores


HPU available: False, using: 0 HPUs



  | Name       | Type             | Params | Mode  | In sizes           | Out sizes
-----------------------------------------------------------------------------------------
0 | cond_layer | ConditionalLayer | 192    | train | [[32, 8], [32, 2]] | [32, 16] 
1 | layers     | Sequential       | 691    | train | [32, 16]           | [32, 1]  
-----------------------------------------------------------------------------------------
883       Trainable params
0         Non-trainable params
883       Total params
0.004     Total estimated model params size (MB)
36        Modules in train mode
0         Modules in eval mode


train_model result: {'val_loss': 16126.7421875, 'hp_metric': 16126.7421875}


spotpython tuning: 3970.03271484375 [#---------] 11.87% 


GPU available: True (mps), used: True


TPU available: False, using: 0 TPU cores


HPU available: False, using: 0 HPUs



  | Name       | Type             | Params | Mode  | In sizes           | Out sizes
-----------------------------------------------------------------------------------------
0 | cond_layer | ConditionalLayer | 96     | train | [[16, 8], [16, 2]] | [16, 8]  
1 | layers     | Sequential       | 197    | train | [16, 8]            | [16, 1]  
-----------------------------------------------------------------------------------------
293       Trainable params
0         Non-trainable params
293       Total params
0.001     Total estimated model params size (MB)
24        Modules in train mode
0         Modules in eval mode


train_model result: {'val_loss': 5215.19384765625, 'hp_metric': 5215.19384765625}


spotpython tuning: 3970.03271484375 [##--------] 17.33% 


GPU available: True (mps), used: True


TPU available: False, using: 0 TPU cores


HPU available: False, using: 0 HPUs



  | Name       | Type             | Params | Mode  | In sizes           | Out sizes
-----------------------------------------------------------------------------------------
0 | cond_layer | ConditionalLayer | 96     | train | [[16, 8], [16, 2]] | [16, 8]  
1 | layers     | Sequential       | 197    | train | [16, 8]            | [16, 1]  
-----------------------------------------------------------------------------------------
293       Trainable params
0         Non-trainable params
293       Total params
0.001     Total estimated model params size (MB)
24        Modules in train mode
0         Modules in eval mode


`Trainer.fit` stopped: `max_epochs=128` reached.


train_model result: {'val_loss': 23775.759765625, 'hp_metric': 23775.759765625}


spotpython tuning: 3970.03271484375 [#####-----] 49.59% 


GPU available: True (mps), used: True


TPU available: False, using: 0 TPU cores


HPU available: False, using: 0 HPUs



  | Name       | Type             | Params | Mode  | In sizes           | Out sizes
-----------------------------------------------------------------------------------------
0 | cond_layer | ConditionalLayer | 96     | train | [[16, 8], [16, 2]] | [16, 8]  
1 | layers     | Sequential       | 197    | train | [16, 8]            | [16, 1]  
-----------------------------------------------------------------------------------------
293       Trainable params
0         Non-trainable params
293       Total params
0.001     Total estimated model params size (MB)
24        Modules in train mode
0         Modules in eval mode


train_model result: {'val_loss': 4698.92138671875, 'hp_metric': 4698.92138671875}
spotpython tuning: 3970.03271484375 [######----] 56.04% 


GPU available: True (mps), used: True


TPU available: False, using: 0 TPU cores


HPU available: False, using: 0 HPUs



  | Name       | Type             | Params | Mode  | In sizes           | Out sizes
-----------------------------------------------------------------------------------------
0 | cond_layer | ConditionalLayer | 96     | train | [[16, 8], [16, 2]] | [16, 8]  
1 | layers     | Sequential       | 197    | train | [16, 8]            | [16, 1]  
-----------------------------------------------------------------------------------------
293       Trainable params
0         Non-trainable params
293       Total params
0.001     Total estimated model params size (MB)
24        Modules in train mode
0         Modules in eval mode


train_model result: {'val_loss': 4691.24072265625, 'hp_metric': 4691.24072265625}
spotpython tuning: 3970.03271484375 [######----] 62.10% 


GPU available: True (mps), used: True


TPU available: False, using: 0 TPU cores


HPU available: False, using: 0 HPUs



  | Name       | Type             | Params | Mode  | In sizes           | Out sizes
-----------------------------------------------------------------------------------------
0 | cond_layer | ConditionalLayer | 96     | train | [[16, 8], [16, 2]] | [16, 8]  
1 | layers     | Sequential       | 197    | train | [16, 8]            | [16, 1]  
-----------------------------------------------------------------------------------------
293       Trainable params
0         Non-trainable params
293       Total params
0.001     Total estimated model params size (MB)
24        Modules in train mode
0         Modules in eval mode


`Trainer.fit` stopped: `max_epochs=128` reached.


train_model result: {'val_loss': 20540.259765625, 'hp_metric': 20540.259765625}


spotpython tuning: 3970.03271484375 [##########] 95.35% 


GPU available: True (mps), used: True


TPU available: False, using: 0 TPU cores


HPU available: False, using: 0 HPUs



  | Name       | Type             | Params | Mode  | In sizes           | Out sizes
-----------------------------------------------------------------------------------------
0 | cond_layer | ConditionalLayer | 96     | train | [[16, 8], [16, 2]] | [16, 8]  
1 | layers     | Sequential       | 197    | train | [16, 8]            | [16, 1]  
-----------------------------------------------------------------------------------------
293       Trainable params
0         Non-trainable params
293       Total params
0.001     Total estimated model params size (MB)
24        Modules in train mode
0         Modules in eval mode


train_model result: {'val_loss': 5512.07861328125, 'hp_metric': 5512.07861328125}


spotpython tuning: 3970.03271484375 [##########] 100.00% Done...



## Looking at the Results

### Tuning Progress

After the hyperparameter tuning run is finished, the progress of the hyperparameter tuning can be visualized with `spotpython`'s method `plot_progress`. The black points represent the performace values (score or metric) of  hyperparameter configurations from the initial design, whereas the red points represents the  hyperparameter configurations found by the surrogate model based optimization.


In [4]:
spot_tuner.plot_progress()

<Figure size 2700x1800 with 1 Axes>

### Tuned Hyperparameters and Their Importance

Results can be printed in tabular form.


In [5]:
from spotpython.utils.eda import gen_design_table
print(gen_design_table(fun_control=fun_control, spot=spot_tuner))

| name           | type   | default   |   lower |   upper | tuned                 | transform             |   importance | stars   |
|----------------|--------|-----------|---------|---------|-----------------------|-----------------------|--------------|---------|
| l1             | int    | 3         |     3.0 |     4.0 | 3.0                   | transform_power_2_int |         0.05 |         |
| epochs         | int    | 4         |     3.0 |     7.0 | 7.0                   | transform_power_2_int |         0.00 |         |
| batch_size     | int    | 4         |     4.0 |     5.0 | 4.0                   | transform_power_2_int |         0.00 |         |
| act_fn         | factor | ReLU      |     0.0 |     5.0 | Swish                 | None                  |         0.00 |         |
| optimizer      | factor | SGD       |     0.0 |     2.0 | Adadelta              | None                  |         0.00 |         |
| dropout_prob   | float  | 0.01      |     0.0 |   0.025 | 0.0012790

A histogram can be used to visualize the most important hyperparameters.


In [6]:
spot_tuner.plot_importance(threshold=1.0)

<Figure size 1650x1050 with 1 Axes>

In [7]:
spot_tuner.plot_important_hyperparameter_contour(max_imp=3)

l1:  0.045859478899300796
epochs:  0.001
batch_size:  0.001
act_fn:  0.001
optimizer:  0.001
dropout_prob:  100.0
lr_mult:  7.424561760184911
patience:  24.14524491106813
batch_norm:  67.40962429242266
initialization:  0.001


<Figure size 3600x1800 with 3 Axes>

<Figure size 3600x1800 with 3 Axes>

<Figure size 3600x1800 with 3 Axes>

### Get the Tuned Architecture {#sec-get-spot-results-608}


In [8]:
import pprint
from spotpython.hyperparameters.values import get_tuned_architecture
config = get_tuned_architecture(spot_tuner, fun_control)
pprint.pprint(config)

{'act_fn': Swish(),
 'batch_norm': True,
 'batch_size': 16,
 'dropout_prob': 0.0012790404219919403,
 'epochs': 128,
 'initialization': 'kaiming_uniform',
 'l1': 8,
 'lr_mult': 4.855811791679552,
 'optimizer': 'Adadelta',
 'patience': 4}
