# TensorBoard HyperParameters Logging

Source: https://www.tensorflow.org/tensorboard/hyperparameter_tuning_with_hparams

In this notebook, we'll use TensorBoard to compare multiple training runs with different hyper-parameters settings. Let us start by adding the TB extension to jupyter and clearing the logs directory.

In [1]:
# Load the TensorBoard notebook extension
%load_ext tensorboard

# Clear any logs from previous runs
!rm -rf ./tb_log/ 

Now let's import the required modules. We'll use the `hparams` plugin of Keras to explore hyper-parameters settings. This plugin has several other functionalities not touched in this example.

In [2]:
import tensorflow as tf
from tensorflow import keras
from tensorboard.plugins.hparams import api as hp

2021-12-22 12:24:21.419584: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory
2021-12-22 12:24:21.419611: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.


##### Load the dataset

We'll use again Fashion MNIST for this example.

In [3]:
fashion_mnist = keras.datasets.fashion_mnist

(x_train, y_train),(x_test, y_test) = fashion_mnist.load_data()
x_train, x_test = x_train / 255.0, x_test / 255.0


##### Build the model

We'll use a function to build a different version of the model based on some hyperparameters. In particular, we'll use only two hyperparameters here for simplicity:
- The number of units in the first fully-connected layer of our NN (`hparams['num_units']`)
- The dropout rate of that same layer (`hparams['dropout']`)

In [4]:
def try_hp_setting(hparams):
    model = keras.models.Sequential([
        keras.layers.Flatten(),
        keras.layers.Dense(hparams['num_units'], activation=tf.nn.relu),
        keras.layers.Dropout(hparams['dropout']),
        keras.layers.Dense(10, activation=tf.nn.softmax),
    ])

    model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])
    return model

##### Train the model

We'll train the model multiple times for different settings of the two hyper-parameters. Each run will be identified by an index (`session_num`). The corresponding logs will be saved in a separate sub-folder for TensorBoard.

The Keras `hparams` plugin has a dedicated callback (`hp.KerasCallback`) which logs the values of the hyper-parameters passed as input associated with each run.

In [5]:
session_num = 0

In [6]:
# for num_units in [512, 1024]:
for num_units in [16, 64]:
  for dropout_rate in [0.1, 0.5]:
        hparams = {'num_units': num_units, 'dropout': dropout_rate}
        run_name = "run-%d" % session_num
        print('--- Starting trial: %s' % run_name)
        print(hparams)
    
        log_dir = 'tb_log/hparam_tuning/' + run_name
        tb_callback = tf.keras.callbacks.TensorBoard(log_dir)
        hp_callback = hp.KerasCallback(log_dir, hparams)
    
        model = try_hp_setting(hparams)
        model.fit(x_train, y_train, validation_split=0.1, epochs=10, callbacks=[tb_callback, hp_callback])
        session_num += 1

--- Starting trial: run-0
{'num_units': 16, 'dropout': 0.1}


2021-12-22 12:24:23.611826: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcuda.so.1'; dlerror: libcuda.so.1: cannot open shared object file: No such file or directory
2021-12-22 12:24:23.611854: W tensorflow/stream_executor/cuda/cuda_driver.cc:269] failed call to cuInit: UNKNOWN ERROR (303)
2021-12-22 12:24:23.611878: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:156] kernel driver does not appear to be running on this host (matteo-Inspiron-7591-2n1): /proc/driver/nvidia/version does not exist
2021-12-22 12:24:23.612087: I tensorflow/core/platform/cpu_feature_guard.cc:151] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2021-12-22 12:24:23.844730: W tensorflow/core/framework/cpu_allocator_impl.cc:82] Allocation of 16

Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10
--- Starting trial: run-1
{'num_units': 16, 'dropout': 0.5}


2021-12-22 12:24:38.953444: W tensorflow/core/framework/cpu_allocator_impl.cc:82] Allocation of 169344000 exceeds 10% of free system memory.


Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10
--- Starting trial: run-2
{'num_units': 64, 'dropout': 0.1}


2021-12-22 12:24:57.269231: W tensorflow/core/framework/cpu_allocator_impl.cc:82] Allocation of 169344000 exceeds 10% of free system memory.


Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10
--- Starting trial: run-3
{'num_units': 64, 'dropout': 0.5}


2021-12-22 12:25:16.600845: W tensorflow/core/framework/cpu_allocator_impl.cc:82] Allocation of 169344000 exceeds 10% of free system memory.


Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


##### Start Tensorboard

Let's start tensorboard and see the results of this experiment. We can clearly inspect the scalar logs as seen in the previous notebook. However, the `HPARAMS` tab provides a more useful visualization.

In [7]:
%tensorboard --logdir './tb_log'