# TensorBoard Scalar Logging

Source: https://www.tensorflow.org/tensorboard/get_started


In this notebok, we'll see how to use tensorboard to examine simple scalar metrics from different training runs. Let us start by importing the required modules and add the tensorboard extension to jupyter.

In [1]:
# Load the TensorBoard notebook extension
%load_ext tensorboard

import tensorflow as tf
from tensorflow import keras

tb_run = 0

2021-12-22 12:20:44.035515: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory
2021-12-22 12:20:44.035539: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.


Let us download the MNIST handwritten digit recognition dataset and build a simple NN model with the Sequential API.

In [2]:
(x_train, y_train),(x_test, y_test) = keras.datasets.mnist.load_data()

x_train, x_test = x_train / 255.0, x_test / 255.0

Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/mnist.npz


In [3]:
def get_model():
    model = tf.keras.models.Sequential([
        keras.layers.Flatten(input_shape=(28, 28)),
        keras.layers.Dense(512, activation='relu', name='DenseFirst'),
        keras.layers.Dense(10, activation='softmax', name='DenseSecond')
      ])
    model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])
    
    return model

##### The TensorBoard callback

Let us use the TensorBoard callback to log training information. Note that we use a run counter variable, so that the following training runs are saved to a different tensorboard directory and can be visualized separately.

In [4]:
model = get_model()
tb_callback = keras.callbacks.TensorBoard(log_dir='./tb_log/run_{}'.format(tb_run), histogram_freq=1)

model.fit(x=x_train, y=y_train, epochs=10, validation_data=(x_test, y_test), callbacks=[tb_callback])

tb_run += 1

2021-12-22 12:21:04.689048: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcuda.so.1'; dlerror: libcuda.so.1: cannot open shared object file: No such file or directory
2021-12-22 12:21:04.689120: W tensorflow/stream_executor/cuda/cuda_driver.cc:269] failed call to cuInit: UNKNOWN ERROR (303)
2021-12-22 12:21:04.689180: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:156] kernel driver does not appear to be running on this host (matteo-Inspiron-7591-2n1): /proc/driver/nvidia/version does not exist
2021-12-22 12:21:04.689835: I tensorflow/core/platform/cpu_feature_guard.cc:151] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2021-12-22 12:21:04.908257: W tensorflow/core/framework/cpu_allocator_impl.cc:82] Allocation of 18

Epoch 1/10
Epoch 2/10
  88/1875 [>.............................] - ETA: 3s - loss: 0.0953 - accuracy: 0.9691

2021-12-22 12:21:09.066691: W tensorflow/core/framework/cpu_allocator_impl.cc:82] Allocation of 96337920 exceeds 10% of free system memory.


Epoch 3/10
  86/1875 [>.............................] - ETA: 3s - loss: 0.0444 - accuracy: 0.9847

2021-12-22 12:21:12.648994: W tensorflow/core/framework/cpu_allocator_impl.cc:82] Allocation of 96337920 exceeds 10% of free system memory.


Epoch 4/10
  79/1875 [>.............................] - ETA: 3s - loss: 0.0324 - accuracy: 0.9889

2021-12-22 12:21:16.335877: W tensorflow/core/framework/cpu_allocator_impl.cc:82] Allocation of 96337920 exceeds 10% of free system memory.


Epoch 5/10
  43/1875 [..............................] - ETA: 4s - loss: 0.0179 - accuracy: 0.9942

2021-12-22 12:21:20.229619: W tensorflow/core/framework/cpu_allocator_impl.cc:82] Allocation of 96337920 exceeds 10% of free system memory.


Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


##### Start TensorBoard

Start TensorBoard directly in the notebook or from the command line (for that, simply remove the initial % from the following command).

In [5]:
%tensorboard --logdir tb_log

##### Custom Logging

Let's try logging some custom scalars from a custom LR scheduler. Let us define a `file_writer` and use it within the LR scheduling function. We also save to a separate `metrics` sub-directory.

In [6]:

train_writer = tf.summary.create_file_writer('./tb_log/run_{}/metrics'.format(tb_run))

def my_lr_schedule(epoch, lr):
    lr = lr * 0.8
    print('My Schedule:', lr, epoch)
    with train_writer.as_default():
        tf.summary.scalar('learning_rate', data=lr, step=epoch)
    return lr

In [7]:
model = get_model()
tb_callback = keras.callbacks.TensorBoard(log_dir='./tb_log/run_{}'.format(tb_run), histogram_freq=1)
lr_callback = keras.callbacks.LearningRateScheduler(my_lr_schedule)


model.fit(x=x_train, y=y_train, epochs=10, validation_data=(x_test, y_test), callbacks=[tb_callback, lr_callback])
tb_run += 1

My Schedule: 0.000800000037997961 0
Epoch 1/10
My Schedule: 0.0006400000303983689 1
Epoch 2/10
My Schedule: 0.0005120000336319208 2
Epoch 3/10
My Schedule: 0.00040960004553198815 3
Epoch 4/10
My Schedule: 0.00032768002711236477 4
Epoch 5/10
My Schedule: 0.0002621440216898918 5
Epoch 6/10
My Schedule: 0.00020971521735191345 6
Epoch 7/10
My Schedule: 0.00016777217388153076 7
Epoch 8/10
My Schedule: 0.00013421773910522462 8
Epoch 9/10
My Schedule: 0.00010737419361248613 9
Epoch 10/10


Now go back up to the TensorBoard window to see the new logs.