# **Tensorboard and other callbacks**

## **Callbacks**

A callback is an object that can perform actions at various stages of training (end of epoch, start of batch, etc.).

Callbacks can be used for:
- periodically save model checkpoints to disk
- do early stopping (e.g. when the model stops improving)
- schedule the learning rate
- get a view on internal states and statistics of a model during training
- write logs after every batch of training to monitor your metrics
- custom callbacks

## **Tensorboard**
![](https://drive.google.com/uc?export=view&id=1h2G9AD5IHXMj08ey16L8AUKiagXhW6nH)

### **Tensorboard alternatives**

- Weights & biases ([wandb](https://wandb.ai/site))
- Neptune AI ([neptune](https://neptune.ai/))

## **Library import**

In [None]:
import numpy as np
import matplotlib.pyplot as plt

import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Flatten, Input, Dropout, Conv2D
from tensorflow.keras.optimizers import Adam
from tensorflow.keras.utils import plot_model
from tensorflow.config.experimental import list_physical_devices

In [None]:
gpus = list_physical_devices('GPU')
print(len(gpus), "Physical GPUs")

## **Data**

In [None]:
(X_train, y_train), (X_test, y_test) = tf.keras.datasets.mnist.load_data()

X_val, X_test, y_val, y_test = X_test[:9000], X_test[9000:], y_test[:9000], y_test[9000:]

# Image normalization
X_train = X_train / 255
X_val = X_val / 255
X_test = X_test / 255

plt.figure(figsize=(9,9))
x = 1
for i in range(5):
    for j in range(5):
        plt.subplot(5,5,x)
        plt.title(f"Label : {y_train[x]}")
        plt.imshow(X_train[x], cmap="gray");
        plt.axis("off")
        x += 1
        
X_train.shape

## **Compile & Fit Model**

### **No activation function**

In [None]:
model = Sequential()
model.add(Input((28,28)))
model.add(Flatten())
model.add(Dense(32, activation=None))
model.add(Dense(10, activation='softmax'))

model.compile(optimizer=Adam(learning_rate=0.0005),
              loss='sparse_categorical_crossentropy',
              metrics=['sparse_categorical_accuracy'])

model.summary()
plot_model(model, show_shapes=True)

In [None]:
history = model.fit(X_train, y_train,
                    validation_data=(X_val, y_val),
                    batch_size=32, epochs=10)

In [None]:
plt.figure(figsize=(11,4))

plt.subplot(121)
plt.plot(history.history['sparse_categorical_accuracy'])
plt.plot(history.history['val_sparse_categorical_accuracy'])
N = len(history.history['sparse_categorical_accuracy'])
xticks, labels = np.arange(N), np.arange(1, N+1)
plt.xticks(xticks, labels)
plt.xlabel("Ecoch")
plt.ylabel("Accuracy [%]")

plt.subplot(122)
plt.plot(history.history['loss'])
plt.plot(history.history['val_loss'])
plt.yscale("log")
plt.xticks(xticks, labels)
plt.xlabel("Ecoch")
plt.ylabel("Loss function");

plt.subplots_adjust(wspace=0.3);

### **Sigmoid activation**

In [None]:
model = Sequential()
model.add(Input((28,28)))
model.add(Flatten())
model.add(Dense(32, activation='sigmoid'))
model.add(Dense(10, activation='softmax'))

model.compile(optimizer=Adam(learning_rate=0.0005),
              loss='sparse_categorical_crossentropy',
              metrics=['sparse_categorical_accuracy'])

In [None]:
history2 = model.fit(X_train, y_train, 
                    validation_data=(X_val, y_val),
                    batch_size=32, epochs=10)

In [None]:
plt.figure(figsize=(11,4))

plt.subplot(121)
plt.plot(history2.history['sparse_categorical_accuracy'])
plt.plot(history2.history['val_sparse_categorical_accuracy'])
N = len(history2.history['sparse_categorical_accuracy'])
xticks, labels = np.arange(N), np.arange(1, N+1)
plt.xticks(xticks, labels)
plt.xlabel("Ecoch")
plt.ylabel("Accuracy [%]")

plt.subplot(122)
plt.plot(history2.history['loss'])
plt.plot(history2.history['val_loss'])
plt.yscale("log")
plt.xticks(xticks, labels)
plt.xlabel("Ecoch")
plt.ylabel("Loss function");

plt.subplots_adjust(wspace=0.3);

### **ReLU activation**

In [None]:
model = Sequential()
model.add(Input((28,28)))
model.add(Flatten())
model.add(Dense(32, activation='relu'))
model.add(Dense(10, activation='softmax'))

model.compile(optimizer=Adam(learning_rate=0.0005),
              loss='sparse_categorical_crossentropy',
              metrics=['sparse_categorical_accuracy'])

In [None]:
history3 = model.fit(X_train, y_train, 
                    validation_data=(X_val, y_val),
                    batch_size=32, epochs=10)

In [None]:
plt.figure(figsize=(11,4))

plt.subplot(121)
plt.plot(history3.history['sparse_categorical_accuracy'])
plt.plot(history3.history['val_sparse_categorical_accuracy'])
N = len(history3.history['sparse_categorical_accuracy'])
xticks, labels = np.arange(N), np.arange(1, N+1)
plt.xticks(xticks, labels)
plt.xlabel("Ecoch")
plt.ylabel("Accuracy [%]")

plt.subplot(122)
plt.plot(history3.history['loss'])
plt.plot(history3.history['val_loss'])
plt.yscale("log")
plt.xticks(xticks, labels)
plt.xlabel("Ecoch")
plt.ylabel("Loss function");

plt.subplots_adjust(wspace=0.3);

### **Custom activation function**

Mish - Misra 2019 (https://arxiv.org/abs/1908.08681)

Motivation: Activation function brings **non-linerity** into the network. Currently, **ReLU** is the most widely used activation function in hidden dense and convolutional layers. But it also has some caveats: derivative is a step function, and zero values can cause **dying ReLU problem**.

In [None]:
from tensorflow.keras.layers import Activation
from tensorflow.keras.utils import get_custom_objects

# class Mish(Activation):
#     def __init__(self, **kwargs):
#         super().__init__(self.mish, **kwargs)
#         self.__name__ = 'Mish'

#     def mish(self, inputs):
#         return inputs * tf.math.tanh(tf.math.softplus(inputs))

# get_custom_objects().update({'Mish': Mish})

def mish(inputs):
    return inputs * tf.math.tanh(tf.math.softplus(inputs))

get_custom_objects().update({'Mish': mish})

In [None]:
x = np.linspace(-5, 5, 1000)

plt.plot(x, x, label='Identity')

y_sigmoid = 1 / (1 + np.exp(-x))
plt.plot(x, y_sigmoid, label='Sigmoid')

y_relu = np.maximum(0, x)
plt.plot(x, y_relu, label='ReLU')

y_mish = x * np.tanh(np.log(1 + np.exp(x)))
plt.plot(x, y_mish, label='Mish')

plt.legend()
plt.xlabel('x')
plt.ylabel('Activation')
plt.title('Activation Functions')
plt.grid()
plt.show()

In [None]:
model = Sequential()
model.add(Input((28,28)))
model.add(Flatten())
model.add(Dense(32, activation='Mish'))
model.add(Dense(10, activation='softmax'))

model.compile(optimizer=Adam(learning_rate=0.0005),
              loss='sparse_categorical_crossentropy',
              metrics=['sparse_categorical_accuracy'])

In [None]:
history4 = model.fit(X_train, y_train, 
                    validation_data=(X_val, y_val),
                    batch_size=32, epochs=10)

In [None]:
plt.figure(figsize=(11,4))

plt.subplot(121)
plt.plot(history.history['val_sparse_categorical_accuracy'], label="none")
plt.plot(history2.history['val_sparse_categorical_accuracy'], label="Sigmoid")
plt.plot(history3.history['val_sparse_categorical_accuracy'], label="ReLU")
plt.plot(history4.history['val_sparse_categorical_accuracy'], label="Mish")
xticks, labels = np.arange(N), np.arange(1, N+1)
plt.xticks(xticks, labels)
plt.xlabel("Ecoch")
plt.ylabel("Accuracy [%]")
plt.legend()

plt.subplot(122)
plt.plot(history.history['val_loss'])
plt.plot(history2.history['val_loss'])
plt.plot(history3.history['val_loss'])
plt.plot(history4.history['val_loss'])
plt.yscale("log")
plt.xticks(xticks, labels)
plt.xlabel("Ecoch")
plt.ylabel("Loss function");

plt.subplots_adjust(wspace=0.3);

## **Tensorboard**

https://www.tensorflow.org/tensorboard \
https://keras.io/api/callbacks/tensorboard/

Inside Python code:
```python
callbacks = [TensorBoard(log_dir="logs/name", update_freq="epoch")]

model.fit(X_train, y_train,
          ...
          callbacks=callbacks)
```

Command line:
```
tensorboard --logdir "logs/"
```


Tensorboard inline:
```
%load_ext tensorboard
%tensorboard --logdir logs/
```

In [None]:
from tensorflow.keras.callbacks import TensorBoard

In [None]:
for activation in [None, "sigmoid", "relu", "Mish"]:
    logdir = f"logs/dense_32_{activation}"
    callbacks = [TensorBoard(log_dir=logdir, update_freq="epoch")]
    
    model = Sequential()
    model.add(Input((28,28)))
    model.add(Flatten())
    model.add(Dense(32, activation=activation))
    model.add(Dense(10, activation='softmax'))
    
    model.compile(optimizer=Adam(learning_rate=0.0005),
                  loss='sparse_categorical_crossentropy',
                  metrics=['sparse_categorical_accuracy'])
    
    model.fit(X_train, y_train,
              validation_data=(X_val, y_val),
              batch_size=32, epochs=10,
              callbacks=callbacks)
    
    _, score = model.evaluate(X_test, y_test)
    
    with tf.summary.create_file_writer(logdir+"/test").as_default():
        tf.summary.scalar("epoch_sparse_categorical_accuracy", score, step=10)

## **Images**

https://www.tensorflow.org/tensorboard/image_summaries

In [None]:
import io

def plot(img, true, pred):
    fig = plt.figure(figsize=(5,5))
    plt.title(f"true:{true} pred:{pred}")
    plt.axis("off")
    plt.imshow(img, cmap=plt.cm.binary)
    return fig

def plot_to_image(figure):
    buf = io.BytesIO()
    plt.savefig(buf, format='png')
    plt.close(figure)
    buf.seek(0)
    image = tf.image.decode_png(buf.getvalue(), channels=4)
    image = tf.expand_dims(image, 0)
    return image

In [None]:
logdir = "logs/dense_32_relu/validation"
with tf.summary.create_file_writer(logdir).as_default():
    for i in range(10):
        img = X_test[i]
        y_pred = model.predict(img.reshape(-1,28,28,1))
        y_pred = np.argmax(y_pred)
        
        tf.summary.image("10 testing data examples", 
                         plot_to_image(plot(img, y_test[i], y_pred)), max_outputs=25, step=i)

### **only wrong images**

In [None]:
y_pred = model.predict(X_test)
y_pred = np.argmax(y_pred, axis=1)
wrong = (y_pred == y_test) == False

with tf.summary.create_file_writer(logdir).as_default():
    for i in range(10):
        img = X_test[wrong][i]
        true = y_test[wrong][i]
        pred = y_pred[wrong][i]
        tf.summary.image("10 mistakenly classified images", 
                         plot_to_image(plot(img, true, pred)), max_outputs=25, step=i)

### **Confusion matrix**

In [None]:
from matplotlib.colors import LogNorm
from sklearn.metrics import confusion_matrix

def plot_confusion_matrix(y_pred, y_test):
    fig = plt.figure(figsize=(6,6))
    cm = confusion_matrix(y_test, y_pred)
    plt.imshow(cm+1, norm=LogNorm())

    for i in range(10):
        for j in range(10):
            plt.text(j,i,cm[i,j], ha="center", va="center")

    plt.xlabel("True label")
    plt.ylabel("Predicted label")
    plt.xticks(np.arange(10))
    plt.yticks(np.arange(10))

    return fig

In [None]:
y_pred = model.predict(X_val)
y_pred = np.argmax(y_pred, axis=1)

plot_confusion_matrix(y_pred, y_val);

In [None]:
with tf.summary.create_file_writer(logdir).as_default():
    tf.summary.image("Confusion matrix", plot_to_image(plot_confusion_matrix(y_pred, y_val)), step=0)

## **LambdaCallback**

In [None]:
from tensorflow.keras.callbacks import LambdaCallback

In [None]:
def batchOutput(batch, logs):
    tf.summary.scalar('batch_accuracy', data=logs['sparse_categorical_accuracy'], step=batch)
    return batch

callbacks = [TensorBoard(log_dir="logs/write_per_batch", update_freq="batch"),
             LambdaCallback(on_batch_end=batchOutput,)]

model = Sequential()
model.add(Input((28,28)))
model.add(Flatten())
model.add(Dense(32, activation='relu'))
model.add(Dense(10, activation='softmax'))

model.compile(optimizer=Adam(learning_rate=0.001),
              loss='sparse_categorical_crossentropy',
              metrics=['sparse_categorical_accuracy'])

model.fit(X_train, y_train,
          validation_data=(X_val, y_val),
          batch_size=32, epochs=1,
          callbacks=callbacks)

## **Reduce learning rate**

https://keras.io/api/callbacks/reduce_lr_on_plateau/

https://www.tensorflow.org/api_docs/python/tf/keras/callbacks/ReduceLROnPlateau

In [None]:
from tensorflow.keras.callbacks import ReduceLROnPlateau

In [None]:
callbacks = [TensorBoard(log_dir="logs/dense_128_relu", update_freq="epoch"),
             ReduceLROnPlateau(monitor='val_loss', factor=0.2, patience=2,
                               min_lr=0.0, verbose=1)]

model = Sequential()
model.add(Input((28,28)))
model.add(Flatten())
model.add(Dense(128, activation='relu'))
model.add(Dense(10, activation='softmax'))

model.compile(optimizer=Adam(learning_rate=0.0005),
              loss='sparse_categorical_crossentropy',
              metrics=['sparse_categorical_accuracy'])

history = model.fit(X_train, y_train, 
                    validation_data=(X_val, y_val),
                    batch_size=32, epochs=20,
                    callbacks=callbacks)

## **Tuning HyperParams**

https://www.tensorflow.org/tensorboard/hyperparameter_tuning_with_hparams

In [None]:
from tensorboard.plugins.hparams import api as hp

neurons = hp.HParam('neurons', hp.Discrete([32, 64, 128]))
activation = hp.HParam('activation', hp.Discrete(["relu", "Mish"]))
learning_rate = hp.HParam('learning_rate', hp.Discrete([1e-3, 1e-4]))

In [None]:
def get_model(neurons, activation, learning_rate):
    model = Sequential()
    model.add(Input((28,28)))
    model.add(Flatten())
    model.add(Dense(neurons, activation=activation))
    model.add(Dense(10, activation='softmax'))
    
    model.compile(optimizer=Adam(learning_rate=learning_rate),
                  loss='sparse_categorical_crossentropy',
                  metrics=['sparse_categorical_accuracy'])
    
    return model

def train_test_model(hparams):
    model = get_model(hparams[neurons], hparams[activation], hparams[learning_rate])

    name = "hp_{0}_{1}_{2}".format(hparams[neurons], hparams[activation], hparams[learning_rate])
    logdir = "logs/hparam_tuning/" + name

    callbacks = [TensorBoard(log_dir=logdir, update_freq="epoch"),
                 hp.KerasCallback(logdir, hparams)]
    
    model.fit(X_train, y_train,
              validation_data=(X_val, y_val),
              batch_size=64, epochs=10, 
              callbacks=callbacks)

    _, accuracy = model.evaluate(X_test, y_test)
    
    with tf.summary.create_file_writer(logdir).as_default():
        tf.summary.scalar("epoch_sparse_categorical_accuracy", accuracy, step=10)

In [None]:
for N in neurons.domain.values:
    for A in activation.domain.values:
        for LR in learning_rate.domain.values:
            hparams = {neurons : N,
                       activation : A,
                       learning_rate : LR}

            print({h.name: hparams[h] for h in hparams})
            train_test_model(hparams)

### **Keras tuner**

https://keras.io/keras_tuner/

In [None]:
#!pip install keras_tuner

In [None]:
import keras_tuner as kt

In [None]:
def build_model(hp):
    neurons = hp.Int('neurons', min_value=64, max_value=256, sampling="linear")
    lr = hp.Float('learning_rate', min_value=1e-4, max_value=1e-3, sampling="log")
    activation = hp.Choice('activation', ["relu", "Mish"])
    
    model = Sequential()
    model.add(Input((28,28)))
    model.add(Flatten())
    model.add(Dense(neurons, activation=activation))
    model.add(Dense(10, activation='softmax'))
    
    model.compile(optimizer=Adam(learning_rate=lr),
                  loss='sparse_categorical_crossentropy',
                  metrics=['sparse_categorical_accuracy'])

    return model

In [None]:
logdir = "logs/keras_tuner"
callbacks = [TensorBoard(log_dir=logdir, update_freq="epoch")]

tuner = kt.BayesianOptimization(build_model, 
                                objective=kt.Objective("val_sparse_categorical_accuracy", direction="max"), 
                                max_trials=12,
                                directory=logdir,
                                overwrite=True)

tuner.search(X_train, y_train, 
             validation_data=(X_val, y_val),
             batch_size=64, epochs=10,
             callbacks=callbacks)

In [None]:
best_model = tuner.get_best_models()[0]

tuner.get_best_hyperparameters()[0].values

In [None]:
best_model.summary()

## **Summary**

- Custom activation function
- Keras callbacks
- Tensorboard
- Custom callbacks
- Tune hyperparameters
- Autotuning

## **Assignments**

1) Try some convolutional neural network and tune its parameters
```python
model = Sequential()
model.add(Input((28,28,1)))
model.add(Conv2D(32, (2,2), padding="same", activation='relu'))
model.add(Flatten())
model.add(Dense(10, activation='softmax'))
```

2) Try to implement a learning rate scheduler 
- https://keras.io/api/callbacks/learning_rate_scheduler/
- try: linear decay, exponential decay