Callbacks can be called during training, evaluation, or inference. We can put  a callback in `fit()`, `evaluate()`, or `preditct()`

https://github.com/stephenjohnmoore/Coursera-Deep-Learning/blob/master/Custom%20Models%2C%20Layers%2C%20and%20Loss%20Functions%20with%20TensorFlow/Week%205%20-%20Bonus%20Content%20-%20Callbacks/C1_W5_Lab_1_exploring-callbacks.ipynb

In [30]:
from __future__ import absolute_import, division, print_function, unicode_literals

try:
    # %tensorflow_version only exists in Colab.
    %tensorflow_version 2.x
except Exception:
    pass

import tensorflow as tf
import tensorflow_datasets as tfds
import matplotlib.pyplot as plt
import io
from PIL import Image

from tensorflow.keras.callbacks import TensorBoard, EarlyStopping, LearningRateScheduler, ModelCheckpoint, CSVLogger, ReduceLROnPlateau
%load_ext tensorboard

from tensorflow.keras.layers import Conv2D, MaxPooling2D, Flatten, Dense
from tensorflow.keras.models import Model
import os
import matplotlib.pylab as plt
import numpy as np
import math
import datetime
import pandas as pd

print("Version: ", tf.__version__)
tf.get_logger().setLevel('INFO')

The tensorboard extension is already loaded. To reload it, use:
  %reload_ext tensorboard
Version:  2.6.0


In [2]:
# Download and prepare the horses or humans dataset

splits, info = tfds.load('horses_or_humans', as_supervised=True, with_info=True, split=['train[:80%]', 'train[80%:]', 'test'])

(train_examples, validation_examples, test_examples) = splits
# the data is in tuples(tensor img, tensor label)

num_examples = info.splits['train'].num_examples
num_classes = info.features['label'].num_classes

[1mDownloading and preparing dataset 153.59 MiB (download: 153.59 MiB, generated: Unknown size, total: 153.59 MiB) to /home/mo/tensorflow_datasets/horses_or_humans/3.0.0...[0m


Dl Completed...: 0 url [00:00, ? url/s]

Dl Size...: 0 MiB [00:00, ? MiB/s]

Generating splits...:   0%|          | 0/2 [00:00<?, ? splits/s]

Generating train examples...:   0%|          | 0/1027 [00:00<?, ? examples/s]

Shuffling horses_or_humans-train.tfrecord...:   0%|          | 0/1027 [00:00<?, ? examples/s]

Generating test examples...:   0%|          | 0/256 [00:00<?, ? examples/s]

Shuffling horses_or_humans-test.tfrecord...:   0%|          | 0/256 [00:00<?, ? examples/s]

[1mDataset horses_or_humans downloaded and prepared to /home/mo/tensorflow_datasets/horses_or_humans/3.0.0. Subsequent calls will reuse this data.[0m


In [28]:
for img, label in train_examples.take(1):
    print("original image shape",img.shape)
    print("image label", label)

original image shape (300, 300, 3)
image label tf.Tensor(0, shape=(), dtype=int64)


In [29]:
SIZE = 150 #@param {type:"slider", min:64, max:300, step:1}
IMAGE_SIZE = (SIZE, SIZE)

def format_image(image, label):
    image = tf.image.resize(image, IMAGE_SIZE) / 255.0
    return  image, label

In [5]:
BATCH_SIZE = 32 #@param {type:"integer"}

In [32]:
shuffle_buffer_size = num_examples // 4
train_batches = train_examples.shuffle(shuffle_buffer_size)
train_batches = train_batches.map(format_image).batch(BATCH_SIZE).prefetch(1)

validation_batches = validation_examples.map(format_image).batch(BATCH_SIZE).prefetch(1)

test_batches = test_examples.map(format_image).batch(1)

In [24]:
for image_batch, label_batch in train_batches.take(1):
    print("images batch shape", image_batch.shape)
    print("labels batch:", label_batch)

image_batch.shape

images batch shape (32, 150, 150, 3)
labels batch: tf.Tensor([1 0 0 1 1 1 0 1 1 0 1 1 1 0 0 0 1 0 1 1 1 1 0 0 1 1 1 0 1 1 0 0], shape=(32,), dtype=int64)


TensorShape([32, 150, 150, 3])

In [34]:
def build_model(dense_units, input_shape=IMAGE_SIZE + (3,)):
    inputs = tf.keras.layers.Input(shape = input_shape, name="input_layer")
    
    x = Conv2D(16, (3,3), activation='relu', name="first_conv")(inputs)
    x = MaxPooling2D(2,2, name='first_maxpool')(x)
    
    x = Conv2D(32, (3,3), activation='relu', name='second_conv')(x)
    x = MaxPooling2D(2,2, name='second_maxpool')(x)
    
    x = Conv2D(64, (3,3), activation='relu', name='third_conv')(x)
    x = MaxPooling2D(2,2, name='third_maxpool')(x)
    
    x = Flatten()(x)
    x = Dense(dense_units, activation='relu')(x)
    outputs = Dense(2, activation='softmax', name='classifier_dense')(x)
    
    return Model(inputs, outputs)    

In [35]:
# first callback: Tensorboard
!rm -rf logs

model = build_model(dense_units=256)
model.compile(
    optimizer='sgd',
    loss='sparse_categorical_crossentropy',
    metrics=['accuracy'])

logdir = os.path.join("logs", datetime.datetime.now().strftime("%Y%m%d-%H%M%S"))
tensorboard_callback = TensorBoard(logdir)

model.fit(train_batches, epochs=10, validation_data=validation_batches,
          callbacks=[tensorboard_callback])


Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


<keras.callbacks.History at 0x7f4ac877ea20>

In [37]:
%tensorboard --logdir logs

Reusing TensorBoard on port 6006 (pid 6849), started 0:01:15 ago. (Use '!kill 6849' to kill it.)

In [38]:
## model checkpoint, we can save a model or weights

model = build_model(dense_units=256)
model.compile(
    optimizer='sgd',
    loss='sparse_categorical_crossentropy', 
    metrics=['accuracy'])
  
model.fit(train_batches, 
          epochs=5, 
          validation_data=validation_batches, 
          verbose=2,
          callbacks=[ModelCheckpoint('weights.{epoch:02d}-{val_loss:.2f}.h5', verbose=1),
          ])



Epoch 1/5
26/26 - 1s - loss: 0.6776 - accuracy: 0.5316 - val_loss: 0.6663 - val_accuracy: 0.8195

Epoch 00001: saving model to weights.01-0.67.h5
Epoch 2/5
26/26 - 0s - loss: 0.6544 - accuracy: 0.5998 - val_loss: 0.6415 - val_accuracy: 0.7463

Epoch 00002: saving model to weights.02-0.64.h5
Epoch 3/5
26/26 - 0s - loss: 0.6232 - accuracy: 0.6959 - val_loss: 0.6262 - val_accuracy: 0.5610

Epoch 00003: saving model to weights.03-0.63.h5
Epoch 4/5
26/26 - 0s - loss: 0.5848 - accuracy: 0.7153 - val_loss: 0.5445 - val_accuracy: 0.7756

Epoch 00004: saving model to weights.04-0.54.h5
Epoch 5/5
26/26 - 0s - loss: 0.5552 - accuracy: 0.7421 - val_loss: 0.5093 - val_accuracy: 0.8049

Epoch 00005: saving model to weights.05-0.51.h5


<keras.callbacks.History at 0x7f4ca9923f98>

In [39]:
model = build_model(dense_units=256)
model.compile(
    optimizer='sgd',
    loss='sparse_categorical_crossentropy', 
    metrics=['accuracy'])
  
model.fit(train_batches, 
          epochs=1, 
          validation_data=validation_batches, 
          verbose=2,
          callbacks=[ModelCheckpoint('saved_model', verbose=1)
          ])

26/26 - 1s - loss: 0.6715 - accuracy: 0.5827 - val_loss: 0.6672 - val_accuracy: 0.5317

Epoch 00001: saving model to saved_model
INFO:tensorflow:Assets written to: saved_model/assets


INFO:tensorflow:Assets written to: saved_model/assets


<keras.callbacks.History at 0x7f4cb8be1fd0>

In [40]:
model = build_model(dense_units=256)
model.compile(
    optimizer='sgd',
    loss='sparse_categorical_crossentropy', 
    metrics=['accuracy'])
  
model.fit(train_batches, 
          epochs=2, 
          validation_data=validation_batches, 
          verbose=2,
          callbacks=[ModelCheckpoint('model.h5', verbose=1)
          ])

Epoch 1/2
26/26 - 1s - loss: 0.6604 - accuracy: 0.5791 - val_loss: 0.6327 - val_accuracy: 0.6829

Epoch 00001: saving model to model.h5
Epoch 2/2
26/26 - 0s - loss: 0.6038 - accuracy: 0.7105 - val_loss: 0.6136 - val_accuracy: 0.6000

Epoch 00002: saving model to model.h5


<keras.callbacks.History at 0x7f4cb883cd30>

In [41]:
#### Early stoppage callback, stop model when we stop improving



model = build_model(dense_units=256)
model.compile(
    optimizer='sgd',
    loss='sparse_categorical_crossentropy', 
    metrics=['accuracy'])
  
model.fit(train_batches, 
          epochs=50, 
          validation_data=validation_batches, 
          verbose=2,
          callbacks=[EarlyStopping(
              patience=3,
              min_delta=0.05,
              baseline=0.8,
              mode='min',
              monitor='val_loss',
              restore_best_weights=True,
              verbose=1)
          ])



Epoch 1/50
26/26 - 1s - loss: 0.6695 - accuracy: 0.5815 - val_loss: 0.6559 - val_accuracy: 0.6439
Epoch 2/50
26/26 - 0s - loss: 0.6137 - accuracy: 0.7092 - val_loss: 0.5824 - val_accuracy: 0.7463
Epoch 3/50
26/26 - 0s - loss: 0.5445 - accuracy: 0.7445 - val_loss: 0.5119 - val_accuracy: 0.7659
Epoch 4/50
26/26 - 0s - loss: 0.4768 - accuracy: 0.7786 - val_loss: 0.5110 - val_accuracy: 0.7805
Epoch 5/50
26/26 - 0s - loss: 0.4276 - accuracy: 0.8236 - val_loss: 0.5087 - val_accuracy: 0.7561
Epoch 6/50
26/26 - 0s - loss: 0.3780 - accuracy: 0.8455 - val_loss: 0.3351 - val_accuracy: 0.8488
Epoch 7/50
26/26 - 0s - loss: 0.3178 - accuracy: 0.8735 - val_loss: 0.3076 - val_accuracy: 0.8927
Epoch 8/50
26/26 - 0s - loss: 0.2791 - accuracy: 0.9002 - val_loss: 0.3269 - val_accuracy: 0.8488
Epoch 9/50
26/26 - 0s - loss: 0.2147 - accuracy: 0.9380 - val_loss: 0.1607 - val_accuracy: 0.9610
Epoch 10/50
26/26 - 0s - loss: 0.1727 - accuracy: 0.9672 - val_loss: 0.2057 - val_accuracy: 0.9268
Epoch 11/50
26/26 -

<keras.callbacks.History at 0x7f4cb88906d8>

In [42]:
#### CSV logger streams epoch results to a csv file
model = build_model(dense_units=256)
model.compile(
    optimizer='sgd',
    loss='sparse_categorical_crossentropy', 
    metrics=['accuracy'])
  
csv_file = 'training.csv'

model.fit(train_batches, 
          epochs=5, 
          validation_data=validation_batches, 
          callbacks=[CSVLogger(csv_file)
          ])

Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


<keras.callbacks.History at 0x7f4cb875b7b8>

In [43]:
pd.read_csv(csv_file).head()

Unnamed: 0,epoch,accuracy,loss,val_accuracy,val_loss
0,0,0.526764,0.671886,0.487805,0.672622
1,1,0.63747,0.63097,0.492683,0.686824
2,2,0.726277,0.568192,0.57561,0.712699
3,3,0.70073,0.580818,0.6,0.602277
4,4,0.774939,0.502648,0.785366,0.487279


In [44]:
### Learning rate scheduler
#updates lr dufing training

model = build_model(dense_units=256)
model.compile(
    optimizer='sgd',
    loss='sparse_categorical_crossentropy', 
    metrics=['accuracy'])
  
def step_decay(epoch):
	initial_lr = 0.01
	drop = 0.5
	epochs_drop = 1
	lr = initial_lr * math.pow(drop, math.floor((1+epoch)/epochs_drop))
	return lr

model.fit(train_batches, 
          epochs=5, 
          validation_data=validation_batches, 
          callbacks=[LearningRateScheduler(step_decay, verbose=1),
                    TensorBoard(log_dir='./log_dir')])

Epoch 1/5

Epoch 00001: LearningRateScheduler setting learning rate to 0.005.
Epoch 2/5

Epoch 00002: LearningRateScheduler setting learning rate to 0.0025.
Epoch 3/5

Epoch 00003: LearningRateScheduler setting learning rate to 0.00125.
Epoch 4/5

Epoch 00004: LearningRateScheduler setting learning rate to 0.000625.
Epoch 5/5

Epoch 00005: LearningRateScheduler setting learning rate to 0.0003125.


<keras.callbacks.History at 0x7f4cb85b92b0>

In [45]:
%tensorboard --logdir log_dir

In [46]:
###

# ReduceLROnPlateau

# Reduce learning rate when a metric has stopped improving.

model = build_model(dense_units=256)
model.compile(
    optimizer='sgd',
    loss='sparse_categorical_crossentropy', 
    metrics=['accuracy'])
  
model.fit(train_batches, 
          epochs=50, 
          validation_data=validation_batches, 
          callbacks=[ReduceLROnPlateau(monitor='val_loss', 
                                       factor=0.2, verbose=1,
                                       patience=1, min_lr=0.001),
                     TensorBoard(log_dir='./log_dir')])

Epoch 1/50
Epoch 2/50
Epoch 3/50
Epoch 4/50
Epoch 5/50
Epoch 6/50
Epoch 7/50
Epoch 8/50
Epoch 9/50
Epoch 10/50

Epoch 00010: ReduceLROnPlateau reducing learning rate to 0.0019999999552965165.
Epoch 11/50
Epoch 12/50
Epoch 13/50
Epoch 14/50
Epoch 15/50
Epoch 16/50
Epoch 17/50
Epoch 18/50

Epoch 00018: ReduceLROnPlateau reducing learning rate to 0.001.
Epoch 19/50
Epoch 20/50
Epoch 21/50
Epoch 22/50
Epoch 23/50
Epoch 24/50
Epoch 25/50
Epoch 26/50
Epoch 27/50
Epoch 28/50
Epoch 29/50
Epoch 30/50
Epoch 31/50
Epoch 32/50
Epoch 33/50
Epoch 34/50
Epoch 35/50
Epoch 36/50
Epoch 37/50
Epoch 38/50
Epoch 39/50
Epoch 40/50
Epoch 41/50
Epoch 42/50
Epoch 43/50
Epoch 44/50
Epoch 45/50
Epoch 46/50
Epoch 47/50
Epoch 48/50
Epoch 49/50
Epoch 50/50


<keras.callbacks.History at 0x7f4cb84dcfd0>

In [47]:
%tensorboard --logdir log_dir

Reusing TensorBoard on port 6007 (pid 8947), started 0:02:17 ago. (Use '!kill 8947' to kill it.)

In [49]:
model.evaluate(test_batches, return_dict=True)



{'loss': 1.3415966033935547, 'accuracy': 0.734375}