## TensorFlow Callbacks in Action


So, what can you do with these callbacks?
1. You can perform a particular task after the starting and ending of the training/batch/ epochs.
2. You can periodically save the model states in the disk.
3. You can schedule the learning rate as per your task.
4. You can automatically stop the training when a particular condition becomes True.
5. And you can do anything during the training process by subclassing these callbacks.



Tensorflow provides a wide range of callbacks under the base class “tf.keras.callbacks. “For the full list of callbacks please visit [TensorFlow’s website](https://www.tensorflow.org/api_docs/python/tf/keras/callbacks/Callback).



1. custom callbacks by subclassing callback class.
2. Early stopping callback.
3. Model checkpoint callback.
4. ReduceOnPlateu callback.
5. Learning rate Scheduler.

But let’s first load the cats_vs_dogs dataset

In [7]:
import tensorflow as tf
import tensorflow_datasets as tfds
import numpy as np

images_train = np.load("./images_train.npy") / 255
images_valid = np.load("./images_valid.npy") / 255
images_test = np.load("./images_test.npy") / 255
labels_train = np.load("./labels_train.npy")
labels_valid = np.load("./labels_valid.npy")
labels_test= np.load("./labels_test.npy")

print("{} training data examples".format(images_train.shape[0]))
print("{} validation data examples".format(images_valid.shape[0]))
print("{} test data examples".format(images_test.shape[0]))

600 training data examples
300 validation data examples
300 test data examples


In [10]:
images_train

array([[[[0.10588235, 0.05098039, 0.        ],
         [0.11372549, 0.05882353, 0.00784314],
         [0.09019608, 0.03529412, 0.        ],
         ...,
         [0.17254902, 0.10588235, 0.00392157],
         [0.19215686, 0.1254902 , 0.02352941],
         [0.20392157, 0.1372549 , 0.03529412]],

        [[0.10196078, 0.04705882, 0.        ],
         [0.11764706, 0.0627451 , 0.01176471],
         [0.10196078, 0.04705882, 0.00392157],
         ...,
         [0.2       , 0.13333333, 0.03137255],
         [0.21568627, 0.14901961, 0.04705882],
         [0.22352941, 0.15686275, 0.05490196]],

        [[0.08627451, 0.03137255, 0.        ],
         [0.10196078, 0.04705882, 0.        ],
         [0.09411765, 0.03921569, 0.        ],
         ...,
         [0.19215686, 0.1254902 , 0.02352941],
         [0.20392157, 0.1372549 , 0.03529412],
         [0.21176471, 0.14509804, 0.04313725]],

        ...,

        [[0.30196078, 0.21568627, 0.07058824],
         [0.30980392, 0.22352941, 0.07843137]

In [None]:
# Cat Dog - Binary Classification Problem 

# CNN Model 

In [12]:
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Flatten, Conv2D , MaxPool2D
def get_compiled_model(compile=True):
  ''' prepare and compile the model '''
  model = Sequential()
  model.add(Conv2D(32, (3,3), activation='relu', padding='SAME', input_shape=(160,160,3)))
  model.add(Conv2D(32, (3,3), activation='relu', padding='SAME'))
  model.add(MaxPool2D(2,2))
  model.add(Conv2D(64, (3,3), activation='relu', padding='SAME'))
  model.add(Conv2D(64, (3,3), activation='relu', padding='SAME'))
  model.add(MaxPool2D(2,2))
  model.add(Flatten())
  model.add(Dense(128, activation='relu'))
  model.add(Dense(1, activation='sigmoid'))
  
  if compile is True:
    model.compile(optimizer=tf.keras.optimizers.Adam(learning_rate=0.001),
              loss=tf.keras.losses.BinaryCrossentropy(), 
              metrics=[tf.keras.metrics.BinaryAccuracy(name='acc')])
  return model

model = get_compiled_model()

# inspecting the model architecture
model.summary()

Model: "sequential_1"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv2d_4 (Conv2D)            (None, 160, 160, 32)      896       
_________________________________________________________________
conv2d_5 (Conv2D)            (None, 160, 160, 32)      9248      
_________________________________________________________________
max_pooling2d_2 (MaxPooling2 (None, 80, 80, 32)        0         
_________________________________________________________________
conv2d_6 (Conv2D)            (None, 80, 80, 64)        18496     
_________________________________________________________________
conv2d_7 (Conv2D)            (None, 80, 80, 64)        36928     
_________________________________________________________________
max_pooling2d_3 (MaxPooling2 (None, 40, 40, 64)        0         
_________________________________________________________________
flatten_1 (Flatten)          (None, 102400)           

## 1. Custom callbacks by subclassing callback class.

These callbacks come under the base class “tf.keras.callbacks.”
By subclassing these callbacks, we can perform certain functions when the training/batch/epochs have started or ended.
For this, we can override the function of callback classes.
The name of these functions is self explain their behavior.
For example def on_train_begin(), this means what to do when
training will begin.
Let’s see below how to override these functions. We can
also, monitor logs and perform certain actions, generally at 
the starting or the ending of the training/batch/epochs.

In [11]:
labels_train

array([0, 1, 1, 0, 0, 0, 0, 1, 0, 1, 0, 0, 1, 1, 1, 0, 1, 1, 1, 0, 0, 1,
       0, 1, 1, 0, 1, 1, 0, 1, 0, 1, 0, 0, 1, 1, 0, 1, 0, 1, 1, 1, 0, 1,
       1, 0, 1, 0, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 1, 1, 0, 0, 1, 1, 1,
       0, 1, 0, 0, 1, 1, 0, 0, 0, 0, 1, 0, 1, 1, 1, 1, 1, 0, 1, 0, 1, 1,
       0, 0, 0, 0, 0, 0, 1, 1, 0, 1, 1, 1, 0, 1, 1, 0, 1, 1, 1, 1, 1, 0,
       0, 1, 0, 1, 1, 0, 1, 0, 0, 1, 0, 1, 1, 0, 0, 0, 1, 1, 0, 0, 0, 0,
       0, 1, 0, 1, 0, 0, 1, 1, 0, 0, 0, 1, 1, 0, 1, 0, 0, 1, 0, 0, 0, 0,
       0, 0, 1, 1, 1, 0, 1, 0, 0, 0, 1, 1, 1, 0, 1, 1, 1, 0, 1, 0, 0, 1,
       0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 1, 1, 1, 0, 1, 0, 1, 1, 0, 1, 1, 0,
       0, 1, 1, 0, 0, 0, 0, 1, 0, 0, 1, 1, 1, 0, 0, 0, 0, 1, 0, 0, 0, 1,
       0, 1, 0, 0, 0, 1, 0, 1, 1, 1, 0, 0, 1, 1, 1, 1, 1, 1, 1, 0, 1, 1,
       0, 0, 0, 1, 1, 0, 0, 0, 1, 1, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 1, 1,
       0, 0, 0, 1, 1, 0, 1, 1, 0, 1, 0, 0, 0, 1, 1, 1, 0, 1, 1, 1, 0, 0,
       0, 1, 0, 1, 0, 1, 1, 0, 1, 1, 1, 0, 0, 1, 0,

In [5]:
import datetime
from tensorflow.keras.callbacks import Callback

class CustomCallback(Callback):
  def on_train_begin(self,logs=None):
    print("Training is started, at time {}".format(datetime.datetime.now().time()))
  def on_train_end(self, logs=None):
    print("Training is ended at {}".format(datetime.datetime.now().time()))
  def on_train_batch_begin(self, batch, logs=None):
    print('Training: batch {} begins at {}'.format(batch, datetime.datetime.now().time()))
  def on_train_batch_end(self, batch, logs=None):
    print('Training: batch {} ends at {}'.format(batch, datetime.datetime.now().time()))

custom_callback = CustomCallback()

model = get_compiled_model()

model.fit(images_train, labels_train, validation_data=(images_valid, labels_valid), 
          epochs=1, callbacks=[custom_callback])

Training is started, at time 11:11:32.048510
Training: batch 0 begins at 11:11:32.070100
Training: batch 0 ends at 11:11:37.553522
 1/19 [>.............................] - ETA: 0s - loss: 0.6954 - acc: 0.5312Training: batch 1 begins at 11:11:37.558305
Training: batch 1 ends at 11:11:37.600523
Training: batch 2 begins at 11:11:37.600835
Training: batch 2 ends at 11:11:37.636836
 3/19 [===>..........................] - ETA: 0s - loss: 1.9636 - acc: 0.4688Training: batch 3 begins at 11:11:37.638112
Training: batch 3 ends at 11:11:37.672230
Training: batch 4 begins at 11:11:37.672842
Training: batch 4 ends at 11:11:37.708426
Training: batch 5 ends at 11:11:37.741226
Training: batch 6 begins at 11:11:37.741515
Training: batch 6 ends at 11:11:37.772384
Training: batch 7 ends at 11:11:37.807318
Training: batch 8 begins at 11:11:37.807915
Training: batch 8 ends at 11:11:37.838598
Training: batch 9 ends at 11:11:37.872714
Training: batch 10 begins at 11:11:37.873090
Training: batch 10 ends at 1

<tensorflow.python.keras.callbacks.History at 0x7f0978098c10>


## 2. EarlyStopping Callback.


 EarlyStopping Callback.
So, let’s see how one can use this callback.

First, import the callback, and then create the instance of the
EarlyStopping callback and pass the arguments as per our needs.

* “monitor” you can pass the loss or the metric.
Generally, we pass val_loss and monitor it.

* “min_delta” you can pass an integer in this argument.
In simple words, you’re telling the callback that the model
is not improving if it’s not decreasing more/less than the loss/metrics.

* “patience,” it means about how many epochs to wait.
And after that, if there is no improvement seen in the
model performance according to the value of “min delta,” then stop the training.

* “mode”
By default it’s set to ‘auto’ this comes handy when
you’re dealing with the custom loss/metric. So, you can 
tell the callback whether the model is improving when
its custom loss/metric is decreasing then set it to “min” 
or increasing then set it to “max.”

In [6]:
from tensorflow.keras.callbacks import EarlyStopping


early_stopping = EarlyStopping(monitor='val_loss', 
                               min_delta=0.001, 
                               patience=2, 
                               verbose=0, 
                               mode='min', 
                               baseline=None, 
                               restore_best_weights=False)

model = get_compiled_model()

model.fit(images_train, labels_train, 
          validation_data=(images_valid, labels_valid), 
          epochs=80, 
          callbacks=[early_stopping])

Epoch 1/80
Epoch 2/80
Epoch 3/80
Epoch 4/80
Epoch 5/80
Epoch 6/80
Epoch 7/80
Epoch 8/80
Epoch 9/80


<tensorflow.python.keras.callbacks.History at 0x7f0930241e50>

## 3. ReduceLROnPlateau.

This callback is used to reduce the learning rate if there is 
not any improvement in the loss/metric.

The arguments are:

* “monitor” it’s set to that loss/metric as a string
 of which we are reducing the learning if it’ll not improve.

* “factor” You can pass an integer in this argument,
and say your current learning rate is LR, then if
there is not any improvement seen in the monitored loss/metric,
then the learning is going to decrease by that “factor.”
i.e new learning rate = lr * factor

* “Verbose”
You can set verbose =1 to see the learning rate at every epoch.
Or verbose = 0 to disable it.

The argument min_delta and mode are the same as explained in the arguments of EarlyStopping Callback.

In [7]:
from tensorflow.keras.callbacks import ReduceLROnPlateau



callback  = ReduceLROnPlateau(monitor='val_loss', 
                              factor=0.1, 
                              patience=10, 
                              verbose=0, 
                              mode='auto', 
                              min_delta=0.002, 
                              cooldown=0, 
                              min_lr=0)

model = get_compiled_model()

model.fit(images_train, labels_train, 
          validation_data=(images_valid, labels_valid), 
          epochs=20, 
          callbacks=[callback])

Epoch 1/20
Epoch 2/20
Epoch 3/20
Epoch 4/20
Epoch 5/20
Epoch 6/20
Epoch 7/20
Epoch 8/20
Epoch 9/20
Epoch 10/20
Epoch 11/20
Epoch 12/20
Epoch 13/20
Epoch 14/20
Epoch 15/20
Epoch 16/20
Epoch 17/20
Epoch 18/20
Epoch 19/20
Epoch 20/20


<tensorflow.python.keras.callbacks.History at 0x7f08c87da8d0>

## 4. ModelCheckpoint

TO Save Model Checkpoint

So, let’s see how we can use this callback. We can save
the model checkpoint in Keras h5/hd5 format or TensorFlow pb
format. If you pass the argument “filepath= model.h5”(.h5 extension)
it’ll be saved in the Keras format or “filepath= model.p”(.pb extension)
for saving in the TensorFlow model format.

Also, there are two options to save the checkpoint either you can save the entire architecture+weights or just the weights. You can do this by setting “save_only_weights=True” or “save_only_weights=False”

In [8]:
from tensorflow.keras.callbacks import ModelCheckpoint


# ModelCheckpoint 

# Architecure 
# Weights 


model_checkpoint_callback = ModelCheckpoint(filepath= "model.h5", 
                                            monitor='val_loss', 
                                            verbose=0, 
                                            save_best_only=False, 
                                            save_weights_only=False, 
                                            mode='min', 
                                            save_freq='epoch')
model = get_compiled_model()


model.fit(images_train, labels_train, 
          validation_data=(images_valid, labels_valid), 
          epochs=20, 
          callbacks=[model_checkpoint_callback])



# Epoch 500 

# Download 

# loading the model from the disk.
model.load_weights("model.h5") # Loading your Model FIle .h5 keras , Tensor pb

Epoch 1/20
Epoch 2/20
Epoch 3/20
Epoch 4/20
Epoch 5/20
Epoch 6/20
Epoch 7/20
Epoch 8/20
Epoch 9/20
Epoch 10/20
Epoch 11/20
Epoch 12/20
Epoch 13/20
Epoch 14/20
Epoch 15/20
Epoch 16/20
Epoch 17/20
Epoch 18/20
Epoch 19/20
Epoch 20/20


In [9]:
# inspecting the architecture .
model.summary()

Model: "sequential_4"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv2d_16 (Conv2D)           (None, 160, 160, 32)      896       
_________________________________________________________________
conv2d_17 (Conv2D)           (None, 160, 160, 32)      9248      
_________________________________________________________________
max_pooling2d_8 (MaxPooling2 (None, 80, 80, 32)        0         
_________________________________________________________________
conv2d_18 (Conv2D)           (None, 80, 80, 64)        18496     
_________________________________________________________________
conv2d_19 (Conv2D)           (None, 80, 80, 64)        36928     
_________________________________________________________________
max_pooling2d_9 (MaxPooling2 (None, 40, 40, 64)        0         
_________________________________________________________________
flatten_4 (Flatten)          (None, 102400)           

In [10]:
# evaluating the model on the test set.
model.evaluate(images_test, labels_test)



[4.367551326751709, 0.5433333516120911]

## 5. LearningRateScheduler
>  The simplest way to schedule the learning is to decrease the learning rate 
linearly from a large initial value to a small value. 
This allows large weight changes at the beginning of the 
the learning process and small changes or fine-tuning towards
the end of the learning process.

Let’s see how to schedule the learning rate. For this, we have to
define an auxiliary function that contains the rules for
alternating the learning rate. 
And then we can simply pass the name of this auxiliary function
to the argument of the object of the LearningRateScheduler class.

In [11]:
from tensorflow.keras.callbacks import LearningRateScheduler


def lr_function(epoch, lr):
    if epoch % 2 == 0:
        return lr
    else:
        return lr + epoch/1000

learning_rate_schedular_callback = LearningRateScheduler(schedule= lr_function ,
                                                         verbose=1)

model = get_compiled_model()

model.fit(images_train, labels_train, 
          validation_data=(images_valid, labels_valid), 
          epochs=10, 
          callbacks=[learning_rate_schedular_callback] )



Epoch 00001: LearningRateScheduler reducing learning rate to 0.0010000000474974513.
Epoch 1/10

Epoch 00002: LearningRateScheduler reducing learning rate to 0.0020000000474974513.
Epoch 2/10

Epoch 00003: LearningRateScheduler reducing learning rate to 0.0020000000949949026.
Epoch 3/10

Epoch 00004: LearningRateScheduler reducing learning rate to 0.005000000094994903.
Epoch 4/10

Epoch 00005: LearningRateScheduler reducing learning rate to 0.004999999888241291.
Epoch 5/10

Epoch 00006: LearningRateScheduler reducing learning rate to 0.009999999888241292.
Epoch 6/10

Epoch 00007: LearningRateScheduler reducing learning rate to 0.009999999776482582.
Epoch 7/10

Epoch 00008: LearningRateScheduler reducing learning rate to 0.01699999977648258.
Epoch 8/10

Epoch 00009: LearningRateScheduler reducing learning rate to 0.016999999061226845.
Epoch 9/10

Epoch 00010: LearningRateScheduler reducing learning rate to 0.025999999061226846.
Epoch 10/10


<tensorflow.python.keras.callbacks.History at 0x7f0978b32050>