# <center> <font color='#0B5345'> <b> Predefined callbacks </font> </b> </center>

### <b> <font color='blue'> Table of Contents </b> </font>

- 1. [Libraries](#1)
- 2. [Loading data](#2)
- 3. [Model Building](#3)
- 4. [Training with callbacks](#4)
   - 4.1. [Early Stopping](#4.1)
   - 4.2. [Learning Rate Scheduler](#4.2)
   - 4.3. [Tensor Board](#4.3)
   - 4.4. [Checkpoint](#4.4)
   - 4.5. [Combining callbacks](#4.5)





<a name="1"></a>
## <b> <font color='##138D75'> 1. Libraries </b> </font>

In [21]:
# que no se impriman info y warnings
import os
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2' 

import warnings
# Ignore specific warning
warnings.filterwarnings('ignore', message='Allocation of .* exceeds 10% of free system memory')


In [59]:
import tensorflow as tf
from tensorflow.keras import layers, callbacks, models
from tensorflow.keras.models import load_model
import datetime


<a name="2"></a>
## <b> <font color='##138D75'> 2. Loading data </b> </font>

In [14]:
# Load MNIST dataset
mnist = tf.keras.datasets.mnist

# Load data into training and testing sets
(x_train, y_train), (x_test, y_test) = mnist.load_data()

# Normalize pixel values to be between 0 and 1
x_train, x_test = x_train / 255.0, x_test / 255.0


<a name="2"></a>
## <b> <font color='##138D75'> 3. Model building </b> </font>

Let's create a simple model (our goal here is to learn about callbacks)

In [56]:
def build_model():

    # Define the model
    model = tf.keras.models.Sequential([
        tf.keras.layers.Flatten(input_shape=(28,28,1)),
        tf.keras.layers.Dense(10),
        tf.keras.layers.Dense(10,activation='softmax'),
    ])

    # Compile the model
    model.compile(optimizer='adam',
                  loss='sparse_categorical_crossentropy',
                  metrics=['accuracy'])
        
    return model

<a name="4"></a>
## <b> <font color='##138D75'> 4. Training with Early Stopping callback </b> </font>

<a name="4.1"></a>
### 4.1. Early Stopping

The "Early Stopping Callback" stops training when a monitored metric has stopped improving.

It is a way to prevent overfitting.

First, let's use the "bad model"

In [57]:
model = build_model()

In [23]:
# Define EarlyStopping callback
early_stopping = callbacks.EarlyStopping(
    monitor='val_loss', 
    patience=3, 
    min_delta=0.001, 
    restore_best_weights=True
)

# Train the model with EarlyStopping callback
history = model.fit(x_train, y_train, epochs=20, 
                    validation_data=(x_test, y_test), 
                    callbacks=[early_stopping])

# Evaluate the model
test_loss, test_acc = model.evaluate(x_test, y_test)
print(f'Test accuracy: {test_acc}')

Epoch 1/20
Epoch 2/20
Epoch 3/20
Epoch 4/20
Epoch 5/20
Epoch 6/20
Epoch 7/20
Epoch 8/20
Epoch 9/20
Epoch 10/20
Epoch 11/20
Epoch 12/20
Epoch 13/20
Test accuracy: 0.9272000193595886


We can see that it stops in the epoch 13 because in 3 consecutives epochs the validation accuracy did not improve (0.2735 -> 0.2826 -> 0.2763)

<a name="4.2"><a/>
### Learning Rate Scheduler

The Learning Rate Scheduler callback in TensorFlow allows you to dynamically adjust the learning rate during training based on predefined schedules or functions. It helps optimize the training process by allowing the model to converge faster and potentially achieve better performance.

In [58]:
model = build_model()

In [35]:
# Define a simple learning rate schedule function
#In this example, we use a simple decay schedule where 
#the learning rate decreases by a factor of 0.1 every 5 epochs.
def lr_schedule(epoch):
    """
    Returns a learning rate based on the current epoch.
    """
    initial_lr = 0.1
    decay_factor = 0.1
    decay_epochs = 5
    lr = initial_lr * (decay_factor ** (epoch // decay_epochs))
    return lr

# Create a LearningRateScheduler callback
lr_scheduler = callbacks.LearningRateScheduler(lr_schedule)


# Train the model with the learning rate scheduler callback
history = model.fit(x_train, y_train, epochs=12,
                    validation_data=(x_test, y_test), 
                    callbacks=[lr_scheduler])


Epoch 1/12
Epoch 2/12
Epoch 3/12
Epoch 4/12
Epoch 5/12
Epoch 6/12
Epoch 7/12
Epoch 8/12
Epoch 9/12
Epoch 10/12
Epoch 11/12
Epoch 12/12


We can see that when it reaches epoch 6 (after 5 epochs), the learning rate changes from 0.1 to 0.01, and when it reaches epoch 11 (after another 5 epochs), from 0.01 to 0.001."

<a name="4.3"></a>
### Tensor Board callback

The TensorBoard callback in TensorFlow is used to visualize and monitor the training process of your neural network models.

In [37]:
model = build_model()

# Define TensorBoard callback
# We specify yhe log directory where the TensorBoard logs will be saved. The log directory includes 
# a timestamp to make it unique.
log_dir = "logs/fit/" + datetime.datetime.now().strftime("%Y%m%d-%H%M%S") 
tensorboard_callback = callbacks.TensorBoard(log_dir=log_dir, histogram_freq=1)

# Train the model with TensorBoard callback
model.fit(x_train, y_train, epochs=10, validation_data=(x_test, y_test), callbacks=[tensorboard_callback])


Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


<keras.src.callbacks.History at 0x7fd6a26a2da0>


After training, you can start TensorBoard from the command line by navigating to the directory containing your code and logs:

.....Predefined$ ls


'Callbacks.ipnyb' logs

Start Tensor Board

...... Predefined$ tensorboard --logdir logs/


Then, we navigate to 'http://localhost:6006'


<br>

<img src="images/TensorBoard.png"/>

<a name="4.4"></a>
### Checkpoint 

The Checkpoint callback allows you to save the model's weights during training, which enables you to resume training from the last saved checkpoint or use the saved weights for inference later on. This callback is particularly useful in scenarios where you have long training times or when you want to track the progress of your model over multiple training sessions


We are going to simulate a scenario where the training is interrupted (for example, if the computer crashes) to demonstrate how we can resume training not from the beginning, but from the last saved checkpoint.

In [42]:
model = build_model()

# Define checkpoint callback to save model weights
checkpoint_path = "training/model_checkpoint.ckpt"
checkpoint_dir = os.path.dirname(checkpoint_path)

# Create a callback that saves the model's weights
checkpoint_callback = callbacks.ModelCheckpoint(filepath=checkpoint_path,
                                      monitor='val_loss',
                                      save_best_only=True, # only saves the best
                                      mode='min') # it's min because monitor=val_loss



# Train the model with checkpoint callback
model.fit(x_train, y_train, epochs=20, validation_data=(x_test, y_test),
          callbacks=[checkpoint_callback])

Epoch 1/20


INFO:tensorflow:Assets written to: training/model_checkpoint.ckpt/assets


Epoch 2/20
Epoch 3/20
Epoch 4/20
Epoch 5/20
Epoch 6/20
Epoch 7/20

KeyboardInterrupt: 

In [46]:
# Load the model
model_ckp = load_model(checkpoint_path)

In [48]:
# check
model_ckp.evaluate(x_test,y_test)



[0.27205604314804077, 0.9251000285148621]

In [49]:
# resume training
model_ckp.fit(x_train, y_train, epochs=5, validation_data=(x_test, y_test),
          callbacks=[checkpoint_callback])

Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


<keras.src.callbacks.History at 0x7fd69d169150>

<a name="4.5"></a>
### Combining callbacks

We can combine callbacks. In the following example, we combine 'Early Stopping' with 'Learning Rate Scheduler'


In [54]:
model = build_model()

In [55]:
# Define EarlyStopping callback
early_stopping = callbacks.EarlyStopping(
    monitor='val_loss', 
    patience=5, 
    min_delta=0.001, 
    restore_best_weights=True
)


# LR Scheduler
# This function keeps the initial learning rate for the first 5 epochs
# and decreases it exponentially after that.
def scheduler(epoch, lr):
     if epoch < 3:
        return lr
     else:
        return lr * ops.exp(-0.1)


    
lr_schduler = callbacks.LearningRateScheduler(scheduler)

    
# Train the model with EarlyStopping callback
history = model.fit(x_train, y_train, epochs=20, 
                    validation_data=(x_test, y_test), 
                    callbacks=[early_stopping, lr_scheduler])

Epoch 1/20
Epoch 2/20
Epoch 3/20
Epoch 4/20
Epoch 5/20
Epoch 6/20
Epoch 7/20
Epoch 8/20
Epoch 9/20
Epoch 10/20
Epoch 11/20
Epoch 12/20
Epoch 13/20
Epoch 14/20
Epoch 15/20
Epoch 16/20
Epoch 17/20
Epoch 18/20
Epoch 19/20
Epoch 20/20
