# Callbacks

A callback in TensorFlow is a function that is executed at certain stages of the training process, such as at the end of every epoch. They are used to perform actions during training, such as saving model weights, updating learning rates, stopping training early, or logging metrics.

Here are a few important uses of callbacks:

1. **Model Checkpointing**: This saves the model's weights at different points during training. It's useful because if a long running training process gets interrupted, you can resume from the last saved state.
2. **Learning Rate Scheduling**: This adjusts the learning rate during training. For example, you might want to decrease the learning rate over time.
3. **Early Stopping**: This stops training when the model's performance on a validation set stops improving. It's useful to prevent overfitting - when the model starts to learn the training data too well and performs poorly on unseen data.

### **Model Checkpointing:**

Model checkpointing is implemented using the **`ModelCheckpoint`** callback. This is useful in case your training procedure gets interrupted. You can save the model or just the weights at regular intervals during training.

```python

checkpoint_cb = tf.keras.callbacks.ModelCheckpoint("my_model.h5", save_best_only=True)

model.fit(training_images, training_labels, epochs=10,
          validation_data=(test_images, test_labels),
          callbacks=[checkpoint_cb])

```

In this example, **`ModelCheckpoint`** will save the weights of the model in a file named "my_model.h5" at the end of each epoch. The **`save_best_only=True`** argument means it will only save the model when its performance on the validation set is the best so far.

### **Learning Rate Scheduling:**

You can adjust the learning rate during training by using the **`LearningRateScheduler`** callback. This can help the model converge faster or achieve a better result.

```python

def scheduler(epoch, lr):
  if epoch < 10:
    return lr
  else:
    return lr * tf.math.exp(-0.1)

lr_schedule_cb = tf.keras.callbacks.LearningRateScheduler(scheduler)

model.fit(training_images, training_labels, epochs=20,
          callbacks=[lr_schedule_cb])

```

In this example, the learning rate will stay the same for the first 10 epochs, and then it will exponentially decay every epoch after that.

### **Early Stopping: P**atience

You can stop training when a monitored metric has stopped improving by using the **`EarlyStopping`** callback. This can prevent overfitting by not allowing the model to learn the training data too well.

```python

early_stopping_cb = tf.keras.callbacks.EarlyStopping(patience=10, restore_best_weights=True)

model.fit(training_images, training_labels, epochs=100,
          validation_data=(test_images, test_labels),
          callbacks=[early_stopping_cb])

```

In this example, training will stop when the validation loss doesn't improve for 10 epochs. The **`restore_best_weights=True`** argument means the model will keep the weights from the epoch with the best monitored metric, which in this case is the validation loss.

### **Early Stopping: T**arget accuracy or loss

Here's an example of a simple callback that stops training when it reaches a certain accuracy:

```python
class myCallback(tf.keras.callbacks.Callback):
  def on_epoch_end(self, epoch, logs={}):
    if(logs.get('accuracy')>0.9):
      print("\nReached 90% accuracy so cancelling training!")
      self.model.stop_training = True

callbacks = myCallback()

model.fit(training_images, training_labels, epochs=5, callbacks=[callbacks])

```

In this example, the **`on_epoch_end`** function is called at the end of each epoch, and if the accuracy at that point is above 90%, it stops the training.

## Load and Normalize the Fashion MNIST dataset

Like the previous lab, you will use the Fashion MNIST dataset again for this exercise. And also as mentioned before, you will normalize the pixel values to help optimize the training.

In [None]:
import tensorflow as tf

# Instantiate the dataset API
fmnist = tf.keras.datasets.fashion_mnist

# Load the dataset
(x_train, y_train),(x_test, y_test) = fmnist.load_data()

# Normalize the pixel values
x_train, x_test = x_train / 255.0, x_test / 255.0

## Creating a Callback class

You can create a callback by defining a class that inherits the [tf.keras.callbacks.Callback](https://www.tensorflow.org/api_docs/python/tf/keras/callbacks/Callback) base class. From there, you can define available methods to set where the callback will be executed. For instance below, you will use the [on_epoch_end()](https://www.tensorflow.org/api_docs/python/tf/keras/callbacks/Callback#on_epoch_end) method to check the loss at each training epoch.

In [None]:
class myCallback(tf.keras.callbacks.Callback):
  def on_epoch_end(self, epoch, logs={}):
    # Check accuracy
    if(logs.get('loss') < 0.4):

      # Stop if threshold is met
      print("\nLoss is lower than 0.4 so cancelling training!")
      self.model.stop_training = True

# Instantiate class
callbacks = myCallback()

## Define and compile the model

Next, you will define and compile the model. The architecture will be similar to the one you built in the previous lab. Afterwards, you will set the optimizer, loss, and metrics that you will use for training.

In [None]:
# Define the model
model = tf.keras.models.Sequential([
  tf.keras.layers.Flatten(input_shape=(28, 28)),
  tf.keras.layers.Dense(512, activation=tf.nn.relu),
  tf.keras.layers.Dense(10, activation=tf.nn.softmax)
])

# Compile the model
model.compile(optimizer=tf.optimizers.Adam(),
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])



## Train the model

Now you are ready to train the model. To set the callback, simply set the `callbacks` parameter to the `myCallback` instance you declared before. Run the cell below and observe what happens.

In [None]:
# Train the model with a callback
model.fit(x_train, y_train, epochs=10, callbacks=[callbacks])

You will notice that the training does not need to complete all 10 epochs. By having a callback at each end of the epoch, it is able to check the training parameters and compare if it meets the threshold you set in the function definition. In this case, it will simply stop when the loss falls below `0.40` after the current epoch.

*Optional Challenge: Modify the code to make the training stop when the accuracy metric exceeds 60%.*

That concludes this simple exercise on callbacks!