# Ungraded Lab: Using Callbacks to Control Training 使用回调控制训练

In this lab, you will use the [Callbacks API](https://keras.io/api/callbacks/) to stop training when a specified metric is met. This is a useful feature so you won't need to complete all epochs when this threshold is reached. For example, if you set 1000 epochs and your desired accuracy is already reached at epoch 200, then the training will automatically stop. Let's see how this is implemented in the next sections.

在本实验中，你将使用 Callbacks API 在达到指定指标时停止训练。这是一个非常有用的功能，因为当达到阈值时，你无需完成所有的训练轮数（epochs）。例如，如果你设置了 1000 个 epochs，但你的目标准确率已经在第 200 个 epoch 达到了，那么训练将自动停止。让我们在接下来的部分中看看如何实现这一点。

## Load and Normalize the Fashion MNIST dataset 加载并归一化 Fashion MNIST 数据集

Like the previous lab, you will use the Fashion MNIST dataset again for this exercise. And also as mentioned before, you will normalize the pixel values to help optimize the training.

与之前的实验一样，你将再次使用 Fashion MNIST 数据集来完成本练习。同时，如前所述，你将归一化像素值以帮助优化训练过程。

In [4]:
#I use kaggle as the notebook, which will load the dataset as below
import tensorflow as tf
import numpy as np
import kagglehub

def load_images(file_path):
    with open(file_path, 'rb') as f:
        data = np.frombuffer(f.read(), dtype=np.uint8, offset=16)
    return data.reshape(-1, 28, 28)

def load_labels(file_path):
    with open(file_path, 'rb') as f:
        data = np.frombuffer(f.read(), dtype=np.uint8, offset=8)
    return data

path = kagglehub.dataset_download("zalando-research/fashionmnist")

training_images_path = path + '/train-images-idx3-ubyte'  
training_labels_path = path + '/train-labels-idx1-ubyte'
testing_images_path = path + '/t10k-images-idx3-ubyte'
testing_labels_path = path + '/t10k-labels-idx1-ubyte'

# load data
x_training = load_images(training_images_path)
y_train = load_labels(training_labels_path)
x_testing = load_images(testing_images_path)
y_test = load_labels(testing_labels_path)

In [5]:


# Instantiate the dataset API
# fmnist = tf.keras.datasets.fashion_mnist

# Load the dataset
# (x_train, y_train), (x_test, y_test) = fmnist.load_data()

# Normalize the pixel values
x_train, x_test = x_training / 255.0, x_testing / 255.0

## Creating a Callback class 创建回调类

You can create a callback by defining a class that inherits the [tf.keras.callbacks.Callback](https://www.tensorflow.org/api_docs/python/tf/keras/callbacks/Callback) base class. From there, you can define available methods to set where the callback will be executed. For instance below, you will use the [on_epoch_end()](https://www.tensorflow.org/api_docs/python/tf/keras/callbacks/Callback#on_epoch_end) method to check the loss at each training epoch.

你可以通过定义一个继承 tf.keras.callbacks.Callback 基类的类来创建回调。然后，你可以定义可用的方法来设置回调在何时执行。例如，在下面的代码中，你将使用 on_epoch_end() 方法在每个训练 epoch 结束时检查损失值。

In [6]:
class myCallback(tf.keras.callbacks.Callback):
    def on_epoch_end(self, epoch, logs=None):
        '''
        Halts the training when the loss falls below 0.4

        Args:
            epoch (integer) - index of epoch (required but unused in the function definition below)
            logs (dict) - metric results from the training epoch
        '''

        # Check the loss
        if logs['loss'] < 0.4:

            # Stop if threshold is met
            print("\nLoss is lower than 0.4 so cancelling training!")
            self.model.stop_training = True

## Define and compile the model 定义并编译模型

Next, you will define and compile the model. The architecture will be similar to the one you built in the previous lab. Afterwards, you will set the optimizer, loss, and metrics that you will use for training.

接下来，你将定义并编译模型。模型的结构将与你之前实验中构建的类似。之后，你将设置用于训练的优化器、损失函数和评估指标。

In [7]:
# Define the model
model = tf.keras.models.Sequential([
    tf.keras.Input(shape=(28,28)),
    tf.keras.layers.Flatten(),
    tf.keras.layers.Dense(512, activation=tf.nn.relu),
    tf.keras.layers.Dense(10, activation=tf.nn.softmax)
])

# Compile the model
model.compile(optimizer=tf.optimizers.Adam(),
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])

### Train the model 训练模型

Now you are ready to train the model. To set the callback, simply set the `callbacks` parameter to an instance of `myCallback` put into a list. Run the cell below and observe what happens.

现在你已经准备好训练模型了。要设置回调，只需将 callbacks 参数设置为 myCallback 的实例，并将其放入一个列表中。运行下面的单元格并观察会发生什么。

In [8]:
# Train the model with a callback
model.fit(x_train, y_train, epochs=10, callbacks=[myCallback()])

Epoch 1/10
[1m1875/1875[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m13s[0m 6ms/step - accuracy: 0.7942 - loss: 0.5781
Epoch 2/10
[1m1869/1875[0m [32m━━━━━━━━━━━━━━━━━━━[0m[37m━[0m [1m0s[0m 6ms/step - accuracy: 0.8691 - loss: 0.3621
Loss is lower than 0.4 so cancelling training!
[1m1875/1875[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m12s[0m 6ms/step - accuracy: 0.8691 - loss: 0.3621


<keras.src.callbacks.history.History at 0x7d63e0474340>

You will notice that the training does not need to complete all 10 epochs. By having a callback at the end of each epoch, it is able to check the training parameters and compare if it meets the threshold you set in the function definition. In this case, it will simply stop when the loss falls below `0.40` after the current epoch.

*Optional Challenge: Modify the code to make the training stop when the accuracy metric exceeds 60%.*

That concludes this simple exercise on callbacks!

你会注意到，训练不需要完成所有的 10 个 epochs。通过在每个 epoch 结束时设置回调，它能够检查训练参数，并判断是否满足你在函数定义中设置的阈值。在这个例子中，当损失值在当前 epoch 结束后低于 0.40 时，训练将自动停止。

可选挑战：修改代码，使训练在准确率超过 60% 时停止。

这就是关于回调的简单练习的全部内容！