<a href="https://colab.research.google.com/github/Allen123321/DEMO-DL/blob/master/006_TensorFlow2%E6%95%99%E7%A8%8B_%E8%87%AA%E5%AE%9A%E4%B9%89%E5%9B%9E%E8%B0%83.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

自定义回调是一个很好用的工具，可以在训练，评估和推理期间自定义模型的行为，包括读取/更改keras模型等。

In [1]:
from __future__ import absolute_import, division,print_function, unicode_literals
import tensorflow as tf

## 1 Keras回调简介
在Kreas中，Callback是一个python类，旨在被子类化以提供特定功能，并在训练的各阶段（包括每个batch/epoch的开始和结束），以及测试中调用一组方法。

我们可以通过回调列表，传递回调方法，在训练/评估/推断的不同阶段调用回调方法。

构建一个模型

In [2]:
def get_model():
    model = tf.keras.Sequential()
    model.add(tf.keras.layers.Dense(1, activation = 'linear', input_dim = 784))
    model.compile(optimizer=tf.keras.optimizers.RMSprop(lr=0.1), loss='mean_squared_error', metrics=['mae'])
    return model

导入数据

In [3]:
(x_train, y_train), (x_test, y_test) = tf.keras.datasets.mnist.load_data()
x_train = x_train.reshape(60000, 784).astype('float32') / 255
x_test = x_test.reshape(10000, 784).astype('float32') / 255

Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/mnist.npz


In [4]:

# 定义一个简单的自定义回调，以跟踪每批数据的开始和结束。

import datetime

class MyCustomCallback(tf.keras.callbacks.Callback):

    def on_train_batch_begin(self, batch, logs=None):
        print('Training: batch {} begins at {}'.format(batch, datetime.datetime.now().time()))

    def on_train_batch_end(self, batch, logs=None):
        print('Training: batch {} ends at {}'.format(batch, datetime.datetime.now().time()))

    def on_test_batch_begin(self, batch, logs=None):
        print('Evaluating: batch {} begins at {}'.format(batch, datetime.datetime.now().time()))

    def on_test_batch_end(self, batch, logs=None):
        print('Evaluating: batch {} ends at {}'.format(batch, datetime.datetime.now().time()))

In [5]:
# 在训练时传入回调函数

model = get_model()
_ = model.fit(x_train, y_train,
          batch_size=64,
          epochs=1,
          steps_per_epoch=5,
          verbose=0,
          callbacks=[MyCustomCallback()])

Training: batch 0 begins at 09:23:10.356244
Training: batch 0 ends at 09:23:10.847938
Training: batch 1 begins at 09:23:10.848678
Training: batch 1 ends at 09:23:10.850134
Training: batch 2 begins at 09:23:10.850306
Training: batch 2 ends at 09:23:10.851435
Training: batch 3 begins at 09:23:10.851564
Training: batch 3 ends at 09:23:10.852802
Training: batch 4 begins at 09:23:10.852932
Training: batch 4 ends at 09:23:10.853934


## 1.1 以下方法会调用回调函数
fit(), fit_generator() 训练或使用迭代数据进行训练。

evaluate(), evaluate_generator() 评估或使用迭代数据进行评估。

predict(), predict_generator() 预测或使用迭代数据进行预测。

In [7]:
_ = model.evaluate(x_test, y_test, batch_size=128, verbose=0, steps=5,
          callbacks=[MyCustomCallback()])

Evaluating: batch 0 begins at 09:23:47.311024
Evaluating: batch 0 ends at 09:23:47.313384
Evaluating: batch 1 begins at 09:23:47.313553
Evaluating: batch 1 ends at 09:23:47.314869
Evaluating: batch 2 begins at 09:23:47.314976
Evaluating: batch 2 ends at 09:23:47.316052
Evaluating: batch 3 begins at 09:23:47.316174
Evaluating: batch 3 ends at 09:23:47.317379
Evaluating: batch 4 begins at 09:23:47.317480
Evaluating: batch 4 ends at 09:23:47.318413


## 2 回调方法概述
### 2.1 训练/测试/预测的常用方法

为了进行训练，测试和预测，提供了以下方法来替代。

on_(train|test|predict)_begin(self, logs=None) 在fit/ evaluate/ predict开始时调用。

on_(train|test|predict)_end(self, logs=None) 在fit/ evaluate/ predict结束时调用。

on_(train|test|predict)_batch_begin(self, batch, logs=None) 在培训/测试/预测期间处理批次之前立即调用。在此方法中，logs是带有batch和size可用键的字典，代表当前批次号和批次大小。

on_(train|test|predict)_batch_end(self, batch, logs=None) 在培训/测试/预测批次结束时调用。在此方法中，logs是一个包含状态指标结果的字典。

2.2 训练时特定方法
另外，为了进行培训，提供以下内容。

on_epoch_begin（self，epoch，logs = None） 在训练期间的开始时调用。

on_epoch_end（self，epoch，logs = None） 在训练期间的末尾调用。


2.3 logsdict的用法
该logs字典包含损loss，已经每个batch和epoch的结束时的所有指标。示例包括loss和平均绝对误差。

In [8]:
class LossAndErrorPrintingCallback(tf.keras.callbacks.Callback):

    def on_train_batch_end(self, batch, logs=None):
        print('For batch {}, loss is {:7.2f}.'.format(batch, logs['loss']))

    def on_test_batch_end(self, batch, logs=None):
        print('For batch {}, loss is {:7.2f}.'.format(batch, logs['loss']))

    def on_epoch_end(self, epoch, logs=None):
        print('The average loss for epoch {} is {:7.2f} and mean absolute error is {:7.2f}.'.format(epoch, logs['loss'], logs['mae']))

model = get_model()
_ = model.fit(x_train, y_train,
          batch_size=64,
          steps_per_epoch=5,
          epochs=3,
          verbose=0,
          callbacks=[LossAndErrorPrintingCallback()])

For batch 0, loss is   24.85.
For batch 1, loss is  477.44.
For batch 2, loss is  325.83.
For batch 3, loss is  246.63.
For batch 4, loss is  198.87.
The average loss for epoch 0 is  198.87 and mean absolute error is    8.26.
For batch 0, loss is    6.17.
For batch 1, loss is    6.03.
For batch 2, loss is    5.99.
For batch 3, loss is    5.77.
For batch 4, loss is    6.02.
The average loss for epoch 1 is    6.02 and mean absolute error is    2.03.
For batch 0, loss is    8.70.
For batch 1, loss is    7.12.
For batch 2, loss is    6.48.
For batch 3, loss is    7.38.
For batch 4, loss is    9.22.
The average loss for epoch 2 is    9.22 and mean absolute error is    2.49.


In [9]:
_ = model.evaluate(x_test, y_test, batch_size=128, verbose=0, steps=20,
          callbacks=[LossAndErrorPrintingCallback()])

For batch 0, loss is   19.19.
For batch 1, loss is   19.32.
For batch 2, loss is   19.30.
For batch 3, loss is   19.53.
For batch 4, loss is   19.50.
For batch 5, loss is   19.97.
For batch 6, loss is   20.11.
For batch 7, loss is   20.14.
For batch 8, loss is   20.09.
For batch 9, loss is   20.07.
For batch 10, loss is   19.94.
For batch 11, loss is   19.93.
For batch 12, loss is   20.00.
For batch 13, loss is   19.90.
For batch 14, loss is   19.79.
For batch 15, loss is   19.63.
For batch 16, loss is   19.72.
For batch 17, loss is   19.64.
For batch 18, loss is   19.69.
For batch 19, loss is   19.77.



## 3 keras回调示例
### 3.1 以最小的损失尽早停止
第一个示例展示了Callback通过达到最小损失时更改属性model.stop_training（布尔值），停止Keras训练。用户可以提供一个参数patience来指定训练最终停止之前应该等待多少个时期。

注：tf.keras.callbacks.EarlyStopping 提供了更完整，更通用的实现。

In [10]:
import numpy as np
class EarlyStoppingAtMinLoss(tf.keras.callbacks.Callback):
    def __init__(self, patience=0):
        super(EarlyStoppingAtMinLoss, self).__init__()
        self.patience = patience
        self.best_weights = None  # loss最低时的权重
    def on_train_begin(self, logs=None):
        # loss不再下降时等待的轮数
        self.wait = 0
        # 停止时的轮数
        self.stopped_epoch = 0
        # 开始时的最优loss
        self.best = np.Inf
    
    def on_epoch_end(self, epoch, logs=None):
        current = logs.get('loss')
        if np.less(current, self.best):
            self.best = current
            self.wait = 0
            # 最佳权重
            self.best_weights = self.model.get_weights()
        else:
            self.wait += 1
            if self.wait >= self.patience:
                self.stopped_epoch = epoch
                self.model.stop_training = True
                print('导入当前最佳模型')
                self.model.set_weights(self.best_weights)
    def on_train_end(self, logs=None):
        if self.stopped_epoch > 0:
            print('在%05d: 提前停止训练'% (self.stopped_epoch+1))

In [11]:
model = get_model()
_ = model.fit(x_train, y_train,
          batch_size=64,
          steps_per_epoch=5,
          epochs=30,
          verbose=0,
          callbacks=[LossAndErrorPrintingCallback(), EarlyStoppingAtMinLoss()])

For batch 0, loss is   22.20.
For batch 1, loss is  491.81.
For batch 2, loss is  335.79.
For batch 3, loss is  254.13.
For batch 4, loss is  204.92.
The average loss for epoch 0 is  204.92 and mean absolute error is    8.25.
For batch 0, loss is    6.67.
For batch 1, loss is    7.11.
For batch 2, loss is    6.81.
For batch 3, loss is    6.58.
For batch 4, loss is    6.17.
The average loss for epoch 1 is    6.17 and mean absolute error is    2.06.
For batch 0, loss is    6.51.
For batch 1, loss is    5.75.
For batch 2, loss is    5.42.
For batch 3, loss is    5.38.
For batch 4, loss is    5.40.
The average loss for epoch 2 is    5.40 and mean absolute error is    1.89.
For batch 0, loss is   11.22.
For batch 1, loss is   16.06.
For batch 2, loss is   23.59.
For batch 3, loss is   26.06.
For batch 4, loss is   25.30.
The average loss for epoch 3 is   25.30 and mean absolute error is    4.31.
导入当前最佳模型
在00004: 提前停止训练


## 自定义学习率
在模型训练中通常要做的一件事是随着训练轮次改变学习率。Keras后端公开了可用于设置变量的get_value API。在此示例中，我们展示了如何使用自定义的回调来动态更改学习率。

注：这只是示例实现，请参见callbacks.LearningRateScheduler和keras.optimizers.schedules有关更一般的实现。

In [12]:

class LearningRateScheduler(tf.keras.callbacks.Callback):
    def __init__(self, schedule):
        super(LearningRateScheduler, self).__init__()
        self.schedule = schedule
        
    def on_epoch_begin(self, epoch, logs=None):
        if not hasattr(self.model.optimizer, 'lr'):
            raise ValueError('Optimizer没有lr参数。')
        # 获取当前lr
        lr = float(tf.keras.backend.get_value(self.model.optimizer.lr))
        # 调整lr
        scheduled_lr = self.schedule(epoch, lr)
        tf.keras.backend.set_value(self.model.optimizer.lr, scheduled_lr)
        print('Epoch %05d: 学习率为%6.4f.'%(epoch, scheduled_lr))


按轮次调整学习率

In [13]:

LR_SCHEDULE = [
    # (epoch to start, learning rate) tuples
    (3, 0.05), (6, 0.01), (9, 0.005), (12, 0.001)
]

def lr_schedule(epoch, lr):
    if epoch < LR_SCHEDULE[0][0] or epoch > LR_SCHEDULE[-1][0]:
        return lr
    for i in range(len(LR_SCHEDULE)):
        if epoch == LR_SCHEDULE[i][0]:
            return LR_SCHEDULE[i][1]
    return lr

model = get_model()
_ = model.fit(x_train, y_train,
          batch_size=64,
          steps_per_epoch=5,
          epochs=15,
          verbose=0,
          callbacks=[LossAndErrorPrintingCallback(), LearningRateScheduler(lr_schedule)])

Epoch 00000: 学习率为0.1000.
For batch 0, loss is   23.23.
For batch 1, loss is  486.80.
For batch 2, loss is  334.04.
For batch 3, loss is  252.60.
For batch 4, loss is  203.43.
The average loss for epoch 0 is  203.43 and mean absolute error is    8.37.
Epoch 00001: 学习率为0.1000.
For batch 0, loss is    6.56.
For batch 1, loss is    6.60.
For batch 2, loss is    6.01.
For batch 3, loss is    6.15.
For batch 4, loss is    6.10.
The average loss for epoch 1 is    6.10 and mean absolute error is    2.02.
Epoch 00002: 学习率为0.1000.
For batch 0, loss is    5.31.
For batch 1, loss is    5.34.
For batch 2, loss is    5.41.
For batch 3, loss is    5.79.
For batch 4, loss is    5.81.
The average loss for epoch 2 is    5.81 and mean absolute error is    1.95.
Epoch 00003: 学习率为0.0500.
For batch 0, loss is    5.59.
For batch 1, loss is    5.16.
For batch 2, loss is    5.52.
For batch 3, loss is    5.37.
For batch 4, loss is    5.27.
The average loss for epoch 3 is    5.27 and mean absolute error is    1.