In [1]:
%%html
<style>
table {float:left}
</style>

# [tensorboard](https://www.tensorflow.org/tensorboard/get_started)

基本类型：  
  
| type | usage |
| :--- | :--- |
| Scalars | loss、metrics，also track training speed, learning rate, and other scalar values. |
| Graphs | visualize your model |
| Histograms | visualize weights and biases |
| Distributions | visualize weights and biases |  

Histograms、Distributions主要区别在于图标样式不同。主要用于观察分布以及哪些区域的参数长时间没更新。

## keras Model.fit里使用

``` python 
model = create_model()
model.compile(optimizer='adam',
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])

log_dir = "logs/fit/" + datetime.datetime.now().strftime("%Y%m%d-%H%M%S")
tensorboard_callback = tf.keras.callbacks.TensorBoard(log_dir=log_dir, histogram_freq=1)

model.fit(x=x_train, 
          y=y_train, 
          epochs=5, 
          validation_data=(x_test, y_test), 
          callbacks=[tensorboard_callback])
%tensorboard --logdir logs/fit # 若执行失败，可以cli里执行
```

## 用在自定义的training loop

tldr：
``` python
tf.summary.create_file_writer(train_log_dir)
with train_summary_writer.as_default():
    tf.summary.scalar('loss', train_loss.result(), step=epoch)
    tf.summary.scalar('accuracy', train_accuracy.result(), step=epoch)
```
完整example：

```python 
current_time = datetime.datetime.now().strftime("%Y%m%d-%H%M%S")
train_log_dir = 'logs/gradient_tape/' + current_time + '/train'
train_summary_writer = tf.summary.create_file_writer(train_log_dir)

model = create_model() # reset our model

EPOCHS = 5

for epoch in range(EPOCHS):
    for (x_train, y_train) in train_dataset:
        train_step(model, optimizer, x_train, y_train)
    with train_summary_writer.as_default():
        tf.summary.scalar('loss', train_loss.result(), step=epoch)
        tf.summary.scalar('accuracy', train_accuracy.result(), step=epoch)

    template = 'Epoch {}, Loss: {}, Accuracy: {}, Test Loss: {}, Test Accuracy: {}'
    print (template.format(epoch+1,
                         train_loss.result(), 
                         train_accuracy.result()*100,
                         test_loss.result(), 
                         test_accuracy.result()*100))

    # Reset metrics every epoch
    train_loss.reset_states()
    train_accuracy.reset_states()

```

## 通过callback定义
**以learning rate为例：**

```python 
def lr_schedule(epoch):
  """
  Returns a custom learning rate that decreases as epochs progress.
  """
    learning_rate = 0.2
    if epoch > 10:
        learning_rate = 0.02
    if epoch > 20:
        learning_rate = 0.01
    if epoch > 50:
        learning_rate = 0.005

    tf.summary.scalar('learning rate', data=learning_rate, step=epoch)
    return learning_rate

lr_callback = keras.callbacks.LearningRateScheduler(lr_schedule)

...

training_history = model.fit(
    x_train, # input
    y_train, # output
    batch_size=train_size,
    verbose=0, # Suppress chatty output; use Tensorboard instead
    epochs=100,
    validation_data=(x_test, y_test),
    callbacks=[tensorboard_callback, lr_callback],
)
```

**可视化[embedding](https://www.tensorflow.org/tensorboard/tensorboard_projector_plugin)、[Hparameter tuning](https://www.tensorflow.org/tensorboard/hyperparameter_tuning_with_hparams)**可以查看官方文档。

# [callbacks](https://www.tensorflow.org/guide/keras/writing_your_own_callbacks)
1. A callback is a powerful tool to customize the behavior of a Keras model during training, evaluation, or inference.
2. Callbacks are useful to get a view on internal states and statistics of the model during training.
3. You can pass a list of callbacks (as the keyword argument callbacks) to the following model methods:
- keras.Model.fit()
- keras.Model.evaluate()
- keras.Model.predict()
4. It has a series of method,see: [An overview of callback methods](https://www.tensorflow.org/guide/keras/writing_your_own_callbacks#an_overview_of_callback_methods)

## 其中比较常用的是基于self.model
Here are a few of the things you can do with self.model in a callback:

1. Set self.model.stop_training = True to **immediately interrupt training**.
2. **Mutate hyperparameters** of the optimizer (available as self.model.optimizer), such as self.model.optimizer.learning_rate.
3. **Save the model** at period intervals.
4. **Record the output of model.predict() on a few test samples** at the end of each epoch, to use as a sanity check during training.
5. **Extract visualizations of intermediate features** at the end of each epoch, to monitor what the model is learning over time.

两个例子：  
[Early stopping at minimum loss](https://www.tensorflow.org/guide/keras/writing_your_own_callbacks#early_stopping_at_minimum_loss)  
[Learning rate scheduling](https://www.tensorflow.org/guide/keras/writing_your_own_callbacks#learning_rate_scheduling)