# How to "see" what's going on during training?

Since during training - especially for early stopping - we would like to have 

- Model graph visualization
- Visualization measured scalars 
    - Training metrics (train and 
    - Weights based metrics (histogram of weights)

So instead of simple printouts, we can get interactive charts, like this:

<img src="https://www.tensorflow.org/images/mnist_tensorboard.png" width=600 heigth=600>

Generally TensorBoard's structure rests on the predicament of being able to access a continuously growing **log** of events which occur during training.


## Logging in TF: `tf.summary`

With the help of [tf.summary](https://www.tensorflow.org/api_docs/python/tf/summary) tools we are able to collect measurement during the `Session`, and save them to a pre-defined folder.

### How to save summaries in "barebones" TF?

```python

def train(model, optimizer, dataset, log_freq=10):
  avg_loss = tf.keras.metrics.Mean(name='loss', dtype=tf.float32)
  for images, labels in dataset:
    loss = train_step(model, optimizer, images, labels)
    avg_loss.update_state(loss)
    if tf.equal(optimizer.iterations % log_freq, 0):
      tf.summary.scalar('loss', avg_loss.result(), step=optimizer.iterations)
      avg_loss.reset_states()

def test(model, test_x, test_y, step_num):
  # training=False is only needed if there are layers with different
  # behavior during training versus inference (e.g. Dropout).
  loss = loss_fn(model(test_x, training=False), test_y)
  tf.summary.scalar('loss', loss, step=step_num)

train_summary_writer = tf.summary.create_file_writer('/tmp/summaries/train')
test_summary_writer = tf.summary.create_file_writer('/tmp/summaries/test')

with train_summary_writer.as_default():
  train(model, optimizer, dataset)

with test_summary_writer.as_default():
  test(model, test_x, test_y, optimizer.iterations)
```
In the now default "eager" execution mode, the ops are executed and thus the data is saved on the fly. (In TF1.0 "graph mode" one has to do a fetch and save operation explicitly.)  


**To sum it up:**
- We initialize a `tf.summary.FileWriter` object
    - We use `./Graph` in our example, but can be any folder name.
    - We give in as parameter our default graph in this case, just to show, that we would like to save metrics from this model. (Multiple models in multiple scopes can exist, with their own loggers...)
- We write an element (in this case every update) to the summary
- Logs will be written to the appropriate folder



## Using TensorBoard over the log

So if we have a constantly updating log, we can start to use TensorBoard.

TensorBoard is a **different process** with a built in **webserver**, that we have to start separately, and by default it is listening on 

`http://localhost:6006/`.

To start TensorBoard, use the command 

`TensorBoard --logdir ...`

After this, navigate to the webaddress above where you can interact with the log data.

Short tutorial on TensorBoard can be found [here](https://medium.com/@anthony_sarkis/tensortoard-quick-start-in-5-minutes-e3ec69f673af), the documentation [here](https://www.tensorflow.org/guide/summaries_and_TensorBoard)

The approach of TensorBoard is scalable, is a separate logging collector / viewer machine is used, it can scale to training clusters. (See [here](https://learnk8s.io/infiniteconf2018))

## How to use TesorBoard with Keras?

Keras is having a concept of `callbacks`, which are executed at various points in training, like at the beginning / end of a minibatch or the beginning / end of an epoch.

Keras comes with a built in facility for connecting in a TensorBoard logger as a callback pretty easily.

The design pattern is as follows:

```python
from keras.callbacks import TensorBoard
...

tensorboard = TensorBoard(log_dir='./logs', update_freq='epoch')
...

model.fit(x_train, y_train_cat,
         epochs=..,
         validation_data=(x_test, y_test_cat),
         callbacks=[tensorboard]) 
```





The saved logs will be processed by TensoBoard normally. It is worth noting, that there are many customization possibilites, eg. to let the callback save gradient histograms, weight histograms etc.

Same general design patter is true for `ModelCheckpoint`-s, which save our models, which are super handy, especially with the `save_best_only=True` parameter enabled.

## Getting TensorBoard to work in Colab

Since Colab is a remote environment, running a new webserver is not totally straightforward.

Luckily, Google worked on the nice integration of Tensorboard to Colab.

### Initialization

Load the TensorBoard extension

```python
%load_ext tensorboard
```

### Add to tf.keras callback

Same as before...

```python
tensorboard_callback = tf.keras.callbacks.TensorBoard(logdir, histogram_freq=1)
```

### Start TensorBoard within the notebook using magics function

```python
%tensorboard --logdir logs/
```