# Ch. 7.2 Inspecting and monitoring deep-learning models using Keras callbacks and TensorBoard
In this section, we will introduce methods to allow our computationally involved models to become more self-aware and able to sense its environment, update its operator, and automatically make proactive corrections.

## 7.2.1 Using callbacks to act ona model during training
When training a model, there are many things you can't predict from the start. It's difficult to know how many epochs are necessary to get an optimal validation loss. In previous sections, we have trained models with enough epochs that we begin overfitting, using the first run to determine the proper number of epochs to train for, and then launching a new training run from scratch using the optimal number. So many unnecessary computations!

A better approach would be to stop training when we measure the validation loss is no longer improving, which can be done using a Keras *callback*. A *callback* is an object that is passed to the model in the call to `fit`. The callback has the capability of interrupting training, saving a model, loading a different weight set, or altering the model in other ways. Here are some examples of ways to use callbacks:
 - **Model checkpointing** - Saving the current weights of the model at different points during training.
 - **Early stopping** - Interrupting training when the validation loss is no longer improving.
 - **Dynamically adjusting the value of parameters during training** - Such as the learning rate of the optimizer.
 - **Logging training and validation metrics during training, or visualizing representations learned by the model as they are updated** - The Keras progress bar is a callback!
 
Let's review a few of the built-in callbacks from the `keras.callbacks` module:

`keras.callbacks.ModelCheckpoint
keras.callbacks.EarlyStopping
keras.callbacks.LearningRateScheduler
keras.callbacks.ReduceLROnPlateau
keras.callbacks.CSVLogger`

**THE ModelCheckpoint AND EarlyStopping CALLBACKS**

We can use **`EarlyStopping`** to interrupt training once a target metric being monitored has stopped improving for a fixed number of epochs. For instance, this callback allows us to interrupt training as soon as we start overfitting, allowing us to avoid retraining our model for a smaller number of epochs. This callback is typically used in combination with **`ModelCheckpoint`**, which lets us continually save the model during training (and, optionally, save only the current best model):

In [None]:
import keras

# Callbacks are passed to the model via the callbacks argument in fit
# It takes a list of callbacks & we can pass any number of them.
callbacks_list = [
    keras.callbacks.EarlyStopping( # interrupts training when improvement stops     
        monitor='acc', # monitors model validation accuracy
        patience=1, # interrupts training when acc has stopped improving for more than 1 epoch
    ),
    keras.callbacks.ModelCheckpoint( # Saves current weights after every epoch
        filepath='my_model.h5', # Saves current weights after every epoch
        monitor='val_loss', # Don't overwrite model unless val_loss has improved
        save_best_only=True,
    )
]

model.compile(optimizer='rmsprop',
              loss='binary_crossentropy',
              metrics=['acc']) # monitor accuracy, so it should be part of model's metrics

model.fit(x, y,
          epochs=10,
          batch_size=32,
          callbacks=callbacks_list,
          validation_data=(x_val, y_val)) # need to pass validation_data to the call to fit

**THE ReduceLROnPlateau CALLBACK**

We can use this callback to reduce the learning rate when the validation loss has stopped improving. Reducing or increasing the learning rate in case of a *loss plateau* is an effective strategy to get out of local minima during training. Let's see an example:

In [None]:
callbacks_list = [
    keras.callbacks.ReduceLROnPlateau(
        monitor='val_loss' # monitors model's validation loss
        factor=0.1, # divides learning rate by 10 when triggered
        patience=10, # triggered after validation loss has stopped improving for 10 epochs
    )
]

model.fit(x, y,
          epochs=10,
          batch_size=32,
          callbacks=callbacks_list,
          validation_data=(x_val, y_val))

**WRITING YOUR OWN CALLBACK**

We can also implement our own callbacks if one of the built-in callbacks doesn't perform a specific action. Callbacks are implemented by subclassing the class `keras.callbacks.Callback`. We can implement any number of the following transparently names methods:

In [None]:
on_epoch_begin # Called at start of every epoch
on_epoch_end # Called at end of every epoch

on_batch_begin # Called right before processing each batch
on_batch_end # Called right after processing each batch

on_train_begin # Called at the start of training
on_train_end # Called at the end of training

These methods all are called with a **`logs`** argument, which is a dictionary containing information about the previous batch, epoch, or training run: training and validation metrics, and so on. Additionally, the callback has access to the following attributes:
 - **`self.model`** - The model instance from which the callback is being called
 - **`self.model`** - The value of what was passed to `fit` as validation data
 
Here is a simple example of a custom callback that saves to disk (as Numpy arrays) the activations of every layer of the model at the end of every epoch, computed on the first sample of the validation set:

In [None]:
import keras
import numpy as np

class ActivationLogger(keras.callbacks.Callback):

    def set_model(self, model):
        self.model = model # called by parent model before training, to inform callback of what model will be calling it
        layer_outputs = [layer.output for layer in model.layers]
        self.activations_model = keras.models.Model(model.input,
                                                    layer_outputs) # model instance that returns activations of every layer

    def on_epoch_end(self, epoch, logs=None):
        if self.validation_data is None:
            raise RuntimeError('Requires validation_data.')

        validation_sample = self.validation_data[0][0:1] # obtains the first input sample of validation data
        activations = self.activations_model.predict(validation_sample)
        f = open('activations_at_epoch_' + str(epoch) + '.npz', 'w')   # saves arrays to disk  
        np.savez(f, activations)                                         
        f.close() 

## 7.2.2 Introduction to TensorBoard: the TensorFlow visualization framework
To do good research or develop good models, we need rich, frequent feedback about what’s going on inside our models during experiments. Making progress is an iterative process, or loop: start with an idea and express it as an experiment, attempt to validate or invalidate the idea. Run the experiment and process the information it generates, which inspires the next idea. The more iterations of this loop we’re able to run, the more refined and powerful our ideas become. Keras helps us go from idea to experiment in the least possible time, and fast GPUs can help get from experiment to result as quickly as possible. But what about processing the experiment results? That’s where TensorBoard comes in.

![progress loop](images/7_2_2_progress.jpg)

TensorBoard is a browser-based visualization tool that comes packaged with TensorFlow. The main purpose of TensorBoard is to help visually monitor everything that goes on inside a model during training. Several cool features are provided by TensorBoard, all in a browser:
 - Visually monitoring metrics during training
 - Visualizing model architecture
 - Visualizing histograms of activations and gradients
 - Exploring embeddings in 3D
 
Let's take a look at these features with a simple example. We will train a 1D CNN on the IMDB sentiment-analysis task.

**TEXT-CLASSIFICATION MODEL TO USE WITH TENSORBOARD**

In [2]:
# import keras modules
import keras
from keras import layers
from keras.datasets import imdb
from keras.preprocessing import sequence

In [3]:
# Number of words to consider as features
max_features = 2000

# Cuts off texts after this number of words
max_len = 500

In [4]:
(x_train, y_train), (x_test, y_test) = imdb.load_data(num_words=max_features)
x_train = sequence.pad_sequences(x_train, maxlen=max_len)
x_test = sequence.pad_sequences(x_test, maxlen=max_len)

In [5]:
model = keras.models.Sequential()
model.add(layers.Embedding(max_features, 128, input_length=max_len, name='embed'))
model.add(layers.Conv1D(32, 7, activation='relu'))
model.add(layers.MaxPooling1D(5))
model.add(layers.Conv1D(32, 7, activation='relu'))
model.add(layers.GlobalMaxPooling1D())
model.add(layers.Dense(1))
model.summary()
model.compile(optimizer='rmsprop', loss='binary_crossentropy', metrics=['acc'])

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
embed (Embedding)            (None, 500, 128)          256000    
_________________________________________________________________
conv1d_1 (Conv1D)            (None, 494, 32)           28704     
_________________________________________________________________
max_pooling1d_1 (MaxPooling1 (None, 98, 32)            0         
_________________________________________________________________
conv1d_2 (Conv1D)            (None, 92, 32)            7200      
_________________________________________________________________
global_max_pooling1d_1 (Glob (None, 32)                0         
_________________________________________________________________
dense_1 (Dense)              (None, 1)                 33        
Total params: 291,937
Trainable params: 291,937
Non-trainable params: 0
_________________________________________________________________


Before we can use TensorBoard, we need to create a directory (my_log_dir) where we'll store the log files it generates. Once this is done, let's launch the training with a `TensorBoard` callback instance. This callback will write log events to disk at the specified location.

**TRAINING THE MODEL WITH A TensorBoard CALLBACK**

In [6]:
callbacks = [
    keras.callbacks.TensorBoard(
        log_dir='my_log_dir', # files written here
        histogram_freq=1, # Records activation histograms every 1 epoch
        embeddings_freq=1, # Records embedding data every 1 epoch
    )
]

history = model.fit(x_train, y_train, epochs=20, batch_size=128,
                    validation_split=0.2, callbacks=callbacks)

Train on 20000 samples, validate on 5000 samples
Epoch 1/20
Epoch 2/20
Epoch 3/20
Epoch 4/20
Epoch 5/20
Epoch 6/20
Epoch 7/20
Epoch 8/20
Epoch 9/20
Epoch 10/20
Epoch 11/20
Epoch 12/20
Epoch 13/20
Epoch 14/20
Epoch 15/20
Epoch 16/20
Epoch 17/20
Epoch 18/20
Epoch 19/20
Epoch 20/20


At this point, we can launch the TensorBoard server from the command line, instructing it to read the logs the callback is currently writing. Here is the command to enter:

`tensorboard --logdir=my_log_dir`

We can then browse to http://localhost:6006 and look at the model we are training. In addition to live graphs of the training and validation metrics, we can acced the Histograms tab where we can find pretty visualizations of histograms of activation values taken by our layers.

![metrics](images/7_2_2_tensorboard1.jpg)
![histograms](images/7_2_2_tensorboard2.jpg)

The Embeddings tab gives us a way to inspect the embedding locations and spatial relationships of the 10,000 words in the input vocabulary, as learned by the initial `Embedding` layer. Because the embedding space is 128-dimensional, TensorBoard automatically reduces it to 2D or 3D using a dimensionality-reduction algorithm of choice: either principal component analysis (PCA) or t-distributed stochastic neighbor embedding (t-SNE). 

![clusters](images/7_2_2_tensorboard3.jpg)

In figure above, in the point cloud, we can clearly see two clusters: words with a positive connotation and words with a negative connotation. The visualization makes it immediately obvious that embeddings trained jointly with a specific objective result in models that are completely specific to the underlying task—that’s the reason using pretrained generic word embeddings is rarely a good idea.

![graphs](images/7_2_2_tensorboard4.jpg)

The Graphs tab shows an interactive visualization of the graph of low-level TensorFlow operations underlying our Keras model. As you can see, there’s a lot more going on than you would expect. The model we just built may look simple when defined in Keras—a small stack of basic layers—but under the hood, we need to construct a fairly complex graph structure to make it work. A lot of it is related to the gradient-descent process. This complexity differential between what we see and what we’re manipulating is the key motivation for using Keras as our way of building models, instead of working with raw TensorFlow to define everything from scratch. Keras makes our workflow dramatically simpler.

## Wrapping up
 - Keras callbacks provide a simple way to monitor models during training and automatically take action based on the state of the model.
 - When using TensorFlow, TensorBoard is a great way to visualize model activity in your browser. You can use it in Keras models via the TensorBoard callback.