![alt text](./images/TrainedModel.png "Image produced by callback")


## Keras Training Visualization

This notebook uses a Keras callback and Matplotlib to display an animated graph of a model being trained.<br />

The model trained is a linear regression (modeled as a single node neural network). The callback function in the cell titled "Function that draws and updates the graph" is generic and can be used for any neural network. The key limitations are that it's only for a single input and single output.<br />

The basic strategy is to create a hook after each mini-batch training. The callback runs the model on the dataset and plots the results and (if desired) the current mean squared error.<br />

### Import Requirements
Note that this uses qt5 for display, not inline. It is possible to do animation inline, but it's a bit more limiting.

In [2]:
import numpy as np
import pandas as pd
%matplotlib qt5
import matplotlib.pyplot as plt

### Load Data
This is publicly available data from the Lending Club (http://bit.ly/2LC6wth) on the performance of loans that they issued from 2017 to present. Interest rate was provided. I computed the total losses from fields in their file, did a bit of cleanup, and pickled it in this file. My data set is restricted to loans that have defaulted (aka "charged off").

In [3]:
data = pd.read_pickle('data/lend_club_ir_v_losses.pkl')
data.head()

Unnamed: 0,int_rate,loss_pct
1,0.1527,0.768256
8,0.2128,0.937043
9,0.1269,0.823038
12,0.1349,0.810327
14,0.1065,0.392143


### Select Features and Labels

In [4]:
features = data['int_rate'].values
labels = data['loss_pct'].values

### Split into test and training datasets

In [5]:
from sklearn.model_selection import train_test_split

X_train, X_test, y_train, y_test = train_test_split(features.reshape(-1, 1), labels, random_state=42)

### Function that determines how often to update the graph
Designed spefically for the one-parameter linear model below, this function asks for an update whenever the weight or bias has changed by more than a specified threshold (defaults to 1%). A minimum frequency (defaults to once every 500 batches) can also be specified.

In [6]:
from keras.layers import Dense

# returns closured callback that will determine how often the graph is redrawn
def get_frequency_callback(**kwargs):

    # parameters
    weight_threshold = kwargs.get('weight_threshold', 0.01)
    bias_threshold = kwargs.get('bias_threshold', 0.01)
    min_frequency = kwargs.get('min_frequency', 500)

    # declared variables that will be retained between invocations of the callback
    layer = None
    w_prev = 0
    b_prev = 0
    batch_prev = 0
    
    # the callback that will actually make the decision to update or not
    def frequency_callback(model, X, y, tot_batches):

        nonlocal layer, w_prev, b_prev, batch_prev
        
        # get the model layer containing the weight and bias
        if layer == None:
            layer = model.get_layer('output')
        
        # get the current value of the weight and bias
        w = layer.get_weights()[0][0][0]
        b = layer.get_weights()[1][0]

        # assume change was too small for an update
        display = False
        
        # if change in weight or bias exceeds relevant threshold, or it's been too long since the last update 
        if (np.abs(w - w_prev) > weight_threshold or np.abs(b - b_prev) > bias_threshold) \
            or tot_batches - batch_prev > min_frequency:
            
            # update on this iteration
            display = True
            
            # keep track of weight and bias from last update
            w_prev = w
            b_prev = b
            
            # keep track of how long since the last update
            batch_prev = tot_batches
        
        return display
    
    # return closure
    return frequency_callback

  from ._conv import register_converters as _register_converters
Using TensorFlow backend.


## Import function that draws and updates the graph

See kerviz.py for the code.

The callback function should be run *on_batch_end*. It determines whether an update is necessary and redraws the graph, displaying related data, as well.<br />

The enclosing function does some set up and creates a closure with the data that will be retained between calls or that needs to be known in advance, because keras passes very little information to the callback.<br />

This function should be reusable for graphing any single-input single-output model in Keras. I have used it, e.g., for multi-layer neural networks.<br />

Expect this to be *super slow*. Especially for simple models, the cost of running and graphing the model will be significantly higher than the cost of training on a single batch. You will get warnings from Keras about slowness.<br />

This is a toy for learning / investigating. So, performance is not a primary concern, but you do want it to be usable. Several things can have a big impact on performance.<br />
<ol>
<li>The **sparsity** options reduce the number datapoints used in the scatter plot and in running the model on each pass.</li>
<li>The **frequency** option determines how often to update the graph. I've used the function option to update only when the change to the model parameters is big enough to justify an update. But, it's specific to the model being trained.</li>
<li>Turning off the **loss display** slightly reduces the number of computations, but significantly reduces the amount of drawing.</li>
</ol>

There is also an option to write the updates to files as individual images, which can then be used to create an animation.

In [7]:
from kerviz import get_redraw

## Create and compile the model

In [8]:
from keras.models import Sequential
from keras.layers import Dense
from keras.optimizers import SGD
from keras.initializers import Constant

# initializing to a negative slope so that the training is more interesting
initializer = Constant(-1.0)

# linear model is an ANN with 1 input node, 1 output node, and linear activation
model = Sequential()
model.add(Dense(1, input_dim=X_train.shape[1], kernel_initializer=initializer, 
                activation='linear', name='output'))

model.compile(loss='mean_squared_error', optimizer='sgd')

epochs = 25
batch_size = 128


## Build the callback function that will be passed to Keras and train the model

In [9]:
import matplotlib as mpl

from keras.callbacks import LambdaCallback

# get closured redraw callback function
# this will also draw the background for the graph
cb_redraw = get_redraw( X_train, y_train, model, batch_size, epochs,
                        frequency=get_frequency_callback(weight_threshold=0.03, bias_threshold=0.02),
                        scatter_sparsity=3, show_loss=True, loss_smoothing=51,
                        title="Linear Regression of Loan Losses vs. Interest Rate",
                        x_label="Interest Rate",
                        y_label="Total Loss (% of Funded Amount)",
                        x_tick_formatter=mpl.ticker.PercentFormatter(xmax=1),
                        y_tick_formatter=mpl.ticker.PercentFormatter(xmax=1),
                        loss_scale=0.8, display_mode='screen')

# wrap callback function in Keras structure, to be called after each batch
redraw_callback = LambdaCallback(on_batch_end=cb_redraw)

# train the model, passing the Keras-wrapped callback function
model.fit(X_train, y_train, epochs=epochs, batch_size=batch_size, callbacks=[redraw_callback])

Epoch 1/25
   256/255480 [..............................] - ETA: 35:29 - loss: 0.6861

  % delta_t_median)
  % delta_t_median)


   512/255480 [..............................] - ETA: 25:08 - loss: 0.6558

  % delta_t_median)
  % delta_t_median)


   768/255480 [..............................] - ETA: 21:35 - loss: 0.6347

  % delta_t_median)
  % delta_t_median)


  1024/255480 [..............................] - ETA: 19:47 - loss: 0.6167

  % delta_t_median)
  % delta_t_median)


  1280/255480 [..............................] - ETA: 18:40 - loss: 0.6037

  % delta_t_median)
  % delta_t_median)


  1536/255480 [..............................] - ETA: 17:54 - loss: 0.5818

  % delta_t_median)
  % delta_t_median)


  1792/255480 [..............................] - ETA: 17:20 - loss: 0.5672

  % delta_t_median)
  % delta_t_median)


  2304/255480 [..............................] - ETA: 16:37 - loss: 0.5255

  % delta_t_median)


  3072/255480 [..............................] - ETA: 15:59 - loss: 0.4774

  % delta_t_median)


  3328/255480 [..............................] - ETA: 15:49 - loss: 0.4620

  % delta_t_median)


  3712/255480 [..............................] - ETA: 15:08 - loss: 0.4429

  % delta_t_median)


Epoch 2/25
Epoch 3/25
Epoch 4/25
Epoch 5/25
Epoch 6/25
Epoch 7/25
Epoch 8/25
Epoch 9/25
Epoch 10/25
Epoch 11/25
Epoch 12/25
Epoch 13/25
Epoch 14/25
Epoch 15/25
Epoch 16/25
Epoch 17/25
Epoch 18/25
Epoch 19/25
Epoch 20/25
Epoch 21/25
Epoch 22/25
Epoch 23/25
Epoch 24/25
Epoch 25/25


<keras.callbacks.History at 0x1a32de4f98>