# Analysing training dynamics using HSIC Criterion

This notebook explains how we can analyse training process by using calculating HSIC criterion as measure of dependence between hidden representation and output label (generalization) and between hidden representation and input variable (complexity) and obtaining 2-D training trajectory in information plane (we will continue calling it information plane even though HSIC criterion has no counter part in information theory).  

## Importing glow modules

In [5]:
# importing PyGlow modules
import glow
from glow.layers import Dense, Dropout, Conv2d, Flatten
from glow.datasets import mnist, cifar10
from glow.models import IBSequential
from glow.information_bottleneck.estimator import HSIC

## Load dataset

In [6]:
# hyperparameter
batch_size = 64
num_workers = 3
validation_split = 0.2
num_epochs = 2

# load the dataset
train_loader, val_loader, test_loader = mnist.load_data(
    batch_size=batch_size, num_workers=num_workers, validation_split=validation_split
)

## IB based Sequential Model - IBSequential
IBSequential is keras like Sequential model with extended functionalities which support tracking of training dynamics using evaluators (instances of glow.information_bottleneck.Estimator). It evaluates the dynamics using criterion defined in the evaluator and obtain 2-D information plane coordinates (x-axis - complexity and y-axis - generalization).

In [7]:
model = IBSequential(input_shape=(1, 28, 28), gpu=True, track_dynamics=True, save_dynamics=True)
model.add(Conv2d(filters=16, kernel_size=3, stride=1, padding=1, activation='relu'))
model.add(Flatten())
model.add(Dropout(0.4))
model.add(Dense(500, activation='relu'))
model.add(Dropout(0.4))
model.add(Dense(200, activation='relu'))
model.add(Dropout(0.2))
model.add(Dense(10, activation='softmax'))

Running on CUDA enabled device !


## Attaching dynamics evaluator to the model
Attaching HSIC evaluator to the model. You can also use your custom criterion for evaluating the training dynamics (to know how refer Custom_criterion_class.ipynb notebook).

In [9]:
# compile the model
model.compile(optimizer='SGD', loss='cross_entropy', metrics=['accuracy'])
model.attach_evaluator(HSIC(kernel='gaussian', gpu=True, sigma=5))

## Training the model

In [10]:
# train the model along with calculating dynamics
model.fit_generator(train_loader, val_loader, num_epochs)



Epoch 1/2
Training loop: 


100%|██████████| 750/750 [05:40<00:00,  1.33it/s]
  0%|          | 0/188 [00:00<?, ?it/s]



loss: 2.19 - acc: 0.66
Validation loop: 


100%|██████████| 188/188 [00:04<00:00, 45.92it/s]
  0%|          | 0/750 [00:00<?, ?it/s]



loss: 1.87 - acc: 0.50


Epoch 2/2
Training loop: 


100%|██████████| 750/750 [06:28<00:00,  2.69it/s]
  0%|          | 0/188 [00:00<?, ?it/s]



loss: 1.75 - acc: 0.78
Validation loop: 


100%|██████████| 188/188 [00:01<00:00, 97.70it/s]



loss: 1.65 - acc: 0.88



