# Visualizing Solvers with TensorBoard

This tutorial may assume knowledge from the tutorial on [Visualization with TensorBoard](minpy_visualization.ipynb). It is based on MinPy's [CNN Tutorial](../cnn_tutorial/cnn_tutorial.rst).


## Equip the CNN Tutorial with Visualization Functions

Set up as in the original tutorial.

In [1]:
"""Convolution Neural Network example using only MXNet symbol."""
import sys

from minpy.nn.io import NDArrayIter
# Can also use MXNet IO here
# from mxnet.io import NDArrayIter
from minpy.core import Function
from minpy.nn import layers
from minpy.nn.model import ModelBase
from minpy.nn.solver import Solver
from examples.utils.data_utils import get_CIFAR10_data

# Please uncomment following if you have GPU-enabled MXNet installed.
#from minpy.context import set_context, gpu
#set_context(gpu(0)) # set the global context as gpu(0)

import mxnet as mx

batch_size=128
input_size=(3, 32, 32)
flattened_input_size=3 * 32 * 32
hidden_size=512
num_classes=10

Design a template for CNN.

In [2]:
class ConvolutionNet(ModelBase):
    def __init__(self):
        super(ConvolutionNet, self).__init__()
        # Define symbols that using convolution and max pooling to extract better features
        # from input image.
        net = mx.sym.Variable(name='X')
        net = mx.sym.Convolution(
                data=net, name='conv', kernel=(7, 7), num_filter=32)
        net = mx.sym.Activation(
                data=net, act_type='relu')
        net = mx.sym.Pooling(
                data=net, name='pool', pool_type='max', kernel=(2, 2),
                stride=(2, 2))
        net = mx.sym.Flatten(data=net)
        net = mx.sym.FullyConnected(
                data=net, name='fc1', num_hidden=hidden_size)
        net = mx.sym.Activation(
                data=net, act_type='relu')
        net = mx.sym.FullyConnected(
                data=net, name='fc2', num_hidden=num_classes)
        net = mx.sym.SoftmaxOutput(data=net, name='softmax', normalization='batch')
        # Create forward function and add parameters to this model.
        input_shapes = {'X': (batch_size,) + input_size, 'softmax_label': (batch_size,)}
        self.cnn = Function(net, input_shapes=input_shapes, name='cnn')
        self.add_params(self.cnn.get_params())

    def forward_batch(self, batch, mode):
        out = self.cnn(X=batch.data[0],
                       softmax_label=batch.label[0],
                       **self.params)
        return out

    def loss(self, predict, y):
        return layers.softmax_cross_entropy(predict, y)

Set `get_CIFAR10_data`'s argument to the data file location for cifar-10 dataset. The original tutorial applied an argparse to read the directory directly in the terminal. For the convenience of using a Jupyter notebook, this is not used here.

Declare the directory for storing log files, which will be used for viusalization later.

`visualize` is an optional argument of Solver and is set to be `False` by default. Set `visualize` to be `True` and pass the `summaries_dir` argument as well. We will touch the details of implementing visualization functions in Solver later.

In [3]:
def main():
    # Create model.
    model = ConvolutionNet()
    # Create data iterators for training and testing sets.
    data = get_CIFAR10_data('cifar-10-batches-py')
    train_dataiter = NDArrayIter(data=data['X_train'],
                                 label=data['y_train'],
                                 batch_size=batch_size,
                                 shuffle=True)
    test_dataiter = NDArrayIter(data=data['X_test'],
                                label=data['y_test'],
                                batch_size=batch_size,
                                shuffle=False)

    # Declare the directory for storing data, which will be used for visualization with tensorboard later.
    summaries_dir = '/private/tmp/cnn_log'

    # Create solver.
    solver = Solver(model,
                    train_dataiter,
                    test_dataiter,
                    num_epochs=10,
                    init_rule='gaussian',
                    init_config={
                        'stdvar': 0.001
                    },
                    update_rule='sgd_momentum',
                    optim_config={
                        'learning_rate': 1e-3,
                        'momentum': 0.9
                    },
                    verbose=True,
                    print_every=20,
                    visualize=True,
                    summaries_dir=summaries_dir)
    # Initialize model parameters.
    solver.init()
    # Train!
    solver.train()

In [4]:
if __name__ == '__main__':
    main()

(Iteration 1 / 3828) loss: 2.302535
(Iteration 21 / 3828) loss: 2.302051
(Iteration 41 / 3828) loss: 2.291640
(Iteration 61 / 3828) loss: 2.133044
(Iteration 81 / 3828) loss: 2.033680
(Iteration 101 / 3828) loss: 1.995795
(Iteration 121 / 3828) loss: 1.796180
(Iteration 141 / 3828) loss: 1.884282
(Iteration 161 / 3828) loss: 1.702727
(Iteration 181 / 3828) loss: 1.745341
(Iteration 201 / 3828) loss: 1.550407
(Iteration 221 / 3828) loss: 1.405793
(Iteration 241 / 3828) loss: 1.529175
(Iteration 261 / 3828) loss: 1.440347
(Iteration 281 / 3828) loss: 1.859766
(Iteration 301 / 3828) loss: 1.416149
(Iteration 321 / 3828) loss: 1.481019
(Iteration 341 / 3828) loss: 1.501948
(Iteration 361 / 3828) loss: 1.508027
(Iteration 381 / 3828) loss: 1.516997
(Epoch 1 / 10) train acc: 0.501953125, val_acc: 0.4931640625, time: 1253.37731194.
(Iteration 401 / 3828) loss: 1.296929
(Iteration 421 / 3828) loss: 1.496588
(Iteration 441 / 3828) loss: 1.330925
(Iteration 461 / 3828) loss: 1.450040
(Iteration 

Open the terminal, and call the following command:

~~~bash
tensorboard --logdir=summaries_dir
~~~

Note you don't need to include `/private` for `summaries_dir`, so in this case `summaries_dir` will be `/tmp/cnn_log`.

Once you start TensorBoard, you should see the visualization of scalars in the EVENTS section as below. The training accuracy, validation accuracy, training loss and the squared L2-norm of the gradient are implemented by default in the Solver.

Note: If you have more than one `SummaryWriter`(2 in this case), the data of some `SummaryWriter`s might not be written into the log files immediately. But you should get whatever you want by the end of the training.

![CNN Loss Curve](cnn_loss.png)

![Curve of Squared L2-Norm](cnn_gradient_norm.png)

![Training accuracy](cnn_accuracy.png)

## Implementation Details of the Solver

Now we touch the details of the implementation of visualization in Solver. This will not show a complete implementation for the Solver class.

### Step1: Generate SummaryWriters

If `self.visualize == True`, two `SummaryWriter`s will be generated by default, one for training and one for testing.

In [None]:
class Solver(object):
    ...
    def __init__(self, model, train_dataiter, test_dataiter, **kwargs):
        ...
        self.visualize = kwargs.pop('visualize', False)
        
        if self.visualize:
            # Retrieve the summary directory. Create summary writers for training and test.
            self.summaries_dir = kwargs.pop('summaries_dir', '/private/tmp/newlog')
            self.train_writer = SummaryWriter(self.summaries_dir + '/train')
            self.test_writer = SummaryWriter(self.summaries_dir + '/test')

### Step2: Set a Scalar Summary for Squared L2-norm of the Gradient

In [None]:
def _step(self, batch, iteration):
    ...
    if self.visualize:
            Grad_norm = 0

        # Perform a parameter update
        for p, w in self.model.params.items():
            dw = grads[p]
            if self.visualize:
                norm = dw ** 2
                while not isinstance(norm, minpy.array.Number):
                    norm = sum(norm)
                Grad_norm += norm
            config = self.optim_configs[p]
            next_w, next_config = self.update_rule(w, dw, config)
            self.model.params[p] = next_w
            self.optim_configs[p] = next_config

        if self.visualize:
            grad_norm_summary = summaryOps.scalarSummary('squared L2-norm of the gradient', Grad_norm)
            self.train_writer.add_summary(grad_norm_summary, iteration)
    ...

### Step3: Set a Scalar Summary for Training Loss

In [None]:
def train(self):
        """
        Run optimization to train the model.
        """
        num_iterations = self.train_dataiter.getnumiterations(
        ) * self.num_epochs
        t = 0
        for epoch in range(self.num_epochs):
            start = time.time()
            self.epoch = epoch + 1
            for each_batch in self.train_dataiter:
                self._step(each_batch, t + 1)
                # Maybe print training loss
                if self.verbose and t % self.print_every == 0:
                    print('(Iteration %d / %d) loss: %f' %
                          (t + 1, num_iterations, self.loss_history[-1]))
                if self.visualize:
                    # Add scalar summaries of training loss.
                    loss_summary = summaryOps.scalarSummary('loss', self.loss_history[-1])
                    self.train_writer.add_summary(loss_summary, t + 1)
                
                t += 1

### Step4: Set a Scalar Summary for Training/Validation Accuracy

In [None]:
def train(self):
    ...
    for epoch in range(self.num_epochs):
        start = time.time()
        self.epoch = epoch + 1
        ...
        # evaluate after each epoch
        train_acc = self.check_accuracy(self.train_dataiter, num_samples=self.train_acc_num_samples)
        val_acc = self.check_accuracy(self.test_dataiter)
        self.train_acc_history.append(train_acc)
        self.val_acc_history.append(val_acc)
        ...
        if self.visualize:
            val_acc_summary = summaryOps.scalarSummary('accuracy', val_acc)
            self.test_writer.add_summary(val_acc_summary, self.epoch)
            train_acc_summary = summaryOps.scalarSummary('accuracy', train_acc)
            self.train_writer.add_summary(train_acc_summary, self.epoch)
        ...

You could do whatever you want like cross entropy, dropout_keep_probability, mean, etc. This is a result from the TensorFlow's tutorial on constructing a deep convolutional MNIST classifier: [link](https://github.com/tensorflow/tensorflow/blob/r0.11/tensorflow/examples/tutorials/mnist/mnist_with_summaries.py).

![MNIST Result](mnist_result.png)