# Introduction

This notebook shows how to make use of the tensorboard present in tensorflow. The below code is exactly same as the previous notebook where a CNN was implemented. So to proceed it is advised to be familiar with the previous notebook implementation.

We will add those additional codes alone in this notebook and look to capture the parameters of the model and visualize it in tensorboard.

Only the additional lines that we add compared to the previous notebook is alone explained in the comments in the appropriate places.

## Tensorboard
Tensorboard is a visualization tool that come with tensorflow installation. So no additional libraries are required to use this.

The general idea is that we want to store all the parameter values in a file and later pass this file to tensorboard to visualize it.

The parameters that we visualize generally in a model are its computaional graph, weights and biases of layers, cost, accuracy values over time and then also how the model learns the embeddings for a input.

So anything that we wish to store should be given a name, so that we can identify it in visualization. The names can be anything but unique for each parameter. But it is always advised to use the naming convention for debugging purposes.

Let's get started.

In [1]:
import numpy as np
import tensorflow as tf
import matplotlib.pyplot as plt
# os library is used to specify the directories. 
import os

In [2]:
print 'Library versions used.'
print 'Numpy :{} \nTensorflow :{}'.format(np.version.version, tf.__version__)

Library versions used.
Numpy :1.13.1 
Tensorflow :1.2.1


In [3]:
from tensorflow.examples.tutorials.mnist import input_data
data = input_data.read_data_sets('data/MNIST', one_hot=True)

Extracting data/MNIST/train-images-idx3-ubyte.gz
Extracting data/MNIST/train-labels-idx1-ubyte.gz
Extracting data/MNIST/t10k-images-idx3-ubyte.gz
Extracting data/MNIST/t10k-labels-idx1-ubyte.gz


In [4]:
print 'Data Size'
print 'Training Data \t:', len(data.train.labels)
print 'validation Data :', len(data.validation.labels)
print 'Testing Data \t:', len(data.test.labels)

Data Size
Training Data 	: 55000
validation Data : 5000
Testing Data 	: 10000


In [5]:
print 'Training image vector size :', len(data.train.images[0])

Training image vector size : 784


In [6]:
print 'Number of labels :', len(data.train.labels[0])
print 'Example of a label :', data.train.labels[0]

Number of labels : 10
Example of a label : [ 0.  0.  0.  0.  0.  0.  0.  1.  0.  0.]


In [7]:
data.validation.cls = np.array([label.argmax() for label in data.validation.labels])
data.test.cls = np.array([label.argmax() for label in data.test.labels])

In [8]:
image_size = 28
image_size_flatten = image_size * image_size
num_labels = 10

First the inputs x and y_true should be given a name

In [9]:
x = tf.placeholder(tf.float32, [None, image_size_flatten], name='x')
y_true = tf.placeholder(tf.float32, [None, num_labels], name='labels')

While building the convolution layer 
1. Provide a name for it. pass the name as parameter as if we use two ConvLayer we can generate layers with different names by passing it as parameter.
2. Any parameter of the model that we wish to visualize should be added to the tensorflow summary. Here for the ConvLayer we store the weights, biases and the activations.
3. The weights and biases stores should be defined in a way that they belong to a particular layer or name scope. For this case we use a name_scope function in tensorflow that groups the following into a single name_scope. In simple words it is like saying they all belong to this name_scope or layer.

In [10]:
class ConvLayer(object):
    def __init__(self, inpt, filter_size, num_input_channels, num_filters, strides=(1,1,1,1), 
                 padding='SAME', activation=tf.nn.sigmoid, name='conv'):
        
        # create name_scope
        with tf.name_scope(name):
            self.input = inpt            
            filter_shape = [filter_size, filter_size, num_input_channels, num_filters]
            
            # provide name to weights and biases
            self.W = tf.Variable(tf.truncated_normal(filter_shape, stddev=0.1), dtype=tf.float32, name='W')
            self.b = tf.Variable(tf.truncated_normal(filter_shape[-1:], stddev=0.1), dtype=tf.float32, name='b')
            conv_output = tf.nn.conv2d(self.input, filter=self.W, strides=strides, padding=padding)
            conv_output = conv_output + self.b
            self.output = activation(conv_output) if activation is not None else conv_output
            
            self.params = [self.W, self.b]

            # add the parameters to tensorflow summary
            tf.summary.histogram("weights", self.W)
            tf.summary.histogram("biases", self.b)
            tf.summary.histogram("activation", self.output)

In [11]:
class FlattenLayer(object):
    def __init__(self, inpt, shape):
        self.input = inpt        
        self.output = tf.reshape(self.input, shape=shape)

In [12]:
class MaxPoolLayer(object):
    def __init__(self, inpt, ksize=(1, 2, 2, 1), strides=(1, 2, 2, 1), padding="SAME"):
        self.input = inpt
        self.output = tf.nn.max_pool(self.input, ksize=ksize, strides=strides, padding=padding)

In [13]:
class DropoutLayer(object):
    def __init__(self, inpt):
        self.keep_prob = tf.placeholder(tf.float32)
        self.input = inpt
        self.output = tf.nn.dropout(self.input, keep_prob=self.keep_prob)

In [14]:
filter_size1 = 5 
num_filters1 = 16
filter_size2 = 5
num_filters2 = 36
num_channels = 1

In tensorboard we can also check if the input passed is the right one. For our case the input is an image. So we store a sample of 3 images alone in the summary that are passed on as input.

In [15]:
inpt = tf.reshape(x, shape=[-1, image_size, image_size, num_channels])

# Store 3 inputs in the summary
tf.summary.image('input', inpt, 3)

<tf.Tensor 'input:0' shape=() dtype=string>

In [16]:
#Now since the ConvLayer accepts name as parameter, add name while creating the layer
layer0_conv = ConvLayer(inpt, filter_size=filter_size1, num_input_channels = num_channels,
                        num_filters=num_filters1, strides=[1, 1, 1, 1],
                        activation=tf.nn.relu, padding='SAME', name='conv0')
layer0_pool = MaxPoolLayer(layer0_conv.output, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1])

In [17]:
#Now since the ConvLayer accepts name as parameter, add name while creating the layer
layer1_conv = ConvLayer(layer0_pool.output, filter_size=filter_size2, num_input_channels = num_filters1,
                        num_filters=num_filters2, strides=[1, 1, 1, 1], 
                        activation=tf.nn.relu, padding="SAME", name='conv1')
layer1_pool = MaxPoolLayer(layer1_conv.output, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1])

In [18]:
num_features = layer1_pool.output.get_shape()[1:4].num_elements()
layer2_flatten = FlattenLayer(layer1_pool.output, shape=[-1, num_features])
layer2_dropout = DropoutLayer(layer2_flatten.output)

In [19]:
#we do the same as we did for ConvLayer, define a name_scope and then add the parameters to the summary
fc_size = 128
#define name_scope
with tf.name_scope('fc'):
    weights_fc = tf.get_variable('fc_w', [num_features, fc_size], tf.float32)
    biases_fc = tf.get_variable('fc_b', [fc_size], tf.float32)
    fc_output = tf.nn.relu(tf.matmul(layer2_dropout.output, weights_fc) + biases_fc)
    
    # add weights and biases to summary
    tf.summary.histogram('weights', weights_fc)
    tf.summary.histogram('biases', biases_fc)

In [20]:
#define name_scope
with tf.name_scope('output_layer'):
    weights_output = tf.get_variable('output_w', [fc_size, num_labels], tf.float32)
    biases_output = tf.get_variable('output_b', [num_labels], tf.float32)
    output = tf.matmul(fc_output, weights_output) + biases_output
    
    # add weights and biases to summary
    tf.summary.histogram('weights', weights_output)
    tf.summary.histogram('biases', biases_output)

In [21]:
y_pred = tf.nn.softmax(output)
y_pred_cls = tf.argmax(y_pred, dimension=1)

Also the cost function needs to visualized to check how the cost of the model varies over time. But previously the weights and biases we added to the summary are viewed as distributions but the cost is a scalar value. So we need to store it as scalar.

In [22]:
#define name_scope
with tf.name_scope('cost'):
    cross_entropy = tf.nn.softmax_cross_entropy_with_logits(logits=output, labels=y_true)
    cost = tf.reduce_mean(cross_entropy)
    
    #add the parameter cost to summary
    tf.summary.scalar('cost', cost)

In [23]:
optimizer = tf.train.AdamOptimizer(learning_rate=0.001).minimize(cost)

The accuray is another parameter that helps us understand if the model is working fine.

In [24]:
#define name_scope
with tf.name_scope('accuracy'):
    correct_prediction = tf.equal(y_pred_cls, tf.argmax(y_true, 1))
    accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
    
    #add accuracy to the summary
    tf.summary.scalar('accuracy', accuracy)

In [25]:
# specify a directory to save the model and the parameters
LOGDIR = '/tmp/mnist/'

Since we added different type of parameters to the summary like histogram and scalar, we need to tell tensorflow to store them together in a single file. It is done by merge_all function.

In [26]:
merged_summary = tf.summary.merge_all()

Let's define to store the embeddings that the model has learnt. For this we would pass the first 1024 test data and  visualize the embedding the model has learnt for those inputs. 

In [27]:
# create a tensor for the embedding for the embedding. 
# The shape is the number of images that we pass which in our case is 1024 to the size of last fully connected layer.
# The reason for using the last fully connected layer is because it the point at which the model has 
# all features extracted and is to be passed onto the output layer.
embedding_var = tf.Variable(tf.zeros([1024, fc_size]), name="embedding")

# specify which values should be used for embedding - the last fully connected layer output.
assignment = embedding_var.assign(fc_output)

In [28]:
session = tf.Session()
session.run(tf.global_variables_initializer())

While visualizing the embedding we would like to see them as images and the true labels of them displayed, so we can understand better how well the embedding is learnt. But so far we passed only the embedding values to disaply them, so to seem them as image instead of point being displayed in the embedding visualization we need to also tell the tensorboard the image and the true label of those data.

For easier computations, tensorflow require to pass them as specific files. <br/>
A .tsv file containing the true labels of those data. <br/>
A sprite image file which contains all those images in a single file. <br/>
(You can read more about those files in tensorflowwebsite)

In [29]:
#create a projector config
config = tf.contrib.tensorboard.plugins.projector.ProjectorConfig()
#create an object for the embedding class in config.
embedding = config.embeddings.add()
#Give embedding tensor the name that we created for the embedding_var
embedding.tensor_name = embedding_var.name
#set metadata_path to Path for .tsv file containing the true labels
embedding.metadata_path = os.path.join(os.getcwd(), 'data/labels_1024.tsv')
#set sprite image_path
embedding.sprite.image_path   = os.path.join(os.getcwd(),'data/sprite_1024.png')
# Now since all images are passed in a single file we need to specify the dimension of
# each image in that particular file so the the model can idetify them
embedding.sprite.single_image_dim.extend([28, 28])

Now that we have set everything we need to visualize, we need to write them in a file

In [30]:
# create a write instance to save in a particular file.
writer = tf.summary.FileWriter(LOGDIR)
# write the computation graph to the file
writer.add_graph(session.graph)

In [31]:
# The next line writes a projector_config.pbtxt in the LOGDIR. TensorBoard will read this file during startup.
# This is a config file where the path of the sprite file, tsv file are specified the path
tf.contrib.tensorboard.plugins.projector.visualize_embeddings(writer, config)

In [32]:
batch_size = 64
epoch = 10
saver = tf.train.Saver()

In [33]:
import math
def train(epoch=10, batch_size=64):
    total_batch = math.ceil(float(len(data.train.images))/batch_size) 
    print '------------------TRAINING------------------'
    for i in range(epoch):
        avg_cost = 0
        # save the initial embeddings. we pass the first 1024 test data as input for the embedding 
        # with dropout probability set to 1.0
        session.run(assignment, feed_dict={x: data.test.images[:1024], y_true: data.test.labels[:1024], layer2_dropout.keep_prob:1.0})
        # save the initial model with i as suffix , so that we can keep track of the learning at different epoch
        saver.save(session, os.path.join(LOGDIR, "model.ckpt"), i)

        for j in range(int(total_batch)):
            x_batch, y_batch = data.train.next_batch(batch_size)
            
            feed_dict = {x: x_batch, y_true: y_batch, layer2_dropout.keep_prob:0.5}
            
            # at each batch calculate the summary and write to file.
            s = session.run(merged_summary, feed_dict)
            writer.add_summary(s, i*(total_batch)+j)                        

            _, c = session.run([optimizer, cost], feed_dict)                
            avg_cost += c / total_batch 
        print 'Epoch :{0} completed. Error :{1}'.format(i+1, avg_cost)
    print '-------------TRAINING COMPLETED-------------'    
    # save the final model
    saver.save(session, LOGDIR)

In [34]:
train(epoch=10)
session.close()

------------------TRAINING------------------
Epoch :1 completed. Error :0.218448304452
Epoch :2 completed. Error :0.0743830678518
Epoch :3 completed. Error :0.0593719641093
Epoch :4 completed. Error :0.0471451916324
Epoch :5 completed. Error :0.0416919785725
Epoch :6 completed. Error :0.0365576494207
Epoch :7 completed. Error :0.0318568797937
Epoch :8 completed. Error :0.0288753090377
Epoch :9 completed. Error :0.0262601324376
Epoch :10 completed. Error :0.0234545576143
-------------TRAINING COMPLETED-------------


In [35]:
print('Done training!')
print('Run `tensorboard --logdir=%s` in terminal and see the results in tensorboard.' % LOGDIR)

Done training!
Run `tensorboard --logdir=/tmp/mnist/` in terminal and see the results in tensorboard.
