# Tensorboard and Debugging Tensorflow

From Tensorflow website:

The computations you'll use TensorFlow for - like training a massive deep neural network - can be complex and confusing. To make it easier to understand, debug, and optimize TensorFlow programs, we've included a suite of visualization tools called TensorBoard. You can use TensorBoard to visualize your TensorFlow graph, plot quantitative metrics about the execution of your graph, and show additional data like images that pass through it. 

One key tool for organization is **`tf.name_scope`**. This allows you to apply a prefix to a layer, which is useful when you have a deep neural network and want to distinguish the different layers and operations.

Tensorflow has a special operation called a **summary** which outputs protocol buffers to disk. Summaries can be used on scalars, images, histograms, and audio. `tf.summary.FileWriter` writes summaries to disk.

To introduce these concepts, we will try to debug a broken neural network with Tensorboard.

In [4]:
import tensorflow as tf

In [5]:
LOGDIR = "/tmp/mnist_tutorial/"
mnist = tf.contrib.learn.datasets.mnist.read_data_sets(train_dir=LOGDIR + "data", one_hot=True)

Extracting /tmp/mnist_tutorial/data/train-images-idx3-ubyte.gz
Extracting /tmp/mnist_tutorial/data/train-labels-idx1-ubyte.gz
Extracting /tmp/mnist_tutorial/data/t10k-images-idx3-ubyte.gz
Extracting /tmp/mnist_tutorial/data/t10k-labels-idx1-ubyte.gz


Notice the summaries below. They allow us to visualize the weights, biases, and activations of our layers.

In [6]:

def conv_layer(input, size_in, size_out, name="conv"):
    with tf.name_scope(name):
        w = tf.Variable(tf.zeros([5, 5, size_in, size_out], name="W"))
        b = tf.Variable(tf.zeros([size_out], name="B"))
        conv = tf.nn.conv2d(input, w, strides=[1, 1, 1, 1], padding="SAME")
        act = tf.nn.relu(conv + b)
        tf.summary.histogram("weights", w)
        tf.summary.histogram("biases", b)
        tf.summary.histogram("activations", act)
        return tf.nn.max_pool(act, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding="SAME")


def fc_layer(input, size_in, size_out, name="fc"):
    with tf.name_scope(name):
        w = tf.Variable(tf.zeros([size_in, size_out], name="W"))
        b = tf.Variable(tf.zeros([size_out], name="B"))
        act = tf.matmul(input, w) + b
        tf.summary.histogram("weights", w)
        tf.summary.histogram("biases", b)
        tf.summary.histogram("activations", act)
        return act


The code below builds a neural network model and somewhat hackily supports adding multiple layers. Notice the namescopes and how they show up in Tensorboard.

In [7]:

def mnist_model(learning_rate, use_two_fc, use_two_conv, hparam):
    tf.reset_default_graph()
    sess = tf.Session()

    # Setup placeholders, and reshape the data
    x = tf.placeholder(tf.float32, shape=[None, 784], name="x")
    x_image = tf.reshape(x, [-1, 28, 28, 1])
    tf.summary.image('input', x_image, 3)
    y = tf.placeholder(tf.float32, shape=[None, 10], name="labels")

    if use_two_conv:
        conv1 = conv_layer(x_image, 1, 32, "conv1")
        conv_out = conv_layer(conv1, 32, 64, "conv2")
    else:
        conv1 = conv_layer(x_image, 1, 64, "conv")
        conv_out = tf.nn.max_pool(conv1, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding="SAME")

    flattened = tf.reshape(conv_out, [-1, 7 * 7 * 64])


    if use_two_fc:
        fc1 = fc_layer(flattened, 7 * 7 * 64, 1024, "fc1")
        relu = tf.nn.relu(fc1)
        embedding_input = relu
        tf.summary.histogram("fc1/relu", relu)
        embedding_size = 1024
        logits = fc_layer(fc1, 1024, 10, "fc2")
    else:
        embedding_input = flattened
        embedding_size = 7*7*64
        logits = fc_layer(flattened, 7*7*64, 10, "fc")

    with tf.name_scope("xent"):
        xent = tf.reduce_mean(
                tf.nn.softmax_cross_entropy_with_logits(
                        logits=logits, labels=y), name="xent")
        tf.summary.scalar("xent", xent)

    with tf.name_scope("train"):
        train_step = tf.train.AdamOptimizer(learning_rate).minimize(xent)

    with tf.name_scope("accuracy"):
        correct_prediction = tf.equal(tf.argmax(logits, 1), tf.argmax(y, 1))
        accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
        tf.summary.scalar("accuracy", accuracy)

    summ = tf.summary.merge_all()
    
    sess.run(tf.global_variables_initializer())
    
    # FileWriter writes informaiton we want to visualize
    writer = tf.summary.FileWriter(LOGDIR + hparam)
    writer.add_graph(sess.graph)
    
    # Training
    for i in range(100):
        batch = mnist.train.next_batch(100)
        if i % 5 == 0:
            [train_accuracy, s] = sess.run([accuracy, summ], feed_dict={x: batch[0], y: batch[1]})
            writer.add_summary(s, i)
            print("Iteration: {} Accuracy: {}".format(i, train_accuracy))
        sess.run(train_step, feed_dict={x: batch[0], y: batch[1]})

In [None]:
"""
Exercise: Spot the bug!

Run the code below and then visualize the Tensorboard output via
tensorboard --logdir /tmp/mnist_tutorial/(____insert prefix_name here____)

Try to fix the bug, change the prefix, and then run again.
"""

prefix_name = "debug"
mnist_model(1e-3, True, False, prefix_name)

Tensorboard also makes it easy to test different hyperparameters. Below we have a handy function for constructing an experiment name given its hyperparameters.

In [2]:
def make_hparam_string(learning_rate, use_two_fc, use_two_conv):
    conv_param = "conv=2" if use_two_conv else "conv=1"
    fc_param = "fc=2" if use_two_fc else "fc=1"
    return "lr_%.0E,%s,%s" % (learning_rate, conv_param, fc_param)


For this part of the notebook, try adding in a different learning rate besides 1e-3. Visualize your results in Tensorboard!

In [8]:

LOGDIR = "/tmp/tuning/"
learning_rates = [1e-3]
use_two_fcs = [True]
use_two_convs = [True, False]

for learning_rate in learning_rates:
    for use_two_fc in use_two_fcs:
        for use_two_conv in use_two_convs:
            # Construct a hyperparameter string for each one (example: "lr_1E-3,fc=2,conv=2")
            hparam = make_hparam_string(learning_rate, use_two_fc, use_two_conv)
            print('Starting run for %s' % hparam)

            # Actually run with the new settings
            mnist_model(learning_rate, use_two_fc, use_two_conv, hparam)
    print('Done training!')
    print('Run `tensorboard --logdir=%s` to see the results.' % LOGDIR)
    print('Running on mac? If you want to get rid of the dialogue asking to give '
        'network permissions to TensorBoard, you can provide this flag: '
        '--host=localhost')

Starting run for lr_1E-03,conv=2,fc=2


AttributeError: 'module' object has no attribute 'image'

In your terminal, run:

`tensorboard --logdir /tmp/tuning`

Notice that this is the parent folder for all the experiments you just launched, and so Tensorboard will fetch all the experiment data that was logged in the subfolders.

In the remainder of the time, feel free to play around with other parameter settings. You can try adding in additional layers, more units in each layer, and various regularization techniques.

## Additional useful resources

https://www.tensorflow.org/get_started/summaries_and_tensorboard