# Using TensorBoard

TensorBoard is a visualization tool that makes it easier to understand, debug and optimize Tensorflow programs. For example, you can use TensorBoard to visualize the progress of training loss and accuracy, distribution of internal weights of a neural net or images that pass through the network. 

## Summary Operations

To record the state during a TensorFlow session, TensorBoard uses computational nodes called *summary operation* that are added to the computation graph.

There are several kind of summary operations:
- scalar
- histogram
- tensor summary
- image
- audio

In addition to summary operations, TensorBoard is also able to visualize the computation graph itself. Using the simple regression model for MNIST data, we examine how to attach summary operations to your computation graph and visualize them using TensorBoard.

## The standard setup and learning data

In [1]:
import tensorflow as tf

In [2]:
tf.reset_default_graph()

Fetch and one-hot encode MNIST data set. 

In [3]:
from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets('_tmp_mnist_data', one_hot=True)

Extracting _tmp_mnist_data/train-images-idx3-ubyte.gz
Extracting _tmp_mnist_data/train-labels-idx1-ubyte.gz
Extracting _tmp_mnist_data/t10k-images-idx3-ubyte.gz
Extracting _tmp_mnist_data/t10k-labels-idx1-ubyte.gz


## Simple regression model

Let's define a simple regression model, just like we did in the Tensorflow tutorial

In [4]:
tf.reset_default_graph()

# Inputs
x = tf.placeholder(tf.float32, [None, 784]) # None means that a dimension can be of any length

# Weights
W = tf.Variable(tf.zeros([784, 10]))
# Biases
b = tf.Variable(tf.zeros([10]))

# Model output
y = tf.matmul(x, W) + b

# Correct output
y_ = tf.placeholder(tf.float32, [None, 10])

We do a small change on top of the TF tutorial: `learning_rate` is defined as a variable.

In [5]:
cross_entropy = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(labels=y_, logits=y))
learning_rate = tf.Variable(0.1)
train_step = tf.train.GradientDescentOptimizer(learning_rate).minimize(cross_entropy)

In [6]:
correct_prediction = tf.equal(tf.argmax(y,1), tf.argmax(y_,1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))

## Configuring TensorBoard

TensorBoard uses a logging directory to. Each project should have it's own logging directory, but we should also name the subdirectory for each task we want to investigate separately. In addition, in this tutorial we add a timestamp so we don't mess up data from different runs, but it depends on your task if you want to do this.

In [7]:
from datetime import datetime
now = datetime.now()
logdir = "logs/basic-softmax-regression/run-" + now.strftime("%Y-%m-%dT%H-%M-%S") + "/"

Next, we define a bunch of TensorBoard summary variables. 

In [8]:
tf.summary.scalar('accuracy', accuracy)
tf.summary.scalar('learning_rate', learning_rate)
tf.summary.histogram("weights", W)

<tf.Tensor 'weights:0' shape=() dtype=string>

Using `merge_all` convenience TensorFlow variable, we combine all our summary variables to a single computation node. This merged node *depends* on all summary variable nodes. We see later why this is useful.

In [9]:
merged = tf.summary.merge_all()

## Visualizing the computational graph with TensorBoard

We start by creating a session. All the computational nodes defined above are part of the default graph and when we create a session without explicitly passing a graph to it, session automatically attaches to the default graph.

In [10]:
session = tf.Session()

TensorBoard visualizes data in log files written during Tensorflow sessions. To write TensorBoard summary variables to the file, we need to define a log writer. If we pass a graph to it, it will also write a graph to the log file, which allows us to visualize it.

In [11]:
log_writer = tf.summary.FileWriter(logdir, session.graph)

You can now launch TensorBoard and examine the graph. In your  terminal, write:
    
    cd <root directory of tensorgroup-study repository>
    tensorboard --logdir week2/logs
   
Now open http://0.0.0.0:6006 to see TensorBoard. Go to `Graph` tab to examine the computational graph.

## Training the model

Next, we initialize global variables.

In [12]:
init = tf.global_variables_initializer()
session.run(init)

We define a training funtion that lets us train the model for N epochs. This allos us to manually change learning rate during the training.

A few things to note about the training function:

1) The `epoch` step counter is defined as a global variable, so that it doens't reset each time we call `train` function. The effect of this can be seen later when examine TensorBoard data. 

2) We pass the `merged` summary operation to the `run` method. If we don't this, summary operations are not calculated. Tensorflow uses a computational graph, and if nothing depends on summary operations, they are not calculated by default. We could pass each individual summary operation to the `run` method, but `merge_all` node makes it more convenient. As it depends on the all summary operations, it is enough to pass it to trigger computation of all summaries.

3) Passing `accuracy` to the `run` is not needed from the perspective of Tensorboard. We use it only for printing, but we can compare the difference between occasionally printed values to more dense Tensorboard data.

In [13]:
epoch = 0 # global step counter
def train(epochs):
  global epoch
  for i in range(epochs):
    batch_xs, batch_ys = mnist.train.next_batch(100)
    summary, accuracy_val, _ = session.run([merged, accuracy, train_step],
                                         feed_dict={x: batch_xs, y_: batch_ys})
    log_writer.add_summary(summary, epoch)
    epoch += 1
    if epoch % 100 == 0:
        print("Epoch {}, accuracy = {}".format(epoch, accuracy_val))    

Let's train the model for 1000 epochs. 

In [14]:
train(1000)

Epoch 100, accuracy = 0.8299999833106995
Epoch 200, accuracy = 0.8700000047683716
Epoch 300, accuracy = 0.8899999856948853
Epoch 400, accuracy = 0.9200000166893005
Epoch 500, accuracy = 0.8999999761581421
Epoch 600, accuracy = 0.8600000143051147
Epoch 700, accuracy = 0.9399999976158142
Epoch 800, accuracy = 0.9599999785423279
Epoch 900, accuracy = 0.9300000071525574
Epoch 1000, accuracy = 0.8999999761581421


Go to TensorBoard and examine scalar values and histograms. You should see that TensorBoard has learned the model, but the performance between runs wildly varies. Could this be due to the learning rate?

Next, we manually decrease the learning rate to see if it helps 

In [15]:
session.run([tf.assign(learning_rate, 0.01)])
train(1000)

Epoch 1100, accuracy = 0.9200000166893005
Epoch 1200, accuracy = 0.9100000262260437
Epoch 1300, accuracy = 0.8899999856948853
Epoch 1400, accuracy = 0.9200000166893005
Epoch 1500, accuracy = 0.9200000166893005
Epoch 1600, accuracy = 0.8999999761581421
Epoch 1700, accuracy = 0.9700000286102295
Epoch 1800, accuracy = 0.8799999952316284
Epoch 1900, accuracy = 0.9300000071525574
Epoch 2000, accuracy = 0.8799999952316284


In [16]:
session.run([tf.assign(learning_rate, 0.001)])
train(1000)

Epoch 2100, accuracy = 0.9200000166893005
Epoch 2200, accuracy = 0.9200000166893005
Epoch 2300, accuracy = 0.8999999761581421
Epoch 2400, accuracy = 0.8500000238418579
Epoch 2500, accuracy = 0.8700000047683716
Epoch 2600, accuracy = 0.8899999856948853
Epoch 2700, accuracy = 0.9100000262260437
Epoch 2800, accuracy = 0.9200000166893005
Epoch 2900, accuracy = 0.9399999976158142
Epoch 3000, accuracy = 0.9399999976158142
