# Visualizing the Graph and Training Curves Using TensorBoard

So now we have a computation graph that trains a Linear Regression model using Mini-batch Gradient Descent, and we are saving checkpoints at regular intervals. Sounds sophisticated, doesn’t it? 

However, we are still relying on the `print()` function to visualize progress during training. 

There is a better way: enter `TensorBoard`. If you feed it some training stats, it will display nice interactive visualizations of these stats in your web browser (e.g., learning curves). You can also provide it the graph’s definition and it will give you a great interface to browse through it. 

This is very useful to identify errors in the graph, to find bottlenecks, and so on.

The first step is to tweak your program a bit so it writes the graph definition and some training stats—for example, the training error (MSE)—to a log directory that TensorBoard will read from. 

You need to **use a different log directory every time you run your program, or else TensorBoard will merge stats from different runs, which will mess up the visualizations**. 

The simplest solution for this is to include a time‐stamp in the log directory name. Add the following code at the beginning of the program:

In [1]:
from datetime import datetime

now = datetime.utcnow().strftime("%Y%m%d%H%M%S")
root_logdir = "./tf_logs"
logdir = "{}/run-{}".format(root_logdir, now)

In [2]:
import tensorflow as tf
import numpy as np
from sklearn.preprocessing import StandardScaler
from sklearn.datasets import fetch_california_housing

# To plot pretty figures
%matplotlib inline
import matplotlib
import matplotlib.pyplot as plt
plt.rcParams['axes.labelsize'] = 14
plt.rcParams['xtick.labelsize'] = 12
plt.rcParams['ytick.labelsize'] = 12

# Where to save the figures
PROJECT_ROOT_DIR = "."
CHAPTER_ID = "tensorflow"

def save_fig(fig_id, tight_layout=True):
    path = os.path.join(PROJECT_ROOT_DIR, "images", CHAPTER_ID, fig_id + ".png")
    print("Saving figure", fig_id)
    if tight_layout:
        plt.tight_layout()
    plt.savefig(path, format='png', dpi=300)

# to make this notebook's output stable across runs
def reset_graph(seed= 2018):
    tf.reset_default_graph()
    tf.set_random_seed(seed)
    np.random.seed(seed)
    
housing = fetch_california_housing()
m,n = housing.data.shape
housing_data_plus_bias = np.c_[np.ones((m,1)), housing.data]
scaler = StandardScaler()
scaled_housing_data = scaler.fit_transform(housing.data)
scaled_housing_data_plus_bias = np.c_[np.ones((m, 1)), scaled_housing_data]    


  from ._conv import register_converters as _register_converters


In [3]:
reset_graph()
n_epochs = 200
batch_size = 500

learning_rate = 0.01
n_batches = int(np.ceil(m/batch_size))

X = tf.placeholder(tf.float32, shape=(None, n+1), name = 'X')
y = tf.placeholder(tf.float32, shape=(None,1), name = 'y') 
theta  =  tf.Variable(tf.random_uniform([n+1, 1], -1.0, 1.0), name = 'theta')
y_pred = tf.matmul(X, theta, name = 'predictions')
error = y_pred - y
mse = tf.reduce_mean(tf.square(error), name = 'mse')
optimizer = tf.train.GradientDescentOptimizer(learning_rate = learning_rate)
#optimizer = tf.train.MomentumOptimizer(learning_rate = learning_rate, momentum = 0.9)
training_op = optimizer.minimize(mse)
init = tf.global_variables_initializer()
saver = tf.train.Saver() 

mse_summary = tf.summary.scalar('MSE', mse) # <--- here is the scalar
file_writer = tf.summary.FileWriter(logdir, tf.get_default_graph()) # <-- here is the writer


The first line creates a node in the graph that will evaluate the MSE value and write it to a TensorBoard-compatible binary log string called a summary. The second line creates a `FileWriter` that you will use to write summaries to logfiles in the log directory.

Next you need to update the execution phase to evaluate the `mse_summary` node regularly during training (e.g., every 10 mini-batches). This will output a summary that you can then write to the events file using the file_writer. Here is the updated code:

In [4]:
def fetch_batch(epoch, batch_index, batch_size):
    np.random.seed(epoch * n_batches + batch_index)  
    indices = np.random.randint(m, size=batch_size)  
    X_batch = scaled_housing_data_plus_bias[indices] 
    y_batch = housing.target.reshape(-1, 1)[indices] 
    return X_batch, y_batch

with tf.Session() as sess:
    sess.run(init)
    for epoch in range(n_epochs):
        if epoch % 100 == 0:
            save_path = saver.save(sess, "./temp/my_model.ckpt")
            print("Epoch", epoch, "MSE =", mse.eval(feed_dict = {X:scaled_housing_data_plus_bias, y:housing.target.reshape(-1, 1)}))        
        for batch_index in range(n_batches):
            X_batch, y_batch = fetch_batch(epoch, batch_index, batch_size) 
            if batch_index % 10 == 0:
                summary_str = mse_summary.eval(feed_dict={X: X_batch, y: y_batch})
                step = epoch * n_batches + batch_index
                file_writer.add_summary(summary_str, step)               
            sess.run(training_op, feed_dict = {X:X_batch, y:y_batch})                        
    #y_pred_value = y_pred.eval(feed_dict = {X:scaled_housing_data_plus_bias, y:housing.target.reshape(-1, 1)})
    best_theta = theta.eval()
    save_path = saver.save(sess, "./temp/my_model_final.ckpt")

file_writer.close()

Epoch 0 MSE = 4.2606196
Epoch 100 MSE = 0.52518654


Now run this program: it will create the log directory and write an events file in this directory, containing both the graph definition and the MSE values. Open up a shell and go to your working directory, then type ls -l tf_logs/run* to list the contents of the log directory:

```bash
[Data_Science_Python] c.cui $:ls -l tf_logs/run*
total 128
-rw-r--r--  1 caihaocui  staff  62207 Apr 26 19:03 events.out.tfevents.1524733371.192-168-1-103.tpgi.com.au
```

If you run the program a second time, you should see a second directory in the tf_logs/ directory:

```bash
[Data_Science_Python] c.cui $:ls -l tf_logs/run*
tf_logs/run-20180426090242:
total 128
-rw-r--r--  1 caihaocui  staff  62207 Apr 26 19:03 events.out.tfevents.1524733371.192-168-1-103.tpgi.com.au
tf_logs/run-20180426090550:
total 168
-rw-r--r--  1 caihaocui  staff  83218 Apr 26 19:06 events.out.tfevents.1524733551.192-168-1-103.tpgi.com.au
```

Great! Now it’s time to fire up the TensorBoard server. You need to activate your virtualenv environment if you created one, then start the server by running the tensorboard command, pointing it to the root log directory. This starts the TensorBoard web server, listening on port 6006 (which is “goog” written upside down):Next open a browser and go to http://0.0.0.0:6006/ (or http://localhost:6006/). 

Welcome to TensorBoard! In the Events tab you should see MSE on the right. 

```bash
[Data_Science_Python] c.cui $:tensorboard --logdir=tf_logs

```

## Name Scopes
When dealing with more complex models such as neural networks, the graph can easily become cluttered with thousands of nodes. To avoid this, you can create name scopes to group related nodes. For example, let’s modify the previous code to define the error and mse ops within a name scope called "loss":

In [5]:
reset_graph()

now = datetime.utcnow().strftime("%Y%m%d%H%M%S")
root_logdir = "tf_logs"
logdir = "{}/run-{}/".format(root_logdir, now)

X = tf.placeholder(tf.float32, shape=(None, n + 1), name="X")
y = tf.placeholder(tf.float32, shape=(None, 1), name="y")
theta = tf.Variable(tf.random_uniform([n + 1, 1], -1.0, 1.0, seed=42), name="theta")
y_pred = tf.matmul(X, theta, name="predictions")

with tf.name_scope("loss") as scope: # <-- Name Scope
    error = y_pred - y
    mse = tf.reduce_mean(tf.square(error), name="mse")

In [6]:
optimizer = tf.train.GradientDescentOptimizer(learning_rate=learning_rate)
training_op = optimizer.minimize(mse)

init = tf.global_variables_initializer()

mse_summary = tf.summary.scalar('MSE', mse)
file_writer = tf.summary.FileWriter(logdir, tf.get_default_graph())

In [7]:
n_epochs = 200
batch_size = 500
n_batches = int(np.ceil(m / batch_size))

with tf.Session() as sess:
    sess.run(init)
    for epoch in range(n_epochs):
        for batch_index in range(n_batches):
            X_batch, y_batch = fetch_batch(epoch, batch_index, batch_size)
            if batch_index % 10 == 0:
                summary_str = mse_summary.eval(feed_dict={X: X_batch, y: y_batch})
                step = epoch * n_batches + batch_index
                file_writer.add_summary(summary_str, step)
            sess.run(training_op, feed_dict={X: X_batch, y: y_batch})

    best_theta = theta.eval()

file_writer.flush()
file_writer.close()
print("Best theta:")
print(best_theta)

Best theta:
[[ 2.060158  ]
 [ 0.8348199 ]
 [ 0.11775211]
 [-0.2500834 ]
 [ 0.3308876 ]
 [ 0.00223646]
 [-0.04730813]
 [-0.896435  ]
 [-0.8715286 ]]


In [8]:
print(error.op.name)
print(mse.op.name)

loss/sub
loss/mse


In TensorBoard, the mse and error nodes now appear inside the loss namespace, which appears collapsed by default.

## inside Jupyter
If you want to take a peek at the graph directly within Jupyter, you can use the show_graph() function 

In [9]:
from IPython.display import clear_output, Image, display, HTML

def strip_consts(graph_def, max_const_size=32):
    """Strip large constant values from graph_def."""
    strip_def = tf.GraphDef()
    for n0 in graph_def.node:
        n = strip_def.node.add() 
        n.MergeFrom(n0)
        if n.op == 'Const':
            tensor = n.attr['value'].tensor
            size = len(tensor.tensor_content)
            if size > max_const_size:
                tensor.tensor_content = b"<stripped %d bytes>"%size
    return strip_def

def show_graph(graph_def, max_const_size=32):
    """Visualize TensorFlow graph."""
    if hasattr(graph_def, 'as_graph_def'):
        graph_def = graph_def.as_graph_def()
    strip_def = strip_consts(graph_def, max_const_size=max_const_size)
    code = """
        <script>
          function load() {{
            document.getElementById("{id}").pbtxt = {data};
          }}
        </script>
        <link rel="import" href="https://tensorboard.appspot.com/tf-graph-basic.build.html" onload=load()>
        <div style="height:600px">
          <tf-graph-basic id="{id}"></tf-graph-basic>
        </div>
    """.format(data=repr(str(strip_def)), id='graph'+str(np.random.rand()))

    iframe = """
        <iframe seamless style="width:1200px;height:620px;border:0" srcdoc="{}"></iframe>
    """.format(code.replace('"', '&quot;'))
    display(HTML(iframe))

In [11]:
show_graph(tf.get_default_graph())

# Modularity

Suppose you want to create a graph that adds the output of two rectified linear units (ReLU). A ReLU computes a linear function of the inputs, and outputs the result if it is positive, and 0 otherwise, $h(x) = max\{w*x+b, 0\}$.