TensorBoard provides the visualization and tooling needed for machine learning experimentation:

    1. Tracking and visualizing metrics such as loss and accuracy [Scalars]
    2. Visualizing the model graph (ops and layers) [Graphs]
    3. Viewing histograms of weights, biases, or other tensors as they change over time [Distributions]
    4. Projecting embeddings to a lower dimensional space [Projector]
    5. Displaying images, text, and audio data
    6. Profiling TensorFlow programs


In [None]:
# Load the TensorBoard notebook extension
%load_ext tensorboard

In [None]:
import tensorflow as tf
import datetime

In [None]:
log_dir = "/tmp/logs/fit/" + datetime.datetime.now().strftime("%Y%m%d-%H%M%S")

##### Option1: Use callbacks to write metrics to tensorboard

In [None]:
mnist = tf.keras.datasets.mnist

(x_train, y_train),(x_test, y_test) = mnist.load_data()
x_train, x_test = x_train / 255.0, x_test / 255.0

def create_model():
      return tf.keras.models.Sequential([
        tf.keras.layers.Flatten(input_shape=(28, 28)),
        tf.keras.layers.Dense(512, activation='relu'),
        tf.keras.layers.Dropout(0.2),
        tf.keras.layers.Dense(10, activation='softmax')
      ])
    
model = create_model()

tf.keras.callbacks.TensorBoard callback params. 

        log_dir    => the path of the directory where to save the log files to be parsed by TensorBoard.

    histogram_freq => 	frequency (in epochs) at which to compute activation and weight histograms for the layers 
                        of the model. If set to 0, histograms won't be computed. Validation data (or split) must 
                        be specified for histogram visualizations.

    update_freq    =>	'batch' or 'epoch' or integer. When using 'batch', writes the losses and metrics to 
                        TensorBoard after each batch. The same applies for 'epoch'. If using an integer, let's say 
                        1000, the callback will write the metrics and losses to TensorBoard every 1000 batches. 
                        Note that writing too frequently to TensorBoard can slow down your training.

In [None]:
model.compile(optimizer='adam',
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])

# Create call back 
tensorboard_callback = tf.keras.callbacks.TensorBoard(log_dir=log_dir, histogram_freq=1)

In [None]:
# Pass the tensorboard_callback 
model.fit(x=x_train, 
          y=y_train, 
          epochs=5, 
          validation_data=(x_test, y_test), 
          callbacks=[tensorboard_callback])

In [None]:
%tensorboard --logdir /tmp/logs/fit

Scalars Dashboard:

        The Scalars dashboard shows how the loss and metrics change with every epoch or batch[Depends on config]. 
        You can use it to also track training speed, learning rate, and other scalar values.

Graphs Dashboard:

    The Graphs dashboard helps you visualize your model. In this case, the Keras graph of layers is shown which can
    help you ensure it is built correctly. 

Distributions and Histograms Dashboard:

    The Distributions and Histograms dashboards show the distribution of a Tensor over time. This can be useful
    to visualize weights and biases and verify that they are changing in an expected way.


##### Option2: Using TensorBoard with other methods

In [None]:
# Using same dataset but converted into batches. 
train_dataset = tf.data.Dataset.from_tensor_slices((x_train, y_train))
test_dataset = tf.data.Dataset.from_tensor_slices((x_test, y_test))

train_dataset = train_dataset.shuffle(60000).batch(64)
test_dataset = test_dataset.batch(64)

In [None]:
loss_object = tf.keras.losses.SparseCategoricalCrossentropy()
optimizer = tf.keras.optimizers.Adam()

In [None]:
# Define our metrics
train_loss = tf.keras.metrics.Mean('train_loss', dtype=tf.float32)
train_accuracy = tf.keras.metrics.SparseCategoricalAccuracy('train_accuracy')

test_loss = tf.keras.metrics.Mean('test_loss', dtype=tf.float32)
test_accuracy = tf.keras.metrics.SparseCategoricalAccuracy('test_accuracy')

In [None]:
# Define the training and test functions:
def train_step(model, optimizer, x_train, y_train):
    with tf.GradientTape() as tape:
        predictions = model(x_train, training=True)
        loss = loss_object(y_train, predictions)
    grads = tape.gradient(loss, model.trainable_variables)
    optimizer.apply_gradients(zip(grads, model.trainable_variables))

    train_loss(loss)
    train_accuracy(y_train, predictions)

def test_step(model, x_test, y_test):
    predictions = model(x_test)
    loss = loss_object(y_test, predictions)

    test_loss(loss)
    test_accuracy(y_test, predictions)

In [None]:
# Set up summary writers to write the summaries to disk in a different logs directory:

current_time = datetime.datetime.now().strftime("%Y%m%d-%H%M%S")
train_log_dir = '/tmp/logs/gradient_tape/' + current_time + '/train'
test_log_dir = '/tmp/logs/gradient_tape/' + current_time + '/test'

train_summary_writer = tf.summary.create_file_writer(train_log_dir)
test_summary_writer = tf.summary.create_file_writer(test_log_dir)

Start training. 

Use tf.summary.scalar() to log metrics (loss and accuracy) during training/testing within the scope of the summary writers to write the summaries to disk. You have control over which metrics to log and how often to do it. Other tf.summary functions enable logging other types of data.

In [None]:
model = create_model() # reset our model

EPOCHS = 5

for epoch in range(EPOCHS):
    for (x_train, y_train) in train_dataset:
        train_step(model, optimizer, x_train, y_train)
    with train_summary_writer.as_default():
        tf.summary.scalar('loss', train_loss.result(), step=epoch)
        tf.summary.scalar('accuracy', train_accuracy.result(), step=epoch)

    for (x_test, y_test) in test_dataset:
        test_step(model, x_test, y_test)
    with test_summary_writer.as_default():
        tf.summary.scalar('loss', test_loss.result(), step=epoch)
        tf.summary.scalar('accuracy', test_accuracy.result(), step=epoch)

    template = 'Epoch {}, Loss: {}, Accuracy: {}, Test Loss: {}, Test Accuracy: {}'
    print (template.format(epoch+1,
                         train_loss.result(), 
                         train_accuracy.result()*100,
                         test_loss.result(), 
                         test_accuracy.result()*100))

    # Reset metrics every epoch
    train_loss.reset_states()
    test_loss.reset_states()
    train_accuracy.reset_states()
    test_accuracy.reset_states()