##### Copyright 2018 The TensorFlow Authors.

In [0]:
#@title Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

# TensorBoard Scalars: Logging basic training metrics in Keras

<table class="tfo-notebook-buttons" align="left">
  <td>[link text](https://)
    <img src="https://www.tensorflow.org/images/tf_logo_32px.png" />Currently N/A
  </td>
  <td>
    <a target="_blank" href="https://colab.research.google.com/github/tensorflow/tensorboard/blob/master/docs/r2/tensorboard_scalars_and_keras_basic.ipynb"><img src="https://www.tensorflow.org/images/colab_logo_32px.png" />Run in Google Colab</a>
  </td>
  <td>
    <a target="_blank" href="https://github.com/tensorflow/tensorboard/blob/master/docs/r2/tensorboard_scalars_and_keras_basic.ipynb"><img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" />View source on GitHub</a>
  </td>
</table>


## Overview

A basic task in machine learning is understanding how key metrics such as loss change as training progresses. These metrics can help you understand if you're [overfitting](https://en.wikipedia.org/wiki/Overfitting), for example, or if you're unnecessarily training for too long. You may want to compare these metrics across different training runs to help debug and improve your model.

TensorBoard's **Scalars Plugin** allows you to visualize these metrics using a simple API with very little effort. These visualizations are called **Scalar Summaries**. 

This tutorial presents very basic examples to help you learn how to use Scalar Summaries when developing your Keras model. You will learn how to use the Keras TensorBoard callback and TensorFlow Summary APIs to visualize default and custom scalars.

## Setup

In [0]:
# Ensure TensorFlow 2.0 is installed.
!pip install -q tf-nightly-2.0-preview
# Load the TensorBoard notebook extension.
%load_ext tensorboard.notebook

In [0]:
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function

from datetime import datetime
from packaging import version

import tensorflow as tf
from tensorflow import keras

import numpy as np

print("TensorFlow version: ", tf.__version__)
assert version.parse(tf.__version__).release[0] >= 2, \
    "This notebook requires TensorFlow 2.0 or above."

## Set up data for a simple regression

You're now going to use Keras to calculate a regression, i.e., find the best line of fit for a paired data set. (While using neural networks and gradient descent is [overkill for this kind of problem](https://https://stats.stackexchange.com/questions/160179/do-we-need-gradient-descent-to-find-the-coefficients-of-a-linear-regression-mode), it does make for a very easy to understand example.)

You're going to use TensorBoard to observe how training and test **loss** change across epochs. Hopefully, you'll see training and test loss decrease over time and then remain steady.

First, generate 1000 data points roughly along the line *y = 0.5x + 2*. Split these data points into training and test sets. Your hope is that the neural net learns this relationship.

In [0]:
data_size = 1000
# 80% of the data is for training.
train_pct = 0.8

train_size = int(data_size * train_pct)

# Create some input data between -1 and 1 and randomize it.
X = np.linspace(-1, 1, data_size)
np.random.shuffle(X)

# Generate the output data.
# f(X) = 0.5X + 2 + noise
Y = 0.5 * X + 2 + np.random.normal(0, 0.05, (data_size, ))

# Split into test and train pairs.
X_train, Y_train = X[:train_size], Y[:train_size]
X_test, Y_test = X[train_size:], Y[train_size:]

## Training the model and logging loss

You're now ready to define, train and evaluate your model. 

To log the *loss* scalar as you train, you'll create the Keras TensorBoard callback, specifying a log directory, and pass it to [Model.fit()](https://https://www.tensorflow.org/api_docs/python/tf/keras/models/Model#fit).

TensorBoard reads log data from the log directory hierarchy. In this notebook, the root log directory is "logs/scalars", suffixed by a timestamped subdirectory. This enables easily identification and selection of training runs as you use TensorBoard and iterate on your model.
 

In [0]:
logdir="logs/scalars/" + datetime.now().strftime("%Y%m%d-%H%M%S")
tensorboard_callback = keras.callbacks.TensorBoard(log_dir=logdir)

model = keras.models.Sequential([
    keras.layers.Dense(16),
    keras.layers.Dense(1, input_dim=1),
])

model.compile(
    loss='mse', # keras.losses.mean_squared_error
    optimizer=keras.optimizers.SGD(lr=0.2),
)

print("Training ... With default parameters, this takes less than 10 seconds.")
training_history = model.fit(
    X_train, # input
    Y_train, # output
    batch_size=train_size,
    verbose=0, # Suppress chatty output; use Tensorboard instead
    epochs=100,
    validation_data=(X_test, Y_test),
    callbacks=[tensorboard_callback],
)

print("Average test loss: ", np.average(training_history.history['loss']))

## Examining loss using TensorBoard

Now, start TensorBoard, specifying the root log directory.

Wait a few seconds for TensorBoard's UI to spin up. 

In [0]:
%tensorboard --logdir logs/scalars

You may see TensorBoard say that "No dashboards are active for the current data set". That's because initial logging data hasn't been saved yet. As training progresses, the Keras model will start logging data. TensorBoard will periodicially refresh and show you your scalar metrics. If you're impatient, you can tap the Refresh arrow on the top right.

As you watch the training progress, note how both training and validation loss rapidly decrease and then remain stable. In fact, you could have stopped training after 25 epochs, because the training didn't improve much after that point.

Hover over the graph to see specific data points. You can also try zooming in with your mouse, or selecting part of them to view more detail.

Using the "Runs" selector on the left, you can select specific runs, or select only training or validation. As you continue to experiment and develop your model, you can compare runs by selecting only those you want to examine.


Ok, TensorBoard indicates that the metrics look good! Now see how the model actually behaves in real life. 

Given the input data (60, 25, 2), the line *y = 0.5x + 2* should yield (32, 14.5, 3). Does the model agree?

In [0]:
print(model.predict([60, 25, 2]))

Not bad!

## Logging custom scalars

What if you want to log custom values, such as a dynamic learning rate? To do that, you need to use the TensorFlow Summary API.

Retrain the regression model and log a custom learning rate. Here's how:

1.  Create a file writer, using ```tf.summary.create_file_writer()```.
2.  Define a custom learning rate function. This will be passed to the Keras [LearningRateScheduler](https://www.tensorflow.org/api_docs/python/tf/keras/callbacks/LearningRateScheduler) callback.
3.  Inside the learning rate function, use ```tf.summary.scalar()``` to log the custom learning rate.
4.  Pass the LearningRateScheduler callback to Model.fit().

In general, to log a custom scalar, you need to use ```tf.summary.scalar()``` with a file writer. You can reuse the file writer for multiple scalars.

In [0]:
logdir="logs/scalars/" + datetime.now().strftime("%Y%m%d-%H%M%S")
file_writer = tf.summary.create_file_writer(logdir + "/metrics")
file_writer.set_as_default()

def lr_schedule(epoch):
  """
  Returns a custom learning rate that decreases as epochs progress.
  """
  learning_rate = 0.2
  if epoch > 10:
    learning_rate = 0.02
  if epoch > 20:
    learning_rate = 0.01
  if epoch > 50:
    learning_rate = 0.005

  tf.summary.scalar('learning rate', data=learning_rate, step=epoch)
  return learning_rate

lr_callback = keras.callbacks.LearningRateScheduler(lr_schedule)
tensorboard_callback = keras.callbacks.TensorBoard(log_dir=logdir)

model = keras.models.Sequential([
    keras.layers.Dense(16, input_dim=1),
    keras.layers.Dense(1, input_dim=1),
])

model.compile(
    loss='mse', # keras.losses.mean_squared_error
    optimizer=keras.optimizers.SGD(),
)

training_history = model.fit(
    X_train, # input
    Y_train, # output
    batch_size=train_size,
    verbose=0, # Suppress chatty output; use Tensorboard instead
    epochs=100,
    validation_data=(X_test, Y_test),
    callbacks=[tensorboard_callback, lr_callback],
)

Let's look at TensorBoard again.

In [0]:
%tensorboard --logdir logs/scalars

Using the "Runs" selector on the left, notice that you have a ```<timestamp>/metrics``` run. Selecting this show a "learning rate" graph that allows you to verify the progression of the learning rate during this run. 

You can also compare this run's training and validation loss curves against your earlier runs.

How does this model do?

In [0]:
print(model.predict([60, 25, 2]))