<a href="https://colab.research.google.com/github/rahiakela/deep-learning-research-and-practice/blob/main/deep-learning-with-python-by-francois-chollet/7-deep-dive-into-keras/03_keras_custom_training_and_evaluation.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

##Keras custom training and evaluation fundamentals

There are three APIs for building models in Keras:

* The Sequential model, the most approachable API—it’s basically a Python list. As such, it’s limited to simple stacks of layers.
* The Functional API, which focuses on graph-like model architectures. It represents
a nice mid-point between usability and flexibility, and as such, it’s the
most commonly used model-building API.
* Model subclassing, a low-level option where you write everything yourself from
scratch. This is ideal if you want full control over every little thing. However, you
won’t get access to many built-in Keras features, and you will be more at risk of
making mistakes.

<img src='https://github.com/rahiakela/deep-learning-research-and-practice/blob/main/deep-learning-with-python-by-francois-chollet/7-deep-dive-into-keras/images/1.png?raw=1' width='600'/>

##Setup

In [1]:
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers

from tensorflow.keras.datasets import mnist

import random
import string
import re

import numpy as np
from matplotlib import pyplot as plt

In [2]:
(images, labels), (test_images, test_labels) = mnist.load_data()

images = images.reshape((60000, 28 * 28)).astype("float32") / 255
test_images = test_images.reshape((10000, 28 * 28)).astype("float32") / 255

train_images, val_images = images[10000:], images[:10000]
train_labels, val_labels = labels[10000:], labels[:10000]

Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/mnist.npz


In [3]:
def get_mnist_model():
  inputs = keras.Input(shape=(28 * 28, ))
  features = layers.Dense(512, activation="relu")(inputs)
  features = layers.Dropout(0.5)(features)
  outputs = layers.Dense(10, activation="softmax")(features)

  model = keras.Model(inputs=inputs, outputs=outputs)

  return model

##Using built-in training and evaluation loops

After all, the built-in `fit()` workflow is solely focused on supervised learning: a setup
where there are known targets (also called labels or annotations) associated with your
input data, and where you compute your loss as a function of these targets and the
model’s predictions. 


However, not every form of machine learning falls into this category. There are other setups where no explicit targets are present, such as generative
learning self-supervised learning (where targets
are obtained from the inputs), and reinforcement learning (where learning is driven by
occasional “rewards,” much like training a dog).

Whenever you find yourself in a situation where the built-in fit() is not enough,
you will need to write your own custom training logic.

As a reminder, the contents of a
typical training loop look like this:

* Run the forward pass (compute the model’s output) inside a gradient tape to
obtain a loss value for the current batch of data.
* Retrieve the gradients of the loss with regard to the model’s weights.
* Update the model’s weights so as to lower the loss value on the current batch of data.

These steps are repeated for as many batches as necessary. This is essentially what
`fit()` does under the hood.

##Training versus inference

In the low-level training loop examples you’ve seen so far:

* step 1 (the forward pass) was done via `predictions = model(inputs)`
* step 2 (retrieving the gradients computed by the gradient tape) was done via `gradients = tape.gradient(loss,model.weights)`

Some Keras layers, such as the `Dropout` layer, have different behaviors during training and during inference. Such layers expose
a training Boolean argument in their `call()` method. 

Calling `dropout(inputs,
training=True)` will drop some activation entries, while calling `dropout(inputs, training=False)` does nothing.

In addition, note that when you retrieve the gradients of the weights of your
model, you should not use `tape.gradients(loss, model.weights)`, but rather `tape.gradients(loss, model.trainable_weights)`. Indeed, layers and models own two kinds of weights:

* **Trainable weights**—These are meant to be updated via backpropagation to minimize the loss of the model, such as the kernel and bias of a Dense layer.
* **Non-trainable weights**—These are meant to be updated during the forward pass by the layers that own them.

Among Keras built-in layers, the only layer that features non-trainable weights is the `BatchNormalization` layer.The `BatchNormalization`
layer needs non-trainable weights in order to track information about the mean and
standard deviation of the data that passes through it, so as to perform an online
approximation of feature normalization.







In [4]:
def train_step(inputs, targets):
  with tf.GradientTape() as tape:
    predictions = model(inputs, training=True)
    loss = loss_fn(targets, predictions)

  gradients = tape.gradients(loss, model.trainable_weights)
  optimizer.apply_gradients(zip(model.trainable_weights, gradients))

##Low-level usage of metrics

You’ve already learned about the metrics API: simply
call `update_state(y_true, y_pred)` for each batch of targets and predictions, and
then use `result()` to query the current metric value:

In [5]:
metric = keras.metrics.SparseCategoricalAccuracy()

targets = [0, 1, 2]
predictions = [[1, 0, 0], [0, 1, 0], [0, 0, 1]]

metric.update_state(targets, predictions)
current_result = metric.result()
print(f"result: {current_result:.2f}")

result: 1.00


You may also need to track the average of a scalar value, such as the model’s loss. You
can do this via the `keras.metrics.Mean metric`:

In [6]:
values = [0, 1, 2, 3, 4]

mean_tracker = keras.metrics.Mean()
for val in values:
  mean_tracker.update_state(val)
print(f"Mean of values: {mean_tracker.result():.2f}")

Mean of values: 2.00


Remember to use `metric.reset_state()` when you want to reset the current results.

##A complete training and evaluation loop

Let’s combine the forward pass, backward pass, and metrics tracking into a `fit()`-like training step function that takes a batch of data and targets and returns the logs that
would get displayed by the `fit()` progress bar.

In [7]:
model = get_mnist_model()

loss_fn = keras.losses.SparseCategoricalCrossentropy()
optimizer = keras.optimizers.RMSprop()
metrics = [keras.metrics.SparseCategoricalAccuracy()]
loss_tracking_metric = keras.metrics.Mean()

def train_step(inputs, targets):
  with tf.GradientTape() as tape:
    # Run the forward pass, Note that we pass training=True
    predictions = model(inputs, training=True)
    loss = loss_fn(targets, predictions)
  # Run the backward pass. Note that we use model.trainable_weights
  gradients = tape.gradient(loss, model.trainable_weights)
  optimizer.apply_gradients(zip(gradients, model.trainable_weights))

  # Keep track of metrics
  logs = {}
  for metric in metrics:
    metric.update_state(targets, predictions)
    logs[metric.name] = metric.result()

  # Keep track of the loss average
  loss_tracking_metric.update_state(loss)
  logs["loss"] = loss_tracking_metric.result()
  return logs

We will need to reset the state of our metrics at the start of each epoch and before running evaluation.

In [8]:
def reset_metrics():
  for metric in metrics:
    metric.reset_state()
  loss_tracking_metric.reset_state()

We can now lay out our complete training loop. 

Note that we use a `tf.data.Dataset` object to turn our NumPy data into an iterator that iterates over the data in batches of size 32.

In [9]:
training_dataset = tf.data.Dataset.from_tensor_slices((train_images, train_labels))
training_dataset = training_dataset.batch(32)

epochs = 3
for epoch in range(epochs):
  reset_metrics()
  for inputs_batch, targets_batch in training_dataset:
    logs = train_step(inputs_batch, targets_batch)
  print(f"Results at the end of epoch {epoch}")
  for key, value in logs.items():
    print(f"...{key}: {value:.4f}")

Results at the end of epoch 0
...sparse_categorical_accuracy: 0.9128
...loss: 0.2926
Results at the end of epoch 1
...sparse_categorical_accuracy: 0.9518
...loss: 0.1684
Results at the end of epoch 2
...sparse_categorical_accuracy: 0.9630
...loss: 0.1386


And here’s the evaluation loop: a simple for loop that repeatedly calls a `test_step()` function, which processes a single batch of data. The `test_step()` function is just a subset of the logic of `train_step()`. 

It omits the code that deals with updating the weights
of the model—that is to say, everything involving the `GradientTape` and the optimizer.

In [10]:
def test_step(inputs, targets):
  # Make prediction, Note that we pass training=False
  predictions = model(inputs, training=False)
  loss = loss_fn(targets, predictions)

  # Keep track of metrics
  logs = {}
  for metric in metrics:
    metric.update_state(targets, predictions)
    logs[f"val_{metric.name}"] = metric.result()

  # Keep track of the loss average
  loss_tracking_metric.update_state(loss)
  logs["val_loss"] = loss_tracking_metric.result()
  return logs

In [11]:
val_dataset = tf.data.Dataset.from_tensor_slices((val_images, val_labels))
val_dataset = val_dataset.batch(32)

reset_metrics()
for inputs_batch, targets_batch in val_dataset:
  logs = test_step(inputs_batch, targets_batch)
print("Evaluation results:")
for key, value in logs.items():
  print(f"...{key}: {value:.4f}")

Evaluation results:
...val_sparse_categorical_accuracy: 0.9685
...val_loss: 0.1275


Congrats—you’ve just reimplemented `fit()` and `evaluate()`!

##Make it fast with tf.function

You may have noticed that your custom loops are running significantly slower than the
built-in `fit()` and `evaluate()`, despite implementing essentially the same logic.
That’s because, by default, TensorFlow code is executed line by line, eagerly, much like
NumPy code or regular Python code. Eager execution makes it easier to debug your
code, but it is far from optimal from a performance standpoint.

It’s more performant to compile your TensorFlow code into a computation graph that
can be globally optimized in a way that code interpreted line by line cannot. The syntax
to do this is very simple: just add a `@tf.function` to any function you want to compile
before executing.

In [12]:
@tf.function
def test_step(inputs, targets):
  # Make prediction, Note that we pass training=False
  predictions = model(inputs, training=False)
  loss = loss_fn(targets, predictions)

  # Keep track of metrics
  logs = {}
  for metric in metrics:
    metric.update_state(targets, predictions)
    logs[f"val_{metric.name}"] = metric.result()

  # Keep track of the loss average
  loss_tracking_metric.update_state(loss)
  logs["val_loss"] = loss_tracking_metric.result()
  return logs

In [13]:
val_dataset = tf.data.Dataset.from_tensor_slices((val_images, val_labels))
val_dataset = val_dataset.batch(32)

reset_metrics()
for inputs_batch, targets_batch in val_dataset:
  logs = test_step(inputs_batch, targets_batch)
print("Evaluation results:")
for key, value in logs.items():
  print(f"...{key}: {value:.4f}")

Evaluation results:
...val_sparse_categorical_accuracy: 0.9685
...val_loss: 0.1275


On the Colab CPU, we go from taking `1.80s` to run the evaluation loop to only `0.8s`. Much faster!

Remember, while you are debugging your code, prefer running it eagerly, without
any `@tf.function` decorator. It’s easier to track bugs this way. Once your code is working
and you want to make it fast, add a `@tf.function` decorator to your training step
and your evaluation step—or any other performance-critical function.

##Custom training loop

Writing our own training loop entirely from scratch provides you with the most flexibility, but you end up writing a lot of code
while simultaneously missing out on many convenient features of `fit()`, such as callbacks
or built-in support for distributed training.

There’s actually a middle ground between
`fit()` and a training loop written from scratch: you can provide a custom training step function and let the framework do the rest.

You can do this by overriding the `train_step()` method of the Model class. This is
the function that is called by `fit()` for every batch of data. You will then be able to call
`fit()` as usual, and it will be running your own learning algorithm under the hood.

In [16]:
# This metric object will be used to track the average of per-batch losses during training and evaluation
loss_fn = keras.losses.SparseCategoricalCrossentropy()
loss_tracker = keras.metrics.Mean(name="loss")

class CustomModel(keras.Model):
  def train_step(self, data):
    inputs, targets = data
    with tf.GradientTape() as tape:
      # Run the forward pass, Note that we pass training=True
      predictions = model(inputs, training=True)
      loss = loss_fn(targets, predictions)
    # Run the backward pass. Note that we use model.trainable_weights
    gradients = tape.gradient(loss, model.trainable_weights)
    optimizer.apply_gradients(zip(gradients, model.trainable_weights))

    # update the loss tracker metric that tracks the average of the loss
    loss_tracker.update_state(loss)
    # return the average loss so far by querying the loss tracker metric
    return {"loss": loss_tracker.result()}

  @property
  def metrics(self):
    return [loss_tracker]

We can now instantiate our custom model, compile it (we only pass the optimizer, since
the loss is already defined outside of the model), and train it using `fit()` as usual:

In [15]:
inputs = keras.Input(shape=(28 * 28, ))
features = layers.Dense(512, activation="relu")(inputs)
features = layers.Dropout(0.5)(features)
outputs = layers.Dense(10, activation="softmax")(features)

model = CustomModel(inputs, outputs)

model.compile(optimizer=keras.optimizers.RMSprop())
model.fit(train_images, train_labels, epochs=3)

Epoch 1/3
Epoch 2/3
Epoch 3/3


<keras.callbacks.History at 0x7f8e6572e390>

There are a couple of points to note:

* This pattern does not prevent you from building models with the Functional
API. You can do this whether you’re building Sequential models, Functional
API models, or subclassed models.

* You don’t need to use a `@tf.function` decorator when you override `train_
step`—the framework does it for you.

After you’ve called `compile()`, you get access to the following:

* `self.compiled_loss`—The loss function you passed to `compile()`.
* `self.compiled_metrics`—A wrapper for the list of metrics you passed, which
allows you to call `self.compiled_metrics.update_state()` to update all of
your metrics at once.
* `self.metrics`—The actual list of metrics you passed to `compile()`. Note that it also includes a metric that tracks the loss.



In [17]:
class CustomModel(keras.Model):

  def train_step(self, data):
    inputs, targets = data
    with tf.GradientTape() as tape:
      # Run the forward pass, Note that we pass training=True
      predictions = model(inputs, training=True)
      # Compute the loss via self.compiled_ loss
      loss = self.compiled_loss(targets, predictions)
    # Run the backward pass. Note that we use model.trainable_weights
    gradients = tape.gradient(loss, model.trainable_weights)
    optimizer.apply_gradients(zip(gradients, model.trainable_weights))
    # Update the model’s metrics via self.compiled_metrics
    self.compiled_metrics.update_state(targets, predictions)

    # Return a dict mapping metric names to their current value
    return {m.name: m.result() for m in self.metrics}

Let’s try it:

In [18]:
inputs = keras.Input(shape=(28 * 28, ))
features = layers.Dense(512, activation="relu")(inputs)
features = layers.Dropout(0.5)(features)
outputs = layers.Dense(10, activation="softmax")(features)

model = CustomModel(inputs, outputs)

model.compile(optimizer=keras.optimizers.RMSprop(), 
              loss=keras.losses.SparseCategoricalCrossentropy(),
              metrics=[keras.metrics.SparseCategoricalAccuracy()])
model.fit(train_images, train_labels, epochs=3)

Epoch 1/3
Epoch 2/3
Epoch 3/3


<keras.callbacks.History at 0x7f8e683e0150>