<a href="https://colab.research.google.com/github/rahiakela/deep-learning-research-and-practice/blob/main/deep-learning-with-python-by-francois-chollet/7-deep-dive-into-keras/02_keras_training_and_evaluation_fundamentals.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

##Keras model fundamentals

There are three APIs for building models in Keras:

* The Sequential model, the most approachable API—it’s basically a Python list. As such, it’s limited to simple stacks of layers.
* The Functional API, which focuses on graph-like model architectures. It represents
a nice mid-point between usability and flexibility, and as such, it’s the
most commonly used model-building API.
* Model subclassing, a low-level option where you write everything yourself from
scratch. This is ideal if you want full control over every little thing. However, you
won’t get access to many built-in Keras features, and you will be more at risk of
making mistakes.

<img src='https://github.com/rahiakela/deep-learning-research-and-practice/blob/main/deep-learning-with-python-by-francois-chollet/7-deep-dive-into-keras/images/1.png?raw=1' width='600'/>

##Setup

In [1]:
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers

from tensorflow.keras.datasets import mnist

import random
import string
import re

import numpy as np

##Using built-in training and evaluation loops

The principle of progressive disclosure of complexity—access to a spectrum of workflows
that go from dead easy to arbitrarily flexible, one step at a time—also applies to
model training. Keras provides you with different workflows for training models. 

They
can be as simple as calling `fit()` on your data, or as advanced as writing a new training
algorithm from scratch.



In [2]:
def get_mnist_model():
  inputs = keras.Input(shape=(28 * 28, ))
  features = layers.Dense(512, activation="relu")(inputs)
  features = layers.Dropout(0.5)(features)
  outputs = layers.Dense(10, activation="softmax")(features)

  model = keras.Model(inputs=inputs, outputs=outputs)

  return model

Load your data, reserving
some for validation.

In [3]:
(images, labels), (test_images, test_labels) = mnist.load_data()

images = images.reshape((60000, 28 * 28)).astype("float32") / 255
test_images = test_images.reshape((10000, 28 * 28)).astype("float32") / 255

train_images, val_images = images[10000:], images[:10000]
train_labels, val_labels = labels[10000:], labels[:10000]

Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/mnist.npz


In [4]:
model = get_mnist_model()

# Compile the model by specifying its optimizer, the loss function to minimize, and the metrics to monitor
model.compile(optimizer="rmsprop", loss="sparse_categorical_crossentropy", metrics=["accuracy"])

# train the model, optionally providing validation data to monitor performance on unseen data
model.fit(train_images, train_labels, epochs=3, validation_data=(val_images, val_labels))

Epoch 1/3
Epoch 2/3
Epoch 3/3


<keras.callbacks.History at 0x7f2db7386c50>

In [5]:
# compute the loss and metrics on new data
test_metrics = model.evaluate(test_images, test_labels)



In [6]:
# compute classification probabilities on new data
predictions = model.predict(test_images)

There are a couple of ways you can customize this simple workflow:

* Provide your own custom metrics.
* Pass `callbacks` to the `fit()` method to schedule actions to be taken at specific points during training.


##Writing your own metrics

Metrics are key to measuring the performance of your model—in particular, to measuring
the difference between its performance on the training data and its performance
on the test data.

A Keras metric is a subclass of the `keras.metrics.Metric` class. Like layers, a metric
has an internal state stored in TensorFlow variables. Unlike layers, these variables
aren’t updated via backpropagation, so you have to write the state-update logic yourself,
which happens in the `update_state()` method.

For example, here’s a simple custom metric that measures the root mean squared
error (RMSE).

In [7]:
class RootMeanSquaredError(keras.metrics.Metric):

  def __init__(self, name="rmse", **kwargs):
    super().__init__(name=name, **kwargs)

    # Define the state variables in the constructor.
    self.mse_sum = self.add_weight(name="mse_sum", initializer="zeros")
    self.total_samples = self.add_weight(name="total_samples", initializer="zeros", dtype="int32")

  def update_state(self, y_true, y_pred, sample_weight=None):
    # To match our MNIST model, we expect categorical predictions and integer labels.
    y_true = tf.one_hot(y_true, depth=tf.shape(y_pred)[1])
    mse = tf.reduce_sum(tf.square(y_true - y_pred))
    self.mse_sum.assign_add(mse)
    num_samples = tf.shape(y_pred)[0]
    self.total_samples.assign_add(num_samples)

  def result(self):
    return tf.sqrt(self.mse_sum / tf.cast(self.total_samples, tf.float32))

  def reset_state(self):
    self.mse_sum.assign(0)
    self.total_samples.assign(0)

Now, Custom metrics can be used just like built-in ones. Let’s test-drive our own metric:

In [8]:
model = get_mnist_model()

# Compile the model by specifying its optimizer, the loss function to minimize, and the metrics to monitor
model.compile(optimizer="rmsprop", loss="sparse_categorical_crossentropy", metrics=["accuracy", RootMeanSquaredError()])

# train the model, optionally providing validation data to monitor performance on unseen data
model.fit(train_images, train_labels, epochs=3, validation_data=(val_images, val_labels))

Epoch 1/3
Epoch 2/3
Epoch 3/3


<keras.callbacks.History at 0x7f2db4944250>

In [9]:
# compute the loss and metrics on new data
test_metrics = model.evaluate(test_images, test_labels)



In [10]:
# compute classification probabilities on new data
predictions = model.predict(test_images)

##Using callbacks

The Keras `callbacks` API will help you
transform your call to `model.fit()` from a paper airplane into a smart, autonomous
drone that can self-introspect and dynamically take action.

A callback is an object (a class instance implementing specific methods) that is
passed to the model in the call to `fit()` and that is called by the model at various
points during training. It has access to all the available data about the state of the
model and its performance, and it can take action: interrupt training, save a model,
load a different weight set, or otherwise alter the state of the model.

Here are some examples of ways you can use callbacks:

* **Model checkpointing**—Saving the current state of the model at different points
during training.
* **Early stopping**—Interrupting training when the validation loss is no longer
improving (and of course, saving the best model obtained during training).
* **Dynamically adjusting the value of certain parameters during training**—Such as the
learning rate of the optimizer.
* **Logging training and validation metrics during training, or visualizing the representations
learned by the model as they’re updated**—The `fit()` progress bar that you’re
familiar with is in fact a callback!



###EarlyStopping and ModelCheckpoint

The EarlyStopping callback interrupts training once a target metric being monitored
has stopped improving for a fixed number of epochs. 

For instance, this callback
allows you to interrupt training as soon as you start overfitting, thus avoiding having to
retrain your model for a smaller number of epochs. This callback is typically used in combination with ModelCheckpoint, which lets you continually save the model during
training (and, optionally, save only the current best model so far: the version of the
model that achieved the best performance at the end of an epoch).

In [12]:
callbacks_list = [
   # Monitors the model’s validation accuracy and Interrupts training when accuracy has stopped improving for two epochs     
   keras.callbacks.EarlyStopping(monitor="val_accuracy", patience=2),
   keras.callbacks.ModelCheckpoint(filepath="checkpoint_path.keras", monitor="val_loss", save_best_only=True)          
]

In [13]:
model = get_mnist_model()

# Compile the model by specifying its optimizer, the loss function to minimize, and the metrics to monitor
model.compile(optimizer="rmsprop", loss="sparse_categorical_crossentropy", metrics=["accuracy", RootMeanSquaredError()])

# train the model, optionally providing validation data to monitor performance on unseen data
model.fit(train_images, train_labels, epochs=3, validation_data=(val_images, val_labels), callbacks=callbacks_list)

Epoch 1/3
Epoch 2/3
Epoch 3/3


<keras.callbacks.History at 0x7f2db45ac990>

To reload the model you’ve saved, just use.

In [14]:
model = keras.models.load_model("checkpoint_path.keras")

In [15]:
# compute the loss and metrics on new data
test_metrics = model.evaluate(test_images, test_labels)



In [16]:
# compute classification probabilities on new data
predictions = model.predict(test_images)

###Custom callback

##Conclusion

In general, the Functional API provides you with a pretty good trade-off between
ease of use and flexibility. It also gives you direct access to layer connectivity, which is
very powerful for use cases such as model plotting or feature extraction. 

If you can use
the Functional API—that is, if your model can be expressed as a directed acyclic graph
of layers—I recommend using it over model subclassing.

In general, using Functional models
that include subclassed layers provides the best of both worlds: high development flexibility
while retaining the advantages of the Functional API.

