# Deep Learning - Nasir Hussain - 2021/09/18

# 7 Working with Keras: A deep dive

## 7.4 Writing your own training and evaluation loops

- built-in fit() workflow is solely focused on supervised learning: 
  - a setup where there are known targets (also called labels or annotations) associated with your input data
  - where you compute your loss as a function of these targets and the model’s predictions

- contents of a typical training loop
  1. Run the forward pass (compute the model’s output) inside a gradient tape to obtain a loss value for the current batch of data.
  2. Retrieve the gradients of the loss with regard to the model’s weights.
  3. Update the model’s weights so as to lower the loss value on the current batch of data.

### 7.4.1 Training versus inference

- low-level training loop
  - step 1 (the forward pass) 
    - was done via pre‐dictions = model(inputs, training=True)
  - step 2 (retrieving the gradients computed by the gradient tape) 
    - was done via gradients = tape.gradient(loss, model.trainable_weights)

- Trainable weights
  - These are meant to be updated via backpropagation to minimize the loss of the model, such as the kernel and bias of a Dense layer.
- Non-trainable weights
  - These are meant to be updated during the forward pass by the layers that own them

In [None]:
# a supervised-learning training step
'''
def train_step(inputs, targets):
  with tf.GradientTape() as tape:
    predictions = model(inputs, training=True)
    loss = loss_fn(targets, predictions)
    gradients = tape.gradients(loss, model.trainable_weights)
    optimizer.apply_gradients(zip(model.trainable_weights, gradients))
'''

### 7.4.2 Low-level usage of metrics

In [8]:
# simply call update_state(y_true,y_pred) for each batch of targets and predictions
# use result() to query the current metric value

from tensorflow import keras
metric = keras.metrics.SparseCategoricalAccuracy()
targets = [0, 1, 2]
predictions = [[1, 0, 0], [0, 1, 0], [0, 0, 1]]
metric.update_state(targets, predictions)
current_result = metric.result()
print(f"result: {current_result:.2f}")

result: 1.00


In [9]:
# track the average of a scalar value
values = [0, 1, 2, 3, 4]
mean_tracker = keras.metrics.Mean() 
for value in values:
  mean_tracker.update_state(value) 
print(f"Mean of values: {mean_tracker.result():.2f}")

Mean of values: 2.00


### 7.4.3 A complete training and evaluation loop

In [12]:
from tensorflow.keras import layers
from tensorflow.keras.datasets import mnist
import tensorflow as tf

def get_mnist_model():
  inputs = keras.Input(shape=(28 * 28,))
  features = layers.Dense(512, activation="relu")(inputs)
  features = layers.Dropout(0.5)(features)
  outputs = layers.Dense(10, activation="softmax")(features)
  model = keras.Model(inputs, outputs)
  return model

(images, labels), (test_images, test_labels) = mnist.load_data()
images = images.reshape((60000, 28 * 28)).astype("float32") / 255
test_images = test_images.reshape((10000, 28 * 28)).astype("float32") / 255
train_images, val_images = images[10000:], images[:10000]
train_labels, val_labels = labels[10000:], labels[:10000]


In [13]:
# 7.4.3 A complete training and evaluation loop

model = get_mnist_model()

loss_fn = keras.losses.SparseCategoricalCrossentropy()
optimizer = keras.optimizers.RMSprop()
metrics = [keras.metrics.SparseCategoricalAccuracy()]
loss_tracking_metric = keras.metrics.Mean()

def train_step(inputs, targets):
  with tf.GradientTape() as tape:
    predictions = model(inputs, training=True)
    loss = loss_fn(targets, predictions)
    gradients = tape.gradient(loss, model.trainable_weights)
    optimizer.apply_gradients(zip(gradients, model.trainable_weights))
    
  logs = {}
  
  for metric in metrics:
    metric.update_state(targets, predictions)
    logs[metric.name] = metric.result()
    loss_tracking_metric.update_state(loss)
    logs["loss"] = loss_tracking_metric.result()
  
  return logs

In [14]:
# Listing 7.22 Writing a step-by-step training loop: resetting the metrics
def reset_metrics():
  for metric in metrics:
    metric.reset_state()
    loss_tracking_metric.reset_state()

In [15]:
# Listing 7.23 Writing a step-by-step training loop: the loop itself
training_dataset = tf.data.Dataset.from_tensor_slices((train_images, train_labels))
training_dataset = training_dataset.batch(32)
epochs = 3
for epoch in range(epochs):
  reset_metrics()
  for inputs_batch, targets_batch in training_dataset:
    logs = train_step(inputs_batch, targets_batch)
  print(f"Results at the end of epoch {epoch}")
  
  for key, value in logs.items():
    print(f"...{key}: {value:.4f}")

Results at the end of epoch 0
...sparse_categorical_accuracy: 0.9149
...loss: 0.2891
Results at the end of epoch 1
...sparse_categorical_accuracy: 0.9535
...loss: 0.1662
Results at the end of epoch 2
...sparse_categorical_accuracy: 0.9628
...loss: 0.1406


- test_step() function is just a subset of the logic of train_step()
  - It omits the code that deals with updating the weights of the model

In [16]:
# Listing 7.24 Writing a step-by-step evaluation loop
def test_step(inputs, targets):
  predictions = model(inputs, training=False)
  loss = loss_fn(targets, predictions)
  
  logs = {}
  
  for metric in metrics:
    metric.update_state(targets, predictions)
    logs["val_" + metric.name] = metric.result()
  
  loss_tracking_metric.update_state(loss)
  logs["val_loss"] = loss_tracking_metric.result()
  return logs
 
val_dataset = tf.data.Dataset.from_tensor_slices((val_images, val_labels))
val_dataset = val_dataset.batch(32)
reset_metrics() 

for inputs_batch, targets_batch in val_dataset:
  logs = test_step(inputs_batch, targets_batch) 
print("Evaluation results:") 

for key, value in logs.items(): 
  print(f"...{key}: {value:.4f}")

Evaluation results:
...val_sparse_categorical_accuracy: 0.9641
...val_loss: 0.1396


### 7.4.4 Make it fast with tf.function

- It’s more performant to compile your TensorFlow code into a computation graph that can be globally optimized in a way that code interpreted line by line cannot

In [17]:
# Listing 7.25 Adding a @tf.function decorator to our evaluation-step function
@tf.function

def test_step(inputs, targets):
  predictions = model(inputs, training=False)
  loss = loss_fn(targets, predictions)
  
  logs = {}
  
  for metric in metrics:
    metric.update_state(targets, predictions)
    logs["val_" + metric.name] = metric.result()
  
  loss_tracking_metric.update_state(loss)
  logs["val_loss"] = loss_tracking_metric.result()
  return logs
 
val_dataset = tf.data.Dataset.from_tensor_slices((val_images, val_labels))
val_dataset = val_dataset.batch(32)
reset_metrics() 

for inputs_batch, targets_batch in val_dataset:
  logs = test_step(inputs_batch, targets_batch) 
print("Evaluation results:") 

for key, value in logs.items(): 
  print(f"...{key}: {value:.4f}")

Evaluation results:
...val_sparse_categorical_accuracy: 0.9641
...val_loss: 0.1396


### 7.4.5 Leveraging fit() with a custom training loop

-  provide a custom training step function and let the framework do the rest.
- do this by overriding the train_step() method of the Model class. This is the function that is called by fit() for every batch of data.

- Example
  - We create a new class that subclasses keras.Model .
  - We override the method train_step(self, data) . Its contents are nearly identical to what we used in the previous section. It returns a dictionary mapping metric names (including the loss) to their current values.
  - We implement a metrics property that tracks the model’s Metric instances. This enables the model to automatically call reset_state() on the model’s metrics at the start of each epoch and at the start of a call to evaluate() , so you don’t have to do it by hand.


In [18]:
# Listing 7.26 Implementing a custom training step to use with fit()
loss_fn = keras.losses.SparseCategoricalCrossentropy()
loss_tracker = keras.metrics.Mean(name="loss")

class CustomModel(keras.Model):
  def train_step(self, data):
    inputs, targets = data
    with tf.GradientTape() as tape:
      predictions = self(inputs, training=True)
      loss = loss_fn(targets, predictions)
    gradients = tape.gradient(loss, self.trainable_weights)
    self.optimizer.apply_gradients(zip(gradients, self.trainable_weights))
    loss_tracker.update_state(loss)
    return {"loss": loss_tracker.result()}

  @property
  def metrics(self):
    return [loss_tracker] 

In [19]:
inputs = keras.Input(shape=(28 * 28,))
features = layers.Dense(512, activation="relu")(inputs)
features = layers.Dropout(0.5)(features)
outputs = layers.Dense(10, activation="softmax")(features)
model = CustomModel(inputs, outputs)
 
model.compile(optimizer=keras.optimizers.RMSprop())
model.fit(train_images, train_labels, epochs=3)

Epoch 1/3
Epoch 2/3
Epoch 3/3


<keras.callbacks.History at 0x7f44ae12d3d0>

- After you’ve called com‐pile() , you get access to the following:
  - self.compiled_loss
    - The loss function you passed to compile()
  - self.compiled_metrics
    - A wrapper for the list of metrics you passed, which allows you to call
      - self.compiled_metrics.update_state() to update all of your metrics at once.
  - self.metrics
    - The actual list of metrics you passed to compile()
    - Note that it also includes a metric that tracks the loss
    - similar to what we did manually with our loss_tracking_metric earlier.

In [20]:
# example
class CustomModel(keras.Model):
  def train_step(self, data):
    inputs, targets = data
    with tf.GradientTape() as tape:
      predictions = self(inputs, training=True)
      loss = self.compiled_loss(targets, predictions)
    gradients = tape.gradient(loss, self.trainable_weights)
    self.optimizer.apply_gradients(zip(gradients, self.trainable_weights))
    self.compiled_metrics.update_state(targets, predictions)
    return {m.name: m.result() for m in self.metrics} 

inputs = keras.Input(shape=(28 * 28,))
features = layers.Dense(512, activation="relu")(inputs)
features = layers.Dropout(0.5)(features)
outputs = layers.Dense(10, activation="softmax")(features)
model = CustomModel(inputs, outputs)
 
model.compile(
    optimizer=keras.optimizers.RMSprop(),
    loss=keras.losses.SparseCategoricalCrossentropy(),
    metrics=[keras.metrics.SparseCategoricalAccuracy()])

model.fit(train_images, train_labels, epochs=3)

Epoch 1/3
Epoch 2/3
Epoch 3/3


<keras.callbacks.History at 0x7f44ada74d50>

---