In [1]:
from tensorflow import keras
from tensorflow.keras import layers
import tensorflow
from tensorflow.keras.datasets import mnist

In [2]:
def get_mnist_model():                      # for the model to be reusable later
  inputs = keras.Input((28 * 28,))
  features = layers.Dense(512, activation='relu')(inputs)
  features = layers.Dropout(0.5)(features)                 # randomly reducing the neurons to reduce overfitting.
  outputs = layers.Dense(10, activation='softmax')(features)
  model = keras.Model(inputs, outputs)
  return model

In [3]:
# Getting the mnist data
(images, labels), (test_images, test_labels) = mnist.load_data()
images = images.reshape((60000, 28 * 28)).astype("float32") / 255
test_images = test_images.reshape((10000, 28 * 28)).astype("float32") / 255
train_images, val_images = images[10000:], images[:10000]
train_labels, val_labels = labels[10000:], labels[:10000]

When writing a training loop, remember to pass "training=True" during the forward pass so that it behaves in a training mode (and Not inference mode)

Also, to retrieve the gradients of the weights of the model, "model.trainable_weights" should be used, so that the weights will be updated via backpropagation to minize the loss of the model, such as in kernel, 'W' and bias, 'b'.

## Low level usage of metrics

in the low-level training loop, you'll likely want to leverage the keras metrics.

for the API: simply call update_state(y_true, y_pred) for each batch and predictions, and then use the result() argument to querry the current metric value. Lets see:

In [4]:
metric = keras.metrics.SparseCategoricalAccuracy()
targets = [0, 1, 2]
predictions = [[1, 0, 0], [0, 1, 0], [0, 0, 1]]
metric.update_state(targets, predictions)
current_result = metric.result()
print(f"result : {current_result:.2f}")

result : 1.00


Also, to track the average of a scaler like the model's loss, you can use the keras.mmetrics.Mean() metric as follows

In [5]:
values = [0, 1, 2, 3, 4]
mean_tracker = keras.metrics.Mean()
for val in values:
  mean_tracker.update_state(val)
print(f"Mean of values : {mean_tracker.result():.2f}")

Mean of values : 2.00


 Remember to use metric.reset_state() when you want to reset the current results (at the start of a training epoch or at the start of evaluation). Just as the one used in the custom metric: "RootMeanSquaredError()" (check Testing.ipynb file for reference)

with all that we've seen, Lets now look at a complete training and evaluation loop.
## Complete training and evaluation loop

In [6]:
# writing the training loop function
model = get_mnist_model()
loss_fn = keras.losses.SparseCategoricalCrossentropy()
optimizer = keras.optimizers.RMSprop()
metrics = [keras.metrics.SparseCategoricalAccuracy()]   # prepares the list of metric to monitor. reports on how well is the model doin
loss_tracking_metric = keras.metrics.Mean()     # keeps track of avrage of the losses

def train_step(inputs, targets):                        # define one training iteration(one batch) using a custom loop instead of model.fit()
  with tensorflow.GradientTape() as tape:
    predictions = model(inputs, training=True)      # run the forward pass. training=True means it behaves in a training mode as said earlier
    loss = loss_fn(targets, predictions)
  gradients = tape.gradient(loss, model.trainable_weights)
  optimizer.apply_gradients(zip(gradients, model.trainable_weights))

  logs = {}
  for metric in metrics:
    metric.update_state(targets, predictions)   # update the metrics's internal state/accumulators.
    logs[metric.name] = metric.result()         # assigns the metric current value to the metric (name)

  loss_tracking_metric.update_state(loss)
  logs['loss'] = loss_tracking_metric.result()
  return logs

we need to reset our metrics at the begining of each epoch and before the running evaluation. Thus ensures each epoch's metric value reflects the performance of only the epoch. lets see it below

In [7]:
def reset_metrics():
  for metric in metrics:
    metric.reset_state()
  loss_tracking_metric.reset_state()

lets layout our complete training loop. notice we use tf.data.Dataset object, this turns our tensor data (a Numpy data) into iterator. It splits our data into individual (x,y) pairs that tensorflow can efficiently feed into our model. Lets write the complete training loop:

In [8]:
training_dataset = tensorflow.data.Dataset.from_tensor_slices((
    train_images, train_labels))
training_dataset = training_dataset.batch(32)
epochs = 3

for epoch in range(epochs):
  reset_metrics()
  for inputs_batch, targets_batch in training_dataset:
    logs = train_step(inputs_batch, targets_batch)  # returned from the train_step function above. that is the metric, loss and their respective values
  print(f"results at the end of epoch {epoch}")
  for key, values in logs.items():
    print(f"... {key} : {values: .4f}")

results at the end of epoch 0
... sparse_categorical_accuracy :  0.9132
... loss :  0.2906
results at the end of epoch 1
... sparse_categorical_accuracy :  0.9543
... loss :  0.1597
results at the end of epoch 2
... sparse_categorical_accuracy :  0.9638
... loss :  0.1301


And that is a training loops. we can also see that our model is learning well with the loss going down and the metric going up.