# <font color="#418FDE" size="6.5" uppercase>**Custom Training Loops**</font>

>Last update: 20260125.
    
By the end of this Lecture, you will be able to:
- Implement a per-batch training step using tf.GradientTape and a Keras model. 
- Wrap training steps with tf.function to improve performance while preserving debuggability. 
- Track custom metrics within a training loop and reset them appropriately each epoch. 


## **1. Per Batch Train Step**

### **1.1. Model Forward Pass**

<img src="https://cdn.jsdelivr.net/gh/mhrafiei/contents@main/LFF/Master TensorFlow 2.20.0/Module_04/Lecture_B/image_01_01.jpg?v=1769396583" width="250">



>* Forward pass feeds each batch through the model
>* Transforms inputs to predictions; later steps assess quality

>* Forward pass is recorded for automatic differentiation
>* Tracing links inputs, weights, predictions for gradients

>* Training forward pass uses training-specific layer behavior
>* Ensures realistic learning, stable statistics, differentiable predictions



In [None]:
#@title Python Code - Model Forward Pass

# This script demonstrates a simple model forward pass.
# It focuses on per batch predictions during training.
# We keep outputs small and explanations beginner friendly.

# !pip install tensorflow==2.20.0.

# Import required standard libraries safely.
import os
import random
import numpy as np

# Import TensorFlow and Keras submodules.
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers

# Print TensorFlow version in one short line.
print("TensorFlow version:", tf.__version__)

# Set deterministic random seeds for reproducibility.
seed_value = 42
random.seed(seed_value)
np.random.seed(seed_value)
tf.random.set_seed(seed_value)

# Select device preference based on GPU availability.
physical_gpus = tf.config.list_physical_devices("GPU")
if physical_gpus:
    device_name = "GPU"
else:
    device_name = "CPU"

# Print which device type will likely be used.
print("Running on device type:", device_name)

# Load MNIST dataset using Keras helper.
(x_train, y_train), _ = keras.datasets.mnist.load_data()

# Select a very small subset for quick demonstration.
subset_size = 64
x_train_small = x_train[:subset_size]
y_train_small = y_train[:subset_size]

# Normalize images to float32 in range zero one.
x_train_small = x_train_small.astype("float32") / 255.0

# Add channel dimension to match model expectations.
x_train_small = np.expand_dims(x_train_small, axis=-1)

# Validate input shape before building the model.
print("Input batch shape:", x_train_small.shape)

# Build a simple sequential Keras model.
model = keras.Sequential([
    layers.Input(shape=(28, 28, 1)),
    layers.Conv2D(8, (3, 3), activation="relu"),
    layers.Flatten(),
    layers.Dense(10, activation="softmax"),
])

# Compile model to attach loss and optimizer.
model.compile(
    optimizer=keras.optimizers.Adam(learning_rate=0.001),
    loss=keras.losses.SparseCategoricalCrossentropy(),
    metrics=["accuracy"],
)

# Take one small batch for the forward pass.
batch_size = 16
x_batch = x_train_small[:batch_size]
y_batch = y_train_small[:batch_size]

# Confirm batch shapes before passing to the model.
print("Batch images shape:", x_batch.shape)
print("Batch labels shape:", y_batch.shape)

# Define a single forward pass using GradientTape.
@tf.function
def forward_pass_step(inputs):
    # Record operations for automatic differentiation.
    with tf.GradientTape() as tape:
        predictions = model(inputs, training=True)
    return predictions

# Run the forward pass on one training batch.
pred_batch = forward_pass_step(x_batch)

# Check that predictions have expected shape.
print("Predictions shape:", pred_batch.shape)

# Convert predictions to numpy for simple inspection.
pred_batch_np = pred_batch.numpy()

# Print predictions for the first three examples.
for i in range(3):
    class_probs = pred_batch_np[i]
    predicted_class = int(np.argmax(class_probs))
    true_class = int(y_batch[i])
    print(
        "Example",
        i,
        "true:",
        true_class,
        "pred:",
        predicted_class,
    )




### **1.2. Batch Loss Calculation**

<img src="https://cdn.jsdelivr.net/gh/mhrafiei/contents@main/LFF/Master TensorFlow 2.20.0/Module_04/Lecture_B/image_01_02.jpg?v=1769396662" width="250">



>* Compare predictions to targets using a loss
>* Aggregate example losses into one scalar batch loss

>* Loss choice depends on task and labels
>* Loss is differentiable scalar used for gradients

>* Loss choice shapes model focus and behavior
>* Monitoring decreasing batch loss shows learning progress



In [None]:
#@title Python Code - Batch Loss Calculation

# This script shows batch loss calculation clearly.
# It uses TensorFlow for a tiny demo model.
# Focus is on per batch loss with GradientTape.

# !pip install tensorflow==2.20.0.

# Import required standard libraries safely.
import os
import random
import numpy as np

# Import TensorFlow and Keras submodules.
import tensorflow as tf
from tensorflow import keras

# Print TensorFlow version in one short line.
print("TensorFlow version:", tf.__version__)

# Set deterministic seeds for reproducibility.
seed_value = 42
random.seed(seed_value)

# Set NumPy and TensorFlow seeds also.
np.random.seed(seed_value)
tf.random.set_seed(seed_value)

# Choose device automatically based on availability.
physical_gpus = tf.config.list_physical_devices("GPU")

# Inform which device type will be used.
if physical_gpus:
    device_type = "GPU"
else:
    device_type = "CPU"

# Print selected device type briefly.
print("Using device type:", device_type)

# Create a tiny synthetic regression dataset.
num_samples = 64
input_dim = 3

# Generate random input features with small values.
features = np.random.randn(num_samples, input_dim).astype("float32")

# Define true weights and bias for targets.
true_w = np.array([[2.0], [-1.0], [0.5]], dtype="float32")

# Compute noiseless targets using matrix multiplication.
targets = features @ true_w + 0.3

# Add small Gaussian noise to targets.
noise = 0.05 * np.random.randn(num_samples, 1).astype("float32")

# Final noisy targets for training batches.
targets = targets + noise

# Wrap data into a tf.data.Dataset object.
dataset = tf.data.Dataset.from_tensor_slices((features, targets))

# Shuffle and batch the dataset for training.
batch_size = 8
dataset = dataset.shuffle(buffer_size=num_samples).batch(batch_size)

# Build a simple Keras regression model.
model = keras.Sequential([
    keras.layers.Input(shape=(input_dim,)),
    keras.layers.Dense(4, activation="relu"),
    keras.layers.Dense(1)
])

# Create a mean squared error loss instance.
loss_fn = keras.losses.MeanSquaredError()

# Create an optimizer for parameter updates.
optimizer = keras.optimizers.SGD(learning_rate=0.1)

# Define a metric to track mean batch loss.
train_loss_metric = keras.metrics.Mean(name="train_loss")

# Define one training step using GradientTape.
@tf.function
def train_step(batch_x, batch_y):
    # Validate shapes before forward pass.
    tf.debugging.assert_rank(batch_x, 2)
    tf.debugging.assert_rank(batch_y, 2)

    # Record operations for automatic differentiation.
    with tf.GradientTape() as tape:
        predictions = model(batch_x, training=True)

        # Compute scalar batch loss from predictions.
        batch_loss = loss_fn(batch_y, predictions)

    # Compute gradients of loss with respect to weights.
    gradients = tape.gradient(batch_loss, model.trainable_variables)

    # Apply gradients to update model parameters.
    optimizer.apply_gradients(zip(gradients, model.trainable_variables))

    # Update running metric with current batch loss.
    train_loss_metric.update_state(batch_loss)

    # Return scalar batch loss for optional inspection.
    return batch_loss

# Run a tiny custom training loop for two epochs.
num_epochs = 2
for epoch in range(num_epochs):
    # Reset metric state at the start of each epoch.
    train_loss_metric.reset_state()

    # Iterate over small number of batches.
    for step, (batch_x, batch_y) in enumerate(dataset):
        batch_loss_value = train_step(batch_x, batch_y)

        # Print first batch loss each epoch only.
        if step == 0:
            print(
                f"Epoch {epoch + 1}, first batch loss:",
                float(batch_loss_value)
            )

    # Print mean loss across all batches this epoch.
    print(
        f"Epoch {epoch + 1}, mean batch loss:",
        float(train_loss_metric.result())
    )




### **1.3. Gradient Calculation and Update**

<img src="https://cdn.jsdelivr.net/gh/mhrafiei/contents@main/LFF/Master TensorFlow 2.20.0/Module_04/Lecture_B/image_01_03.jpg?v=1769396699" width="250">



>* Gradients show how each weight affects loss
>* Autodiff backpropagates to produce optimizer-ready gradients

>* Optimizer pairs gradients with weights and updates
>* Small gradient-based steps gradually improve model predictions

>* Manage gradient size and target correct variables
>* Stable gradients give smoother training and better models



In [None]:
#@title Python Code - Gradient Calculation and Update

# This script shows gradient calculation and updates.
# It uses tf.GradientTape with a simple model.
# Focus on one batch training step implementation.

# !pip install tensorflow==2.20.0.

# Import required TensorFlow and NumPy modules.
import tensorflow as tf
import numpy as np

# Set deterministic seeds for reproducible behavior.
tf.random.set_seed(7)
np.random.seed(7)

# Print TensorFlow version in one short line.
print("TensorFlow version:", tf.__version__)

# Select device string based on GPU availability.
physical_gpus = tf.config.list_physical_devices("GPU")
if physical_gpus:
    device_name = "GPU available"
else:
    device_name = "GPU not available"

# Show which type of device will be used.
print("Device status:", device_name)

# Create tiny synthetic input data for demonstration.
inputs = np.linspace(-1.0, 1.0, num=8, dtype=np.float32)
inputs = inputs.reshape(-1, 1)

# Create simple target values using a linear rule.
targets = 3.0 * inputs + 0.5

# Wrap arrays as TensorFlow tensors with float32.
inputs_tf = tf.convert_to_tensor(inputs, dtype=tf.float32)

# Ensure targets tensor has matching shape.
targets_tf = tf.convert_to_tensor(targets, dtype=tf.float32)

# Confirm shapes are compatible for training.
assert inputs_tf.shape == targets_tf.shape

# Build a tiny Keras model with one Dense layer.
model = tf.keras.Sequential([
    tf.keras.layers.Dense(1, input_shape=(1,))
])

# Create a mean squared error loss function.
loss_fn = tf.keras.losses.MeanSquaredError()

# Create a simple SGD optimizer with small learning rate.
optimizer = tf.keras.optimizers.SGD(learning_rate=0.1)

# Define one training step using GradientTape.
@tf.function
def train_step(x_batch, y_batch):
    # Record operations for automatic differentiation.
    with tf.GradientTape() as tape:
        predictions = model(x_batch, training=True)
        loss_value = loss_fn(y_batch, predictions)

    # Compute gradients with respect to trainable variables.
    gradients = tape.gradient(loss_value, model.trainable_variables)

    # Optionally clip gradients to avoid extreme values.
    clipped_gradients = [
        tf.clip_by_norm(g, clip_norm=1.0) for g in gradients
    ]

    # Apply gradients to update model parameters.
    optimizer.apply_gradients(
        zip(clipped_gradients, model.trainable_variables)
    )

    # Return scalar loss and gradient norms for inspection.
    grad_norms = [tf.norm(g) for g in clipped_gradients]
    return loss_value, grad_norms

# Run a few manual training steps on the same batch.
num_steps = 3
for step in range(num_steps):
    loss_value, grad_norms = train_step(inputs_tf, targets_tf)
    grad_norms_np = [float(g.numpy()) for g in grad_norms]
    print(
        "Step", step, "loss:", float(loss_value.numpy()),
        "grad_norms:", grad_norms_np
    )

# Show final predictions after gradient updates.
final_preds = model(inputs_tf, training=False)
print("Final predictions:", final_preds.numpy().reshape(-1))



## **2. Mastering tf function**

### **2.1. Decorating the Training Step**

<img src="https://cdn.jsdelivr.net/gh/mhrafiei/contents@main/LFF/Master TensorFlow 2.20.0/Module_04/Lecture_B/image_02_01.jpg?v=1769396737" width="250">



>* tf.function compiles training steps into fast graphs
>* Reduces Python overhead, speeding many training iterations

>* Define training_step to take tensors, return tensors
>* Call compiled training_step inside simple Python loops

>* Balance graph performance with training transparency and debugging
>* Keep core math in graph, control in Python



In [None]:
#@title Python Code - Decorating the Training Step

# This script shows tf function on training steps.
# It uses a tiny model and dataset subset.
# Focus is decorating training step for speed.

# !pip install tensorflow==2.20.0.

# Import required TensorFlow modules.
import tensorflow as tf

# Set deterministic seeds for reproducibility.
tf.random.set_seed(7)

# Print TensorFlow version in one short line.
print("TensorFlow version:", tf.__version__)

# Select device string based on GPU availability.
physical_gpus = tf.config.list_physical_devices("GPU")

# Choose device name for information only.
if physical_gpus:
    device_name = "GPU"
else:
    device_name = "CPU"

# Print which device type is selected.
print("Using device type:", device_name)

# Load MNIST dataset from Keras datasets.
(mnist_x_train, mnist_y_train), _ = tf.keras.datasets.mnist.load_data()

# Normalize images to float32 in range zero one.
mnist_x_train = mnist_x_train.astype("float32") / 255.0

# Add channel dimension to images for Conv2D.
mnist_x_train = mnist_x_train[..., tf.newaxis]

# Take a small subset for quick demonstration.
subset_size = 2048

# Slice the subset safely from training data.
train_images = mnist_x_train[:subset_size]

# Slice corresponding labels for the subset.
train_labels = mnist_y_train[:subset_size]

# Validate shapes before building dataset.
assert train_images.shape[0] == subset_size

# Create tf.data dataset from tensors.
train_ds = tf.data.Dataset.from_tensor_slices((train_images, train_labels))

# Shuffle and batch the dataset for training.
train_ds = train_ds.shuffle(1024, seed=7).batch(64)

# Build a simple sequential convolutional model.
model = tf.keras.Sequential([
    tf.keras.layers.Conv2D(8, (3, 3), activation="relu", input_shape=(28, 28, 1)),
    tf.keras.layers.Flatten(),
    tf.keras.layers.Dense(32, activation="relu"),
    tf.keras.layers.Dense(10)
])

# Define loss function for sparse classification.
loss_fn = tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True)

# Define optimizer with a small learning rate.
optimizer = tf.keras.optimizers.Adam(learning_rate=0.001)

# Define a metric to track mean training loss.
train_loss_metric = tf.keras.metrics.Mean(name="train_loss")

# Define a metric to track training accuracy.
train_acc_metric = tf.keras.metrics.SparseCategoricalAccuracy(name="train_acc")

# Define one training step without decoration first.
def plain_train_step(images, labels):
    with tf.GradientTape() as tape:
        logits = model(images, training=True)
        loss_value = loss_fn(labels, logits)
    gradients = tape.gradient(loss_value, model.trainable_variables)
    optimizer.apply_gradients(zip(gradients, model.trainable_variables))
    train_loss_metric.update_state(loss_value)
    train_acc_metric.update_state(labels, logits)

# Decorate the training step using tf function.
@tf.function
def graph_train_step(images, labels):
    with tf.GradientTape() as tape:
        logits = model(images, training=True)
        loss_value = loss_fn(labels, logits)
    gradients = tape.gradient(loss_value, model.trainable_variables)
    optimizer.apply_gradients(zip(gradients, model.trainable_variables))
    train_loss_metric.update_state(loss_value)
    train_acc_metric.update_state(labels, logits)

# Run one warmup epoch using plain training step.
for batch_images, batch_labels in train_ds:
    plain_train_step(batch_images, batch_labels)

# Read metrics after warmup epoch.
warmup_loss = float(train_loss_metric.result().numpy())

# Read accuracy metric after warmup epoch.
warmup_acc = float(train_acc_metric.result().numpy())

# Reset metrics before tf function training epoch.
train_loss_metric.reset_state()

# Reset accuracy metric for a clean second epoch.
train_acc_metric.reset_state()

# Run one epoch using the tf function training step.
for batch_images, batch_labels in train_ds:
    graph_train_step(batch_images, batch_labels)

# Read metrics after tf function epoch.
wrapped_loss = float(train_loss_metric.result().numpy())

# Read accuracy metric after tf function epoch.
wrapped_acc = float(train_acc_metric.result().numpy())

# Print a short comparison of both epochs.
print("Warmup epoch loss and accuracy:", round(warmup_loss, 4), round(warmup_acc, 4))

# Print metrics for the tf function decorated epoch.
print("tf.function epoch loss and accuracy:", round(wrapped_loss, 4), round(wrapped_acc, 4))

# Print a final note about tf function usage.
print("Training step decorated with tf.function executed successfully.")



### **2.2. Tracing vs Retracing**

<img src="https://cdn.jsdelivr.net/gh/mhrafiei/contents@main/LFF/Master TensorFlow 2.20.0/Module_04/Lecture_B/image_02_02.jpg?v=1769396782" width="250">



>* Tracing builds an optimized graph from Python code
>* Later calls reuse this graph for faster training

>* Retracing happens when inputs or control flow change
>* Frequent retracing adds overhead and reduces performance

>* Keep inputs consistent to avoid frequent retracing
>* Stable graphs give fast training and easy debugging



In [None]:
#@title Python Code - Tracing vs Retracing

# This script demonstrates tracing versus retracing simply.
# We use a tiny model and custom tf function steps.
# Focus on input shapes and dtypes affecting tracing.

# !pip install tensorflow==2.20.0.

# Import TensorFlow and NumPy libraries.
import tensorflow as tf
import numpy as np

# Set deterministic seeds for reproducibility.
tf.random.set_seed(7)
np.random.seed(7)

# Print TensorFlow version in one short line.
print("TensorFlow version:", tf.__version__)

# Select device string based on GPU availability.
device = "GPU" if tf.config.list_physical_devices("GPU") else "CPU"

# Print which device type will be used.
print("Running on device type:", device)

# Create a simple dense model for demonstration.
model = tf.keras.Sequential([
    tf.keras.layers.Input(shape=(4,)),
    tf.keras.layers.Dense(3, activation="relu"),
    tf.keras.layers.Dense(1)
])

# Create a simple optimizer instance.
optimizer = tf.keras.optimizers.SGD(learning_rate=0.1)

# Define a basic mean squared error loss.
loss_fn = tf.keras.losses.MeanSquaredError()

# Define a tf function with fixed input signature.
@tf.function(input_signature=[
    tf.TensorSpec(shape=(None, 4), dtype=tf.float32),
    tf.TensorSpec(shape=(None, 1), dtype=tf.float32)
])

def train_step_fixed(x_batch, y_batch):
    # Record operations for automatic differentiation.
    with tf.GradientTape() as tape:
        preds = model(x_batch, training=True)
        loss = loss_fn(y_batch, preds)
    # Compute gradients and apply them.
    grads = tape.gradient(loss, model.trainable_variables)
    optimizer.apply_gradients(zip(grads, model.trainable_variables))
    return loss

# Define a tf function without fixed signature.
@tf.function

def train_step_flexible(x_batch, y_batch):
    # Same body but no explicit input signature.
    with tf.GradientTape() as tape:
        preds = model(x_batch, training=True)
        loss = loss_fn(y_batch, preds)
    # Apply gradients to update model weights.
    grads = tape.gradient(loss, model.trainable_variables)
    optimizer.apply_gradients(zip(grads, model.trainable_variables))
    return loss

# Create helper to build random batches with given shape.
def make_batch(batch_size, feature_dim):
    # Create random features and targets with correct shapes.
    x = np.random.randn(batch_size, feature_dim).astype("float32")
    y = np.random.randn(batch_size, 1).astype("float32")
    return x, y

# Prepare two batches with same shape for fixed function.
x1, y1 = make_batch(batch_size=8, feature_dim=4)

# Call fixed function twice to reuse traced graph.
loss1 = train_step_fixed(x1, y1)
loss2 = train_step_fixed(x1, y1)

# Print losses to show normal execution reuse.
print("Fixed signature losses:", float(loss1), float(loss2))

# Prepare batches with different batch sizes for flexible.
x2, y2 = make_batch(batch_size=4, feature_dim=4)

# Call flexible function with first batch size.
loss3 = train_step_flexible(x1, y1)

# Call flexible function with different batch size.
loss4 = train_step_flexible(x2, y2)

# Print losses to highlight potential retracing behavior.
print("Flexible signature losses:", float(loss3), float(loss4))

# Show final model prediction shape for confirmation.
print("Final prediction shape:", model(x1).shape)



### **2.3. Debugging Autograph Issues**

<img src="https://cdn.jsdelivr.net/gh/mhrafiei/contents@main/LFF/Master TensorFlow 2.20.0/Module_04/Lecture_B/image_02_03.jpg?v=1769396819" width="250">



>* Autograph rewrites Python control flow into graphs
>* Graph execution can break prints and dynamic patterns

>* Disable tf.function, test pieces in eager
>* Reenable graph, simplify code, use tf logging

>* Read tracing errors to spot incompatible Python patterns
>* Refactor to tensor operations and use TF logging



In [None]:
#@title Python Code - Debugging Autograph Issues

# This script demonstrates debugging TensorFlow Autograph issues.
# It compares eager and tf function training steps safely.
# It keeps prints minimal while showing key behaviors.

# !pip install tensorflow==2.20.0.

# Import required TensorFlow module.
import tensorflow as tf

# Set deterministic random seeds.
tf.random.set_seed(7)

# Print TensorFlow version briefly.
print("TensorFlow version:", tf.__version__)

# Select available device type string.
device_type = "GPU" if tf.config.list_physical_devices("GPU") else "CPU"

# Print selected device type.
print("Using device type:", device_type)

# Create a tiny synthetic regression dataset.
features = tf.random.normal(shape=(16, 3))

# Create targets as simple linear combination.
targets = tf.reduce_sum(features, axis=1, keepdims=True)

# Validate dataset shapes defensively.
assert features.shape == (16, 3)

# Validate target shape defensively.
assert targets.shape == (16, 1)

# Build a small Keras model.
model = tf.keras.Sequential([
    tf.keras.layers.Input(shape=(3,)),
    tf.keras.layers.Dense(4, activation="relu"),
    tf.keras.layers.Dense(1),
])

# Create an optimizer instance.
optimizer = tf.keras.optimizers.SGD(learning_rate=0.1)

# Define a simple loss function.
loss_fn = tf.keras.losses.MeanSquaredError()

# Define a metric to track mean loss.
train_metric = tf.keras.metrics.Mean(name="train_loss")

# Define an eager training step for clarity.
def train_step_eager(x_batch, y_batch):
    with tf.GradientTape() as tape:
        preds = model(x_batch, training=True)
        loss = loss_fn(y_batch, preds)
    grads = tape.gradient(loss, model.trainable_variables)
    optimizer.apply_gradients(zip(grads, model.trainable_variables))
    train_metric.update_state(loss)
    return loss

# Define a tf function training step with autograph.
@tf.function
def train_step_graph(x_batch, y_batch):
    tf.debugging.assert_shapes([(x_batch, (None, 3)), (y_batch, (None, 1))])
    with tf.GradientTape() as tape:
        preds = model(x_batch, training=True)
        loss = loss_fn(y_batch, preds)
        tf.print("Inside graph loss:", loss)
    grads = tape.gradient(loss, model.trainable_variables)
    optimizer.apply_gradients(zip(grads, model.trainable_variables))
    train_metric.update_state(loss)
    return loss

# Run one eager step to verify behavior.
loss_eager = train_step_eager(features, targets)

# Print eager mode loss value.
print("Eager step loss:", float(loss_eager))

# Reset metric before graph execution.
train_metric.reset_state()

# Run one graph step to compare behavior.
loss_graph = train_step_graph(features, targets)

# Print graph mode loss value.
print("Graph step loss:", float(loss_graph))

# Show metric value after graph step.
print("Tracked metric loss:", float(train_metric.result()))

# Final confirmation message about script completion.
print("Finished Autograph debugging demonstration.")



## **3. Metrics in Training Loops**

### **3.1. Using tf.keras.metrics.Metric**

<img src="https://cdn.jsdelivr.net/gh/mhrafiei/contents@main/LFF/Master TensorFlow 2.20.0/Module_04/Lecture_B/image_03_01.jpg?v=1769396892" width="250">



>* Metric objects track performance across many batches
>* They give flexible, custom metrics for any workflow

>* Metric objects act like reusable measurement gauges
>* They centralize statistics, simplifying code and reducing bugs

>* Define and reuse multiple metrics across training phases
>* Swap or extend metrics without changing loop structure



In [None]:
#@title Python Code - Using tf.keras.metrics.Metric

# This script shows metrics in custom loops.
# It uses TensorFlow 2.20.0 with small data.
# Focus is on tf.keras.metrics.Metric usage.

# !pip install tensorflow==2.20.0.

# Import required libraries safely.
import os
import random
import numpy as np
import tensorflow as tf

# Set deterministic seeds for reproducibility.
seed_value = 42
random.seed(seed_value)
np.random.seed(seed_value)
tf.random.set_seed(seed_value)

# Print TensorFlow version briefly.
print("TensorFlow version:", tf.__version__)

# Select device preferring GPU when available.
physical_gpus = tf.config.list_physical_devices("GPU")
if physical_gpus:
    device_name = "GPU"
else:
    device_name = "CPU"
print("Using device type:", device_name)

# Load MNIST dataset from Keras datasets.
(mnist_x_train, mnist_y_train), _ = tf.keras.datasets.mnist.load_data()

# Reduce dataset size for quick demonstration.
train_images = mnist_x_train[:2000].astype("float32") / 255.0
train_labels = mnist_y_train[:2000].astype("int32")

# Add channel dimension to images.
train_images = np.expand_dims(train_images, axis=-1)

# Validate shapes before building dataset.
assert train_images.shape[0] == train_labels.shape[0]
assert train_images.ndim == 4 and train_labels.ndim == 1

# Create small tf.data.Dataset for training.
batch_size = 64
dataset = tf.data.Dataset.from_tensor_slices((train_images, train_labels))

# Shuffle and batch the dataset.
dataset = dataset.shuffle(buffer_size=2000, seed=seed_value)
dataset = dataset.batch(batch_size)

# Build a simple sequential classification model.
model = tf.keras.Sequential([
    tf.keras.layers.Input(shape=(28, 28, 1)),
    tf.keras.layers.Flatten(),
    tf.keras.layers.Dense(64, activation="relu"),
    tf.keras.layers.Dense(10, activation="softmax"),
])

# Define optimizer and loss function.
optimizer = tf.keras.optimizers.Adam(learning_rate=0.001)
loss_fn = tf.keras.losses.SparseCategoricalCrossentropy()

# Create metric objects for loss and accuracy.
train_loss_metric = tf.keras.metrics.Mean(name="train_loss")
train_acc_metric = tf.keras.metrics.SparseCategoricalAccuracy(
    name="train_accuracy"
)

# Define one training step using GradientTape.
@tf.function
def train_step(images, labels):
    with tf.GradientTape() as tape:
        predictions = model(images, training=True)
        loss_value = loss_fn(labels, predictions)
    gradients = tape.gradient(loss_value, model.trainable_variables)
    optimizer.apply_gradients(zip(gradients, model.trainable_variables))

    # Update metric states with current batch.
    train_loss_metric.update_state(loss_value)
    train_acc_metric.update_state(labels, predictions)

# Set number of epochs for demonstration.
num_epochs = 3

# Run custom training loop with metrics.
for epoch in range(num_epochs):
    # Reset metric states at epoch start.
    train_loss_metric.reset_state()
    train_acc_metric.reset_state()

    # Iterate over batches in dataset.
    for batch_images, batch_labels in dataset:
        train_step(batch_images, batch_labels)

    # Read metric results after epoch.
    epoch_loss = train_loss_metric.result().numpy()
    epoch_acc = train_acc_metric.result().numpy()

    # Print concise epoch summary line.
    print(
        f"Epoch {epoch + 1}: loss={epoch_loss:.4f}, accuracy={epoch_acc:.4f}"
    )

# Show final metric values after training.
final_loss = train_loss_metric.result().numpy()
final_acc = train_acc_metric.result().numpy()
print("Final tracked loss:", round(float(final_loss), 4))
print("Final tracked accuracy:", round(float(final_acc), 4))



### **3.2. Updating And Reading Metrics**

<img src="https://cdn.jsdelivr.net/gh/mhrafiei/contents@main/LFF/Master TensorFlow 2.20.0/Module_04/Lecture_B/image_03_02.jpg?v=1769396940" width="250">



>* Treat each metric as a stateful calculator
>* Update metrics every batch with predictions and loss

>* Updating and reading metrics are separate actions
>* Read metrics at checkpoints; reading doesnâ€™t change state

>* Update all metrics every training batch consistently
>* Read metrics at epoch end for reliable snapshots



In [None]:
#@title Python Code - Updating And Reading Metrics

# This script shows metrics in custom loops.
# It focuses on updating and reading metrics.
# Run cells to observe metric behavior clearly.

# !pip install tensorflow==2.20.0.

# Import required libraries for TensorFlow training.
import os
import random
import numpy as np
import tensorflow as tf

# Set seeds for reproducible behavior in this demo.
seed_value = 42
random.seed(seed_value)
np.random.seed(seed_value)
tf.random.set_seed(seed_value)

# Print TensorFlow version in a compact single line.
print("TensorFlow version:", tf.__version__)

# Select device preference based on GPU availability.
physical_gpus = tf.config.list_physical_devices("GPU")
if physical_gpus:
    device_type = "GPU"
else:
    device_type = "CPU"

# Print which device type will likely be used.
print("Running on device type:", device_type)

# Create a tiny synthetic classification dataset.
num_samples = 256
num_features = 20
num_classes = 3

# Generate random input features with normal distribution.
X = np.random.randn(num_samples, num_features).astype("float32")

# Generate random integer labels for classification.
y_int = np.random.randint(num_classes, size=(num_samples,))

# Convert integer labels to one hot encoded vectors.
y_onehot = tf.one_hot(y_int, depth=num_classes)

# Wrap arrays into a tf.data.Dataset pipeline.
dataset = tf.data.Dataset.from_tensor_slices((X, y_onehot))

# Shuffle and batch the dataset for training.
batch_size = 32
dataset = dataset.shuffle(buffer_size=num_samples, seed=seed_value)

# Batch and prefetch for efficient small training loop.
dataset = dataset.batch(batch_size).prefetch(tf.data.AUTOTUNE)

# Build a simple Keras model for demonstration.
model = tf.keras.Sequential([
    tf.keras.layers.Input(shape=(num_features,)),
    tf.keras.layers.Dense(16, activation="relu"),
    tf.keras.layers.Dense(num_classes, activation="softmax"),
])

# Define an optimizer and a loss function.
optimizer = tf.keras.optimizers.Adam(learning_rate=0.01)
loss_fn = tf.keras.losses.CategoricalCrossentropy()

# Create metric objects for loss and accuracy.
train_loss_metric = tf.keras.metrics.Mean(name="train_loss")
train_acc_metric = tf.keras.metrics.CategoricalAccuracy(
    name="train_accuracy"
)

# Define one training step using GradientTape and metrics.
@tf.function
def train_step(batch_x, batch_y):
    # Record operations for automatic differentiation.
    with tf.GradientTape() as tape:
        preds = model(batch_x, training=True)
        loss_value = loss_fn(batch_y, preds)

    # Compute gradients of loss with respect to weights.
    grads = tape.gradient(loss_value, model.trainable_variables)

    # Apply gradients to update model parameters.
    optimizer.apply_gradients(zip(grads, model.trainable_variables))

    # Update loss metric with current batch loss value.
    train_loss_metric.update_state(loss_value)

    # Update accuracy metric with labels and predictions.
    train_acc_metric.update_state(batch_y, preds)

    # Return scalar loss for optional external uses.
    return loss_value

# Validate dataset shapes before starting training.
for sample_x, sample_y in dataset.take(1):
    assert sample_x.shape[1] == num_features
    assert sample_y.shape[1] == num_classes

# Set number of epochs small for quick demonstration.
num_epochs = 3

# Run custom training loop with metric tracking.
for epoch in range(num_epochs):
    # Reset metric states at the start of each epoch.
    train_loss_metric.reset_state()
    train_acc_metric.reset_state()

    # Iterate over batches and perform training steps.
    for batch_x, batch_y in dataset:
        _ = train_step(batch_x, batch_y)

    # Read metric results once after processing all batches.
    epoch_loss = train_loss_metric.result().numpy()
    epoch_acc = train_acc_metric.result().numpy()

    # Print a concise summary line for this epoch.
    print(
        f"Epoch {epoch + 1}: loss={epoch_loss:.4f}, "
        f"accuracy={epoch_acc:.4f}"
    )

# Show final metric values after last epoch training.
final_loss = float(train_loss_metric.result().numpy())
final_acc = float(train_acc_metric.result().numpy())
print("Final tracked loss:", round(final_loss, 4), "accuracy:", round(final_acc, 4))



### **3.3. Resetting Metrics Each Epoch**

<img src="https://cdn.jsdelivr.net/gh/mhrafiei/contents@main/LFF/Master TensorFlow 2.20.0/Module_04/Lecture_B/image_03_03.jpg?v=1769397004" width="250">



>* Metrics should reflect only the current epoch
>* Reset each epoch to see true performance trends

>* Reset all metrics at each epoch start
>* Otherwise metrics accumulate and hide true improvements

>* Per-epoch resets improve monitoring and debugging decisions
>* They reveal trends, anomalies, and overfitting accurately



In [None]:
#@title Python Code - Resetting Metrics Each Epoch

# This script shows metrics resetting each epoch.
# It uses a tiny model and dataset subset.
# Focus on clear metric behavior in loops.

# !pip install tensorflow==2.20.0.

# Import required modules for TensorFlow training.
import os
import random
import numpy as np
import tensorflow as tf

# Set seeds for reproducible training behavior.
seed_value = 7
random.seed(seed_value)
np.random.seed(seed_value)
tf.random.set_seed(seed_value)

# Print TensorFlow version in one short line.
print("TensorFlow version:", tf.__version__)

# Select device preference based on GPU availability.
physical_gpus = tf.config.list_physical_devices("GPU")
if physical_gpus:
    device_name = "GPU"
else:
    device_name = "CPU"

# Print which device type will likely be used.
print("Running training on device type:", device_name)

# Load MNIST dataset using Keras utilities.
(mnist_x_train, mnist_y_train), _ = tf.keras.datasets.mnist.load_data()

# Normalize images to float32 values between zero and one.
mnist_x_train = mnist_x_train.astype("float32") / 255.0

# Add channel dimension to match Conv2D expectations.
mnist_x_train = np.expand_dims(mnist_x_train, axis=-1)

# Use a very small subset for quick demonstration.
subset_size = 512
mnist_x_train = mnist_x_train[:subset_size]
mnist_y_train = mnist_y_train[:subset_size]

# Validate shapes to avoid unexpected training issues.
assert mnist_x_train.shape[0] == subset_size
assert mnist_y_train.shape[0] == subset_size

# Create a tf.data.Dataset for efficient batching.
train_ds = tf.data.Dataset.from_tensor_slices(
    (mnist_x_train, mnist_y_train)
)

# Shuffle lightly and batch into small groups.
train_ds = train_ds.shuffle(buffer_size=1024, seed=seed_value)
train_ds = train_ds.batch(64)

# Build a simple sequential convolutional model.
model = tf.keras.Sequential([
    tf.keras.layers.Conv2D(
        8, (3, 3), activation="relu", input_shape=(28, 28, 1)
    ),
    tf.keras.layers.Flatten(),
    tf.keras.layers.Dense(32, activation="relu"),
    tf.keras.layers.Dense(10, activation="softmax"),
])

# Define optimizer and loss function for classification.
optimizer = tf.keras.optimizers.Adam(learning_rate=0.001)
loss_fn = tf.keras.losses.SparseCategoricalCrossentropy()

# Create metric objects for loss and accuracy tracking.
train_loss_metric = tf.keras.metrics.Mean(name="train_loss")
train_acc_metric = tf.keras.metrics.SparseCategoricalAccuracy(
    name="train_accuracy"
)

# Define one training step using GradientTape mechanics.
@tf.function
def train_step(images, labels):
    with tf.GradientTape() as tape:
        predictions = model(images, training=True)
        loss_value = loss_fn(labels, predictions)
    gradients = tape.gradient(loss_value, model.trainable_variables)
    optimizer.apply_gradients(zip(gradients, model.trainable_variables))
    train_loss_metric.update_state(loss_value)
    train_acc_metric.update_state(labels, predictions)

# Set number of epochs small for quick execution.
num_epochs = 3

# Run custom training loop with metric resets each epoch.
for epoch in range(num_epochs):
    train_loss_metric.reset_state()
    train_acc_metric.reset_state()
    for batch_images, batch_labels in train_ds:
        train_step(batch_images, batch_labels)
    epoch_loss = train_loss_metric.result().numpy()
    epoch_acc = train_acc_metric.result().numpy()
    print(
        f"Epoch {epoch + 1}: loss={epoch_loss:.4f}, accuracy={epoch_acc:.4f}"
    )

# Show that metrics are independent by printing final values.
print("Final epoch loss and accuracy reflect last epoch only.")



# <font color="#418FDE" size="6.5" uppercase>**Custom Training Loops**</font>


In this lecture, you learned to:
- Implement a per-batch training step using tf.GradientTape and a Keras model. 
- Wrap training steps with tf.function to improve performance while preserving debuggability. 
- Track custom metrics within a training loop and reset them appropriately each epoch. 

In the next Module (Module 5), we will go over 'Data Pipelines'