# <font color="#418FDE" size="6.5" uppercase>**Transforms and Augment**</font>

>Last update: 20260129.
    
By the end of this Lecture, you will be able to:
- Apply common preprocessing transforms such as normalization, resizing, and tensor conversion to raw data. 
- Design and configure data augmentation pipelines that improve model robustness without corrupting labels. 
- Integrate transforms into Dataset and DataLoader workflows while monitoring their impact on training. 


## **1. Core Preprocessing Transforms**

### **1.1. Tensor Conversion and Normalization**

<img src="https://cdn.jsdelivr.net/gh/mhrafiei/contents@main/LFF/Master PyTorch 2.10.0/Module_04/Lecture_B/image_01_01.jpg?v=1769744165" width="250">



>* Convert varied raw data into tensor arrays
>* Tensors enable fast batching, device moves, and training

>* Normalization rescales tensor values to stable ranges
>* It prevents large features dominating, improving learning

>* Treat conversion and normalization as one pipeline
>* Use consistent stats to avoid distribution shifts



In [None]:
#@title Python Code - Tensor Conversion and Normalization

# This script shows tensor conversion basics.
# It uses TensorFlow for simple image tensors.
# Focus on conversion and normalization steps.

# !pip install tensorflow.

# Import required standard libraries.
import os
import random
import math

# Import TensorFlow and NumPy libraries.
import tensorflow as tf
import numpy as np

# Set deterministic random seeds.
SEED = 42
random.seed(SEED)
np.random.seed(SEED)

# Set TensorFlow random seed.
tf.random.set_seed(SEED)

# Print TensorFlow version briefly.
print("TensorFlow version:", tf.__version__)

# Create a small fake RGB image.
height, width, channels = 4, 4, 3
fake_image_uint8 = tf.random.uniform(
    shape=(height, width, channels),
    minval=0,
    maxval=256,
    dtype=tf.int32,
)

# Cast random integers to uint8 type.
fake_image_uint8 = tf.cast(fake_image_uint8, tf.uint8)

# Show original image shape and dtype.
print("Original shape:", fake_image_uint8.shape)
print("Original dtype:", fake_image_uint8.dtype)

# Convert image to float32 tensor.
image_float = tf.image.convert_image_dtype(
    fake_image_uint8,
    dtype=tf.float32,
)

# Confirm new dtype and value range.
print("Converted dtype:", image_float.dtype)
print("Min, max after scaling:",
      float(tf.reduce_min(image_float)),
      float(tf.reduce_max(image_float)))

# Define per channel mean and std values.
mean = tf.constant([0.5, 0.5, 0.5], dtype=tf.float32)
std = tf.constant([0.2, 0.2, 0.2], dtype=tf.float32)

# Reshape mean and std for broadcasting.
mean = tf.reshape(mean, (1, 1, 3))
std = tf.reshape(std, (1, 1, 3))

# Apply normalization to the image.
normalized_image = (image_float - mean) / std

# Check normalized image statistics.
mean_value = tf.reduce_mean(normalized_image)
std_value = tf.math.reduce_std(normalized_image)

# Print summary statistics for understanding.
print("Normalized mean (approx):", float(mean_value))
print("Normalized std (approx):", float(std_value))

# Verify normalized tensor shape matches original.
print("Normalized shape:", normalized_image.shape)




### **1.2. Spatial Resizing Basics**

<img src="https://cdn.jsdelivr.net/gh/mhrafiei/contents@main/LFF/Master PyTorch 2.10.0/Module_04/Lecture_B/image_01_02.jpg?v=1769744217" width="250">



>* Resize images to a common input size
>* Standardized sizes enable efficient tensor batching and processing

>* Resizing changes detail; downscaling loses small features
>* Choose target size based on model and task

>* Preserve aspect ratio to avoid shape distortion
>* Use crop or padding to keep composition realistic



In [None]:
#@title Python Code - Spatial Resizing Basics

# This script demonstrates basic spatial resizing concepts.
# We use TensorFlow to resize simple example images safely.
# Focus on shapes, aspect ratios, and visual differences.

# !pip install tensorflow.

# Import required standard libraries.
import os
import random
import numpy as np

# Import TensorFlow and image utilities.
import tensorflow as tf
from tensorflow.keras import datasets

# Set deterministic random seeds for reproducibility.
seed_value = 42
random.seed(seed_value)

# Set NumPy and TensorFlow seeds for consistent behavior.
np.random.seed(seed_value)
tf.random.set_seed(seed_value)

# Print TensorFlow version in one concise line.
print("TensorFlow version:", tf.__version__)

# Load MNIST dataset with small grayscale digit images.
(x_train, y_train), _ = datasets.mnist.load_data()

# Select a small subset of images for this demo.
num_samples = 4
x_small = x_train[:num_samples]

# Convert NumPy images to float32 tensors with channel dimension.
images = tf.convert_to_tensor(x_small, dtype=tf.float32)
images = tf.expand_dims(images, axis=-1)

# Confirm original image shape is as expected.
print("Original batch shape:", images.shape)

# Define a target spatial size for resizing operation.
target_height, target_width = 56, 56

# Resize images using bilinear interpolation method.
resized_bilinear = tf.image.resize(
    images,
    size=(target_height, target_width),
    method="bilinear",
)

# Resize images using nearest neighbor interpolation.
resized_nearest = tf.image.resize(
    images,
    size=(target_height, target_width),
    method="nearest",
)

# Confirm resized shapes match the chosen target size.
print("Resized bilinear shape:", resized_bilinear.shape)
print("Resized nearest shape:", resized_nearest.shape)

# Compute simple statistics to compare interpolation effects.
orig_mean = tf.reduce_mean(images).numpy()

# Compute means for resized image batches.
mean_bilinear = tf.reduce_mean(resized_bilinear).numpy()
mean_nearest = tf.reduce_mean(resized_nearest).numpy()

# Print summary of mean pixel values for each version.
print("Original mean pixel:", round(float(orig_mean), 3))
print("Bilinear mean pixel:", round(float(mean_bilinear), 3))
print("Nearest mean pixel:", round(float(mean_nearest), 3))

# Show one example of original and resized shapes and labels.
index = 0
print("Example label:", int(y_train[index]))
print("Original shape:", images[index].shape)
print("Bilinear shape:", resized_bilinear[index].shape)
print("Nearest shape:", resized_nearest[index].shape)



### **1.3. Channel and Dtype Basics**

<img src="https://cdn.jsdelivr.net/gh/mhrafiei/contents@main/LFF/Master PyTorch 2.10.0/Module_04/Lecture_B/image_01_03.jpg?v=1769744267" width="250">



>* Channel order and dtype define data meaning
>* Wrong channel layout makes models misread inputs

>* Dtype controls value precision, storage, and speed
>* Convert ints to floats, avoid mixed-type bugs

>* Standardize channel order for all input sources
>* Choose compatible dtypes to avoid subtle errors



In [None]:
#@title Python Code - Channel and Dtype Basics

# This script explores channels and dtypes basics.
# It uses TensorFlow to inspect simple image tensors.
# Focus on shapes, channel order, and dtype conversions.

# !pip install tensorflow==2.20.0.

# Import required standard libraries.
import os
import random
import numpy as np

# Import TensorFlow and check version.
import tensorflow as tf
print("TensorFlow version:", tf.__version__)

# Set deterministic random seeds.
seed_value = 42
random.seed(seed_value)
np.random.seed(seed_value)

# Create a fake RGB image using NumPy.
height, width, channels = 4, 5, 3
np_image_uint8 = np.random.randint(
    0, 256, size=(height, width, channels), dtype=np.uint8
)

# Show basic information about the NumPy image.
print("NumPy image shape:", np_image_uint8.shape)
print("NumPy image dtype:", np_image_uint8.dtype)

# Convert NumPy image to TensorFlow tensor.
tf_image_uint8 = tf.convert_to_tensor(np_image_uint8)
print("Tensor shape (HWC):", tf_image_uint8.shape)

# Check that channels are in last dimension.
print("Channels last dimension size:", tf_image_uint8.shape[-1])

# Convert dtype from uint8 to float32.
tf_image_float = tf.cast(tf_image_uint8, tf.float32)
print("Tensor dtype after cast:", tf_image_float.dtype)

# Scale pixel values to range zero to one.
tf_image_scaled = tf_image_float / 255.0
print("Min and max after scale:",
      float(tf.reduce_min(tf_image_scaled)),
      float(tf.reduce_max(tf_image_scaled)))

# Reorder channels from HWC to CHW layout.
tf_image_chw = tf.transpose(tf_image_scaled, perm=(2, 0, 1))
print("Tensor shape (CHW):", tf_image_chw.shape)

# Validate that channel count is preserved.
assert tf_image_chw.shape[0] == channels

# Create a small grayscale image example.
gray_height, gray_width = 4, 4
np_gray_uint8 = np.random.randint(
    0, 256, size=(gray_height, gray_width), dtype=np.uint8
)

# Convert grayscale image to tensor with explicit channel.
tf_gray = tf.convert_to_tensor(np_gray_uint8, dtype=tf.uint8)
tf_gray_expanded = tf.expand_dims(tf_gray, axis=-1)

# Show grayscale shapes before and after channel expansion.
print("Gray original shape:", tf_gray.shape)
print("Gray with channel shape:", tf_gray_expanded.shape)

# Final confirmation message about channel and dtype handling.
print("Channel and dtype basics demo finished.")



## **2. Designing Safe Augmentations**

### **2.1. Safe Spatial Augmentations**

<img src="https://cdn.jsdelivr.net/gh/mhrafiei/contents@main/LFF/Master PyTorch 2.10.0/Module_04/Lecture_B/image_02_01.jpg?v=1769744297" width="250">



>* Use gentle geometric changes that keep labels
>* Teach models to ignore position and orientation

>* Match spatial transforms to realistic data geometry
>* Use gentle changes that keep labels clearly correct

>* Task and domain define label-safe geometry
>* Randomize transforms within a carefully chosen envelope



In [None]:
#@title Python Code - Safe Spatial Augmentations

# This script shows safe spatial augmentations.
# We use TensorFlow image utilities for clarity.
# Focus on flips crops and small rotations.

# !pip install tensorflow==2.20.0.

# Import required standard libraries.
import os
import random
import numpy as np

# Import TensorFlow and image utilities.
import tensorflow as tf
from tensorflow.keras import datasets

# Set deterministic random seeds everywhere.
seed_value = 42
random.seed(seed_value)
np.random.seed(seed_value)
tf.random.set_seed(seed_value)

# Print TensorFlow version briefly.
print("TensorFlow version:", tf.__version__)

# Load a small subset of CIFAR10 data.
(x_train, y_train), _ = datasets.cifar10.load_data()
num_samples = 8
x_train = x_train[:num_samples]
y_train = y_train[:num_samples]

# Confirm shapes are as expected.
print("Subset shape:", x_train.shape, y_train.shape)

# Normalize images to float32 range zero one.
x_train = x_train.astype("float32") / 255.0

# Define a safe horizontal flip function.
def safe_random_flip(image):
    image = tf.image.random_flip_left_right(image)
    return image

# Define a safe small random crop function.
def safe_random_crop(image):
    image_shape = tf.shape(image)
    height = image_shape[0]
    width = image_shape[1]
    crop_height = tf.cast(0.9 * tf.cast(height, tf.float32), tf.int32)
    crop_width = tf.cast(0.9 * tf.cast(width, tf.float32), tf.int32)
    cropped = tf.image.random_crop(image, (crop_height, crop_width, 3))
    resized = tf.image.resize(cropped, (height, width))
    return resized

# Define a safe small rotation function.
def safe_small_rotation(image):
    angles = tf.random.uniform((), minval=-0.2, maxval=0.2)
    rotated = tfa_image_rotate(image, angles)
    return rotated

# Implement rotation using dense sampling.
def tfa_image_rotate(image, angle):
    image = tf.expand_dims(image, axis=0)
    rotated = tfa_rotate_batch(image, angle)
    rotated = tf.squeeze(rotated, axis=0)
    return rotated

# Rotate a batch using affine transform math.
def tfa_rotate_batch(images, angle):
    cos_a = tf.math.cos(angle)
    sin_a = tf.math.sin(angle)
    transform = [cos_a, -sin_a, 0.0, sin_a, cos_a, 0.0, 0.0, 0.0]
    transform = tf.convert_to_tensor(transform, dtype=tf.float32)
    transform = tf.reshape(transform, (1, 8))
    transforms = tf.repeat(transform, tf.shape(images)[0], axis=0)
    rotated = tf.raw_ops.ImageProjectiveTransformV3(
        images=images,
        transforms=transforms,
        output_shape=tf.shape(images)[1:3],
        interpolation="BILINEAR",
        fill_mode="REFLECT",
        fill_value=0.0,
    )
    return rotated

# Compose safe spatial augmentations together.
def apply_safe_spatial_augment(image):
    image = safe_random_flip(image)
    image = safe_random_crop(image)
    image = safe_small_rotation(image)
    return image

# Take one example image and label.
example_image = x_train[0]
example_label = int(y_train[0][0])
print("Original label:", example_label)

# Apply augmentations several times deterministically.
augmented_images = []
for i in range(3):
    tf.random.set_seed(seed_value + i)
    aug = apply_safe_spatial_augment(example_image)
    augmented_images.append(aug)

# Stack augmented images for plotting.
augmented_stack = tf.stack(augmented_images, axis=0)
print("Augmented batch shape:", augmented_stack.shape)

# Plot original and augmented images together.
import matplotlib.pyplot as plt

# Create a simple figure with subplots.
fig, axes = plt.subplots(1, 4, figsize=(8, 3))
axes[0].imshow(example_image)
axes[0].set_title("Original")
axes[0].axis("off")

# Show each augmented image.
for idx in range(3):
    axes[idx + 1].imshow(augmented_stack[idx])
    axes[idx + 1].set_title("Aug" + str(idx + 1))
    axes[idx + 1].axis("off")

# Tighten layout and display plot.
plt.tight_layout()
plt.show()



### **2.2. Color Jitter and Noise**

<img src="https://cdn.jsdelivr.net/gh/mhrafiei/contents@main/LFF/Master PyTorch 2.10.0/Module_04/Lecture_B/image_02_02.jpg?v=1769744395" width="250">



>* Color jitter simulates realistic lighting and camera changes
>* Keep ranges moderate to avoid changing image meaning

>* Match jitter ranges to realistic domain conditions
>* Visually check samples; avoid label ambiguity

>* Noise simulates real sensor imperfections and artifacts
>* Increase noise slowly so labels stay clear



In [None]:
#@title Python Code - Color Jitter and Noise

# This script shows color jitter and noise.
# It uses TensorFlow image preprocessing utilities.
# Focus on safe augmentations that keep labels.

# !pip install tensorflow.

# Import required standard libraries.
import os
import random
import numpy as np

# Import TensorFlow and image utilities.
import tensorflow as tf
from tensorflow.keras import layers

# Set deterministic random seeds.
seed_value = 42
random.seed(seed_value)
np.random.seed(seed_value)

# Set TensorFlow random seed.
tf.random.set_seed(seed_value)

# Print TensorFlow version once.
print("TensorFlow version:", tf.__version__)

# Load MNIST dataset from Keras.
(x_train, y_train), _ = tf.keras.datasets.mnist.load_data()

# Select a small subset for demonstration.
num_samples = 8
x_small = x_train[:num_samples]

# Expand grayscale images to three channels.
x_small_rgb = np.repeat(x_small[..., np.newaxis], 3, axis=3)

# Normalize pixel values to range [0,1].
x_small_rgb = x_small_rgb.astype("float32") / 255.0

# Confirm shapes are as expected.
assert x_small_rgb.shape[0] == num_samples

# Create a simple color jitter layer.
color_jitter_layer = tf.keras.Sequential(
    [
        layers.RandomBrightness(
            factor=0.2,
            value_range=(0.0, 1.0),
            seed=seed_value,
        ),
        layers.RandomContrast(
            factor=0.2,
            seed=seed_value,
        ),
    ]
)

# Define a function to add gentle Gaussian noise.
def add_gaussian_noise(images, stddev=0.05):
    noise = tf.random.normal(
        shape=tf.shape(images),
        mean=0.0,
        stddev=stddev,
        seed=seed_value,
    )
    noisy = images + noise
    noisy = tf.clip_by_value(noisy, 0.0, 1.0)
    return noisy

# Take a small batch as TensorFlow tensor.
images_batch = tf.convert_to_tensor(x_small_rgb[:4])

# Apply color jitter to the batch.
images_jittered = color_jitter_layer(images_batch)

# Apply Gaussian noise to the jittered batch.
images_noisy = add_gaussian_noise(images_jittered, stddev=0.05)

# Verify shapes remain unchanged.
assert images_batch.shape == images_jittered.shape

# Import plotting library for visualization.
import matplotlib.pyplot as plt

# Create a figure with three rows of images.
fig, axes = plt.subplots(
    nrows=3,
    ncols=4,
    figsize=(8, 6),
)

# Plot original images in the first row.
for idx in range(4):
    axes[0, idx].imshow(images_batch[idx].numpy())
    axes[0, idx].axis("off")

# Plot color jittered images in the second row.
for idx in range(4):
    axes[1, idx].imshow(images_jittered[idx].numpy())
    axes[1, idx].axis("off")

# Plot jittered plus noisy images in third row.
for idx in range(4):
    axes[2, idx].imshow(images_noisy[idx].numpy())
    axes[2, idx].axis("off")

# Add a compact title explaining each row.
fig.suptitle(
    "Top: original, middle: color jitter, bottom: jitter + noise",
    fontsize=10,
)

# Adjust layout and display the plot.
plt.tight_layout()
plt.show()




### **2.3. Label Safe Augmentations**

<img src="https://cdn.jsdelivr.net/gh/mhrafiei/contents@main/LFF/Master PyTorch 2.10.0/Module_04/Lecture_B/image_02_03.jpg?v=1769744459" width="250">



>* Augment data while keeping the original label
>* If humans might relabel, augmentation is unsafe

>* Label safety depends entirely on the task
>* Identify which cues labels rely on before augmenting

>* Start with conservative, domain-informed augmentation settings
>* Iteratively adjust, inspect outputs, and consult experts



In [None]:
#@title Python Code - Label Safe Augmentations

# This script demonstrates label safe augmentations.
# We use TensorFlow image utilities for simple transforms.
# Focus on changes that preserve classification labels.

# !pip install tensorflow.

# Import required standard libraries.
import os
import random
import numpy as np

# Import TensorFlow and image utilities.
import tensorflow as tf
from tensorflow.keras import datasets

# Set deterministic random seeds everywhere.
SEED = 42
random.seed(SEED)
np.random.seed(SEED)

# Set TensorFlow random seed for reproducibility.
tf.random.set_seed(SEED)

# Print TensorFlow version in one short line.
print("TensorFlow version:", tf.__version__)

# Load MNIST dataset using Keras helper.
(x_train, y_train), _ = datasets.mnist.load_data()

# Select a tiny subset for quick demonstration.
num_samples = 4
x_small = x_train[:num_samples]
y_small = y_train[:num_samples]

# Validate shapes before further processing.
print("Subset shape:", x_small.shape)

# Expand grayscale images to have channel dimension.
x_small = np.expand_dims(x_small, axis=-1)

# Convert numpy arrays to float32 tensors.
x_small = tf.convert_to_tensor(x_small, dtype=tf.float32)
y_small = tf.convert_to_tensor(y_small, dtype=tf.int32)

# Normalize pixel values to range zero one.
x_small = x_small / 255.0

# Define a label safe augmentation function.
def safe_augment(image: tf.Tensor) -> tf.Tensor:
    # Apply small random rotation preserving digit identity.
    angle_rad = tf.random.uniform((), -0.15, 0.15)
    # Rotate image using bilinear interpolation.
    rotated = tfa_image_rotate(image, angle_rad)
    # Apply mild random brightness jitter.
    jitter = tf.image.random_brightness(rotated, 0.1)
    # Clip values to valid normalized range.
    clipped = tf.clip_by_value(jitter, 0.0, 1.0)
    # Return augmented image tensor.
    return clipped


# Implement rotation using TensorFlow raw ops.
def tfa_image_rotate(image: tf.Tensor, angle: tf.Tensor) -> tf.Tensor:
    # Compute rotation matrix elements from angle.
    cos_a = tf.math.cos(angle)
    sin_a = tf.math.sin(angle)

    # Build 2x3 transform matrix for rotation.
    transform = tf.stack([
        cos_a,
        -sin_a,
        0.0,
        sin_a,
        cos_a,
        0.0,
        0.0,
        0.0,
    ])

    # Reshape transform to correct dimensions.
    transform = tf.reshape(transform, (1, 8))

    # Add batch dimension to single image.
    image_batched = tf.expand_dims(image, axis=0)

    # Apply projective transform to image.
    rotated = tf.raw_ops.ImageProjectiveTransformV3(
        images=image_batched,
        transforms=transform,
        output_shape=tf.shape(image_batched)[1:3],
        interpolation="BILINEAR",
        fill_mode="REFLECT",
        fill_value=0.0,
    )

    # Remove batch dimension before returning.
    return tf.squeeze(rotated, axis=0)

# Apply safe augmentations to each sample.
augmented_images = []
for i in range(num_samples):
    # Augment each image independently.
    aug_img = safe_augment(x_small[i])

    # Collect augmented image tensors.
    augmented_images.append(aug_img)

# Stack augmented images into single tensor.
augmented_batch = tf.stack(augmented_images, axis=0)

# Confirm shapes of original and augmented.
print("Original batch shape:", x_small.shape)

# Print augmented batch shape for comparison.
print("Augmented batch shape:", augmented_batch.shape)

# Convert one pair to numpy for inspection.
original_example = x_small[0].numpy()
augmented_example = augmented_batch[0].numpy()

# Print small summaries instead of full arrays.
print("Original example min max:", original_example.min(), original_example.max())

# Show augmented example statistics for comparison.
print("Augmented example min max:", augmented_example.min(), augmented_example.max())

# Print labels to emphasize label safety.
print("Original labels:", y_small.numpy())



## **3. Integrating Data Transforms**

### **3.1. Using Transforms Inside Datasets**

<img src="https://cdn.jsdelivr.net/gh/mhrafiei/contents@main/LFF/Master PyTorch 2.10.0/Module_04/Lecture_B/image_03_01.jpg?v=1769744553" width="250">



>* Dataset automatically transforms samples into model-ready format
>* Keeps training loop simple, consistent, and reproducible

>* Centralized transforms prevent inconsistent preprocessing bugs
>* Every split uses the same transformation pipeline

>* Different transform pipelines for training and validation
>* Keeps code clean and data prep centralized



In [None]:
#@title Python Code - Using Transforms Inside Datasets

# This script shows transforms inside datasets.
# We use TensorFlow image dataset utilities.
# Focus is integrating preprocessing into pipelines.

# !pip install tensorflow.

# Import required standard libraries.
import os
import random
import numpy as np

# Import TensorFlow and image utilities.
import tensorflow as tf
from tensorflow.keras import layers

# Set deterministic random seeds.
seed_value = 42
random.seed(seed_value)
np.random.seed(seed_value)

# Set TensorFlow random seed deterministically.
tf.random.set_seed(seed_value)

# Print TensorFlow version briefly.
print("TensorFlow version:", tf.__version__)

# Define a small image size for speed.
img_height = 32
img_width = 32

# Create a tiny synthetic image dataset.
num_samples = 12
num_classes = 3

# Generate random images and integer labels.
images = tf.random.uniform(
    shape=(num_samples, img_height, img_width, 3),
    minval=0.0,
    maxval=1.0,
)

# Create simple repeating labels.
labels = tf.range(num_samples) % num_classes

# Validate shapes before building dataset.
assert images.shape[0] == labels.shape[0]

# Build a base tf.data Dataset from tensors.
base_ds = tf.data.Dataset.from_tensor_slices((images, labels))

# Define a preprocessing transform function.
def preprocess_example(image, label):
    # Resize image to target size.
    image = tf.image.resize(image, (img_height, img_width))
    # Normalize image to zero mean range.
    image = (image - 0.5) * 2.0
    # Return transformed image and unchanged label.
    return image, label


# Define a training augmentation transform function.
def augment_example(image, label):
    # Apply random horizontal flip.
    image = tf.image.random_flip_left_right(image, seed=seed_value)
    # Apply small random brightness change.
    image = tf.image.random_brightness(image, max_delta=0.1)
    # Return augmented image and original label.
    return image, label

# Create dataset with only preprocessing transforms.
val_ds = base_ds.map(preprocess_example, num_parallel_calls=1)

# Create dataset with preprocessing and augmentation.
train_ds = base_ds.map(preprocess_example, num_parallel_calls=1)
train_ds = train_ds.map(augment_example, num_parallel_calls=1)

# Batch both datasets for efficient loading.
train_ds = train_ds.batch(4)
val_ds = val_ds.batch(4)

# Take one batch from each dataset.
train_batch_images, train_batch_labels = next(iter(train_ds))
val_batch_images, val_batch_labels = next(iter(val_ds))

# Print basic information about the batches.
print("Train batch shape:", train_batch_images.shape)
print("Train labels:", train_batch_labels.numpy())
print("Validation batch shape:", val_batch_images.shape)
print("Validation labels:", val_batch_labels.numpy())

# Show that transforms keep labels unchanged.
print("First train label, first val label:",
      int(train_batch_labels[0]), int(val_batch_labels[0]))



### **3.2. Compose Versus Custom Calls**

<img src="https://cdn.jsdelivr.net/gh/mhrafiei/contents@main/LFF/Master PyTorch 2.10.0/Module_04/Lecture_B/image_03_02.jpg?v=1769744609" width="250">



>* Compose groups transforms into one ordered pipeline
>* Improves consistency, reuse, and understanding of preprocessing

>* Custom transform calls give flexible, dynamic control
>* Scattered logic hurts debugging, reproducibility, evaluation fairness

>* Choose structure or flexibility based on project stability
>* Prototype with custom logic, then formalize composed pipelines



In [None]:
#@title Python Code - Compose Versus Custom Calls

# This script compares composed and custom transforms.
# It uses TensorFlow image preprocessing utilities.
# Focus on clarity not training performance today.

# !pip install tensorflow.

# Import required standard libraries.
import os
import random
import numpy as np

# Import TensorFlow and image utilities.
import tensorflow as tf
from tensorflow.keras import layers

# Set deterministic seeds for reproducibility.
seed_value = 42
random.seed(seed_value)
np.random.seed(seed_value)
tf.random.set_seed(seed_value)

# Print TensorFlow version in one short line.
print("TensorFlow version:", tf.__version__)

# Load a tiny subset of MNIST digits.
(mnist_x_train, mnist_y_train), _ = tf.keras.datasets.mnist.load_data()

# Select a small batch for this demo.
small_images = mnist_x_train[:8]
small_labels = mnist_y_train[:8]

# Validate shapes before further processing.
print("Small batch shape:", small_images.shape)

# Expand grayscale images to have channel dimension.
small_images = small_images[..., np.newaxis]

# Define a composed preprocessing pipeline function.
def composed_preprocess(image):
    image = tf.image.resize(image, (32, 32))
    image = tf.cast(image, tf.float32) / 255.0
    image = tf.image.random_flip_left_right(image, seed=seed_value)
    return image


# Define a custom manual preprocessing function.
def custom_preprocess(image, index):
    image = tf.cast(image, tf.float32) / 255.0
    if index % 2 == 0:
        image = tf.image.resize(image, (32, 32))
    else:
        image = tf.image.resize(image, (28, 28))
        image = tf.image.resize_with_pad(image, 32, 32)
    if index % 3 == 0:
        image = tf.image.random_flip_left_right(image, seed=seed_value)
    return image

# Build a Dataset using the composed pipeline.
composed_ds = tf.data.Dataset.from_tensor_slices((small_images, small_labels))
composed_ds = composed_ds.map(
    lambda img, lbl: (composed_preprocess(img), lbl),
    num_parallel_calls=tf.data.AUTOTUNE,
)

# Build a Dataset using the custom pipeline.
index_ds = tf.data.Dataset.range(len(small_images))
custom_ds = tf.data.Dataset.from_tensor_slices((small_images, small_labels))
custom_ds = tf.data.Dataset.zip((custom_ds, index_ds))
custom_ds = custom_ds.map(
    lambda pair, idx: (custom_preprocess(pair[0], idx), pair[1]),
    num_parallel_calls=tf.data.AUTOTUNE,
)

# Take one batch from each dataset for inspection.
composed_batch = next(iter(composed_ds.batch(4)))
custom_batch = next(iter(custom_ds.batch(4)))

# Unpack images and labels from batches.
composed_images, composed_labels = composed_batch
custom_images, custom_labels = custom_batch

# Print summary statistics for composed pipeline.
print("Composed batch shape:", composed_images.shape)
print("Composed labels:", composed_labels.numpy())

# Print summary statistics for custom pipeline.
print("Custom batch shape:", custom_images.shape)
print("Custom labels:", custom_labels.numpy())

# Show simple mean pixel values for quick comparison.
print("Composed mean pixel:", float(tf.reduce_mean(composed_images)))
print("Custom mean pixel:", float(tf.reduce_mean(custom_images)))

# Final line confirms script finished successfully.
print("Comparison of composed versus custom transforms done.")



### **3.3. Transform Effects on Metrics**

<img src="https://cdn.jsdelivr.net/gh/mhrafiei/contents@main/LFF/Master PyTorch 2.10.0/Module_04/Lecture_B/image_03_03.jpg?v=1769744676" width="250">



>* Transforms change inputs and shift training metrics
>* Stronger augmentations may hurt training scores yet generalize

>* Use strong, random transforms only during training
>* Keep validation transforms simple, stable, and consistent

>* Iteratively tweak transforms and watch learning curves
>* Log settings, compare runs, choose effective pipelines



In [None]:
#@title Python Code - Transform Effects on Metrics

# This script shows how transforms affect metrics.
# We compare training with and without simple augmentations.
# Focus on clear metrics using a tiny MNIST subset.

# !pip install tensorflow==2.20.0.

# Import required standard libraries.
import os
import random
import numpy as np

# Import tensorflow and keras utilities.
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers

# Set deterministic seeds for reproducibility.
seed_value = 42
random.seed(seed_value)
np.random.seed(seed_value)
tf.random.set_seed(seed_value)

# Print tensorflow version in one short line.
print("TensorFlow version:", tf.__version__)

# Load MNIST dataset from keras datasets.
(x_train, y_train), (x_test, y_test) = keras.datasets.mnist.load_data()

# Reduce dataset size for quick demonstration.
train_samples = 4000
test_samples = 1000
x_train = x_train[:train_samples]
y_train = y_train[:train_samples]

# Slice test data for faster evaluation.
x_test = x_test[:test_samples]
y_test = y_test[:test_samples]

# Validate shapes before building datasets.
print("Train shape:", x_train.shape, y_train.shape)
print("Test shape:", x_test.shape, y_test.shape)

# Normalize images to range zero one.
x_train = x_train.astype("float32") / 255.0
x_test = x_test.astype("float32") / 255.0

# Add channel dimension for convolution layers.
x_train = np.expand_dims(x_train, axis=-1)
x_test = np.expand_dims(x_test, axis=-1)

# Create base tf dataset without augmentation.
base_train_ds = tf.data.Dataset.from_tensor_slices(
    (x_train, y_train)
)

# Shuffle and batch the base training dataset.
base_train_ds = base_train_ds.shuffle(
    buffer_size=train_samples,
    seed=seed_value
).batch(64)

# Create validation dataset with deterministic preprocessing.
val_ds = tf.data.Dataset.from_tensor_slices(
    (x_test, y_test)
).batch(64)

# Define simple augmentation function for training.
def augment_images(images, labels):
    images = tf.image.random_flip_left_right(
        images,
        seed=seed_value
    )
    images = tf.image.random_brightness(
        images,
        max_delta=0.1
    )
    return images, labels

# Create augmented training dataset using map.
aug_train_ds = base_train_ds.map(
    augment_images,
    num_parallel_calls=tf.data.AUTOTUNE
)

# Prefetch for better pipeline performance.
base_train_ds = base_train_ds.prefetch(tf.data.AUTOTUNE)
aug_train_ds = aug_train_ds.prefetch(tf.data.AUTOTUNE)
val_ds = val_ds.prefetch(tf.data.AUTOTUNE)

# Define a small convolutional model builder.
def build_model():
    model = keras.Sequential([
        layers.Input(shape=(28, 28, 1)),
        layers.Conv2D(16, 3, activation="relu"),
        layers.MaxPooling2D(),
        layers.Conv2D(32, 3, activation="relu"),
        layers.Flatten(),
        layers.Dense(64, activation="relu"),
        layers.Dense(10, activation="softmax"),
    ])
    model.compile(
        optimizer="adam",
        loss="sparse_categorical_crossentropy",
        metrics=["accuracy"],
    )
    return model

# Train model without augmentation for few epochs.
model_no_aug = build_model()
history_no_aug = model_no_aug.fit(
    base_train_ds,
    epochs=3,
    validation_data=val_ds,
    verbose=0
)

# Train model with augmentation using same settings.
model_aug = build_model()
history_aug = model_aug.fit(
    aug_train_ds,
    epochs=3,
    validation_data=val_ds,
    verbose=0
)

# Extract final epoch metrics for both runs.
final_no_aug = history_no_aug.history
final_aug = history_aug.history

# Print concise comparison of training and validation metrics.
print("No aug - train loss, acc:",
      round(final_no_aug["loss"][-1], 4),
      round(final_no_aug["accuracy"][-1], 4))
print("No aug - val loss, acc:",
      round(final_no_aug["val_loss"][-1], 4),
      round(final_no_aug["val_accuracy"][-1], 4))
print("Aug    - train loss, acc:",
      round(final_aug["loss"][-1], 4),
      round(final_aug["accuracy"][-1], 4))
print("Aug    - val loss, acc:",
      round(final_aug["val_loss"][-1], 4),
      round(final_aug["val_accuracy"][-1], 4))




# <font color="#418FDE" size="6.5" uppercase>**Transforms and Augment**</font>


In this lecture, you learned to:
- Apply common preprocessing transforms such as normalization, resizing, and tensor conversion to raw data. 
- Design and configure data augmentation pipelines that improve model robustness without corrupting labels. 
- Integrate transforms into Dataset and DataLoader workflows while monitoring their impact on training. 

In the next Module (Module 5), we will go over 'Computer Vision Models'