# <font color="#418FDE" size="6.5" uppercase>**Transforms and Augment**</font>

>Last update: 20260129.
    
By the end of this Lecture, you will be able to:
- Apply common preprocessing transforms such as normalization, resizing, and tensor conversion to raw data. 
- Design and configure data augmentation pipelines that improve model robustness without corrupting labels. 
- Integrate transforms into Dataset and DataLoader workflows while monitoring their impact on training. 


## **1. Core Preprocessing Transforms**

### **1.1. Tensor Conversion and Normalization**

<img src="https://cdn.jsdelivr.net/gh/mhrafiei/contents@main/LFF/Master PyTorch 2.10.0/Module_04/Lecture_B/image_01_01.jpg?v=1769673622" width="250">



>* Convert raw images, text, tables into tensors
>* Normalize tensor values for stable, efficient training

>* Convert varied images to float tensors, rescaled
>* Normalize each color channel using mean and std

>* Tensor conversion and normalization help across modalities
>* They stabilize training by controlling feature scales



In [None]:
#@title Python Code - Tensor Conversion and Normalization

# This script shows tensor conversion basics.
# It uses TensorFlow for simple image preprocessing.
# Focus on conversion, scaling, and normalization steps.

# Install TensorFlow only if missing in other environments.
# !pip install tensorflow==2.20.0 --quiet.

# Import required standard libraries.
import os
import random
import numpy as np

# Import TensorFlow and image utilities.
import tensorflow as tf
from tensorflow.keras import datasets

# Set deterministic seeds for reproducibility.
np.random.seed(42)
random.seed(42)
tf.random.set_seed(42)

# Print TensorFlow version in one short line.
print("TensorFlow version:", tf.__version__)

# Load MNIST dataset with small grayscale images.
(x_train, y_train), _ = datasets.mnist.load_data()

# Select a tiny subset for this demonstration.
num_samples = 4
x_small = x_train[:num_samples]
y_small = y_train[:num_samples]

# Confirm shapes before further processing.
print("Subset shape:", x_small.shape)

# Convert uint8 images to float32 tensors.
x_tensor = tf.convert_to_tensor(x_small, dtype=tf.float32)

# Add channel dimension for grayscale images.
x_tensor = tf.expand_dims(x_tensor, axis=-1)

# Validate tensor shape after expansion.
print("Tensor shape:", x_tensor.shape)

# Scale pixel values from 0-255 to 0-1.
x_scaled = x_tensor / 255.0

# Compute per channel mean and standard deviation.
channel_mean = tf.reduce_mean(x_scaled, axis=(0, 1, 2))
channel_std = tf.math.reduce_std(x_scaled, axis=(0, 1, 2))

# Avoid division by zero using small epsilon.
epsilon = 1e-7
safe_std = tf.maximum(channel_std, epsilon)

# Apply channel wise normalization transform.
x_normalized = (x_scaled - channel_mean) / safe_std

# Compute statistics before and after normalization.
mean_before = tf.reduce_mean(x_scaled).numpy()
std_before = tf.math.reduce_std(x_scaled).numpy()
mean_after = tf.reduce_mean(x_normalized).numpy()
std_after = tf.math.reduce_std(x_normalized).numpy()

# Print summary of scaling and normalization.
print("Mean before:", round(float(mean_before), 4))
print("Std before:", round(float(std_before), 4))
print("Mean after:", round(float(mean_after), 4))
print("Std after:", round(float(std_after), 4))

# Show one original and normalized pixel range.
idx = 0
orig_min = float(tf.reduce_min(x_scaled[idx]))
orig_max = float(tf.reduce_max(x_scaled[idx]))
norm_min = float(tf.reduce_min(x_normalized[idx]))
norm_max = float(tf.reduce_max(x_normalized[idx]))

# Print pixel ranges to illustrate transformations.
print("Scaled range sample:", round(orig_min, 3), round(orig_max, 3))
print("Norm range sample:", round(norm_min, 3), round(norm_max, 3))




### **1.2. Spatial Resizing Basics**

<img src="https://cdn.jsdelivr.net/gh/mhrafiei/contents@main/LFF/Master PyTorch 2.10.0/Module_04/Lecture_B/image_01_02.jpg?v=1769673699" width="250">



>* Real images come in many different sizes
>* Resizing standardizes dimensions so models process efficiently

>* Direct scaling can distort image aspect ratios
>* Preserve aspect ratio using resize, crop, or pad

>* Interpolation computes new pixel values when resizing
>* Choose methods that preserve task-relevant visual details



In [None]:
#@title Python Code - Spatial Resizing Basics

# This script shows basic spatial resizing concepts.
# We use TensorFlow to resize small example images.
# Focus on shapes and visual intuition not training.

# !pip install tensorflow.

# Import required standard libraries.
import os
import random
import numpy as np

# Import TensorFlow and image utilities.
import tensorflow as tf
from tensorflow.keras import layers

# Set deterministic random seeds for reproducibility.
seed_value = 42
random.seed(seed_value)
np.random.seed(seed_value)

# Configure TensorFlow global random seed deterministically.
tf.random.set_seed(seed_value)

# Print TensorFlow version in one concise line.
print("TensorFlow version:", tf.__version__)

# Create a tiny batch of random RGB images.
num_images = 3
height, width, channels = 40, 60, 3

# Generate random pixel values between zero and one.
images_np = np.random.rand(num_images, height, width, channels)

# Convert NumPy images to TensorFlow tensor.
images_tf = tf.convert_to_tensor(images_np, dtype=tf.float32)

# Show original batch shape before resizing.
print("Original batch shape:", images_tf.shape)

# Define a simple resizing layer to fixed size.
resize_layer = layers.Resizing(height=64, width=64)

# Apply direct scaling resize to the batch.
resized_direct = resize_layer(images_tf)

# Show new shape after direct scaling resize.
print("Direct scaled shape:", resized_direct.shape)

# Define a resizing layer that preserves aspect ratio.
resize_keep = layers.Resizing(height=64, width=64, crop_to_aspect_ratio=True)

# Apply aspect ratio preserving resize to batch.
resized_keep = resize_keep(images_tf)

# Show shape after aspect ratio preserving resize.
print("Aspect ratio shape:", resized_keep.shape)

# Define a nearest neighbor interpolation resizing layer.
resize_nearest = layers.Resizing(
    height=64,
    width=64,
    interpolation="nearest",
)

# Apply nearest neighbor interpolation to batch.
resized_nearest = resize_nearest(images_tf)

# Show shape after nearest neighbor interpolation.
print("Nearest interpolation shape:", resized_nearest.shape)

# Define a bilinear interpolation resizing layer.
resize_bilinear = layers.Resizing(
    height=64,
    width=64,
    interpolation="bilinear",
)

# Apply bilinear interpolation to the same batch.
resized_bilinear = resize_bilinear(images_tf)

# Show shape after bilinear interpolation resize.
print("Bilinear interpolation shape:", resized_bilinear.shape)

# Compute mean absolute difference between methods.
diff_direct_keep = tf.reduce_mean(
    tf.abs(resized_direct - resized_keep)
)

# Compute difference between nearest and bilinear outputs.
diff_near_bilin = tf.reduce_mean(
    tf.abs(resized_nearest - resized_bilinear)
)

# Print concise numeric comparison of resize strategies.
print("Mean difference direct vs keep:", float(diff_direct_keep))

# Print difference between interpolation strategies for intuition.
print("Mean difference nearest vs bilinear:", float(diff_near_bilin))




### **1.3. Channel and Dtype Basics**

<img src="https://cdn.jsdelivr.net/gh/mhrafiei/contents@main/LFF/Master PyTorch 2.10.0/Module_04/Lecture_B/image_01_03.jpg?v=1769673773" width="250">



>* Know channel count, meaning, and ordering first
>* Mismatch in channels breaks preprocessing across domains

>* Dtype controls value range, storage, and math
>* Consistent float dtypes prevent subtle training bugs

>* Standardize channels and dtypes early in pipelines
>* This prevents inconsistent transforms and unstable training



In [None]:
#@title Python Code - Channel and Dtype Basics

# This script explains channels and dtypes simply.
# We use TensorFlow to inspect small image tensors.
# Focus is on shapes, channels, and type conversions.

# !pip install tensorflow.

# Import required standard libraries.
import os
import random
import numpy as np

# Import TensorFlow and image utilities.
import tensorflow as tf
from tensorflow.keras import datasets

# Set deterministic random seeds everywhere.
seed_value = 42
random.seed(seed_value)
np.random.seed(seed_value)
tf.random.set_seed(seed_value)

# Print TensorFlow version briefly.
print("TensorFlow version:", tf.__version__)

# Load a small subset of CIFAR10 images.
(x_train, y_train), _ = datasets.cifar10.load_data()

# Select first four images for quick demo.
num_samples = 4
x_small = x_train[:num_samples]
y_small = y_train[:num_samples]

# Confirm original shape and dtype information.
print("Original shape:", x_small.shape)
print("Original dtype:", x_small.dtype)

# Show one label to confirm dataset loaded.
print("Example label:", int(y_small[0, 0]))

# Convert numpy batch to TensorFlow tensor.
images_uint8 = tf.convert_to_tensor(x_small)
print("Tensor dtype before cast:", images_uint8.dtype)

# Check that channels are in last dimension.
print("Tensor shape:", images_uint8.shape)

# Extract one image and inspect channel values.
one_image = images_uint8[0]
print("One image shape:", one_image.shape)

# Convert from uint8 to float32 safely.
images_float = tf.image.convert_image_dtype(images_uint8, tf.float32)
print("Tensor dtype after cast:", images_float.dtype)

# Confirm value range after conversion.
min_val = tf.reduce_min(images_float).numpy()
max_val = tf.reduce_max(images_float).numpy()
print("Value range after cast:", float(min_val), float(max_val))

# Simulate grayscale by averaging color channels.
image_gray = tf.reduce_mean(images_float[0], axis=-1, keepdims=True)
print("Grayscale shape:", image_gray.shape)

# Repeat grayscale channel three times.
image_gray_three = tf.repeat(image_gray, repeats=3, axis=-1)
print("Repeated grayscale shape:", image_gray_three.shape)

# Stack original and grayscale for simple comparison.
stacked = tf.stack([images_float[0], image_gray_three], axis=0)
print("Stacked batch shape:", stacked.shape)




## **2. Label Safe Augmentations**

### **2.1. Spatial Augmentation Basics**

<img src="https://cdn.jsdelivr.net/gh/mhrafiei/contents@main/LFF/Master PyTorch 2.10.0/Module_04/Lecture_B/image_02_01.jpg?v=1769673815" width="250">



>* Change image geometry while keeping label meaning
>* Simulate real-world viewpoint changes to improve robustness

>* Common spatial augments: crop, pad, flip, rotate
>* Simulate natural viewpoint changes, expand limited datasets

>* Balance useful variation against possible label changes
>* Match augmentation ranges to domainâ€™s natural geometry



In [None]:
#@title Python Code - Spatial Augmentation Basics

# This script shows basic spatial augmentations.
# We use TensorFlow image utilities for transformations.
# Focus is on label safe geometric changes.

# !pip install tensorflow.

# Import required standard libraries.
import os
import random
import numpy as np

# Import TensorFlow and image utilities.
import tensorflow as tf
from tensorflow.keras import datasets

# Set deterministic random seeds for reproducibility.
seed_value = 42
random.seed(seed_value)
np.random.seed(seed_value)

# Set TensorFlow random seed for reproducible results.
tf.random.set_seed(seed_value)

# Print TensorFlow version in one short line.
print("TensorFlow version:", tf.__version__)

# Load MNIST dataset with small grayscale images.
(x_train, y_train), _ = datasets.mnist.load_data()

# Select a tiny subset for quick demonstration.
num_samples = 4
x_subset = x_train[:num_samples]

y_subset = y_train[:num_samples]

# Normalize images to range [0,1] as float32.
x_subset = x_subset.astype("float32") / 255.0

# Add channel dimension to match expected shape.
x_subset = np.expand_dims(x_subset, axis=-1)

# Confirm shapes are as expected before transforms.
print("Subset shape:", x_subset.shape)

# Define a function for random horizontal flip.
def random_flip(image):
    image = tf.image.random_flip_left_right(image)
    return image

# Define a function for small random rotation.
def random_rotate(image, max_degrees=15.0):
    radians = max_degrees * np.pi / 180.0
    angle = tf.random.uniform((), -radians, radians)
    image = tfa_image_rotate(image, angle)
    return image

# Define a helper using dense image warp for rotation.
def tfa_image_rotate(image, angle):
    c = tf.math.cos(angle)
    s = tf.math.sin(angle)
    transform = [c, -s, 0.0, s, c, 0.0, 0.0, 0.0]
    transform = tf.convert_to_tensor(transform)
    transform = tf.reshape(transform, (8,))
    image = tf.expand_dims(image, axis=0)
    image = tf.raw_ops.ImageProjectiveTransformV3(
        images=image,
        transforms=tf.expand_dims(transform, axis=0),
        output_shape=tf.shape(image)[1:3],
        interpolation="BILINEAR",
        fill_mode="REFLECT",
        fill_value=0.0,
    )
    image = tf.squeeze(image, axis=0)
    return image

# Define a function for random translation using padding.
def random_translate(image, max_pixels=3):
    dx = tf.random.uniform((), -max_pixels, max_pixels + 1, tf.int32)
    dy = tf.random.uniform((), -max_pixels, max_pixels + 1, tf.int32)
    paddings = [[max_pixels, max_pixels], [max_pixels, max_pixels], [0, 0]]
    padded = tf.pad(image, paddings, mode="REFLECT")
    h, w, _ = tf.unstack(tf.shape(image))
    start_y = max_pixels + dy
    start_x = max_pixels + dx
    translated = padded[start_y:start_y + h, start_x:start_x + w, :]
    return translated

# Compose spatial augmentations into one function.
def spatial_augment(image):
    image = random_flip(image)
    image = random_rotate(image)
    image = random_translate(image)
    return image

# Apply augmentations and collect results.
augmented_images = []
for i in range(num_samples):
    img = x_subset[i]
    aug = spatial_augment(img)
    augmented_images.append(aug)

# Convert augmented list to tensor for inspection.
augmented_images = tf.stack(augmented_images, axis=0)

# Verify shapes match original subset shapes.
print("Augmented shape:", augmented_images.shape)

# Show original and augmented labels to confirm safety.
print("Original labels:", y_subset.tolist())

# Convert augmented images to numpy for simple stats.
aug_np = augmented_images.numpy()

# Print simple statistics to summarize augmentations.
print("Original min, max:", float(x_subset.min()), float(x_subset.max()))

# Print augmented statistics to confirm valid ranges.
print("Augmented min, max:", float(aug_np.min()), float(aug_np.max()))

# Print one example pair of label and pixel mean.
idx = 0
print("Example label:", int(y_subset[idx]), "orig mean:", float(x_subset[idx].mean()))

# Print augmented mean for the same example index.
print("Augmented mean:", float(aug_np[idx].mean()))



### **2.2. Color Jitter and Noise**

<img src="https://cdn.jsdelivr.net/gh/mhrafiei/contents@main/LFF/Master PyTorch 2.10.0/Module_04/Lecture_B/image_02_02.jpg?v=1769673945" width="250">



>* Adjust brightness, contrast, saturation, and hue randomly
>* Keep labels fixed while improving robustness to appearance

>* Add small pixel noise to mimic imperfections
>* Model learns robust features while labels stay valid

>* Control strength and frequency to avoid label corruption
>* Use realistic ranges, monitor effects on images



In [None]:
#@title Python Code - Color Jitter and Noise

# This script shows label safe color jitter augmentations.
# We use TensorFlow to add jitter and noise examples.
# Focus is on appearance changes preserving image labels.

# !pip install tensorflow.

# Import required standard libraries.
import os
import random
import numpy as np

# Import TensorFlow and image utilities.
import tensorflow as tf
from tensorflow.keras import layers

# Set deterministic random seeds everywhere.
seed_value = 42
random.seed(seed_value)
np.random.seed(seed_value)
tf.random.set_seed(seed_value)

# Print TensorFlow version in one short line.
print("TensorFlow version:", tf.__version__)

# Create a simple synthetic RGB image tensor.
height, width, channels = 64, 64, 3
base_image = tf.ones((height, width, channels), dtype=tf.float32)

# Add a colored square to the center region.
center_start, center_end = 16, 48
square_color = tf.constant([1.0, 0.0, 0.0], dtype=tf.float32)

# Use tensor operations to draw the square.
center_region = tf.ones((center_end - center_start,
                         center_end - center_start,
                         channels), dtype=tf.float32)
center_region = center_region * square_color

# Insert the square into the base image.
base_image_with_square = tf.tensor_scatter_nd_update(
    base_image,
    indices=tf.reshape(
        tf.stack(
            tf.meshgrid(
                tf.range(center_start, center_end),
                tf.range(center_start, center_end),
                indexing="ij"),
            axis=-1),
        (-1, 2)),
    updates=tf.reshape(center_region, (-1, channels)))

# Confirm image shape is as expected.
print("Base image shape:", base_image_with_square.shape)

# Normalize image to [0,1] range safely.
image = tf.clip_by_value(base_image_with_square, 0.0, 1.0)

# Define a color jitter layer using Keras preprocessing.
color_jitter_layer = tf.keras.Sequential([
    layers.RandomBrightness(factor=0.2,
                            value_range=(0.0, 1.0),
                            seed=seed_value),
    layers.RandomContrast(factor=0.2,
                          seed=seed_value)
])

# Apply color jitter to the image batch.
image_batch = tf.expand_dims(image, axis=0)
color_jittered_batch = color_jitter_layer(image_batch,
                                          training=True)
color_jittered_image = tf.squeeze(color_jittered_batch,
                                  axis=0)

# Define a simple Gaussian noise function.
def add_gaussian_noise(x, stddev=0.05):
    noise = tf.random.normal(shape=tf.shape(x),
                             mean=0.0,
                             stddev=stddev,
                             seed=seed_value)
    noisy = x + noise
    noisy = tf.clip_by_value(noisy, 0.0, 1.0)
    return noisy

# Apply Gaussian noise to the jittered image.
noisy_image = add_gaussian_noise(color_jittered_image,
                                 stddev=0.05)

# Compute simple statistics for each version.
def summarize_image(name, img_tensor):
    img_min = tf.reduce_min(img_tensor).numpy()
    img_max = tf.reduce_max(img_tensor).numpy()
    img_mean = tf.reduce_mean(img_tensor).numpy()
    print(f"{name} min={img_min:.3f} max={img_max:.3f} mean={img_mean:.3f}")

# Show statistics for base, jittered, and noisy images.
summarize_image("Base", image)
summarize_image("Color jittered", color_jittered_image)
summarize_image("Jittered plus noise", noisy_image)

# Build a small dataset pipeline with these transforms.
def preprocess_with_augment(x):
    x = tf.image.resize(x, (height, width))
    x = tf.cast(x, tf.float32) / 255.0
    x = color_jitter_layer(tf.expand_dims(x, 0),
                           training=True)
    x = tf.squeeze(x, axis=0)
    x = add_gaussian_noise(x, stddev=0.05)
    return x

# Load a tiny subset of MNIST digits.
(train_images, train_labels), _ = tf.keras.datasets.mnist.load_data()
train_images = train_images[:32]
train_labels = train_labels[:32]

# Expand grayscale images to RGB channels.
train_images_rgb = np.repeat(train_images[..., np.newaxis],
                             3, axis=-1)

# Create a TensorFlow dataset from arrays.
ds = tf.data.Dataset.from_tensor_slices(
    (train_images_rgb, train_labels))

# Map preprocessing with augmentation and batch.
ds_aug = ds.map(lambda x, y: (preprocess_with_augment(x), y),
                num_parallel_calls=tf.data.AUTOTUNE)
ds_aug = ds_aug.batch(8).prefetch(tf.data.AUTOTUNE)

# Take one batch and show label stability.
for batch_images, batch_labels in ds_aug.take(1):
    print("Augmented batch shape:", batch_images.shape)
    print("Batch labels:", batch_labels.numpy())

# Final line prints confirmation of script completion.
print("Color jitter and noise augmentation demo complete.")




### **2.3. Label Preserving Constraints**

<img src="https://cdn.jsdelivr.net/gh/mhrafiei/contents@main/LFF/Master PyTorch 2.10.0/Module_04/Lecture_B/image_02_03.jpg?v=1769674030" width="250">



>* Augmentations must keep the original label meaning
>* Decide safe transforms based on task semantics

>* Structured tasks need geometry-consistent label transformations
>* Augment inputs and labels together, avoiding meaning changes

>* Match augmentations to task-specific invariances and limits
>* Continuously inspect augmented data and training behavior



In [None]:
#@title Python Code - Label Preserving Constraints

# This script illustrates label preserving constraints.
# We compare safe and unsafe image augmentations.
# Focus is on how transforms affect labels.

# !pip install tensorflow.

# Import required standard libraries.
import os
import random
import numpy as np

# Import TensorFlow and Keras utilities.
import tensorflow as tf
from tensorflow.keras import datasets

# Set deterministic seeds for reproducibility.
seed_value = 42
random.seed(seed_value)
np.random.seed(seed_value)

# Configure TensorFlow random seed deterministically.
tf.random.set_seed(seed_value)

# Print TensorFlow version in one short line.
print("TensorFlow version:", tf.__version__)

# Load MNIST dataset with train and test splits.
(train_images, train_labels), _ = datasets.mnist.load_data()

# Confirm expected image and label shapes.
print("Train images shape:", train_images.shape)

# Select a tiny subset for quick demonstration.
subset_size = 8
images_subset = train_images[:subset_size]

# Slice corresponding labels for the subset.
labels_subset = train_labels[:subset_size]

# Normalize images to range zero to one.
images_subset = images_subset.astype("float32") / 255.0

# Add channel dimension to match image format.
images_subset = np.expand_dims(images_subset, axis=-1)

# Define a safe augmentation that preserves labels.
safe_augment = tf.keras.Sequential([
    tf.keras.layers.RandomRotation(
        factor=0.05,
        fill_mode="nearest"
    ),
    tf.keras.layers.RandomTranslation(
        height_factor=0.05,
        width_factor=0.05
    )
])

# Define an unsafe augmentation for digit orientation.
unsafe_augment = tf.keras.Sequential([
    tf.keras.layers.RandomFlip(
        mode="horizontal"
    ),
    tf.keras.layers.RandomRotation(
        factor=0.5
    )
])

# Apply safe augmentation once to the subset.
safe_augmented = safe_augment(images_subset, training=True)

# Apply unsafe augmentation once to the subset.
unsafe_augmented = unsafe_augment(images_subset, training=True)

# Ensure augmented shapes still match original.
print("Safe augmented shape:", safe_augmented.shape)

# Confirm unsafe augmented shape also matches.
print("Unsafe augmented shape:", unsafe_augmented.shape)

# Helper function to summarize one example pair.
def describe_example(index, original, safe_img, unsafe_img, label):
    # Compute simple mean intensity statistics.
    orig_mean = float(np.mean(original))
    safe_mean = float(np.mean(safe_img))

    # Compute mean for unsafe augmented image.
    unsafe_mean = float(np.mean(unsafe_img))

    # Print a compact description line.
    print(
        f"Idx {index} label {label} | "
        f"orig_mean {orig_mean:.3f} "
        f"safe_mean {safe_mean:.3f} "
        f"unsafe_mean {unsafe_mean:.3f}"
    )

# Loop over a few examples and describe them.
max_examples = 5
for i in range(min(max_examples, subset_size)):
    describe_example(
        i,
        images_subset[i],
        safe_augmented[i],
        unsafe_augmented[i],
        int(labels_subset[i])
    )

# Final message linking back to label constraints.
print("Safe transforms keep digit identity, unsafe may flip digits.")




## **3. Transform Integration Patterns**

### **3.1. Transforms in Dataset**

<img src="https://cdn.jsdelivr.net/gh/mhrafiei/contents@main/LFF/Master PyTorch 2.10.0/Module_04/Lecture_B/image_03_01.jpg?v=1769674112" width="250">



>* Dataset applies transforms when samples are loaded
>* Ensures consistent, model-ready data without altering originals

>* Dataset transforms hide preprocessing from training loop
>* Easily swap datasets and test preprocessing strategies

>* Different datasets use role-specific transform pipelines
>* Ensures fair, stable evaluation and robust training



In [None]:
#@title Python Code - Transforms in Dataset

# This script shows transforms inside datasets.
# We use TensorFlow image dataset utilities.
# Focus on integrating preprocessing into dataset.

# !pip install tensorflow.

# Import required standard libraries.
import os
import random
import numpy as np

# Import TensorFlow and image utilities.
import tensorflow as tf
from tensorflow.keras import datasets

# Set deterministic random seeds.
seed_value = 42
random.seed(seed_value)
np.random.seed(seed_value)

# Set TensorFlow random seed.
tf.random.set_seed(seed_value)

# Print TensorFlow version once.
print("TensorFlow version:", tf.__version__)

# Load small subset of MNIST data.
(x_train, y_train), _ = datasets.mnist.load_data()

# Select a tiny slice for speed.
num_samples = 512
x_train = x_train[:num_samples]

# Slice labels to match images.
y_train = y_train[:num_samples]

# Validate shapes before building dataset.
assert x_train.shape[0] == y_train.shape[0]

# Print basic dataset information.
print("Train subset shape:", x_train.shape)

# Define a preprocessing transform function.

def preprocess_image(image, label):
    # Convert image to float32 tensor.
    image = tf.cast(image, tf.float32)
    # Add channel dimension for grayscale.
    image = tf.expand_dims(image, axis=-1)
    # Normalize pixel values to zero mean.
    image = (image / 255.0) - 0.5
    # Return transformed image and original label.
    return image, label

# Create base dataset from NumPy arrays.
base_ds = tf.data.Dataset.from_tensor_slices((x_train, y_train))

# Apply preprocessing transform inside dataset.
train_ds = base_ds.map(preprocess_image, num_parallel_calls=1)

# Shuffle and batch the transformed dataset.
train_ds = train_ds.shuffle(buffer_size=num_samples).batch(32)

# Take one batch to inspect shapes.
images_batch, labels_batch = next(iter(train_ds))

# Print batch shapes after transforms.
print("Batch images shape:", images_batch.shape)

# Print batch labels shape for confirmation.
print("Batch labels shape:", labels_batch.shape)

# Compute simple statistics of transformed batch.
batch_mean = tf.reduce_mean(images_batch).numpy()

# Compute standard deviation of transformed batch.
batch_std = tf.math.reduce_std(images_batch).numpy()

# Print summary of transform effects.
print("Batch pixel mean:", float(batch_mean))

# Print standard deviation to show normalization.
print("Batch pixel std:", float(batch_std))

# Show first label to confirm label integrity.
print("First label in batch:", int(labels_batch[0].numpy()))



### **3.2. Composing Transform Calls**

<img src="https://cdn.jsdelivr.net/gh/mhrafiei/contents@main/LFF/Master PyTorch 2.10.0/Module_04/Lecture_B/image_03_02.jpg?v=1769674177" width="250">



>* Transforms run in a strict, ordered pipeline
>* Each step expects specific input; wrong order breaks

>* Mix always-on and random transforms thoughtfully
>* Preserve meaning while increasing variation for generalization

>* Choose which transforms run pre- and post-batch
>* Structure stages to reduce work and ease changes



In [None]:
#@title Python Code - Composing Transform Calls

# This script shows composing simple image transforms.
# We use TensorFlow to simulate transform pipelines.
# Focus on per sample and per batch transforms.

# !pip install tensorflow.

# Import required standard libraries.
import os
import random
import numpy as np

# Import TensorFlow and Keras utilities.
import tensorflow as tf
from tensorflow import keras

# Set deterministic random seeds.
seed_value = 42
random.seed(seed_value)

# Configure NumPy and TensorFlow seeds.
np.random.seed(seed_value)
tf.random.set_seed(seed_value)

# Print TensorFlow version briefly.
print("TensorFlow version:", tf.__version__)

# Load a small subset of MNIST digits.
(x_train, y_train), _ = keras.datasets.mnist.load_data()

# Select a tiny subset for quick demonstration.
num_samples = 256
x_small = x_train[:num_samples]

y_small = y_train[:num_samples]

# Validate shapes before building datasets.
assert x_small.ndim == 3
assert y_small.shape[0] == num_samples

# Define a per sample deterministic transform function.
def per_sample_transform(image, label):
    # Convert image to float32 tensor.
    image = tf.cast(image, tf.float32)
    image = image / 255.0

    # Add channel dimension for convolutional models.
    image = tf.expand_dims(image, axis=-1)

    # Apply a random horizontal flip augmentation.
    image = tf.image.random_flip_left_right(image, seed=seed_value)

    # Return transformed image and original label.
    return image, label

# Define a simple per batch transform function.
def per_batch_transform(images, labels):
    # Confirm batch rank and image rank.
    tf.debugging.assert_rank(images, 4)

    # Normalize batch to zero mean unit variance.
    mean, variance = tf.nn.moments(images, axes=[0, 1, 2])
    images = (images - mean) / tf.sqrt(variance + 1e-6)

    # Return batch ready for the model.
    return images, labels

# Build a base dataset from NumPy arrays.
base_ds = tf.data.Dataset.from_tensor_slices((x_small, y_small))

# Apply per sample transform inside the dataset.
train_ds = base_ds.map(per_sample_transform, num_parallel_calls=1)

# Shuffle and batch the dataset for training.
train_ds = train_ds.shuffle(buffer_size=num_samples, seed=seed_value)
train_ds = train_ds.batch(32)

# Apply per batch transform after batching.
train_ds = train_ds.map(per_batch_transform, num_parallel_calls=1)

# Prefetch to improve pipeline performance.
train_ds = train_ds.prefetch(tf.data.AUTOTUNE)

# Inspect one batch to see composed effects.
for batch_images, batch_labels in train_ds.take(1):
    # Print batch shapes and basic statistics.
    print("Batch images shape:", batch_images.shape)
    print("Batch labels shape:", batch_labels.shape)

    # Compute mean and std of the batch.
    batch_mean = tf.reduce_mean(batch_images).numpy()
    batch_std = tf.math.reduce_std(batch_images).numpy()

    # Print summary statistics for monitoring.
    print("Batch mean after transforms:", float(batch_mean))
    print("Batch std after transforms:", float(batch_std))

# Build a tiny model to consume transformed batches.
model = keras.Sequential([
    keras.layers.Input(shape=(28, 28, 1)),
    keras.layers.Conv2D(8, (3, 3), activation="relu"),
    keras.layers.Flatten(),
    keras.layers.Dense(10, activation="softmax"),
])

# Compile the model with simple settings.
model.compile(
    optimizer="adam",
    loss="sparse_categorical_crossentropy",
    metrics=["accuracy"],
)

# Train briefly to ensure pipeline integration.
history = model.fit(train_ds, epochs=1, verbose=0)

# Print final training accuracy for confirmation.
print("Training accuracy with composed transforms:", float(history.history["accuracy"][0]))



### **3.3. Tracking Training Metrics**

<img src="https://cdn.jsdelivr.net/gh/mhrafiei/contents@main/LFF/Master PyTorch 2.10.0/Module_04/Lecture_B/image_03_03.jpg?v=1769674273" width="250">



>* Track how transforms change training and validation metrics
>* Compare metric curves before and after transform tweaks

>* Track detailed metrics like per-class accuracy
>* Log metrics and transform settings to explain changes

>* Pair metric trends with visual data checks
>* Use this feedback loop to refine transforms



In [None]:
#@title Python Code - Tracking Training Metrics

# This script shows tracking metrics with transforms.
# We use TensorFlow image data and simple augmentations.
# Focus on how transforms affect training curves.

# !pip install tensorflow.

# Import required standard libraries.
import os
import random
import numpy as np

# Import TensorFlow and Keras utilities.
import tensorflow as tf
from tensorflow import keras

# Set deterministic seeds for reproducibility.
seed_value = 42
random.seed(seed_value)

# Set NumPy and TensorFlow seeds deterministically.
np.random.seed(seed_value)
tf.random.set_seed(seed_value)

# Print TensorFlow version in one short line.
print("TensorFlow version:", tf.__version__)

# Choose device based on GPU availability.
physical_gpus = tf.config.list_physical_devices("GPU")
if physical_gpus:
    device_name = "GPU"
else:
    device_name = "CPU"

# Print which device type will be mainly used.
print("Using device type:", device_name)

# Load MNIST dataset from Keras datasets.
(mnist_x_train, mnist_y_train), (mnist_x_test, mnist_y_test) = (
    keras.datasets.mnist.load_data()
)

# Select a small subset for quick demonstration.
train_subset_size = 4000
test_subset_size = 1000

# Slice the arrays to obtain smaller subsets.
x_train = mnist_x_train[:train_subset_size]
y_train = mnist_y_train[:train_subset_size]

# Slice test arrays similarly for evaluation.
x_test = mnist_x_test[:test_subset_size]
y_test = mnist_y_test[:test_subset_size]

# Validate shapes to avoid unexpected broadcasting.
assert x_train.shape[0] == train_subset_size
assert x_test.shape[0] == test_subset_size

# Expand channel dimension and normalize to floats.
x_train = x_train.astype("float32") / 255.0
x_test = x_test.astype("float32") / 255.0

# Add channel dimension required for Conv2D layers.
x_train = np.expand_dims(x_train, axis=-1)
x_test = np.expand_dims(x_test, axis=-1)

# Confirm final shapes are as expected.
print("Train shape:", x_train.shape)
print("Test shape:", x_test.shape)

# Define a simple convolutional model builder.
def build_simple_model():
    model = keras.Sequential(
        [
            keras.layers.Input(shape=(28, 28, 1)),
            keras.layers.Conv2D(16, (3, 3), activation="relu"),
            keras.layers.MaxPooling2D(pool_size=(2, 2)),
            keras.layers.Flatten(),
            keras.layers.Dense(32, activation="relu"),
            keras.layers.Dense(10, activation="softmax"),
        ]
    )
    model.compile(
        optimizer="adam",
        loss="sparse_categorical_crossentropy",
        metrics=["accuracy"],
    )
    return model

# Create a tf.data.Dataset without augmentation.
base_train_ds = tf.data.Dataset.from_tensor_slices((x_train, y_train))
base_train_ds = base_train_ds.shuffle(1000, seed=seed_value)

# Batch and prefetch for performance.
base_train_ds = base_train_ds.batch(64).prefetch(tf.data.AUTOTUNE)

# Create a simple normalization only pipeline.
def normalize_only(image, label):
    return image, label

# Map normalization transform to dataset.
train_ds_no_aug = base_train_ds.map(normalize_only)

# Define an augmentation function using tf.image.
def augment_image(image, label):
    image = tf.image.random_flip_left_right(image, seed=seed_value)
    image = tf.image.random_brightness(image, max_delta=0.2)
    return image, label

# Apply augmentation transform to create new dataset.
train_ds_aug = base_train_ds.map(augment_image)

# Prepare validation dataset without augmentation.
val_ds = tf.data.Dataset.from_tensor_slices((x_test, y_test))
val_ds = val_ds.batch(64).prefetch(tf.data.AUTOTUNE)

# Build two identical models for fair comparison.
model_no_aug = build_simple_model()
model_aug = build_simple_model()

# Train model without augmentation silently.
history_no_aug = model_no_aug.fit(
    train_ds_no_aug,
    validation_data=val_ds,
    epochs=3,
    verbose=0,
)

# Train model with augmentation silently.
history_aug = model_aug.fit(
    train_ds_aug,
    validation_data=val_ds,
    epochs=3,
    verbose=0,
)

# Extract final epoch metrics for both runs.
final_no_aug = (
    history_no_aug.history["loss"][-1],
    history_no_aug.history["val_loss"][-1],
    history_no_aug.history["accuracy"][-1],
    history_no_aug.history["val_accuracy"][-1],
)

# Extract augmented run metrics similarly.
final_aug = (
    history_aug.history["loss"][-1],
    history_aug.history["val_loss"][-1],
    history_aug.history["accuracy"][-1],
    history_aug.history["val_accuracy"][-1],
)

# Print a short header explaining the comparison.
print("\nFinal metrics after 3 epochs (no augmentation):")

# Print rounded metrics for the baseline model.
print(
    "train_loss=", round(final_no_aug[0], 3),
    "val_loss=", round(final_no_aug[1], 3),
    "train_acc=", round(final_no_aug[2], 3),
    "val_acc=", round(final_no_aug[3], 3),
)

# Print header for augmented training metrics.
print("\nFinal metrics after 3 epochs (with augmentation):")

# Print rounded metrics for the augmented model.
print(
    "train_loss=", round(final_aug[0], 3),
    "val_loss=", round(final_aug[1], 3),
    "train_acc=", round(final_aug[2], 3),
    "val_acc=", round(final_aug[3], 3),
)



# <font color="#418FDE" size="6.5" uppercase>**Transforms and Augment**</font>


In this lecture, you learned to:
- Apply common preprocessing transforms such as normalization, resizing, and tensor conversion to raw data. 
- Design and configure data augmentation pipelines that improve model robustness without corrupting labels. 
- Integrate transforms into Dataset and DataLoader workflows while monitoring their impact on training. 

In the next Module (Module 5), we will go over 'Computer Vision Models'