# <font color="#418FDE" size="6.5" uppercase>**CNNs from Scratch**</font>

>Last update: 20260129.
    
By the end of this Lecture, you will be able to:
- Construct a convolutional neural network using Conv2d, pooling, and fully connected layers in PyTorch. 
- Train the CNN on a small image dataset using the standard training loop and appropriate loss and metrics. 
- Analyze model performance using accuracy, confusion matrices, and simple error inspection. 


## **1. Building CNN Layers**

### **1.1. Convolution Layers Essentials**

<img src="https://cdn.jsdelivr.net/gh/mhrafiei/contents@main/LFF/Master PyTorch 2.10.0/Module_05/Lecture_A/image_01_01.jpg?v=1769676649" width="250">



>* Small filters scan local pixel neighborhoods for patterns
>* Stacked filters capture many features with few parameters

>* Conv2d hyperparameters control filters and feature maps
>* Kernel, stride, padding set detail and resolution

>* Early layers learn simple, generic visual features
>* Deeper layers combine features into high-level concepts



In [None]:
#@title Python Code - Convolution Layers Essentials

# This script explains basic convolution layers.
# It uses TensorFlow to mimic PyTorch ideas.
# Run cells to see shapes and outputs.

# !pip install tensorflow==2.20.0.

# Import required standard libraries.
import os
import random
import numpy as np

# Import TensorFlow and Keras layers.
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers

# Set deterministic random seeds.
seed_value = 42
random.seed(seed_value)
np.random.seed(seed_value)
tf.random.set_seed(seed_value)

# Print TensorFlow version once.
print("TensorFlow version:", tf.__version__)

# Define image and batch dimensions.
batch_size = 4
height = 28
width = 28
channels = 1

# Create a small random image batch.
images = tf.random.uniform(
    shape=(batch_size, height, width, channels),
    minval=0.0,
    maxval=1.0,
)

# Verify the input tensor shape.
print("Input batch shape:", images.shape)

# Define a simple Conv2D layer.
conv_layer = layers.Conv2D(
    filters=8,
    kernel_size=(3, 3),
    strides=(1, 1),
    padding="same",
)

# Apply convolution layer to the images.
conv_output = conv_layer(images)

# Print the convolution output shape.
print("Conv output shape:", conv_output.shape)

# Define a max pooling layer.
pool_layer = layers.MaxPooling2D(
    pool_size=(2, 2),
    strides=(2, 2),
    padding="valid",
)

# Apply pooling to convolution output.
pool_output = pool_layer(conv_output)

# Print the pooled feature map shape.
print("Pooled output shape:", pool_output.shape)

# Show kernel weights shape for intuition.
kernel_weights = conv_layer.kernel
print("Kernel weights shape:", kernel_weights.shape)

# Select one example feature map.
example_map = pool_output[0]

# Compute simple statistics for that map.
mean_val = tf.reduce_mean(example_map).numpy()
max_val = tf.reduce_max(example_map).numpy()
min_val = tf.reduce_min(example_map).numpy()

# Print summary statistics for understanding.
print("Example feature map mean:", float(mean_val))
print("Example feature map max:", float(max_val))
print("Example feature map min:", float(min_val))

# Confirm final tensor rank and size.
print("Example feature map shape:", example_map.shape)




### **1.2. Understanding Pooling Layers**

<img src="https://cdn.jsdelivr.net/gh/mhrafiei/contents@main/LFF/Master PyTorch 2.10.0/Module_05/Lecture_A/image_01_02.jpg?v=1769676715" width="250">



>* Pooling compresses feature maps, keeping key information
>* Makes models efficient and robust to small shifts

>* Max and average pooling summarize feature map regions
>* Max keeps strongest signals; pooling adds spatial invariance

>* Pooling size and stride control downsampling strength
>* Placement balances detail loss and high-level abstraction



In [None]:
#@title Python Code - Understanding Pooling Layers

# This script explains pooling layers visually.
# It uses TensorFlow to simulate simple pooling.
# Run cells to see how pooling changes images.

# !pip install tensorflow==2.20.0.

# Import required standard libraries.
import os
import random
import numpy as np

# Import TensorFlow and check version.
import tensorflow as tf
print("TensorFlow version:", tf.__version__)

# Set deterministic random seeds.
seed_value = 42
random.seed(seed_value)
np.random.seed(seed_value)

# Create a small fake image batch.
height, width, channels = 8, 8, 1
image_array = np.arange(height * width,
                        dtype=np.float32).reshape(
                        (1, height, width, channels))

# Normalize values to range zero one.
image_array = image_array / np.max(image_array)
print("Input batch shape:", image_array.shape)

# Convert numpy array to tensor.
image_tensor = tf.convert_to_tensor(image_array,
                                    dtype=tf.float32)

# Define a simple max pooling layer.
max_pool = tf.keras.layers.MaxPool2D(
    pool_size=(2, 2), strides=(2, 2))

# Define a simple average pooling layer.
avg_pool = tf.keras.layers.AveragePooling2D(
    pool_size=(2, 2), strides=(2, 2))

# Apply max pooling to the image.
max_pooled = max_pool(image_tensor)
print("Max pooled shape:", max_pooled.shape)

# Apply average pooling to the image.
avg_pooled = avg_pool(image_tensor)
print("Avg pooled shape:", avg_pooled.shape)

# Convert pooled tensors back to numpy.
max_pooled_np = max_pooled.numpy().reshape(
    (max_pooled.shape[1], max_pooled.shape[2]))

# Convert average pooled tensor to numpy.
avg_pooled_np = avg_pooled.numpy().reshape(
    (avg_pooled.shape[1], avg_pooled.shape[2]))

# Print original image values summary.
print("Original top left patch:\n",
      image_array[0, :4, :4, 0])

# Print max pooled values summary.
print("Max pooled values:\n", max_pooled_np)

# Print average pooled values summary.
print("Avg pooled values:\n", avg_pooled_np)

# Confirm spatial size reduction effect.
print("Height reduced from", height, "to",
      max_pooled.shape[1])




### **1.3. Flattening and Linear Layers**

<img src="https://cdn.jsdelivr.net/gh/mhrafiei/contents@main/LFF/Master PyTorch 2.10.0/Module_05/Lecture_A/image_01_03.jpg?v=1769676755" width="250">



>* Convolutions create spatial feature maps highlighting patterns
>* Flattening turns maps into vectors for classification

>* Linear layers weight flattened features for classes
>* Stacked linear layers build higher-level feature combinations

>* Flatten size controls parameters, cost, overfitting risk
>* Balance compression and detail for accurate, efficient decisions



In [None]:
#@title Python Code - Flattening and Linear Layers

# This script explains flattening and linear layers.
# We use TensorFlow to mimic PyTorch style layers.
# Focus is on shapes before and after flattening.

# !pip install tensorflow.

# Import required standard libraries.
import os
import random
import numpy as np

# Import TensorFlow and Keras layers.
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers

# Set deterministic random seeds.
seed_value = 42
random.seed(seed_value)

# Set NumPy random seed.
np.random.seed(seed_value)

# Set TensorFlow random seed.
tf.random.set_seed(seed_value)

# Print TensorFlow version once.
print("TensorFlow version:", tf.__version__)

# Define image height, width, channels.
img_height, img_width, img_channels = 28, 28, 1

# Create a small batch of fake images.
batch_size = 4

# Use random numbers to simulate features.
fake_images = np.random.rand(
    batch_size,
    img_height,
    img_width,
    img_channels,
).astype("float32")

# Define a simple convolutional feature extractor.
feature_extractor = keras.Sequential([
    layers.Conv2D(
        filters=8,
        kernel_size=3,
        activation="relu",
        input_shape=(img_height, img_width, img_channels),
    ),
    layers.MaxPooling2D(pool_size=2),
    layers.Conv2D(filters=16, kernel_size=3, activation="relu"),
    layers.MaxPooling2D(pool_size=2),
])

# Pass fake images through feature extractor.
feature_maps = feature_extractor(fake_images)

# Show shape of feature maps before flattening.
print("Feature maps shape:", feature_maps.shape)

# Build a small classifier with flattening.
classifier = keras.Sequential([
    feature_extractor,
    layers.Flatten(),
    layers.Dense(units=32, activation="relu"),
    layers.Dense(units=10, activation="softmax"),
])

# Get flattened features by calling Flatten directly.
flatten_layer = layers.Flatten()

# Compute flattened output for inspection.
flattened_output = flatten_layer(feature_maps)

# Print shape of flattened vector.
print("Flattened vector shape:", flattened_output.shape)

# Confirm flattened length equals product of dimensions.
fm_shape = feature_maps.shape

# Compute expected flattened length safely.
expected_length = int(fm_shape[1] * fm_shape[2] * fm_shape[3])

# Print expected flattened length for comparison.
print("Expected flattened length:", expected_length)

# Print actual flattened length from tensor.
print("Actual flattened length:", int(flattened_output.shape[1]))

# Create fake labels for ten classes.
fake_labels = np.random.randint(0, 10, size=(batch_size,))

# Compile classifier with simple settings.
classifier.compile(
    optimizer="adam",
    loss="sparse_categorical_crossentropy",
    metrics=["accuracy"],
)

# Train briefly to show linear layers working.
history = classifier.fit(
    fake_images,
    fake_labels,
    epochs=2,
    batch_size=batch_size,
    verbose=0,
)

# Evaluate model on same tiny batch.
loss, acc = classifier.evaluate(
    fake_images,
    fake_labels,
    verbose=0,
)

# Print final accuracy to confirm pipeline.
print("Tiny batch accuracy after training:", float(acc))




## **2. Training Vision CNNs**

### **2.1. Image Normalization Basics**

<img src="https://cdn.jsdelivr.net/gh/mhrafiei/contents@main/LFF/Master PyTorch 2.10.0/Module_05/Lecture_A/image_02_01.jpg?v=1769676838" width="250">



>* Raw pixel scales hinder stable CNN training
>* Normalization rescales pixels to a consistent range

>* Scale pixels and standardize using dataset statistics
>* Center channels so CNN learns shapes, not brightness

>* Pretrained models require matching normalization statistics
>* Consistent normalization stabilizes training and evaluation metrics



In [None]:
#@title Python Code - Image Normalization Basics

# This script shows basic image normalization concepts.
# We use TensorFlow to load and normalize MNIST images.
# Focus on scaling and standardizing pixel intensities.

# !pip install tensorflow==2.20.0.

# Import required standard libraries.
import os
import random
import numpy as np

# Import TensorFlow and Keras utilities.
import tensorflow as tf
from tensorflow.keras import datasets

# Set deterministic random seeds for reproducibility.
seed_value = 42
random.seed(seed_value)
np.random.seed(seed_value)
tf.random.set_seed(seed_value)

# Print TensorFlow version in one short line.
print("TensorFlow version:", tf.__version__)

# Load MNIST dataset using Keras helper.
(x_train, y_train), (x_test, y_test) = datasets.mnist.load_data()

# Confirm dataset shapes before normalization.
print("Train shape:", x_train.shape, "Test shape:", x_test.shape)

# Select a small subset for quick demonstration.
subset_size = 1000
x_train_small = x_train[:subset_size]
y_train_small = y_train[:subset_size]

# Convert integer pixels to float32 values.
x_train_float = x_train_small.astype("float32")
x_test_float = x_test.astype("float32")

# Show original pixel range using min and max.
print("Original min, max:", x_train_float.min(), x_train_float.max())

# Scale pixels to range zero to one.
x_train_scaled = x_train_float / 255.0
x_test_scaled = x_test_float / 255.0

# Compute per pixel mean and standard deviation.
mean_value = np.mean(x_train_scaled)
std_value = np.std(x_train_scaled)

# Print computed mean and standard deviation.
print("Scaled mean:", float(mean_value))
print("Scaled std:", float(std_value))

# Standardize data using training statistics.
x_train_norm = (x_train_scaled - mean_value) / std_value
x_test_norm = (x_test_scaled - mean_value) / std_value

# Verify new distribution after normalization.
print("Norm mean:", float(np.mean(x_train_norm)))
print("Norm std:", float(np.std(x_train_norm)))

# Add channel dimension required for CNN inputs.
x_train_norm = np.expand_dims(x_train_norm, axis=-1)
x_test_norm = np.expand_dims(x_test_norm, axis=-1)

# Confirm final shapes are as expected.
print("Final train shape:", x_train_norm.shape)

# Build a simple CNN model for classification.
model = tf.keras.Sequential([
    tf.keras.layers.Conv2D(
        8,
        (3, 3),
        activation="relu",
        input_shape=(28, 28, 1),
    ),
    tf.keras.layers.MaxPooling2D((2, 2)),
    tf.keras.layers.Flatten(),
    tf.keras.layers.Dense(32, activation="relu"),
    tf.keras.layers.Dense(10, activation="softmax"),
])

# Compile model with optimizer, loss, and accuracy metric.
model.compile(
    optimizer="adam",
    loss="sparse_categorical_crossentropy",
    metrics=["accuracy"],
)

# Train briefly on normalized subset with silent output.
history = model.fit(
    x_train_norm,
    y_train_small,
    epochs=2,
    batch_size=64,
    verbose=0,
)

# Evaluate model on normalized test data silently.
loss, acc = model.evaluate(x_test_norm, y_test, verbose=0)

# Print final test accuracy to summarize effect.
print("Test accuracy with normalization:", float(acc))




### **2.2. CrossEntropy Loss for Labels**

<img src="https://cdn.jsdelivr.net/gh/mhrafiei/contents@main/LFF/Master PyTorch 2.10.0/Module_05/Lecture_A/image_02_02.jpg?v=1769676910" width="250">



>* Cross entropy measures prediction error against true labels
>* Rewards confident correct guesses, penalizes confident mistakes

>* Logits become class probabilities using softmax
>* Cross entropy penalizes uncertain or wrong predictions

>* Large loss on confident mistakes gives strong gradients
>* Stable gradients guide CNN improvement across applications



In [None]:
#@title Python Code - CrossEntropy Loss for Labels

# This script shows cross entropy loss usage.
# We train a tiny CNN on MNIST digits.
# Focus is on labels and loss behavior.

# !pip install tensorflow==2.20.0.

# Import required standard libraries.
import os
import random
import numpy as np

# Import tensorflow and keras submodules.
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers

# Set deterministic seeds for reproducibility.
seed_value = 42
random.seed(seed_value)
np.random.seed(seed_value)
tf.random.set_seed(seed_value)

# Print tensorflow version in one short line.
print("TensorFlow version:", tf.__version__)

# Select device preferring GPU when available.
physical_gpus = tf.config.list_physical_devices("GPU")
if physical_gpus:
    device_name = "GPU"
else:
    device_name = "CPU"

# Print selected device type briefly.
print("Using device:", device_name)

# Load MNIST dataset from keras datasets.
(mnist_x_train, mnist_y_train), (mnist_x_test, mnist_y_test) = (
    keras.datasets.mnist.load_data()
)

# Reduce dataset size for quick demonstration.
train_limit = 2000
test_limit = 500
x_train = mnist_x_train[:train_limit]
y_train = mnist_y_train[:train_limit]

# Slice test data to small subset.
x_test = mnist_x_test[:test_limit]
y_test = mnist_y_test[:test_limit]

# Normalize images to range zero one.
x_train = x_train.astype("float32") / 255.0
x_test = x_test.astype("float32") / 255.0

# Add channel dimension for convolution layers.
x_train = np.expand_dims(x_train, axis=-1)
x_test = np.expand_dims(x_test, axis=-1)

# Validate shapes before building model.
print("Train shape:", x_train.shape, y_train.shape)

# Define number of classes for digits.
num_classes = 10
input_shape = x_train.shape[1:]

# Build a simple sequential CNN model.
model = keras.Sequential([
    layers.Input(shape=input_shape),
    layers.Conv2D(16, (3, 3), activation="relu"),
    layers.MaxPooling2D((2, 2)),
    layers.Conv2D(32, (3, 3), activation="relu"),
    layers.MaxPooling2D((2, 2)),
    layers.Flatten(),
    layers.Dense(64, activation="relu"),
    layers.Dense(num_classes),
])

# Explain loss choice using sparse labels.
loss_fn = keras.losses.SparseCategoricalCrossentropy(from_logits=True)

# Compile model with optimizer and metrics.
model.compile(
    optimizer=keras.optimizers.Adam(learning_rate=0.001),
    loss=loss_fn,
    metrics=["accuracy"],
)

# Train model briefly with silent verbose.
history = model.fit(
    x_train,
    y_train,
    epochs=3,
    batch_size=64,
    validation_split=0.1,
    verbose=0,
)

# Evaluate model on held out test data.
test_loss, test_acc = model.evaluate(
    x_test,
    y_test,
    verbose=0,
)

# Print concise evaluation results.
print("Test loss (cross entropy):", round(test_loss, 4))
print("Test accuracy:", round(test_acc, 4))

# Take small batch to inspect predictions.
sample_images = x_test[:5]
sample_labels = y_test[:5]
logits = model.predict(sample_images, verbose=0)

# Convert logits to probabilities with softmax.
probabilities = tf.nn.softmax(logits, axis=-1).numpy()

# Print predicted class and true label pairs.
for idx in range(len(sample_labels)):
    true_label = int(sample_labels[idx])
    pred_label = int(np.argmax(probabilities[idx]))
    true_prob = float(probabilities[idx, true_label])
    print(
        "Sample",
        idx,
        "true:",
        true_label,
        "pred:",
        pred_label,
        "p_true:",
        round(true_prob, 3),
    )




### **2.3. Accuracy Metric Setup**

<img src="https://cdn.jsdelivr.net/gh/mhrafiei/contents@main/LFF/Master PyTorch 2.10.0/Module_05/Lecture_A/image_02_03.jpg?v=1769676994" width="250">



>* Accuracy counts how many predictions are correct
>* Compute using argmax class, compare, then average

>* Track correct predictions per batch, then aggregate
>* Epoch-wide averaging gives stable, trustworthy accuracy

>* Accuracy is useful but limited, especially imbalanced
>* Track train and validation accuracy to detect overfitting



In [None]:
#@title Python Code - Accuracy Metric Setup

# This script shows accuracy metric setup.
# We use TensorFlow for a tiny example.
# Focus is on simple clear accuracy calculation.

# !pip install tensorflow==2.20.0.

# Import required standard libraries.
import os
import random
import numpy as np

# Import TensorFlow and Keras utilities.
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers

# Set deterministic random seeds.
seed_value = 42
random.seed(seed_value)
np.random.seed(seed_value)
tf.random.set_seed(seed_value)

# Print TensorFlow version once.
print("TensorFlow version:", tf.__version__)

# Load MNIST dataset from Keras.
(x_train, y_train), (x_test, y_test) = keras.datasets.mnist.load_data()

# Select a small subset for speed.
train_samples = 2000
test_samples = 500
x_train = x_train[:train_samples]
y_train = y_train[:train_samples]

# Slice test data subset.
x_test = x_test[:test_samples]
y_test = y_test[:test_samples]

# Normalize pixel values to range zero one.
x_train = x_train.astype("float32") / 255.0
x_test = x_test.astype("float32") / 255.0

# Add channel dimension for convolution layers.
x_train = np.expand_dims(x_train, axis=-1)
x_test = np.expand_dims(x_test, axis=-1)

# Validate shapes before building model.
assert x_train.shape[1:] == (28, 28, 1)
assert x_test.shape[1:] == (28, 28, 1)

# Build a tiny CNN model.
model = keras.Sequential([
    layers.Conv2D(8, (3, 3), activation="relu", input_shape=(28, 28, 1)),
    layers.MaxPooling2D((2, 2)),
    layers.Flatten(),
    layers.Dense(32, activation="relu"),
    layers.Dense(10)
])

# Define loss function for logits.
loss_fn = keras.losses.SparseCategoricalCrossentropy(from_logits=True)

# Define optimizer with safe learning rate.
optimizer = keras.optimizers.Adam(learning_rate=0.001)

# Prepare metric variables for accuracy.
train_correct = 0
train_total = 0
test_correct = 0
test_total = 0

# Convert labels to int64 for safety.
y_train = y_train.astype("int64")
y_test = y_test.astype("int64")

# Define batch size and epochs.
batch_size = 64
epochs = 2

# Training loop over epochs.
for epoch in range(epochs):
    # Reset counters each epoch.
    train_correct = 0
    train_total = 0

    # Iterate over training batches.
    for start in range(0, train_samples, batch_size):
        end = min(start + batch_size, train_samples)
        x_batch = x_train[start:end]
        y_batch = y_train[start:end]

        # Use GradientTape for training.
        with tf.GradientTape() as tape:
            logits = model(x_batch, training=True)
            loss_value = loss_fn(y_batch, logits)

        # Apply gradients to model weights.
        grads = tape.gradient(loss_value, model.trainable_variables)
        optimizer.apply_gradients(zip(grads, model.trainable_variables))

        # Compute predicted class indices.
        preds = tf.argmax(logits, axis=1, output_type=tf.int64)

        # Count correct predictions in batch.
        correct_batch = tf.reduce_sum(tf.cast(preds == y_batch, tf.int32))

        # Update running totals for accuracy.
        train_correct += int(correct_batch.numpy())
        train_total += y_batch.shape[0]

    # Compute epoch training accuracy.
    train_accuracy = train_correct / train_total

    # Evaluate accuracy on test subset.
    test_correct = 0
    test_total = 0

    # Loop over test batches for accuracy.
    for start in range(0, test_samples, batch_size):
        end = min(start + batch_size, test_samples)
        x_batch = x_test[start:end]
        y_batch = y_test[start:end]

        # Forward pass without gradient.
        logits = model(x_batch, training=False)

        # Compute predicted classes for test.
        preds = tf.argmax(logits, axis=1, output_type=tf.int64)

        # Count correct predictions for test.
        correct_batch = tf.reduce_sum(tf.cast(preds == y_batch, tf.int32))

        # Update running test totals.
        test_correct += int(correct_batch.numpy())
        test_total += y_batch.shape[0]

    # Compute epoch test accuracy.
    test_accuracy = test_correct / test_total

    # Print concise accuracy summary.
    print(
        f"Epoch {epoch + 1}: train_acc={train_accuracy:.3f}, test_acc={test_accuracy:.3f}"
    )




## **3. Evaluating CNN Performance**

### **3.1. Reading Confusion Matrices**

<img src="https://cdn.jsdelivr.net/gh/mhrafiei/contents@main/LFF/Master PyTorch 2.10.0/Module_05/Lecture_A/image_03_01.jpg?v=1769677083" width="250">



>* Confusion matrices summarize predictions for every class
>* Rows show true classes, columns show predicted classes

>* Check diagonal versus off-diagonal counts and patterns
>* Use asymmetries to judge error types and costs

>* Look for repeated confusions between similar classes
>* Use these patterns to guide data and improvements



In [None]:
#@title Python Code - Reading Confusion Matrices

# This script shows confusion matrices for CNN predictions.
# It uses TensorFlow to train a tiny image classifier.
# Focus is on reading confusion matrices and errors.

# !pip install tensorflow==2.20.0.

# Import required standard libraries.
import os
import random
import numpy as np

# Import TensorFlow and Keras utilities.
import tensorflow as tf
from tensorflow import keras

# Import confusion matrix and plotting tools.
from sklearn.metrics import confusion_matrix
import matplotlib.pyplot as plt

# Set deterministic random seeds for reproducibility.
seed_value = 42
random.seed(seed_value)
np.random.seed(seed_value)
tf.random.set_seed(seed_value)

# Print TensorFlow version in one short line.
print("TensorFlow version:", tf.__version__)

# Load MNIST dataset of handwritten digits.
(x_train, y_train), (x_test, y_test) = keras.datasets.mnist.load_data()

# Select a small subset for quick training.
train_samples = 4000
test_samples = 1000
x_train = x_train[:train_samples]
y_train = y_train[:train_samples]

# Slice test data to the chosen subset.
x_test = x_test[:test_samples]
y_test = y_test[:test_samples]

# Normalize pixel values to range zero to one.
x_train = x_train.astype("float32") / 255.0
x_test = x_test.astype("float32") / 255.0

# Add channel dimension required by Conv2D.
x_train = np.expand_dims(x_train, axis=-1)
x_test = np.expand_dims(x_test, axis=-1)

# Confirm shapes are as expected.
print("Train shape:", x_train.shape, y_train.shape)
print("Test shape:", x_test.shape, y_test.shape)

# Build a simple CNN model for classification.
model = keras.Sequential([
    keras.layers.Conv2D(
        8,
        (3, 3),
        activation="relu",
        input_shape=(28, 28, 1),
    ),
    keras.layers.MaxPooling2D((2, 2)),
    keras.layers.Conv2D(16, (3, 3), activation="relu"),
    keras.layers.MaxPooling2D((2, 2)),
    keras.layers.Flatten(),
    keras.layers.Dense(32, activation="relu"),
    keras.layers.Dense(10, activation="softmax"),
])

# Compile model with suitable loss and metric.
model.compile(
    optimizer="adam",
    loss="sparse_categorical_crossentropy",
    metrics=["accuracy"],
)

# Train briefly with silent output for speed.
model.fit(
    x_train,
    y_train,
    epochs=2,
    batch_size=64,
    verbose=0,
)

# Evaluate accuracy on the test subset.
loss, acc = model.evaluate(x_test, y_test, verbose=0)
print("Test accuracy:", round(float(acc), 4))

# Get predicted class probabilities for test images.
y_prob = model.predict(x_test, verbose=0)

# Convert probabilities to predicted class indices.
y_pred = np.argmax(y_prob, axis=1)

# Validate prediction and label shapes match.
assert y_pred.shape == y_test.shape

# Compute confusion matrix using sklearn helper.
cm = confusion_matrix(y_test, y_pred, labels=list(range(10)))

# Print a small summary of diagonal correctness.
correct_per_class = np.diag(cm)
print("Correct per class:", correct_per_class)

# Create a simple confusion matrix heatmap plot.
fig, ax = plt.subplots(figsize=(5, 5))

# Show matrix as an image with color scale.
im = ax.imshow(cm, cmap="Blues")

# Add axis labels for true and predicted classes.
ax.set_xlabel("Predicted class")
ax.set_ylabel("True class")

# Add a short title explaining the plot.
ax.set_title("MNIST confusion matrix (subset)")

# Add colorbar to interpret magnitude visually.
fig.colorbar(im, ax=ax)

# Highlight a few typical mistakes for beginners.
mis_idx = np.where(y_pred != y_test)[0]

# If there are mistakes, print first few examples.
if mis_idx.size > 0:
    sample_idx = mis_idx[:5]
    print("Example errors (true, pred):")
    for i in sample_idx:
        print(int(y_test[i]), int(y_pred[i]))

# Display the confusion matrix plot once.
plt.tight_layout()
plt.show()



### **3.2. Inspecting Misclassified Images**

<img src="https://cdn.jsdelivr.net/gh/mhrafiei/contents@main/LFF/Master PyTorch 2.10.0/Module_05/Lecture_A/image_03_02.jpg?v=1769677166" width="250">



>* Look at images the CNN predicts incorrectly
>* Connect wrong predictions to specific visual challenges

>* Look for repeated patterns in wrong predictions
>* Group errors to uncover systematic model weaknesses

>* Use misclassified images to guide data improvements
>* Target risky errors with specialized data and review



In [None]:
#@title Python Code - Inspecting Misclassified Images

# This script inspects misclassified CNN predictions visually.
# It trains a tiny CNN on MNIST digit images briefly.
# Then it plots a few misclassified test images clearly.

# !pip install tensorflow==2.20.0.

# Import required standard libraries safely.
import os
import random
import numpy as np

# Import TensorFlow and Keras utilities.
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers

# Set deterministic random seeds for reproducibility.
seed_value = 42
random.seed(seed_value)
np.random.seed(seed_value)
tf.random.set_seed(seed_value)

# Print TensorFlow version in one short line.
print("TensorFlow version:", tf.__version__)

# Load MNIST dataset using Keras helper.
(x_train, y_train), (x_test, y_test) = keras.datasets.mnist.load_data()

# Reduce dataset size for faster execution.
x_train = x_train[:8000]
y_train = y_train[:8000]
x_test = x_test[:2000]
y_test = y_test[:2000]

# Normalize pixel values to range zero one.
x_train = x_train.astype("float32") / 255.0
x_test = x_test.astype("float32") / 255.0

# Add channel dimension required by Conv2D.
x_train = np.expand_dims(x_train, axis=-1)
x_test = np.expand_dims(x_test, axis=-1)

# Confirm shapes are as expected.
assert x_train.shape[1:] == (28, 28, 1)
assert x_test.shape[1:] == (28, 28, 1)

# Build a simple convolutional neural network.
model = keras.Sequential([
    layers.Conv2D(16, (3, 3), activation="relu", input_shape=(28, 28, 1)),
    layers.MaxPooling2D((2, 2)),
    layers.Conv2D(32, (3, 3), activation="relu"),
    layers.MaxPooling2D((2, 2)),
    layers.Flatten(),
    layers.Dense(64, activation="relu"),
    layers.Dense(10, activation="softmax"),
])

# Compile model with suitable loss and metric.
model.compile(
    optimizer="adam",
    loss="sparse_categorical_crossentropy",
    metrics=["accuracy"],
)

# Train briefly with silent output settings.
history = model.fit(
    x_train,
    y_train,
    epochs=2,
    batch_size=64,
    verbose=0,
    validation_split=0.1,
)

# Evaluate model accuracy on test set.
test_loss, test_acc = model.evaluate(
    x_test,
    y_test,
    verbose=0,
)

# Print concise test accuracy information.
print("Test accuracy:", round(float(test_acc), 4))

# Get predicted class probabilities for test set.
probs = model.predict(x_test, verbose=0)

# Convert probabilities to predicted class indices.
y_pred = np.argmax(probs, axis=1)

# Ensure prediction shape matches labels shape.
assert y_pred.shape == y_test.shape

# Identify indices where predictions are incorrect.
mis_idx = np.where(y_pred != y_test)[0]

# Handle case with very few misclassifications.
num_to_show = min(9, mis_idx.shape[0])

# Print how many misclassified images exist.
print("Total misclassified images:", int(mis_idx.shape[0]))

# Select a small subset of misclassified indices.
selected_idx = mis_idx[:num_to_show]

# Import matplotlib for plotting images.
import matplotlib.pyplot as plt

# Create a square grid for misclassified images.
cols = 3
rows = int(np.ceil(num_to_show / cols))
fig, axes = plt.subplots(rows, cols, figsize=(6, 6))

# Flatten axes array for easier iteration.
axes = np.array(axes).reshape(-1)

# Loop through selected misclassified examples.
for i, ax in enumerate(axes):
    # Clear unused subplot axes if needed.
    if i >= num_to_show:
        ax.axis("off")
        continue

    # Get index of current misclassified image.
    idx = int(selected_idx[i])

    # Extract image, true label, and predicted label.
    img = x_test[idx].squeeze()
    true_label = int(y_test[idx])
    pred_label = int(y_pred[idx])

    # Show grayscale image without axis ticks.
    ax.imshow(img, cmap="gray")
    ax.axis("off")

    # Add informative title with labels.
    ax.set_title(
        f"T:{true_label} P:{pred_label}",
        fontsize=8,
    )

# Adjust layout to avoid overlapping titles.
plt.tight_layout()
plt.show()




### **3.3. Spotting Overfitting Patterns**

<img src="https://cdn.jsdelivr.net/gh/mhrafiei/contents@main/LFF/Master PyTorch 2.10.0/Module_05/Lecture_A/image_03_03.jpg?v=1769677259" width="250">



>* Compare training and validation performance to detect overfitting
>* Watch for widening accuracy and loss gaps over time

>* Overfitting shows in loss curves and errors
>* Watch training versus validation trends to detect

>* Check mistakes and sensitivity to small changes
>* Compare subsets to see brittle, non-general features



In [None]:
#@title Python Code - Spotting Overfitting Patterns

# This script shows CNN overfitting patterns clearly.
# We use a tiny MNIST subset for quick training.
# Focus on training versus validation loss and accuracy.

# !pip install tensorflow==2.20.0.

# Import required standard libraries.
import os
import random
import numpy as np

# Import tensorflow and keras submodules.
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers

# Set deterministic seeds for reproducibility.
seed_value = 42
random.seed(seed_value)
np.random.seed(seed_value)
tf.random.set_seed(seed_value)

# Print tensorflow version in one short line.
print("TensorFlow version:", tf.__version__)

# Load MNIST dataset from keras datasets.
(x_train, y_train), (x_test, y_test) = keras.datasets.mnist.load_data()

# Normalize pixel values to range zero one.
x_train = x_train.astype("float32") / 255.0
x_test = x_test.astype("float32") / 255.0

# Add channel dimension for convolution layers.
x_train = np.expand_dims(x_train, axis=-1)
x_test = np.expand_dims(x_test, axis=-1)

# Select small subsets to exaggerate overfitting.
train_samples = 2000
val_samples = 2000
x_train_small = x_train[:train_samples]
y_train_small = y_train[:train_samples]

# Create validation subset from test data.
x_val_small = x_test[:val_samples]
y_val_small = y_test[:val_samples]

# Confirm shapes are as expected.
print("Train subset shape:", x_train_small.shape)
print("Validation subset shape:", x_val_small.shape)

# Build a small convolutional neural network.
model = keras.Sequential([
    layers.Conv2D(32, (3, 3), activation="relu", input_shape=(28, 28, 1)),
    layers.MaxPooling2D((2, 2)),
    layers.Conv2D(64, (3, 3), activation="relu"),
    layers.MaxPooling2D((2, 2)),
    layers.Flatten(),
    layers.Dense(128, activation="relu"),
    layers.Dense(10, activation="softmax"),
])

# Compile model with suitable loss and metric.
model.compile(
    optimizer="adam",
    loss="sparse_categorical_crossentropy",
    metrics=["accuracy"],
)

# Train model and store training history.
history = model.fit(
    x_train_small,
    y_train_small,
    epochs=15,
    batch_size=64,
    validation_data=(x_val_small, y_val_small),
    verbose=0,
)

# Extract loss and accuracy curves from history.
train_loss = history.history["loss"]
val_loss = history.history["val_loss"]
train_acc = history.history["accuracy"]
val_acc = history.history["val_accuracy"]

# Print compact table of epoch metrics.
print("Epoch  TrainLoss  ValLoss  TrainAcc  ValAcc")
for epoch in range(len(train_loss)):
    tl = round(train_loss[epoch], 3)
    vl = round(val_loss[epoch], 3)
    ta = round(train_acc[epoch], 3)
    va = round(val_acc[epoch], 3)
    print(epoch + 1, tl, vl, ta, va)

# Identify epoch where validation loss starts increasing.
best_val_loss = min(val_loss)
best_epoch = val_loss.index(best_val_loss) + 1

# Print simple interpretation about overfitting start.
print("Best validation loss at epoch:", best_epoch)



# <font color="#418FDE" size="6.5" uppercase>**CNNs from Scratch**</font>


In this lecture, you learned to:
- Construct a convolutional neural network using Conv2d, pooling, and fully connected layers in PyTorch. 
- Train the CNN on a small image dataset using the standard training loop and appropriate loss and metrics. 
- Analyze model performance using accuracy, confusion matrices, and simple error inspection. 

In the next Lecture (Lecture B), we will go over 'Transfer Learning'