# ðŸŽ¯ Model 2: Modified Convolution Model (Stride = 1)

Based on **Generative Deep Learning (2nd Edition)**, Chapter 2 - Convolutions

This notebook implements Model 2: a modified version with stride = 1. This model is identical to Model 1 in architecture but explicitly uses stride = 1 for the convolution layer to demonstrate the effect of stride on feature maps.

In [1]:
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers, models, optimizers
import numpy as np

  if not hasattr(np, "object"):


## 1. Load and Prepare the Dataset

Using MNIST dataset (handwritten digits) - same as Model 1. Images are 28x28 grayscale, 10 classes (digits 0-9).

In [2]:
# Load MNIST dataset
print("=" * 70)
print("LOADING DATASET")
print("=" * 70)
(x_train_full, y_train_full), (x_test, y_test) = keras.datasets.mnist.load_data()

# Normalize pixel values to [0, 1] range for better training stability
x_train_full = x_train_full.astype('float32') / 255.0
x_test = x_test.astype('float32') / 255.0

# Expand dimensions to add channel axis: (60000, 28, 28) -> (60000, 28, 28, 1)
# Conv2D layers require input shape: (height, width, channels)
x_train_full = np.expand_dims(x_train_full, axis=-1)
x_test = np.expand_dims(x_test, axis=-1)

print(f"Training data shape: {x_train_full.shape}")
print(f"Test data shape: {x_test.shape}")

LOADING DATASET
Training data shape: (60000, 28, 28, 1)
Test data shape: (10000, 28, 28, 1)


## 2. Train / Validation Split

Same split as Model 1: 50,000 samples for training, 10,000 for validation.

In [3]:
x_train, x_val = x_train_full[:50000], x_train_full[50000:]
y_train, y_val = y_train_full[:50000], y_train_full[50000:]

print(f"\nTrain set: {x_train.shape[0]} samples")
print(f"Validation set: {x_val.shape[0]} samples")
print(f"Test set: {x_test.shape[0]} samples")


Train set: 50000 samples
Validation set: 10000 samples
Test set: 10000 samples


## 3. Build the Modified Convolution Model (Stride = 1)

**KEY MODIFICATION: stride = (1, 1)**

- Stride = 1 means the filter moves 1 pixel at a time in both directions
- This preserves maximum spatial information from the input
- With stride = 1 and valid padding, the output feature map size is: `output_size = (input_size - kernel_size + 1) / stride`
- For 28x28 input with 3x3 kernel: (28 - 3 + 1) / 1 = 26x26

**Why stride = 1 preserves more spatial information:**
- The filter examines every possible position in the input image
- No information is skipped between positions
- This results in a larger feature map (26x26) compared to stride = 2 (13x13)
- More spatial detail is retained, which can help with fine-grained features

**Comparison to Model 1:**
- Model 1 also used stride = 1, so this model is architecturally identical
- However, explicitly setting stride = 1 demonstrates the baseline case
- If stride were 2, the feature map would be 13x13, losing spatial resolution
- Stride = 1 is optimal for preserving spatial relationships in images

In [None]:
print("=" * 70)
print("BUILDING MODEL 2: MODIFIED CONVOLUTION (STRIDE = 1)")
print("=" * 70)

model = models.Sequential()

# Input layer (Keras 3.x recommended approach to avoid warnings)
model.add(layers.Input(shape=(28, 28, 1)))

# Convolution layer with stride = 1 (explicitly set)
model.add(layers.Conv2D(filters=32,
                        kernel_size=(3, 3),
                        strides=(1, 1),  # EXPLICITLY SET TO 1
                        padding='valid',
                        activation='relu'))

# Flatten the 2D feature maps into a 1D vector for dense layers
# Output from Conv2D: (batch, 26, 26, 32) -> Flatten -> (batch, 26*26*32)
# With stride = 1, we get 26x26 = 676 spatial positions per filter
# Total flattened size: 26 * 26 * 32 = 21,632 features
model.add(layers.Flatten())

# Dense (fully connected) layer for feature combination
# 32 units: same as Model 1 for direct comparison
model.add(layers.Dense(units=32, activation='relu'))

# Output layer: 10 units for 10 digit classes, softmax for probability distribution
model.add(layers.Dense(units=10, activation='softmax'))

BUILDING MODEL 2: MODIFIED CONVOLUTION (STRIDE = 1)


  super().__init__(activity_regularizer=activity_regularizer, **kwargs)


## 4. Compile the Model

Same optimizer and loss as Model 1 for direct comparison. Using RMSprop optimizer and sparse categorical crossentropy loss.

In [1]:
model.compile(optimizer=optimizers.RMSprop(),
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])

NameError: name 'model' is not defined

## 5. Model Summary and Convolution Layer Configuration

In [2]:
print("=" * 70)
print("MODEL SUMMARY")
print("=" * 70)
model.summary()

# Extract and display convolution layer configuration
print("\n" + "=" * 70)
print("CONVOLUTION LAYER CONFIGURATION")
print("=" * 70)
for layer in model.layers:
    if isinstance(layer, layers.Conv2D):
        print(f"Layer Name: {layer.name}")
        print(f"  Filters: {layer.filters}")
        print(f"  Kernel Size: {layer.kernel_size}")
        print(f"  Strides: {layer.strides}  <-- EXPLICITLY SET TO (1, 1)")
        print(f"  Padding: {layer.padding}")
        # Get activation name safely
        if hasattr(layer.activation, '__name__'):
            activation_name = layer.activation.__name__
        elif callable(layer.activation):
            activation_name = str(layer.activation)
        else:
            activation_name = str(layer.activation)
        print(f"  Activation: {activation_name}")
        print(f"  Input Shape: (28, 28, 1)")
        print(f"  Output Shape: (26, 26, 32)  [28-3+1=26 with stride=1, valid padding, 32 filters]")
        print(f"\n  NOTE: Stride = 1 means the filter examines every pixel position,")
        print(f"        preserving maximum spatial information in the feature map.")

MODEL SUMMARY


NameError: name 'model' is not defined

## 6. Train the Model

Training with 5 epochs, batch size 64, RMSprop optimizer, and sparse categorical crossentropy loss. **Key modification: Convolution Stride = (1, 1)**

In [None]:
print("=" * 70)
print("TRAINING MODEL 2: MODIFIED CONVOLUTION (STRIDE = 1)")
print("=" * 70)
print("Training with the following settings:")
print("  - Epochs: 5")
print("  - Batch Size: 64")
print("  - Optimizer: RMSprop")
print("  - Loss: Sparse Categorical Crossentropy")
print("  - Convolution Stride: (1, 1)  <-- KEY MODIFICATION")
print("=" * 70)

# Train the model
history = model.fit(x_train, y_train,
                    epochs=5,
                    batch_size=64,
                    validation_data=(x_val, y_val),
                    verbose=1)

# Extract final training and validation accuracies
final_train_acc = history.history['accuracy'][-1]
final_val_acc = history.history['val_accuracy'][-1]

# Print per-epoch accuracies
print("\n" + "=" * 70)
print("TRAINING HISTORY - ACCURACY PER EPOCH")
print("=" * 70)
for epoch in range(len(history.history['accuracy'])):
    print(f"Epoch {epoch + 1}:")
    print(f"  Training Accuracy:   {history.history['accuracy'][epoch]:.4f}")
    print(f"  Validation Accuracy: {history.history['val_accuracy'][epoch]:.4f}")

## 7. Evaluate on Test Set

In [None]:
print("=" * 70)
print("EVALUATING ON TEST SET")
print("=" * 70)
test_loss, test_acc = model.evaluate(x_test, y_test, verbose=1)

## 8. Final Results (Formatted for Academic Submission)

In [None]:
print("\n" + "=" * 70)
print("MODEL 2: MODIFIED CONVOLUTION (STRIDE = 1)")
print("=" * 70)
print(f"Final Training Accuracy:   {final_train_acc:.4f}")
print(f"Final Validation Accuracy: {final_val_acc:.4f}")
print(f"Final Test Accuracy:       {test_acc:.4f}")
print(f"Final Test Loss:           {test_loss:.4f}")
print("=" * 70)

print("\nModel Configuration Summary:")
print("  - Architecture: Conv2D -> Flatten -> Dense -> Dense")
print("  - Conv2D: 32 filters, 3x3 kernel, stride = (1, 1), valid padding")
print("  - Stride = 1 preserves maximum spatial resolution (26x26 feature maps)")
print("  - This model is architecturally identical to Model 1")
print("  - Stride = 1 allows the filter to examine every pixel position")
print("  - No pooling, no batch normalization, minimal architecture")
print("=" * 70)

print("\nComparison Notes:")
print("  - Stride = 1 means the convolution filter moves 1 pixel at a time")
print("  - This preserves more spatial information than larger strides")
print("  - Feature map size: 26x26 (with 3x3 kernel and valid padding)")
print("  - If stride were 2, feature map would be 13x13, losing spatial detail")
print("  - Stride = 1 is optimal for preserving fine-grained image features")
print("=" * 70)