# Notebook 1: Pooling Layers Deep Dive

**Course:** 21CSE558T - Deep Neural Network Architectures  
**Module:** 4 - CNNs (Week 2 of 3)  
**Date:** October 31, 2025  
**Duration:** ~25 minutes

---

## Learning Objectives

By the end of this notebook, you will be able to:
1. Explain the purpose of pooling layers in CNNs
2. Compare Max Pooling, Average Pooling, and Global Average Pooling
3. Calculate output dimensions after pooling operations
4. Understand parameter reduction benefits of Global Average Pooling
5. Implement all three pooling types in Keras

---

## Setup

In [None]:
import numpy as np
import matplotlib.pyplot as plt
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras.layers import (
    Conv2D, MaxPooling2D, AveragePooling2D, 
    GlobalAveragePooling2D, Flatten, Dense
)
from tensorflow.keras.models import Sequential

print(f"TensorFlow version: {tf.__version__}")
print(f"Keras version: {keras.__version__}")

# Set random seed for reproducibility
np.random.seed(42)
tf.random.set_seed(42)

---

## Part 1: Understanding Pooling - Character: Meera's Photography Studio

**Meet Character: Meera - Portrait Photographer**

**Character: Meera** runs a portrait photography studio. She takes hundreds of photos per session, but clients only want the highlights.

**Her Selection Strategy:**
1. **For action shots:** Pick the SHARPEST image (highest quality) → **Max Pooling**
2. **For group photos:** Blend multiple shots (average expressions) → **Average Pooling**
3. **For portfolio:** One representative photo of entire session → **Global Pooling**

---

## Part 2: Max Pooling - Keep the Strongest Signals

In [None]:
# Create a sample 4×4 feature map
feature_map = np.array([
    [1, 3, 2, 0],
    [4, 2, 1, 3],
    [0, 1, 5, 2],
    [2, 3, 1, 4]
])

print("Original Feature Map (4×4):")
print(feature_map)
print(f"Shape: {feature_map.shape}")

In [None]:
# Manual Max Pooling (2×2, stride 2)
def manual_max_pool(feature_map, pool_size=2, stride=2):
    """
    Manually perform max pooling operation.
    
    Character: Meera picks the sharpest photo from each group!
    """
    h, w = feature_map.shape
    out_h = (h - pool_size) // stride + 1
    out_w = (w - pool_size) // stride + 1
    
    output = np.zeros((out_h, out_w))
    
    for i in range(out_h):
        for j in range(out_w):
            # Extract pool region
            region = feature_map[
                i*stride:i*stride+pool_size,
                j*stride:j*stride+pool_size
            ]
            # Take maximum value
            output[i, j] = np.max(region)
            print(f"Region [{i},{j}]: {region.flatten()} → Max: {output[i,j]}")
    
    return output

# Apply max pooling
print("\nApplying Max Pooling (2×2, stride=2):\n")
max_pooled = manual_max_pool(feature_map)

print("\nMax Pooled Output (2×2):")
print(max_pooled)
print(f"Shape: {max_pooled.shape}")

In [None]:
# Visualize Max Pooling
fig, axes = plt.subplots(1, 2, figsize=(12, 5))

# Original
im1 = axes[0].imshow(feature_map, cmap='viridis', interpolation='nearest')
axes[0].set_title('Original Feature Map (4×4)', fontsize=14, fontweight='bold')
axes[0].set_xlabel('Width')
axes[0].set_ylabel('Height')
for i in range(4):
    for j in range(4):
        axes[0].text(j, i, f'{feature_map[i,j]:.0f}', 
                    ha='center', va='center', color='white', fontsize=12, fontweight='bold')
plt.colorbar(im1, ax=axes[0])

# Max Pooled
im2 = axes[1].imshow(max_pooled, cmap='viridis', interpolation='nearest')
axes[1].set_title('After Max Pooling (2×2)', fontsize=14, fontweight='bold')
axes[1].set_xlabel('Width')
axes[1].set_ylabel('Height')
for i in range(2):
    for j in range(2):
        axes[1].text(j, i, f'{max_pooled[i,j]:.0f}', 
                    ha='center', va='center', color='white', fontsize=12, fontweight='bold')
plt.colorbar(im2, ax=axes[1])

plt.tight_layout()
plt.show()

print("\n📊 Key Observation: Max pooling keeps strongest features (highest values)")

### Max Pooling in Keras

In [None]:
# Using Keras MaxPooling2D
# Need to add batch and channel dimensions: (batch, height, width, channels)
feature_map_keras = feature_map.reshape(1, 4, 4, 1).astype('float32')

# Create MaxPooling layer
max_pool_layer = MaxPooling2D(pool_size=(2, 2), strides=2)

# Apply pooling
max_pooled_keras = max_pool_layer(feature_map_keras)

print("Keras MaxPooling2D Result:")
print(max_pooled_keras.numpy().squeeze())
print(f"Shape: {max_pooled_keras.shape} → (batch=1, height=2, width=2, channels=1)")

# Verify it matches our manual implementation
print("\n✅ Matches manual implementation:", 
      np.allclose(max_pooled, max_pooled_keras.numpy().squeeze()))

---

## Part 3: Average Pooling - Smooth Features

In [None]:
# Manual Average Pooling
def manual_avg_pool(feature_map, pool_size=2, stride=2):
    """
    Manually perform average pooling operation.
    
    Character: Meera blends photos for smooth composition!
    """
    h, w = feature_map.shape
    out_h = (h - pool_size) // stride + 1
    out_w = (w - pool_size) // stride + 1
    
    output = np.zeros((out_h, out_w))
    
    for i in range(out_h):
        for j in range(out_w):
            region = feature_map[
                i*stride:i*stride+pool_size,
                j*stride:j*stride+pool_size
            ]
            # Take average value
            output[i, j] = np.mean(region)
            print(f"Region [{i},{j}]: {region.flatten()} → Avg: {output[i,j]:.2f}")
    
    return output

# Apply average pooling
print("Applying Average Pooling (2×2, stride=2):\n")
avg_pooled = manual_avg_pool(feature_map)

print("\nAverage Pooled Output (2×2):")
print(avg_pooled)

In [None]:
# Compare Max vs Average Pooling
fig, axes = plt.subplots(1, 3, figsize=(15, 5))

# Original
im1 = axes[0].imshow(feature_map, cmap='viridis', interpolation='nearest')
axes[0].set_title('Original (4×4)', fontsize=14, fontweight='bold')
for i in range(4):
    for j in range(4):
        axes[0].text(j, i, f'{feature_map[i,j]:.0f}', 
                    ha='center', va='center', color='white', fontsize=11, fontweight='bold')
plt.colorbar(im1, ax=axes[0])

# Max Pooled
im2 = axes[1].imshow(max_pooled, cmap='viridis', interpolation='nearest')
axes[1].set_title('Max Pooling (2×2)\n"Keep Sharpest"', fontsize=14, fontweight='bold')
for i in range(2):
    for j in range(2):
        axes[1].text(j, i, f'{max_pooled[i,j]:.0f}', 
                    ha='center', va='center', color='white', fontsize=11, fontweight='bold')
plt.colorbar(im2, ax=axes[1])

# Average Pooled
im3 = axes[2].imshow(avg_pooled, cmap='viridis', interpolation='nearest')
axes[2].set_title('Average Pooling (2×2)\n"Blend Photos"', fontsize=14, fontweight='bold')
for i in range(2):
    for j in range(2):
        axes[2].text(j, i, f'{avg_pooled[i,j]:.1f}', 
                    ha='center', va='center', color='white', fontsize=11, fontweight='bold')
plt.colorbar(im3, ax=axes[2])

plt.tight_layout()
plt.show()

print("\n📊 Comparison:")
print(f"Max Pooling:     {max_pooled.flatten()}")
print(f"Average Pooling: {avg_pooled.flatten()}")
print("\n💡 Max pooling keeps sharp features, Average pooling smooths them")

---

## Part 4: Global Average Pooling - The Revolutionary Technique

### The Problem: Parameter Explosion with Flatten + Dense

In [None]:
# Demonstrate parameter explosion problem
print("="*60)
print("PARAMETER EXPLOSION PROBLEM")
print("="*60)

# Assume last conv layer output: 7×7×512
height, width, channels = 7, 7, 512
num_classes = 1000

print(f"\nLast Conv Layer Output: {height}×{width}×{channels}")
print(f"Target: {num_classes} classes (ImageNet)\n")

# Traditional approach
print("❌ OLD APPROACH (Flatten + Dense):")
print("-" * 60)
flattened_size = height * width * channels
print(f"1. Flatten: {height}×{width}×{channels} = {flattened_size:,} values")

dense_neurons = 4096
params_dense1 = flattened_size * dense_neurons
print(f"2. Dense({dense_neurons}): {flattened_size:,} × {dense_neurons:,} = {params_dense1:,} parameters")

params_dense2 = dense_neurons * dense_neurons
print(f"3. Dense({dense_neurons}): {dense_neurons:,} × {dense_neurons:,} = {params_dense2:,} parameters")

params_output = dense_neurons * num_classes
print(f"4. Dense({num_classes}): {dense_neurons:,} × {num_classes:,} = {params_output:,} parameters")

total_old = params_dense1 + params_dense2 + params_output
print(f"\n📊 Total FC Parameters: {total_old:,} (~{total_old/1e6:.1f} Million!)")
print(f"💾 Memory: ~{total_old * 4 / 1e6:.1f} MB (float32)")

# Modern approach
print("\n" + "="*60)
print("✅ MODERN APPROACH (Global Average Pooling):")
print("-" * 60)
print(f"1. GlobalAveragePooling2D: {height}×{width}×{channels} → {channels} values")
print(f"   (Average each channel: {height}×{width}=49 values → 1 value)")

params_gap_output = channels * num_classes
print(f"2. Dense({num_classes}): {channels} × {num_classes:,} = {params_gap_output:,} parameters")

total_new = params_gap_output
print(f"\n📊 Total Parameters: {total_new:,} (~{total_new/1e6:.2f} Million)")
print(f"💾 Memory: ~{total_new * 4 / 1e6:.1f} MB (float32)")

# Comparison
print("\n" + "="*60)
print("🎯 COMPARISON:")
print("-" * 60)
reduction_factor = total_old / total_new
print(f"Parameter Reduction: {reduction_factor:.0f}× fewer parameters!")
print(f"Old approach: {total_old:,} params")
print(f"New approach: {total_new:,} params")
print(f"Saved: {total_old - total_new:,} parameters")
print("\n💡 Global Average Pooling = Dramatic overfitting reduction!")

### How Global Average Pooling Works

In [None]:
# Demonstrate Global Average Pooling
# Create sample 7×7×3 feature maps (3 channels for visualization)
sample_feature_maps = np.random.rand(1, 7, 7, 3)

print("Input Feature Maps: Shape", sample_feature_maps.shape)
print("(batch=1, height=7, width=7, channels=3)\n")

# Manual global average pooling
gap_output_manual = np.mean(sample_feature_maps, axis=(1, 2))
print("Global Average Pooling Process:")
print("-" * 60)
for ch in range(3):
    channel_data = sample_feature_maps[0, :, :, ch]
    avg_value = np.mean(channel_data)
    print(f"Channel {ch}: {channel_data.shape} → Average = {avg_value:.4f}")

print(f"\nOutput: {gap_output_manual.shape} = {gap_output_manual.squeeze()}")
print("(One value per channel!)")

In [None]:
# Visualize Global Average Pooling
fig, axes = plt.subplots(1, 4, figsize=(16, 4))

# Show 3 channels
for ch in range(3):
    im = axes[ch].imshow(sample_feature_maps[0, :, :, ch], cmap='viridis')
    axes[ch].set_title(f'Channel {ch}\n(7×7 values)', fontsize=12, fontweight='bold')
    axes[ch].axis('off')
    plt.colorbar(im, ax=axes[ch], fraction=0.046)

# Show GAP result
gap_values = gap_output_manual.squeeze()
axes[3].bar(range(3), gap_values, color=['red', 'green', 'blue'], alpha=0.7)
axes[3].set_title('After Global Avg Pooling\n(3 values)', fontsize=12, fontweight='bold')
axes[3].set_xlabel('Channel')
axes[3].set_ylabel('Average Value')
axes[3].set_xticks(range(3))
axes[3].grid(axis='y', alpha=0.3)

plt.tight_layout()
plt.show()

print("\n📊 Each 7×7 feature map → Single average value")
print("💡 For 512 channels: 7×7×512 = 25,088 values → 512 values")

### Global Average Pooling in Keras

In [None]:
# Using Keras GlobalAveragePooling2D
gap_layer = GlobalAveragePooling2D()
gap_output_keras = gap_layer(sample_feature_maps)

print("Keras GlobalAveragePooling2D:")
print(f"Input shape:  {sample_feature_maps.shape}")
print(f"Output shape: {gap_output_keras.shape}")
print(f"Output values: {gap_output_keras.numpy().squeeze()}")

print("\n✅ Matches manual implementation:",
      np.allclose(gap_output_manual, gap_output_keras.numpy()))

---

## Part 5: Architecture Comparison - Old vs Modern

In [None]:
# Build OLD architecture (Flatten + Dense)
print("="*60)
print("OLD ARCHITECTURE (VGG-style)")
print("="*60)

old_model = Sequential([
    Conv2D(32, (3,3), activation='relu', input_shape=(28, 28, 1)),
    MaxPooling2D((2,2)),
    Conv2D(64, (3,3), activation='relu'),
    MaxPooling2D((2,2)),
    Flatten(),                      # ❌ Flatten
    Dense(128, activation='relu'),  # ❌ Large Dense
    Dense(10, activation='softmax')
], name='Old_Architecture')

old_model.summary()

# Count parameters
total_params_old = old_model.count_params()
print(f"\n📊 Total Parameters: {total_params_old:,}")

In [None]:
# Build MODERN architecture (Global Average Pooling)
print("="*60)
print("MODERN ARCHITECTURE (ResNet-style)")
print("="*60)

modern_model = Sequential([
    Conv2D(32, (3,3), activation='relu', input_shape=(28, 28, 1)),
    MaxPooling2D((2,2)),
    Conv2D(64, (3,3), activation='relu'),
    MaxPooling2D((2,2)),
    Conv2D(128, (3,3), activation='relu'),  # Extra conv layer
    GlobalAveragePooling2D(),                # ✅ Global Avg Pool
    Dense(10, activation='softmax')          # ✅ Small Dense
], name='Modern_Architecture')

modern_model.summary()

# Count parameters
total_params_modern = modern_model.count_params()
print(f"\n📊 Total Parameters: {total_params_modern:,}")

In [None]:
# Compare architectures
print("="*60)
print("🎯 ARCHITECTURE COMPARISON")
print("="*60)

comparison_data = {
    'Architecture': ['Old (Flatten+Dense)', 'Modern (Global Avg Pool)'],
    'Total Parameters': [total_params_old, total_params_modern],
    'Overfitting Risk': ['High', 'Low'],
    'Memory Usage': [f'{total_params_old*4/1e6:.2f} MB', f'{total_params_modern*4/1e6:.2f} MB']
}

import pandas as pd
df = pd.DataFrame(comparison_data)
print(df.to_string(index=False))

reduction = (1 - total_params_modern / total_params_old) * 100
print(f"\n💡 Parameter Reduction: {reduction:.1f}%")
print(f"✅ Modern architecture is {total_params_old/total_params_modern:.1f}× more efficient!")

---

## Part 6: Practice Exercises

### Exercise 1: Calculate Output Dimensions

In [None]:
# TODO: Calculate output dimensions after pooling
# Formula: output_size = (input_size - pool_size) / stride + 1

def calculate_pool_output_size(input_size, pool_size, stride):
    """
    Calculate output dimension after pooling.
    
    Args:
        input_size: Input dimension (height or width)
        pool_size: Pooling window size
        stride: Stride of pooling operation
    
    Returns:
        Output dimension
    """
    # YOUR CODE HERE
    output_size = (input_size - pool_size) // stride + 1
    return output_size

# Test cases
print("Exercise 1: Calculate Pooling Output Dimensions\n")
test_cases = [
    (28, 2, 2),  # Input: 28×28, Pool: 2×2, Stride: 2
    (32, 2, 2),  # Input: 32×32, Pool: 2×2, Stride: 2
    (56, 3, 2),  # Input: 56×56, Pool: 3×3, Stride: 2
    (224, 7, 7), # Input: 224×224, Pool: 7×7, Stride: 7
]

for input_size, pool_size, stride in test_cases:
    output = calculate_pool_output_size(input_size, pool_size, stride)
    print(f"Input: {input_size}×{input_size}, Pool: {pool_size}×{pool_size}, Stride: {stride} → Output: {output}×{output}")

### Exercise 2: Build Your Own CNN with Different Pooling Strategies

In [None]:
# TODO: Build three CNN variants with different pooling strategies

# Variant 1: Max Pooling only
model_max = Sequential([
    Conv2D(32, (3,3), activation='relu', input_shape=(32, 32, 3)),
    MaxPooling2D((2,2)),  # Use Max Pooling
    Conv2D(64, (3,3), activation='relu'),
    MaxPooling2D((2,2)),  # Use Max Pooling
    Flatten(),
    Dense(10, activation='softmax')
], name='Max_Pooling_Model')

# Variant 2: Average Pooling only
# YOUR CODE HERE
model_avg = Sequential([
    Conv2D(32, (3,3), activation='relu', input_shape=(32, 32, 3)),
    AveragePooling2D((2,2)),  # Use Average Pooling
    Conv2D(64, (3,3), activation='relu'),
    AveragePooling2D((2,2)),  # Use Average Pooling
    Flatten(),
    Dense(10, activation='softmax')
], name='Avg_Pooling_Model')

# Variant 3: Global Average Pooling
# YOUR CODE HERE
model_gap = Sequential([
    Conv2D(32, (3,3), activation='relu', input_shape=(32, 32, 3)),
    MaxPooling2D((2,2)),
    Conv2D(64, (3,3), activation='relu'),
    MaxPooling2D((2,2)),
    Conv2D(128, (3,3), activation='relu'),
    GlobalAveragePooling2D(),  # Use Global Average Pooling
    Dense(10, activation='softmax')
], name='GAP_Model')

# Compare parameter counts
print("Model Comparison:\n")
print(f"Max Pooling Model:     {model_max.count_params():,} parameters")
print(f"Average Pooling Model: {model_avg.count_params():,} parameters")
print(f"Global Avg Pool Model: {model_gap.count_params():,} parameters")

print("\n💡 Which model has the fewest parameters? Why?")

---

## Summary

### Key Takeaways

1. **Max Pooling:**
   - Keeps strongest features (maximum value)
   - Most common in CNNs
   - Translation invariant
   - 0 learnable parameters

2. **Average Pooling:**
   - Smooths features (average value)
   - Less aggressive than max pooling
   - Less common in modern CNNs
   - 0 learnable parameters

3. **Global Average Pooling:**
   - Replaces Flatten + Dense
   - Dramatic parameter reduction (100× or more!)
   - Better generalization
   - Used in modern architectures (ResNet, MobileNet)
   - 0 learnable parameters

4. **When to Use:**
   - **Max Pooling:** Almost always (standard practice)
   - **Average Pooling:** Rarely (special cases only)
   - **Global Average Pooling:** Modern CNNs (replace Flatten+Dense)

### Character: Meera's Summary

**Character: Meera** says:
- "For most photos (features), I pick the sharpest one (Max Pooling)"
- "For my portfolio summary, I take one representative photo of my entire work (Global Average Pooling)"
- "This way, clients get the highlights without being overwhelmed!"

### Next Steps

In the next notebook, we'll explore **Batch Normalization** - the technique that revolutionized deep learning training!

---

**End of Notebook 1**