# Notebook 6: Pooling Layers

**Week 10 - Module 4: CNN Basics**
**DO3 (October 27, 2025) - Saturday**
**Duration:** 15-20 minutes

## Learning Objectives

1. ✅ **Understand** why pooling is needed
2. ✅ **Implement** max pooling and average pooling
3. ✅ **Calculate** pooling output dimensions
4. ✅ **Compare** pooling vs stride

---

In [None]:
import numpy as np
import matplotlib.pyplot as plt

plt.rcParams['figure.figsize'] = (12, 6)
print("✅ Setup complete!")

## 1. Why Pooling?

**Three Main Reasons:**

1. **Translation Invariance**: Small shifts don't matter
2. **Dimension Reduction**: Reduce computational load
3. **Overfitting Prevention**: Fewer parameters to learn

### Example:

If a cat's eye moves 1 pixel, we still want to detect it!

---

## 2. Max Pooling

**Max Pooling** = Take maximum value in window

**Example (2×2 max pooling):**

```
Input (4×4):          Output (2×2):
[1, 3, 2, 4]          [3, 4]
[5, 6, 7, 8]     →    [6, 8]
[9, 2, 3, 4]
[1, 5, 6, 7]          [9, 7]
```

**Calculation:**
- Top-left: max(1,3,5,6) = 6
- Top-right: max(2,4,7,8) = 8
- Bottom-left: max(9,2,1,5) = 9
- Bottom-right: max(3,4,6,7) = 7

---

In [None]:
def max_pool2d(image, pool_size=2, stride=2):
    """Implement 2D max pooling."""
    h, w = image.shape
    out_h = (h - pool_size) // stride + 1
    out_w = (w - pool_size) // stride + 1
    output = np.zeros((out_h, out_w))

    for i in range(out_h):
        for j in range(out_w):
            start_i = i * stride
            start_j = j * stride
            window = image[start_i:start_i+pool_size, start_j:start_j+pool_size]
            output[i, j] = np.max(window)

    return output

# Test
test_img = np.array([[1, 3, 2, 4],
                     [5, 6, 7, 8],
                     [9, 2, 3, 4],
                     [1, 5, 6, 7]])

pooled = max_pool2d(test_img, pool_size=2, stride=2)
print("Input (4×4):")
print(test_img)
print("\nMax Pooled (2×2):")
print(pooled)

## 3. Average Pooling

**Average Pooling** = Take average value in window

Same example:
```
Input (4×4):          Output (2×2):
[1, 3, 2, 4]          [3.75, 5.25]
[5, 6, 7, 8]     →
[9, 2, 3, 4]          [4.25, 5.0]
[1, 5, 6, 7]
```

**When to use which?**
- **Max Pooling**: Most common, preserves strongest features
- **Average Pooling**: Smoother, used in final layers

---

In [None]:
def avg_pool2d(image, pool_size=2, stride=2):
    """Implement 2D average pooling."""
    h, w = image.shape
    out_h = (h - pool_size) // stride + 1
    out_w = (w - pool_size) // stride + 1
    output = np.zeros((out_h, out_w))

    for i in range(out_h):
        for j in range(out_w):
            start_i = i * stride
            start_j = j * stride
            window = image[start_i:start_i+pool_size, start_j:start_j+pool_size]
            output[i, j] = np.mean(window)

    return output

avg_pooled = avg_pool2d(test_img, pool_size=2, stride=2)
print("Average Pooled (2×2):")
print(avg_pooled)
print("\nMax vs Average:")
print(f"Max: {pooled[0,0]:.2f} | Avg: {avg_pooled[0,0]:.2f}")

## 4. Global Average Pooling (GAP)

**GAP** = Average over entire spatial dimension

- **Input**: H × W × C
- **Output**: 1 × 1 × C

**Use case:** Replace fully connected layers at the end

---

## Summary

### 🎯 Key Points

1. **Max Pooling**: Keeps strongest activations
2. **Average Pooling**: Smooth aggregation
3. **Output Size**: $(H - F) / S + 1$ (same as convolution)
4. **No Parameters**: Pooling has NO learnable weights!

### 🔮 Next

**Notebook 7:** Complete CNN Architecture (putting it all together)

---

*Week 10 - Deep Neural Network Architectures (21CSE558T)*
*SRM University - M.Tech Program*