# Day 13 — "Pooling, Downsampling & Hierarchical Feature Extraction"

Pooling lets CNNs zoom out: compressing details while preserving meaning so deeper layers capture shapes, parts, and objects.

## 1. Core Intuition

- Pooling mimics stepping back from an image: details → shapes → scenes.
- Downsampling builds hierarchical, multi-scale features (pixels → edges → textures → parts → objects).

## 2. Why Pooling is Needed

- Convolution alone lacks invariance.
- Pooling selects salient features, compresses maps, enlarges receptive fields, reduces compute.

## 3. Max Pooling

`maxpool(x) = max(x_1, ..., x_n)` within a window. Picks the strongest activation, giving shift invariance and sparsity.

## 4. Average Pooling & GAP

- `avgpool(x) = (1/n)∑x_i` keeps smooth context.
- Global Average Pooling (GAP) replaces FC heads, reduces parameters, improves generalization.

## 5. Strided Convolution

Learnable downsampling that lets the model decide which info to retain; common replacement for rigid pooling.

## 6. Python Demo — Pooling & Strided Convolution

`days/day13/code/pooling.py` implements the helpers below.

In [5]:
from __future__ import annotations

import sys
from pathlib import Path
import numpy as np


def find_repo_root(marker: str = "days") -> Path:
    path = Path.cwd()
    while path != path.parent:
        if (path / marker).exists():
            return path
        path = path.parent
    raise RuntimeError("Run this notebook from inside the repository tree.")

REPO_ROOT = find_repo_root()
if str(REPO_ROOT) not in sys.path:
    sys.path.append(str(REPO_ROOT))

from days.day13.code.pooling import pool2d, global_avg_pool, strided_conv2d

img = np.array([[1,3,2,8],[4,6,5,2],[7,1,0,3],[2,9,4,1]])
filt = np.array([[1,-1],[-1,1]])
print("Original:\n", img)
print("MaxPool 2x2:\n", pool2d(img, mode='max'))
print("AvgPool 2x2:\n", pool2d(img, mode='avg'))
print("Global Avg Pool:", global_avg_pool(img))
print("Strided conv (stride=2):\n", strided_conv2d(img, filt, stride=2))


Original:
 [[1 3 2 8]
 [4 6 5 2]
 [7 1 0 3]
 [2 9 4 1]]
MaxPool 2x2:
 [[6. 8.]
 [9. 4.]]
AvgPool 2x2:
 [[3.5  4.25]
 [4.75 2.  ]]
Global Avg Pool: 3.625
Strided conv (stride=2):
 [[ 0. -9.]
 [13. -6.]]


## 7. Visualization — Pooling Movement

`days/day13/code/visualizations.py` plots pooling results and animates a sliding max-pool window.

In [6]:
from days.day13.code.visualizations import plot_pooling_example, anim_sliding_maxpool

RUN_ANIMATIONS = False

if RUN_ANIMATIONS:
    plot_path = plot_pooling_example()
    gif_path = anim_sliding_maxpool()
    print('Saved assets →', plot_path, gif_path)
else:
    print('Set RUN_ANIMATIONS = True to regenerate Day 13 figures in days/day13/outputs/.')


Set RUN_ANIMATIONS = True to regenerate Day 13 figures in days/day13/outputs/.


## 9. Hierarchical Feature Extraction

Pooling expands receptive fields so deeper layers capture textures, parts, and objects—mirroring human perception from pixels to scenes.

## 10. Pooling vs Strided Convolution

| Property | Pooling | Strided Conv |
| --- | --- | --- |
| Learnable | No | Yes |
| Invariance | Strong (max) | Moderate |
| Sharpness | High (max) | Depends on learned filters |
| Modern usage | Declining | Increasing |


## 11. Mini Exercises

1. Implement max pool manually and compare to torch.nn.MaxPool2d.
2. Swap max pool for average pool in a CNN; observe accuracy.
3. Replace pooling with stride=2 conv; compare results.
4. Visualize intermediate CNN feature maps (edges → textures → parts).
5. Try GAP vs fully connected head on CIFAR-10.

## 12. Key Takeaways

| Point | Meaning |
| --- | --- |
| Pooling compresses & gains invariance | Less sensitive to small shifts. |
| Max pooling highlights strong features | Edge detectors thrive. |
| Average/GAP summarize context | Smooth, parameter-efficient heads. |
| Strided conv learns what to keep | Flexible alternative to pooling. |
| Hierarchical representations | CNNs zoom out layer by layer. |

> Pooling lets networks zoom out—learning not just pixels, but patterns, parts, and objects.