# Day 17 — "Backpropagation Through Convolutions (Computational Graph + Intuition)"

A convolution kernel is a stamp. Forward pass slides the stamp; backward pass gathers feedback from every imprint to reshape the stamp.

## 1. Core Intuition

- Forward: sliding window sums local patches.
- Backward: each output pixel sends gradient heat to its patch + shared kernel weights.

## 2. Mathematical Breakdown

`dW[u,v] = ∑ dY[i,j] * x[i+u,j+v]`.
`dX = dY * rot180(W)` (full conv).

## 3. Python — Naive Conv Forward/Backward

`days/day17/code/conv_backprop.py` implements the functions below.

In [1]:
from __future__ import annotations

import sys
from pathlib import Path
import numpy as np


def find_repo_root(marker: str = "days") -> Path:
    path = Path.cwd()
    while path != path.parent:
        if (path / marker).exists():
            return path
        path = path.parent
    raise RuntimeError("Run this notebook from inside the repository tree.")

REPO_ROOT = find_repo_root()
if str(REPO_ROOT) not in sys.path:
    sys.path.append(str(REPO_ROOT))

from days.day17.code.conv_backprop import conv2d_forward, conv2d_backward

x = np.random.randn(5,5)
w = np.random.randn(3,3)
y = conv2d_forward(x, w)
dW, dX = conv2d_backward(x, w, np.ones_like(y))
print('Forward output shape:', y.shape)
print('dW shape:', dW.shape, 'dX shape:', dX.shape)


Forward output shape: (3, 3)
dW shape: (3, 3) dX shape: (5, 5)


## 4. Visualization — Gradient Accumulation

`days/day17/code/visualizations.py` animates forward windows and weight-gradient buildup.

In [2]:
from days.day17.code.visualizations import anim_conv_backprop

RUN_ANIMATIONS = False

if RUN_ANIMATIONS:
    gif_path = anim_conv_backprop()
    print('Saved animation →', gif_path)
else:
    print('Set RUN_ANIMATIONS = True to regenerate Day 17 figures in days/day17/outputs/.')


Set RUN_ANIMATIONS = True to regenerate Day 17 figures in days/day17/outputs/.


## 5. Gradient Flow Insight

- Weight sharing ⇒ gradients sum across positions.
- Input gradients use flipped kernels.
- Efficient GPU kernels reuse convolution logic for backward pass.

## 8. Mini Exercises

1. Modify stride/dilation and recompute gradients.
2. Extend to multi-channel convs.
3. Compare naive gradients with PyTorch autograd.
4. Visualize weight gradient heatmaps.
5. Implement backprop for dilated conv.

## 9. Key Takeaways

| Point | Meaning |
| --- | --- |
| Backprop slides windows in reverse | each output contributes to weights & inputs. |
| Weight gradients accumulate | due to shared kernels. |
| Input gradients = conv with flipped kernels | chain rule in action. |
| Understanding conv backprop | essential for designing CNN variants. |

> Convolution backprop is a choreography of gradients—each location reshapes the shared kernel.