# Exercise 00: Understanding Convolutions - From First Principles

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/shang-vikas/series1-coding-exercises/blob/main/exercises/blog-04/exercise-00.ipynb)

In [None]:
## Setup

In [None]:
# Install required packages using the kernel's Python interpreter
import sys
import subprocess
import importlib

def install_if_missing(package, import_name=None):
    """Install package if it's not already installed."""
    if import_name is None:
        import_name = package

    try:
        importlib.import_module(import_name)
        print(f"âœ“ {package} is already installed")
    except ImportError:
        print(f"Installing {package}....")
        subprocess.check_call([sys.executable, "-m", "pip", "install", package])
        print(f"âœ“ {package} installed successfully")

# Install required packages
install_if_missing("numpy")
install_if_missing("matplotlib")
install_if_missing("scikit-learn", "sklearn")

## ðŸ§ª Exercise 1 â€” Why Flattening Fails

**Goal:** Show how spatial structure is destroyed.

In [None]:
import numpy as np
import matplotlib.pyplot as plt
from sklearn.datasets import load_digits

digits = load_digits()
image = digits.images[0]

plt.imshow(image, cmap="gray")
plt.title("Original")
plt.show()

# Flatten
flat = image.flatten()

# Shuffle pixels
np.random.shuffle(flat)
shuffled = flat.reshape(8, 8)

plt.imshow(shuffled, cmap="gray")
plt.title("Shuffled")
plt.show()

**Ask readers:**

- Same pixel values?
- Completely destroyed meaning?
- Why?

This drives locality home.

## ðŸ§ª Exercise 2 â€” Manual Convolution

**Goal:** Implement one 3Ã—3 filter manually.

In [None]:
def simple_conv(image, kernel):
    h, w = image.shape
    k = kernel.shape[0]
    output = np.zeros((h-k+1, w-k+1))

    for i in range(h-k+1):
        for j in range(w-k+1):
            patch = image[i:i+k, j:j+k]
            output[i, j] = np.sum(patch * kernel)

    return output

**Example edge detector:**

In [None]:
kernel = np.array([
    [-1, -1, -1],
    [-1,  8, -1],
    [-1, -1, -1]
])

# Apply to real digit image
digits = load_digits()
image = digits.images[0]

# Apply convolution
edges = simple_conv(image, kernel)

# Visualize
fig, axes = plt.subplots(1, 2, figsize=(10, 5))
axes[0].imshow(image, cmap="gray")
axes[0].set_title("Original")
axes[0].axis('off')

axes[1].imshow(edges, cmap="gray")
axes[1].set_title("Edges Detected")
axes[1].axis('off')

plt.tight_layout()
plt.show()

Let them see edges appear.

Now they understand convolution mechanically.

## ðŸ§ª Exercise 3 â€” Weight Sharing Demonstration

Use same kernel in two image regions.

Show:

- Detector fires in both places.
- No new weights needed.

This connects to parameter efficiency.

In [None]:
# Demonstrate weight sharing
digits = load_digits()
image = digits.images[0]

# Same kernel applied to different regions
kernel = np.array([
    [-1, -1, -1],
    [-1,  8, -1],
    [-1, -1, -1]
])

# Apply to top-left region
top_left = image[0:3, 0:3]
result1 = np.sum(top_left * kernel)

# Apply to bottom-right region
bottom_right = image[5:8, 5:8]
result2 = np.sum(bottom_right * kernel)

print(f"Top-left region response: {result1}")
print(f"Bottom-right region response: {result2}")
print(f"\nSame kernel, same weights, different locations!")
print(f"Total parameters: {kernel.size} (shared across entire image)")

## ðŸ§ª Exercise 4 â€” Pooling by Hand

In [None]:
def max_pool(image, size=2):
    h, w = image.shape
    output = np.zeros((h//size, w//size))

    for i in range(0, h, size):
        for j in range(0, w, size):
            patch = image[i:i+size, j:j+size]
            output[i//size, j//size] = np.max(patch)

    return output

In [None]:
# Show before/after pooling
digits = load_digits()
image = digits.images[0]

# Apply max pooling
pooled = max_pool(image, size=2)

# Visualize
fig, axes = plt.subplots(1, 2, figsize=(10, 5))
axes[0].imshow(image, cmap="gray")
axes[0].set_title("Before Pooling (8Ã—8)")
axes[0].axis('off')

axes[1].imshow(pooled, cmap="gray")
axes[1].set_title("After Max Pooling (4Ã—4)")
axes[1].axis('off')

plt.tight_layout()
plt.show()

print(f"Original size: {image.shape}")
print(f"Pooled size: {pooled.shape}")
print(f"Reduction: {image.size / pooled.size:.1f}x")

**Ask:**

- What information disappeared?
- What survived?

## ðŸ§ª Exercise 5 â€” Compare Parameter Count

In [None]:
# Compute parameter count

# Fully connected (8Ã—8 image â†’ 64 inputs â†’ 32 hidden)
image_size = 8 * 8  # 64 inputs
hidden_size = 32
fc_params = image_size * hidden_size

print("Fully Connected Layer:")
print(f"  Input size: {image_size}")
print(f"  Hidden size: {hidden_size}")
print(f"  Parameters: {image_size} Ã— {hidden_size} = {fc_params}")

# Convolution (3Ã—3 kernel, 1 filter)
kernel_size = 3
num_filters = 1
conv_params = kernel_size * kernel_size * num_filters

print("\nConvolutional Layer:")
print(f"  Kernel size: {kernel_size}Ã—{kernel_size}")
print(f"  Number of filters: {num_filters}")
print(f"  Parameters: {kernel_size} Ã— {kernel_size} Ã— {num_filters} = {conv_params}")

print(f"\nParameter ratio: {fc_params / conv_params:.1f}x more parameters in FC layer")
print(f"\nSame edge detector applied everywhere with only {conv_params} parameters!")

Let them calculate scale difference.

This hits hard.