# CNN Kernels and Filters: Deep Dive

## Understanding the Building Blocks of Convolutional Neural Networks

### 📚 Learning Objectives
- Understand what kernels/filters are and how they work
- Learn key terminology: stride, padding, dilation
- Visualize convolution operations step-by-step
- Implement basic convolution from scratch
- Explore different types of filters and their effects

In [None]:
import numpy as np
import matplotlib.pyplot as plt
import tensorflow as tf
from tensorflow import keras
import cv2
from scipy import ndimage
import seaborn as sns

plt.style.use('default')
np.random.seed(42)

## 1. What are Kernels/Filters?

**Kernels** (also called **filters**) are small matrices that slide across input data to detect specific features.

### Key Concepts:
- **Kernel**: A small matrix (e.g., 3×3, 5×5) containing learnable weights
- **Convolution**: Mathematical operation that applies the kernel to input data
- **Feature Detection**: Each kernel learns to detect specific patterns (edges, textures, shapes)

In [None]:
# Example: Common 3x3 kernels
edge_kernel = np.array([[-1, -1, -1],
                       [-1,  8, -1],
                       [-1, -1, -1]])

blur_kernel = np.array([[1, 1, 1],
                       [1, 1, 1],
                       [1, 1, 1]]) / 9

sharpen_kernel = np.array([[ 0, -1,  0],
                          [-1,  5, -1],
                          [ 0, -1,  0]])

print("Edge Detection Kernel:")
print(edge_kernel)
print("\nBlur Kernel:")
print(blur_kernel)
print("\nSharpen Kernel:")
print(sharpen_kernel)

## 2. How Convolution Works

Convolution involves sliding a kernel across the input and computing dot products at each position.

### Step-by-step Process:
1. Place kernel at top-left of input
2. Compute element-wise multiplication
3. Sum all products to get single output value
4. Slide kernel to next position
5. Repeat until entire input is covered

In [None]:
def simple_convolution(input_matrix, kernel, stride=1):
    """Simple 2D convolution implementation"""
    input_h, input_w = input_matrix.shape
    kernel_h, kernel_w = kernel.shape
    
    output_h = (input_h - kernel_h) // stride + 1
    output_w = (input_w - kernel_w) // stride + 1
    
    output = np.zeros((output_h, output_w))
    
    for i in range(0, output_h * stride, stride):
        for j in range(0, output_w * stride, stride):
            output[i//stride, j//stride] = np.sum(
                input_matrix[i:i+kernel_h, j:j+kernel_w] * kernel
            )
    
    return output

# Example convolution
input_img = np.array([[1, 2, 3, 4],
                     [5, 6, 7, 8],
                     [9, 10, 11, 12],
                     [13, 14, 15, 16]])

kernel = np.array([[1, 0],
                  [0, -1]])

result = simple_convolution(input_img, kernel)

print("Input:")
print(input_img)
print("\nKernel:")
print(kernel)
print("\nOutput:")
print(result)

## 3. Key Terminology

### Stride
**Definition**: Number of pixels the kernel moves at each step
- **Stride = 1**: Kernel moves 1 pixel at a time (default)
- **Stride = 2**: Kernel moves 2 pixels, reducing output size
- **Effect**: Larger stride = smaller output, faster computation

In [None]:
# Demonstrate different strides
input_5x5 = np.random.randint(0, 10, (5, 5))
kernel_3x3 = np.array([[1, 0, -1],
                      [1, 0, -1],
                      [1, 0, -1]])

stride_1 = simple_convolution(input_5x5, kernel_3x3, stride=1)
stride_2 = simple_convolution(input_5x5, kernel_3x3, stride=2)

print(f"Input shape: {input_5x5.shape}")
print(f"Stride 1 output shape: {stride_1.shape}")
print(f"Stride 2 output shape: {stride_2.shape}")

fig, axes = plt.subplots(1, 3, figsize=(12, 4))
axes[0].imshow(input_5x5, cmap='viridis')
axes[0].set_title('Input (5x5)')
axes[1].imshow(stride_1, cmap='viridis')
axes[1].set_title('Stride=1 (3x3)')
axes[2].imshow(stride_2, cmap='viridis')
axes[2].set_title('Stride=2 (2x2)')
plt.tight_layout()
plt.show()

### Padding
**Definition**: Adding extra pixels around input borders
- **Valid Padding**: No padding (output smaller than input)
- **Same Padding**: Padding to keep output same size as input
- **Purpose**: Control output size, preserve border information

In [None]:
def add_padding(input_matrix, pad_size, pad_value=0):
    """Add padding around input matrix"""
    return np.pad(input_matrix, pad_size, mode='constant', constant_values=pad_value)

# Demonstrate padding
original = np.array([[1, 2, 3],
                    [4, 5, 6],
                    [7, 8, 9]])

padded = add_padding(original, 1)

print("Original (3x3):")
print(original)
print("\nWith padding=1 (5x5):")
print(padded)

# Show effect on convolution output size
kernel_3x3 = np.array([[1, 1, 1],
                      [1, 1, 1],
                      [1, 1, 1]]) / 9

no_pad_result = simple_convolution(original, kernel_3x3)
pad_result = simple_convolution(padded, kernel_3x3)

print(f"\nNo padding output shape: {no_pad_result.shape}")
print(f"With padding output shape: {pad_result.shape}")

## 4. Common Filter Types and Their Effects

Different kernels detect different features:

In [None]:
# Create sample image
def create_sample_image():
    img = np.zeros((50, 50))
    # Add rectangle
    img[15:35, 15:35] = 1
    # Add diagonal line
    for i in range(10, 40):
        img[i, i] = 0.5
    return img

sample_img = create_sample_image()

# Define various kernels
kernels = {
    'Horizontal Edge': np.array([[-1, -1, -1],
                                [ 0,  0,  0],
                                [ 1,  1,  1]]),
    
    'Vertical Edge': np.array([[-1, 0, 1],
                              [-1, 0, 1],
                              [-1, 0, 1]]),
    
    'Diagonal Edge': np.array([[ 0, 1, 0],
                              [-1, 0, 1],
                              [ 0,-1, 0]]),
    
    'Gaussian Blur': np.array([[1, 2, 1],
                              [2, 4, 2],
                              [1, 2, 1]]) / 16
}

# Apply kernels and visualize
fig, axes = plt.subplots(2, 3, figsize=(15, 10))
axes[0, 0].imshow(sample_img, cmap='gray')
axes[0, 0].set_title('Original Image')

for idx, (name, kernel) in enumerate(kernels.items()):
    row = (idx + 1) // 3
    col = (idx + 1) % 3
    
    filtered = ndimage.convolve(sample_img, kernel)
    axes[row, col].imshow(filtered, cmap='gray')
    axes[row, col].set_title(f'{name} Filter')

plt.tight_layout()
plt.show()

## 5. Advanced Concepts

### Dilation
**Definition**: Spacing between kernel elements
- **Dilation = 1**: Standard convolution (default)
- **Dilation > 1**: Dilated/atrous convolution, increases receptive field

In [None]:
def dilated_convolution(input_matrix, kernel, dilation=1):
    """Simple dilated convolution implementation"""
    if dilation == 1:
        return simple_convolution(input_matrix, kernel)
    
    # Create dilated kernel
    k_h, k_w = kernel.shape
    dilated_h = k_h + (k_h - 1) * (dilation - 1)
    dilated_w = k_w + (k_w - 1) * (dilation - 1)
    
    dilated_kernel = np.zeros((dilated_h, dilated_w))
    
    for i in range(k_h):
        for j in range(k_w):
            dilated_kernel[i * dilation, j * dilation] = kernel[i, j]
    
    return simple_convolution(input_matrix, dilated_kernel)

# Demonstrate dilation
input_7x7 = np.random.randint(0, 5, (7, 7))
kernel_3x3 = np.array([[1, 0, -1],
                      [1, 0, -1],
                      [1, 0, -1]])

normal_conv = simple_convolution(input_7x7, kernel_3x3)
dilated_conv = dilated_convolution(input_7x7, kernel_3x3, dilation=2)

print(f"Input shape: {input_7x7.shape}")
print(f"Normal convolution output: {normal_conv.shape}")
print(f"Dilated convolution output: {dilated_conv.shape}")

## 6. Receptive Field

**Definition**: The region in the input that affects a single output pixel

### Calculating Receptive Field:
- Single layer: kernel size
- Multiple layers: grows with depth
- Formula: RF = (RF_prev - 1) × stride + kernel_size

In [None]:
def calculate_receptive_field(layers):
    """Calculate receptive field for a sequence of conv layers"""
    rf = 1
    stride_product = 1
    
    print("Layer\tKernel\tStride\tReceptive Field")
    print("-" * 40)
    
    for i, (kernel_size, stride) in enumerate(layers):
        rf = rf + (kernel_size - 1) * stride_product
        stride_product *= stride
        print(f"{i+1}\t{kernel_size}\t{stride}\t{rf}")
    
    return rf

# Example CNN architecture
layers = [(3, 1), (3, 1), (3, 2), (3, 1), (3, 2)]
final_rf = calculate_receptive_field(layers)
print(f"\nFinal receptive field: {final_rf}x{final_rf}")

## 7. Practical Implementation with TensorFlow

Let's see how these concepts work in practice:

In [None]:
# Create a simple CNN layer
model = keras.Sequential([
    keras.layers.Conv2D(32, (3, 3), strides=1, padding='same', activation='relu', input_shape=(28, 28, 1)),
    keras.layers.Conv2D(64, (3, 3), strides=2, padding='valid', activation='relu'),
    keras.layers.Conv2D(128, (5, 5), strides=1, padding='same', activation='relu')
])

model.summary()

# Visualize learned filters (after training)
def visualize_filters(model, layer_idx=0):
    """Visualize the learned filters of a conv layer"""
    filters = model.layers[layer_idx].get_weights()[0]
    n_filters = filters.shape[-1]
    
    fig, axes = plt.subplots(4, 8, figsize=(16, 8))
    for i in range(min(32, n_filters)):
        ax = axes[i//8, i%8]
        ax.imshow(filters[:, :, 0, i], cmap='gray')
        ax.set_title(f'Filter {i+1}')
        ax.axis('off')
    
    plt.tight_layout()
    plt.show()

# Initialize with random weights for visualization
dummy_input = np.random.random((1, 28, 28, 1))
_ = model(dummy_input)
visualize_filters(model, 0)

## 8. Key Takeaways

### Essential Concepts:
1. **Kernels/Filters**: Small matrices that detect features through convolution
2. **Stride**: Controls how much the kernel moves (affects output size)
3. **Padding**: Adds borders to control output dimensions
4. **Dilation**: Increases receptive field without adding parameters
5. **Receptive Field**: Input region affecting each output pixel

### Practical Tips:
- Start with small kernels (3×3, 5×5)
- Use stride > 1 for downsampling
- Apply padding to maintain spatial dimensions
- Stack multiple layers to detect complex features
- Monitor receptive field growth through network depth

## 9. Practice Exercises

Try these exercises to reinforce your understanding:

In [None]:
# Exercise 1: Create a custom edge detection kernel
# TODO: Design a kernel that detects 45-degree diagonal edges

# Exercise 2: Calculate output dimensions
# Given: Input (224, 224), Kernel (7, 7), Stride 2, Padding 3
# Calculate: Output dimensions

# Exercise 3: Implement depthwise convolution
# TODO: Create a function that applies different kernels to each input channel

print("Complete the exercises above to practice your understanding!")