# 🏗️ Notebook 05: Simple CNN Architecture

**Purpose:** Design and implement a convolutional neural network from scratch for binary classification.

**What you'll learn:** How Conv2D, ReLU, MaxPool, and fully connected layers work together to classify images.


## 🎯 Concept Primer: CNN Building Blocks

### Convolutional Layers (Conv2D)
- **Purpose:** Detect local patterns (edges, textures, shapes)
- **Parameters:**
  - `in_channels`: Number of input channels (3 for RGB)
  - `out_channels`: Number of filters/feature maps (32, 64, 128)
  - `kernel_size`: Size of convolution window (3×3 common)
  - `padding`: Adds zeros around image to preserve spatial size

**Example:** Conv2D(3, 32, kernel_size=3, padding=1)
- Input: [Batch, 3, 96, 96]
- Output: [Batch, 32, 96, 96] (with padding=1)

### ReLU Activation
- **Formula:** `ReLU(x) = max(0, x)`
- **Purpose:** Introduces non-linearity (allows learning complex patterns)
- Applied after each convolution

### MaxPool2D
- **Purpose:** Reduces spatial dimensions, keeps strongest features
- **Typical:** `MaxPool2D(2)` → Halves width and height

**Example:** MaxPool2D(2) on [Batch, 32, 96, 96]
- Output: [Batch, 32, 48, 48]

### Flatten
- Converts 2D feature maps → 1D vector for fully connected layers
- **Example:** [Batch, 128, 12, 12] → [Batch, 18432]

### Fully Connected (Linear) Layers
- **Purpose:** Combine global features for classification
- **Final layer:** 1 output neuron for binary classification

### Sigmoid Activation
- **Formula:** `σ(x) = 1 / (1 + e^(-x))`
- **Output:** Probability in [0, 1]
- Applied to final layer for binary classification


## 📐 Architecture Specification

### SimpleCNN: 3 Conv Blocks + 2 FC Layers

```
Input: [Batch, 3, 96, 96]
    ↓
BLOCK 1:
  Conv2D(in=3, out=32, kernel=3, padding=1)
  ReLU
  MaxPool2D(2)
    → [Batch, 32, 48, 48]
    ↓
BLOCK 2:
  Conv2D(in=32, out=64, kernel=3, padding=1)
  ReLU
  MaxPool2D(2)
    → [Batch, 64, 24, 24]
    ↓
BLOCK 3:
  Conv2D(in=64, out=128, kernel=3, padding=1)
  ReLU
  MaxPool2D(2)
    → [Batch, 128, 12, 12]
    ↓
Flatten: [Batch, 128×12×12] = [Batch, 18432]
    ↓
FC1: Linear(18432, 256)
ReLU
    → [Batch, 256]
    ↓
FC2: Linear(256, 1)
Sigmoid
    → [Batch, 1]
    ↓
Squeeze(dim=1)  → [Batch]
```

### Spatial Dimension Tracking
- 96 → 48 → 24 → 12 (three halvings via MaxPool)
- Final flatten size: 128 × 12 × 12 = **18,432**


## 📚 Learning Objectives

By the end of this notebook, you will:

1. ✅ Define a `SimpleCNN` class inheriting from `nn.Module`
2. ✅ Implement `__init__()` to define layers
3. ✅ Implement `forward()` to define the computation flow
4. ✅ Instantiate `cnn_model` and print its architecture
5. ✅ Test forward pass on a fake batch to verify output shape: `[Batch]`


## ✅ Acceptance Criteria

Your CNN is correct when:

- [ ] `SimpleCNN` inherits from `nn.Module`
- [ ] `__init__()` defines 3 Conv2D layers, 2 Linear layers
- [ ] `forward()` applies Conv→ReLU→MaxPool three times, then Flatten→FC→ReLU→FC→Sigmoid→Squeeze
- [ ] Forward pass on input `[8,3,96,96]` produces output shape `[8]`
- [ ] Output values are in range [0, 1] (probabilities)
- [ ] `print(cnn_model)` displays all layers


---

## 💻 TODO 1: Import PyTorch Libraries


In [None]:
# TODO 1: Import PyTorch
# Hint: import torch
# Hint: import torch.nn as nn
# Hint: import torch.nn.functional as F

# YOUR CODE HERE

print("✅ PyTorch imported")


---

## 💻 TODO 2: Define SimpleCNN Class — __init__ Method

**What to define in `__init__`:**

```python
class SimpleCNN(nn.Module):
    def __init__(self):
        super(SimpleCNN, self).__init__()
        
        # TODO: Define convolutional layers
        self.conv1 = nn.Conv2D(???, ???, kernel_size=???, padding=???)
        self.conv2 = ???
        self.conv3 = ???
        
        # TODO: Define fully connected layers
        self.fc1 = nn.Linear(???, ???)
        self.fc2 = nn.Linear(???, ???)
```

**Hints:**
- `conv1`: 3 → 32 channels
- `conv2`: 32 → 64 channels
- `conv3`: 64 → 128 channels
- All convs: kernel_size=3, padding=1
- `fc1`: 18432 → 256
- `fc2`: 256 → 1


In [None]:
# TODO 2: Define SimpleCNN class with __init__ method
# Hint: class SimpleCNN(nn.Module):
# Hint:     def __init__(self):
# Hint:         super(SimpleCNN, self).__init__()
# Hint:         # Define layers here

# YOUR CODE HERE
class SimpleCNN(nn.Module):
    def __init__(self):
        super(SimpleCNN, self).__init__()
        
        # TODO: Define conv1, conv2, conv3
        # TODO: Define fc1, fc2
        pass  # Remove this line when you add code
    
    def forward(self, x):
        # We'll implement this in the next TODO
        pass

print("✅ SimpleCNN class defined (skeleton)")


---

## 💻 TODO 3: Implement forward() Method

**What the forward method should do:**

```python
def forward(self, x):
    # Block 1: Conv1 → ReLU → MaxPool
    # TODO: x = self.conv1(x)
    # TODO: x = F.relu(x)
    # TODO: x = F.max_pool2d(x, 2)  # or nn.MaxPool2d(2)
    
    # Block 2: Conv2 → ReLU → MaxPool
    # TODO: ...
    
    # Block 3: Conv3 → ReLU → MaxPool
    # TODO: ...
    
    # Flatten: [Batch, 128, 12, 12] → [Batch, 18432]
    # TODO: x = x.view(x.size(0), -1)  # or torch.flatten(x, 1)
    
    # FC1 → ReLU
    # TODO: x = self.fc1(x)
    # TODO: x = F.relu(x)
    
    # FC2 → Sigmoid → Squeeze
    # TODO: x = self.fc2(x)
    # TODO: x = torch.sigmoid(x)
    # TODO: x = x.squeeze(1)  # [Batch, 1] → [Batch]
    
    return x
```

**Expected shapes at each step:**
- After Conv1+Pool: [B, 32, 48, 48]
- After Conv2+Pool: [B, 64, 24, 24]
- After Conv3+Pool: [B, 128, 12, 12]
- After Flatten: [B, 18432]
- After FC1: [B, 256]
- After FC2: [B, 1]
- After Squeeze: [B]


In [None]:
# TODO 3: Implement the forward() method
# Re-define the class with complete forward() implementation

# YOUR CODE HERE
# Copy the class from TODO 2 and complete the forward() method

print("✅ SimpleCNN forward() method implemented")


---

## 💻 TODO 4: Instantiate the Model & Print Architecture


In [None]:
# TODO 4: Instantiate cnn_model and print it
# Hint: cnn_model = SimpleCNN()
# Hint: print(cnn_model)

# YOUR CODE HERE
cnn_model = None  # Replace this line

print("✅ Model instantiated:")
# Print the model


---

## 💻 TODO 5: Test Forward Pass with Fake Batch

**What you need to do:**
1. Create a fake batch: `fake_batch = torch.randn(8, 3, 96, 96)`
2. Run forward pass: `outputs = cnn_model(fake_batch)`
3. Print output shape and value range

**Expected output:**
```
✅ Forward pass successful
   Input shape: torch.Size([8, 3, 96, 96])
   Output shape: torch.Size([8])
   Output range: [min_val, max_val]  (should be between 0 and 1)
```


In [None]:
# TODO 5: Test forward pass with a fake batch
# Hint: fake_batch = torch.randn(8, 3, 96, 96)
# Hint: outputs = cnn_model(fake_batch)
# Hint: print(outputs.shape)

# YOUR CODE HERE

print("✅ Forward pass successful")
# Print input shape, output shape, min/max output values


---

## 🤔 Reflection Prompts

### Question 1: Why Sigmoid at the End?
We use Sigmoid for the final activation, producing outputs in [0, 1].

**Alternative:** Use no final activation and `BCEWithLogitsLoss` instead of `BCELoss`.

**Question:** What are the trade-offs between:
- Sigmoid + BCELoss (what we use)
- No Sigmoid + BCEWithLogitsLoss

**Your analysis:**

---

### Question 2: Flattening Calculation
After three MaxPool operations, we have shape [Batch, 128, 12, 12].

**Calculate:**
- How many parameters does `fc1 = nn.Linear(18432, 256)` have?
- Formula: `num_params = (input_size × output_size) + output_size` (weights + biases)

**Your calculation:**

---

### Question 3: Receptive Field
Each conv layer with kernel_size=3 expands the receptive field.

**Question:** After 3 conv layers, how much of the original input does a single neuron "see"?
(Hint: This is a theoretical question about the receptive field growing with each layer.)

**Your intuition:**

---


## 🚀 Next Steps

Amazing! You've built a CNN from scratch.

**Move to Notebook 06:** Device, Loss, and Optimizer Setup

**Key Takeaway:** Conv→ReLU→Pool extracts features; FC layers classify!
