<a href="https://colab.research.google.com/github/Saibhossain/face-generation-model/blob/main/Learning_CNN.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# A CNN is a neural network designed to extract patterns from images using:



*   Convolution
*   Non-linearity (ReLU)
*   Pooling
*   Fully-connected layers



## Instead of learning from raw pixels, CNNs learn filters (kernels) that detect:

* edges
* textures
* shapes
* faces/features (eyes, mouth)
* concepts (cat, dog, person)



### 1️⃣ The Convolution Operation

We slide a kernel/filter over the image and compute:



```
Output(i,j) = ∑∑ X(i+m,j+n) ⋅ W(m,n)
m=1→k n=1→k
```

**Where:**

| Symbol | Meaning |
|--------|---------|
| **X** | Input image patch |
| **W** | Filter/kernel weights |
| **k** | Kernel size (e.g., 3×3) |
| **i,j** | Location in the image |

Every filter produces one feature map.

**Example:**
- 1 input image + 32 filters → 32 output feature maps

---

### 2️⃣ Padding & Stride (Formula)

**Output size formula:**

```
O = (I - K + 2P)/S + 1
```


**Where:**
- **I** = Input size
- **K** = Kernel size  
- **P** = Padding
- **S** = Stride

---

### 3️⃣ ReLU Activation

Adds non-linearity to the network:

```
ReLU(x) = max(0, x)
```

---

### 4️⃣ MaxPooling

Reduces spatial size of feature maps:
```
O = (I - K)/S + 1
```

**Typical Configuration:**
- 2×2 kernel
- Stride 2

In [None]:
import torch
import torch.nn as nn
import torch.optim as optim
from torchvision import datasets, transforms
from torch.utils.data import DataLoader

transform = transforms.Compose([
    transforms.ToTensor()
])
train_data = datasets.MNIST(root="./data", train=True,download=True,transform=transform)
test_data = datasets.MNIST(root="./data", train=False,download=True,transform=transform)
train_loader =DataLoader(train_data,batch_size=64, shuffle=True)
test_loader =DataLoader(test_data,batch_size=64, shuffle=False)

print("\n--- Training Data Info ---")
print(f"Number of training samples: {len(train_data)}")

# Get an example image and label
example_train_image, example_train_label = train_data[0]
print(f"Shape of a single training image: {example_train_image.shape}")
print(f"Example training label: {example_train_label}")

print("\n--- Test Data Info ---")
print(f"Number of test samples: {len(test_data)}")

# Get an example image and label
example_test_image, example_test_label = test_data[0]
print(f"Shape of a single test image: {example_test_image.shape}")
print(f"Example test label: {example_test_label}")

print("\n--- Dataset Details ---")
print(f"Number of classes: {len(train_data.classes)}")
print(f"Class names: {train_data.classes}")



--- Training Data Info ---
Number of training samples: 60000
Shape of a single training image: torch.Size([1, 28, 28])
Example training label: 5

--- Test Data Info ---
Number of test samples: 10000
Shape of a single test image: torch.Size([1, 28, 28])
Example test label: 7

--- Dataset Details ---
Number of classes: 10
Class names: ['0 - zero', '1 - one', '2 - two', '3 - three', '4 - four', '5 - five', '6 - six', '7 - seven', '8 - eight', '9 - nine']


### How a CNN Trains:

The training process for a Convolutional Neural Network (CNN) generally involves these steps, repeated over several **epochs** (full passes through the training data) and **batches** (subsets of data):

1.  **Forward Pass**: Input data (e.g., an image) is fed into the network. Each layer performs its operation (convolution, activation, pooling, linear transformation) and passes the output to the next layer. This culminates in the final output (e.g., class probabilities).

2.  **Loss Calculation**: The network's output is compared to the true labels using a **loss function** (e.g., `nn.CrossEntropyLoss` for classification). This function quantifies how 'wrong' the network's prediction was.

3.  **Backward Pass (Backpropagation)**: The calculated loss is then used to compute the gradients of the loss with respect to each of the model's parameters (weights and biases). This process propagates the error backward through the network.

4.  **Optimizer Step**: An **optimizer** (e.g., `optim.Adam`) uses these gradients to update the model's parameters. The goal is to adjust the parameters in a direction that reduces the loss in subsequent forward passes.

This cycle continues until the network's performance on the training data (and ideally, unseen validation data) reaches a satisfactory level.


### Debugging the Model's Data Flow:

A common way to debug CNNs, especially when dealing with dimension issues, is to print the shape of the tensor `x` as it passes through different layers in the `forward` method. This helps you understand how the spatial dimensions and channel counts change after each operation.

In [None]:
# Modified SimpleCNN with print statements for debugging data flow
class DebugSimpleCNN(nn.Module):
    def __init__(self):
        super().__init__()
        self.conv1 = nn.Conv2d(1, 8, kernel_size=3, padding=1)
        self.relu1 = nn.ReLU()
        self.pool1 = nn.MaxPool2d(2)

        self.conv2 = nn.Conv2d(8, 16, kernel_size=3, padding=1)
        self.relu2 = nn.ReLU()
        self.pool2 = nn.MaxPool2d(2)

        # Linear layer input size needs to be carefully calculated
        # For an input of 28x28:
        # After conv1 (28x28) -> pool1 (14x14)
        # After conv2 (14x14) -> pool2 (7x7)
        # Channels are 16
        self.fc1 = nn.Linear(16 * 7 * 7, 64)
        self.relu3 = nn.ReLU()
        self.fc2 = nn.Linear(64, 10)

    def forward(self, x):
        print(f"Input shape: {x.shape}") # Expected: [batch_size, 1, 28, 28]

        x = self.conv1(x)
        print(f"After conv1: {x.shape}") # Expected: [batch_size, 8, 28, 28] (with padding=1)
        x = self.relu1(x)
        print(f"After relu1: {x.shape}") # Shape remains the same
        x = self.pool1(x)
        print(f"After pool1: {x.shape}") # Expected: [batch_size, 8, 14, 14] (28/2)

        x = self.conv2(x)
        print(f"After conv2: {x.shape}") # Expected: [batch_size, 16, 14, 14] (with padding=1)
        x = self.relu2(x)
        print(f"After relu2: {x.shape}") # Shape remains the same
        x = self.pool2(x)
        print(f"After pool2: {x.shape}") # Expected: [batch_size, 16, 7, 7] (14/2)

        # Flatten the tensor for the fully connected layer
        # x.shape[0] is the batch size
        x = x.view(x.shape[0], -1)
        print(f"After flattening: {x.shape}") # Expected: [batch_size, 16*7*7]

        x = self.fc1(x)
        print(f"After fc1: {x.shape}") # Expected: [batch_size, 64]
        x = self.relu3(x)
        print(f"After relu3: {x.shape}") # Shape remains the same
        x = self.fc2(x)
        print(f"Final output shape: {x.shape}") # Expected: [batch_size, 10]
        return x

# Instantiate the debug model and pass a sample batch to see the print statements
# Note: This will not train the model, but only demonstrate the forward pass.

debug_model = DebugSimpleCNN()

# Get one batch of images from the training loader
sample_images, sample_labels = next(iter(train_loader))

# Perform a forward pass with the sample images
print("\n--- Running Debug Model Forward Pass ---")
_ = debug_model(sample_images)
print("----------------------------------------")

# You can also set a breakpoint in the forward method and inspect variables using a debugger if your IDE supports it.


--- Running Debug Model Forward Pass ---
Input shape: torch.Size([64, 1, 28, 28])
After conv1: torch.Size([64, 8, 28, 28])
After relu1: torch.Size([64, 8, 28, 28])
After pool1: torch.Size([64, 8, 14, 14])
After conv2: torch.Size([64, 16, 14, 14])
After relu2: torch.Size([64, 16, 14, 14])
After pool2: torch.Size([64, 16, 7, 7])
After flattening: torch.Size([64, 784])
After fc1: torch.Size([64, 64])
After relu3: torch.Size([64, 64])
Final output shape: torch.Size([64, 10])
----------------------------------------


In [None]:
# ----------------------------
# 2. Define and Instantiate DebugSimpleCNN for Training
# ----------------------------
# Using DebugSimpleCNN to observe internal tensor shapes during training
model = DebugSimpleCNN()
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters(), lr=0.001)

# ----------------------------
# 3. Train the CNN with Debug Prints
# ----------------------------
print("\n--- Starting Training with DebugSimpleCNN ---")
for epoch in range(3):  # just 3 epochs for demo
    print(f"\n--- Epoch {epoch+1} ---")
    for batch_idx, (images, labels) in enumerate(train_loader):
        # For brevity, let's only print debug info for the first batch of the first epoch
        if epoch == 0 and batch_idx == 0:
            print(f"\nDebugging forward pass for Batch {batch_idx+1}:")
            outputs = model(images)
        else:
            outputs = model(images)

        loss = criterion(outputs, labels)

        optimizer.zero_grad()
        loss.backward()
        optimizer.step()

    print(f"Epoch {epoch+1}, Loss: {loss.item():.4f}")
print("--- Training Finished ---")

# ----------------------------
# 4. Evaluate the CNN
# ----------------------------
print("\n--- Starting Evaluation ---")
correct = 0
total = 0
model.eval()
with torch.no_grad():
    for images, labels in test_loader:
        outputs = model(images)
        _, predicted = torch.max(outputs.data, 1)

        total += labels.size(0)
        correct += (predicted == labels).sum().item()

print("Test Accuracy:", 100 * correct / total, "%")
print("--- Evaluation Finished ---")

# ----------------------------
# 5. Predict on a sample
# ----------------------------
print("\n--- Predicting on a Sample ---")
example_image, example_label = test_data[0]
model.eval()
with torch.no_grad():
    # The forward pass prints will trigger here as well
    output = model(example_image.unsqueeze(0))
    predicted_class = output.argmax(dim=1).item()

print("True Label:", example_label)
print("Predicted Label:", predicted_class)
print("--- Prediction Finished ---")


[1;30;43mStreaming output truncated to the last 5000 lines.[0m
After relu2: torch.Size([64, 16, 14, 14])
After pool2: torch.Size([64, 16, 7, 7])
After flattening: torch.Size([64, 784])
After fc1: torch.Size([64, 64])
After relu3: torch.Size([64, 64])
Final output shape: torch.Size([64, 10])
Input shape: torch.Size([64, 1, 28, 28])
After conv1: torch.Size([64, 8, 28, 28])
After relu1: torch.Size([64, 8, 28, 28])
After pool1: torch.Size([64, 8, 14, 14])
After conv2: torch.Size([64, 16, 14, 14])
After relu2: torch.Size([64, 16, 14, 14])
After pool2: torch.Size([64, 16, 7, 7])
After flattening: torch.Size([64, 784])
After fc1: torch.Size([64, 64])
After relu3: torch.Size([64, 64])
Final output shape: torch.Size([64, 10])
Input shape: torch.Size([64, 1, 28, 28])
After conv1: torch.Size([64, 8, 28, 28])
After relu1: torch.Size([64, 8, 28, 28])
After pool1: torch.Size([64, 8, 14, 14])
After conv2: torch.Size([64, 16, 14, 14])
After relu2: torch.Size([64, 16, 14, 14])
After pool2: torch.Size