# Self-Study Try-it 17.2: CNN Modifications and Preprocessing

This activity extends the existing LeNet-style CIFAR-10 baseline code by incorporating dropout, variations in filter size and padding, data augmentation, and a deeper network architecture.

In [None]:
import torch
import torch.nn as nn
import torch.optim as optim
import torchvision
import torchvision.transforms as transforms

### Step 1: Add Dropout for Regularization
Dropout helps prevent overfitting by randomly "dropping" units during training, forcing the model to learn more robust features.

Modify your LeNet class to include dropout layers:

In [None]:
class LeNetDropout(nn.Module):
    def __init__(self):
        super(LeNetDropout, self).__init__()
        self.conv1 = nn.Conv2d(3, 6, kernel_size=5, padding=2)
        self.pool = nn.AvgPool2d(2, 2)
        self.conv2 = nn.Conv2d(6, 16, kernel_size=5)

        self.dropout = nn.Dropout(p=0.5)  # Dropout with 50% rate after fc1
        # The input size for fc1 will be determined dynamically
        self.fc1 = nn.Linear(16 * 8 * 8, 120) # Adjusted based on CIFAR-10 input size and pooling
        self.fc2 = nn.Linear(120, 84)
        self.fc3 = nn.Linear(84, 10)
        self.sigmoid = nn.Sigmoid()

    def forward(self, x):
        x = self.pool(self.sigmoid(self.conv1(x)))
        x = self.pool(self.sigmoid(self.conv2(x)))
        # print(x.shape)  # Uncomment to check the shape before flattening
        x = x.view(-1, 16 * 8 * 8) # Adjusted based on CIFAR-10 input size and pooling
        x = self.sigmoid(self.fc1(x))
        x = self.dropout(x)  # Apply dropout here
        x = self.sigmoid(self.fc2(x))
        x = self.fc3(x)
        return x

### Step 2: Experiment with Filter Size and Padding
You can try different kernel sizes or add padding to control the size of feature maps after convolution.

Example: Change the second convolution to a 3x3 kernel with a padding of 1.

In [None]:
class LeNetDropout1(nn.Module):
    def __init__(self):
        super(LeNetDropout1, self).__init__()
        self.conv1 = nn.Conv2d(3, 6, kernel_size=5, padding=2)
        self.pool = nn.AvgPool2d(2, 2)
        self.conv2 = nn.Conv2d(6, 16, kernel_size=3, padding=1)

        self.dropout = nn.Dropout(p=0.75)  # Dropout with 50% rate after fc1
        # The input size for fc1 will be determined dynamically
        self.fc1 = nn.Linear(16 * 8 * 8, 120) # Adjusted based on CIFAR-10 input size and pooling
        self.fc2 = nn.Linear(120, 84)
        self.fc3 = nn.Linear(84, 10)
        self.sigmoid = nn.Sigmoid()

    def forward(self, x):
        x = self.pool(self.sigmoid(self.conv1(x)))
        x = self.pool(self.sigmoid(self.conv2(x)))
        # print(x.shape)  # Uncomment to check the shape before flattening
        x = x.view(-1, 16 * 8 * 8) # Adjusted based on CIFAR-10 input size and pooling
        x = self.sigmoid(self.fc1(x))
        x = self.dropout(x)  # Apply dropout here
        x = self.sigmoid(self.fc2(x))
        x = self.fc3(x)
        return x

### Step 3: Enhance Data Augmentation
To help the model generalize better add, additional transforms to increase the variation within the dataset.

In [None]:
# Data preprocessing and augmentation
import torchvision.transforms as transforms

transform_train = transforms.Compose([
    transforms.RandomHorizontalFlip(),
    transforms.RandomCrop(32, padding=4),  # Randomly crop with padding
    transforms.ToTensor(),
    transforms.Normalize((0.4914, 0.4822, 0.4465),
                         (0.247, 0.243, 0.261))
])


transform_test = transforms.Compose([
    transforms.ToTensor(),
    transforms.Normalize((0.4914, 0.4822, 0.4465), (0.247, 0.243, 0.261))
])

### Step 4: Create a Deeper CNN Variant
Add more convolutional layers and increase the feature map depth. Below is an example of a deeper model.

In [None]:
class DeepLeNet(nn.Module):
    def __init__(self):
        super(DeepLeNet, self).__init__()
        self.conv1 = nn.Conv2d(3, 16, 3, padding=1)  # 3->16 channels
        self.conv2 = nn.Conv2d(16, 32, 3, padding=1)
        self.pool = nn.MaxPool2d(2, 2)  # Using max pooling here
        self.conv3 = nn.Conv2d(32, 64, 3, padding=1)
        self.fc1 = nn.Linear(64 * 4 * 4, 256)        # Adjust for final size
        self.fc2 = nn.Linear(256, 128)
        self.fc3 = nn.Linear(128, 10)
        self.relu = nn.ReLU()
        self.dropout = nn.Dropout(0.5)

    def forward(self, x):
        x = self.relu(self.conv1(x))
        x = self.pool(self.relu(self.conv2(x)))
        x = self.pool(self.relu(self.conv3(x)))
        x = x.view(-1, 64 * 4 * 4)
        x = self.dropout(self.relu(self.fc1(x)))
        x = self.dropout(self.relu(self.fc2(x)))
        # print(x.shape) # Add this line to check the shape before the final layer
        x = self.fc3(x)
        return x

In [None]:
# Load CIFAR-10 dataset
trainset = torchvision.datasets.CIFAR10(root='./data', train=True, download=True, transform=transform_train)
trainloader = torch.utils.data.DataLoader(trainset, batch_size=128, shuffle=True, num_workers=2)

testset = torchvision.datasets.CIFAR10(root='./data', train=False, download=True, transform=transform_test)
testloader = torch.utils.data.DataLoader(testset, batch_size=100, shuffle=False, num_workers=2)


In [None]:
# Initialize model, loss function, and optimizer
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
net = LeNetDropout1() # Create an instance of the model
net = net.to(device) # Move the model instance to the device
criterion = nn.CrossEntropyLoss()
optimizer = optim.SGD(net.parameters(), lr=0.1, momentum=0.9)

In [None]:
# Training loop
for epoch in range(10):  # 20 epochs
    net.train()
    running_loss = 0.0
    for inputs, labels in trainloader:
        inputs, labels = inputs.to(device), labels.to(device)

        optimizer.zero_grad()
        outputs = net(inputs)
        loss = criterion(outputs, labels)
        loss.backward()
        optimizer.step()

        running_loss += loss.item()

    print(f"Epoch {epoch+1}, Loss: {running_loss/len(trainloader):.4f}")

In [None]:
# Evaluation on test set
net.eval()
correct = 0
total = 0
with torch.no_grad():
    for inputs, labels in testloader:
        inputs, labels = inputs.to(device), labels.to(device)
        outputs = net(inputs)
        _, predicted = torch.max(outputs, 1)
        total += labels.size(0)
        correct += (predicted == labels).sum().item()

print(f'Accuracy on CIFAR-10 test set: {100 * correct / total:.2f}%')

### Further Learning

In this section, we'll try changing some of the features of the model and explore how these changes affect the model's performance.

- Increase the `dropout` to 0.75. What is the impact on the accuracy?
- What is the effect of increasing the padding from 2 to 3?
- Increase the number of epochs to 25 and check the impact on loss and overall accuracy.


##Increase the `dropout` to 0.75. What is the impact on the accuracy?

Dropout randomly disables a fraction of neurons during training to prevent overfitting. A rate of p=0.75 means 75% of the neurons are dropped during each forward pass in training.
### Potential Benefits:
**Stronger regularization**: Forces the network to learn more robust features.

**Better generalization**: May reduce overfitting on small or noisy datasets.

### Potential Drawbacks:
**Underfitting risk**: With 75% of neurons dropped, the model may struggle to learn effectively, especially if the dataset is complex (like CIFAR-10).

**Slower convergence**: Fewer active neurons means weaker gradient signals, which can slow learning.

## What is the effect of increasing the padding from 2 to 3?

Increasing padding from 2 to 3 adds more zero pixels around the input, which increases the output spatial dimensions.

With kernel_size=5 and padding=2, the output size remains the same as input (e.g., 32×32).

With padding=3, the output becomes larger:

$$
\text{Output size} = \left\lfloor \frac{32 + 2 \times 3 - 5}{1} \right\rfloor + 1 = 33
$$

So the feature map becomes 33×33 instead of 32×32.

## Increase the number of epochs to 25 and check the impact on loss and overall accuracy.

Will more epochs, there will a decrease in loss and an improvement in overall accuracy.


##