
---

### Homework Assignment: Comprehensive Exploration of CNN Techniques

#### Objective:
The objective of this assignment is to explore various techniques used in convolutional neural networks (CNNs) for image classification tasks. Specifically, students will experiment with different regularization techniques and initialization methods to understand their impact on model performance.

#### Tasks:
1. **Dataset Preparation:**
   - Download the CIFAR-10 dataset, a widely used benchmark dataset for image classification.
   - Preprocess the dataset by normalizing the pixel values and splitting it into training and testing sets.

2. **Experiment 1: Regularization Techniques:**
   - Implement a CNN model architecture for image classification using PyTorch.
   - Experiment with different regularization techniques:
     - No regularization
     - L2 regularization
     - Dropout regularization
   - Train each model using the training set and evaluate its performance on the testing set.
   - Compare and analyze the impact of each regularization technique on model performance.

3. **Experiment 2: Initialization Techniques:**
   - Implement a CNN model architecture for image classification using PyTorch.
   - Experiment with different weight initialization techniques:
     - Default initialization
     - Xavier initialization
     - Kaiming initialization
   - Train each model using the training set and evaluate its performance on the testing set.
   - Compare and analyze the impact of each initialization technique on model performance.

4. **Experiment 3: Learning Rate Scheduling:**
   - Experiment with different techniques:
     - Step decay
     - Exponential decay
     - Cyclic learning rates
   - Train each model using the training set and evaluate its performance on the testing set.
   - Compare and analyze the impact of each initialization technique on model performance.


5. **Analysis and Conclusion:**
   - Analyze the results obtained from the experiments conducted in Steps 2 and 3,4.
   - Discuss the strengths and weaknesses of each regularization technique and initialization method.
   - Provide insights into how these techniques affect model performance, training convergence, and generalization ability.
   - Propose recommendations for selecting appropriate techniques based on the characteristics of the dataset and task.

#### Submission Guidelines:
- Students are required to submit a Jupyter Notebook containing the implementation of the CNN models with various techniques, along with necessary explanations, comments, and visualizations.
- Additionally, students must provide a detailed report summarizing their findings, including comparisons of model performance, analysis of techniques, and insights gained from the experimentation.





# Example of a CNN on the CIFAR dataset

In [2]:
import torch
import torch.nn as nn
import torch.optim as optim
import torchvision
import torchvision.transforms as transforms

# Device configuration
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')

# Hyperparameters
num_epochs = 1
batch_size = 64
learning_rate = 0.001

# Dataset
transform = transforms.Compose([
    transforms.ToTensor(),
    transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))
])

train_dataset = torchvision.datasets.CIFAR10(root='./data', train=True, transform=transform, download=True)
test_dataset = torchvision.datasets.CIFAR10(root='./data', train=False, transform=transform)

train_loader = torch.utils.data.DataLoader(dataset=train_dataset, batch_size=batch_size, shuffle=True)
test_loader = torch.utils.data.DataLoader(dataset=test_dataset, batch_size=batch_size, shuffle=False)

# Model
class CNN(nn.Module):
    def __init__(self):
        super(CNN, self).__init__()
        self.conv1 = nn.Conv2d(3, 32, kernel_size=3, padding=1)
        self.relu = nn.ReLU()
        self.maxpool = nn.MaxPool2d(kernel_size=2, stride=2)
        self.conv2 = nn.Conv2d(32, 64, kernel_size=3, padding=1)
        self.fc1 = nn.Linear(64 * 8 * 8, 512)
        self.fc2 = nn.Linear(512, 10)
        self.dropout = nn.Dropout(0.5)  # Dropout regularization

        # Weight initialization
        nn.init.kaiming_normal_(self.conv1.weight)
        nn.init.kaiming_normal_(self.conv2.weight)
        nn.init.kaiming_normal_(self.fc1.weight)
        nn.init.kaiming_normal_(self.fc2.weight)

    def forward(self, x):
        x = self.conv1(x)
        x = self.relu(x)
        x = self.maxpool(x)
        x = self.conv2(x)
        x = self.relu(x)
        x = self.maxpool(x)
        x = x.view(-1, 64 * 8 * 8)
        x = self.fc1(x)
        x = self.relu(x)
        x = self.dropout(x)  # Apply dropout
        x = self.fc2(x)
        return x

model = CNN().to(device)

# Loss and optimizer
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters(), lr=learning_rate, weight_decay=1e-4)  # Weight decay regularization
# Learning rate scheduler
scheduler = optim.lr_scheduler.StepLR(optimizer, step_size=5, gamma=0.1)

# Training loop
total_step = len(train_loader)
for epoch in range(num_epochs):
    model.train()
    for i, (images, labels) in enumerate(train_loader):
        images = images.to(device)
        labels = labels.to(device)

        # Forward pass
        outputs = model(images)
        loss = criterion(outputs, labels)

        # Backward and optimize
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()

        if (i+1) % 100 == 0:
            print ('Epoch [{}/{}], Step [{}/{}], Loss: {:.4f}'
                   .format(epoch+1, num_epochs, i+1, total_step, loss.item()))
    # Update learning rate
    scheduler.step()

# Test the model
model.eval()
with torch.no_grad():
    correct = 0
    total = 0
    for images, labels in test_loader:
        images = images.to(device)
        labels = labels.to(device)
        outputs = model(images)
        _, predicted = torch.max(outputs.data, 1)
        total += labels.size(0)
        correct += (predicted == labels).sum().item()

    print('Accuracy of the model on the test images: {} %'.format(100 * correct / total))


Files already downloaded and verified
Epoch [1/1], Step [100/782], Loss: 1.7917
Epoch [1/1], Step [200/782], Loss: 1.3838
Epoch [1/1], Step [300/782], Loss: 1.2567
Epoch [1/1], Step [400/782], Loss: 1.3939
Epoch [1/1], Step [500/782], Loss: 1.3481
Epoch [1/1], Step [600/782], Loss: 1.1661
Epoch [1/1], Step [700/782], Loss: 1.2332
Accuracy of the model on the test images: 62.84 %


# Example for using various regulization techniques

In [9]:
import torch
import torch.nn as nn
import torch.optim as optim
import torchvision
import torchvision.transforms as transforms

# Device configuration
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')

# Hyperparameters
num_epochs = 1
batch_size = 64
learning_rate = 0.001

# Dataset
transform = transforms.Compose([
    transforms.ToTensor(),
    transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))
])

train_dataset = torchvision.datasets.CIFAR10(root='./data', train=True, transform=transform, download=True)
test_dataset = torchvision.datasets.CIFAR10(root='./data', train=False, transform=transform)

train_loader = torch.utils.data.DataLoader(dataset=train_dataset, batch_size=batch_size, shuffle=True)
test_loader = torch.utils.data.DataLoader(dataset=test_dataset, batch_size=batch_size, shuffle=False)

# Model
class CNN(nn.Module):
    def __init__(self):
        super(CNN, self).__init__()
        self.conv1 = nn.Conv2d(3, 32, kernel_size=3, padding=1)
        self.conv2 = nn.Conv2d(32, 64, kernel_size=3, padding=1)
        self.fc1 = nn.Linear(64 * 8 * 8, 512)
        self.fc2 = nn.Linear(512, 10)
        self.relu = nn.ReLU()
        self.maxpool = nn.MaxPool2d(kernel_size=2, stride=2)
        self.dropout = nn.Dropout(0.5)

    def forward(self, x):
        x = self.conv1(x)
        x = self.relu(x)
        x = self.maxpool(x)
        x = self.conv2(x)
        x = self.relu(x)
        x = self.maxpool(x)
        x = x.view(-1, 64 * 8 * 8)
        x = self.fc1(x)
        x = self.relu(x)
        x = self.dropout(x)
        x = self.fc2(x)
        return x

# Experiment 1: Regularization Techniques
regularization_methods = {
    "No Regularization": None,
    "L2 Regularization": 1e-4
    # Add more regularization techniques as needed
}

print("Experiment 1: Regularization Techniques")
for name, regularization in regularization_methods.items():
    print(f"Experimenting with {name}")
    model = CNN().to(device)

    # Loss and optimizer
    criterion = nn.CrossEntropyLoss()
    optimizer = optim.Adam(model.parameters(), lr=learning_rate)

    if regularization is not None:
        # Apply regularization to the optimizer
        optimizer = optim.Adam(model.parameters(), lr=learning_rate, weight_decay=regularization)

    # Training loop
    for epoch in range(num_epochs):
        model.train()
        for i, (images, labels) in enumerate(train_loader):
            images = images.to(device)
            labels = labels.to(device)

            # Forward pass
            outputs = model(images)
            loss = criterion(outputs, labels)

            # Backward and optimize
            optimizer.zero_grad()
            loss.backward()
            optimizer.step()

            if (i+1) % 100 == 0:
                print ('Epoch [{}/{}], Step [{}/{}], Loss: {:.4f}'
                       .format(epoch+1, num_epochs, i+1, len(train_loader), loss.item()))

    # Test the model
    model.eval()
    with torch.no_grad():
        correct = 0
        total = 0
        for images, labels in test_loader:
            images = images.to(device)
            labels = labels.to(device)
            outputs = model(images)
            _, predicted = torch.max(outputs.data, 1)
            total += labels.size(0)
            correct += (predicted == labels).sum().item()

        accuracy = 100 * correct / total
        print(f'Accuracy of the model on the test images: {accuracy}%')
    print()


Files already downloaded and verified
Experiment 1: Regularization Techniques
Experimenting with No Regularization
Epoch [1/1], Step [100/782], Loss: 1.5685
Epoch [1/1], Step [200/782], Loss: 1.4143
Epoch [1/1], Step [300/782], Loss: 1.2986
Epoch [1/1], Step [400/782], Loss: 1.3840
Epoch [1/1], Step [500/782], Loss: 1.4242
Epoch [1/1], Step [600/782], Loss: 1.2984
Epoch [1/1], Step [700/782], Loss: 1.0885
Accuracy of the model on the test images: 61.03%

Experimenting with L2 Regularization
Epoch [1/1], Step [100/782], Loss: 1.7951
Epoch [1/1], Step [200/782], Loss: 1.3884
Epoch [1/1], Step [300/782], Loss: 1.3318
Epoch [1/1], Step [400/782], Loss: 1.2071
Epoch [1/1], Step [500/782], Loss: 1.6516
Epoch [1/1], Step [600/782], Loss: 1.2574
Epoch [1/1], Step [700/782], Loss: 1.3983
Accuracy of the model on the test images: 61.92%



# Some functions you need to learn, read about and experiment with :

In [None]:
# Default Initialization
nn.init.normal_(self.conv1.weight)
nn.init.normal_(self.conv2.weight)
nn.init.normal_(self.fc1.weight)
nn.init.normal_(self.fc2.weight)

# Xavier Initialization
nn.init.xavier_normal_(self.conv1.weight)
nn.init.xavier_normal_(self.conv2.weight)
nn.init.xavier_normal_(self.fc1.weight)
nn.init.xavier_normal_(self.fc2.weight)

# Kaiming Initialization
nn.init.kaiming_normal_(self.conv1.weight)
nn.init.kaiming_normal_(self.conv2.weight)
nn.init.kaiming_normal_(self.fc1.weight)
nn.init.kaiming_normal_(self.fc2.weight)


In [None]:
# Adam Optimizer
optimizer_adam = optim.Adam(model.parameters(), lr=learning_rate)

# SGD Optimizer
optimizer_sgd = optim.SGD(model.parameters(), lr=learning_rate, momentum=0.9)

# RMSprop Optimizer
optimizer_rmsprop = optim.RMSprop(model.parameters(), lr=learning_rate)
