# Cat-Dog Classification with L2 Regularization and Accuracy Plotting

This notebook demonstrates how to implement L2 regularization in two ways on a simple convolutional neural network (CNN) for cat-dog classification:

1. **Using the optimizer's weight decay parameter**
2. **Manually adding the L2 norm penalty in the training loop**

It also tracks the training accuracy and the L2 norm of weights over epochs, and plots a graph at the end to help you observe the impact of the regularization on the training performance.

## Using This Notebook with a Custom Dataset

If you have a custom dataset, make sure it is organized in a structure that is compatible with PyTorch's `ImageFolder` class. Typically, this means you should have a root directory with one subdirectory per class (e.g., `cat` and `dog`). For example:

```
your_dataset_root/
    ├── cat/
    │     ├── image1.jpg
    │     ├── image2.jpg
    │     └── ...
    └── dog/
          ├── image1.jpg
          ├── image2.jpg
          └── ...
```

Then, change the `root` parameter in the `datasets.ImageFolder` call (see the code cell below) to point to your custom dataset directory.

No data augmentation is applied here; only basic resizing and normalization are used.

In [None]:
import torch
import torch.nn as nn
import torch.optim as optim
from torchvision import datasets, transforms
from torch.utils.data import DataLoader
import matplotlib.pyplot as plt

# Check device
device = 'cuda' if torch.cuda.is_available() else 'cpu'
print(f"Using device: {device}")

# Define a simple CNN for binary classification (cats vs dogs)
class CatDogCNN(nn.Module):
    def __init__(self):
        super(CatDogCNN, self).__init__()
        self.features = nn.Sequential(
            nn.Conv2d(3, 16, kernel_size=3, padding=1),
            nn.ReLU(),
            nn.MaxPool2d(2),
            nn.Conv2d(16, 32, kernel_size=3, padding=1),
            nn.ReLU(),
            nn.MaxPool2d(2)
        )
        self.classifier = nn.Sequential(
            nn.Flatten(),
            nn.Linear(32 * 56 * 56, 128),  # assuming input images are 224x224
            nn.ReLU(),
            nn.Linear(128, 2)  # 2 output classes: cat and dog
        )

    def forward(self, x):
        x = self.features(x)
        x = self.classifier(x)
        return x

# Transformation: only basic resizing and normalization, no data augmentation.
transform = transforms.Compose([
    transforms.Resize((224, 224)),
    transforms.ToTensor(),
    transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
])

# Change 'data/train' to the path of your custom dataset if needed.
train_dataset = datasets.ImageFolder(root='data/train', transform=transform)
train_loader = DataLoader(train_dataset, batch_size=32, shuffle=True)

print(f"Number of training samples: {len(train_dataset)}")

## Experiment (a): L2 Regularization via Optimizer’s Weight Decay

This experiment uses the optimizer's built-in weight decay to apply L2 regularization. Adjust the `weight_decay` parameter to see its effect on the training dynamics. The training function now also tracks training accuracy and the L2 norm of weights.

In [None]:
# Create model instance for Experiment (a)
model_a = CatDogCNN().to(device)

# Loss function
criterion = nn.CrossEntropyLoss()

# Set weight decay factor (lambda)
weight_decay = 1e-4

# Use an optimizer with built-in weight decay
optimizer_a = optim.Adam(model_a.parameters(), lr=1e-3, weight_decay=weight_decay)

def train_model_a(num_epochs=5):
    model_a.train()
    epoch_accuracy = []
    weight_norms = []
    for epoch in range(num_epochs):
        running_loss = 0.0
        correct = 0
        total = 0
        for images, labels in train_loader:
            images, labels = images.to(device), labels.to(device)
            
            optimizer_a.zero_grad()
            outputs = model_a(images)
            loss = criterion(outputs, labels)
            loss.backward()
            optimizer_a.step()
            
            running_loss += loss.item() * images.size(0)
            
            # Compute accuracy
            _, predicted = torch.max(outputs, 1)
            total += labels.size(0)
            correct += (predicted == labels).sum().item()
        
        epoch_loss = running_loss / len(train_dataset)
        acc = 100 * correct / total
        epoch_accuracy.append(acc)
        
        # Get L2 norm of the first conv layer weights
        weight_norm = model_a.features[0].weight.data.norm(2).item()
        weight_norms.append(weight_norm)
        
        print(f"[Exp A] Epoch [{epoch+1}/{num_epochs}], Loss: {epoch_loss:.4f}, Accuracy: {acc:.2f}%")
        print(f"[Exp A] Layer Conv1 L2 norm: {weight_norm:.4f}")
    
    return epoch_accuracy, weight_norms

# Uncomment the line below to run Experiment (a) and capture the accuracy and weight norms
# acc_a, l2_a = train_model_a()

## Experiment (b): L2 Regularization by Manually Adding the L2 Norm

In this experiment, we apply L2 regularization manually by computing the L2 norm of all weight parameters and adding it to the loss. The training function tracks training accuracy and the L2 norm of weights as well.

In [None]:
# Create a new instance of the model for Experiment (b)
model_b = CatDogCNN().to(device)

# Use the same loss function
criterion = nn.CrossEntropyLoss()

# Create an optimizer without built-in weight decay
optimizer_b = optim.Adam(model_b.parameters(), lr=1e-3, weight_decay=0)  # weight_decay is 0 for manual regularization

# Regularization strength
lambda_reg = 1e-4

def train_model_b(num_epochs=5):
    model_b.train()
    epoch_accuracy = []
    weight_norms = []
    for epoch in range(num_epochs):
        running_loss = 0.0
        correct = 0
        total = 0
        for images, labels in train_loader:
            images, labels = images.to(device), labels.to(device)
            
            optimizer_b.zero_grad()
            outputs = model_b(images)
            loss = criterion(outputs, labels)
            
            # Manually compute the L2 regularization penalty
            l2_penalty = 0.0
            for param in model_b.parameters():
                if param.requires_grad:
                    l2_penalty += torch.sum(param ** 2)
            
            loss += lambda_reg * l2_penalty
            loss.backward()
            optimizer_b.step()
            
            running_loss += loss.item() * images.size(0)
            
            # Compute accuracy
            _, predicted = torch.max(outputs, 1)
            total += labels.size(0)
            correct += (predicted == labels).sum().item()
        
        epoch_loss = running_loss / len(train_dataset)
        acc = 100 * correct / total
        epoch_accuracy.append(acc)
        
        # Get L2 norm of the first conv layer weights
        weight_norm = model_b.features[0].weight.data.norm(2).item()
        weight_norms.append(weight_norm)
        
        print(f"[Exp B] Epoch [{epoch+1}/{num_epochs}], Loss: {epoch_loss:.4f}, Accuracy: {acc:.2f}%")
        print(f"[Exp B] Layer Conv1 L2 norm: {weight_norm:.4f}")
    
    return epoch_accuracy, weight_norms

# Uncomment the line below to run Experiment (b) and capture the accuracy and weight norms
# acc_b, l2_b = train_model_b()

## Plotting Accuracy and Weight Norms

After running one of the experiments (either Experiment (a) or (b)), run the cell below to plot the training accuracy (and optionally, the L2 norm of the weights) versus epoch. Adjust which experiment's data you want to plot by uncommenting the appropriate lines.

In [None]:
import matplotlib.pyplot as plt

# Example: Plot results for Experiment (a)
# Make sure to run train_model_a() and obtain acc_a, l2_a first by uncommenting the call in the cell above

# Uncomment one of these lines depending on which experiment you ran:
# For Experiment (a):
# acc_history, l2_history = acc_a, l2_a

# For Experiment (b):
# acc_history, l2_history = acc_b, l2_b

# For demonstration, if you haven't run either experiment, here is some dummy data:
if 'acc_a' not in globals() and 'acc_b' not in globals():
    acc_history = [60, 65, 70, 75, 80]
    l2_history = [100, 95, 90, 85, 80]
else:
    try:
        acc_history, l2_history = acc_a, l2_a
    except NameError:
        acc_history, l2_history = acc_b, l2_b

# Plot training accuracy
epochs = range(1, len(acc_history)+1)
plt.figure()
plt.plot(epochs, acc_history, marker='o', label='Training Accuracy (%)')
plt.title('Training Accuracy per Epoch')
plt.xlabel('Epoch')
plt.ylabel('Accuracy (%)')
plt.legend()
plt.grid(True)
plt.show()

# Optionally, plot the L2 norm of weights
plt.figure()
plt.plot(epochs, l2_history, marker='o', color='red', label='Layer Conv1 L2 Norm')
plt.title('L2 Norm of Conv1 Weights per Epoch')
plt.xlabel('Epoch')
plt.ylabel('L2 Norm')
plt.legend()
plt.grid(True)
plt.show()

## Conclusion

This notebook provided two experiments demonstrating L2 regularization in a CNN for cat-dog classification. Experiment (a) uses the optimizer's built-in weight decay, while Experiment (b) computes the L2 norm manually and adds it to the loss. Both experiments track training accuracy and the L2 norm of the weights during training. In the final cell, you can plot these metrics to observe how regularization affects model training.

Feel free to adjust hyperparameters (learning rate, regularization strength, etc.) and the model architecture to best suit your custom dataset and application.