# Deep Learning: More Convolutional Neural Networks

In [None]:
import torch
import torch.nn as nn
import torch.optim as optim
import torchvision
import torchvision.transforms as transforms

# Device configuration
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')

For today's small group, we will walk through the process of setting up a convolutional neural network ("CNN" for short) using the `pytorch` package!

Recall from lecture that CNNs are generally used to process gridded data or images.

Let's begin by loading one of the toy datasets included in `pytorch`:

In [None]:
# init preprocessing for CIFAR-10 dataset (images are 32x32x3)
transform = transforms.Compose([
    transforms.ToTensor(),
    transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))  # normalize to [-1, 1]
])

In [None]:
batch_size = 100
train_dataset = torchvision.datasets.CIFAR10(root='./data', train=True,
                                             download=True, transform=transform)
train_loader = torch.utils.data.DataLoader(train_dataset, batch_size=batch_size, shuffle=True)

test_dataset = torchvision.datasets.CIFAR10(root='./data', train=False,
                                            download=True, transform=transform)
test_loader = torch.utils.data.DataLoader(test_dataset, batch_size=batch_size, shuffle=False)

Great! We have image data now. But what does it look like?

Let's plot the different classes below using `matplotlib`

In [None]:
# Plot each class here

## Building a CNN

Staff:
- Reference lecture: CNNs involve stacking multiple **layers**
- The first part of a CNN involves stacking multiple layers of convolutional, activation, and maxpool layers 
- The example code below shows 3 of these stacks of layers!
<p align="left">
    <img src = "https://media.geeksforgeeks.org/wp-content/uploads/20250529121802516451/Convolutional-Neural-Network-in-Machine-Learning.webp" width = "500">
</p>

### Activation Functions

Staff:
- Please briefly review some of the common activation functions (there will be a table on this at the beginning of lecture)
- Discuss 3 of the most common activation functions
- Be sure to define what their names are in tensorflow

Please add comments to this code:

In [None]:
# Define a simple CNN
class CNN(nn.Module):
    def __init__(self):
        super(CNN, self).__init__()
        self.layer1 = nn.Sequential(
            nn.Conv2d(3, 32, kernel_size=3, padding=1),
            nn.BatchNorm2d(32),
            nn.ReLU(),
            nn.MaxPool2d(2)
        )
        self.layer2 = nn.Sequential(
            nn.Conv2d(32, 64, kernel_size=3, padding=1),
            nn.BatchNorm2d(64),
            nn.ReLU(),
            nn.MaxPool2d(2)
        )
        self.layer3 = nn.Sequential(
            nn.Conv2d(64, 128, kernel_size=3, padding=1),
            nn.BatchNorm2d(128),
            nn.ReLU(),
            nn.MaxPool2d(2)
        )
        self.fc = nn.Linear(128*4*4, 10)  # CIFAR-10 has 10 classes

    def forward(self, x):
        out = self.layer1(x)
        out = self.layer2(out)
        out = self.layer3(out)
        out = out.view(out.size(0), -1)
        out = self.fc(out)
        return out

model = CNN().to(device)

### Q: What do you think will happen to your CNN as you change the activation function?
Feel free to try this by changing `activation_func`!

### A: 
YOUR ANSWER HERE

----
## Training a CNN
Staff:
- Please discuss the selection process for optimizer and loss inputs
- Define what learning rate is
- Give an overview of 2-3 common loss functions and their behavior

[This](https://www.geeksforgeeks.org/machine-learning/epoch-in-machine-learning/) reference will be useful

In [None]:
#  Loss and optimizer
learning_rate = 0.001
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters(), lr=learning_rate)

### Q: What might happen if we changed our loss from __ to __ ?

### A:
YOUR ANSWER HERE

### Q: What happens if the `learning_rate` parameter is too high? Or too low?

### A:
YOUR ANSWER HERE

In [None]:
# Training loop
# STAFF: Please add check for early stopping!!!
num_epochs = 10
for epoch in range(num_epochs):
    model.train()
    running_loss = 0.0
    for i, (images, labels) in enumerate(train_loader):
        images = images.to(device)
        labels = labels.to(device)
        
        # Forward pass
        outputs = model(images)
        loss = criterion(outputs, labels)
        
        # Backward and optimize
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()
        
        running_loss += loss.item()
        if (i+1) % 100 == 0:
            print(f'Epoch [{epoch+1}/{num_epochs}], Step [{i+1}/{len(train_loader)}], Loss: {loss.item():.4f}')

### Q: What happens if you increase `epochs`? Will performance always improve as `epochs` increases?

### A:
YOUR ANSWER HERE

----
## Validating a CNN

Staff: Please add comments/explanations as needed to this code!

In [None]:
# Testing
model.eval()
with torch.no_grad():
    correct = 0
    total = 0
    for images, labels in test_loader:
        images = images.to(device)
        labels = labels.to(device)
        outputs = model(images)
        _, predicted = torch.max(outputs.data, 1)
        total += labels.size(0)
        correct += (predicted == labels).sum().item()

    print(f'Test Accuracy: {100 * correct / total:.2f}%')


----

## Analyzing Performance
- Staff: prompt some reflection about the plot below

In [None]:
# Plot accuracy vs. epochs