# CSE 151B: Homework 2 Coding
## PyTorch Implementation

Using PyTorch’s `Sequential` model class, build a deep convolutional network to classify handwritten digits in MNIST.

You are only allowed to use the following in your model design:
- Linear Layers
- Conv2D
- MaxPool2D
- BatchNorm2D
- Dropout Layers
- ReLU and Softmax
- Flatten

Your goal is to build a model that achieves **test accuracy ≥ 0.985** with fewer than 1 million parameters.

**Warning**: The modules in your Sequential network should *only* consist of `nn` objects! That means you should not be using `torch.nn.functional` modules or lambda expressions in your Sequential block. Leaving functional/lambda expressions in your model code will result in no credit!

This notebook provides a skeleton layout for you. You may use whatever parts of this notebook you deem necessary; there is no need for you to adhere to the structure. However, during submission, you must carefully follow the zip file formatting as requested; see the bottom of the notebook.

In [25]:
import torch
import torch.nn as nn
import torch.optim as optim
from torchvision import datasets, transforms
from torch.utils.data import DataLoader

In [26]:
def get_data_loaders(batch_size) -> tuple[DataLoader, DataLoader]:
    '''
    Return the training and testing MNIST dataloaders.
    '''
    transform = transforms.Compose([
        transforms.ToTensor(),
        transforms.Normalize((0.1307,), (0.3081,))
    ])
    
    train_dataset = datasets.MNIST(root='./data', train=True, download=True, transform=transform)
    test_dataset = datasets.MNIST(root='./data', train=False, download=True, transform=transform)

    train_loader = DataLoader(train_dataset, batch_size=batch_size, shuffle=True)
    test_loader = DataLoader(test_dataset, batch_size=batch_size, shuffle=False)

    return train_loader, test_loader


In [27]:
def build_model(dropout_prob=0.3) -> nn.Module:
    model = nn.Sequential(
        nn.Conv2d(1, 16, kernel_size=3, padding=1),    # (16, 28, 28)
        nn.BatchNorm2d(16),
        nn.ReLU(),
        nn.MaxPool2d(2),                               # (16, 14, 14)

        nn.Conv2d(16, 32, kernel_size=3, padding=1),   # (32, 14, 14)
        nn.BatchNorm2d(32),
        nn.ReLU(),
        nn.MaxPool2d(2),                               # (32, 7, 7)

        nn.Conv2d(32, 64, kernel_size=3, padding=1),   # (64, 7, 7)
        nn.BatchNorm2d(64),
        nn.ReLU(),
        nn.MaxPool2d(2),                               # (64, 3, 3)
        nn.Dropout(dropout_prob),

        nn.Flatten(),                                  # 64 * 3 * 3 = 576
        nn.Linear(576, 64),
        nn.ReLU(),
        nn.Dropout(dropout_prob),
        nn.Linear(64, 10)                              # No Softmax
    )
    return model


In [28]:
def check_params():
    model = build_model()
    print(f"Number of parameters: {sum(p.numel() for p in model.parameters())}")

In [29]:
def train(model, optimizer, criterion, train_loader, n_epochs=1):
    '''
    Train the model for `n_epochs` epochs. Returns none (model is modified in place).
    '''
    model.train()
    for epoch in range(n_epochs):
        running_loss = 0.0
        correct = 0
        total = 0

        for images, labels in train_loader:
            # Zero the gradients
            optimizer.zero_grad()

            # Forward pass
            outputs = model(images)

            # Compute loss
            loss = criterion(outputs, labels)

            # Backward pass and optimization
            loss.backward()
            optimizer.step()

            # Track statistics
            running_loss += loss.item()
            _, predicted = torch.max(outputs.data, 1)
            total += labels.size(0)
            correct += (predicted == labels).sum().item()

        epoch_loss = running_loss / len(train_loader)
        epoch_acc = correct / total
        print(f"Epoch [{epoch + 1}/{n_epochs}], Loss: {epoch_loss:.4f}, Accuracy: {epoch_acc:.4f}")


In [30]:
def test(model, test_loader):
    '''
    Tests the model. Returns none (you should print the accuracy).
    '''
    model.eval()
    correct = 0
    total = 0

    with torch.no_grad():
        for images, labels in test_loader:
            outputs = model(images)
            _, predicted = torch.max(outputs.data, 1)
            total += labels.size(0)
            correct += (predicted == labels).sum().item()

    accuracy = correct / total
    print(f"Test Accuracy: {accuracy:.4f}")


In [31]:
train_loader, test_loader = get_data_loaders(batch_size=64)

criterion = nn.CrossEntropyLoss()
dropout_values = [i / 10 for i in range(10)]

for p in dropout_values:
    print(f"\nTraining with dropout={p:.1f}")
    model = build_model(dropout_prob=p)
    optimizer = torch.optim.Adam(model.parameters(), lr=0.001)
    
    train(model, optimizer, criterion, train_loader, n_epochs=5)
    test(model, test_loader)
    torch.save(model, f'hw2_dropout_{p:.1f}.pt')



Training with dropout=0.0
Epoch [1/5], Loss: 0.1362, Accuracy: 0.9607
Epoch [2/5], Loss: 0.0425, Accuracy: 0.9869
Epoch [3/5], Loss: 0.0317, Accuracy: 0.9899
Epoch [4/5], Loss: 0.0261, Accuracy: 0.9914
Epoch [5/5], Loss: 0.0205, Accuracy: 0.9931
Test Accuracy: 0.9910

Training with dropout=0.1
Epoch [1/5], Loss: 0.1471, Accuracy: 0.9572
Epoch [2/5], Loss: 0.0488, Accuracy: 0.9848
Epoch [3/5], Loss: 0.0387, Accuracy: 0.9873
Epoch [4/5], Loss: 0.0311, Accuracy: 0.9899
Epoch [5/5], Loss: 0.0261, Accuracy: 0.9919
Test Accuracy: 0.9916

Training with dropout=0.2
Epoch [1/5], Loss: 0.1792, Accuracy: 0.9474
Epoch [2/5], Loss: 0.0626, Accuracy: 0.9814
Epoch [3/5], Loss: 0.0489, Accuracy: 0.9851
Epoch [4/5], Loss: 0.0415, Accuracy: 0.9873
Epoch [5/5], Loss: 0.0348, Accuracy: 0.9890
Test Accuracy: 0.9925

Training with dropout=0.3
Epoch [1/5], Loss: 0.2286, Accuracy: 0.9310
Epoch [2/5], Loss: 0.0834, Accuracy: 0.9756
Epoch [3/5], Loss: 0.0644, Accuracy: 0.9807
Epoch [4/5], Loss: 0.0534, Accurac

In [32]:
# find your best model, and train it for 10 epochs
best_p = 0.2 # TODO: fill in your best probability
model = build_model(dropout_prob=best_p)
optimizer = torch.optim.Adam(model.parameters(), lr=0.001)
criterion = nn.CrossEntropyLoss()

train(model, optimizer, criterion, train_loader, n_epochs = 10)
test(model, test_loader)
torch.save(model, "hw2_model.pt")

Epoch [1/10], Loss: 0.1844, Accuracy: 0.9474
Epoch [2/10], Loss: 0.0661, Accuracy: 0.9799
Epoch [3/10], Loss: 0.0526, Accuracy: 0.9840
Epoch [4/10], Loss: 0.0417, Accuracy: 0.9868
Epoch [5/10], Loss: 0.0371, Accuracy: 0.9886
Epoch [6/10], Loss: 0.0349, Accuracy: 0.9895
Epoch [7/10], Loss: 0.0306, Accuracy: 0.9905
Epoch [8/10], Loss: 0.0285, Accuracy: 0.9906
Epoch [9/10], Loss: 0.0250, Accuracy: 0.9920
Epoch [10/10], Loss: 0.0226, Accuracy: 0.9927
Test Accuracy: 0.9915


# Submission Instructions

Zip all of your **code** and **model .pt files** into one file, and submit on Gradescope to the respective submission.