# CAP4453 Robot Vision: Softmax Classifier

This notebook provides a skeleton implementation of a one‑layer linear classifier using PyTorch. You will load the CIFAR‑10 or MNIST dataset, flatten the images, and train a linear classifier with softmax loss. The code below is only a starting point; feel free to modify it as needed. The model will automatically use a GPU if available.


In [10]:
import torch
from torchvision import datasets, transforms
from torch.utils.data import DataLoader
import numpy as np
import matplotlib.pyplot as plt

# Choose device (GPU if available, else CPU)
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
print(f"Using device: {device}")

# Select dataset: 'cifar10' or 'mnist'
dataset_name = 'cifar10'

if dataset_name.lower() == 'cifar10':
    transform = transforms.Compose([transforms.ToTensor()])
    train_dataset = datasets.CIFAR10(root='./data', train=True, download=True, transform=transform)
    test_dataset = datasets.CIFAR10(root='./data', train=False, download=True, transform=transform)
    num_classes = 10
    input_dim = 32*32*3
elif dataset_name.lower() == 'mnist':
    transform = transforms.Compose([transforms.ToTensor()])
    train_dataset = datasets.MNIST(root='./data', train=True, download=True, transform=transform)
    test_dataset = datasets.MNIST(root='./data', train=False, download=True, transform=transform)
    num_classes = 10
    input_dim = 28*28
else:
    raise ValueError('Unknown dataset')

# Split train into train/val
train_size = int(0.9 * len(train_dataset))
val_size = len(train_dataset) - train_size
train_dataset, val_dataset = torch.utils.data.random_split(train_dataset, [train_size, val_size])

batch_size = 128
train_loader = DataLoader(train_dataset, batch_size=batch_size, shuffle=True, num_workers=0)
val_loader = DataLoader(val_dataset, batch_size=batch_size, shuffle=False, num_workers=0)
test_loader = DataLoader(test_dataset, batch_size=batch_size, shuffle=False, num_workers=0)

print(f"Train samples: {len(train_dataset)}, Validation samples: {len(val_dataset)}, Test samples: {len(test_dataset)}")


Using device: cpu


  entry = pickle.load(f, encoding="latin1")


Train samples: 45000, Validation samples: 5000, Test samples: 10000


In [11]:
# Define a simple linear classifier using PyTorch
import torch.nn as nn

class LinearClassifier(nn.Module):
    def __init__(self, input_dim, num_classes):
        super().__init__()
        self.fc = nn.Linear(input_dim, num_classes)

    def forward(self, x):
        x = x.view(x.size(0), -1)  # flatten
        return self.fc(x)
    

# Initialize model, loss function and optimizer
model = LinearClassifier(input_dim, num_classes).to(device)
criterion = nn.CrossEntropyLoss()
learning_rate = 1e-3
weight_decay = 1e-4
optimizer = torch.optim.SGD(model.parameters(), lr=learning_rate, weight_decay=weight_decay)

# Training loop skeleton
num_epochs = 10
for epoch in range(num_epochs):
    model.train()
    running_loss = 0.0
    correct = 0
    total = 0
    for images, labels in train_loader:
        images, labels = images.to(device), labels.to(device)
        optimizer.zero_grad()
        outputs = model(images)
        loss = criterion(outputs, labels)
        loss.backward()
        optimizer.step()
        running_loss += loss.item() * images.size(0)
        _, predicted = outputs.max(1)
        total += labels.size(0)
        correct += predicted.eq(labels).sum().item()
    train_loss = running_loss / total
    train_acc = 100. * correct / total

    # Evaluate on validation set
    model.eval()
    val_correct = 0
    val_total = 0
    with torch.no_grad():
        for images, labels in val_loader:
            images, labels = images.to(device), labels.to(device)
            outputs = model(images)
            _, predicted = outputs.max(1)
            val_total += labels.size(0)
            val_correct += predicted.eq(labels).sum().item()
    val_acc = 100. * val_correct / val_total
    print(f"Epoch {epoch+1}/{num_epochs}, Train loss: {train_loss:.4f}, Train acc: {train_acc:.2f}%, Val acc: {val_acc:.2f}%")


Epoch 1/10, Train loss: 2.2009, Train acc: 20.72%, Val acc: 26.62%
Epoch 2/10, Train loss: 2.0869, Train acc: 27.66%, Val acc: 29.54%
Epoch 3/10, Train loss: 2.0281, Train acc: 30.33%, Val acc: 31.56%
Epoch 4/10, Train loss: 1.9904, Train acc: 31.80%, Val acc: 33.46%
Epoch 5/10, Train loss: 1.9632, Train acc: 33.05%, Val acc: 33.48%
Epoch 6/10, Train loss: 1.9423, Train acc: 33.78%, Val acc: 34.12%
Epoch 7/10, Train loss: 1.9255, Train acc: 34.27%, Val acc: 33.84%
Epoch 8/10, Train loss: 1.9117, Train acc: 34.78%, Val acc: 33.96%
Epoch 9/10, Train loss: 1.9001, Train acc: 34.95%, Val acc: 34.94%
Epoch 10/10, Train loss: 1.8898, Train acc: 35.50%, Val acc: 35.46%


### Next steps

- Experiment with different learning rates, regularization strengths and batch sizes. Please record (e.g., screenshot, copy training/val log history etc) the validation set performances with different hyper‑parameters in the report. 
- After choosing hyper‑parameters, train on the combined train+validation set and report the test accuracy.
- Plot training and validation accuracies over epochs.
- Compare these results with those obtained using a two‑layer network (see the next notebook).
