# Lab 2: Convolutional Neural Networks

In this lab we classify images from the [CIFAR-10 dataset](https://www.openml.org/d/40926) using convolutional neural networks in **PyTorch**.

Tip: You can run these exercises faster on a GPU. If you don't have one locally, use Google Colab and enable GPU at *Runtime -> Change runtime type*.

In [None]:
if 'google.colab' in str(get_ipython()):
    !pip install --quiet openml

In [None]:
%matplotlib inline
import numpy as np
import matplotlib.pyplot as plt
import openml as oml

import torch
import torch.nn as nn
import torch.optim as optim
from torch.utils.data import DataLoader, TensorDataset
from sklearn.model_selection import train_test_split

In [None]:
# Download CIFAR-10 data
cifar = oml.datasets.get_dataset(40926)
X, y, _, _ = cifar.get_data(target=cifar.default_target_attribute, dataset_format='array')

cifar_classes = {0: "airplane", 1: "automobile", 2: "bird", 3: "cat", 4: "deer",
                 5: "dog", 6: "frog", 7: "horse", 8: "ship", 9: "truck"}

# Reshape to (N, Channels, Height, Width) â€” PyTorch's expected format
X = X.reshape((len(X), 3, 32, 32))
y = y.astype(np.int64)
print("X shape:", X.shape, "y shape:", y.shape)

In [None]:
# Visualize some random examples
from random import randint

fig, axes = plt.subplots(1, 5, figsize=(10, 5))
for i in range(5):
    n = randint(0, len(X) - 1)
    axes[i].imshow(X[n].transpose(1, 2, 0).astype(np.uint8))  # CHW -> HWC for plotting
    axes[i].set_xlabel(cifar_classes[int(y[n])])
    axes[i].set_xticks(()); axes[i].set_yticks(())
plt.show()

## Exercise 1: A simple model
* Split the data into 80% training and 20% validation sets
* Normalize the data to [0,1]
* Build a ConvNet with 3 convolutional layers interspersed with `nn.MaxPool2d` layers, and one dense (`nn.Linear`) layer
    * Use at least 32 3x3 filters in the first layer and ReLU activation
    * Otherwise, make rational design choices or experiment a bit to see what works
* You should at least get 60% accuracy
* For training, you can try batch sizes of 64, and 20-50 epochs, but feel free to explore this as well
* Plot and interpret the learning curves. Is the model overfitting? How could you improve it further?

Hint: You can define models using `nn.Sequential(...)` and stack layers like `nn.Conv2d`, `nn.ReLU()`, `nn.MaxPool2d`, `nn.Flatten`, and `nn.Linear`.

## Exercise 2: VGG-like model
* Implement a simplified VGG model by building 3 'blocks' of 2 convolutional layers each
* Do MaxPooling after each block
* The first block should use at least 32 filters, later blocks should use more
* Use 3x3 filters with `padding=1` to preserve spatial dimensions within each block
* Use a dense layer with at least 128 hidden nodes
* Use ReLU activations everywhere (where it makes sense)
* Plot and interpret the learning curves

## Exercise 3: Regularization
* Explore different ways to regularize your VGG-like model
  * Try adding `nn.Dropout(p)` after every MaxPooling and Dense layer
    * What are good Dropout rates? Try a fixed rate, or increase rates in the deeper layers
  * Try `nn.BatchNorm2d` together with Dropout
    * Think about where batch normalization would make sense
* Plot and interpret the learning curves

## Exercise 4: Data Augmentation
* Perform image augmentation using `torchvision.transforms` (e.g. `RandomHorizontalFlip`, `RandomAffine`, `RandomCrop`)
* You will need a custom Dataset class that applies transforms in `__getitem__`
* What is the effect? What is the effect with and without Dropout?
* Plot and interpret the learning curves

## Exercise 5: Interpret the misclassifications
Chances are that even your best model is not yet perfect. It is important to understand what kind of errors it still makes.
* Run the validation images through the network and detect all misclassified ones
* Visualize some of the misclassifications. Are they to be expected?
* Compute the confusion matrix (e.g. using `sklearn.metrics.confusion_matrix`). Which classes are often confused?

## Exercise 6: Interpret the model
Retrain your best model on all the data. Then, visualize the activations (feature maps) for a sample image at each convolutional layer.

Hint: In PyTorch, you can use `register_forward_hook` on a layer to capture its output during a forward pass.

Interpret the results. Is your model learning something useful?

## Optional: Take it a step further
* Repeat the exercises with a [higher-resolution version](https://www.openml.org/d/41103) (OpenML ID 41103), or a [version with 100 classes](https://www.openml.org/d/41983) (OpenML ID 41983).