# Lab 1: Neural Networks

In this lab we build dense neural networks on the MNIST dataset using PyTorch.

In [None]:
if 'google.colab' in str(get_ipython()):
    !pip install --quiet openml

In [None]:
%matplotlib inline
import numpy as np
import matplotlib.pyplot as plt
import openml as oml

import torch
import torch.nn as nn
from torch.utils.data import DataLoader, TensorDataset, random_split
from sklearn.model_selection import train_test_split

## Load the data

In [None]:
# Download MNIST
mnist = oml.datasets.get_dataset(554)
X, y, _, _ = mnist.get_data(target=mnist.default_target_attribute, dataset_format='array')
X = X.reshape(70000, 28, 28)

# Visualize some examples
from random import randint
fig, axes = plt.subplots(1, 5, figsize=(10, 5))
for i in range(5):
    n = randint(0, len(X) - 1)
    axes[i].imshow(X[n], cmap=plt.cm.gray_r)
    axes[i].set_xlabel(y[n])
    axes[i].set_xticks(()); axes[i].set_yticks(())
plt.show()

In [None]:
# Predefined 60000/10000 train-test split
X_train, X_test, y_train, y_test = train_test_split(X, y, train_size=60000, random_state=0)

## Exercise 1: Preprocessing
* Normalize the data: map pixel values from 0-255 to 0.0-1.0
* Flatten the data from (N, 28, 28) to (N, 784)
* Convert the numpy arrays to PyTorch tensors (`torch.float32` for X, `torch.long` for y)
* Create a `TensorDataset` for training and another for testing

## Exercise 2: Create a neural network model

Implement a `create_model` function that builds a dense neural network using `nn.Sequential`:
* 2 layers: one hidden layer and one output layer
* The number of nodes in each layer should be function parameters
* Add at least one `nn.Dropout` layer for regularization

Consider:
* Input size: 28x28 = 784 (flattened)
* Hidden layer: use `nn.ReLU()` activation
* Output: 10 classes â€” no softmax needed since `nn.CrossEntropyLoss` handles it internally

In [None]:
def create_model(layer_1_units=32, layer_2_units=10, dropout_rate=0.3):
    pass

## Exercise 3: Create a training function

Implement a `train_model` function that:
* Creates `DataLoader`s from the datasets
* Uses `nn.CrossEntropyLoss` and `torch.optim.Adam`
* Runs a training loop for a given number of epochs
* Computes and prints train/validation loss and accuracy each epoch
* Returns a history dict with `loss`, `accuracy`, `val_loss`, `val_accuracy` lists

In [None]:
def train_model(model, train_dataset, val_dataset, epochs=10, batch_size=64, learning_rate=0.001):
    pass

## Exercise 4: Evaluate the model

* Train the model with: learning rate 0.003, 50 epochs, batch size 4000, and a 20% validation split
* Plot the learning curves (loss and accuracy for train and validation)
* Report performance on the test set

In [None]:
# Helper plotting function
def plot_curve(history):
    epochs = range(1, len(history["accuracy"]) + 1)
    fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(12, 4))
    ax1.plot(epochs, history["accuracy"], label="Train")
    ax1.plot(epochs, history["val_accuracy"], label="Val")
    ax1.set_title("Accuracy"); ax1.legend()
    ax2.plot(epochs, history["loss"], label="Train")
    ax2.plot(epochs, history["val_loss"], label="Val")
    ax2.set_title("Loss"); ax2.legend()
    plt.tight_layout(); plt.show()

## Exercise 5: Optimize the model

Try to optimize the model to reach at least **96% test accuracy**. Experiment with:
* The number of hidden layers
* The number of neurons per layer
* Dropout rates
* Learning rate and batch size

Hint: A model with layers like [256, 128, 64] and dropout 0.2 should get there.