# Problem Set 1 - Neural network implementation

Team PS 1 G

Team members:
- Xiaohan Wu 237867
- 

As described in section "3 Neural network implementation" of assignment 1, the goal is to build a shallow neural network from scratch using different approaches. To validate that your code is working and that the network is actually learning something, please use the following MNIST classification task. Finally, please submit proof of the learning progress as described in the assignment.

## Imports

In [32]:
import random
import pandas as pd
import numpy as np
from sklearn import model_selection
import sklearn.datasets as sk_datasets
import torchvision.datasets as torch_datasets
from torchvision import transforms
import torch
import matplotlib.pyplot as plt

from scratch.network import Network
from scratch.res_network import ResNetwork
from pytorch.network import TorchNetwork
from scratch.utils import *

import optuna
from sklearn.metrics import accuracy_score

In [33]:
# Automatically load changes in imported modules
%load_ext autoreload
%autoreload 2

# Explicitly set seed for reproducibility
GLOBAL_RANDOM_STATE = 42

random.seed(GLOBAL_RANDOM_STATE)
np.random.seed(GLOBAL_RANDOM_STATE)

The autoreload extension is already loaded. To reload it, use:
  %reload_ext autoreload


## A) Neural Network Classifier from Scratch

### Data

In [44]:
import pandas as pd
import numpy as np

# Load from local CSV instead of fetch_openml
data = pd.read_csv("mnist_784.csv")

# Features are all columns except the last one (check your file)
x = data.drop("class", axis=1).values.astype("float32") / 255.0
y_cat = data["class"].astype(int).values

# One-hot encode y
y = np.zeros((len(y_cat), 10))
for i, val in enumerate(y_cat):
    y[i, val] = 1

# Use only small subset of data for faster training
x = x[:1000]
y = y[:1000]

# Split data into train and validation set
x_train, x_val, y_train, y_val = model_selection.train_test_split(x, y, test_size=0.2, random_state=GLOBAL_RANDOM_STATE)

### ML Model & Training

In [51]:
fnn = Network(sizes=[784, 128, 64, 10], learning_rate=0.1, epochs=50)
fnn.fit(x_train, y_train, x_val, y_val, cosine_annealing_lr=False)

Epoch: 1, Training Time: 0.12s, Training Accuracy: 86.88%, Validation Accuracy: 79.00%
Epoch: 2, Training Time: 0.24s, Training Accuracy: 94.00%, Validation Accuracy: 85.50%
Epoch: 3, Training Time: 0.37s, Training Accuracy: 97.00%, Validation Accuracy: 85.50%
Epoch: 4, Training Time: 0.50s, Training Accuracy: 98.88%, Validation Accuracy: 88.00%
Epoch: 5, Training Time: 0.60s, Training Accuracy: 99.50%, Validation Accuracy: 88.50%
Epoch: 6, Training Time: 0.74s, Training Accuracy: 99.88%, Validation Accuracy: 88.00%
Epoch: 7, Training Time: 0.85s, Training Accuracy: 100.00%, Validation Accuracy: 88.50%
Epoch: 8, Training Time: 0.96s, Training Accuracy: 100.00%, Validation Accuracy: 90.00%
Epoch: 9, Training Time: 1.06s, Training Accuracy: 100.00%, Validation Accuracy: 90.00%
Epoch: 10, Training Time: 1.17s, Training Accuracy: 100.00%, Validation Accuracy: 90.00%
Epoch: 11, Training Time: 1.28s, Training Accuracy: 100.00%, Validation Accuracy: 90.00%
Epoch: 12, Training Time: 1.41s, Tra

### Test cosine annealing scheduler

In [52]:
fnn.fit(x_train, y_train, x_val, y_val, cosine_annealing_lr=True)

Epoch: 1, Training Time: 0.13s, Training Accuracy: 100.00%, Validation Accuracy: 91.50%
Epoch: 2, Training Time: 0.25s, Training Accuracy: 100.00%, Validation Accuracy: 91.50%
Epoch: 3, Training Time: 0.38s, Training Accuracy: 100.00%, Validation Accuracy: 91.50%
Epoch: 4, Training Time: 0.49s, Training Accuracy: 100.00%, Validation Accuracy: 91.50%
Epoch: 5, Training Time: 0.62s, Training Accuracy: 100.00%, Validation Accuracy: 91.50%
Epoch: 6, Training Time: 0.77s, Training Accuracy: 100.00%, Validation Accuracy: 91.50%
Epoch: 7, Training Time: 0.89s, Training Accuracy: 100.00%, Validation Accuracy: 91.50%
Epoch: 8, Training Time: 1.02s, Training Accuracy: 100.00%, Validation Accuracy: 91.50%
Epoch: 9, Training Time: 1.14s, Training Accuracy: 100.00%, Validation Accuracy: 91.50%
Epoch: 10, Training Time: 1.27s, Training Accuracy: 100.00%, Validation Accuracy: 91.50%
Epoch: 11, Training Time: 1.40s, Training Accuracy: 100.00%, Validation Accuracy: 91.50%
Epoch: 12, Training Time: 1.54

### Test residual neural network

In [57]:
res_nn = ResNetwork(sizes=[784, 128, 128, 10], learning_rate=0.005, epochs=50)
res_nn.fit(x_train, y_train, x_val, y_val)

Epoch: 1, Training Time: 0.15s, Training Accuracy: 52.25%, Validation Accuracy: 52.00%
Epoch: 2, Training Time: 0.30s, Training Accuracy: 72.75%, Validation Accuracy: 72.00%
Epoch: 3, Training Time: 0.45s, Training Accuracy: 80.25%, Validation Accuracy: 78.50%
Epoch: 4, Training Time: 0.59s, Training Accuracy: 83.38%, Validation Accuracy: 82.00%
Epoch: 5, Training Time: 0.79s, Training Accuracy: 86.62%, Validation Accuracy: 82.00%
Epoch: 6, Training Time: 0.94s, Training Accuracy: 88.75%, Validation Accuracy: 82.50%
Epoch: 7, Training Time: 1.08s, Training Accuracy: 90.00%, Validation Accuracy: 83.00%
Epoch: 8, Training Time: 1.24s, Training Accuracy: 91.25%, Validation Accuracy: 84.00%
Epoch: 9, Training Time: 1.39s, Training Accuracy: 92.62%, Validation Accuracy: 85.00%
Epoch: 10, Training Time: 1.54s, Training Accuracy: 93.38%, Validation Accuracy: 85.50%
Epoch: 11, Training Time: 1.72s, Training Accuracy: 94.12%, Validation Accuracy: 85.00%
Epoch: 12, Training Time: 1.86s, Training

## B) Neural Network Classifier using Torch

### Data

In [60]:
# Define data preprocessing steps
#transform = transforms.Compose([
#                transforms.ToTensor(),
#                transforms.Normalize((0.5,), (0.5,))
#            ])

# Download MNIST dataset
train_set = torch_datasets.MNIST('data', train=True, download=True, transform=transform)
val_set = torch_datasets.MNIST('data', train=False, download=True, transform=transform)

# Use only small subset of data for faster training
train_set = torch.utils.data.Subset(train_set, range(1000))
val_set = torch.utils.data.Subset(val_set, range(1000))

# Utilize PyTorch DataLoader from simplified & harmonized loading of data
train_loader = torch.utils.data.DataLoader(train_set, batch_size=1)
val_loader = torch.utils.data.DataLoader(val_set, batch_size=1)


RuntimeError: Error downloading train-images-idx3-ubyte.gz:
Tried https://ossci-datasets.s3.amazonaws.com/mnist/, got:
<urlopen error [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1000)>
Tried http://yann.lecun.com/exdb/mnist/, got:
HTTP Error 404: Not Found


### ML Model & Training

In [None]:
torch_nn = TorchNetwork(sizes=[784, 128, 64, 10], learning_rate=0.2, epochs=50, random_state=GLOBAL_RANDOM_STATE)
torch_nn.fit(train_loader, val_loader)

## C) Visualize accuracy & hyperparameter tuning

Here, you should compare the accuracy of all trained models. Optionally, you can also show the results of hyperparameter tuning and comment which hyperparameters work best for this task.

In [None]:
### BEGIN SOLUTION ###
 

### END SOLUTION ###