<a href="https://colab.research.google.com/github/elhamod/BA865-2024/blob/main/hands-on/First_Pytorch_NN_solution.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

#Welcome to your first PyTorch Neural Net!

##Things we will investigate:

- How to load and pre-process the data.
- How to construct an MLP.
- How to train an MLP (Loss and optimization).
- How to utilize a GPU.
- How the complexity of the model affects its performance.
- How to measure the performance of the model.
- The effects of hyper-parameters:
  - Learning rate.
  - Optimizer.
  - Batch size.
- How to use WandB.
- Using SCC.


##Import some packages

In [47]:
import torch
import torch.nn as nn
import torch.optim as optim
from torchvision import datasets, transforms
from torch.utils.data import DataLoader

This helps you check if GPU is available

In [48]:
torch.cuda.is_available()

False

Some extra fancy but optional packages:

- `torchmetrics` for calculating accuracy
- `wandb` for logging

In [49]:
# !pip install -U torchmetrics

In [50]:
# !pip install wandb -qU
# import wandb
# wandb.login()

## Hyper-parameters

Define your hyper-parameters here.

In [51]:
# Hyperparameters

# Data
input_size = 28 * 28  # MNIST images are 28x28
output_size = 10  # 10 classes for the digits 0-9
batch_size = 64

# MLP
hidden_size = 128

#Optimzation
learning_rate = 0.001
epochs = 3

## Data

Load your dataset and create `DataLoaders` that handle the batching and shuffling.

In [52]:
# Transformations
transform = transforms.Compose([transforms.ToTensor(), transforms.Normalize((0.5,), (0.5,))])

# MNIST dataset
train_dataset = datasets.MNIST(root='./data', train=True, transform=transform, download=True)
test_dataset = datasets.MNIST(root='./data', train=False, transform=transform, download=True)

# Data loaders
train_loader = DataLoader(dataset=train_dataset, batch_size=batch_size, shuffle=True)
test_loader = DataLoader(dataset=test_dataset, batch_size=batch_size, shuffle=False)

## Define and create your model

In [53]:
# MLP model
class MLP(nn.Module):
    def __init__(self, input_size, hidden_size, output_size):
        super(MLP, self).__init__()
        self.fc1 = nn.Linear(input_size, hidden_size)
        self.relu = nn.ReLU()
        self.fc2 = nn.Linear(hidden_size, output_size)

    # Defines the forward pass.
    def forward(self, x):
        x = x.view(-1, input_size)
        x = self.fc1(x)
        x = self.relu(x)
        x = self.fc2(x)
        return x

# Could also be written as this:
# class MLP(nn.Module):
#     def __init__(self, input_size, hidden_size, output_size):
#         super(MLP, self).__init__()
#         self.model = nn.Sequential(
#             nn.Linear(input_size, hidden_size),
#             nn.ReLU(),
#             nn.Linear(hidden_size, output_size)
#         )

#     def forward(self, x):
#         x = x.view(-1, input_size)
#         x = self.model(x)
#         return x


Adding `.cuda` moves your model to the GPU.

In [54]:
model = MLP(input_size, hidden_size, output_size)#.cuda()

## Loss

For classification, we use cross-entropy.

In [55]:
criterion = nn.CrossEntropyLoss()

##Optimizer

In [56]:
optimizer = optim.Adam(model.parameters(), lr=learning_rate)

## Training!

In [57]:
# wandb.init(
#     # Set the project where this run will be logged
#     project="First PyTorch NN",
#     # We pass a run name (otherwise it’ll be randomly assigned, like sunshine-lollypop-10)
#     name="experiment_2",
#     # Track hyperparameters and run metadata
#     config={
#     "learning_rate": learning_rate,
#     "epochs": epochs,
#     "notes for me": "This is a very lovely experiment. please work!"
#     })

In [58]:
# import torchmetrics
# Define the accuracy metric
# train_accuracy = torchmetrics.Accuracy()


train_acc = 0

# Training loop
for epoch in range(epochs): # The epochs.
    for i, (images, labels) in enumerate(train_loader): # The batches.
        # step 1: Zero out the gradients.
        optimizer.zero_grad()

        # step 1.1 move data to cuda. Make sure the model is on cuda too!
        #images = images.cuda
        #labels = labels.cuda()

        # step2: Forward pass
        outputs = model(images) #images.cuda

        # step 3: calculate the loss.
        loss = criterion(outputs, labels)

        # step 4: Backward pass
        loss.backward()
        optimizer.step()

        # step 5: (optional) calculate accuracy
        # train_accuracy.update(outputs, labels)
        train_acc = train_acc + torch.sum(torch.argmax(outputs, axis=1) == labels)

        # Print the loss
        if i %100 == 0:
          print("Epoch", epoch+ 1, " batch", i, ": ", loss.item())

    # Compute total train accuracy
    # train_acc = train_accuracy.compute()
    # train_accuracy.reset()
    train_acc = train_acc/len(train_dataset)

    print(f'Epoch [{epoch + 1}/{epochs}], Train Accuracy: {train_acc.item():.4f}')
    # wandb.log({"train_accuracy": train_acc, "loss": loss})



Epoch 1  batch 0 :  2.3279385566711426
Epoch 1  batch 100 :  0.4409348964691162
Epoch 1  batch 200 :  0.2814260423183441
Epoch 1  batch 300 :  0.22777119278907776
Epoch 1  batch 400 :  0.4258238673210144
Epoch 1  batch 500 :  0.2658941447734833
Epoch 1  batch 600 :  0.3383975028991699
Epoch 1  batch 700 :  0.14792561531066895
Epoch 1  batch 800 :  0.2226632982492447
Epoch 1  batch 900 :  0.1953068971633911
Epoch [1/3], Train Accuracy: 0.8880
Epoch 2  batch 0 :  0.3055928349494934
Epoch 2  batch 100 :  0.3321334421634674
Epoch 2  batch 200 :  0.17917245626449585
Epoch 2  batch 300 :  0.24423947930335999
Epoch 2  batch 400 :  0.36964285373687744
Epoch 2  batch 500 :  0.1722712516784668
Epoch 2  batch 600 :  0.224585622549057
Epoch 2  batch 700 :  0.16651535034179688
Epoch 2  batch 800 :  0.32655808329582214
Epoch 2  batch 900 :  0.10728760063648224
Epoch [2/3], Train Accuracy: 0.9406
Epoch 3  batch 0 :  0.06462714821100235
Epoch 3  batch 100 :  0.06736205518245697
Epoch 3  batch 200 :  0

Test

In [62]:
# test_accuracy = torchmetrics.Accuracy()

# Test the model
test_acc = 0
with torch.no_grad():
    for images, labels in test_loader:
        outputs = model(images)

        # Compute test accuracy
        # test_accuracy.update(outputs, labels)
        test_acc = test_acc + torch.sum(torch.argmax(outputs, axis=1) == labels)

    test_acc = test_acc/len(test_dataset)
    # test_acc = test_accuracy.compute()
    # test_accuracy.reset()

    print(f'Test Accuracy: {test_acc:.4f}')
    # wandb.summary['test_accuracy'] = test_acc



Test Accuracy: 0.9565


In [60]:
# wandb.finish()