# PyTorch Tutorial on Neural Networks

## Working with data
This Jupyter Notebook follows the PyTorch Tutorial using the FashinMNIST dataset. The link is available [here](https://pytorch.org/tutorials/beginner/basics/quickstart_tutorial.html).

There are text, computational vision and audio datasets available (OpenSource) in the TorchText, TorchVision and TorchAudio libraries.

In [None]:
# Adding imports
import torch
from torch import nn
from torch.utils.data import DataLoader
from torchvision import datasets
from torchvision.transforms import ToTensor

In [None]:
# Downloading training data from built-in open datasets
training_data = datasets.FashionMNIST(
    root="data",
    train=True,
    download=True,
    transform=ToTensor(),
)

# Download test data 
test_data = datasets.FashionMNIST(
    root="data",
    train=False,
    download=True,
    transform=ToTensor(),
)

The ```Dataset``` is passed as an argument to ```DataLoader``` and wrapped as an iterable. This way it supports batching, sampling, shuffling and data loading. Each element in the iterable returns a batch of a given number of features and labels. Here, the batch size is of 64 elements.

In [None]:
batch_size = 64

# Create data loaders for training and testing
train_dataloader = DataLoader(training_data, batch_size=batch_size)
test_dataloader = DataLoader(test_data, batch_size=batch_size)

for X, y in test_dataloader:
    print(f"Shape of X [N, C, H, W]: {X.shape}")
    print(f"Shape of y: {y.shape} {y.dtype}")
    break

## Creating Models

To define a Neural Network in PyTorch, we create a class that inherits from nn.Module. The layers are defined in ```__init__``` function. The ```forward``` function describes how data will pass through the network. 

It moves the operation to the GPU or MPS, if available, to accelerate the operations.

In [None]:
# Choose between cpu, gpu or mps device for training
device = (
    "cuda" if torch.cuda.is_available()
    else "mps" if torch.backends.mps.is_available()
    else "cpu"
)
print(f"Using {device} device")

In [None]:
# Define model
class NeuralNetwork(nn.Module):
    def __init__(self):
        super().__init__()
        self.flatten = nn.Flatten()
        self.linear_relu_stack = nn.Sequential(
            nn.Linear(28*28, 512),
            nn.ReLU(),
            nn.Linear(512, 512),
            nn.ReLU(),
            nn.Linear(512, 10)
        )

    def forward(self, x):
        x = self.flatten(x)
        logits = self.linear_relu_stack(x)
        return logits

model = NeuralNetwork().to(device)
print(model)

## Optimizing the Model Parameters
To train a model, we need a loss function and an optimizer.
- A ***loss function*** is a function that quantifies the difference between the predicted outputs of an algorithm and the actual target values. Here we have the [*Cross Entropy loss function*](https://www.datacamp.com/tutorial/the-cross-entropy-loss-function-in-machine-learning)
- An ***optimizer*** defines how the parameters are adapted to achieve the expected target value using the loss in the loss function. Here it is the [*Stochastic Gradient Descent* (SGD)](https://www.geeksforgeeks.org/ml-stochastic-gradient-descent-sgd/)

In [None]:
loss_fn = nn.CrossEntropyLoss()
optimizer = torch.optim.SGD(model.parameters(), lr=1e-3)

Now, in a single loop, the model is fed in batches and make predictions on the training dataset. It backpropagates the prediction error from the loss function to adjust the model's parameter using the optimizer.

In [None]:
def train(dataloader, model, loss_fn, optimizer):
    size = len(dataloader.dataset)
    model.train()
    for batch, (X,y) in enumerate(dataloader):
        X, y = X.to(device), y.to(device)

        # Computes prediction error with the loss function
        pred = model(X)
        loss = loss_fn(pred, y)

        # Backpropagation and optimization
        loss.backward()
        optimizer.step()
        optimizer.zero_grad()

        if batch % 100 == 0:
            loss, current = loss.item(), (batch + 1) * len(X)
            print(f"Loss: {loss:>7f} [{current:>5d}] {size:>5d}")

In [None]:
# Check the performance against the test dataset to evaluate learning
def test(dataloader, model, loss_fn):
    size = len(dataloader.dataset)
    num_batches = len(dataloader)
    model.eval()
    test_loss, correct = 0, 0
    with torch.no_grad():
        for X, y in dataloader:
            X, y = X.to(device), y.to(device)
            pred = model(X)
            test_loss += loss_fn(pred, y).item()
            correct += (pred.argmax(1) == y).type(torch.float).sum().item()
    test_loss /= num_batches
    correct /= size
    print(f"Test Error: \n Accuracy: {(100*correct):>0.1f}%, \
          Avg loss: {test_loss:>8f}\n")

The training process occurs over a number of iterations called *epochs*. During each iteration/epoch, the model learns parameters to make more accurate predictions. Here, the accuracy and loss at each epoch is printed. The goal is to see the increasing of accuracy and decrease of loss in each epoch.

In [None]:
epochs = 50
for t in range(epochs):
    print(f"Epoch {t+1}\n-----------------------------")
    train(train_dataloader, model, loss_fn, optimizer)
    test(test_dataloader, model, loss_fn)
print("Done!")

## Saving and Loading Models
A common way to save a model is to serialize the internal state dictionary (containing the model parameters).

In [None]:
torch.save(model.state_dict(), "model.pth")
print("Saved PyTorch Model State to model.pth")

The process for loading a model includes re-creating the model structure and loading the state dictionary into it.

In [None]:
model = NeuralNetwork().to(device)
model.load_state_dict(torch.load("model.pth"))

## Making predictions
This model can now be used to make predictions.

In [None]:
classes = [
    "T-shirt/top",
    "Trouser",
    "Pullover",
    "Dress",
    "Coat",
    "Sandal",
    "Shirt",
    "Sneaker",
    "Bag",
    "Ankle boot",
]

model.eval()
miss_count = 0
with torch.no_grad():
    for i in range(len(test_data)):
        x, y = test_data[i][0], test_data[i][1]
        x = x.to(device)
        pred = model(x)
        predicted, actual = classes[pred[0].argmax(0)], classes[y]
        if predicted != actual:
            miss_count += 1
            print(f"Predicted: {predicted}, Actual: {actual}")

print(f"Miss count: {miss_count}")