# Pytorch Getting Started Example
Basic usage of the pytorch framework using the FashionMNIST dataset to classify images.  Some extra notes and code by me for clarity.
https://pytorch.org/tutorials/beginner/basics/quickstart_tutorial.html

### Imports

In [52]:
import torch
from torch import nn
from torch.utils.data import DataLoader
from torchvision import datasets
from torchvision.transforms import ToTensor

### Download training and test data from open datasets.
Pytorch comes with built-in popular datasets ready for trying out ML algorithms.

In [53]:

training_data = datasets.FashionMNIST(
    root="data",
    train=True,
    download=True,
    transform=ToTensor(),
)

test_data = datasets.FashionMNIST(
    root="data",
    train=False,
    download=True,
    transform=ToTensor(),
)

### Dataloaders
Dataloaders set up the training data for training in randomized batches

In [54]:
batch_size = 64 # number of examples per batch => 934 batches

# Create data loaders.
train_dataloader = DataLoader(training_data, batch_size=batch_size)
test_dataloader = DataLoader(test_data, batch_size=batch_size)

for X, y in test_dataloader:
    print(f"Shape of X [N, C, H, W]: {X.shape}")
    print(f"Shape of y: {y.shape} {y.dtype}")
    break

Shape of X [N, C, H, W]: torch.Size([64, 1, 28, 28])
Shape of y: torch.Size([64]) torch.int64


### Create the Actual Model

In [55]:
# Get cpu or gpu device for training.
device = "cuda" if torch.cuda.is_available() else "cpu"
print(f"Using {device} device")

# Define model
class NeuralNetwork(nn.Module):
    def __init__(self):
        super().__init__()
        self.flatten = nn.Flatten()
        self.linear_relu_stack = nn.Sequential(
            nn.Linear(28*28, 512),
            nn.ReLU(),
            nn.Linear(512, 512),
            nn.ReLU(),
            nn.Linear(512, 10)
        )

    def forward(self, x):
        x = self.flatten(x)
        logits = self.linear_relu_stack(x)
        return logits

model = NeuralNetwork().to(device)
print(model)

Using cuda device
NeuralNetwork(
  (flatten): Flatten(start_dim=1, end_dim=-1)
  (linear_relu_stack): Sequential(
    (0): Linear(in_features=784, out_features=512, bias=True)
    (1): ReLU()
    (2): Linear(in_features=512, out_features=512, bias=True)
    (3): ReLU()
    (4): Linear(in_features=512, out_features=10, bias=True)
  )
)


### Specify the Loss Functions and Optimizer for Training
Fashion MNIST dataset has 9 classes.  Since this is a multiclass classification problem, we select cross entropy loss.  That gives each of the 9 classes a probability score that it is the correct class for the input fashion image.  Then choose the highest probability for prediction as the predicted class.

Explanation of multiclass classification: https://machinelearningmastery.com/types-of-classification-in-machine-learning/

In [56]:
loss_fn = nn.CrossEntropyLoss()
optimizer = torch.optim.SGD(model.parameters(), lr=1e-3)

### Set up the Training Loop with a Function
This is where the model parameters are adjusted with gradient descent. The function performs one step of gradient descent for each training example within each batch defined in the Data Loader.

In [57]:
def train(dataloader, model, loss_fn, optimizer):
    size = len(dataloader.dataset)
    model.train()
    for batch, (X, y) in enumerate(dataloader):
        X, y = X.to(device), y.to(device)

        # Compute prediction error
        pred = model(X)
        loss = loss_fn(pred, y)

        # Backpropagation
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()

        if batch % 100 == 0:
            loss, current = loss.item(), batch * len(X)
            print(f"loss: {loss:>7f}  [{current:>5d}/{size:>5d}]")

### Set up the Test Loop with a Function
Model parameters are frozen, then the test input data is used to run predictions.  This is done to confirm the model can make accurate predictions on unseen data.

In [58]:
def test(dataloader, model, loss_fn):
    size = len(dataloader.dataset)
    num_batches = len(dataloader)
    model.eval()
    test_loss, correct = 0, 0
    with torch.no_grad():
        for X, y in dataloader:
            X, y = X.to(device), y.to(device)
            pred = model(X)
            test_loss += loss_fn(pred, y).item()
            correct += (pred.argmax(1) == y).type(torch.float).sum().item()
    test_loss /= num_batches
    correct /= size
    print(f"Test Error: Accuracy: {(100*correct):>0.1f}%, Avg loss: {test_loss:>8f} \n")

# Run the Training and Test Loops
Epochs repeats the training and testing with the dataloaders. Here dataloaders were instantiated with the option ```shuffle=True```.  This allows the data to be randomly shuffled each epoch to increase diversity in training examples.

In [59]:
epochs = 5
for t in range(epochs):
    print(f"Epoch {t+1}\n-------------------------------")
    train(train_dataloader, model, loss_fn, optimizer)
    test(test_dataloader, model, loss_fn)
print("Done!")

Epoch 1
-------------------------------
0
loss: 2.312949  [    0/60000]
100
loss: 2.299469  [ 6400/60000]
200
loss: 2.276719  [12800/60000]
300
loss: 2.271442  [19200/60000]
400
loss: 2.259832  [25600/60000]
500
loss: 2.217210  [32000/60000]
600
loss: 2.234184  [38400/60000]
700
loss: 2.193502  [44800/60000]
800
loss: 2.198310  [51200/60000]
900
loss: 2.182774  [57600/60000]
Test Error: Accuracy: 48.0%, Avg loss: 2.162542 

Epoch 2
-------------------------------
0
loss: 2.165981  [    0/60000]
100
loss: 2.162425  [ 6400/60000]
200
loss: 2.103024  [12800/60000]
300
loss: 2.125799  [19200/60000]
400
loss: 2.077706  [25600/60000]
500
loss: 2.009519  [32000/60000]
600
loss: 2.049284  [38400/60000]
700
loss: 1.963391  [44800/60000]
800
loss: 1.976336  [51200/60000]
900
loss: 1.929137  [57600/60000]
Test Error: Accuracy: 55.9%, Avg loss: 1.903950 

Epoch 3
-------------------------------
0
loss: 1.924218  [    0/60000]
100
loss: 1.906695  [ 6400/60000]
200
loss: 1.784367  [12800/60000]
300
