# Python + Pytorch Starter Notebook

Notes from my adventure setting up a Jupyter Notebook for Pytorch on my mac.

## Does python work?

Select the cell below and hit Ctrl + Enter to execute.  I set up conda and had to configure the kernel.  ctrl-Shift-P Create Jupyter.

In [9]:
1+1

2

## Is Pytorch installed and available in this kernel?

Run this next:

In [1]:
import torch
x = torch.rand(5, 3)
print(x)

tensor([[0.2874, 0.2790, 0.6920],
        [0.2934, 0.1805, 0.2458],
        [0.2753, 0.2174, 0.9389],
        [0.3444, 0.9865, 0.2724],
        [0.0278, 0.8962, 0.5216]])


If you see the tensor output above, it's working!  

## Using Pytorch

Let's get some data, but first more imports.  

In [2]:
import torch
from torch import nn
from torch.utils.data import DataLoader
from torchvision import datasets
from torchvision.transforms import ToTensor

In [3]:
# Download training data from open datasets.
training_data = datasets.FashionMNIST(
    root="data",
    train=True,
    download=True,
    transform=ToTensor(),
)

# Download test data from open datasets.
test_data = datasets.FashionMNIST(
    root="data",
    train=False,
    download=True,
    transform=ToTensor(),
)

Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/train-images-idx3-ubyte.gz
Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/train-images-idx3-ubyte.gz to data/FashionMNIST/raw/train-images-idx3-ubyte.gz


100%|██████████| 26421880/26421880 [00:02<00:00, 9153128.79it/s] 


Extracting data/FashionMNIST/raw/train-images-idx3-ubyte.gz to data/FashionMNIST/raw

Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/train-labels-idx1-ubyte.gz
Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/train-labels-idx1-ubyte.gz to data/FashionMNIST/raw/train-labels-idx1-ubyte.gz


100%|██████████| 29515/29515 [00:00<00:00, 238018.56it/s]


Extracting data/FashionMNIST/raw/train-labels-idx1-ubyte.gz to data/FashionMNIST/raw

Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/t10k-images-idx3-ubyte.gz
Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/t10k-images-idx3-ubyte.gz to data/FashionMNIST/raw/t10k-images-idx3-ubyte.gz


100%|██████████| 4422102/4422102 [00:00<00:00, 4579545.49it/s]


Extracting data/FashionMNIST/raw/t10k-images-idx3-ubyte.gz to data/FashionMNIST/raw

Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/t10k-labels-idx1-ubyte.gz
Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/t10k-labels-idx1-ubyte.gz to data/FashionMNIST/raw/t10k-labels-idx1-ubyte.gz


100%|██████████| 5148/5148 [00:00<00:00, 5715266.54it/s]

Extracting data/FashionMNIST/raw/t10k-labels-idx1-ubyte.gz to data/FashionMNIST/raw






Now we have some data to use for training.  Pass the dataset to a dataloader.  Dataloaders give you an iterable and handles things like shuffling/randomization for you.  

In [4]:
batch_size = 64

# Create data loaders.
train_dataloader = DataLoader(training_data, batch_size=batch_size)
test_dataloader = DataLoader(test_data, batch_size=batch_size)

for X, y in test_dataloader:
    print(f"Shape of X [N, C, H, W]: {X.shape}")
    print(f"Shape of y: {y.shape} {y.dtype}")
    break

Shape of X [N, C, H, W]: torch.Size([64, 1, 28, 28])
Shape of y: torch.Size([64]) torch.int64


## Creating a Model

Let's create the model we will use.  The model is an object that represents the architecture of the network.  How many layers?  What kind of layers?  How do we move forward?

After we create an instance of the model, we send it `to` the device it will be run on (CUDA if you've got a GPU capable, my macbook is on mps, nad worst case you'll run on CPU.  Performance gets worse as you move down the list but it still works.)

In [6]:
# Get cpu, gpu or mps device for training.
device = (
    "cuda"
    if torch.cuda.is_available()
    else "mps"
    if torch.backends.mps.is_available()
    else "cpu"
)
print(f"Using {device} device")

# Define model
class NeuralNetwork(nn.Module):
    def __init__(self):
        super().__init__()
        self.flatten = nn.Flatten()
        self.linear_relu_stack = nn.Sequential(
            nn.Linear(28*28, 512),
            nn.ReLU(),
            nn.Linear(512, 512),
            nn.ReLU(),
            nn.Linear(512, 10)
        )

    def forward(self, x):
        x = self.flatten(x)
        logits = self.linear_relu_stack(x)
        return logits

model = NeuralNetwork().to(device)
print(model)

Using mps device
NeuralNetwork(
  (flatten): Flatten(start_dim=1, end_dim=-1)
  (linear_relu_stack): Sequential(
    (0): Linear(in_features=784, out_features=512, bias=True)
    (1): ReLU()
    (2): Linear(in_features=512, out_features=512, bias=True)
    (3): ReLU()
    (4): Linear(in_features=512, out_features=10, bias=True)
  )
)


## Optimizing the model's Parameters

The model needs to know how to calculate loss and an optimizer.  I need to understand what these do better, but I know you're basically selecting from some math functions based on how you think they'll perform with the rest of your model.  Experimentation to find what works best with your data.

In [8]:
loss_fn = nn.CrossEntropyLoss()
optimizer = torch.optim.SGD(model.parameters(), lr=1e-3)

## Training

In this loop it trains itself and updates it's weights based on how it does.  We'll actually run it soon, when we write the training loop.

In [10]:
def train(dataloader, model, loss_fn, optimizer):
    size = len(dataloader.dataset)
    model.train()
    for batch, (X, y) in enumerate(dataloader):
        X, y = X.to(device), y.to(device)

        # Compute prediction error
        pred = model(X)
        loss = loss_fn(pred, y)

        # Backpropagation
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()

        if batch % 100 == 0:
            loss, current = loss.item(), (batch + 1) * len(X)
            print(f"loss: {loss:>7f}  [{current:>5d}/{size:>5d}]")
            

# Check our progress

The other function we need to implement our training loop is a test function.  There's a bunch I need to read more about in here.

In [12]:
def test(dataloader, model, loss_fn):
    size = len(dataloader.dataset)
    num_batches = len(dataloader)
    model.eval()
    test_loss, correct = 0, 0
    with torch.no_grad():
        for X, y in dataloader:
            X, y = X.to(device), y.to(device)
            pred = model(X)
            test_loss += loss_fn(pred, y).item()
            correct += (pred.argmax(1) == y).type(torch.float).sum().item()
    test_loss /= num_batches
    correct /= size
    print(f"Test Error: \n Accuracy: {(100*correct):>0.1f}%, Avg loss: {test_loss:>8f} \n")

## The Training loop!

Here goes:

In [13]:
epochs = 5
for t in range(epochs):
    print(f"Epoch {t+1}\n-------------------------------")
    train(train_dataloader, model, loss_fn, optimizer)
    test(test_dataloader, model, loss_fn)
print("Done!")

Epoch 1
-------------------------------
loss: 2.174369  [   64/60000]
loss: 2.169693  [ 6464/60000]
loss: 2.111092  [12864/60000]
loss: 2.127666  [19264/60000]
loss: 2.080482  [25664/60000]
loss: 2.021308  [32064/60000]
loss: 2.042222  [38464/60000]
loss: 1.968672  [44864/60000]
loss: 1.985266  [51264/60000]
loss: 1.898694  [57664/60000]
Test Error: 
 Accuracy: 58.8%, Avg loss: 1.905262 

Epoch 2
-------------------------------
loss: 1.929222  [   64/60000]
loss: 1.909338  [ 6464/60000]
loss: 1.789787  [12864/60000]
loss: 1.837290  [19264/60000]
loss: 1.726349  [25664/60000]
loss: 1.672010  [32064/60000]
loss: 1.695709  [38464/60000]
loss: 1.593846  [44864/60000]
loss: 1.628344  [51264/60000]
loss: 1.508476  [57664/60000]
Test Error: 
 Accuracy: 62.1%, Avg loss: 1.534667 

Epoch 3
-------------------------------
loss: 1.591206  [   64/60000]
loss: 1.564437  [ 6464/60000]
loss: 1.414400  [12864/60000]
loss: 1.489172  [19264/60000]
loss: 1.367419  [25664/60000]
loss: 1.360624  [32064/600

## Wicked cool - lets save it

We've got a model.  It's not great - plenty of room to learn and improve the existing code.  Let's save the model:

In [14]:
torch.save(model.state_dict(), "model.pth")
print("Saved PyTorch Model State to model.pth")

Saved PyTorch Model State to model.pth


It saved the file next to this notebook file, it's in the explorer panel.

## Loading the model

We can load an existing, saved model from disk like so

In [15]:
model = NeuralNetwork()
model.load_state_dict(torch.load("model.pth"))

<All keys matched successfully>

In [18]:
# Next lets use it - and disable backpropagation.

classes = [
    "T-shirt/top",
    "Trouser",
    "Pullover",
    "Dress",
    "Coat",
    "Sandal",
    "Shirt",
    "Sneaker",
    "Bag",
    "Ankle boot",
]

model.eval()
x, y = test_data[0][0], test_data[0][1]
with torch.no_grad():
    pred = model(x)
    predicted, actual = classes[pred[0].argmax(0)], classes[y]
    print(f'Predicted: "{predicted}", Actual: "{actual}"')

Predicted: "Ankle boot", Actual: "Ankle boot"
