<a href="https://colab.research.google.com/github/lucken99/DataScience_Notes/blob/main/pytorch_quickstart.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

[QUICkSTART](https://pytorch.org/tutorials/beginner/basics/quickstart_tutorial.html)

# Working with data
Pytorch has two primitives to work with data:
`torch.utils.data.DataLoader` and
`torch.utils.data.Dataset`

`Dataset` stores the samples and their corresponding labels, and
`DataLoader` wraps an iterable around the `Dataset`.

In [1]:
import torch
from torch import nn
from torch.utils.data import Dataset
from torch.utils.data import DataLoader


PyTorch offers domain-specific libraries such as

`torchtext` <br>
`torchvision` <br>
`torchaudio` <br>

all of which include datasets

In [2]:
from torchvision import datasets
from torchvision.transforms import ToTensor

The `torchvision.datasets` module contains `Dataset` objects for many real-world vision data like </br>
**CIFAR**,
**COCO**,
**FashionMNIST**, [(full list here)](https://pytorch.org/vision/stable/datasets.html)

</br>

Every TorchVision `Dataset` includes two arguments: </br>
`transform` and `target_transform` to modify the samples and labels respectively.

In [3]:
## !rm -r /content/data

# Download training data from open datasets.
training_data = datasets.FashionMNIST(
    root="data",
    train=True,
    download=True,
    transform=ToTensor(),
)

# Download test data from open datsets.
test_data = datasets.FashionMNIST(
    root="data",
    train=False,
    download=True,
    transform=ToTensor(),
)

Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/train-images-idx3-ubyte.gz
Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/train-images-idx3-ubyte.gz to data/FashionMNIST/raw/train-images-idx3-ubyte.gz


100%|██████████| 26421880/26421880 [00:01<00:00, 19123751.99it/s]


Extracting data/FashionMNIST/raw/train-images-idx3-ubyte.gz to data/FashionMNIST/raw

Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/train-labels-idx1-ubyte.gz
Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/train-labels-idx1-ubyte.gz to data/FashionMNIST/raw/train-labels-idx1-ubyte.gz


100%|██████████| 29515/29515 [00:00<00:00, 289608.48it/s]


Extracting data/FashionMNIST/raw/train-labels-idx1-ubyte.gz to data/FashionMNIST/raw

Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/t10k-images-idx3-ubyte.gz
Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/t10k-images-idx3-ubyte.gz to data/FashionMNIST/raw/t10k-images-idx3-ubyte.gz


100%|██████████| 4422102/4422102 [00:00<00:00, 5450166.10it/s]


Extracting data/FashionMNIST/raw/t10k-images-idx3-ubyte.gz to data/FashionMNIST/raw

Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/t10k-labels-idx1-ubyte.gz
Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/t10k-labels-idx1-ubyte.gz to data/FashionMNIST/raw/t10k-labels-idx1-ubyte.gz


100%|██████████| 5148/5148 [00:00<00:00, 21876673.75it/s]

Extracting data/FashionMNIST/raw/t10k-labels-idx1-ubyte.gz to data/FashionMNIST/raw






We pass the `Dataset` as an argument to `DataLoader`. </br>
This wraps and iterable over our dataset, and supports automatic **batching, sampling, shuffling and multiprocess data loading**.

In [4]:
# Here we define a batch size of 64, i.e. each element
# in the dataloader iterable will return a batch of 64 features and labels.
batch_size = 64

# Create data loaders.
train_dataloader = DataLoader(training_data, batch_size=batch_size)
test_dataloader = DataLoader(test_data, batch_size=batch_size)

for X, y in test_dataloader:
    print(f"Shape of X [N, C, H, W]: {X.shape}")
    print(f"Shape of y: {y.shape} {y.dtype}")
    break

Shape of X [N, C, H, W]: torch.Size([64, 1, 28, 28])
Shape of y: torch.Size([64]) torch.int64


# Creating Models
To define a neural network in PyTorch, we create a class that inherits from `nn.Module`. </br>

We define the layers of the network in the `__init__` function and specify how data will pass through the network in the `forward` function.
</br>
To accelerate operations in the neural network, we move it to the GPU or MPS if available.


In [5]:
# Get cpu, gpu or mps device for training.
device = (
    "cuda"
    if torch.cuda.is_available()
    else "mps"
    if torch.backends.mps.is_available()
    else "cpu"
)
print(f"Using {device} device")

# Define model
class NeuralNetwork(nn.Module):
    def __init__(self):
        super().__init__()
        self.flatten = nn.Flatten()
        self.linear_relu_stack = nn.Sequential(
            nn.Linear(28*28, 512),
            nn.ReLU(),
            nn.Linear(512, 512),
            nn.ReLU(),
            nn.Linear(512, 10)
        )

    def forward(self, x):
        x = self.flatten(x)
        logits = self.linear_relu_stack(x)
        return logits

model = NeuralNetwork().to(device)
print(model)

Using cpu device
NeuralNetwork(
  (flatten): Flatten(start_dim=1, end_dim=-1)
  (linear_relu_stack): Sequential(
    (0): Linear(in_features=784, out_features=512, bias=True)
    (1): ReLU()
    (2): Linear(in_features=512, out_features=512, bias=True)
    (3): ReLU()
    (4): Linear(in_features=512, out_features=10, bias=True)
  )
)


# Optimizing the Model Parameters

To train a model, we need a **loss function** and an **optimizer**.

In [6]:
loss_fn = nn.CrossEntropyLoss()
optimizer = torch.optim.SGD(model.parameters(), lr=1e-3)


In a single training loop, the model makes predictions on the training dataset(fed to it in batches), and backpropagated the prediction error to adjust the model's parameters.



In [7]:
def train(dataloader, model, loss_fn, optimizer):
    size = len(dataloader.dataset)
    model.train()
    for batch, (X, y) in enumerate(dataloader):
        X, y = X.to(device), y.to(device)

        # Compute prediction error
        pred = model(X)
        loss = loss_fn(pred, y)

        # Backpropagation
        loss.backward()
        optimizer.step()
        optimizer.zero_grad()

        if batch % 100 == 0:
            loss, current = loss.item(), (batch + 1) * len(X)
            print(f"loss: {loss:>7f} [{current:>5d}/{size:>5d}]")




We also check the model's performance against the test dataset to ensure it is learning.

In [8]:
def test(dataloader, model, loss_fn):
    size = len(dataloader.dataset)
    num_batches = len(dataloader)
    model.eval()
    test_loss, correct = 0, 0
    with torch.no_grad():
        for X, y in dataloader:
            X, y = X.to(device), y.to(device)
            pred = model(X)
            test_loss += loss_fn(pred, y).item()
            correct += (pred.argmax(1) == y).type(torch.float).sum().item()
    test_loss /= num_batches
    correct /= size
    print(f"Test Error: \n Accuracy: {(100*correct):>0.1f}%, Avg loss: {test_loss:>8f} \n")

The training process is conducted over several iterations (epochs). During each epoch, the model learns parameters to make better predictions.

In [9]:
epochs = 5
for t in range(epochs):
    print(f"Epoch {t+1}\n--------------------------------------")
    train(train_dataloader, model, loss_fn, optimizer)
    test(test_dataloader, model, loss_fn)
print("Done!")

Epoch 1
--------------------------------------
loss: 2.306051 [   64/60000]
loss: 2.299984 [ 6464/60000]
loss: 2.277940 [12864/60000]
loss: 2.263947 [19264/60000]
loss: 2.251173 [25664/60000]
loss: 2.202307 [32064/60000]
loss: 2.216742 [38464/60000]
loss: 2.178794 [44864/60000]
loss: 2.180527 [51264/60000]
loss: 2.136618 [57664/60000]
Test Error: 
 Accuracy: 41.7%, Avg loss: 2.141427 

Epoch 2
--------------------------------------
loss: 2.152002 [   64/60000]
loss: 2.145747 [ 6464/60000]
loss: 2.086812 [12864/60000]
loss: 2.096680 [19264/60000]
loss: 2.042767 [25664/60000]
loss: 1.970874 [32064/60000]
loss: 2.001888 [38464/60000]
loss: 1.918114 [44864/60000]
loss: 1.921557 [51264/60000]
loss: 1.843231 [57664/60000]
Test Error: 
 Accuracy: 53.2%, Avg loss: 1.849501 

Epoch 3
--------------------------------------
loss: 1.880788 [   64/60000]
loss: 1.853776 [ 6464/60000]
loss: 1.740224 [12864/60000]
loss: 1.774919 [19264/60000]
loss: 1.663085 [25664/60000]
loss: 1.613996 [32064/60000]
l

# Saving Models
[Saving and loading Models](https://pytorch.org/tutorials/beginner/saving_loading_models.html?highlight=pth%20tar)

A common way to save a model is to serialize the internal state dictionary (containing the model parameters).

In PyTorch, the learnable parameters (i.e. weights and biases) of an torch.nn.Module model are contained in the model's parameters (accessed with `model.parameters()`).

a state_dict is simply a python dictionary object that maps each layer to its parameter tensor

In [10]:
torch.save(model.state_dict(), "model.pth")
print("Saved PyTorch Model State to model.pth")

Saved PyTorch Model State to model.pth


# Loading Models

The process for loading a model includes re-creating the model structure and loading the state dictionary

In [11]:
model = NeuralNetwork().to(device)
model.load_state_dict(torch.load("model.pth"))

<All keys matched successfully>

We can now make predictions using this model

In [16]:
classes = [
    "T-shirt/top",
    "Trouser",
    "Pullover",
    "Dress",
    "Coat",
    "Sandal",
    "Shirt",
    "Sneaker",
    "Bag",
    "Ankle boot",
]

model.eval()
x, y = test_data[0][0], test_data[0][1]
with torch.no_grad():
    x = x.to(device)
    pred = model(x)
    predicted, actual = classes[pred[0].argmax(0)], classes[y]
    print(f"Predicted: '{predicted}', Actual: '{actual}'")

Predicted: 'Ankle boot', Actual: 'Ankle boot'
