# Quickstart

Runs through the API for common tasks in ML

### Working with data

Pytorch has two primitives to work with data:
- torch.utils.data.DataLoader
- torch.utils.data.Dataset

Dataset stores the samples and their corresponding labels. Dataloader wraps an iterable around the Dataset

In [2]:
import torch
from torch import nn
from torch.utils.data import DataLoader
from torchvision import datasets
from torchvision.transforms import ToTensor

PyTorch offers domain-specific libraries such as TorchText, TorchVision, and TorchAudio, all of which include datasets. 

For this tutorial, we will be using a TorchVision dataset.

The torchvision.datasets module contains Dataset objects for many real-world vision data like CIFAR, COCO, etc (full list: https://pytorch.org/vision/stable/datasets.html). 

In this tutorial, we use the FashionMNIST dataset. Every TorchVision Dataset includes two arguments: transform and target_transform to modify the samples and labels respectively.

In [3]:
# Download training data from open datasets.
training_data = datasets.FashionMNIST(
    root="data",
    train=True,
    download=True,
    transform=ToTensor(),
)

# Download test data from open datasets.
test_data = datasets.FashionMNIST(
    root="data",
    train=False,
    download=True,
    transform=ToTensor(),
)

Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/train-images-idx3-ubyte.gz
Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/train-images-idx3-ubyte.gz to data/FashionMNIST/raw/train-images-idx3-ubyte.gz


100%|██████████████████████████| 26421880/26421880 [00:03<00:00, 7083733.89it/s]


Extracting data/FashionMNIST/raw/train-images-idx3-ubyte.gz to data/FashionMNIST/raw

Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/train-labels-idx1-ubyte.gz
Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/train-labels-idx1-ubyte.gz to data/FashionMNIST/raw/train-labels-idx1-ubyte.gz


100%|█████████████████████████████████| 29515/29515 [00:00<00:00, 264919.19it/s]


Extracting data/FashionMNIST/raw/train-labels-idx1-ubyte.gz to data/FashionMNIST/raw

Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/t10k-images-idx3-ubyte.gz
Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/t10k-images-idx3-ubyte.gz to data/FashionMNIST/raw/t10k-images-idx3-ubyte.gz


100%|█████████████████████████████| 4422102/4422102 [00:11<00:00, 375195.82it/s]


Extracting data/FashionMNIST/raw/t10k-images-idx3-ubyte.gz to data/FashionMNIST/raw

Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/t10k-labels-idx1-ubyte.gz
Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/t10k-labels-idx1-ubyte.gz to data/FashionMNIST/raw/t10k-labels-idx1-ubyte.gz


100%|██████████████████████████████████| 5148/5148 [00:00<00:00, 5290927.96it/s]

Extracting data/FashionMNIST/raw/t10k-labels-idx1-ubyte.gz to data/FashionMNIST/raw






Pass the Dataset as an argument to DataLoader. This wraps an iterable over our dataset, and supports automatic batching, sampling, shuffling and multiprocess data loading. 

Here we define a batch size of 64, i.e. each element in the dataloader iterable will return a batch of 64 features and labels.

In [4]:
batch_size = 64

# Create data loaders.
train_dataloader = DataLoader(training_data, batch_size=batch_size)
test_dataloader = DataLoader(test_data, batch_size=batch_size)

for X, y in test_dataloader:
    print(f"Shape of X [N, C, H, W]: {X.shape}")
    print(f"Shape of y: {y.shape} {y.dtype}")
    break

Shape of X [N, C, H, W]: torch.Size([64, 1, 28, 28])
Shape of y: torch.Size([64]) torch.int64


N: This represents the number of samples or examples in the batch. In the context of image data, N would be the batch size, which is the number of images processed together during one iteration of training or inference.

C: This represents the number of channels in the image. For grayscale images, C would be 1, indicating a single channel. For color images represented in RGB format, C would be 3, representing the three color channels (red, green, blue).

H: This represents the height of the image in pixels.

W: This represents the width of the image in pixels.

### Creating Models

In [5]:
# Get cpu, gpu or mps device for training.
device = (
    "cuda"
    if torch.cuda.is_available()
    else "mps"
    if torch.backends.mps.is_available()
    else "cpu"
)
print(f"Using {device} device")

Using mps device


In [6]:
# Define model
class NeuralNetwork(nn.Module):
    def __init__(self):
        super().__init__()
        self.flatten = nn.Flatten()
        self.linear_relu_stack = nn.Sequential(
            nn.Linear(28*28, 512),
            nn.ReLU(),
            nn.Linear(512, 512),
            nn.ReLU(),
            nn.Linear(512, 10)
        )

    def forward(self, x):
        x = self.flatten(x)
        logits = self.linear_relu_stack(x)
        return logits

model = NeuralNetwork().to(device)
print(model)

NeuralNetwork(
  (flatten): Flatten(start_dim=1, end_dim=-1)
  (linear_relu_stack): Sequential(
    (0): Linear(in_features=784, out_features=512, bias=True)
    (1): ReLU()
    (2): Linear(in_features=512, out_features=512, bias=True)
    (3): ReLU()
    (4): Linear(in_features=512, out_features=10, bias=True)
  )
)


Above code defines a neural network model using the nn.Module class and prints out a summary of the model architecture.

1. Define the Neural Network Class (NeuralNetwork):

    The NeuralNetwork class is defined, which is a subclass of nn.Module, the base class for all neural network modules in PyTorch.

    The __init__ method is called when an instance of the class is created. Inside __init__, the layers of the neural network are defined.
    

2. Define Layers in the __init__ Method:

    - The __init__ method initializes the neural network layers:
        - nn.Flatten(): This layer flattens the input tensor into a 1D tensor. It is used to flatten the input images, which are 2D (28x28 pixels), into a 1D tensor (784 pixels).
        - nn.Sequential(): This is a container that sequentially applies a list of layers.
        - nn.Linear(): This defines fully connected (dense) layers. It takes the size of the input and output as parameters.
        - nn.ReLU(): This is the rectified linear unit (ReLU) activation function, applied after each linear layer except the last one.
    - The neural network consists of three linear layers with ReLU activation functions between them.
    - The first linear layer takes 28x28 input features (the size of the flattened image) and outputs 512 features.
    - The second and third linear layers have 512 input features and output 512 features, and 10 features respectively.
    - The output of the last linear layer (with 10 output features) represents the logits for each class (in this case, the FashionMNIST dataset has 10 classes).


3. Define the forward Method:

    - The forward method defines the forward pass of the neural network.
    - It takes an input tensor x and passes it through the layers defined in the __init__ method.
    - First, the input tensor is flattened using nn.Flatten().
    - Then, the flattened tensor is passed through the sequential layer defined in the __init__ method.
    - The output of the last linear layer (logits) is returned.
    
    
4. Instantiate the Model and Move to Device (model = NeuralNetwork().to(device)):

    - An instance of the NeuralNetwork class is created.
    - It is then moved to the device specified earlier (CPU, GPU, or MPS) using the to method.

### Optimizing the Model Parameters

To train model, need a loss function and optimizer

In [7]:
loss_fn = nn.CrossEntropyLoss()
optimizer = torch.optim.SGD(model.parameters(), lr=1e-3)

In single training loop, model makes predictions on the training dataset (fed in batches), and backpropogates the prediction error to adjust the model's parameters.

In [8]:
def train(dataloader, model, loss_fn, optimizer):
    size = len(dataloader.dataset)
    model.train()
    for batch, (X, y) in enumerate(dataloader):
        X, y = X.to(device), y.to(device)

        # Compute prediction error
        pred = model(X)
        loss = loss_fn(pred, y)

        # Backpropagation
        loss.backward()
        optimizer.step()
        optimizer.zero_grad()

        if batch % 100 == 0:
            loss, current = loss.item(), (batch + 1) * len(X)
            print(f"loss: {loss:>7f}  [{current:>5d}/{size:>5d}]")

1. Function Signature:
    The function train takes four parameters:
    dataloader: The PyTorch DataLoader object containing the training data.
    model: The neural network model to be trained.
    loss_fn: The loss function used to compute the loss.
    optimizer: The optimizer used to update the model parameters.


2. Set Model to Training Mode:
    model.train() sets the model to training mode. This is important because certain layers behave differently during training and evaluation (e.g., dropout layers).


3. Training Loop:
    The function iterates over batches of data from the dataloader:


4. Device Placement:
    X, y = X.to(device), y.to(device): This line moves the input data X and target labels y to the specified device (device).


5. Forward Pass:
    pred = model(X): Performs a forward pass through the neural network model to compute predictions (pred) for the input data X.


6. Compute Loss:
    loss = loss_fn(pred, y): Computes the loss between the predicted values (pred) and the actual target labels (y) using the specified loss function (loss_fn).


7. Backpropagation and Optimization:
    loss.backward(): Backpropagates the gradients of the loss function with respect to the model parameters.
    optimizer.step(): Updates the model parameters using the gradients computed during the backward pass.
    optimizer.zero_grad(): Clears the gradients accumulated in the previous iteration. This is necessary because PyTorch accumulates gradients by default.


8. Logging Training Progress:
    if batch % 100 == 0: Logs the training progress every 100 batches.
    loss, current = loss.item(), (batch + 1) * len(X): Extracts the current loss value and the current size of the processed data batch.
    print(f"loss: {loss:>7f} [{current:>5d}/{size:>5d}]"): Prints the current loss and the progress in terms of the number of samples processed out of the total dataset size.


In summary, the train function encapsulates the training process for the neural network model. It iterates over batches of training data, computes predictions, calculates the loss, performs backpropagation, and updates the model parameters using the specified optimizer. Additionally, it logs the training progress to monitor the performance of the model during training.

Also check the model’s performance against the test dataset to ensure it is learning.

In [9]:
def test(dataloader, model, loss_fn):
    size = len(dataloader.dataset)
    num_batches = len(dataloader)
    model.eval()
    test_loss, correct = 0, 0
    with torch.no_grad():
        for X, y in dataloader:
            X, y = X.to(device), y.to(device)
            pred = model(X)
            test_loss += loss_fn(pred, y).item()
            correct += (pred.argmax(1) == y).type(torch.float).sum().item()
    test_loss /= num_batches
    correct /= size
    print(f"Test Error: \n Accuracy: {(100*correct):>0.1f}%, Avg loss: {test_loss:>8f} \n")

function test responsible for evaluating the performance of a trained neural network model on a test dataset:


1. Function Signature:
    The function test takes three parameters:
    dataloader: The PyTorch DataLoader object containing the test data.
    model: The trained neural network model to be evaluated.
    loss_fn: The loss function used to compute the loss.


2. Initialization:
    size = len(dataloader.dataset): Computes the total size of the test dataset.
    num_batches = len(dataloader): Computes the number of batches in the test dataset.


3. Set Model to Evaluation Mode:
    model.eval(): Sets the model to evaluation mode. This disables certain layers, like dropout layers, which behave differently during training and evaluation.


4. Initialize Variables for Evaluation:

    test_loss, correct = 0, 0: Initializes variables to accumulate the total test loss and the number of correctly predicted samples.

5. Evaluation Loop:
    The function iterates over batches of data from the test dataloader.
    with torch.no_grad(): This context manager ensures that gradients are not computed during the evaluation process. This reduces memory consumption and speeds up computation.


6. Device Placement and Prediction:

    X, y = X.to(device), y.to(device): Moves the input data X and target labels y to the specified device (device).
    pred = model(X): Performs a forward pass through the neural network model to compute predictions (pred) for the input data X.


7. Compute Test Loss and Accuracy:

    test_loss += loss_fn(pred, y).item(): Computes the test loss for the current batch and accumulates it.
    correct += (pred.argmax(1) == y).type(torch.float).sum().item(): Computes the number of correctly predicted samples in the current batch and accumulates it.


8. Compute Average Test Loss and Accuracy:

    test_loss /= num_batches: Computes the average test loss across all batches.
    correct /= size: Computes the accuracy as the ratio of correctly predicted samples to the total number of samples.


9. Print Evaluation Results:

    Prints the test error, including accuracy and average loss.


In summary, the test function evaluates the trained neural network model on a test dataset, computing the test loss and accuracy. It iterates over batches of test data, computes predictions, and accumulates metrics for evaluation. Finally, it prints out the evaluation results, including the accuracy and average loss on the test dataset.

The training process is conducted over several iterations (epochs). During each epoch, the model learns parameters to make better predictions. We print the model’s accuracy and loss at each epoch; we’d like to see the accuracy increase and the loss decrease with every epoch

In [10]:
epochs = 5
for t in range(epochs):
    print(f"Epoch {t+1}\n-------------------------------")
    train(train_dataloader, model, loss_fn, optimizer)
    test(test_dataloader, model, loss_fn)
print("Done!")

Epoch 1
-------------------------------
loss: 2.291703  [   64/60000]
loss: 2.291132  [ 6464/60000]
loss: 2.264377  [12864/60000]
loss: 2.266026  [19264/60000]
loss: 2.250588  [25664/60000]
loss: 2.209043  [32064/60000]
loss: 2.234445  [38464/60000]
loss: 2.184725  [44864/60000]
loss: 2.185010  [51264/60000]
loss: 2.160828  [57664/60000]
Test Error: 
 Accuracy: 38.7%, Avg loss: 2.148675 

Epoch 2
-------------------------------
loss: 2.151424  [   64/60000]
loss: 2.156168  [ 6464/60000]
loss: 2.085038  [12864/60000]
loss: 2.106302  [19264/60000]
loss: 2.063388  [25664/60000]
loss: 1.989804  [32064/60000]
loss: 2.042598  [38464/60000]
loss: 1.946138  [44864/60000]
loss: 1.952910  [51264/60000]
loss: 1.897489  [57664/60000]
Test Error: 
 Accuracy: 57.0%, Avg loss: 1.879922 

Epoch 3
-------------------------------
loss: 1.905232  [   64/60000]
loss: 1.893374  [ 6464/60000]
loss: 1.751979  [12864/60000]
loss: 1.800064  [19264/60000]
loss: 1.705885  [25664/60000]
loss: 1.640492  [32064/600

### Saving Models

A common way to save a model is to serialize the internal state dictionary (containing the model parameters)

In [18]:
torch.save(model.state_dict(), "model.pth")
print("Saved PyTorch Model State to model.pth")

Saved PyTorch Model State to model.pth


### Loading Models

The process for loading a model includes re-creating the model structure and loading the state dictionary into it

In [19]:
model = NeuralNetwork().to(device)
model.load_state_dict(torch.load("model.pth"))

<All keys matched successfully>

Model can now make predictions

In [33]:
classes = [
    "T-shirt/top",
    "Trouser",
    "Pullover",
    "Dress",
    "Coat",
    "Sandal",
    "Shirt",
    "Sneaker",
    "Bag",
    "Ankle boot",
]

model.eval()
x, y = test_data[0][0], test_data[0][1]
with torch.no_grad():
    x = x.to(device)
    pred = model(x)
    predicted, actual = classes[pred[0].argmax(0)], classes[y]
    print(f'Predicted: "{predicted}", Actual: "{actual}"')

Predicted: "Pullover", Actual: "Ankle boot"


1. Define Classes:

    The classes list contains the names of the classes corresponding to the FashionMNIST dataset. Each index in the list corresponds to a class label.


2. Set Model to Evaluation Mode:

    model.eval() sets the model to evaluation mode. This disables certain layers like dropout layers that behave differently during training and evaluation.


3. Perform Inference:

    x, y = test_data[0][0], test_data[0][1] retrieves the first sample (x) and its corresponding label (y) from the test dataset.
    
    with torch.no_grad(): is a context manager that disables gradient calculation during inference. This reduces memory consumption and speeds up computation.
    
    x = x.to(device) moves the input tensor x to the specified device (device).
    
    pred = model(x) performs inference by passing the input tensor through the model, obtaining the predicted output.
    
    predicted, actual = classes[pred[0].argmax(0)], classes[y] retrieves the class names corresponding to the predicted class (determined by the index of the maximum value in the predicted tensor) and the actual class label (y).
    
    print(f'Predicted: "{predicted}", Actual: "{actual}"') prints out the predicted class and the actual class of the sample.


In summary, this code snippet demonstrates how to use a trained neural network model to make predictions on a single sample from the test dataset and print out the predicted class alongside the actual class. This is a common approach used to verify the performance of a trained model on unseen data samples