<h1> PyTorch in a nutshell </h1> 

PyTorch is a module built for working with tensors and particularly adapt for building deep learning models in image recognition and language processing. As a python module, it aims to be easy to implement and provide code that is generally very readable.
In this notebook we will follow the tutorial presented on the [PyTorch Page](https://pytorch.org/tutorials/beginner/basics/quickstart_tutorial.html), which mainly focuses on computer vision applications. 

The required modules can be installed on the conda environment with:

In [2]:
!conda install pytorch torchvision torchaudio -c pytorch

Collecting package metadata (current_repodata.json): done
Solving environment: done

# All requested packages already installed.



we can then test the modules are correctly installed by importing them

In [3]:
import numpy as np
import torch
from torch import nn
from torch.utils.data import DataLoader
from torchvision import datasets
from torchvision.transforms import ToTensor

<h2> Downloading a pytorch standard dataset</h2> 

The $torchvision.dataset$ module include several training and test sets that can be downloaded, following the PyTorch tutorial we are going to download the _FashionMNIST_ dataset.

In [4]:
# Download training data from open datasets.
training_data = datasets.FashionMNIST(
    root="data",
    train=True,
    download=True,
    transform=ToTensor(),
)

# Download test data from open datasets.
test_data = datasets.FashionMNIST(
    root="data",
    train=False,
    download=True,
    transform=ToTensor(),
)

Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/train-images-idx3-ubyte.gz
Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/train-images-idx3-ubyte.gz to data/FashionMNIST/raw/train-images-idx3-ubyte.gz


100%|██████████████████████████| 26421880/26421880 [00:03<00:00, 7810381.73it/s]


Extracting data/FashionMNIST/raw/train-images-idx3-ubyte.gz to data/FashionMNIST/raw

Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/train-labels-idx1-ubyte.gz
Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/train-labels-idx1-ubyte.gz to data/FashionMNIST/raw/train-labels-idx1-ubyte.gz


100%|█████████████████████████████████| 29515/29515 [00:00<00:00, 847434.20it/s]

Extracting data/FashionMNIST/raw/train-labels-idx1-ubyte.gz to data/FashionMNIST/raw

Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/t10k-images-idx3-ubyte.gz





Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/t10k-images-idx3-ubyte.gz to data/FashionMNIST/raw/t10k-images-idx3-ubyte.gz


100%|████████████████████████████| 4422102/4422102 [00:00<00:00, 4641692.13it/s]


Extracting data/FashionMNIST/raw/t10k-images-idx3-ubyte.gz to data/FashionMNIST/raw

Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/t10k-labels-idx1-ubyte.gz
Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/t10k-labels-idx1-ubyte.gz to data/FashionMNIST/raw/t10k-labels-idx1-ubyte.gz


100%|██████████████████████████████████| 5148/5148 [00:00<00:00, 4252122.29it/s]


Extracting data/FashionMNIST/raw/t10k-labels-idx1-ubyte.gz to data/FashionMNIST/raw



we then need to prepare the data for the dataloader and understand what is the size of the input data:

In [5]:
batch_size = 64 # Each iterable element will return to the dataloader batch_size features and labels

# Create data loaders.
train_dataloader = DataLoader(training_data, batch_size=batch_size)
test_dataloader = DataLoader(test_data, batch_size=batch_size)

for x, y in test_dataloader:
    print(f"Shape of x [N, C, H, W]: {x.shape}")
    print(f"Shape of y: {y.shape} {y.dtype}")
    break

Shape of x [N, C, H, W]: torch.Size([64, 1, 28, 28])
Shape of y: torch.Size([64]) torch.int64


<h2> Defining the neural network</h2> 

We can now define the neural network model, to begin, let's check if cuda is available as the NN will perform much better when run on GPU.
If not we will run on cpu

In [7]:
# Get cpu, gpu or mps device for training.
device = (
    "cuda"
    if torch.cuda.is_available() # GPU optimization
    else "mps"
    if torch.backends.mps.is_available() # Optimization for mac
    else "cpu"
)
print(f"Using {device} device")



Using cpu device


we can then define the NN class that will be trained from the data, following the tutorial we will define it as follows:

In [8]:
# Define model
class NeuralNetwork(nn.Module):
    def __init__(self):
        super().__init__() # Inherit methods from the nn.module
        self.flatten = nn.Flatten() # Reduce the dimensions of the feature input map to make it understandable from next layer
        self.linear_relu_stack = nn.Sequential(
            nn.Linear(28*28, 512), # Input layer, first value is the size of the image, second the choosen number of nodes
            nn.ReLU(),
            nn.Linear(512, 512), # Hidden layer, generally nodes * nodes
            nn.ReLU(),
            nn.Linear(512, 10) # Output layer, given by nodes*number of output classess
        )

    def forward(self, x): # forward step to train the model from data
        x = self.flatten(x)
        logits = self.linear_relu_stack(x)
        return logits

model = NeuralNetwork().to(device)
print(model)

NeuralNetwork(
  (flatten): Flatten(start_dim=1, end_dim=-1)
  (linear_relu_stack): Sequential(
    (0): Linear(in_features=784, out_features=512, bias=True)
    (1): ReLU()
    (2): Linear(in_features=512, out_features=512, bias=True)
    (3): ReLU()
    (4): Linear(in_features=512, out_features=10, bias=True)
  )
)


<h2> Training the neural network given the input data</h2> 

In order to train the neural network, we first have to define a _Loss function_ and an optimizer. 
These are already available in the torch environment, and following the tutorial we will use:

In [9]:
loss_fn = nn.CrossEntropyLoss() # basically represent the difference between the estimated probability of the distributions of values and the original
optimizer = torch.optim.SGD(model.parameters(), lr=1e-3) # The optimizer will use stochastic gradient descent

in the training step we will then make a predictions on the training set, and backpropagates the errors to improve the coefficients

In [10]:
def train(dataloader, model, loss_fn, optimizer):
    size = len(dataloader.dataset)
    model.train()
    for batch, (X, y) in enumerate(dataloader):
        X, y = X.to(device), y.to(device)

        # Compute prediction error
        pred = model(X)
        loss = loss_fn(pred, y)

        # Backpropagation
        loss.backward()
        optimizer.step()
        optimizer.zero_grad()

        if batch % 100 == 0:
            loss, current = loss.item(), (batch + 1) * len(X)
            print(f"loss: {loss:>7f}  [{current:>5d}/{size:>5d}]")

to avoid overfitting on the training set, we also need to check its performance on the test set 

In [11]:
def test(dataloader, model, loss_fn):
    size = len(dataloader.dataset)
    num_batches = len(dataloader)
    model.eval()
    test_loss, correct = 0, 0
    with torch.no_grad():
        for X, y in dataloader:
            X, y = X.to(device), y.to(device)
            pred = model(X)
            test_loss += loss_fn(pred, y).item()
            correct += (pred.argmax(1) == y).type(torch.float).sum().item()
    test_loss /= num_batches
    correct /= size
    print(f"Test Error: \n Accuracy: {(100*correct):>0.1f}%, Avg loss: {test_loss:>8f} \n")

The learning process will be run over several epochs, if the model is converging we should see it's precision getting better after each training epoch:

In [12]:
epochs = 5
for t in range(epochs):
    print(f"Epoch {t+1}\n-------------------------------")
    train(train_dataloader, model, loss_fn, optimizer)
    test(test_dataloader, model, loss_fn)
print("Done!")

Epoch 1
-------------------------------
loss: 2.291617  [   64/60000]
loss: 2.285248  [ 6464/60000]
loss: 2.264884  [12864/60000]
loss: 2.256673  [19264/60000]
loss: 2.240844  [25664/60000]
loss: 2.196738  [32064/60000]
loss: 2.208365  [38464/60000]
loss: 2.172820  [44864/60000]
loss: 2.161995  [51264/60000]
loss: 2.127469  [57664/60000]
Test Error: 
 Accuracy: 49.7%, Avg loss: 2.124474 

Epoch 2
-------------------------------
loss: 2.133316  [   64/60000]
loss: 2.124659  [ 6464/60000]
loss: 2.056557  [12864/60000]
loss: 2.067950  [19264/60000]
loss: 2.014289  [25664/60000]
loss: 1.951779  [32064/60000]
loss: 1.973492  [38464/60000]
loss: 1.899523  [44864/60000]
loss: 1.888290  [51264/60000]
loss: 1.811873  [57664/60000]
Test Error: 
 Accuracy: 61.8%, Avg loss: 1.815423 

Epoch 3
-------------------------------
loss: 1.852242  [   64/60000]
loss: 1.821786  [ 6464/60000]
loss: 1.693214  [12864/60000]
loss: 1.727600  [19264/60000]
loss: 1.619705  [25664/60000]
loss: 1.581096  [32064/600

we can now make predictions with our model:

In [16]:
classes = [
    "T-shirt/top",
    "Trouser",
    "Pullover",
    "Dress",
    "Coat",
    "Sandal",
    "Shirt",
    "Sneaker",
    "Bag",
    "Ankle boot",
]

model.eval()
x, y = test_data[0][0], test_data[0][1]
with torch.no_grad():
    x = x.to(device)
    pred = model(x)
    predicted, actual = classes[pred[0].argmax(0)], classes[y]
    print(f'Predicted: "{predicted}", Actual: "{actual}"')

Predicted: "Ankle boot", Actual: "Ankle boot"


<h2> Saving and loading pre-trained models </h2> 

Training a model properly generally require long time, once the training process is done we can hence save the trained model so that we can just reload it when we need to use it.

To save the model we proceed as follows:

In [13]:
torch.save(model.state_dict(), "./Output/FashionMNISTModel.pth")
print("Saved PyTorch Model State to model.pth")

Saved PyTorch Model State to model.pth


which can then be loaded as:

In [14]:
model = NeuralNetwork().to(device)
model.load_state_dict(torch.load("./Output/FashionMNISTModel.pth"))

<All keys matched successfully>