# Transformations in PyTorch

Now we're going to run the same thing, but using PyTorch, with the aim of using a neural network as our final model. Our final model will be a simple linear model, so nothing fancy.

In [1]:
from sklearn.datasets import load_digits
import matplotlib.pyplot as plt
plt.rcParams['axes.axisbelow'] = True

import numpy as np

In [2]:
digits = load_digits()

In [3]:
digits.images.shape

(1797, 8, 8)

In [4]:
# Test train split
from sklearn.model_selection import train_test_split

X_train, X_test, y_train, y_test = train_test_split(digits.images, digits.target, test_size=0.25, random_state=1337)
X_train, X_val, y_train, y_val = train_test_split(X_train, y_train, test_size=0.25, random_state=1337)

print(y_train.shape, y_val.shape, y_test.shape)

(1010,) (337,) (450,)


### Model

Build a basic model with a 2 linear layers, using cross entropy loss and the Adam optimizer all with default arguments.

In [5]:
import torch
import torch.nn as nn

In [81]:
class NeuralNetwork(nn.Module):
    def __init__(self, hidden=32):
        super().__init__()
        self.flatten = nn.Flatten()
        self.linear_relu_stack = nn.Sequential(
            nn.Linear(8*8, hidden, dtype=torch.float64),
            nn.ReLU(),
            nn.Linear(hidden, 10, dtype=torch.float64)
        )

    def forward(self, x):
        x = self.flatten(x)
        logits = self.linear_relu_stack(x)
        return logits

model = NeuralNetwork()

criterion = nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(model.parameters())

print(model)

NeuralNetwork(
  (flatten): Flatten(start_dim=1, end_dim=-1)
  (linear_relu_stack): Sequential(
    (0): Linear(in_features=64, out_features=32, bias=True)
    (1): ReLU()
    (2): Linear(in_features=32, out_features=10, bias=True)
  )
)


In [None]:
out = model(torch.tensor(X_train[0], dtype=torch.float64).unsqueeze(0))
out

tensor([[ 0.1944, -1.2189, -0.1181,  2.3252,  0.9390,  0.0453, -0.6722, -0.5918,
         -0.3674,  4.4997]], dtype=torch.float64, grad_fn=<AddmmBackward0>)

Great, it all works. As a sidenote - testing the output of your neural network layers is a good idea while you're building it just to make sure everything is working as intended.

Now, we could just stuff all of our data through the network and train it, and this is fine for small datasets, but as your datasets get larger (which they will do if you're using more complicated NN architectures), then you don't want implement things like batching by hand. Fortunatel PyTorch comes with some pretty great dataloaders that takes care of this for you.

In addition, we also want to normalize our data, and convert everything to tensors. Let's try this now.

As with the custom sklearn classes, you are required to implement certain methods: `__init__`, `__len__`, and `__getitem__`.

In [74]:
from torch.utils.data import Dataset, DataLoader
from torchvision import transforms, utils

In [75]:
class DigitsDataset(Dataset):
    def __init__(self, images, labels, transform=None):
        self.images = images
        self.labels = labels
        self.transform = transform

    def __len__(self):
        return len(self.labels)

    def __getitem__(self, idx):
        label = self.labels[idx]
        image = self.images[idx]      
        if self.transform:
            image = self.transform(image)
        label = label
        return torch.tensor(image).unsqueeze(0), label

There is not a standard min-max scaler in pytorch, so we can implement one ourselves (note that this is overkill - since we know the maximum value and minimum values of the images, we can simply do `X / max(X)`, and the effect will be the same).

Again there are certain methods we have to define. The `__call__` method is basically the `fit_transform` method.

In [159]:
class MinMax(object):
    def __init__(self, feature_range=(0,1)):
        self.min = feature_range[0]
        self.max = feature_range[1]

    def __call__(self, sample):
        std = (sample - np.min(sample)) / (np.max(sample) - np.min(sample))
        sample_scaled = std * (self.max - self.min) + self.min

        return sample_scaled

We either feed this directly into our dataset, or we can string together multiple transformations using `Compose`:

In [171]:
transform = transforms.Compose([
    MinMax()
])

In [172]:
train_dataset = DigitsDataset(X_train, y_train, transform=transform)
train_dataloader = DataLoader(train_dataset, batch_size=32, shuffle=True)

test_dataset = DigitsDataset(X_val, y_val, transform=transform)
test_dataloader = DataLoader(test_dataset, batch_size=32, shuffle=True)

In [173]:
for X, y in train_dataloader:
    print(f"Shape of X [N, C, H, W]: {X.shape} {X.dtype}")
    print(f"Shape of y: {y.shape} {y.dtype}")
    break

Shape of X [N, C, H, W]: torch.Size([32, 1, 1, 8, 8]) torch.float64
Shape of y: torch.Size([32]) torch.int64


  return torch.tensor(image).unsqueeze(0), label


Now we write some functions to run training and testing. These are stock functions shamelessly ripped from the PyTorch tutorials (which are very good by the way).

In [174]:
def train(dataloader, model, loss_fn, optimizer):
    size = len(dataloader.dataset)
    model.train()
    for batch, (X, y) in enumerate(dataloader):
        # Compute prediction error
        pred = model(X)
        loss = loss_fn(pred, y)

        # Backpropagation
        loss.backward()
        optimizer.step()
        optimizer.zero_grad()

        if batch % 100 == 0:
            loss, current = loss.item(), (batch + 1) * len(X)
            print(f"loss: {loss:>7f}  [{current:>5d}/{size:>5d}]")

In [175]:
def test(dataloader, model, loss_fn):
    size = len(dataloader.dataset)
    num_batches = len(dataloader)
    model.eval()
    test_loss, correct = 0, 0
    with torch.no_grad():
        for X, y in dataloader:
            pred = model(X)
            test_loss += loss_fn(pred, y).item()
            correct += (pred.argmax(1) == y).type(torch.float).sum().item()
    test_loss /= num_batches
    correct /= size
    print(f"Test Error: \n Accuracy: {(100*correct):>0.1f}%, Avg loss: {test_loss:>8f} \n")

In [176]:
model = NeuralNetwork(hidden=256)
criterion = nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(model.parameters())

epochs = 10
for t in range(epochs):
    print(f"Epoch {t+1}\n-------------------------------")
    train(train_dataloader, model, criterion, optimizer)
    test(test_dataloader, model, criterion)
print("Done!")

Epoch 1
-------------------------------
loss: 2.298777  [   32/ 1010]
Test Error: 
 Accuracy: 86.9%, Avg loss: 1.748523 

Epoch 2
-------------------------------
loss: 1.680736  [   32/ 1010]
Test Error: 
 Accuracy: 87.2%, Avg loss: 1.059877 

Epoch 3
-------------------------------
loss: 0.973698  [   32/ 1010]
Test Error: 
 Accuracy: 89.3%, Avg loss: 0.638599 

Epoch 4
-------------------------------
loss: 0.626328  [   32/ 1010]
Test Error: 
 Accuracy: 90.8%, Avg loss: 0.462090 

Epoch 5
-------------------------------
loss: 0.395452  [   32/ 1010]


  return torch.tensor(image).unsqueeze(0), label


Test Error: 
 Accuracy: 91.4%, Avg loss: 0.359660 

Epoch 6
-------------------------------
loss: 0.381919  [   32/ 1010]
Test Error: 
 Accuracy: 93.2%, Avg loss: 0.288526 

Epoch 7
-------------------------------
loss: 0.198614  [   32/ 1010]
Test Error: 
 Accuracy: 92.6%, Avg loss: 0.256855 

Epoch 8
-------------------------------
loss: 0.186870  [   32/ 1010]
Test Error: 
 Accuracy: 93.8%, Avg loss: 0.223322 

Epoch 9
-------------------------------
loss: 0.154736  [   32/ 1010]
Test Error: 
 Accuracy: 93.8%, Avg loss: 0.223808 

Epoch 10
-------------------------------
loss: 0.152001  [   32/ 1010]
Test Error: 
 Accuracy: 96.4%, Avg loss: 0.180673 

Done!


For more information, I strongly recommend that you check out the [PyTorch tutorials on custom datasets and transformations](https://pytorch.org/tutorials/beginner/data_loading_tutorial.html).