We look to train, validate and test a machine learning parameter by optimising its parameters using our data.

In [1]:
import torch
from torch import nn
from torch.utils.data import DataLoader
from torchvision import datasets
from torchvision.transforms import ToTensor

training_data = datasets.FashionMNIST(
    root="data",
    train=True,
    download=True,
    transform=ToTensor()
)

test_data = datasets.FashionMNIST(
    root="data",
    train=False,
    download=True,
    transform=ToTensor()
)

train_dataloader = DataLoader(training_data, batch_size=64)
test_dataloader = DataLoader(test_data, batch_size=64)

In [2]:
class NeuralNetwork(nn.Module):
    def __init__(self):

        # super() function is used to give access to methods and properties
        # of a parent or sibling class.
        super().__init__()
        self.flatten = nn.Flatten()
        self.linear_relu_stack = nn.Sequential(
            nn.Linear(28*28, 512),
            nn.ReLU(),
            nn.Linear(512, 512),
            nn.ReLU(),
            nn.Linear(512, 10),
        )

    def forward(self, x):
        x = self.flatten(x)
        logits = self.linear_relu_stack(x)
        return logits

model = NeuralNetwork()

## Optimisation
Optimisation adjusts model parameters to reduce model error in each training step. Optimisation occurs in three steps:
1. optimizer.zero_grad(): reset gradients of model parameters as zero for each iteration (they add up by default).
2. Backpropagate prediction loss with a call to loss.backward() by computing the gradient of the loss function w.r.t. model parameters of interest.
3. Once we have gradients computed, we use "optimizer.step()' to adjust parameters by gradients collected in backward pass.

In [3]:
# Training loop: iterate over training dataset and optimise the model
# parameters to perform its job best with our data.
def train_loop(dataloader, model, loss_fn, optimizer):
  """

  """

  # Get the number of samples in dataset
  size = len(dataloader.dataset)

  # Put model to training mode (important for batch normalisation and dropout layers)
  model.train()

  # Divide entire dataset to smaller batches for parallel processing (improved computational efficiency)
  for batch, (X,y) in enumerate(dataloader):

    # Compute model prediction and loss
    pred = model(X)
    loss = loss_fn(pred, y)

    # Backpropagation
    loss.backward()
    optimizer.step()
    optimizer.zero_grad()

    # If complete batch, display model performance info
    if batch % 100 == 0:
      loss, current = loss.item(), (batch + 1) * len(X)
      print("Loss: {}, [{}/{}]".format(loss, current, size))


# Validation/test loop: iterate over test dataset to check
# if model performance is improving.
def test_loop(dataloader, model, loss_fn):
  """

  """

  # Get the number of samples in dataset
  size = len(dataloader.dataset)

  # Put model to evaluation mode (important for batch normalisation and dropout layers)
  model.eval()
  num_batches = len(dataloader)
  test_loss, correct = 0, 0

  # We don't want to compute gradients of any model parameters during test mode,
  # so we set torch.no_grad(). We also want to reduce unnecessary gradient computations
  # and memory usage for tensors.
  with torch.no_grad():
    for X, y in dataloader:
      pred = model(X)
      test_loss += loss_fn(pred, y).item()

      # WHAT IS THE COMPUTATION HERE??
      correct += (pred.argmax(1) == y).type(torch.float).sum().item()

  test_loss /= num_batches
  correct /= size
  print("Test error: \n Accuracy: {}%, Average loss: {} \n".format(
      100*correct, test_loss))

In [4]:
def run_ml_workflow(epochs, train_dataloader, test_dataloader,
                    model, loss_fn, optimizer):
  """
  """

  for idx in range(epochs):
    print("Epoch {}\n ------------------------".format(idx+1))
    train_loop(train_dataloader, model, loss_fn, optimizer)
    test_loop(test_dataloader, model, loss_fn)

  print("You have successfully trained and tested a machine learning model!!")

We initialise the loss function and optimiser, then pass it to the training and testing loops. Note that you can change the number of epochs used as well.

In this notebook, we are using the following:
- Loss function: Cross Entropy (**SEE FORMULATION**)
- Optimiser: Stochastic gradient descent (**REVIEW NOTES**)

In [5]:
# Define hyperparameters of neural network prior to training (epochs, learning rate and batch size)
epochs = 20
learning_rate = 1e-03
batch_size = 128

# Define loss function and optimiser to use for training
loss_fn = nn.CrossEntropyLoss()
optimizer = torch.optim.SGD(model.parameters(), lr=learning_rate)

# Run machine learning workflow
run_ml_workflow(epochs, train_dataloader, test_dataloader,
                model, loss_fn, optimizer)

Epoch 1
 ------------------------
Loss: 2.296739339828491, [64/60000]
Loss: 2.288170099258423, [6464/60000]
Loss: 2.2718656063079834, [12864/60000]
Loss: 2.2706751823425293, [19264/60000]
Loss: 2.243852138519287, [25664/60000]
Loss: 2.217721700668335, [32064/60000]
Loss: 2.224879503250122, [38464/60000]
Loss: 2.191222906112671, [44864/60000]
Loss: 2.186361789703369, [51264/60000]
Loss: 2.1515936851501465, [57664/60000]
Test error: 
 Accuracy: 39.300000000000004%, Average loss: 2.153448692552603 

Epoch 2
 ------------------------
Loss: 2.162328004837036, [64/60000]
Loss: 2.1563799381256104, [6464/60000]
Loss: 2.100987434387207, [12864/60000]
Loss: 2.116605758666992, [19264/60000]
Loss: 2.0637600421905518, [25664/60000]
Loss: 2.0008583068847656, [32064/60000]
Loss: 2.031008243560791, [38464/60000]
Loss: 1.9532660245895386, [44864/60000]
Loss: 1.9515361785888672, [51264/60000]
Loss: 1.885230302810669, [57664/60000]
Test error: 
 Accuracy: 55.510000000000005%, Average loss: 1.885105704046