#### Importing all dependencies

The first step here is to add all the dependencies. We need os for file input/output functionality, as we will save the CIFAR-10 dataset to local disk later in this tutorial. We'll also import torch, which imports PyTorch. From it we import nn, which allows us to define a neural network module. We also import the DataLoader (for feeding data into the MLP during training), the CIFAR10 dataset (for obvious purposes) and transforms, which allows us to perform transformations on the data prior to feeding it to the MLP.

In [1]:
import os
import torch
from torch import nn
from torchvision.datasets import CIFAR10
from torch.utils.data import DataLoader
from torchvision import transforms

#### Defining the MLP neural network class

Next up is defining the MLP class, which replicates the nn.Module class. This Module class instructs the implementation of our neural network and is therefore really useful when creating one. It has two definitions: __init__, or the constructor, and forward, which implements the forward pass.

In the constructor, we first invoke the superclass initialization and then define the layers of our neural network. We stack all layers (three densely-connected layers with Linear and ReLU activation functions using nn.Sequential. We also add nn.Flatten() at the start. Flatten converts the 3D image representations (width, height and channels) into 1D format, which is necessary for Linear layers. Note that with image data it is often best to use Convolutional Neural Networks. This is out of scope for this tutorial and will be covered in another one.

The forward pass allows us to react to input data - for example, during the training process. In our case, it does nothing but feeding the data through the neural network layers, and returning the output.

In [2]:
class MLP(nn.Module):
    '''
    Multilayer Perceptron.
    '''
    def __init__(self):
        super().__init__()
        self.layers = nn.Sequential(
            nn.Flatten(),
            nn.Linear(32 * 32 * 3, 64),
            nn.ReLU(),
            nn.Linear(64, 32),
            nn.ReLU(),
            nn.Linear(32, 10)
        )


    def forward(self, x):
        '''Forward pass'''
        return self.layers(x)

After defining the class, we can move on and write the runtime code. This code is actually executed at runtime, i.e. when you call the Python script from the terminal with e.g. python mlp.py. The class itself is then not yet used, but we will do so shortly.

The first thing we define in the runtime code is setting the seed of the random number generator. Using a fixed seed ensures that this generator is initialized with the same starting value. This benefits reproducibility of your ML findings.

Loading and preparing the CIFAR-10 data is a two-step process:

- Initializing the dataset itself, by means of CIFAR10. Here, in increasing order, you specify the directory where the dataset has to be saved, that it must be downloaded, and that they must be converted into Tensor format.

- Initializing the DataLoader, which takes the dataset, a batch size, shuffle parameter (whether the data must be ordered at random) and the number of workers to load data with. In PyTorch, data loaders are used for feeding data to the model uniformly.

Next, we initialize the MLP. We also specify the loss function (categorical crossentropy loss) and the Adam optimizer. The optimizer works on the parameters of the MLP and utilizes a learning rate of 10e-4

#### Defining the training loop

The core part of our runtime code is the training loop. In this loop, we perform the epochs, or training iterations. For every iteration, we iterate over the training dataset, perform the entire forward and backward passes, and perform model optimization.

Step-by-step, these are the things that happen within the loop:

- Here, we use 5 epochs, as defined by the range(0, 5).

- We set the current loss value for printing to 0.0.

- Per epoch, we iterate over the training dataset - and more specifically, the minibatches within this training dataset as specified by the batch size (set in the trainloader above). Here, we do the following things:
    - We decompose the data into inputs and targets (or x and y values, respectively).
    
    - We zero the gradients in the optimizer, to ensure that it starts freshly for this minibatch.
    
    - We perform the forward pass - which in effect is feeding the inputs to the model, which, recall, was initialized as mlp.
    
    - We then compute the loss value based on the outputs of the model and the ground truth, available in targets.
    
    - This is followed by the backward pass, where the gradients are computed, and optimization, where the model is adapted.
    
    - Finally, we print some statistics - but only at every 500th minibatch. At the end of the entire process, we print that the training process has finished.



In [6]:
if __name__ == '__main__':
  
    # Set fixed random number seed
    torch.manual_seed(42)
    
    # Prepare CIFAR-10 dataset
    dataset = CIFAR10(os.getcwd(), download=True, transform=transforms.ToTensor())
    trainloader = torch.utils.data.DataLoader(dataset, batch_size=10, shuffle=True, num_workers=4)
    
    # Initialize the MLP
    mlp = MLP()
  
    # Define the loss function and optimizer
    loss_function = nn.CrossEntropyLoss()
    optimizer = torch.optim.Adam(mlp.parameters(), lr=1e-4)
    
    # Run the training loop
    for epoch in range(0, 5): # 5 epochs at maximum
    
        # Print epoch
        print(f'Starting epoch {epoch+1}')
    
        # Set current loss value
        current_loss = 0.0
    
        # Iterate over the DataLoader for training data
        for i, data in enumerate(trainloader, 0):
      
            # Get inputs
            inputs, targets = data
      
            # Zero the gradients
            optimizer.zero_grad()
      
            # Perform forward pass
            outputs = mlp(inputs)
      
            # Compute loss
            loss = loss_function(outputs, targets)
      
            # Perform backward pass
            loss.backward()
      
            # Perform optimization
            optimizer.step()
      
            # Print statistics
            current_loss += loss.item()
            if i % 500 == 499:
                print('Loss after mini-batch %5d: %.5f' % (i + 1, current_loss / 500))
                current_loss = 0.0

    # Process is complete.
    print('Training process has finished.')

Files already downloaded and verified
Starting epoch 1
Loss after mini-batch   500: 2.23692
Loss after mini-batch  1000: 2.09935
Loss after mini-batch  1500: 2.03072
Loss after mini-batch  2000: 1.99956
Loss after mini-batch  2500: 1.93845
Loss after mini-batch  3000: 1.94456
Loss after mini-batch  3500: 1.91624
Loss after mini-batch  4000: 1.90289
Loss after mini-batch  4500: 1.86960
Loss after mini-batch  5000: 1.85643
Starting epoch 2
Loss after mini-batch   500: 1.83056
Loss after mini-batch  1000: 1.83181
Loss after mini-batch  1500: 1.82517
Loss after mini-batch  2000: 1.82263
Loss after mini-batch  2500: 1.81629
Loss after mini-batch  3000: 1.81058
Loss after mini-batch  3500: 1.80163
Loss after mini-batch  4000: 1.77174
Loss after mini-batch  4500: 1.77388
Loss after mini-batch  5000: 1.76247
Starting epoch 3
Loss after mini-batch   500: 1.75368
Loss after mini-batch  1000: 1.76395
Loss after mini-batch  1500: 1.74717
Loss after mini-batch  2000: 1.75415
Loss after mini-batch  