Pytorch Basics

In this notebook we will convert the problem we solved usnig regression notebook into a classification problem instead. So we will assign each wine quality rating to a different class. So quality of 0 will belong to a different class to quality of 1.


The biggest difference between a regression solution and multi-class classification in terms of implementation is that now our model needs to output 10 values (1 for each quality rating) instead of just a single output. Also we will need to change our loss function.

In [None]:
# Import Pandas
import pandas as pd

# There are two datasets available, but we'll just work with the larger, white
# wine dataset. Feel free to play around with the red wine dataset.
red_wine_url = "https://archive.ics.uci.edu/ml/machine-learning-databases/wine-quality/winequality-red.csv"
white_wine_url = "https://archive.ics.uci.edu/ml/machine-learning-databases/wine-quality/winequality-white.csv"

# It's a single function call to load a dataset. CSV files typically use commas
# as delimiters between records, but our dataset uses semicolons so we had to
# specify it with the "delimiter" argument.
all_data = pd.read_csv(white_wine_url, delimiter=';')

print(type(all_data))

<class 'pandas.core.frame.DataFrame'>


In [None]:
target_column = "quality"

# TODO: Extract just the *input* features. Instead of specifying all of the features we
# want, we should drop the feature we *don't* want.
# x_data = ...

# SOLUTION LINE
x_data = all_data.drop(target_column, axis=1)

# TODO: Extract the target feature. We don't want a Series here, but a DataFrame
# with one column (see the above cell)
# y_data = ...

# SOLUTION LINE
y_data = all_data[[target_column]]

# If your implementation is correct the shape of y_data should be (4898, 1)
# So y_data is a 2D tensor containing 4898 examples and just 1 feature.
print("y_data shape:", y_data.shape) 

# Just like with tensors, we can print the shape
num_examples = x_data.shape[0]
num_input_features = x_data.shape[1]

# If your implementation is correct the number of samples should be 4898
print("Number of examples:", num_examples) 

# If your implementation is correct the number of input features should be 11
print("Number of input features:", num_input_features)

# If your implementation the shape of the x_data tensor should be (4898, 11)
# which means it is a 2D array where each row represents one example and each
# column represents one feature
print("x_data shape:", x_data.shape )

y_data shape: (4898, 1)
Number of examples: 4898
Number of input features: 11
x_data shape: (4898, 11)


In [None]:
# Import Torch and the dataset utilities we need
import torch
from torch.utils.data import DataLoader, Dataset, TensorDataset, random_split

# The percentages for each partition
TRAIN_SPLIT = 0.8
VAL_SPLIT = 0.1
TEST_SPLIT = 0.1
# Ensure that the splits add to 100%
assert TRAIN_SPLIT + VAL_SPLIT + TEST_SPLIT == 1


# TODO: Create two tensors and initialise them with x_data.values and y_data.values.
# The dtype should be torch.float32. DataFrame.values directly returns the 2D
# data in the dataframe, which is what Torch requires to initialise a tensor.
# x_tensor = ...
# y_tensor = ...

# SOLUTION LINE
x_tensor = torch.tensor(x_data.values, dtype=torch.float32)
# SOLUTION LINE
y_tensor = torch.tensor(y_data.values, dtype=torch.float32)

# Now we construct a TensorDataset - a simple class used to associate each x and
# y value in our tensors.
full_dataset = TensorDataset(x_tensor, y_tensor)

In [None]:
# Calculate the number of examples in each partition
train_size = int(TRAIN_SPLIT * len(all_data))
val_size = int(VAL_SPLIT * len(all_data))
test_size = len(all_data) - train_size - val_size

print("Train examples:     ", train_size)
print("Validation examples:", val_size)
print("Test examples:      ", test_size)

# Before we actually split the dataset, we seed Torch's random number generator.
# This ensure that we end up with the exact same partitions every time it's run.
torch.manual_seed(42)

# TODO: Split the dataset using the random_split function we imported earlier.
# The function takes a dataset and a list of partition lengths.
# Hint: We already have all of these variables available
# train_dataset, val_dataset, test_dataset = random_split(...)

# SOLUTION LINE
train_dataset, val_dataset, test_dataset = random_split(full_dataset, [train_size, val_size, test_size])

Train examples:      3918
Validation examples: 489
Test examples:       491


In [None]:
# When you've finished the lab, try modifying the batch size to see what effect
# it has on your results
BATCH_SIZE = 64

# TODO: Construct a DataLoader for each Dataset. The constructor takes three
# arguments - a Dataset, the batch size, and a boolean indicating whether it
# should shuffled. We will set shuffle=True for train dataloader.

# train_loader = DataLoader(...
# val_loader = DataLoader(...
# test_loader = DataLoader(...

# SOLUTION LINE
train_loader = DataLoader(train_dataset, batch_size=BATCH_SIZE, shuffle=True)
# SOLUTION LINE
val_loader = DataLoader(val_dataset, batch_size=BATCH_SIZE, shuffle=False)
# SOLUTION LINE
test_loader = DataLoader(test_dataset, batch_size=BATCH_SIZE, shuffle=False)

### Enable GPU Training *(if available)*
The rise in popularity of deep learning is largely a result of the availability of good Graphics Processing Units. So although it's not required, it's definitely good to utilise a GPU if you can.

It's exceptionally easy to use a compatible GPU in Pytorch - we can do it in just a few lines of code!

In [None]:
# By default we'll assume that GPU acceleration isn't available
device = torch.device("cpu")

# Check if GPU acceleration is available (requires a CUDA-compatible GPU) and
# set the device variable accordingly. If the computer has more than one GPU,
# you can specify which one by replacing 0 with a different index
if torch.cuda.is_available():
    device = torch.device("cuda:0")
    torch.cuda.set_device(device)

print("Training on", device)

Training on cuda:0


### Define the Model
Here you will need to change the MLP so that it outputs 10 features instead of just 1. Since to perform classification we need to output a value for each of the 10 quality classes (from 0 to 9).

In [None]:
# Import the neural network module of Pytorch. We access its methods like "nn.Linear"
import torch.nn as nn

# Our model class must subclass nn.Module
class MLP(nn.Module):
    # The __init__ method is similar to a constructor like you find in other
    # languages. We will take the device as an argument to transfer the model to the GPU
    def __init__(self, device):
        super().__init__()
        # TODO: Initialise a Sequential module consisting of the below layers, and
        # store it in the member variable self.seq
        #  - a linear layer mapping from num_input_features to 20 hidden features
        #  - a ReLU activation layer
        #  - a linear layer mapping from 20 hidden features to a 10 features (the wine quality)
        # You can look here for an example:
        #     https://pytorch.org/docs/stable/generated/torch.nn.Sequential.html
        # self.seq = nn.Sequential(...

        # SOLUTION LINE
        self.seq = nn.Sequential(nn.Linear(num_input_features, 20),
                          nn.ReLU(),
                          nn.Linear(20, 10))
    
        # The model stays on the CPU by default. Calling the "to" method transfers
        # the model weights to whichever device we specified
        self.to(device)

    # Our forward method simply takes the input batch x, passes it through our
    # Sequential module, and returns the outputs (predictions)
    def forward(self, x):
        return self.seq(x)

## Function that can be used to compute accuracy
The function below can be used to compute the accuracy of classification. Note it looks at each of the predicted features (in our case that is the 10 quality values) and picks the largest feature to represented the predicted class (predicted quality value). The largest feature is represented by its index in the output feature tensor. Then it compares the predicted index against the target.

In [None]:
def compute_accuracy(predictions, targets):
    # Find the index with the highest predicted value - this is the predicted digit
    predictions = predictions.argmax(1)
    # Count the number of predictions that match the target
    correct = (predictions == targets).sum().item()
    # Compute the accuracy as the percentage correctly predicted
    acc = correct / len(targets)
    return acc

## Simple Training loop

You will need to make the following changes to make it work for classification:


1.   Change the <font color=red>loss</font> function to <font color=red>nn.CrossEntropyLoss()</font>.
2.   Currently the labels are 2D tensors of shape \[Batch size, 1\] we need to change it to 1D tensors of shape \[Batch size\]. We can do this using the squeeze function (<font color=red>torch.squeeze(tensor_name)</font>). 
3.   Currently the labels have float32 type. We need to change it to type long to make it work for classification. The way to convert a tensor into a type long is to use the long function (<font color=red>tensor_name.long()</font>). 
4.   Call the compute_accuracy function to compute the classification accuracy.



In [None]:
import torch.optim as optim
import numpy as np

# TODO: Change this loss to use the correct loss for classification
#criterion = nn.MSELoss()

#SOLUTION
criterion = nn.CrossEntropyLoss()

# Use MLP as the model
model = MLP(device)
# Use the SGD optimizer with initial learning rate set to 0.0001
optimizer = torch.optim.SGD(model.parameters(), lr=0.0001)
# The number of times we loop over the entire dataset
total_epochs = 100


for epoch in range(total_epochs):  # loop over the dataset multiple times

    epoch_train_accuracy = []

    # The following is computed in a single pass through the dataset
    running_loss = 0.0
    for i, data in enumerate(train_loader, 0):
        # get the inputs; data is a list of [inputs, labels]
        inputs, labels = data

        # TODO: Write here code that coverts the labels tensor from shape [batchsize, 1] to shape [batchsize]
        #       Next write code that converts the labels tensor to type long().
        #       labels = ....
        #       labels = ....

        # Solution
        labels = torch.squeeze(labels).long()

        # Copy the data to the specified device
        inputs, labels = inputs.to(device), labels.to(device)

        # Zero the parameter gradients
        optimizer.zero_grad()

        # Forward + backward + optimize
        outputs = model(inputs)

        # Compute the loss using the loss function
        loss = criterion(outputs, labels)

        # TODO: write code here for computing the classification accuracy
        #       by calling the compute_accuracy function written in the previous 
        #       cell.
        #       accuracy = ....

        # solution 
        accuracy = compute_accuracy(outputs, labels)
        epoch_train_accuracy.append(accuracy)

        # Perform backprop using the loss
        loss.backward()
        optimizer.step()
        running_loss += loss.item()

     # print statistics
    print('epoch: %d loss: %.3f' % (epoch + 1, running_loss / len(train_loader)))
    print('epoch: %d accuracy: %.3f' % (epoch + 1, np.mean(epoch_train_accuracy)))

print('Finished Training')


epoch: 1 loss: 6.389
epoch: 1 accuracy: 0.146
epoch: 2 loss: 1.910
epoch: 2 accuracy: 0.320
epoch: 3 loss: 1.841
epoch: 3 accuracy: 0.326
epoch: 4 loss: 1.755
epoch: 4 accuracy: 0.335
epoch: 5 loss: 1.700
epoch: 5 accuracy: 0.331
epoch: 6 loss: 1.615
epoch: 6 accuracy: 0.343
epoch: 7 loss: 1.561
epoch: 7 accuracy: 0.347
epoch: 8 loss: 1.510
epoch: 8 accuracy: 0.346
epoch: 9 loss: 1.462
epoch: 9 accuracy: 0.360
epoch: 10 loss: 1.430
epoch: 10 accuracy: 0.362
epoch: 11 loss: 1.418
epoch: 11 accuracy: 0.370
epoch: 12 loss: 1.400
epoch: 12 accuracy: 0.380
epoch: 13 loss: 1.395
epoch: 13 accuracy: 0.376
epoch: 14 loss: 1.384
epoch: 14 accuracy: 0.383
epoch: 15 loss: 1.390
epoch: 15 accuracy: 0.388
epoch: 16 loss: 1.381
epoch: 16 accuracy: 0.391
epoch: 17 loss: 1.373
epoch: 17 accuracy: 0.394
epoch: 18 loss: 1.366
epoch: 18 accuracy: 0.399
epoch: 19 loss: 1.357
epoch: 19 accuracy: 0.398
epoch: 20 loss: 1.355
epoch: 20 accuracy: 0.403
epoch: 21 loss: 1.348
epoch: 21 accuracy: 0.420
epoch: 22 

<font color = red> Isn't it so much cooler to see an accuracy measure rather than just loss! It actually lets us know how close we are to 100%. This is one of the benefits of doing classification rather than regression. There is no clear accuracy metric for regression.

## Simple Testing loop

Now modify the testing loop so that is also works for classification. Do all the things you did for the training loop with the exception of changing the loss function.


In [None]:
running_loss = 0.0
total_test_accuracy = []

for i, data in enumerate(test_loader, 0):
   # get the inputs; data is a list of [inputs, labels]
   inputs, labels = data
  
   # TODO: Write here code that coverts the labels tensor from shape [batchsize, 1] to shape [batchsize]
   #       Next write code that converts the labels tensor to type long().
   #       labels = ...
   #       labels = ...

   # Solution
   labels = torch.squeeze(labels).long()

   # Copy the data to the specified device
   inputs, labels = inputs.to(device), labels.to(device)

   model.eval()
   with torch.no_grad():
    # Forward + backward + optimize
    outputs = model(inputs)

    # Compute the loss using the loss function
    loss = criterion(outputs, labels)
    running_loss += loss.item()

    # TODO: write code here for computing the classification accuracy
    #       by calling the compute_accuracy function
    #       accuracy = ....
    
    # Solution
    accuracy = compute_accuracy(outputs, labels)
    
    total_test_accuracy.append(accuracy) 

print("test loss: ", running_loss/len(test_loader))
print('test accuracy: ', np.mean(total_test_accuracy))

test loss:  1.19777350127697
test accuracy:  0.48187681686046513


### Make the results better!
Look at some of the things you tried for improving the regression problem result and see how well they work for classification. What is the highest classification accuracy you can get? 