# Objectives


* To introduce you to PyTorch (https://pytorch.org/docs/stable/index.html), which is one of the most widely used software libraries for neural networks.
>**Remember**: It is your responsibility as a machine learning scientist to read documentations for any library function you use and to thoroughly understand what it is doing, if it validly serves your purpose, and which of its parameters you need to consider.

* To apply the perceptron and multilayer perceptron from Week 4 lecture to automatic detection of the number of days of ground frost and snow based on other weather variables.

# Section 1 - Load the UK Met (60km, 2010-2022) data

Same as for the Week 3 lab,

1. You need to first download the data before you can get started. Download from the Week 3 page for the module, on Canvas (see 'Week 3 Lab Dataset' on the page). The file you download will be named *curated_data_1month_2010-2022_nonans.csv*.

2. Then, use the file menu in Google Colab to upload the file to your Colab directory. Once upload is complete, you should be able to see the file on the listed contents of your Colab directory.

3. You can now run the code in the cell below to load the data.

In [11]:
import csv
import numpy


!ls  /content

data_file_full_path = "/Users/suli/Documents/source/repo/MachineLearning/Week 4/curated_data_1month_2010-2022_nonans.csv"

data_as_list = []

# load the dataset
with open(data_file_full_path) as csv_file:
    csv_reader = csv.reader(csv_file, delimiter=',')

    row_count = 0
    for row in csv_reader:

      if row_count > 0:
        data_as_list.append([float(val) for val in row])
      row_count += 1
    # for row in csv_reader:
    #   if row_count > 0:  # Skip the header
    #     try:
    #         data_as_list.append([float(val) for val in row])
    #     except ValueError as e:
    #         print(f"Skipping row {row_count} due to error: {e}")
    # row_count += 1
data = numpy.array(data_as_list)

# check its shape
print("\n The dataset has shape: "+str(data.shape))


# get features and labels from the data
# based on the objectives (see the Objectives section)
feat_col = [5, 6, 7, 8, 9, 10, 11]
ground_frost_col = 4
snow_col = 12

feats = data[:, feat_col]
ground_frost_label = data[:, ground_frost_col]
snow_label = data[:, snow_col]


# take a peek
print("\n A peek at the dataset features: \n"+str(feats))
print("\n A peek at the ground frost labels: \n"+str(ground_frost_label))
print("\n A peek at the snow labels: \n"+str(snow_label))


ls: /content: No such file or directory

 The dataset has shape: (10296, 13)

 A peek at the dataset features: 
[[8.93698275e+01 1.02266536e+03 6.45115642e+01 ... 6.45810733e+00
  6.72744772e+00 6.97199793e-01]
 [8.94462109e+01 1.02270800e+03 5.74868117e+01 ... 5.88191052e+00
  6.23064828e+00 1.62952568e+00]
 [8.93435447e+01 1.02243684e+03 6.82935149e+01 ... 4.62830127e+00
  6.29080656e+00 1.17293773e+00]
 ...
 [8.78370293e+01 1.00645706e+03 1.38800195e+01 ... 4.96064026e+00
  1.85626301e+00 7.99709543e+00]
 [8.88116315e+01 1.00662248e+03 2.05853162e+01 ... 4.93635497e+00
  7.75835354e-01 8.46815900e+00]
 [8.27601516e+01 1.00593830e+03 1.05309193e+01 ... 8.38081942e+00
  3.54509758e+00 6.35990610e+00]]

 A peek at the ground frost labels: 
[ 9.84928987 10.85267889 12.97189949 ... 21.7275541  23.77582838
 17.35386163]

 A peek at the snow labels: 
[112.2352382  116.3547495   57.53778808 ... 177.2424627  135.4028786
 140.831213  ]


# Section 2 - Split into training, validation, and test sets

In [12]:
from sklearn.model_selection import train_test_split

all_ids = numpy.arange(0, feats.shape[0])

random_seed = 1

# First randomly split the data into 70:30 to get the training set
train_set_ids, rem_set_ids = train_test_split(all_ids, test_size=0.3, train_size=0.7,
                                 random_state=random_seed, shuffle=True)


# Then further split the remaining data 50:50 into validation and test sets
val_set_ids, test_set_ids = train_test_split(rem_set_ids, test_size=0.5, train_size=0.5,
                                 random_state=random_seed, shuffle=True)


train_data = feats[train_set_ids, :]
train_ground_frost_labels = ground_frost_label[train_set_ids]
train_snow_labels = snow_label[train_set_ids]

val_data = feats[val_set_ids, :]
val_ground_frost_labels = ground_frost_label[val_set_ids]
val_snow_labels = snow_label[val_set_ids]

test_data = feats[test_set_ids, :]
test_ground_frost_labels = ground_frost_label[test_set_ids]
test_snow_labels = snow_label[test_set_ids]

# Section 3 - Scale (i.e. normalize) the input data

In [13]:
from sklearn.preprocessing import StandardScaler

scaler = StandardScaler()
scaler.fit(feats)
scaled_feats = scaler.transform(feats)
print("\n A peek at the scaled dataset features: \n"+str(scaled_feats))


scaled_train_data = scaled_feats[train_set_ids, :]
scaled_val_data = scaled_feats[val_set_ids, :]
scaled_test_data = scaled_feats[test_set_ids, :]


 A peek at the scaled dataset features: 
[[ 1.32716655  1.57084147 -0.88330353 ...  1.51819974 -0.59716362
  -0.08830932]
 [ 1.34196535  1.57808582 -0.99442844 ...  1.03005351 -0.70725389
   0.32720564]
 [ 1.32207443  1.53201409 -0.82347666 ... -0.03198748 -0.69392287
   0.1237155 ]
 ...
 [ 1.03019683 -1.18300764 -1.68424647 ...  0.24956567 -1.67661336
   3.16507655]
 [ 1.2190197  -1.15490117 -1.57817505 ...  0.2289915  -1.91603506
   3.37501812]
 [ 0.04658465 -1.27114612 -1.73722605 ...  3.1470957  -1.30236928
   2.4354211 ]]


# Section 4 - Train and evaluate a perceptron model


* Implement a perceptron from scratch. (See Week 4 lecture for the formal definition, i.e. mathematics, of a perceptron.)
* Train and evaluate the perceptron for classification of ground frost label into 2 classes. (See Week 3 lab for preparation of the dataset for classification)

# Section 5 - Train and evaluate a multilayer perceptron (MLP)

In [None]:
import torch
from torch import nn
from torch import optim
from torch.utils.data import Dataset
from torch.utils.data import DataLoader
import matplotlib.pyplot as plt
from sklearn.metrics import confusion_matrix, ConfusionMatrixDisplay, f1_score, accuracy_score

### Create the methods to be used

# Create the neural network structure
# for a 3-layer MLP
class three_layer_MLP(nn.Module):
    def __init__(self,
                 input_size,
                 hidden_layer_sizes,
                 output_size):
        super().__init__()
        self.hidden_l1 = nn.Linear(input_size, hidden_layer_sizes[0])
        self.hidden_l2 = nn.Linear(hidden_layer_sizes[0], hidden_layer_sizes[1])
        self.output_l3 = nn.Linear(hidden_layer_sizes[1], output_size)


    def forward(self, inputs):
        out = self.hidden_l1(inputs)
        out = self.hidden_l2(out)
        out = self.output_l3(out)
        out = torch.softmax(out, 1)
        return out


# A method for computing performance metrics of interest
def my_metrics(labels, predictions, show_confusion_matrix=True):

    ## First work out which class has been predicted for each data sample.
    ## Finally return the classification performance
    predictions_numpy = predictions.detach().numpy()
    predicted_classes = numpy.argmax(predictions_numpy, axis=1)
    f1_scores = f1_score(labels, predicted_classes, average=None)
    acc = accuracy_score(labels, predicted_classes)

    if show_confusion_matrix:
      print("\n Confusion matrix:")
      confus_mat = confusion_matrix(labels, predicted_classes)
      disp = ConfusionMatrixDisplay(confus_mat)
      disp.plot()
      plt.show()

    return f1_scores, acc


# A class for managing the data for training the model
class MetDataset(Dataset):
    def __init__(self, feats, labels):
        # Convert features from numpy arrays to PyTorch tensors
        self.feats = torch.tensor(feats, dtype=torch.float32)

        # Recode class label -1 to 0
        # as the PyTorch library requires class labels to be numbered from zero
        numpy.place(labels, labels==-1, 0)

        # Convert labels from numpy arrays to PyTorch tensors
        self.labels = torch.tensor(labels, dtype=torch.long)

    def __len__(self):
        return len(self.labels)

    def __getitem__(self, idx):

        return self.feats[idx, :], self.labels[idx]


NameError: name 'true' is not defined

In [18]:
import random


### Train and evaluate the 3-layer MLP

# Ensure reproducibility
# for PyTorch operations that use random numbers internally
random.seed(random_seed)
torch.manual_seed(random_seed)
numpy.random.seed(random_seed)

# Create an instance of the 3-layer MLP
feature_count = train_data.shape[1]
hidden_layer_sizes = [10, 10]
class_count = numpy.unique(ground_frost_label_class).shape[0]
MLP_model = three_layer_MLP(feature_count, hidden_layer_sizes, class_count)


# Set values for hyperparameters
num_epochs = 100
learning_rate = 0.001
batch_size = 50


# Set up the data loading by batch
# With the test and validation sets having only one batch
train_set = MetDataset(scaled_train_data, train_ground_frost_labels_class)
train_dataloader = DataLoader(train_set, batch_size=batch_size)

val_set = MetDataset(scaled_val_data, val_ground_frost_labels_class)
val_dataloader = DataLoader(val_set, batch_size=len(val_set))

test_set = MetDataset(scaled_test_data, test_ground_frost_labels_class)
test_dataloader = DataLoader(test_set, batch_size=len(test_set))



# Set up the SGD optimizer for updating the model weights
optimizer = optim.SGD(MLP_model.parameters(), lr=learning_rate)


# Compute cross entropy loss against the training labels
loss_function = nn.CrossEntropyLoss()



best_model_acc = 0
losses = []

# Iterate over the dataset at two different stages:
# 1. Iterate over the batches in the dataset (inner for loop below)
# One complete set of iteration through the dataset (i.e. having gone over
# all batches in the dataset at least once) = One epoch
# 2. Iterate over the specified numeber of epochs (outer for loop below)
for epoch in range(0, num_epochs):

    # Set the model to training mode
    MLP_model.train()


    for batch, (X_train, y_train) in enumerate(train_dataloader):

      # Zero out the `.grad` buffers,
      # otherwise on the backward pass we'll add the
      # new gradients to the old ones.
      optimizer.zero_grad()

      # Compute the forward pass and then the loss
      train_pred = MLP_model.forward(X_train)
      train_loss = loss_function(train_pred, y_train)
      train_avg_f1_score, train_acc = my_metrics(y_train, train_pred)

      # Compute the model parameters' gradients
      # and propagating the loss backwards through the network.
      train_loss.backward()

      # Update the model parameters using those gradients
      optimizer.step()

    # How well the network does on the batches
    # is an indication of how well training is progressing
    print("epoch: {} - train loss: {:.4f} train acc: {:.4f}".format(
        epoch,
        train_loss.item(),
        train_acc))

    losses.append(train_loss.item())


# Finally, test your model on the test set and get an estimate of its performance.
# First, set the model to evaluation mode
MLP_model.eval()
for batch, (X_test, y_test) in enumerate(test_dataloader):
  test_pred = MLP_model.forward(X_test)
  test_f1_scores, test_accuracy = my_metrics(y_test, test_pred, show_confusion_matrix=True)
  print("\n test accuracy: {:2.2f}".format(test_accuracy))
  test_pred_numpy = test_pred.detach().numpy()
  print('\n The F1 scores for each of the classes are: '+str(test_f1_scores))

  print("\n Training loss:")
  fig, ax = plt.subplots()
  losses = numpy.array(losses)
  ax.plot(losses, 'b-', label='training loss')


NameError: name 'ground_frost_label_class' is not defined

# Section 6 - Train the MLP with early stopping

* Edit the code in Section 5 to include early stopping. (See Week 4 lecture for information on early stopping.)

# Section 7 - Explore PyTorch MLP training and evaluation

1. The first part of the code in Section 5 uses *torch.nn.Linear()* in building the 3-layer MLP. What do you think that this bit of code does?

2. Why was *numpy.argmax(predictions_numpy, axis=1)* needed to get predictions from the MLP? (See the *my_metrics(...)* method of the first part of the Section 5 code.)

3. Try different settings for the hyperparameters of the MLP in Section 5, particularly the:
  * output layer activation function
  * loss function
  * number of epochs
  * learning rate
  * batch size
  * optimization algorithm