<h1>NHL Stenden Classification Workshop</h1>

<h2>Import Packages: Configuration</h2>
The followings imports are required to run the code. Click the run button to import the packages.

In [1]:
import torch
import torchvision
import torchvision.transforms as transforms
from torch.utils.data import random_split
from torchvision import utils
import matplotlib.pyplot
import matplotlib.pyplot as plt
import numpy as np
import random
import torch.nn as nn
import torch.nn.functional as F

import torch.optim as optim

import os

from sklearn.metrics import confusion_matrix, ConfusionMatrixDisplay

<h2>Computing Hardware: Configuration</h2>
Neural Networks can be executed on different hardware devices. The following code block allows the user to indicate whether to run the Neural Network on the CPU or on the GPU.

In [2]:
# Either run the Neural Network on the CPU (run_cpu = True) or GPU (run_cpu = False)
run_cpu = False

if run_cpu:
    device = torch.device('cpu')
else:
    device = torch.device('cuda')
    torch.cuda.manual_seed_all(42)

print(device)

cuda


<h2>Data: Configuration</h2>
The following code downloads and loads the dataset. Transforms are used to resize and normalize the input images.

<h3>Exercise</h3>
1. <em> Adapt the batch size. What influence does it have on the performance of the model? </em> <br>
2. <em> Add a Horizontal Flip to the list of transformations. What influence does it have on the performance of the model? </em> <br>
3. <em> Choose any other data augmentation that is listed <a href="https://pytorch.org/vision/stable/transforms.html">here</a>. What influence does it have on the performance of the model? </em> </br>
4. <em> Why would you preferably not add data augmentation to the testing set? </em>


In [3]:
# Make experiments reproducible
torch.manual_seed(42)
random.seed(42)
np.random.seed(42)

# Configuration (Feel free to experiment with these values)
batch_size = 4
image_size = 64

# Normalizes the images
transform = transforms.Compose(
    [transforms.ToTensor(),
     transforms.Resize((image_size, image_size)),
     #transforms.RandomHorizontalFlip(p=0.5),
     transforms.Normalize((0.5), (0.5))])


<h2>Data: Standard Dataset</h2>
The following code denotes where the data is located.

In [4]:
# Configuration: Standard dataset
data_path = './detectionBanana'

<h2>Data: Dataset Division</h2>
The following code downloads and loads the dataset.  Three different loaders are used for the training, validation and testing images. This ensures that the images do not overlap across the splits.

In [5]:
# Uploading a file generates unwanted files. This messes up the dataloading part. The code below ensures that we only load images.
class ImageFolderNB(torchvision.datasets.ImageFolder):
    def __init__(self, *args, **kwargs):
        super().__init__(*args, **kwargs)

    def find_classes(self, directory):
        list_classes = [d for d in os.listdir(directory) if os.path.isdir(os.path.join(directory, d)) and d[0] != "."]
        dict_classes = {d:i for i, d in enumerate(sorted(list_classes))}

        return list_classes, dict_classes

# Dataset division (train, validation and test split)
valid_extension = [".jpg", ".png", ".jpeg", ".JPG", ".PNG"]
train_dataset = ImageFolderNB(root=data_path + '/train', transform=transform, is_valid_file = lambda path: os.path.splitext(path)[1] in valid_extension)
validation_dataset = ImageFolderNB(root=data_path + '/valid', transform=transform, is_valid_file = lambda path: os.path.splitext(path)[1] in valid_extension)
test_dataset = ImageFolderNB(root=data_path + '/test', transform=transform, is_valid_file = lambda path: os.path.splitext(path)[1] in valid_extension)

# Define the different dataloaders
trainloader = torch.utils.data.DataLoader(train_dataset, batch_size=batch_size,
                                          shuffle=True)

validloader = torch.utils.data.DataLoader(validation_dataset, batch_size=batch_size,
                                          shuffle=True)

testloader = torch.utils.data.DataLoader(test_dataset, batch_size=batch_size,
                                         shuffle=True)

classes = train_dataset.classes

ValueError: 'class_to_index' must have at least one entry to collect any samples.

<h2>Data: Data Science Perspective (Qualitative)</h2>
The following code block is responsible for visualizing some images. One batch of images (default = 4) is first normalized between 0 and 1 and then displayed. Feel free to run this block multiple times to display different images.

In [None]:
# Normalize the image to be in the range [0, 1] and display one batch
def imshow(img):
    img = img / 2 + 0.5
    plt.imshow(np.transpose(img.numpy(), (1, 2, 0)))
    plt.show()

# get some random training images
images, labels = next(iter(trainloader))

# show images
imshow(torchvision.utils.make_grid(images))
# print labels
print(' '.join(f'{classes[labels[j]]:5s}' for j in range(batch_size)))

<h2>Data: Data Science Perspective (Quantitative)</h2>
The following code block is responsible for displaying some statistics of the training dataset.

In [None]:
train_labels = {}

images, labels = next(iter(trainloader))

channel, height, width = images[0].shape

print("Images have %d channel(s) with height %d and width %d" % (channel, height, width))

for i, (inputs, labels) in enumerate(trainloader, 0):
    # count the number of instances per class
    for label in labels:
        if label.item() in train_labels:
            train_labels[label.item()] += 1
        else:
            train_labels[label.item()] = 1

print("Number of classes: " + str(len(train_labels.keys())))
print("Number of instances per class:")

for key in sorted(train_labels):
    print("%s: %s" % (classes[key], train_labels[key]))

print("Accuracy random guessing on train set: " + str(float(max(train_labels.values()) / int(sum(train_labels.values())))))

<h2>Convolutional Neural Network: Architecture</h2>
The following code defines the Convolutional Neural Network (CNN). Several parameters can be adapted for experimentation. The network consists several components including convolutional layers (nn.Conv2d), maximum pooling layers (nn.MaxPool2d), activation functions (F.relu, F.softmax) and fully connected layers (nn.Linear). Whenever you want to reset the model, you can run this cell.

<h3>Exercise</h3>
1. <em> Change the learning rate. Try to experiment with values between 0.1 and 0.0001. What effect does it have on the training behaviour? Is the model performing better or worse? </em> <br>
2. <em> Change the activation functions from F.relu to F.sigmoid. What effect does it have on the model? </em>

In [None]:
# CNN: Network configuration
num_filters_layer_1 = 6
num_filters_layer_2 = 16

filter_size_layer_1 = 5
filter_size_layer_2 = 5

learning_rate = 0.001

num_classes = len(classes)

#output
output_conv_dim = int((((image_size - (filter_size_layer_1 - 1)) / 2) - (filter_size_layer_2 - 1)) / 2)

# Definition of the Neural Network and its components
class Net(nn.Module):
    def __init__(self):
        super().__init__()
        self.conv1 = nn.Conv2d(in_channels=channel, out_channels=num_filters_layer_1, kernel_size=filter_size_layer_1)
        self.pool = nn.MaxPool2d(2, 2)
        self.conv2 = nn.Conv2d(in_channels=num_filters_layer_1, out_channels=num_filters_layer_2, kernel_size=filter_size_layer_2)
        self.fc1 = nn.Linear(num_filters_layer_2 * output_conv_dim * output_conv_dim, 120)
        self.fc2 = nn.Linear(120, 84)
        self.fc3 = nn.Linear(84, num_classes)

    def forward(self, x):
        x = self.pool(F.relu(self.conv1(x)))
        x = self.pool(F.relu(self.conv2(x)))
        x = torch.flatten(x, 1)
        x = F.relu(self.fc1(x))
        x = F.relu(self.fc2(x))
        x = self.fc3(x)
        return x

net = Net()
model = net.to(device)

<h2>Training a CNN: Loss Function and Optimizer</h2>
A loss function is used to compute the discrepancy between the prediction of the network and the desired output. This discrepancy is then minimized by executing an optimization algorithm. The following code block defines the loss function and instantiates the optimizer.

<h3>Exercise</h3>
1. <em> Change the SGD optimizer to Adam. What difference does it make on the training process and overall performance of the model? </em>

In [None]:
ce_loss = nn.CrossEntropyLoss()
optimizer = optim.SGD(net.parameters(), lr=learning_rate, momentum=0.9)
#optimizer = optim.Adam(net.parameters(), lr=learning_rate)

best_validation_loss = 9999

<h2>Training a CNN: Forward and Backward Pass</h2>
The Neural Network is ready to be trained. The following code block loops once over all training inputs (one epoch) and validates the performance.

<h3>Exercise</h3>
1. <em> Run the following code blocks multiple times. What happens to the validation loss? What effect does it have on the overall performance of the model? </em>

In [None]:
running_loss = 0.0
running_loss_val = 0.0

# Training a CNN: Training (Get the inputs; data is a list of [inputs, labels])
for i, (inputs, labels) in enumerate(trainloader, 0):

    inputs, labels = inputs.to(device), labels.to(device)

     # Training a CNN: Optimizer (sets gradients to 0)
    optimizer.zero_grad()

    # Training a CNN: Forward Pass
    outputs = net(inputs)
    loss = ce_loss(outputs, labels)

    # Training a CNN: Backward Pass
    loss.backward()
    optimizer.step()

    running_loss += loss.item()

print("Training loss after " + str(i) + " minibatches is: " + str(running_loss / i))

# Training a CNN: Validation (Get the inputs; data is a list of [inputs, labels])
for i, (inputs, labels) in enumerate(validloader, 0):

    inputs, labels = inputs.to(device), labels.to(device)

    # Training a CNN: Forward Pass
    outputs = net(inputs)
    loss_val = ce_loss(outputs, labels)

    running_loss_val += loss_val

# Training a CNN: Validation (Store the best model)
if running_loss_val < best_validation_loss:
    PATH = './best.pth'
    torch.save(net.state_dict(), PATH)
    best_validation_loss = running_loss_val

print("Validation loss after " + str(i) + " minibatches is: " + str(float(running_loss_val) / i))

<h2>Testing: Data Science Perspective (Qualitative)</h2>
Before evaluating the performance of our model on the testing set. It is a good idea to get an impression of the test data. The following code block below visualizes a few test images.


In [None]:
dataiter = iter(testloader)
images, labels = next(dataiter)

images, labels = images.to(device), labels.to(device)

# print images
imshow(torchvision.utils.make_grid(images.cpu()))
print(' '.join(f'{classes[labels[j]]:5s}' for j in range(batch_size)))

<h2>Testing: Confidence of Prediction</h2>
Although the above code provides a prediction for our image. It would be interesting to see how confident the model is about its decision. The below code normalizes using softmax the output of our model. This results in scores for each class that lie between 0 and 1 and sum to 1. Execute the code block below. What can you say about the resulting figure?

In [None]:
def softmax(x):

    f_x = np.exp(x) / np.sum(np.exp(x))
    return f_x

# For visualization purposes, only visualize a maximum of first four predictions of the above images
max_confidence_predictions = 4

if batch_size < max_confidence_predictions:
    max_confidence_predictions = batch_size

fig, _ = plt.subplots(nrows=1, ncols=max_confidence_predictions, sharex=True,
                                    figsize=(12, 6))

fig.suptitle('Probability per class', fontsize=12)

for i, ax in enumerate(fig.axes):
    ax.set_xlabel('Probability', fontsize=10)

    if i == 0:
        ax.set_ylabel('Class', fontsize='medium')

    if i != 0:
        ax.set_yticklabels([])

    ax.set_title("Label: " + classes[labels[i]])

    num = [classes[i] for i in range(num_classes)]

    outputs = net(images)
    softmax_output = softmax(outputs[i].data.cpu().numpy())

    color = ['grey' if (x < max(softmax_output)) else 'blue' for x in softmax_output]
    ax.barh(num,softmax_output, color = color)

<h2>Testing: Accuracy and Confusion Matrix</h2>
The following code block makes predictions on the entire testing set. The overall accuracy and the accuracy per class are computed and displayed below. Additionally, a confusion matrix is constructed to indicate how false positives and false negatives contribute to the accuracy.

<h3>Exercise</h3>
1.<em> One way to improve the performance of the model is to add more training and/or validation images. Use the upload functionality within Jupyter Notebook to upload additional images (more info <a href="https://tljh.jupyter.org/en/latest/howto/content/add-data.html">here</a>). The dataset is located <a href="/tree" target="_blank" rel="noopener noreferrer">here</a>. Do you observe any improvements in the performance of the model? </em> <br>

In [None]:
# count number of (correct) predictions
correct_pred = {classname: 0 for classname in classes}
total_pred = {classname: 0 for classname in classes}

predicts = []
ground_truth = []
with torch.no_grad():
    for (images, labels) in testloader:
        images, labels = images.to(device), labels.to(device)
        outputs = net(images)
        _, predictions = torch.max(outputs, 1)

        for label, prediction in zip(labels, predictions):
            if label == prediction:
                correct_pred[classes[label]] += 1
            total_pred[classes[label]] += 1
            predicts.append(prediction.cpu().numpy())
            ground_truth.append(label.cpu().numpy())

total_accuracy = 0

for classname, correct_count in correct_pred.items():
    accuracy = 100 * float(correct_count) / total_pred[classname]
    total_accuracy += accuracy
    print(f'Accuracy for class: {classname:5s} is {accuracy:.1f} %')
print(f'Overall accuracy: {total_accuracy/len(correct_pred.items()):.1f} %')

disp = ConfusionMatrixDisplay(confusion_matrix(ground_truth, predicts), display_labels=classes)
disp.plot(cmap=plt.cm.Blues)
plt.show()