# INFO8010: Homework 2

In the previous homework, you learned how to program your first neural network starting from the very first principles of deep learning. If you managed to solve last assignment without any problems **congratulations!** If that was not the case **don't worry**, here's a second assignment for you which you can use to get better at deep learning.

In this homework we will see some slighly more complicated deep learning concepts: we will start by taking a look at some of PyTorch's functionalities that are necessary for training deep networks efficiently. We will then train our first neural networks for tackling different image classification tasks, learn to build custom datasets and explore how to train a CNN.  

The strucutre of the notebook is identical to the one of the previous homework. Similarly to last time, you have to submit the notebook **with your solutions** to the exercises. When you encounter a `# your code` comment, you have to write some code yourself and you have to discuss the code/results when you see the instruction

> your discussion

Without further ado let's start by importing the libraries we will need throughout this assignment!

In [None]:
import matplotlib.pyplot as plt
import numpy as np
import torch
import torch.nn as nn
import torch.optim as optim
import torch.utils.data as data

from PIL import Image
from torchvision import datasets, transforms, utils

In [None]:
# As of 2022/02/23, the CIFAR10 dataset SSL certificate is outdated which prevents its download.
# The following deactivates the verification of the SSL certificates, but
# never reproduce this unless you absolutely trust the source.
import ssl
ssl._create_default_https_context = ssl._create_unverified_context

## 1. Dataloaders

Today's first concept are PyTorch's dataloaders. As you have seen during the theoretical lectures, one of the main ingredients for successfully training deep learning models is data, **lots of data**.

As you can easily imagine, it is not possible to load datasets of millions of images into the memory of your machine. Furthermore, these images come in a form that does not make it possible to exploit the tensor operations we have seen in the previous assignment.

To deal with these issues (and many more of them) we can use [dataloaders](https://pytorch.org/docs/stable/data.html), a data loading utility that allows us to deal with large datasets efficiently. In what follows, you are given your first example of dataloader which will use the popular [CIFAR10](https://www.cs.toronto.edu/~kriz/cifar.html) dataset.

In [None]:
transform = transforms.Compose([transforms.ToTensor()])
trainset = datasets.CIFAR10(root='./data', train=True, download=True, transform=transform)
testset = datasets.CIFAR10(root='./data', train=False, transform=transform)

Let's explain what we just did. Thanks to PyTorch's [torchvision](https://pytorch.org/vision/stable/index.html) sub-library, we just downloaded the CIFAR10 dataset on our machine. The dataset was stored in the `./data` folder and comes in two different forms thanks to the use of the `train` flag: a version that can be used as training set, and a version that can be used as testing set. These two datasets are subclasses of `torch`'s `data.Dataset` class. We will see later what this `data.Dataset` class consists in exactly. Torchvision also allows us to define a set of image transformations which we have defined at the beginning of this cell: in this case we would like to convert our images to tensors, see the [documentation](https://pytorch.org/vision/stable/transforms.html#torchvision.transforms.ToTensor) for an exact description of this transformation.

Now that we have defined which dataset we would like to use, and the form in which we would like to have our images, we can create our first data loader. Data loaders are objects over which you can iterate and that load, transform and return mini-batches of inputs/targets at each iteration. The advantage of data loaders is that they (can) perform pre-processing of the data in parallel, i.e. in several concurrent worker pools.

Here, we create two data loaders that return mini-batches of 4 elements at each iteration. When using stochastic gradient descent (SGD), the training data loader should shuffle the training dataset.

In [None]:
trainloader = data.DataLoader(trainset, batch_size=4, shuffle=True, num_workers=2)
testloader = data.DataLoader(testset, batch_size=4, shuffle=False, num_workers=2)

Before training anything, let's take a look at the images we just downloaded.

In [None]:
classes = ('plane', 'car', 'bird', 'cat', 'deer', 'dog', 'frog', 'horse', 'ship', 'truck')

def show_images(img):
    plt.imshow(transforms.functional.to_pil_image(img))
    plt.show()

images, labels = next(iter(trainloader))
show_images(utils.make_grid(images))
print(*[classes[l] for l in labels])

The `transforms` module comes also in as very handy for performing other type of data transformations: here's an example which transforms the CIFAR10 images into gray scaled images.

In [None]:
transform = transforms.Compose([transforms.Grayscale(), transforms.ToTensor()])
gray_scaled_trainset = datasets.CIFAR10(root='./data', train=True, transform=transform)
gray_scaled_trainloader = data.DataLoader(gray_scaled_trainset, batch_size=4, shuffle=True, num_workers=2)

images, labels = next(iter(gray_scaled_trainloader))
show_images(utils.make_grid(images))
print(*[classes[l] for l in labels])

### 1.1 Transforms

Al remembered from the theoretical lectures that one way to make neural networks converge faster is to **normalize** the pixel values. He wrote the following code snippet to normalize his training set, but he encountered an error.

In [None]:
# transform = transforms.Compose([
#     transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5)),
#     transforms.ToTensor(),
# ])
# bugged_trainset = datasets.CIFAR10(root='./data', train=True, transform=transform)
# bugged_trainloader = data.DataLoader(bugged_trainset, batch_size=4, shuffle=True, num_workers=2)

# images, labels = next(iter(bugged_trainloader))
# show_images(utils.make_grid(images))  # should look weird due to normalization
# print(*[classes[l] for l in labels])

Fix his mistake.

In [None]:
transform = transforms.Compose([
    transforms.ToTensor(),
    transforms.Normalize(mean=[0.5, 0.5, 0.5], std=[0.5, 0.5, 0.5])
])
trainset = datasets.CIFAR10(root='./data', train=True, transform=transform, download=True)
trainloader = data.DataLoader(trainset, batch_size=4, shuffle=True, num_workers=2)

images, labels = next(iter(trainloader))
show_images(utils.make_grid(images))  # should look weird due to normalization
print(*[classes[l] for l in labels])

Al also remembers that, with image datasets, a common practice to increase the robustness of neural networks is **data augmentation**. He wants to apply random flips (vertical and horizontal) and random color changes to his training set, but he does not know how to. Could you help him?

In [None]:
transform_train = transforms.Compose([
    transforms.RandomHorizontalFlip(),
    transforms.RandomVerticalFlip(),
    transforms.ColorJitter(brightness=0.1, contrast=0.1, saturation=0.1, hue=0.1),
    transforms.ToTensor(),
    transforms.Normalize(mean=[0.5, 0.5, 0.5], std=[0.5, 0.5, 0.5])
])

trainset = datasets.CIFAR10(root='./data', train=True, download=True, transform=transform_train)
trainloader = data.DataLoader(trainset, batch_size=4, shuffle=True, num_workers=2)

images, labels = next(iter(trainloader))
show_images(utils.make_grid(images))
print(*[classes[l] for l in labels])

### 1.2 Running operations on a GPU

As you may know, one important aspect of deep learning is that large models can be trained efficiently on specialized hardwares such as Graphical Processing Units (GPUs) or Tensorial Processing Units (TPUs). PyTorch allows you to perform operations on GPUs very easily by transferring the concerned models and/or tensors to GPUs.

However, to do so, you need a CUDA compatible GPU.

In [None]:
torch.cuda.is_available()

If the result of the previous cell is `True`, everything is ready to run on the GPU and you can continue. Otherwise it means you do not have any GPU that is compatible with the `torch` version installed on your machine. In this case, we invite you to use [Google Colab](https://colab.research.google.com/) to do the rest of this homework. Do not forget to ask Colab for a GPU (in Runtime > Change runtime type > Hardware accelerator).

In [None]:
device = 'cuda'

Let's compare the speed of tensor operations on GPU and CPU.

In [None]:
A = torch.randn(1000, 100000)
B = torch.randn(100000, 1)

# on CPU
%timeit A @ B

In [None]:
A = torch.randn((1000, 100000), device=device)
B = torch.randn((100000, 1), device=device)

# on GPU
%timeit A @ B

Instead of directly creating a tensor on the GPU you may also transfer a model or a tensor on the GPU, for example we can transfer a simple MLP on the GPU and then back to the CPU as follows.

In [None]:
# create MLP on CPU
mlp = nn.Sequential(
    nn.Linear(3, 512),
    nn.ReLU(),
    nn.Linear(512, 512),
    nn.ReLU(),
    nn.Linear(512, 512),
    nn.ReLU(),
    nn.Linear(512, 1),
    nn.Sigmoid(),
)

# forward pass on CPU
x = torch.randn(256, 3)
%timeit mlp(x)

# transfer MLP to GPU (in-place)
mlp.to(device)

# forward pass on GPU
x = x.to(device)
%timeit mlp(x)

# release the GPU memory
mlp.to('cpu')
x = x.to('cpu')

As you may notice, computations are much faster on the GPU. However, data transfer between GPU and CPU (and vice-versa) is usually very slow. We recommend to reduce the transfers of data between GPU and CPU as much as possible. For example when you want to save your loss after each iteration, in order to avoid a memory leak, you should prefer doing `.detach()` rather than `.cpu()` or `.item()`.

## 2.  Classifying the CIFAR10 dataset with an MLP

Now that you know how to handle datasets, we are ready to properly train today's first deep learning model on the CIFAR10 dataset. Before we dive into it, **do not underestimate** the importance of properly pre-processing the data before training neural networks. This step is as important as defining the neural architectures themselves, but is very often overlooked.

In this exercise you are provided with an already defined multi-layer perceptron that you can train to classify CIFAR10 images. The structure of the network is already defined, yet some crucial hyperparameters are missing. It is your job to fill them in and successfully train the network. As part of the exercise, you are also required to monitor the evolution of training: this usually consists in checking how the training and testing losses evolve during training and keeping track of the model's accuracy on the testing set. Report these statistics with some plots. In addition, transfer the network and the mini-batches on GPU to speed up training.

Fill in the code below, discuss your choices and your results. Are you satisfied with the final accuracy?

In [None]:
from tqdm import tqdm

input_features = 3 * 32 * 32
output_features = 10
hidden_features = 10000
learning_rate = 0.001
num_epochs = 50
batch_size = 100 # given
weight_decay = 0.0001

transform_train = transforms.Compose([
    transforms.RandomCrop(32, padding=4),
    transforms.RandomHorizontalFlip(),
    transforms.ColorJitter(brightness=0.1, contrast=0.1, saturation=0.1, hue=0.1),
    transforms.ToTensor(),
    transforms.Normalize(mean=[0.5, 0.5, 0.5], std=[0.5, 0.5, 0.5])
])

transform_test = transforms.Compose([
    transforms.ToTensor(),
    transforms.Normalize(mean=[0.5, 0.5, 0.5], std=[0.5, 0.5, 0.5])
])

trainset = datasets.CIFAR10(root='./data', train=True, transform=transform_train, download=True)
testset = datasets.CIFAR10(root='./data', train=False, transform=transform_test, download=True)

trainloader = torch.utils.data.DataLoader(trainset, batch_size=batch_size, shuffle=True, num_workers=2, pin_memory=True)
testloader = torch.utils.data.DataLoader(testset, batch_size=batch_size, shuffle=False, num_workers=2, pin_memory=True)

class MLP(nn.Sequential):
    def __init__(self, input_features, output_features, hidden_features):
        super().__init__(
            nn.Flatten(),
            nn.Linear(input_features, hidden_features),
            nn.ReLU(),
            nn.Linear(hidden_features, hidden_features),
            nn.ReLU(),
            nn.Linear(hidden_features, hidden_features),
            nn.ReLU(),
            nn.Linear(hidden_features, output_features),
        )

network = MLP(input_features, output_features, hidden_features)

criterion = nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(network.parameters(), lr=learning_rate, weight_decay=weight_decay)
scheduler = torch.optim.lr_scheduler.ReduceLROnPlateau(optimizer, mode='max', factor=0.1, patience=3)

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
network.to(device)
criterion.to(device)

def train(num_epochs):
    train_avg_loss = []
    test_avg_loss = []
    test_accuracy = []
    best_accuracy = 0

    for i in tqdm(range(num_epochs)):
        train_losses = []
        test_losses = []

        for x, y in trainloader:
            x, y = x.to(device), y.to(device)
            pred = network(x)
            loss = criterion(pred, y)
            train_losses.append(loss.detach())

            optimizer.zero_grad()
            loss.backward()
            optimizer.step()

        with torch.no_grad():
            correct = 0

            for x, y in testloader:
                x, y = x.to(device), y.to(device)
                pred = network(x)
                loss = criterion(pred, y)
                test_losses.append(loss)

                y_pred = pred.argmax(dim=-1)
                correct += (y_pred == y).sum().item()

            accuracy = correct / len(testset)
            scheduler.step(accuracy)

            if accuracy > best_accuracy:
                best_accuracy = accuracy
                torch.save(network.state_dict(), 'cifar10_mlp.pth')

        train_avg_loss.append(torch.stack(train_losses).mean())
        test_avg_loss.append(torch.stack(test_losses).mean())
        test_accuracy.append(accuracy)

    return train_avg_loss, test_avg_loss, test_accuracy


In [None]:
train_avg_loss, test_avg_loss, test_accuracy = train(num_epochs)

In [None]:
train_avg_loss_cpu = [train_loss.cpu() for train_loss in train_avg_loss]
test_avg_loss_cpu = [test_loss.cpu() for test_loss in test_avg_loss]

In [None]:
import os

os.mkdir("./MLP")

np.save('./MLP/TrAvgLoss', train_avg_loss_cpu)
np.save('./MLP/TeAvgLoss', test_avg_loss_cpu)
np.save('./MLP/TeAcc', test_accuracy)

Plot the statistics below and discuss your hyperparameter choices.

In [None]:
train_avg_loss_cpu = np.load('./MLP/TrAvgLoss.npy')
test_avg_loss_cpu = np.load('./MLP/TeAvgLoss.npy')
test_accuracy = np.load('./MLP/TeAcc.npy')

In [None]:
# plot the training and testing losses
plt.figure(figsize=(10, 5))
plt.plot(train_avg_loss_cpu, label="Training Loss")
plt.plot(test_avg_loss_cpu, label="Testing Loss")
plt.legend()
plt.title("Training and Testing Loss")
plt.xlabel("Epoch")
plt.ylabel("Loss")
plt.show()

# plot the testing accuracy
plt.figure(figsize=(10, 5))
plt.plot(test_accuracy)
plt.title("Testing Accuracy")
plt.xlabel("Epoch")
plt.ylabel("Accuracy")
plt.show()

In [None]:
print("Final testing accuracy", test_accuracy[-1])

The parameters choosed are the following :
- `optimizer` = Adam, very popular
- `loss_function` = CrossEntropyLoss, classical for classification
- `input_features` = 3 x 32 x 32 to represent the RGB 32 by 32 pixels images
- `output_features` = 10 representing the 10 classes from the dataset
- `hidden_features` = 10000 to make a pretty deep multilayer perceptron
- `learning_rate` = 0.001 usual value for the learning rate
- `epochs` = 50 usual value
- `weight_decay` = 0.0001 usual value for weight decay in the optimizer

We also implemented data augmentation on the training set as seen in the previous snippets to improve our neural network and a scheduler on the optimizer to dynamically tune our learning rate.

We have an accuracy around 55% on the test set which is not really impressive but behaves as expected as we are using a simple multilayer perceptron and not a convolutional neural network that is able to perceive images in a better way by using convolutions products to observe the data in a better way.

## 3.  Create a custom dataset

Sometimes you would like to train a model on your own dataset, which will very likely not be part of `torchvision`. To overcome this you can create a custom dataset class which will handle the data for you. This can be done by inheriting from `torch`'s `data.Dataset` class and defining the methods `__len__` and `__getitem__` (see the [documentation](https://pytorch.org/docs/stable/data.htm)).

In this exercise your goal is to program a custom dataset class which you will later use for training a CNN. We will use the Kaggle Cats and Dogs dataset which you can download from [here](https://www.microsoft.com/en-us/download/details.aspx?id=54765). Note that some images may have different shapes. It is up to you to deal with this elegantly. In addition, some images may be corrupted. You can simply remove those.

When programming a custom dataset class, you have to start by defining the constructor, which will get as input the location of your dataset, whether the images that will be returned will serve for training or testing, and some other potential attributes. For this exercise we will be using 20000 images for training and 5000 images for testing. For the `__getitem__` function you may find the `PIL.Image.open` useful. Do not forget to transform the images into tensors and return the image labels as well ($0$ or $1$).

In [None]:
import os

!wget https://download.microsoft.com/download/3/E/1/3E1C3F21-ECDB-4869-8368-6DEBA77B919F/kagglecatsanddogs_5340.zip
!unzip kagglecatsanddogs_5340.zip

In [None]:
# Corrupted images Cat/666 and Dog/11702 have been removed
from tqdm import tqdm
import os
import random
import shutil

os.remove('./PetImages/Cat/666.jpg')
os.remove('./PetImages/Dog/11702.jpg')

num_test_images = 2500

src_folder = './PetImages'
dst_folder = './PetImages_CNN'

if os.path.exists(dst_folder):
    shutil.rmtree(dst_folder)

os.makedirs(os.path.join(dst_folder, "Train", "Cat"), exist_ok=True)
os.makedirs(os.path.join(dst_folder, "Train", "Dog"), exist_ok=True)
os.makedirs(os.path.join(dst_folder, "Test", "Cat"), exist_ok=True)
os.makedirs(os.path.join(dst_folder, "Test", "Dog"), exist_ok=True)


for cl in ['Cat', 'Dog']:
    src_class_folder = os.path.join(src_folder, cl)
    images = os.listdir(src_class_folder)
    random.shuffle(images)
    num_images = len(images)
    print("# of images of " + cl + " = " + str(num_images))

    test_images = images[:num_test_images]
    train_images = images[num_test_images:]

    for image in tqdm(test_images):
        src_path = os.path.join(src_folder, cl, image)
        dst_path = os.path.join(dst_folder, 'Test', cl, image)
        shutil.copy(src_path, dst_path)

    for image in tqdm(train_images):
        src_path = os.path.join(src_folder, cl, image)
        dst_path = os.path.join(dst_folder, 'Train', cl, image)
        shutil.copy(src_path, dst_path)

In [None]:
class CatAndDogsDataset(data.Dataset):
    def __init__(self, root_dir, train=True, transform=None):
        """Initializes a dataset containing images and labels."""
        super().__init__()
        self.root_dir = root_dir
        self.transform = transform
        self.file_list = []
        self.train=train
        self.classes = {'Cat' : 0, 'Dog' : 1}

        if self.train:
            for cl in self.classes:
                dir_path = os.path.join(root_dir, 'Train', cl)
                for filename in os.listdir(dir_path):
                    if filename.endswith('.jpg'):
                        self.file_list.append((os.path.join(dir_path, filename), self.classes[cl]))
        else:
            for cl in self.classes:
                dir_path = os.path.join(root_dir, 'Test', cl)
                for filename in os.listdir(dir_path):
                    if filename.endswith('.jpg'):
                        self.file_list.append((os.path.join(dir_path, filename), self.classes[cl]))

    def __len__(self):
        """Returns the size of the dataset."""
        return len(self.file_list)

    def __getitem__(self, index):
        """Returns the index-th data item of the dataset."""
        path, label = self.file_list[index]
        img = Image.open(path).convert('RGB')
        if self.transform is not None:
              img = self.transform(img)
        return img, label

Let us have a quick look at these samples.

In [None]:
def show_images(img):
    plt.imshow(transforms.functional.to_pil_image(img))
    plt.show()

transform = transforms.Compose([
    transforms.Resize((32, 32)),
    transforms.ToTensor()
])

my_dataset_train = CatAndDogsDataset('./PetImages_CNN', transform=transform, train=True)
my_loader_train = data.DataLoader(my_dataset_train, batch_size=4, shuffle=True, num_workers=2)

my_dataset_test = CatAndDogsDataset('./PetImages_CNN', transform=transform, train=False)
my_loader_test = data.DataLoader(my_dataset_test, batch_size=4, shuffle=True, num_workers=2)

classes = ['Cat', 'Dog']

print("Training Dataset length = ", my_dataset_train.__len__())
images, labels = next(iter(my_loader_train))
show_images(utils.make_grid(images))
print(*[classes[l] for l in labels])

print("Testing Dataset length = ", my_dataset_test.__len__())
images, labels = next(iter(my_loader_test))
show_images(utils.make_grid(images))
print(*[classes[l] for l in labels])

## 4. Classifying the Cats and Dogs dataset with a CNN

As we have seen in class, classifying images with a multi-layer perceptron isn't really a good idea. Convolutional Neural Networks (CNN) are in fact a much better option for this task. It is now your job to create your custom CNN and train it on the Cats and Dogs Dataset.

Similarly to what you have done when classifying the CIFAR10 dataset you are again required to report and discuss the performance of your model.

In [None]:
from tqdm import tqdm
import torch.nn.functional as F

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
num_epochs = 30
learning_rate = 0.001

class CNN(nn.Module):
    def __init__(self):
        super(CNN, self).__init__()

        self.layer1 = nn.Sequential(
            nn.Conv2d(3, 16, kernel_size=3, padding=1, stride=1),
            nn.BatchNorm2d(16),
            nn.ReLU(),
            nn.MaxPool2d(2)
        )

        self.layer2 = nn.Sequential(
            nn.Conv2d(16, 32, kernel_size=3, padding=1, stride=1),
            nn.BatchNorm2d(32),
            nn.ReLU(),
            nn.MaxPool2d(2)
        )

        self.layer3 = nn.Sequential(
            nn.Conv2d(32, 64, kernel_size=3, padding=1, stride=1),
            nn.BatchNorm2d(64),
            nn.ReLU(),
            nn.MaxPool2d(2)
        )

        self.layer4 = nn.Sequential(
            nn.Conv2d(64, 128, kernel_size=3, padding=1, stride=1),
            nn.BatchNorm2d(128),
            nn.ReLU(),
            nn.MaxPool2d(2)
        )

        self.layer5 = nn.Sequential(
            nn.Conv2d(128, 256, kernel_size=3, padding=1, stride=1),
            nn.BatchNorm2d(256),
            nn.ReLU(),
            nn.MaxPool2d(2)
        )

        self.fc1 = nn.Linear(7 * 7 * 256, 512)
        self.relu = nn.ReLU()
        self.fc2 = nn.Linear(512, 2)

    def forward(self, x):
        out = self.layer1(x)
        out = self.layer2(out)
        out = self.layer3(out)
        out = self.layer4(out)
        out = self.layer5(out)
        out = out.view(out.size(0), -1)
        out = self.relu(self.fc1(out))
        out = self.fc2(out)
        return out

cats_dogs_CNN = CNN().to(device)
cats_dogs_CNN.train()

optimizer = optim.Adam(params=cats_dogs_CNN.parameters(), lr=learning_rate)
criterion = nn.CrossEntropyLoss()

transform_train = transforms.Compose([
    transforms.Resize((224,224)),
    transforms.RandomHorizontalFlip(),
    transforms.ColorJitter(brightness=0.1, contrast=0.1, saturation=0.1, hue=0.1),
    transforms.ToTensor(),
    transforms.Normalize(mean=[0.5, 0.5, 0.5], std=[0.5, 0.5, 0.5])
])

transform_test = transforms.Compose([
    transforms.Resize((224,224)),
    transforms.ToTensor(),
    transforms.Normalize(mean=[0.5, 0.5, 0.5], std=[0.5, 0.5, 0.5])
])

trainset = CatAndDogsDataset('./PetImages_CNN', train=True, transform=transform_train)
testset = CatAndDogsDataset('./PetImages_CNN', train=False, transform=transform_test)

batch_size = 100

trainloader = torch.utils.data.DataLoader(trainset, batch_size=batch_size, shuffle=True, num_workers=2, pin_memory=True)
testloader = torch.utils.data.DataLoader(testset, batch_size=batch_size, shuffle=False, num_workers=2, pin_memory=True)

In [None]:
def train_CNN():
    train_avg_loss = []
    test_avg_loss = []
    test_accuracy = []

    for epoch in tqdm(range(num_epochs)):
        train_losses = torch.tensor(0.0, device=device)
        test_losses = torch.tensor(0.0, device=device)
        accuracy = torch.tensor(0.0, device=device)

        for x, y in trainloader:
            x, y = x.to(device), y.to(device)
            out = cats_dogs_CNN(x)
            loss = criterion(out, y)

            optimizer.zero_grad()
            loss.backward()
            optimizer.step()

            train_losses += loss.detach() / len(trainloader)

        with torch.no_grad():
            for x, y in testloader:
                x, y = x.to(device), y.to(device)
                out = cats_dogs_CNN(x)
                loss = criterion(out, y)

                accuracy += ((out.argmax(dim=1)==y).float().mean()) / len(testloader)
                test_losses += loss / len(testloader)

        print('Epoch {epoch} : Accuracy = {accuracy}'.format(epoch=epoch, accuracy=accuracy))
        train_avg_loss.append(train_losses)
        test_avg_loss.append(test_losses)
        test_accuracy.append(accuracy)

    train_avg_loss = torch.stack(train_avg_loss).cpu()
    test_avg_loss = torch.stack(test_avg_loss).cpu()
    test_accuracy = torch.stack(test_accuracy).cpu()

    return train_avg_loss, test_avg_loss, test_accuracy

In [None]:
train_avg_loss, test_avg_loss, test_accuracy = train_CNN()

In [None]:
dst_folder = './CNN'

if os.path.exists(dst_folder):
    shutil.rmtree(dst_folder)

os.makedirs(dst_folder, exist_ok=True)

np.save(os.path.join(dst_folder, 'TrAvgLoss.npy'), train_avg_loss)
np.save(os.path.join(dst_folder, 'TeAvgLoss.npy'), test_avg_loss)
np.save(os.path.join(dst_folder, 'TeAcc.npy'), test_accuracy)

In [None]:
src_folder = './CNN'

train_avg_loss_cpu = np.load(os.path.join(src_folder, 'TrAvgLoss.npy'))
test_avg_loss_cpu = np.load(os.path.join(src_folder, 'TeAvgLoss.npy'))
test_accuracy_cpu = np.load(os.path.join(src_folder, 'TeAcc.npy'))

plt.figure(figsize=(10, 5))
plt.plot(train_avg_loss_cpu, label='Training loss')
plt.plot(test_avg_loss_cpu, label='Test loss')
plt.legend()
plt.title("Training and Testing Loss")
plt.xlabel("Epoch")
plt.ylabel("Loss")
plt.show()

plt.figure(figsize=(10, 5))
plt.plot(test_accuracy_cpu, label='Test accuracy')
plt.legend()
plt.title("Testing Accuracy")
plt.xlabel("Epoch")
plt.ylabel("Accuracy")
plt.show()

The parameters choosed are the following :
- `optimizer` = Adam, very popular
- `loss` = CrossEntropyLoss, classical for classification
- `learning_rate` = 0.001 usual value for the learning rate
- `epochs` = 30 usual value
- Convolutional network :

INPUT → [[CONV → RELU]x1 → POOL]x5 → [FC → RELU]x1 → FC

We incremented the input channels of each convolution from 3 (RGB) to 32, 64, 128, 256. We handelded images of 224 x 224 pixels. This architecture seemed to work pretty well.

We also implemented data augmentation on the training set as seen in the previous snippets to improve our neural network.

At the end we reach an accuracy of 85% which is pretty good for a simple convolutional neural network. We could have obtained better results by using a more complex architecture or better tailored to the problem, but we'll get the gist of it by trying again and again with different problems and projects.

## Feedback

Now that you are done with this final deep-learning assignment here are some final questions about the exercises you were required to solve:

<span style="color:blue">How much time did you spend on this homework?</span>

3hours coding + 5hours training

<span style="color:blue">Do you feel confortable with what it means to define a neural network and train it?</span>

Yes

<span style="color:blue">Do you think you now have enough preliminary knowledge for successfully starting to work on your course final project?</span>

Yes, with both homeworks done we can dive in the project

<span style="color:blue">If you had to go through the two homeworks again, is there something you would have liked to explore more or explained more into detail?</span>

We think that the basics were nicely covered and that we could always use the theoretical course to check for correct coding principles for the different problems