# Simple architectures but Intermediate PyTorch tools


#### Imports

In [None]:
import matplotlib.pyplot as plt
import torch
import torch.nn as nn
import pathlib
from torch.utils.data import DataLoader
from torchvision import datasets, transforms

In [None]:
_ = torch.manual_seed(25)

As usual, we could set the device to "cuda" or "cpu" according to what your computer has. But for now, we won't really need GPUs, so we are not going to bother manipulating `.to_device()` (if we don't, it'll go to the CPU by default). You should understand where to put `.to_device()` if you want to use your GPU but it's not the point of this TD, one step at a time ...

#### Data

Before we begin, let's gather some data.
Like a chef preparing ingredients before cooking, the data has already been set aside for us by the assistants.

We'll start with a small dataset, as our goal isn't to train the largest model or use the largest dataset just yet. The data we'll be using is a subset of the `Food101` ([credit where it's due](https://data.vision.ee.ethz.ch/cvl/datasets_extra/food-101/)) dataset plus some we added ourselves. This dataset is a popular benchmark in computer vision, as it contains 1,000 images of 101 different types of food, for a total of 101,000 images (75,750 for training and 25,250 for testing).

Instead of 101 food classes though, we're going to start with 3: pizza, steak and sushi.

Download the whole pizza / steak / sushi dataset [here](https://drive.google.com/file/d/1JNiqVEbaOyRIWLc3UwHS4Y6CO60wx1iu/view?usp=sharing) and un-zip it. This is probably what will take you 60% of your time if you do Applied Deep Learning in the real world. But it's here it's pretty much ready.

If you look at the new folder on your computer, all images of pizza are contained in the `pizza/` directory.

This format (putting images of a class in a folder the name of which is the class name) is popular across many different image classification benchmarks, including [ImageNet](https://www.image-net.org/) (of the most popular computer vision benchmark datasets).

In [None]:
# Setup train and testing paths
data_path = pathlib.Path(".")  # that is the current directory and assumes the data is in the same folder as this notebook
train_dir = data_path / "Food-3/train"
test_dir = data_path / "Food-3/test"

train_dir, test_dir

Take a closer look at the images. There is a problem, can you spot it? The images do not all have the same dimension! We are therefore going to learn how to resize images with `torchvision.transforms` (this can do a lot more, but today we're only resizing images).

We will not create a `CustomDataSet` with `class CustomDataset(Dataset)` like we did in the TD 2a. Because when we deal with images, audios, etc ... nicely put in folders, there exist classes already implemented that we can use. But the way they're implemented in PyTorch is similar to how we implemented our own `CustomDataSet` inhereting from the abstract* class `Dataset`.

\* an Abstract Class is a Class we do not want to (and cannot) instantiate. We only want to inherit from it.

Let's use the class [`torchvision.datasets.ImageFolder`](https://pytorch.org/vision/stable/generated/torchvision.datasets.ImageFolder.html#torchvision.datasets.ImageFolder), built-in PyTorch.

The Class `ImageFolder` inherits from `DatasetFolder` which inherits from `VisionDataset` which has a nice method to redefine what happens when we print the object (see below!). When custom classes exist, except if you're ready to spend hours of work and you're a senior software engineer, it's usually better to take classes that are already built in.

In [None]:
print(f"Train data:\n{train_data}\nTest data:\n{test_data}")

---
As a comparison, what we got from printing our CustomDataset last time is not great:


In [None]:
import numpy as np
from torch.utils.data import Dataset

# Data Generation
np.random.seed(42)
xs = np.random.rand(100, 1)
ys = 1 + 2 * xs + .1 * np.random.randn(100, 1)

class CustomDataset(Dataset):
    def __init__(self, x_tensor, y_tensor):
        self.x = x_tensor
        self.y = y_tensor

    def __getitem__(self, index):
        return self.x[index], self.y[index]

    def __len__(self):
        return len(self.x)

xs_tensor = torch.from_numpy(xs).float()
ys_tensor = torch.from_numpy(ys).float()

train_data_custom = CustomDataset(xs_tensor, ys_tensor)
print(train_data_custom)

---
Let's look at one element of our object `train_data`.

We don't have a tensor! That's because, we didn't tell PyTorch to transform it into a tensor. It's not automatic. Let's rewrite our function `data_transform`:

If we look at the [torchvision.transforms.ToTensor() documentation](https://pytorch.org/vision/main/generated/torchvision.transforms.ToTensor.html), we can see that instead of the standard pixel values from 0 to 255, we will get values from 0.0 and 1.0.

Let's recreate our objects `train_data` and `test_data`:

Is 0 pizza, steak or sushi? Our object `torchvision.datasets.ImageFolder` has a useful attribute `classes`.

It's a pizza.

Our images are now in the form of a tensor with shape `[3, 64, 64]`. If we want to plot them, it's possible, however, `matplotlib` wants `HWC` (Height, Width, Color channels), therefore we need to reshape `[3, 64, 64]` in `[64, 64, 3]`. This is something you should feel comfortable doing (this StackOverflow answer can help https://stackoverflow.com/a/51145633/11092636 understanding the two main methods `.permute` and `.view`; here we want to swap axes, we will therefore use `.permute`).

If you remember correctly last TD, after getting our data as a PyTorch `Dataset` we need to turn them into a `DataLoader`.

We'll do so using [`torch.utils.data.DataLoader`](https://pytorch.org/docs/stable/data.html#torch.utils.data.DataLoader).

Reminder: turning our `Dataset`'s into `DataLoader`'s makes them iterable so a model can go through learn the relationships between samples and targets (features and labels).

To keep things simple, we'll use a `batch_size=1`.

Let's create our neural network

Let's do a forward pass to test the model when it's not been trained.

Let's train the model

In [None]:
train(optimizer_adam)

Now let's test our trained model:

We will "scale up" the number of images in the next class, and we will add some techniques we've seen in the class today to try and improve even more our accuracy!!

Keep in mind though, if you train your data on both optimizers and decide you picked one optimizer because the accuracy on your testing set is better with that optimizer, you've effectively got information from the testing dataset which is not independent anymore, it has therefore inherently become a validation set; and you need an extra independent validation set.

It's just numbers though, let's check the predictions are actually correct on some examples by showing the images, the expected label and the predicted label.

Execute this cell several times to iterate through the testing set and see how your model performs:

In [None]:
X_test_one_batch, y_test_one_batch = next(my_iterable)
X_test_one, y_test_one = X_test_one_batch[0], y_test_one_batch[0]

# 0. Eval mode
model.eval()

# 1. Forward pass
model_output = model(X_test_one.view(-1, 64*64*3))

# 2. Calculate predicted label
test_pred_label = model_output.argmax()

# 3. Display
print(f"Predicted class: {class_names[test_pred_label]}")
print(f"Actual class: {class_names[y_test_one]}")
plt.figure(figsize=(10, 7))
plt.imshow(X_test_one.permute(1, 2, 0))
_ = plt.axis("off")