<a href="https://colab.research.google.com/github/Freemanlabs/giz-rwanda-ai-training/blob/master/cv-with-pytorch/01_pytorch_fundamentals/first_steps_with_pytorch.ipynb" target="_blank">
  <img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/>
</a>

## First steps with PyTorch

### Installing PyTorch

To install PyTorch, we recommend consulting the latest instructions on the official https://pytorch.org website. Below, we will outline the basic steps that will work on most systems.

Depending on how your system is set up, you can typically just use Python’s pip installer and install PyTorch from PyPI by executing the following from your terminal:

In [None]:
#! pip install torch

In [None]:
import torch
import numpy as np

print('PyTorch version:', torch.__version__)

np.set_printoptions(precision=3)

In [None]:
! python -c 'import torch; print(torch.__version__)'

### Creating tensors in PyTorch

Now, let’s consider a few different ways of creating tensors, and then see some of their properties and how to manipulate them. Firstly, we can simply create a tensor from a list or a NumPy array using the `torch.tensor` or the `torch.from_numpy` function as follows:

In [None]:
a = [1, 2, 3]
b = np.array([4, 5, 6], dtype=np.int32)

t_a = torch.tensor(a)
t_b = torch.from_numpy(b)

print(t_a)
print(t_b)

This resulted in tensors `t_a` and `t_b`, with their properties, `shape=(3,)` and `dtype=int32`, adopted from their source. Similar to NumPy arrays, we can also see these properties:

In [None]:
torch.is_tensor(a), torch.is_tensor(t_a)

In [None]:
t_ones = torch.ones(2, 3)
t_ones.shape

In [None]:
print(t_ones)

Finally, creating a tensor of random values can be done as follows:

In [None]:
rand_tensor = torch.rand(2,3)
print(rand_tensor)

### Manipulating the data type and shape of a tensor

Learning ways to manipulate tensors is necessary to make them compatible for input to a model or an operation. In this section, you will learn how to manipulate tensor data types and shapes via several PyTorch functions that cast, reshape, transpose, and squeeze (remove dimensions).

The `torch.to()` function can be used to change the data type of a tensor to a desired type:

In [None]:
t_a_new = t_a.to(torch.int64)
print(t_a_new.dtype)

See https://pytorch.org/docs/stable/tensor_attributes.html for all other data types.

As you will see in upcoming classes, certain operations require that the input tensors have a certain number of dimensions (that is, rank) associated with a certain number of elements (shape). Thus, we might need to change the shape of a tensor, add a new dimension, or squeeze an unnecessary dimension. 

PyTorch provides useful functions (or operations) to achieve this, such as `torch.transpose()`, `torch.reshape()`, and `torch.squeeze()`. Let’s take a look at some examples:

- Transposing a tensor:

In [None]:
t = torch.rand(3, 5)
t_tr = torch.transpose(t, 0, 1)

print(t.shape, ' --> ', t_tr.shape)

- Reshaping a tensor (for example, from a 1D vector to a 2D array):

In [None]:
t = torch.zeros(30)
t_reshape = t.reshape(5, 6)

print(t_reshape.shape)

- Removing the unnecessary dimensions (dimensions that have size 1, which are not needed):

In [None]:
t = torch.zeros(1, 2, 1, 4, 1)
t_sqz = torch.squeeze(t, 2)

print(t.shape, ' --> ', t_sqz.shape)

### Applying mathematical operations to tensors

Applying mathematical operations, in particular linear algebra operations, is necessary for building most machine learning models. In this subsection, we will cover some widely used linear algebra
operations, such as element-wise product, matrix multiplication, and computing the norm of a
tensor.

First, let’s instantiate two random tensors, one with uniform distribution in the range [–1, 1] and the other with a standard normal distribution:

In [None]:
torch.manual_seed(1)

t1 = 2 * torch.rand(5, 2) - 1
t2 = torch.normal(mean=0, std=1, size=(5, 2))

Note that torch.rand returns a tensor filled with random numbers from a uniform distribution in
the range of [0, 1].

Notice that `t1` and `t2` have the same shape. Now, to compute the element-wise product of `t1` and `t2`, we can use the following:

In [None]:
t3 = torch.multiply(t1, t2)
print(t3)

To compute the mean, sum, and standard deviation along a certain axis (or axes), we can use `torch.mean()`, `torch.sum()`, and `torch.std()`. For example, the mean of each column in `t1` can be computed as follows:

In [None]:
t4 = torch.mean(t1, axis=0)
print(t4)

The matrix-matrix product between `t1` and `t2` (that is, $t_{1}$ x $t^{T}_{2}$, where the superscript $T$ is for transpose) can be computed by using the `torch.matmul()` function as follows:

In [None]:
t5 = torch.matmul(t1, torch.transpose(t2, 0, 1))
print(t5)

On the other hand, computing $t^{T}_{1}$ x $t_{2}$  is performed by transposing `t1`, resulting in an array of size 2×2:

In [None]:
t6 = torch.matmul(torch.transpose(t1, 0, 1), t2)
print(t6)

Finally, the `torch.linalg.norm()` function is useful for computing the $L^{p}$ norm of a tensor. For example, we can calculate the $L^{2}$ norm of `t1` as follows:

In [None]:
norm_t1 = torch.linalg.norm(t1, ord=2, dim=1)
print(norm_t1)

To verify that this code snippet computes the $L^{2}$ norm of `t1` correctly, you can compare the results with the following NumPy function: `np.sqrt(np.sum(np.square(t1.numpy()), axis=1))`.

In [None]:
np.sqrt(np.sum(np.square(t1.numpy()), axis=1))

### Split, stack, and concatenate tensors

In this subsection, we will cover PyTorch operations for splitting a tensor into multiple tensors, or the reverse: stacking and concatenating multiple tensors into a single one.

Assume that we have a single tensor, and we want to split it into two or more tensors. For this, PyTorch provides a convenient `torch.chunk()` function, which divides an input tensor into a list of equally sized tensors. We can determine the desired number of splits as an integer using the chunks argument to split a tensor along the desired dimension specified by the dim argument. In this case, the total size of the input tensor along the specified dimension must be divisible by the desired number of splits.

Alternatively, we can provide the desired sizes in a list using the `torch.split()` function. Let’s have a look at an example of both these options:

- Providing the number of splits:

In [None]:
torch.manual_seed(1)

t = torch.rand(6)

print(t)

t_splits = torch.chunk(t, 3)

[item.numpy() for item in t_splits]

In this example, a tensor of size 6 was divided into a list of three tensors each with size 2. If the tensor size is not divisible by the chunks value, the last chunk will be smaller.

- Providing the sizes of different splits:

Alternatively, instead of defining the number of splits, we can also specify the sizes of the
output tensors directly. Here, we are splitting a tensor of size 5 into tensors of sizes 3 and 2:

In [None]:
torch.manual_seed(1)
t = torch.rand(5)

print(t)

t_splits = torch.split(t, split_size_or_sections=[3, 2])
 
[item.numpy() for item in t_splits]

Sometimes, we are working with multiple tensors and need to concatenate or stack them to create a single tensor. In this case, PyTorch functions such as `torch.stack()` and `torch.cat()` come in handy.

For example, let’s create a 1D tensor, `A`, containing 1s with size 3, and a `1D` tensor, `B`, containing 0s with size 2, and concatenate them into a `1D` tensor, `C`, of size 5:

In [None]:
A = torch.ones(3)
B = torch.zeros(2)

C = torch.cat([A, B], axis=0)
print(C)

If we create `1D` tensors `A` and `B`, both with size 3, then we can stack them together to form a `2D` tensor, `S`:

In [None]:
A = torch.ones(3)
B = torch.zeros(3)

S = torch.stack([A, B], axis=1)
print(S)

The PyTorch API has many operations that you can use for building a model, processing your data, and more. However, covering every function is outside the scope of this course, where we will focus on the most essential ones. For the full list of operations and functions, you can refer to the documentation page of PyTorch at https://pytorch.org/docs/stable/index.html.

## Building input pipelines in PyTorch

When we are training a deep NN model, we usually train the model incrementally using an iterative optimization algorithm such as stochastic gradient descent, as we have seen in previous classes.

As mentioned at the beginning of this class, `torch.nn` is a module for building NN models. In cases where the training dataset is rather small and can be loaded as a tensor into the memory, we can directly use this tensor for training. In typical use cases, however, when the dataset is too large to fit into the computer memory, we will need to load the data from the main storage device (for example, the hard drive or solid-state drive) in chunks, that is, batch by batch. (Note the use of the term “batch” instead of “mini-batch” in this class to stay close to the PyTorch terminology.) In addition, we may need to construct a data-processing pipeline to apply certain transformations and preprocessing steps to our data, such as mean centering, scaling, or adding noise to augment the training procedure and to prevent overfitting.

Applying preprocessing functions manually every time can be quite cumbersome. Luckily, PyTorch
provides a special class for constructing efficient and convenient preprocessing pipelines. In this section, we will see an overview of different methods for constructing a PyTorch `Dataset` and `DataLoader`, and implementing data loading, shuffling, and batching.

### Creating a PyTorch DataLoader from existing tensors

If the data already exists in the form of a tensor object, a Python list, or a NumPy array, we can easily create a dataset loader using the `torch.utils.data.DataLoader()` class. It returns an object of the `DataLoader` class, which we can use to iterate through the individual elements in the input dataset. As a simple example, consider the following code, which creates a dataset from a list of values from 0 to 5:

In [None]:
from torch.utils.data import DataLoader

t = torch.arange(6, dtype=torch.float32)
data_loader = DataLoader(t)

We can easily iterate through a dataset entry by entry as follows:

In [None]:
for item in data_loader:
    print(item)

If we want to create batches from this dataset, with a desired batch size of 3, we can do this with the `batch_size` argument as follows:

In [None]:
data_loader = DataLoader(t, batch_size=3, drop_last=False)

for i, batch in enumerate(data_loader, 1):
    print(f'batch {i}:', batch)

This will create two batches from this dataset, where the first three elements go into batch #1, and the remaining elements go into batch #2. The optional `drop_last` argument is useful for cases when the number of elements in the tensor is not divisible by the desired batch size. We can drop the last non-full batch by setting `drop_last` to `True`. The default value for `drop_last` is `False`.

We can always iterate through a dataset directly, but as you just saw, DataLoader provides an automatic and customizable batching to a dataset.

### Combining two tensors into a joint dataset

Often, we may have the data in two (or possibly more) tensors. For example, we could have a tensor for features and a tensor for labels. In such cases, we need to build a dataset that combines these tensors, which will allow us to retrieve the elements of these tensors in tuples.

In [None]:
from torch.utils.data import Dataset

class JointDataset(Dataset):
    def __init__(self, x, y):
        self.x = x
        self.y = y
    def __len__(self):
        return len(self.x)
    def __getitem__(self, idx):
        return self.x[idx], self.y[idx]

A custom `Dataset` class must contain the following methods to be used by the data loader later on:

- `__init__()`: This is where the initial logic happens, such as reading existing arrays, loading a file, filtering data, and so forth.
- `__getitem__()`: This returns the corresponding sample to the given index.

Assume that we have two tensors, `t_x` and `t_y`. Tensor `t_x` holds our feature values, each of size 3, and `t_y` stores the class labels. For this example, we first create these two tensors as follows:

In [None]:
torch.manual_seed(1)

t_x = torch.rand([4, 3], dtype=torch.float32)
t_y = torch.arange(4)

Then we create a joint dataset of `t_x` and `t_y` with the custom `Dataset` class as follows:

In [None]:
joint_dataset = JointDataset(t_x, t_y)

Finally, we can print each example of the joint dataset as follows:

In [None]:
example = next(iter(joint_dataset))

In [None]:
for example in joint_dataset:
    print('  x: ', example[0], '  y: ', example[1])

### Shuffle, batch, and repeat

As was mentioned in *Week 2, Training Simple Machine Learning Algorithms for Classification*, when training an NN model using stochastic gradient descent optimization, it is important to feed training data as randomly shuffled batches. You have already seen how to specify the batch size using the `batch_size` argument of a data loader object. Now, in addition to creating batches, you will see how to shuffle and reiterate over the datasets. We will continue working with the previous joint dataset.

First, let’s create a shuffled version data loader from the `joint_dataset` dataset:

In [None]:
torch.manual_seed(1)
data_loader = DataLoader(dataset=joint_dataset, batch_size=2, shuffle=True)

In order to see what the data examples look like, we can execute the following code:

In [None]:
example_loader = next(iter(data_loader))

Here, each batch contains two data records (x) and the corresponding labels (y). Now we iterate through the data loader entry by entry as follows:

In [None]:
for i, batch in enumerate(data_loader, 1):
        print(f'batch {i}:', 'x:', batch[0], 
              '\n         y:', batch[1])

The rows are shuffled without losing the one-to-one correspondence between the entries in `x` and `y`.

In addition, when training a model for multiple epochs, we need to shuffle and iterate over the dataset by the desired number of epochs. So, let’s iterate over the batched dataset twice:

In [None]:
for epoch in range(2):
    print(f'epoch {epoch+1}')
    for i, batch in enumerate(data_loader, 1):
        print(f'batch {i}:', 'x:', batch[0], 
              '\n         y:', batch[1])

This results in two different sets of batches. In the first epoch, the first batch contains a pair of values `[y=1, y=2]`, and the second batch contains a pair of values `[y=3, y=0]`. In the second epoch, two batches contain a pair of values, `[y=2, y=0]` and `[y=1, y=3]` respectively. For each iteration, the elements within a batch are also shuffled.

### Creating a dataset from files on your local storage disk

In this section, we will build a dataset from image files stored on disk. There is an image folder associated with the online content of this week. After downloading the folder, you should be able to see six images of cats and dogs in JPEG format.

This small dataset will show how building a dataset from stored files generally works. To accomplish this, we are going to use two additional modules: Image in PIL to read the image file contents and `transforms` in `torchvision` to decode the raw contents and resize the images.

Before we start, let’s take a look at the content of these files. We will use the `pathlib` library to generate a list of image files:

In [None]:
import pathlib

imgdir_path = pathlib.Path('images/cat_dog_images')

file_list = sorted([str(path) for path in imgdir_path.glob('*.jpg')])

print(file_list)

Next, we will visualize these image examples using `matplotlib`:

In [None]:
import matplotlib.pyplot as plt
import os
from PIL import Image


fig = plt.figure(figsize=(10, 5))
for i, file in enumerate(file_list):
    img = Image.open(file)
    print('Image shape: ', np.array(img).shape)
    ax = fig.add_subplot(2, 3, i+1)
    ax.set_xticks([]); ax.set_yticks([])
    ax.imshow(img)
    ax.set_title(os.path.basename(file), size=15)
    
#plt.savefig('figures/12_03.pdf')
plt.tight_layout()
plt.show()

Just from this visualization and the printed image shapes, we can already see that the images have different aspect ratios. If you print the aspect ratios (or data array shapes) of these images, you will see that some images are 900 pixels high and 1200 pixels wide (900×1200), some are 800×1200, and one is 900×742. Later, we will preprocess these images to a consistent size. Another point to consider is that the labels for these images are provided within their filenames. So, we extract these labels from the list of filenames, assigning label `1` to dogs and label `0` to cats:

In [None]:
labels = [1 if 'dog' in os.path.basename(file) else 0
          for file in file_list]
print(labels)

Now, we have two lists: a list of filenames (or paths of each image) and a list of their labels. In the previous section, you learned how to create a joint dataset from two arrays. Here, we will do the following:

In [None]:
class ImageDataset(Dataset):
    def __init__(self, file_list, labels):
        self.file_list = file_list
        self.labels = labels

    def __getitem__(self, index):
        file = self.file_list[index]      
        label = self.labels[index]
        return file, label

    def __len__(self):
        return len(self.labels)
    
image_dataset = ImageDataset(file_list, labels)
for file, label in image_dataset:
    print(file, label)

The joint dataset has filenames and labels.

Next, we need to apply transformations to this dataset: load the image content from its file path, decode the raw content, and resize it to a desired size, for example, 80×120. As mentioned before, we use the `torchvision.transforms` module to resize the images and convert the loaded pixels into tensors as follows:

In [None]:
import torchvision.transforms as transforms

img_height, img_width = 80, 120
    
transform = transforms.Compose([
    transforms.ToTensor(),
    transforms.Resize((img_height, img_width)),
])

Now we update the `ImageDataset` class with the transform we just defined:

In [None]:
class ImageDataset(Dataset):
    def __init__(self, file_list, labels, transform=None):
        self.file_list = file_list
        self.labels = labels
        self.transform = transform
    def __getitem__(self, index):
        img = Image.open(self.file_list[index])        
        if self.transform is not None:
            img = self.transform(img)
        label = self.labels[index]
        return img, label
    def __len__(self):
        return len(self.labels)


image_dataset = ImageDataset(file_list, labels, transform)

Finally, we visualize these transformed image examples using `matplotlib`:

In [None]:
fig = plt.figure(figsize=(10, 6))
for i, example in enumerate(image_dataset):
    ax = fig.add_subplot(2, 3, i+1)
    ax.set_xticks([]); ax.set_yticks([])
    ax.imshow(example[0].numpy().transpose((1, 2, 0)))
    ax.set_title(f'{example[1]}', size=15)
    
plt.tight_layout()
# plt.savefig('figures/12_04.pdf')
plt.show()

This results in the following visualization of the retrieved example images, along with their labels:

The `__getitem__` method in the `ImageDataset` class wraps all four steps into a single function, including the loading of the raw content (images and labels), decoding the images into tensors, and resizing the images. The function then returns a dataset that we can iterate over and apply other operations that we learned about in the previous sections via a data loader, such as shuffling and batching.