## Welcome to the deep learning tutorial 👋!

The **goals** of the tutorial are to

1. Learn about the building blocks of PyTorch, from layer creation to regularisation and model training.
2. Train feed-forward neural networks on traditional and computer vision tasks.
3. Explore how more advanced network structures can be leveraged to perform time series analysis, and how convolutional layers impact the performance of neural networks on computer vision tasks.
4. Load pre-trained language models from huggingface and use transfer learning to showcase NLP applications.

#### Documentation

- [PyTorch](https://pytorch.org)
- [huggingface](https://huggingface.co)

#### Repos

- [Voight, D. (2022) - Deep Learning with pyTorch Step-by-Step](https://github.com/dvgodoy/PyTorchStepByStep)

#### Books

- [Goodfellow et al (2016) - The Deep Learning Book](https://www.deeplearningbook.org)


In [2]:
import numpy as np
import pandas as pd

import altair as alt
import matplotlib.pyplot as plt
import seaborn as sns

from toolz import pipe
from typing import Dict, Sequence

from dap_taltech.utils.data_getters import DataGetter
from dap_taltech import logger

In [3]:
alt.data_transformers.enable('default', max_rows=None)
getter = DataGetter(local=True)

[94;1;1m2023-08-03 15:22:11,545 - TalTech HackWeek 2023 - INFO - Loading data from /home/dampudia/projects/dap_taltech/data/ (data_getters.py:50)[0m


### PyTorch 🚀

In the universe of deep learning, PyTorch shines like a bright star. An open-source library originally developed by the Facebook AI Research lab, PyTorch provides a platform for all your computation needs, with a particular focus on deep neural networks.

At the core of PyTorch, we find tensors - multi-dimensional arrays similar to NumPy ndarrays but with superpowers. These tensors can be used on a GPU, enabling a significant speedup in computations, a crucial aspect for deep learning tasks which has spurred the surge in deep learning capability in recent times.

In Pytorch, tensors can be declared simply in one of several ways:

![](https://cdn-images-1.medium.com/max/2000/1*_D5ZvufDS38WkhK9rK32hQ.jpeg)


What sets PyTorch apart is its dynamic computation graph, offering flexibility in building complex architectural models. It also provides excellent support for gradient-based optimization through automatic differentiation, a backbone of backpropagation in neural networks.

Lastly, it's worth mentioning PyTorch's rich ecosystem that includes TorchText, TorchVision, and TorchAudio for text, image, and audio processing, respectively. Together, they make PyTorch a versatile tool, suited to solve a broad range of AI tasks.

In this section we're going to first introduce and then build a standard PyTorch workflow, which largely looks as follows:

<img src="https://raw.githubusercontent.com/mrdbourke/pytorch-deep-learning/main/images/01_a_pytorch_workflow.png" width=900 alt="a pytorch workflow flowchat"/>

So strap in and get ready to launch into the deep learning cosmos with PyTorch! 🚀

In [5]:
import torch

# Check PyTorch version
torch.__version__

'2.0.1+cu117'

With the code below, we are configuring our computing setup to be device agnostic. This means, if we are lucky enough to have a GPU, we will leverage its processing power and adequacy to handle tensors. In its absence, we shall have to rely on the good old CPU.

In [15]:
device = "cuda" if torch.cuda.is_available() else "cpu"
print(f"Using device: {device}")

Using device: cuda


#### A primer on Tensors

Tensors form the essential building blocks in the realm of deep learning libraries, including PyTorch. A tensor, in the simplest terms, is a generalization of vectors and matrices to potentially higher dimensions, with the ability to represent n-dimensional arrays of data. 

In PyTorch, these tensors behave similarly to the ndarray objects in NumPy, but with an added advantage of being usable on a GPU, thereby facilitating faster computational operations. Further, these tensors also keep track of the computational graph and gradients, acting as key players in the automatic differentiation system of PyTorch, which is central to training neural networks. 

So, understanding tensors essentially becomes the stepping stone into the world of deep learning. Let's do just that!

In [33]:
# An empty tensor
tensor = torch.empty(6,4, dtype=torch.float)
print(tensor)

tensor([[9.0880e+37, 4.5581e-41, 9.0880e+37, 4.5581e-41],
        [4.4842e-44, 0.0000e+00, 1.5695e-43, 0.0000e+00],
        [6.8018e-34, 0.0000e+00, 0.0000e+00, 0.0000e+00],
        [0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00],
        [0.0000e+00, 0.0000e+00, 1.4013e-45, 0.0000e+00],
        [0.0000e+00, 0.0000e+00, 1.8077e-43, 0.0000e+00]])


Note the elements are not necessarily zero; rather, they are the values in RAM / VRAM memory at the location where the tensor was created. The reason for this is efficiency: if you know you will be writing to all the elements of the tensor, then there is no need to spend time initializing them, so torch.empty() doesn't.

We can always retrieve the shape of the tensor x, providing the dimensions of the tensor in a convenient tuple format.

In [35]:
tensor.size()

torch.Size([6, 4])

In PyTorch, you can conveniently convert tensors into numpy arrays, enabling easy integration with existing Python workflows.

In [None]:
tensor.numpy()

array([[9.0880285e+37, 4.5581436e-41, 9.0880285e+37, 4.5581436e-41],
       [4.4841551e-44, 0.0000000e+00, 1.5694543e-43, 0.0000000e+00],
       [6.8018025e-34, 0.0000000e+00, 0.0000000e+00, 0.0000000e+00],
       [0.0000000e+00, 0.0000000e+00, 0.0000000e+00, 0.0000000e+00],
       [0.0000000e+00, 0.0000000e+00, 1.4012985e-45, 0.0000000e+00],
       [0.0000000e+00, 0.0000000e+00, 1.8076750e-43, 0.0000000e+00]],
      dtype=float32)

The reverse is also true: you can use a numpy array and transform it to a tensor.

In [36]:
array = np.array([[1,5],[2,3]])
torch.from_numpy(array)

tensor([[1, 5],
        [2, 3]])

#### Tensor Operations 🛠️

Creating tensors and understanding their basic properties is important, but the real magic of tensors comes into play when we start performing operations on them.

##### Random numbers similar to numpy 🎲

We can easily create a tensor filled with random numbers from a uniform distribution on the interval [0, 1), similar to how we do it in numpy.

In [37]:
tensor = torch.rand(6, 4)
print(tensor)

tensor([[0.0493, 0.1979, 0.6539, 0.9537],
        [0.2489, 0.0957, 0.8417, 0.3472],
        [0.2436, 0.9246, 0.7245, 0.5005],
        [0.4209, 0.7280, 0.3985, 0.6047],
        [0.5587, 0.4407, 0.6836, 0.2276],
        [0.2094, 0.8565, 0.6048, 0.6883]])


##### Construct a matrix filled with zeros and of dtype long 🎯
We can also create tensors filled with zeros (or any other number), specifying their type at creation time.

In [39]:
tensor = torch.zeros(6, 4, dtype=torch.long)
print(tensor)

tensor([[0, 0, 0, 0],
        [0, 0, 0, 0],
        [0, 0, 0, 0],
        [0, 0, 0, 0],
        [0, 0, 0, 0],
        [0, 0, 0, 0]])


In [40]:
tensor = torch.ones(2, 2, dtype=torch.long)
print(tensor)

tensor([[1, 1],
        [1, 1]])


##### Construct a tensor directly from data 📚
Directly constructing a tensor from a list of numbers is also an option. Remember, the tensor will copy the data, not reference it.

In [42]:
tensor = torch.tensor([3, 2.5])
print(tensor)

tensor([3.0000, 2.5000])


##### Create tensor based on existing tensor 🔄
In some cases, you might want to create a tensor that has the same properties as another tensor (same dtype and same device), but filled with different data. For this, PyTorch provides the new_ones and new_zeros methods.

In [45]:
tensor = tensor.new_ones(3, 3, dtype=torch.double)
print(tensor)
print(tensor.size())

tensor([[1., 1., 1.],
        [1., 1., 1.],
        [1., 1., 1.]], dtype=torch.float64)
torch.Size([3, 3])


In [46]:
tensor = torch.randn_like(tensor, dtype=torch.float)
print(tensor) 
print(tensor.size())


tensor([[ 1.4549,  1.2867,  0.1769],
        [-0.8134, -0.8832, -0.7480],
        [ 1.0545, -0.4150,  1.3091]])
torch.Size([3, 3])


#### Basic Tensor Operation 🧮
Performing arithmetic operations on tensors is easy and intuitive. Under the hood, these operations are highly optimized and can run in parallel on a GPU. This makes PyTorch a powerful tool for numerical computations, similar to numpy but with powerful GPU or other hardware acceleration.

![](https://devblogs.nvidia.com/wp-content/uploads/2018/05/tesnor_core_diagram.png)

We have a variety of basic operations such as addition(torch.add()), subtraction(torch.sub()), division(torch.div()), and multiplication(torch.mul()) that can be performed on tensors. Let's dive in!

Addition

In [47]:
x = torch.rand(5, 3)
y = torch.rand(5, 3)
print(x + y) # Standard Python addition
print(torch.add(x, y)) # PyTorch method

tensor([[1.5656, 0.9265, 1.1569],
        [1.5182, 0.7835, 0.3577],
        [1.1078, 0.7523, 1.3442],
        [1.2108, 0.6473, 0.7450],
        [0.4641, 1.0847, 1.0781]])
tensor([[1.5656, 0.9265, 1.1569],
        [1.5182, 0.7835, 0.3577],
        [1.1078, 0.7523, 1.3442],
        [1.2108, 0.6473, 0.7450],
        [0.4641, 1.0847, 1.0781]])


Subtraction

In [48]:
x = torch.rand(5, 3)
y = torch.rand(5, 3)
print(x - y) # Standard Python subtraction
print(torch.sub(x, y)) # PyTorch method

tensor([[-0.2004,  0.3924, -0.3218],
        [-0.2173, -0.3551, -0.1990],
        [-0.3053,  0.4820, -0.4264],
        [-0.2186, -0.3473, -0.1678],
        [-0.2718, -0.0652,  0.3435]])
tensor([[-0.2004,  0.3924, -0.3218],
        [-0.2173, -0.3551, -0.1990],
        [-0.3053,  0.4820, -0.4264],
        [-0.2186, -0.3473, -0.1678],
        [-0.2718, -0.0652,  0.3435]])


Divison

In [49]:
x = torch.rand(5, 3)
y = torch.rand(5, 3)
print(x / y) # Standard Python division
print(torch.div(x, y)) # PyTorch method

tensor([[ 1.3292,  0.5395, 19.5156],
        [ 0.7785,  0.1425,  1.0147],
        [ 5.1637,  3.5159,  0.6167],
        [ 0.2489,  1.5521,  0.5513],
        [ 0.8063,  0.0584,  1.2823]])
tensor([[ 1.3292,  0.5395, 19.5156],
        [ 0.7785,  0.1425,  1.0147],
        [ 5.1637,  3.5159,  0.6167],
        [ 0.2489,  1.5521,  0.5513],
        [ 0.8063,  0.0584,  1.2823]])


Multiplication

In [50]:
x = torch.rand(5, 3)
y = torch.rand(5, 3)
print(x * y) # Standard Python multiplication
print(torch.mul(x, y)) # PyTorch method

tensor([[0.0621, 0.0107, 0.0217],
        [0.3068, 0.2046, 0.6158],
        [0.3619, 0.3332, 0.4316],
        [0.0820, 0.2235, 0.2566],
        [0.0168, 0.3839, 0.6335]])
tensor([[0.0621, 0.0107, 0.0217],
        [0.3068, 0.2046, 0.6158],
        [0.3619, 0.3332, 0.4316],
        [0.0820, 0.2235, 0.2566],
        [0.0168, 0.3839, 0.6335]])


PyTorch also provides an in-place addition method. This means the addition operation is done on the tensor and the result is stored in the same tensor, without creating a new one. Remember that in PyTorch, every method that ends with an underscore performs an in-place operation.

In [51]:
y.add_(x)  # adds x to y
print(y)

tensor([[0.5025, 0.2125, 0.2945],
        [1.1688, 0.9842, 1.5973],
        [1.2199, 1.1722, 1.4238],
        [0.7054, 1.0820, 1.0178],
        [0.2843, 1.2475, 1.6162]])


PyTorch tensors also support standard numpy-like indexing with all the bells and whistles!

In [53]:
print(x[:, 1])  # Prints the second column of tensor x

tensor([0.1302, 0.6859, 0.6875, 0.2780, 0.5519])


##### Resizing Tensors
PyTorch also provides methods to reshape tensors, similar to numpy's reshape function. The view method in PyTorch can be used to reshape a tensor.

In [61]:
x = torch.randn(6, 6)
print(x)

tensor([[-0.4313, -0.6393,  0.7510, -0.6493,  0.3703,  0.4172],
        [ 0.5227,  0.4034,  0.6747,  0.2088,  0.1143, -0.6383],
        [-0.1340, -0.0990, -0.4202,  0.6485, -0.7047,  0.6283],
        [ 0.6793,  1.3704, -1.4187, -1.3630, -1.3894, -1.6223],
        [-0.1440, -1.5239,  0.7996,  1.2438, -1.0048,  0.2283],
        [-0.8077, -0.9613, -0.4094,  0.0749, -0.3123, -1.1170]])


In [62]:
y = x.view(36)
print(y)

tensor([-0.4313, -0.6393,  0.7510, -0.6493,  0.3703,  0.4172,  0.5227,  0.4034,
         0.6747,  0.2088,  0.1143, -0.6383, -0.1340, -0.0990, -0.4202,  0.6485,
        -0.7047,  0.6283,  0.6793,  1.3704, -1.4187, -1.3630, -1.3894, -1.6223,
        -0.1440, -1.5239,  0.7996,  1.2438, -1.0048,  0.2283, -0.8077, -0.9613,
        -0.4094,  0.0749, -0.3123, -1.1170])


In [63]:
z = x.view(-1, 4)  # the size -1 is inferred from other dimensions
print(z)

tensor([[-0.4313, -0.6393,  0.7510, -0.6493],
        [ 0.3703,  0.4172,  0.5227,  0.4034],
        [ 0.6747,  0.2088,  0.1143, -0.6383],
        [-0.1340, -0.0990, -0.4202,  0.6485],
        [-0.7047,  0.6283,  0.6793,  1.3704],
        [-1.4187, -1.3630, -1.3894, -1.6223],
        [-0.1440, -1.5239,  0.7996,  1.2438],
        [-1.0048,  0.2283, -0.8077, -0.9613],
        [-0.4094,  0.0749, -0.3123, -1.1170]])


Well done! You have learned how to perform basic operations with PyTorch tensors! The power of PyTorch, however, really shines when building complex neural network models, as we'll see in the next sections. Keep going!