<div>
    <img src="../images/emlyon.png" style="height:60px; float:left; padding-right:10px; margin-top:5px" />
    <span>
        <h1 style="padding-bottom:5px;"> Introduction to Deep Learning </h1>
        <a href="https://masters.em-lyon.com/fr/msc-in-data-science-artificial-intelligence-strategy">[DSAIS]</a> MSc in Data Science & Artificial Intelligence Strategy <br/>
         Paris | © Saeed VARASTEH
    </span>
</div>

## Lecture 01 : PyTorch Basics

This lecture content is about fundamentals of __PyTorch__.

PyTorch is an open source deep learning platform that provides a seamless path from research prototyping to production deployment.

---

In [9]:
import torch
torch.__version__

'2.1.2'

In [10]:
if torch.cuda.is_available():
    device = torch.device('cuda')
elif torch.backends.mps.is_available():
    device = torch.device('mps')
else:
    device = torch.device('cpu')

print('Using device:', device)

Using device: mps


In [8]:
dir(torch.cuda.is_available)

['__annotations__',
 '__builtins__',
 '__call__',
 '__class__',
 '__closure__',
 '__code__',
 '__defaults__',
 '__delattr__',
 '__dict__',
 '__dir__',
 '__doc__',
 '__eq__',
 '__format__',
 '__ge__',
 '__get__',
 '__getattribute__',
 '__getstate__',
 '__globals__',
 '__gt__',
 '__hash__',
 '__init__',
 '__init_subclass__',
 '__kwdefaults__',
 '__le__',
 '__lt__',
 '__module__',
 '__name__',
 '__ne__',
 '__new__',
 '__qualname__',
 '__reduce__',
 '__reduce_ex__',
 '__repr__',
 '__setattr__',
 '__sizeof__',
 '__str__',
 '__subclasshook__']

### Torch Tensors

Tensors are the fundamental building block of PyTorch.

#### scalar 
A scaleris a single number and in tensor-speak it's a zero dimension tensor.

In [2]:
s = torch.tensor(5)
s

tensor(5)

In [3]:
s.ndim

0

In [4]:
s.shape

torch.Size([])

In [5]:
s.dtype

torch.int64

You can change tensor datatype by:

In [6]:
s = torch.tensor(5, dtype=torch.float32)
s

tensor(5.)

In [7]:
s.dtype

torch.float32

The most common types in PyTorch are (`torch.float32` or `torch.float`), (`torch.int64` or `torch.long`).

#### Tensor's `item()`

What if we wanted to retrieve the number from the tensor?

In [8]:
# Get the Python number within a tensor (only works with one-element tensors)
s.item()

5.0

#### vector

A vector is a single dimension tensor but can contain many numbers.

In [9]:
v = torch.tensor([3, 5, 6])
v

tensor([3, 5, 6])

In [10]:
v.ndim

1

In [11]:
v.shape

torch.Size([3])

<div class="alert-info">
You can tell the number of dimensions a tensor in PyTorch has by the number of square brackets on the outside ([) and you only need to count one side.
<div>

In [12]:
w = torch.tensor([[3, 5, 6]])
w

tensor([[3, 5, 6]])

In [13]:
w.ndim

2

In [14]:
w.shape

torch.Size([1, 3])

#### matrix

Matrices are as flexible as vectors, except they've got an extra dimension

In [15]:
m = torch.tensor([[3, 5, 6], [2, 7, 9]])
m

tensor([[3, 5, 6],
        [2, 7, 9]])

In [16]:
m.shape

torch.Size([2, 3])

### Torch `rand()`, `zeros()`, `ones()` and `arange()`

The flexibility of torch.rand() is that we can adjust the size to be whatever we want.

In [17]:
random_tensor = torch.rand(size=(3, 4))
random_tensor

tensor([[0.5019, 0.7788, 0.7731, 0.3957],
        [0.2505, 0.7455, 0.9595, 0.2289],
        [0.3802, 0.8874, 0.2076, 0.8507]])

Sometimes you'll just want to fill tensors with zeros or ones.

In [18]:
zeros = torch.zeros(size=(3, 4))
zeros

tensor([[0., 0., 0., 0.],
        [0., 0., 0., 0.],
        [0., 0., 0., 0.]])

In [19]:
ones = torch.ones(size=(3, 4))
ones

tensor([[1., 1., 1., 1.],
        [1., 1., 1., 1.],
        [1., 1., 1., 1.]])

Sometimes you might want a range of numbers, such as 1 to 10 or 0 to 100.

You can use `torch.arange(start, end, step)` to do so.

In [20]:
zero_to_ten = torch.arange(start=0, end=10, step=1)
zero_to_ten

tensor([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

#### Getting information from tensors

- `shape` - what shape is the tensor? (some operations require specific shape rules)
- `dtype` - what datatype are the elements within the tensor stored in?
- `device` - what device is the tensor stored on? (usually GPU or CPU)

In [21]:
zero_to_ten.device

device(type='cpu')

<div class="alert-success">
Generally if you see `torch.cuda()` anywhere, the tensor is being used for GPU. More on this later.
<div>

### Tensor Operations

In deep learning, data (images, text, video, audio, protein structures, etc) gets represented as tensors.

A model learns by investigating those tensors and performing a series of operations (could be 1,000,000s+) on tensors to create a representation of the patterns in the input data.

These operations oftem are: 

- Addition
- Substraction
- Multiplication (element-wise)
- Division
- Matrix multiplication

And that's it! Sure there are a few more here and there but these are the basic building blocks of neural networks

In [22]:
a = torch.tensor(2)
b = torch.tensor(5)
c = a + b
c

tensor(7)

In [23]:
a = torch.tensor(2)
b = torch.tensor(5)
c = a - b
c

tensor(-3)

In [24]:
a = torch.tensor(2)
b = torch.tensor(5)
c = a * b
c

tensor(10)

In [25]:
a = torch.tensor([2,4])
b = torch.tensor([5,3])
c = a * b # or equivalently torch.mul(a,b)
c

tensor([10, 12])

#### Matrix multiplication (is all you need)

One of the most common operations in machine learning and deep learning algorithms (like neural networks) is matrix multiplication.

PyTorch implements matrix multiplication functionality in the __`torch.matmul()`__ method.

In [26]:
a = torch.tensor([2,4])
b = torch.tensor([5,3])
c = torch.matmul(a,b)
c

tensor(22)

<div class="alert-danger">
One of the most common errors in deep learning is shape errors in matrix multiplication.
<div>

In [27]:
a = torch.tensor([[2,4]])
b = torch.tensor([[5,3]])
c = torch.matmul(a,b)
c

RuntimeError: mat1 and mat2 shapes cannot be multiplied (1x2 and 1x2)

How to fix this?

You can perform transposes in PyTorch using either:

- `torch.transpose(input, dim0, dim1)` - where input is the desired tensor to transpose and dim0 and dim1 are the dimensions to be swapped.
- `tensor.T` - where tensor is the desired tensor to transpose.


In [28]:
c = torch.matmul(a,b.T)
c

tensor([[22]])

In [29]:
b.shape

torch.Size([1, 2])

In [30]:
torch.transpose(b,0,1).shape

torch.Size([2, 1])

In [31]:
c = torch.matmul(a, torch.transpose(b,0,1))
c

tensor([[22]])

### Aggregations

In [32]:
zero_to_ten = torch.arange(start=0, end=10, step=1)
zero_to_ten

tensor([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [33]:
print(f"Minimum: {zero_to_ten.min()}")
print(f"Maximum: {zero_to_ten.max()}")
print(f"Sum: {zero_to_ten.sum()}")

Minimum: 0
Maximum: 9
Sum: 45


In [34]:
print(f"Mean: {zero_to_ten.mean()}") # this will error

RuntimeError: mean(): could not infer output dtype. Input dtype must be either a floating point or complex dtype. Got: Long

<div class="alert-danger">
Note: You may find some methods such as torch.mean() require tensors to be in torch.float32 (the most common) or another specific datatype, otherwise the operation will fail.
<div>

#### Changing datatype

You can change the datatypes of tensors using `torch.Tensor.type(dtype=None)` where the dtype parameter is the datatype you'd like to use.

In [35]:
print(f"Mean: {zero_to_ten.type(torch.float32).mean()}")

Mean: 4.5


Shortcut to this:

In [36]:
print(f"Mean: {zero_to_ten.float().mean()}")

Mean: 4.5


### Reshaping, Stacking, Squeezing and Unsqueezing

Often times you'll want to reshape or change the dimensions of your tensors without actually changing the values inside them.

To do so, some popular methods are:

| Method | One-line description |
| :----- | :----- |
| [`torch.reshape(input, shape)`](https://pytorch.org/docs/stable/generated/torch.reshape.html#torch.reshape) | Reshapes `input` to `shape` (if compatible), can also use `torch.Tensor.reshape()`. |
| [`torch.Tensor.view(shape)`](https://pytorch.org/docs/stable/generated/torch.Tensor.view.html) | Returns a view of the original tensor in a different `shape` but shares the same data as the original tensor. |
| [`torch.stack(tensors, dim=0)`](https://pytorch.org/docs/1.9.1/generated/torch.stack.html) | Concatenates a sequence of `tensors` along a new dimension (`dim`), all `tensors` must be same size. |
| [`torch.squeeze(input)`](https://pytorch.org/docs/stable/generated/torch.squeeze.html) | Squeezes `input` to remove all the dimenions with value `1`. |
| [`torch.unsqueeze(input, dim)`](https://pytorch.org/docs/1.9.1/generated/torch.unsqueeze.html) | Returns `input` with a dimension value of `1` added at `dim`. | 
| [`torch.permute(input, dims)`](https://pytorch.org/docs/stable/generated/torch.permute.html) | Returns a *view* of the original `input` with its dimensions permuted (rearranged) to `dims`. | 

Why do any of these?

Because deep learning models (neural networks) are all about manipulating tensors in some way. And because of the rules of matrix multiplication, if you've got shape mismatches, you'll run into errors. These methods help you make the right elements of your tensors are mixing with the right elements of other tensors. 

Let's try them out.

In [37]:
x = torch.arange(1., 8.)
x, x.shape

(tensor([1., 2., 3., 4., 5., 6., 7.]), torch.Size([7]))

#### `torch.reshape()`

Let's add an extra dimension with `torch.reshape()`.

In [38]:
x_reshaped = x.reshape(1, 7)
x_reshaped, x_reshaped.shape

(tensor([[1., 2., 3., 4., 5., 6., 7.]]), torch.Size([1, 7]))

#### `torch.view()`

We can also change the view with `torch.view()`.

In [39]:
x_view = x.view(1, 7)
x_view, x_view.shape

(tensor([[1., 2., 3., 4., 5., 6., 7.]]), torch.Size([1, 7]))

<div class="alert-warning">
Remember though, changing the view of a tensor with torch.view() only creates a new view of the same tensor.
</div>
So changing the view changes the original tensor too.

In [40]:
x_view[:, 0] = 50
x_view, x

(tensor([[50.,  2.,  3.,  4.,  5.,  6.,  7.]]),
 tensor([50.,  2.,  3.,  4.,  5.,  6.,  7.]))

#### `torch.stack()`

If we wanted to stack our new tensor on top of itself five times, we could do so with `torch.stack()`.

In [41]:
x_stacked = torch.stack([x, x, x, x], dim=0)
x_stacked

tensor([[50.,  2.,  3.,  4.,  5.,  6.,  7.],
        [50.,  2.,  3.,  4.,  5.,  6.,  7.],
        [50.,  2.,  3.,  4.,  5.,  6.,  7.],
        [50.,  2.,  3.,  4.,  5.,  6.,  7.]])

#### `torch.squeeze()`

How about removing all single dimensions from a tensor?

To do so you can use `torch.squeeze()`.

In [42]:
print(f"Previous tensor: {x_reshaped}")
print(f"Previous shape: {x_reshaped.shape}")

Previous tensor: tensor([[50.,  2.,  3.,  4.,  5.,  6.,  7.]])
Previous shape: torch.Size([1, 7])


In [43]:
x_squeezed = x_reshaped.squeeze()
print(f"\nNew tensor: {x_squeezed}")
print(f"New shape: {x_squeezed.shape}")


New tensor: tensor([50.,  2.,  3.,  4.,  5.,  6.,  7.])
New shape: torch.Size([7])


#### `torch.permute(input, dims)`

You can also rearrange the order of axes values with `torch.permute(input, dims)`, where the input gets turned into a view with new dims.

In [44]:
print(f"Previous tensor: {x_reshaped}")
print(f"Previous shape: {x_reshaped.shape}")

Previous tensor: tensor([[50.,  2.,  3.,  4.,  5.,  6.,  7.]])
Previous shape: torch.Size([1, 7])


In [45]:
x_permuted = x_reshaped.permute(1, 0) # shifts axis 0->1 and 1->0

print(f"Previous shape: {x_permuted}")
print(f"New shape: {x_permuted.shape}")

Previous shape: tensor([[50.],
        [ 2.],
        [ 3.],
        [ 4.],
        [ 5.],
        [ 6.],
        [ 7.]])
New shape: torch.Size([7, 1])


### Tensors Indexing

Sometimes you'll want to select specific data from tensors (for example, only the first column or second row).

To do so, you can use indexing.

__If you've ever done indexing on Python lists or NumPy arrays, indexing in PyTorch with tensors is very similar.__

In [46]:
x = torch.arange(8).reshape(2,4)
x, x.shape

(tensor([[0, 1, 2, 3],
         [4, 5, 6, 7]]),
 torch.Size([2, 4]))

In [47]:
print(f"First: {x[1]}") 
print(f"Second: {x[1][2]}") 
print(f"Second: {x[1,2]}") 
print(f"Second: {x[1,:]}") 

First: tensor([4, 5, 6, 7])
Second: 6
Second: 6
Second: tensor([4, 5, 6, 7])


### Tensors & NumPy

The two main methods you'll want to use for NumPy to PyTorch (and back again) are: 
* [`torch.from_numpy(ndarray)`](https://pytorch.org/docs/stable/generated/torch.from_numpy.html) - NumPy array -> PyTorch tensor. 
* [`torch.Tensor.numpy()`](https://pytorch.org/docs/stable/generated/torch.Tensor.numpy.html) - PyTorch tensor -> NumPy array.

In [48]:
import torch
import numpy as np
array = np.arange(8.)
tensor = torch.from_numpy(array)
array, tensor

(array([0., 1., 2., 3., 4., 5., 6., 7.]),
 tensor([0., 1., 2., 3., 4., 5., 6., 7.], dtype=torch.float64))

In [49]:
array.dtype, tensor.dtype

(dtype('float64'), torch.float64)

__Note:__ By default, NumPy arrays are created with the datatype `float64` and if you convert it to a PyTorch tensor, it'll keep the same datatype (as above).

However, many PyTorch calculations default to using `float32`.

So if you want to convert your NumPy array (float64) -> PyTorch tensor (float64) -> PyTorch tensor (float32), you can use:

- __`tensor = torch.from_numpy(array).type(torch.float32)`__.

### Reproducibility

As you learn more about neural networks and machine learning, you'll start to discover how much randomness plays a part.

Let's say, you create an algorithm capable of achieving X performance and then your friend tries it out to verify you're not crazy.

How could they do such a thing? That's where reproducibility comes in.

In other words, can you get the same (or very similar) results on your computer running the same code as I get on mine?

Let's see a brief example of reproducibility in PyTorch and how to control randomness.

We'll start by creating two random tensors, since they're random, you'd expect them to be different right?

In [50]:
random_tensor_A = torch.rand(2)
random_tensor_B = torch.rand(2)

print(f"Tensor A:\n{random_tensor_A}\n")
print(f"Tensor B:\n{random_tensor_B}\n")

Tensor A:
tensor([0.4420, 0.1949])

Tensor B:
tensor([0.4805, 0.2718])



Just as you might've expected, the tensors come out with different values.

But what if you wanted to created two random tensors with the same values?

That's where `torch.manual_seed(seed)` comes in, where seed is an integer (like 42 but it could be anything) that flavours the randomness.

In [51]:
torch.manual_seed(seed=42) 
random_tensor_A = torch.rand(2)

torch.manual_seed(seed=42) 
random_tensor_B = torch.rand(2)

print(f"Tensor A:\n{random_tensor_A}\n")
print(f"Tensor B:\n{random_tensor_B}\n")

Tensor A:
tensor([0.8823, 0.9150])

Tensor B:
tensor([0.8823, 0.9150])



---