<a href="https://colab.research.google.com/github/marine-triquet/pytorch/blob/main/00_pytorch_fundamentales.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# 00. PyTorch Fundamentals

Resource notebook: https://www.learnpytorch.io/00_pytorch_fundamentals/

If you have a question: https://github.com/mrdbourke/pytorch-deep-learning/discussions


In [1]:
import torch
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
print(torch.__version__)

2.4.1+cu121


## Introduction to Tensors

### Creating tensors

PyTorch tensors are created using '''torch.Tensor() = https://pytorch.org/docs/stable/tensors.html'''

##### What is a tensors ?
A tensor is a way to represent data, similar to a list or an array, but it can have multiple dimensions.
- A scalar (single number, like 7) is a 0-dimensional tensor.
- A vector (a list of numbers, like [1, 2, 3]) is a 1-dimensional tensor.
- A matrix (a table of numbers, like [[1, 2], [3, 4]] ) is a 2-dimensional tensor.

- scalar are tensors of ranking 0. (No direction indicador)
- vectors on the opposite are tensors of ranking 1. (1 basis vector per component)

Tensors explication: https://www.youtube.com/watch?v=f5liqUk0ZTw

**Explication:**

**scalar**: This typically refers to a single value, like an integer or float, rather than an array.

**.ndim**: This attribute is used to check the number of dimensions of a NumPy array.

scalar = np.array(7)  # This is a scalar (a single value) not an array in this case.


In [2]:
# scalar
scalar = torch.tensor(7)
scalar

tensor(7)

In [3]:
# scalar has 0 dimension.
scalar.ndim

0

In [4]:
# get tensor back as Python int
scalar.item()

7

In [6]:
scalar.shape

torch.Size([])

In [5]:
# vector
vector = torch.tensor([7, 7])
vector

tensor([7, 7])

In [6]:
# vector has 1 dimension. The tensor([7, 7]) has one bracket [ so -> 1 dimension.
vector.ndim

1

In [7]:
vector.shape

torch.Size([2])

In [8]:
# Matrix
MATRIX = torch.tensor([[7, 8],
                       [9, 10]])
MATRIX

tensor([[ 7,  8],
        [ 9, 10]])

In [9]:
# MATRIX has 2 dimensions. The tensor([[7, 8], [9, 10]]) has two brackets [[ so -> 2 dimensions.
MATRIX.ndim

2

In [10]:
MATRIX[0]

tensor([7, 8])

In [11]:
MATRIX[1]

tensor([ 9, 10])

In [12]:
MATRIX.shape

torch.Size([2, 2])

In [13]:
# Tensor
TENSOR = torch.tensor([[[1, 2, 3],
                        [3, 6, 9],
                        [2, 4, 5]]])
TENSOR

tensor([[[1, 2, 3],
         [3, 6, 9],
         [2, 4, 5]]])

In [14]:
# TENSOR has 3 dimensions. The tensor([[ [1, 2, 3], [3, 6, 9], [2, 4, 5] ]]) has three brackets [[[ so -> 3 dimensions.
TENSOR.ndim

3

In [15]:
TENSOR.shape

torch.Size([1, 3, 3])

In [16]:
TENSOR = torch.tensor([[[1, 2, 3, 4], [1, 2, 3, 4], [1, 2, 3, 4]]])
TENSOR

tensor([[[1, 2, 3, 4],
         [1, 2, 3, 4],
         [1, 2, 3, 4]]])

In [17]:
TENSOR.shape

torch.Size([1, 3, 4])

### Random tensors

Why random tensors ?

Random tensors are important because the way many neural networks learn is that they start with tensors full of random numbers and then adjust those random numbers to better represents the data.

`Start with random numbers -> look at data -> update random numbers -> look at data -> update random numbers...`

Torch random tensors - https://pytorch.org/docs/stable/generated/torch.rand.html

In [7]:
# create a random tensor of size (3, 4)
random_tensor = torch.rand(size=(3, 4))
random_tensor, random_tensor.dtype

(tensor([[0.2230, 0.9860, 0.9525, 0.5211],
         [0.1247, 0.4785, 0.1356, 0.2947],
         [0.8467, 0.3056, 0.5288, 0.5674]]),
 torch.float32)

In [8]:
random_tensor.ndim

2

In [9]:
# create a random tensor with similar shape to an image tensor
random_image_size_tensor = torch.rand(size=(224, 224, 3)) # height, width, color channel (R, G, B)
random_image_size_tensor.shape, random_image_size_tensor.ndim

(torch.Size([224, 224, 3]), 3)

In [10]:
# Exercice to create random tensor
random_tensor = torch.rand(size=(5, 5, 3))
random_tensor, random_tensor.shape, random_tensor.ndim

(tensor([[[2.7856e-01, 3.0005e-01, 1.9766e-01],
          [1.3545e-01, 4.6704e-01, 8.5977e-01],
          [6.1142e-01, 9.2392e-01, 3.2015e-01],
          [9.2629e-01, 9.1122e-03, 3.0616e-01],
          [2.8024e-01, 1.3458e-01, 1.8405e-01]],
 
         [[1.0595e-01, 9.3421e-01, 8.8770e-01],
          [4.2758e-01, 3.0945e-01, 8.4181e-01],
          [5.5741e-01, 5.0787e-01, 1.9940e-01],
          [2.1989e-01, 5.3503e-01, 9.8431e-01],
          [8.2302e-02, 1.4645e-01, 8.9916e-01]],
 
         [[6.4120e-01, 3.4976e-01, 7.4006e-01],
          [9.5285e-01, 8.7655e-01, 7.0968e-01],
          [4.4551e-01, 3.1281e-04, 1.0527e-01],
          [2.9867e-01, 4.8022e-01, 3.7291e-01],
          [3.1850e-01, 9.1013e-01, 2.9481e-01]],
 
         [[5.6556e-01, 8.5962e-02, 5.6532e-01],
          [2.5930e-01, 3.4024e-01, 6.7173e-01],
          [9.5365e-01, 1.8282e-01, 3.5511e-01],
          [9.7838e-01, 1.9970e-01, 7.8456e-01],
          [1.9698e-01, 6.6444e-02, 1.7740e-02]],
 
         [[7.4846e-01, 1.328

### Tensors with Zeros and ones



In [11]:
# create a tensor of all zeros
zeros = torch.zeros(size=(3, 4))
zeros, zeros.dtype

(tensor([[0., 0., 0., 0.],
         [0., 0., 0., 0.],
         [0., 0., 0., 0.]]),
 torch.float32)

In [12]:
# create a tensor of all ones
ones = torch.ones(size=(3, 4))
ones, ones.dtype

(tensor([[1., 1., 1., 1.],
         [1., 1., 1., 1.],
         [1., 1., 1., 1.]]),
 torch.float32)

### Creating a range and tensors-like

Sometime you might want a range of numbers, such as 1 to 10 or 0 to 100.
You can use `torch.arange(start, end, step)` to do so.


In [13]:
# create a range of values 0 to 10
zero_to_ten = torch.arange(start=0, end=11, step=1)
zero_to_ten

tensor([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10])

In [14]:
# create a range of value 1 to 10
one_to_ten = torch.arange(start=1, end=11, step=1)
one_to_ten

tensor([ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10])

In [15]:
# Sometimes you might want one tensor of a certain type with the same shape as another tensor.
# For example, a tensor of all zeros with the same shape as a previous tensor.

# Use torch.zeros_like(input=zero_to_ten) it will have the same shape.

ten_zeros = torch.zeros_like(input=zero_to_ten)
ten_zeros

tensor([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0])

### Tensor datatypes

The default of dtype is float32. Single compute precision.

The reason for all of these is to do with precision in computing.
Precision is the amount of detail used to describe a number.
The higher the precision value (8, 16, 32), the more detail and hence data used to express a number.

*Note*: Tensor datatype is one of the 3 big errors you'll run into with PyTorch & Deep learning:
1. Tensors not right datatype - tensor.dtype
2. Tensors not right shape - tensor.shape
3. Tensors not on the right device - tensor.device


In [16]:
# Default datatype for tensors is float32
float_32_tensor = torch.tensor([3.0, 6.0, 9.0],
                               dtype=None, # defaults to None, which is torch.float32 or whatever datatype is passed
                               device=None, # defaults to None, which uses the default tensor type
                               requires_grad=False) # if True, operations performed on the tensor are recorded

float_32_tensor.shape, float_32_tensor.dtype, float_32_tensor.device

(torch.Size([3]), torch.float32, device(type='cpu'))

In [17]:
# convert a tensor from float32to float16
float_16_tensor = float_32_tensor.type(torch.float16)
float_16_tensor

tensor([3., 6., 9.], dtype=torch.float16)

In [18]:
float_16_tensor*float_32_tensor

tensor([ 9., 36., 81.])

In [20]:
# create a tensor
some_tensor = torch.rand(3, 4)
some_tensor

tensor([[0.4198, 0.1683, 0.5128, 0.5037],
        [0.0496, 0.7148, 0.6942, 0.4116],
        [0.8012, 0.2449, 0.9377, 0.0477]])

In [21]:
# Create a tensor
some_tensor = torch.rand(3, 4)

# Find out details about it
print(some_tensor)
print(f"Shape of tensor: {some_tensor.shape}")
print(f"Datatype of tensor: {some_tensor.dtype}")
print(f"Device tensor is stored on: {some_tensor.device}") # will default to CPU

tensor([[0.9162, 0.3474, 0.5233, 0.4391],
        [0.0576, 0.3046, 0.3027, 0.4219],
        [0.5175, 0.4747, 0.7358, 0.5820]])
Shape of tensor: torch.Size([3, 4])
Datatype of tensor: torch.float32
Device tensor is stored on: cpu


In [24]:
# create a tensor and change the default dtype and device
marine_tensor = torch.tensor([2, 4, 6], dtype=torch.float64, device="cuda")
marine_tensor, marine_tensor.dtype, marine_tensor.device

(tensor([2., 4., 6.], device='cuda:0', dtype=torch.float64),
 torch.float64,
 device(type='cuda', index=0))

### Manipulating Tensors (tensor operations)

Tensror operations includes:
* Addition
* Substraction
* Multiplication
* Division
* Matrix multiplication


In [33]:
# create a tensor
tensor = torch.tensor([1, 2, 3])
tensor + 10

tensor([11, 12, 13])

In [26]:
tensor = torch.tensor([1,2,3])
tensor + 100

tensor([101, 102, 103])

In [31]:
# Multiply tensor by 10
tensor = tensor * 10
tensor

tensor([10, 20, 30])

In [32]:
# Substract tensor by 10
tensor - 10

tensor([ 0, 10, 20])

In [34]:
# try out pyTorch in-build functions
torch.mul(tensor, 10)

tensor([10, 20, 30])

In [35]:
torch.add(tensor, 5)

tensor([6, 7, 8])

In [37]:
#


(tensor([1, 2, 2]), tensor([5, 5, 5]))

### Matrix multiplication

Two main ways of performing multiplication in neural networks and depp learning.

1. Element-wise multiplication
2. Matrix multiplication (dot product)
  - use torch.matmul() method:
`torch.matmul(tensor1, tensor2)` or
  - you can use `tensor1 @ tensor2`

Info about multiplying Matrices - https://www.mathsisfun.com/algebra/matrix-multiplying.html

The main two rules for matrix multiplication to remember are:

The **inner dimensions** must match:

* `(3, 2) @ (3, 2)` won't work
* `(2, 3) @ (3, 2)` will work
* `(3, 2) @ (2, 3)` will work


The resulting matrix has the shape of the **outer dimensions**:



* `(2, 3) @ (3, 2) -> (2, 2)`
* `(3, 2) @ (2, 3) -> (3, 3)`










In [47]:
torch.matmul(torch.rand(3, 2) , torch.rand(2, 3))

tensor([[0.5487, 0.8624, 0.1997],
        [0.3730, 0.5939, 0.1454],
        [0.6115, 0.8811, 0.1231]])

In [39]:
# Element wise multiplication
print(tensor, "*", tensor)
print(f"Equals: {tensor * tensor}")

tensor([1, 2, 3]) * tensor([1, 2, 3])
Equals: tensor([1, 4, 9])


In [40]:
# Matric multiplication
torch.matmul(tensor, tensor)

tensor(14)

In [41]:
# Matrix pultiplication by hand
1 * 1 + 2 * 2 + 3 * 3

14

In [42]:
%%time
value = 0
for i in range (len(tensor)):
  value += tensor[i] * tensor[i]
print(value)

tensor(14)
CPU times: user 1.74 ms, sys: 993 µs, total: 2.73 ms
Wall time: 2.58 ms


In [43]:
%%time
torch.matmul(tensor, tensor)


CPU times: user 432 µs, sys: 24 µs, total: 456 µs
Wall time: 352 µs


tensor(14)

### One of the most common errors in deep learning: Shape errors



In [4]:
# Shapes for matrix multiplication
tensor_A = torch.tensor([[1, 2],
                         [3, 4],
                         [5, 6]])
tensor_B = torch.tensor([[7, 10],
                         [8, 11],
                         [9, 12]])

torch.mm(tensor_A, tensor_B) # torch.mm is the same as torch.matmul (it's an alias)


RuntimeError: mat1 and mat2 shapes cannot be multiplied (3x2 and 3x2)

In [5]:
tensor_A.shape, tensor_B.shape

(torch.Size([3, 2]), torch.Size([3, 2]))

To fix our tensor shape issues, we can manipulate the shape of one of our tensors using a *Transpose*.

A **Transpose** switches the axes or dimensions of a tensor.


In [8]:
tensor_B, tensor_B.shape

(tensor([[ 7, 10],
         [ 8, 11],
         [ 9, 12]]),
 torch.Size([3, 2]))

In [11]:
tensor_B.T, tensor_B.T.shape

(tensor([[ 7,  8,  9],
         [10, 11, 12]]),
 torch.Size([2, 3]))

In [26]:
# The matrix multiplication operation works when tensor_B is transposed
print(f"Original shapes:\ntensor_A = {tensor_A.shape}, tensor_B = {tensor_B.shape}")
print(f"New shapes:\ntensor_A = {tensor_A.shape}, tensor_B = {tensor_B.T.shape}")
print(f"Multiplying:\n{tensor_A.shape} @ {tensor_B.T.shape} <- inner dimmension matches")
print(f"Output:\n")
output = torch.matmul(tensor_A, tensor_B.T)
print(output)
print(f"\nOutput shape: {output.shape}")
#torch.matmul(tensor_A, tensor_B.T), torch.matmul(tensor_A, tensor_B.T).shape

Original shapes:
tensor_A = torch.Size([3, 2]), tensor_B = torch.Size([3, 2])
New shapes:
tensor_A = torch.Size([3, 2]), tensor_B = torch.Size([2, 3])
Multiplying:
torch.Size([3, 2]) @ torch.Size([2, 3]) <- inner dimmension matches
Output:

tensor([[ 27,  30,  33],
        [ 61,  68,  75],
        [ 95, 106, 117]])

Output shape: torch.Size([3, 3])


In [40]:
# Exercice 1  to use transpose
tensor_x = torch.tensor([[2, 5, 7],
                         [8, 10, 2],
                         [20, 12, 9]])
tensor_y = torch.tensor([[0, 0, 7],
                         [57, 30, 2],
                         [10, 6, 100]])
tensor_x.shape, tensor_y.shape
torch.matmul(tensor_x, tensor_y)

tensor([[ 355,  192,  724],
        [ 590,  312,  276],
        [ 774,  414, 1064]])

### Finding the min, max, mean, sum, etc (aggregation)


In [63]:
# create a tensor
x = torch.arange(0, 100, 10)
x, x.dtype

(tensor([ 0, 10, 20, 30, 40, 50, 60, 70, 80, 90]), torch.int64)

In [64]:
# Find the min
torch.min(x), x.min()

(tensor(0), tensor(0))

In [65]:
# Find the max
torch.max(x), x.max()

(tensor(90), tensor(90))

In [66]:
# Find the mean -> error regarding the dtype. We need to change the dtype.
torch.mean(x)

RuntimeError: mean(): could not infer output dtype. Input dtype must be either a floating point or complex dtype. Got: Long

In [73]:
# Find the mean
x = x.type(torch.float32)
torch.mean(x), x.mean()

(tensor(45.), tensor(45.))

In [69]:
# Find the sum
torch.sum(x), x.sum()

(tensor(450.), tensor(450.))

In [70]:
# Lets make all these data pretty:
print(f"Minimun: {x.min()}")
print(f"Maximum. {x.max()}")
print(f"Mean: {x.mean()}")
print(f"Sum: {x.sum()}")

Minimun: 0.0
Maximum. 90.0
Mean: 45.0
Sum: 450.0


### Finding the Positional min/max

You can also find the index of a tensor where the max or minimum occurs with `torch.argmax()` and `torch.argmin()` respectively.

This is helpful incase you just want the position where the highest (or lowest) value is and not the actual value itself.


In [7]:
# create a tensor
tensor = torch.arange(1, 100, 10)
print(f"Tensor: {tensor}")

# returns index of max and min values. This means the position in the list here.
print(f"Index where max value occurs: {tensor.argmax()}")
print(f"Index where min value occurs: {tensor.argmin()}")
tensor[0], tensor[9]

Tensor: tensor([ 1, 11, 21, 31, 41, 51, 61, 71, 81, 91])
Index where max value occurs: 9
Index where min value occurs: 0


(tensor(1), tensor(91))

## Reshaping, stacking, squeezing and unsqueezing tensors

* Reshaping - reshapes an input tensor to a defined shape
* View - Return a view of an input tensor of certain shape but keep the same memory as the original tensor
* Stacking - Combine multiple tensors on top of each other.
`torch.stack`, `torch.vstack` or `torch.hstack`. All tensors must be the same size.
* Squeeze - Removes all 1 dimensions from a tensor
* Unsqueeze - Add a 1 dimension to a target tensor
* Permute - Return a view of the import with dimention permuted (swapped) in a certain way


In [1]:
# Let's create a tensor
import torch
x = torch.arange(1., 10.)
x, x.shape

(tensor([1., 2., 3., 4., 5., 6., 7., 8., 9.]), torch.Size([9]))

In [2]:
# Reshape - add an extra dimention
x_reshaped = x.reshape(1, 9)
x_reshaped, x_reshaped.shape

(tensor([[1., 2., 3., 4., 5., 6., 7., 8., 9.]]), torch.Size([1, 9]))

In [3]:
# Reshape - add an extra dimention !!!! Look at the bracker
x_reshaped = x.reshape(9, 1)
x_reshaped, x_reshaped.shape

(tensor([[1.],
         [2.],
         [3.],
         [4.],
         [5.],
         [6.],
         [7.],
         [8.],
         [9.]]),
 torch.Size([9, 1]))

In [15]:
# More reshape exercices
a = torch.arange(1, 11) # size 10
a, a.shape

(tensor([ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10]), torch.Size([10]))

In [17]:
a_reshaped = a.reshape(5, 2) # here the 2 x 5 = 10 which is the same size as above.
a_reshaped, a_reshaped.shape

(tensor([[ 1,  2],
         [ 3,  4],
         [ 5,  6],
         [ 7,  8],
         [ 9, 10]]),
 torch.Size([5, 2]))

In [18]:
# Change the view
z = x.view(1, 9)
z, z.shape

(tensor([[1., 2., 3., 4., 5., 6., 7., 8., 9.]]), torch.Size([1, 9]))

In [19]:
# Changing z change x (because the view of a tensor shares the same memory as the original input)
# Remember though, changing the view of a tensor with torch.view() really only creates a new view of the same tensor

# changing y changes x
z[:, 0] = 5 # for each row replace position 0 with 5.
z, x

(tensor([[5., 2., 3., 4., 5., 6., 7., 8., 9.]]),
 tensor([5., 2., 3., 4., 5., 6., 7., 8., 9.]))

In [23]:
# Stack tensors on top of each other
x_stacked = torch.stack([x, x, x, x], dim=0) #try to change dim to 1 and see
x_stacked

tensor([[5., 2., 3., 4., 5., 6., 7., 8., 9.],
        [5., 2., 3., 4., 5., 6., 7., 8., 9.],
        [5., 2., 3., 4., 5., 6., 7., 8., 9.],
        [5., 2., 3., 4., 5., 6., 7., 8., 9.]])

In [24]:
# Stack tensors on top of each other
x_stacked = torch.stack([x, x, x, x], dim=1) #try to change dim to 1 and see
x_stacked

tensor([[5., 5., 5., 5.],
        [2., 2., 2., 2.],
        [3., 3., 3., 3.],
        [4., 4., 4., 4.],
        [5., 5., 5., 5.],
        [6., 6., 6., 6.],
        [7., 7., 7., 7.],
        [8., 8., 8., 8.],
        [9., 9., 9., 9.]])

In [4]:
# Let's test Squeeze
# torch.squeeze() - removes all single dimensions from a target tensor
print(f"Previous tensor: {x_reshaped}")
print(f"Previous shape: {x_reshaped.shape}")
x_squeezed = x_reshaped.squeeze()
x_squeezed, x_squeezed.shape

Previous tensor: tensor([[1.],
        [2.],
        [3.],
        [4.],
        [5.],
        [6.],
        [7.],
        [8.],
        [9.]])
Previous shape: torch.Size([9, 1])


(tensor([1., 2., 3., 4., 5., 6., 7., 8., 9.]), torch.Size([9]))

In [15]:
# Let's test Squeeze
# torch.squeeze() - removes all single dimensions from a target tensor

#1. initial tensor
b = torch.zeros(2, 1, 2, 1, 2)
b.shape # Output: torch.Size([2, 1, 2, 1, 2])
# The tensor b has the shape [2, 1, 2, 1, 2], which means it has 5 dimensions.
# The first dimension has size 2.
# The second dimension has size 1 (this is a candidate for squeezing).
# The third dimension has size 2.
# The fourth dimension has size 1 (this is also a candidate for squeezing).
# The fifth dimension has size 2.

# 2. torch.sueeze(b)
c = torch.squeeze(b)
c.shape # Output: torch.Size([2, 2, 2])
# When you call torch.squeeze(b) without specifying any dimensions, it will remove all dimensions of size 1.
# So, it removes the second and fourth dimensions, as they both have size 1.
# The resulting tensor c has shape [2, 2, 2].

# 3. torch.squeeze(x, 0)
c = torch.squeeze(b, 0)
c.shape # Output: torch.Size([2, 1, 2, 1, 2])
# Here, you're asking to squeeze only the 0th dimension (the first dimension).
# However, the 0th dimension has size 2, so torch.squeeze(b, 0) doesn't do anything since torch.squeeze() can only remove dimensions of size 1.
# The shape remains [2, 1, 2, 1, 2].

# 4. torch.squeeze(x, 1)
c = torch.squeeze(b, 1)
c.shape # Output: torch.Size([2, 2, 1, 2])
# Now you're asking to squeeze only the 1st dimension (the second dimension).
# The 1st dimension has size 1, so it is removed.
# The resulting tensor y has shape [2, 2, 1, 2].

# 5. torch.squeeze(x, (1, 2, 3))
c = torch.squeeze(b, (1,2,3))
c.shape # output [2, 2, 2]
# This call is trying to squeeze dimensions 1, 2, and 3 (the second, third, and fourth dimensions).
# The second (1st) and fourth (3rd) dimensions have size 2, so it won't be removed.
# The third (2nd) dimension has size 1, so they will be removed.
# The resulting tensor will have the shape [2, 2, 2], as the second and fourth dimensions were removed.


torch.Size([2, 2, 2])

**In summary:**

`torch.squeeze(x)` removes all dimensions of size 1.
`torch.squeeze(x, dim)` removes the specified dimension dim only if its size is 1.

What is torch.unsqueeze()?
The function torch.unsqueeze() adds a dimension of size 1 at the specified position in a tensor's shape. It allows you to "expand" a tensor to have more dimensions without changing its data.

Now, let's go through your examples.

In [17]:
# 1. Initial tensor
x = torch.tensor([1, 2, 3, 4])
x.shape # output troch.Size([4])
#The tensor x is a 1-dimensional tensor with shape [4], meaning it has one axis with 4 elements:
#[1, 2, 3, 4]

torch.Size([4])

In [18]:
# 2. torch.unsqueeze(x, 0)
torch.unsqueeze(x, 0)
# You're asking to add a new dimension at position 0 (the first dimension).
# The shape of the tensor will go from [4] to [1, 4], meaning a new dimension of size 1 is added at the beginning.
# Visual representation:

# The original tensor [1, 2, 3, 4] becomes [[1, 2, 3, 4]] with shape [1, 4].
# This makes the tensor look like a 2D row vector.
# output: tensor([[ 1,  2,  3,  4]])

tensor([[1, 2, 3, 4]])

In [19]:
# 3. torch.unsqueeze(x, 1)
torch.unsqueeze(x, 1)
# You're asking to add a new dimension at position 1 (the second dimension).
# The shape of the tensor will go from [4] to [4, 1], meaning a new dimension of size 1 is added after each element.
# shape of [4,1]
# This turns the tensor into a column vector, where each number is now in its own row.

tensor([[1],
        [2],
        [3],
        [4]])

**Summary of `torch.unsqueeze()`**

`torch.unsqueeze(x, 0)` adds a new dimension at the start, resulting in a 2D row vector `([1, 4])`.

`torch.unsqueeze(x, 1)` adds a new dimension after each element, resulting in a 2D column vector `([4, 1])`.

In general, `torch.unsqueeze()` is a useful function when you need to adjust the shape of a tensor to perform operations that require tensors of specific dimensions.

In [5]:
# Let's unsqueeze
print(f"Previous tensor: {x_squeezed}")
print(f"Previous shape: {x_squeezed.shape}")
x_unsqueezed = x_squeezed.unsqueeze(dim=0)
print(f"New tensor: {x_unsqueezed}")
print(f"New shape: {x_unsqueezed.shape}")

Previous tensor: tensor([1., 2., 3., 4., 5., 6., 7., 8., 9.])
Previous shape: torch.Size([9])
New tensor: tensor([[1., 2., 3., 4., 5., 6., 7., 8., 9.]])
New shape: torch.Size([1, 9])


In [25]:
# torch.permute(input, dims)
# You can rearrange the order of axes values. Returns a view of the original tensor input with its dimensions permuted.

x = torch.rand(2, 3, 5)
x.shape # output: torch.Size([2, 3, 5])
# You create a tensor x with random values and the shape [2, 3, 5]. This means the tensor has:
# 2 elements in the first dimension,
# 3 elements in the second dimension,
# 5 elements in the third dimension.
# We can think of this tensor as a 3D block of numbers with 2 layers, where each layer is a 3x5 matrix.

x = torch.permute(x, (2, 0, 1))
x.shape
# What does torch.permute() do?
# The function torch.permute() allows you to rearrange (or permute) the dimensions of a tensor. It takes a tuple of indices representing the new order of the dimensions.

# In your case, you pass the tuple (2, 0, 1). This means:

# The dimension that was originally at position 2 (third dimension) will move to position 0.
# The dimension that was originally at position 0 (first dimension) will move to position 1.
# The dimension that was originally at position 1 (second dimension) will move to position 2.
# So, the original tensor x has shape [2, 3, 5], and after permuting, the dimensions are rearranged to become [5, 2, 3].

# Visualizing the change:
# Original shape: [2, 3, 5] (2 layers, each a 3x5 matrix).

# First dimension (2): the number of layers.
# Second dimension (3): the number of rows in each layer.
# Third dimension (5): the number of columns in each layer.
# Permuted shape: [5, 2, 3].

# Now the first dimension (5) refers to what was originally the number of columns.
# The second dimension (2) refers to the original number of layers.
# The third dimension (3) refers to the number of rows in each layer.

torch.permute(x, (2, 0, 1)).size()  # Output: torch.Size([5, 2, 3])


torch.Size([5, 2, 3])

**Summary:**

The original tensor has the shape `[2, 3, 5]`, meaning 2 layers of 3x5 matrices.

After applying `torch.permute(x, (2, 0, 1))`, the dimensions are rearranged, and the resulting tensor has the shape `[5, 2, 3]`. Now, the first dimension (5) corresponds to what was originally the third dimension (the number of columns), the second dimension (2) corresponds to the original first dimension (number of layers), and the third dimension (3) corresponds to the original second dimension (number of rows).

In [26]:
# Create tensor with specific shape
x_original = torch.rand(size=(224, 224, 3)) # for image for example, [height, width, colour_channels]

# Permute the original tensor to rearrange the axis order
x_permuted = x_original.permute(2, 0, 1) # shifts axis 0->1, 1->2, 2->0

print(f"Previous shape: {x_original.shape}")
print(f"New shape: {x_permuted.shape}")

Previous shape: torch.Size([224, 224, 3])
New shape: torch.Size([3, 224, 224])


## Indexing (selectiong data from tensor)
Sometimes you'll want to select specific data from tensors (for example, only the first column or second row).

To do so, you can use indexing.

If you've ever done indexing on Python lists or NumPy arrays, indexing in PyTorch with tensors is very similar.

In [33]:
# Create a tensor
import torch
x = torch.arange(1, 10)
x, x.shape

(tensor([1, 2, 3, 4, 5, 6, 7, 8, 9]), torch.Size([9]))

In [34]:
x = torch.arange(1, 10).reshape(1, 3, 3)
x, x.shape

(tensor([[[1, 2, 3],
          [4, 5, 6],
          [7, 8, 9]]]),
 torch.Size([1, 3, 3]))

In [42]:
# Let's index on our new tensor
x[0]

# x[0] means you're selecting the first "batch" (the first element along the first dimension).
# Since the first dimension has size 1, x[0] gives you the entire 3x3 matrix:

tensor([[1, 2, 3],
        [4, 5, 6],
        [7, 8, 9]])

In [None]:
## IMPORTANT NOTE ##
# using this:
# x = torch.arange(1, 10).reshape(1, 3, 3), we have reshape 1, 3, 3 this means that thr dimension 1 is at position 0.
# Therefore x[0] can only be here position 0, however x[0][1] works as well as x[0][1][1]

In [50]:
# Let's index on the middle bracket
x[0][0]
# another way is this notation -> x[0, 0]

# x[0] gives you the 3x3 matrix.
# x[0][0] further indexes into the first row of the 3x3 matrix.

tensor([4, 5, 6])

In [52]:
# Let's index on the last bracket
x[0][0][0]
# another way is -> x[0, 0, 0]

tensor(1)

In [56]:
# From tensor([[1, 2, 3],
#.             [4, 5, 6],
#.             [7, 8, 9]])
# I want to have the number 9.
x[0][2][2]

tensor(9)

In [48]:
# Let's index bracket by bracket
print(f"First square bracket:\n{x[0]}")
print(f"Second square bracket: {x[0][0]}")
print(f"Third square bracket: {x[0][0][0]}")

First square bracket:
tensor([[1, 2, 3],
        [4, 5, 6],
        [7, 8, 9]])
Second square bracket: tensor([1, 2, 3])
Third square bracket: 1


In [57]:
# You can also use : to specify "all values in this dimension" and then use a comma (,) to add another dimension.
x[:, 0]

tensor([[1, 2, 3]])

In [64]:
# Get all values of 0th dimension and the 0 index of 1st dimension
# print the tensor
x[:,:]

tensor([[[1, 2, 3],
         [4, 5, 6],
         [7, 8, 9]]])

In [63]:
# Get all value of the 0th and 1st dimention and the 0 index of the 2nd dimension
x[:,:, 0]

tensor([[1, 4, 7]])

In [66]:
# Get all value of the 0th and 1st dimention and the 2 index of the 2nd dimension
x[:,:, 2]

tensor([[3, 6, 9]])

In [68]:
# Get the 0 index of the 0th and 1st dimension and all value of the 2nd dimension
x[0, 0, :]

tensor([1, 2, 3])

In [70]:
# Get the 0 index of the 0th dimension, all the value of the 1st dimension and the 2 index of the 2nd dimension
x[0,:,2]

tensor([3, 6, 9])

In [71]:
x[0, 0, 1]

tensor(2)

In [72]:
x[:, :, 1]

tensor([[2, 5, 8]])

In [75]:
# Index on x to return 9
x[0, 2, 2]

tensor(9)

In [78]:
# Index on x to return 3,6,9
x[:, :, 2]

tensor([[3, 6, 9]])

## PyTorch tensors & NumPy

Since NumPy is a popular Python numerical computing library, PyTorch has functionality to interact with it nicely.

The two main methods you'll want to use for NumPy to PyTorch (and back again) are:

* `torch.from_numpy(ndarray)` - NumPy array -> PyTorch tensor.
* `torch.Tensor.numpy()` - PyTorch tensor -> NumPy array.

In [79]:
# NumPy array to tensor
import torch
import numpy as np
array = np.arange(1.0, 8.0)
tensor = torch.from_numpy(array)
array, tensor

(array([1., 2., 3., 4., 5., 6., 7.]),
 tensor([1., 2., 3., 4., 5., 6., 7.], dtype=torch.float64))

In [81]:
# Default dtype of a tensor is float32
marinetensor = torch.arange(1.0, 8.0)
marinetensor.dtype

torch.float32

In [89]:
# We saw that moving data from Numpy -> Pytorch the dtype is float64
# We need to change that to float32
#tensor = torch.from_numpy(array).type(torch.float32)
tensor = torch.from_numpy(array)
tensor_float32 = tensor.type(torch.float32)
tensor_float32.dtype

torch.float32

In [90]:
# Change the array, keep the tensor
array = array + 1
array, tensor

(array([2., 3., 4., 5., 6., 7., 8.]),
 tensor([1., 2., 3., 4., 5., 6., 7.], dtype=torch.float64))

In [91]:
# Tensor to NumPy array
tensor = torch.ones(7) # create a tensor of ones with dtype=float32
numpy_tensor = tensor.numpy() # will be dtype=float32 unless changed
tensor, numpy_tensor

(tensor([1., 1., 1., 1., 1., 1., 1.]),
 array([1., 1., 1., 1., 1., 1., 1.], dtype=float32))

In [92]:
# Change the tensor, keep the array the same
tensor = tensor + 1
tensor, numpy_tensor

(tensor([2., 2., 2., 2., 2., 2., 2.]),
 array([1., 1., 1., 1., 1., 1., 1.], dtype=float32))

## Reproducibility (trying to take the random out of random)

In short how a neural networks works:
`start with random numbers -> tensor operations -> try to make better (again and again and again)`

To reduce randomness in neural network and Pytorch comes the concept of a **random seed**.

Essentially what the random seed does is "flavor" the randomess.


In [94]:
# create tensor
torch.rand(3, 3)

tensor([[0.2607, 0.7661, 0.6898],
        [0.4536, 0.5392, 0.2817],
        [0.6514, 0.6159, 0.9396]])

In [102]:
# create 2 random tensors
random_tensor_A = torch.rand(3, 4)
random_tensor_B = torch.rand(3, 4)
print(f"tensor A:\n{random_tensor_A}\n")
print(f"tensor B:\n{random_tensor_B}\n")
print(f"Does tensor A equal tensor B ?")
random_tensor_A == random_tensor_B

tensor A:
tensor([[0.9330, 0.1238, 0.4567, 0.7601],
        [0.5888, 0.0233, 0.9862, 0.3654],
        [0.2073, 0.9378, 0.1193, 0.0148]])

tensor B:
tensor([[0.3147, 0.8298, 0.9045, 0.4734],
        [0.0749, 0.9044, 0.9477, 0.5506],
        [0.9114, 0.0479, 0.8570, 0.3035]])

Does tensor A equal tensor B ?


tensor([[False, False, False, False],
        [False, False, False, False],
        [False, False, False, False]])

In [103]:
1 == 1

True

In [1]:
# Let's make some random but reproducable tensors
import torch

# set the random seed
RANDOM_SEED=42

torch.random.manual_seed(seed=RANDOM_SEED) # try commenting this line out and seeing what happens
random_tensor_C = torch.rand(3, 4)

torch.random.manual_seed(seed=RANDOM_SEED)
random_tensor_D = torch.rand(3, 4)

print(f"Tensor C:\n{random_tensor_C}\n")
print(f"Tensor D:\n{random_tensor_D}\n")
print(f"Does Tensor C equal Tensor D? (anywhere)")
random_tensor_C == random_tensor_D

Tensor C:
tensor([[0.8823, 0.9150, 0.3829, 0.9593],
        [0.3904, 0.6009, 0.2566, 0.7936],
        [0.9408, 0.1332, 0.9346, 0.5936]])

Tensor D:
tensor([[0.8823, 0.9150, 0.3829, 0.9593],
        [0.3904, 0.6009, 0.2566, 0.7936],
        [0.9408, 0.1332, 0.9346, 0.5936]])

Does Tensor C equal Tensor D? (anywhere)


tensor([[True, True, True, True],
        [True, True, True, True],
        [True, True, True, True]])

## Running tensors on GPUs (and making faster computations)
Deep learning algorithms require a lot of numerical operations.

And by default these operations are often done on a CPU (computer processing unit).

However, there's another common piece of hardware called a GPU (graphics processing unit), which is often much faster at performing the specific types of operations neural networks need (matrix multiplications) than CPUs.

Your computer might have one.

If so, you should look to use it whenever you can to train neural networks because chances are it'll speed up the training time dramatically.

There are a few ways to first get access to a GPU and secondly get PyTorch to use the GPU.

Note: When I reference "GPU" throughout this course, I'm referencing a Nvidia GPU with CUDA enabled (CUDA is a computing platform and API that helps allow GPUs be used for general purpose computing & not just graphics) unless otherwise specified.



### Getting a GPU

1. Easiest - Use Google Colab for a free GPU - (Google colab pro)
2. Use your own GPU - takes a little bit of setup and requires the investment in GPU - https://pytorch.org/get-started/locally/
3. Use cloud computing ( AWS, GCP, Azure)




In [1]:
!nvidia-smi

Wed Oct  9 07:29:21 2024       
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.104.05             Driver Version: 535.104.05   CUDA Version: 12.2     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|   0  Tesla T4                       Off | 00000000:00:04.0 Off |                    0 |
| N/A   51C    P8              10W /  70W |      0MiB / 15360MiB |      0%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+
                                                                    

### Check for GPU access with PyTorch

In [2]:
# check for GPU
import torch
torch.cuda.is_available()

True

For pyTorch since its capable of running compute on the GPU or CPU, its best practice to setup device agnostic code : https://pytorch.org/docs/main/notes/cuda.html#device-agnostic-code

E.g. run on GPU if available, else default to CPU


In [11]:
# Setup device agnostic code - set device type
device = "cuda" if torch.cuda.is_available() else "cpu"
device

'cuda'

In [4]:
# Count number of devices
torch.cuda.device_count()

1

###  Getting PyTorch to run on Apple Silicon
In order to run PyTorch on Apple's M1/M2/M3 GPUs you can use the torch.backends.mps module.

Be sure that the versions of the macOS and Pytorch are updated.

You can test if PyTorch has access to a GPU using torch.backends.mps.is_available().

In [6]:
# Check for Apple Silicon GPU
import torch
torch.backends.mps.is_available() # Note this will print false if you're not running on a Mac

False

In [7]:
if torch.cuda.is_available():
    device = "cuda" # Use NVIDIA GPU (if available)
elif torch.backends.mps.is_available():
    device = "mps" # Use Apple Silicon GPU (if available)
else:
    device = "cpu" # Default to CPU if no GPU is available

### Putting tensors (and models) on the GPU

The reason we want our tensors/models on the GPU is because using a GPU results in faster computations.




In [12]:
# Create tensor (default on CPU)
tensor = torch.tensor([1, 2, 3], device="cpu")

# Tensor not on GPU
print(tensor, tensor.device)

# Move tensor to GPU (if available)
tensor_on_gpu = tensor.to(device)
tensor_on_gpu

tensor([1, 2, 3]) cpu


tensor([1, 2, 3], device='cuda:0')

### Moving tensors back to the CPU

E.g. Numpy only works on CPU.



In [13]:
# If tensor is on GPU, can't transform it to NumPy (this will error)
tensor_on_gpu.numpy()

TypeError: can't convert cuda:0 device type tensor to numpy. Use Tensor.cpu() to copy the tensor to host memory first.

In [14]:
# To fix the GPU tensor with NumPy issue, we can first set it to the CPU
tensor_back_on_cpu = tensor_on_gpu.cpu().numpy()
tensor_back_on_cpu

array([1, 2, 3])

In [15]:
tensor_on_gpu

tensor([1, 2, 3], device='cuda:0')

## Exercices

https://www.learnpytorch.io/00_pytorch_fundamentals/