
## 00. Pytorch Fundamentals

Resource notebook : https://colab.research.google.com/drive/17OuMUkwYDuf8WNfYBYOR9Sp9FgbETov2#scrollTo=Acm4WMoEq7Fm

In [None]:
print("Hello I'm excited to learn PyTorch!")

Hello I'm excited to learn PyTorch!


## Introduction to Tensors

Creating tensors

In [None]:
import torch

In [None]:
# scalar

scalar = torch.tensor(7)
scalar

tensor(7)

In [None]:
scalar.item()

7

In [None]:
vector = torch.tensor([7,7])
vector

tensor([7, 7])

In [None]:
vector.ndim

1

In [None]:
vector.shape

torch.Size([2])

In [None]:
# Matrix Creation

MATRIX = torch.tensor([[7,8],
                       [9,10]])
MATRIX

tensor([[ 7,  8],
        [ 9, 10]])

In [None]:
MATRIX.ndim

2

In [None]:
MATRIX.shape

torch.Size([2, 2])

In [None]:
# TENSOR

TENSOR = torch.tensor([[[1,2,3],
                        [3,6,9],
                        [2,4,5]]])

In [None]:
TENSOR

tensor([[[1, 2, 3],
         [3, 6, 9],
         [2, 4, 5]]])

In [None]:
TENSOR[0][0]

tensor([1, 2, 3])

### Random Tensors

Why random tensors??

Random tensors are important because the way many neural networks learn is that they start with tensors full of random numbers and then adjuyst those random numbers to better represent the data.

`Start with random number -> look at data -> update random numbers -> look at data -> update random numbers`

In [None]:
# Create a random tensor of size (3,4)

random_tensor = torch.rand(3,4)
random_tensor

tensor([[0.3946, 0.5978, 0.7144, 0.9649],
        [0.9779, 0.1482, 0.0139, 0.8450],
        [0.7273, 0.5627, 0.1940, 0.1313]])

In [None]:
# Create a random tensor with similar shape to an image tensor

random_image_size_tensor = torch.rand(size = (3,224,224)) # heigh, width, colour
random_image_size_tensor.shape, random_image_size_tensor.ndim

(torch.Size([3, 224, 224]), 3)

### Zeros and Ones

In [None]:
# Create a tensor or zeros
zeros = torch.zeros(size = (3,4))
zeros

tensor([[0., 0., 0., 0.],
        [0., 0., 0., 0.],
        [0., 0., 0., 0.]])

In [None]:
zeros * random_tensor

tensor([[0., 0., 0., 0.],
        [0., 0., 0., 0.],
        [0., 0., 0., 0.]])

In [None]:
# Create a tensor of ones
ones = torch.ones(size = (3,4))
ones

tensor([[1., 1., 1., 1.],
        [1., 1., 1., 1.],
        [1., 1., 1., 1.]])

In [None]:
ones.dtype

torch.float32

### Create a range of tensors and tesors-like

In [None]:
# Use torch.range() and get deprecated message, use torch.arange()

one_to_ten = torch.arange(start = 0, end = 11, step = 1)
one_to_ten

tensor([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10])

In [None]:
# Creating tensors like
ten_zeros =torch.zeros_like(input = one_to_ten)
ten_zeros

tensor([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0])

### Tensor datatypes

**Note:** Tensor data types is one of the 3 big errors you will run into with PyTorch and deep learning:

1. Tensors not right datatype
2. Tensors not right shape
3. Tensors not on the right device

In [None]:
# Float 32 tensor
float_32_tensor = torch.tensor([3.0,6.0,9.0],
                               dtype = None,  # what datatype is the tensor (e.g. tensor.float16)
                               device = None, # What device is your tensor on (e.g. "cpu", "cuda")
                               requires_grad=False) # Whether or not to track gradients with this tensors operations
float_32_tensor, float_32_tensor.dtype

(tensor([3., 6., 9.]), torch.float32)

In [None]:
float_16_tensor = float_32_tensor.type(torch.float16)
float_16_tensor, float_16_tensor.dtype

(tensor([3., 6., 9.], dtype=torch.float16), torch.float16)

In [None]:
float_32_tensor * float_16_tensor

tensor([ 9., 36., 81.])

In [None]:
int_32_tensor = torch.tensor([3,6,9], dtype = torch.int32)
int_32_tensor

tensor([3, 6, 9], dtype=torch.int32)

In [None]:
float_32_tensor * float_16_tensor * int_32_tensor

tensor([ 27., 216., 729.])

### Getting information from tensors (Tensor Attributes)

1. Tensors not right datatypes - to do get datatype from a tensor, can use `tensor.dtypes`
2. Tensors not right shape - to get shape of from a tensor, can use `tensor.shape`
3. Tensors not on the right device - to get device form a tensor, can use `tensor.device`

In [None]:
# Create a tensor

some_tensor = torch.rand(3,4)
some_tensor = some_tensor.type(torch.float16)
some_tensor

tensor([[0.3779, 0.1730, 0.2104, 0.1746],
        [0.5278, 0.1490, 0.7461, 0.5713],
        [0.6973, 0.7349, 0.6143, 0.2389]], dtype=torch.float16)

In [None]:
# Find out detailes about some tensor

print(some_tensor)
print(f" Datatype of tensor : {some_tensor.dtype}")
print(f"Sfape of tensor : {some_tensor.shape}")
print(f"Device tensor is on : {some_tensor.device}")

tensor([[0.3779, 0.1730, 0.2104, 0.1746],
        [0.5278, 0.1490, 0.7461, 0.5713],
        [0.6973, 0.7349, 0.6143, 0.2389]], dtype=torch.float16)
 Datatype of tensor : torch.float16
Sfape of tensor : torch.Size([3, 4])
Device tensor is on : cpu


### Manipulating Tensors (tensor operations)

Tensor operations include:
* Addition
* Substraction
* Manipulation (element-wise)
* Division
* Matrix Multiplication

In [None]:
# Create a tensor

tensor = torch.tensor([1,2,3])
tensor + 10

tensor([11, 12, 13])

In [None]:
# Multiply tensor by 10
tensor * 10

tensor([10, 20, 30])

In [None]:
# Substrack 10
tensor - 10

tensor([-9, -8, -7])

### Matric Multiplication

Two main ways of performing multiplication in neural networks and deep learning:

1. Element-wise multiplication
2. Matrix multiplication

There are twp main rules that performing matrix multiplicatrion needs to satisfy:
1. The **inner dimensions** must match:
* `(3,2) @ (3,2)` won't work
* `(2,3) @ (3,2)` will work

2. The resulting matrix has the shape of the **outer dimensions**:
* `(2,3) @ (3,2)` -> `(2,2)`
* `(3,2) @ (3,2)` -> `(3,3)`


In [None]:

 # Element-wise multiplication

 print(f"{tensor}, '*', {tensor}, '=', {tensor*tensor}")

tensor([1, 2, 3]), '*', tensor([1, 2, 3]), '=', tensor([1, 4, 9])


In [None]:
# Matrix Multiplication

torch.matmul(tensor,tensor)

tensor(14)

In [None]:
%%time
value = 0
for i in range(len(tensor)):
  value += tensor[i]*tensor[i]
print(value)

tensor(14)
CPU times: user 2.19 ms, sys: 0 ns, total: 2.19 ms
Wall time: 2.12 ms


In [None]:
%%time
torch.matmul(tensor,tensor)

CPU times: user 543 µs, sys: 0 ns, total: 543 µs
Wall time: 448 µs


tensor(14)

### One of the most common errors in deep learning: shape errors

In [None]:
# Shape for matrix multiplication
tensor_A = torch.tensor([[1,2],
                         [3,4],
                         [5,6]])

tensor_B = torch.tensor([[7,10],
                         [8,11],
                         [9,12]])


In [None]:
tensor_A.shape, tensor_B.shape

(torch.Size([3, 2]), torch.Size([3, 2]))

To fix out tensor issues, we can manipulate the shape of ones of our tensrors using a **transpose**.

A **transpose** swuthes the axes of a given tensor.

In [None]:
tensor_B, tensor_B.shape

(tensor([[ 7, 10],
         [ 8, 11],
         [ 9, 12]]),
 torch.Size([3, 2]))

In [None]:
tensor_B.T, tensor_B.T.shape

(tensor([[ 7,  8,  9],
         [10, 11, 12]]),
 torch.Size([2, 3]))

In [None]:
# The matrix multiplication operation works when thensro_B is transposed
print(f"Original shape: {tensor_A.shape}, tensor_B = {tensor_B.shape}")
print(f"New shapes: tensor_A = {tensor_A.shape}, same as above, tensor_B.T = {tensor_B.T.shape}")
print(f"Multiplying : {tensor_A.shape} @ {tensor_B.T.shape} <- inner dimensions myst match")
print(f"Output: \n")
output = torch.matmul(tensor_A, tensor_B.T)
print(output)
print(f"\nOutput shape : {output.shape}")


Original shape: torch.Size([3, 2]), tensor_B = torch.Size([3, 2])
New shapes: tensor_A = torch.Size([3, 2]), same as above, tensor_B.T = torch.Size([2, 3])
Multiplying : torch.Size([3, 2]) @ torch.Size([2, 3]) <- inner dimensions myst match
Output: 

tensor([[ 27,  30,  33],
        [ 61,  68,  75],
        [ 95, 106, 117]])

Output shape : torch.Size([3, 3])


In [None]:
torch.matmul(tensor_A, tensor_B.T)

tensor([[ 27,  30,  33],
        [ 61,  68,  75],
        [ 95, 106, 117]])

### Finding the min,max, sum etc (tensor aggregation)

In [None]:
# Create a tensor

x = torch.arange(0,100,10)

In [None]:
# Find the min

torch.min(x), x.min()

(tensor(0), tensor(0))

In [None]:
# Find the max

torch.max(x), x.max()

(tensor(90), tensor(90))

In [None]:
# Find the mean - note the torch.mean() function requires a tensro of float 32, datatype to work

torch.mean(x.type(torch.float32)), x.type(torch.float32).mean()

(tensor(45.), tensor(45.))

In [None]:
# Find the sum

torch.sum(x), x.sum()

(tensor(450), tensor(450))

### Finding the positional min and max

In [None]:
# Find the position in tensort that has the minimum value with argmin -> returns the index position of that tensor where minimum value occures

x.argmin()

tensor(0)

In [None]:
x[0]

tensor(0)

In [None]:
# Find the position in tensort that has the maximum value with argmax

x.argmax()

tensor(9)

In [None]:
x[9]

tensor(90)

### Reshaping, stacking, squeezing and unsquezing tensors

* Reshaping - reshapes an input tensor to a defined shape
* View - Return a vie of an input tensor of certain shape but keep the same meomry as the original tensor
* Stacking - Combine multiple tensors on top of each other (vstack) or side by side (hstack)
* Squeeze - removes all `1` dimensions from a tensor
* Unsqueeze - add a `1` dimentions to a target tensor
* Permute - Return a view of the input with dimenions permuted (swapped) in a certain way

In [None]:
# Let's create a tensor

x = torch.arange(1.,10.)
x, x.shape

(tensor([1., 2., 3., 4., 5., 6., 7., 8., 9.]), torch.Size([9]))

In [None]:
# Add an extra dimention
x_reshaped = x.reshape(1,9)
x_reshaped, x_reshaped.shape

(tensor([[1., 2., 3., 4., 5., 6., 7., 8., 9.]]), torch.Size([1, 9]))

In [None]:
# Change the view
z = x.view(1,9)
z, z.shape

(tensor([[1., 2., 3., 4., 5., 6., 7., 8., 9.]]), torch.Size([1, 9]))

In [None]:
# Stack tensors on top of each other

x_stacked = torch.stack([x,x,x,x], dim = 0)
x_stacked

tensor([[1., 2., 3., 4., 5., 6., 7., 8., 9.],
        [1., 2., 3., 4., 5., 6., 7., 8., 9.],
        [1., 2., 3., 4., 5., 6., 7., 8., 9.],
        [1., 2., 3., 4., 5., 6., 7., 8., 9.]])

In [None]:
# torch.squeeze() - removes all single dimensions from a target tensor

print(f"Previous tensor : {x_reshaped}")
print(f"Previous shape : {x_reshaped.shape}")

# Remove extra dimensions from x_reshaped
x_squeezed = x_reshaped.squeeze()
print(f"\nNew tensor: {x_squeezed}")
print(f"New shape : {x_squeezed.shape}")

Previous tensor : tensor([[1., 2., 3., 4., 5., 6., 7., 8., 9.]])
Previous shape : torch.Size([1, 9])

New tensor: tensor([1., 2., 3., 4., 5., 6., 7., 8., 9.])
New shape : torch.Size([9])


In [None]:
# torch.unsqueeze() - adds a single dimention to a target tensor at specific dim

print(f"Previous target: {x_squeezed}")
print(f"Previous shape: {x_squeezed.shape}")

# Add an extra dimension with unsqueeze
x_unsqueezed = x_squeezed.unsqueeze(dim = 1)
print(f"\nNew tensor: {x_unsqueezed}")
print(f"New shape: {x_unsqueezed.shape}")

Previous target: tensor([1., 2., 3., 4., 5., 6., 7., 8., 9.])
Previous shape: torch.Size([9])

New tensor: tensor([[1.],
        [2.],
        [3.],
        [4.],
        [5.],
        [6.],
        [7.],
        [8.],
        [9.]])
New shape: torch.Size([9, 1])


In [None]:
# torch.permute - rearranges the dimensions of a target tensror in a specified order

x_original = torch.rand(size = (224,224,3)) #height, width, colour_channels

# Permute the original tensor to rearrange the axis (or dim) order
x_permuted = x_original.permute(2,0,1)  # shifts axis 0->1, 1->2, 2->0

print(f"Previous shape : {x_original.shape}")
print(f"Permuted shape : {x_permuted.shape}")

Previous shape : torch.Size([224, 224, 3])
Permuted shape : torch.Size([3, 224, 224])


### Indexing (selecting data from tensors)

Indexing with PyTorch is similar to indexing with NumPy.

In [None]:
# Create a tensor

x = torch.arange(1,10).reshape(1,3,3)
x, x.shape

(tensor([[[1, 2, 3],
          [4, 5, 6],
          [7, 8, 9]]]),
 torch.Size([1, 3, 3]))

In [None]:
# Let's index on our new tensor

x[0], x[0,:,:]

(tensor([[1, 2, 3],
         [4, 5, 6],
         [7, 8, 9]]),
 tensor([[1, 2, 3],
         [4, 5, 6],
         [7, 8, 9]]))

In [None]:
x[0][0], x[0,0,:]

(tensor([1, 2, 3]), tensor([1, 2, 3]))

In [None]:
x[0][0][0], x[0,0,0]

(tensor(1), tensor(1))

In [None]:
x[0,2,2], x[0,:,2]

(tensor(9), tensor([3, 6, 9]))

### PyTorch tensors and NumPy

NumPy is a popular scientific Python numerical computing library.
Because of this, PyTorch has functionality to interact with it.

* Data in NumPy, want in PyTorch tensor -> `torch.from_numpy(ndarray)`
* PyTorch tensor -> NumPy -> `torch.Tensor.numpy()`

In [None]:
# NumPy array to tensor

import numpy as np
import torch

array = np.arange(1.0, 8.0)
tensor = torch.from_numpy(array)
array, tensor

(array([1., 2., 3., 4., 5., 6., 7.]),
 tensor([1., 2., 3., 4., 5., 6., 7.], dtype=torch.float64))

In [None]:
# Change the value of array, what will this do to `tensor`?

array = array + 1
array, tensor

(array([2., 3., 4., 5., 6., 7., 8.]),
 tensor([1., 2., 3., 4., 5., 6., 7.], dtype=torch.float64))

In [None]:
torch.arange(1.0, 8.).dtype

torch.float32

In [None]:
# Tensor to NumPy array

tensor = torch.ones(7)
numpy_tensor = tensor.numpy()
tensor, numpy_tensor

(tensor([1., 1., 1., 1., 1., 1., 1.]),
 array([1., 1., 1., 1., 1., 1., 1.], dtype=float32))

In [None]:
# Change the tensor, what happens to `numpy_tensor`

tensor = tensor + 1
tensor, numpy_tensor

(tensor([2., 2., 2., 2., 2., 2., 2.]),
 array([1., 1., 1., 1., 1., 1., 1.], dtype=float32))

## Reproducibility (tryint to take random out of random)

In short how a neural network learns:

`start with random numbers -> tensor operations -> update random numbers to try and make them representations of the data -> again -> again -> again...`

To reduce the randomness in neural networks and PyTorch comes the concept of a **random seed**.
Essentially what the random seed does is "flavour" the randomness

In [None]:
import torch

# Create tow random tensors
random_tensor_A = torch.rand(3,4)
random_tensor_B = torch.rand(3,4)

print(random_tensor_A)
print(random_tensor_A)
print(random_tensor_A==random_tensor_B)

tensor([[0.3488, 0.0105, 0.0191, 0.9960],
        [0.9368, 0.7120, 0.4854, 0.9227],
        [0.9387, 0.1647, 0.7630, 0.9034]])
tensor([[0.3488, 0.0105, 0.0191, 0.9960],
        [0.9368, 0.7120, 0.4854, 0.9227],
        [0.9387, 0.1647, 0.7630, 0.9034]])
tensor([[False, False, False, False],
        [False, False, False, False],
        [False, False, False, False]])


In [None]:
# Let's make some random but reproducible tensors

import torch

RANDOM_SEED = 41
torch.manual_seed(RANDOM_SEED)
random_tensor_C = torch.rand(3,4)

torch.manual_seed(RANDOM_SEED)
random_tensor_D = torch.rand(3,4)

print(random_tensor_C)
print(random_tensor_D)
print(random_tensor_C==random_tensor_D)

tensor([[0.2364, 0.2266, 0.8005, 0.1692],
        [0.2650, 0.7720, 0.1282, 0.7452],
        [0.8045, 0.6357, 0.5896, 0.6933]])
tensor([[0.2364, 0.2266, 0.8005, 0.1692],
        [0.2650, 0.7720, 0.1282, 0.7452],
        [0.8045, 0.6357, 0.5896, 0.6933]])
tensor([[True, True, True, True],
        [True, True, True, True],
        [True, True, True, True]])


## Running tensors and PyTorch objects on the GPUs (and making faster computations)

`GPUs = faster compitation on numbers, thanks to CUDA + NVIDIA hardware + Pytorch`

## 1. Getting a GPU

1. Easiest - Use Google Colab for a free GPU (options to upgrade as well)
2. Use your one GPU - this require a bit of setup and required the investment of purchasing a GPU, there's lots of options...
3. Use cloud computing - GCP, AWS, Azure, these services allow you to rent compiters on the cloud and access them.

## 2. Check for GPU access with PyTorch


In [None]:
# Chec k for GPU Access with PyTorch
import torch
torch.cuda.is_available()

True

For PyTorch since it's capable of runninf compute on the GPU or CPU, it's best practise to setup device agnostic code

E.g. run on GPU if available, else default to CPU

In [None]:
# Setup device agnostic code
device = "cuda" if torch.cuda.is_available() else "cpu"
device

'cuda'

In [None]:
# Count number of devices
torch.cuda.device_count()

1

## 3. Putting tensors (and models) on the GPU

The reason we want our tensors/models on the GPU is because using a GOPU results toi faster computing.

In [None]:
# Create a tensor (default on CPU)

tensor = torch.tensor([1,2,3], device ='cpu')
print(tensor, tensor.device)

tensor([1, 2, 3]) cpu


In [None]:
# Move tebnsor to GPU (if available)
tensor_on_gpu = tensor.to(device)
tensor_on_gpu

tensor([1, 2, 3], device='cuda:0')

## 4. Moving tensors back to the cpu

In [None]:
tensor_on_cpu = tensor_on_gpu.to('cpu')

(array([1, 2, 3]), device(type='cpu'))

In [None]:
tensor_on_gpu.cpu().numpy()

array([1, 2, 3])

In [None]:
tensor_on_cpu.numpy(), tensor_on_cpu.device

(array([1, 2, 3]), device(type='cpu'))