<a href="https://colab.research.google.com/github/AndyHoskins87/PyTorch_DeepLearning/blob/main/00_Pytorch_fundamentals.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## PyTorch Fundamentals

Resource Notebook: https://www.learnpytorch.io/00_pytorch_fundamentals/


In [1]:
import torch
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
print(torch.__version__);

2.2.1+cu121


## Intro to Tensors



Creating Tensors

PyTorch tensors are created using `torch.Tensor()` = https://pytorch.org/docs/stable/tensors.html

* `Scalar` - single number 0 dimensions
* `Vector` - 1 dimension
* `Matrix` - 2 dimensional array of numbers
* `Tensor` - An n-dimensional array of numbers

In [2]:
# Scalar 0 dimensions
scalar = torch.tensor(7)
scalar

tensor(7)

In [3]:
scalar.ndim

0

In [4]:
# Vector 1 dimensions
vector = torch.tensor([7, 7])
vector

tensor([7, 7])

In [5]:
vector.shape

torch.Size([2])

In [6]:
# Matrix 2 dimensions
MATRIX = torch.tensor([[1, 2],
                       [3, 4]])
MATRIX

tensor([[1, 2],
        [3, 4]])

In [7]:
MATRIX.shape, MATRIX.ndim

(torch.Size([2, 2]), 2)

In [8]:
# Tensor n-dimensional array
TENSOR = torch.tensor([[[1, 2, 3],
                        [4, 5, 6],
                        [7, 8, 9]]])
TENSOR

tensor([[[1, 2, 3],
         [4, 5, 6],
         [7, 8, 9]]])

In [9]:
TENSOR.shape, TENSOR.ndim

(torch.Size([1, 3, 3]), 3)

## Random Tensors



In [10]:
# Create a random tensor of size (3, 4)
torch_rand = torch.rand(3, 4)
torch_rand

tensor([[0.1032, 0.2891, 0.8703, 0.7602],
        [0.6944, 0.9780, 0.4298, 0.4188],
        [0.0345, 0.2508, 0.1709, 0.4688]])

In [11]:
torch_rand.shape, torch_rand.ndim

(torch.Size([3, 4]), 2)

In [12]:
# Create a tensor based on an image
Image_rand = torch.rand(size=(3, 224, 224)) # colour channels, height, width
Image_rand.ndim, Image_rand.shape

(3, torch.Size([3, 224, 224]))

## Zeros and Ones

In [13]:
# Create a tensor of all zeros
zeros = torch.zeros(size=(3, 3))
zeros

tensor([[0., 0., 0.],
        [0., 0., 0.],
        [0., 0., 0.]])

In [14]:
# Create a tensor of all ones
ones = torch.ones(size=(3, 4))
ones

tensor([[1., 1., 1., 1.],
        [1., 1., 1., 1.],
        [1., 1., 1., 1.]])

In [15]:
ones.dtype

torch.float32

## Creating a range of tensors

In [16]:
# Use torch.arange()
one_to_ten = torch.arange(start=0, end=11, step=1)
one_to_ten

tensor([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10])

In [17]:
# Creating tensor like
ten_zeros = torch.zeros_like(input=one_to_ten)
ten_zeros

tensor([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0])

## Tensor Data Types

In [18]:
# Float 32 tensor
float_32_tensor = torch.tensor([3.0, 6.0, 9.0],
                               dtype=None,  # defaults to None, which is torch.float32
                               device=None, # defaults to None, which uses the default tensor type
                               requires_grad=False) # if True, operations performed on the tensor are recorded
float_32_tensor

tensor([3., 6., 9.])

* `shape` - what shape is the tensor? (some operations require specific shape rules)
* `dtype` - what datatype are the elements within the tensor stored in?
* `device` - what device is the tensor stored on? (usually GPU or CPU)

When you run into issues in PyTorch, it's very often one to do with one of the three attributes above.

In [19]:
float_32_tensor.shape, float_32_tensor.dtype, float_32_tensor.device

(torch.Size([3]), torch.float32, device(type='cpu'))

## Manipulating Tensors (Tensor Operations)

Tensor Operations include:
* Addition
* Subtraction
* Multiplication (element-wise
* Divison
* Matrix Multiplication

In [20]:
# Create a Tensor and add 10
tensor = torch.tensor([1, 2, 3])
tensor + 10

tensor([11, 12, 13])

In [21]:
# Create a Tensor and multiply by 10
tensor = torch.tensor([1, 2, 3])
tensor * 10

tensor([10, 20, 30])

In [22]:
# Subtract 10
tensor - 10

tensor([-9, -8, -7])

In [23]:
# You can also use torch functions
torch.mul(tensor, 10)

tensor([10, 20, 30])

## Matrix Multiplication

PyTorch implements matrix multiplication functionality in the torch.matmul() method.

The main two rules for matrix multiplication to remember are:

The inner dimensions must match:
* (3, 2) @ (3, 2) won't work
* (2, 3) @ (3, 2) will work
* (3, 2) @ (2, 3) will work
The resulting matrix has the shape of the outer dimensions:
* (2, 3) @ (3, 2) -> (2, 2)
* (3, 2) @ (2, 3) -> (3, 3)

Note: "@" in Python is the symbol for matrix multiplication.

Resource: You can see all of the rules for matrix multiplication using torch.matmul()  https://pytorch.org/docs/stable/generated/torch.matmul.html

Practise: http://matrixmultiplication.xyz/.


* Element-wise multiplication	`[1*1, 2*2, 3*3]` = `[1, 4, 9]`	tensor * tensor
* Matrix multiplication	`[1*1 + 2*2 + 3*3]` = `[14]`	tensor.matmul(tensor)

In [24]:
# Element-wise multiplication
print(tensor, "*", tensor)
print(f"Equals: {tensor * tensor}")

tensor([1, 2, 3]) * tensor([1, 2, 3])
Equals: tensor([1, 4, 9])


In [25]:
# Matrix multiplication
torch.matmul(tensor, tensor)
1*1 + 2*2 + 3*3

14

In [26]:
%%time
value = 0
for i in range(len(tensor)):
    value += tensor[i] * tensor[i]
print(value)

tensor(14)
CPU times: user 1.64 ms, sys: 324 µs, total: 1.96 ms
Wall time: 3.01 ms


In [27]:
%%time
torch.matmul(tensor, tensor)

CPU times: user 44 µs, sys: 9 µs, total: 53 µs
Wall time: 56.5 µs


tensor(14)

One of the most common errors in deep learning are (Shape Errors)

In [28]:
# Shapes for matrix multiplication
tensor_A = torch.tensor([[1, 2],
                        [3, 4],
                        [5, 6]], dtype=torch.float32)

tensor_B = torch.tensor([[7, 8],
                        [9, 10],
                        [10, 11]], dtype=torch.float32)

tensor_A.shape, tensor_B.shape
# The inner dimensions do not match so you cant do matrix multiplication

(torch.Size([3, 2]), torch.Size([3, 2]))

To fix our tensor shape issues we can manipulate the shape of our tensors using `transpose`
* `tensor.T` where tensor is the disired tensor to transpose.

In [29]:
print(tensor_A)
print(tensor_B.T)

tensor([[1., 2.],
        [3., 4.],
        [5., 6.]])
tensor([[ 7.,  9., 10.],
        [ 8., 10., 11.]])


In [30]:
tensor_A.shape, tensor_B.T.shape

(torch.Size([3, 2]), torch.Size([2, 3]))

In [31]:
# torch.mm is a shortcut for matmul
tensor_C = torch.mm(tensor_A, tensor_B.T)
tensor_C, tensor_C.shape # is the shape of the outter dimensions of A & B

(tensor([[ 23.,  29.,  32.],
         [ 53.,  67.,  74.],
         [ 83., 105., 116.]]),
 torch.Size([3, 3]))

Note: A matrix multiplication like this is also referred to as the dot product of two matrices.



## Finding the min, max, mean, sum etc (Tensor Aggregation)

In [32]:
# Create a tensor
x = torch.arange(0, 100, 10)
x

tensor([ 0, 10, 20, 30, 40, 50, 60, 70, 80, 90])

In [33]:
# Find the min
torch.min(x), x.min()

(tensor(0), tensor(0))

In [34]:
# Find the max
torch.max(x), x.max()

(tensor(90), tensor(90))

In [35]:
# Find the mean - Note: the torch.mean() function requires a tensor of float 32 datatype to work
torch.mean(x.type(torch.float32)), x.type(torch.float32).mean()

(tensor(45.), tensor(45.))

In [36]:
# Find the sum
torch.sum(x), x.sum()

(tensor(450), tensor(450))

Positional Min/Max

You can also find the index of a tensor where the max or minimum occurs with `torch.argmax()` and `torch.argmin()` respectively.

In [37]:
# Fine the positional minimum and maximum value
x, x.argmin(), x.argmax()

(tensor([ 0, 10, 20, 30, 40, 50, 60, 70, 80, 90]), tensor(0), tensor(9))

## Reshaping, stacking, squeezing and unsqueezing tensors

* Reshaping - Reshapes an input tensor to a defined shape
* View - Resturn a view of an input tensor, but keep the same memory as the original tensor
* Stacking - Combine multiple tensors on top of each other (vstack) or side by side (hstack)
* Squeeze - removes all `1` dimensions from a tensor
* Unsqueeze - add a `1` dimension to a target tensor
* Permute - Return a view of the input with dimensions permuted (swapped).

In [38]:
# Create a tensor
x = torch.arange(1., 10)
x, x.shape

(tensor([1., 2., 3., 4., 5., 6., 7., 8., 9.]), torch.Size([9]))

In [39]:
# add a dimension
x_reshaped = x.reshape(1, 9) # Has to fit the input size so (1, 7) for example will fail. (2, 9) will also fail as 2*9 would equal 18.
x_reshaped, x_reshaped.shape

(tensor([[1., 2., 3., 4., 5., 6., 7., 8., 9.]]), torch.Size([1, 9]))

In [40]:
# Change the view
z = x.view(1, 9)
z, z.shape

(tensor([[1., 2., 3., 4., 5., 6., 7., 8., 9.]]), torch.Size([1, 9]))

In [41]:
# Changing z changes x (because a view of a tensor shares the same memory as the original tensor)
z[:, 0] = 5
z, x

(tensor([[5., 2., 3., 4., 5., 6., 7., 8., 9.]]),
 tensor([5., 2., 3., 4., 5., 6., 7., 8., 9.]))

In [42]:
# Stack tensors on top of each other
x_stacked = torch.stack([x, x, x, x], dim=0)
x_stacked

tensor([[5., 2., 3., 4., 5., 6., 7., 8., 9.],
        [5., 2., 3., 4., 5., 6., 7., 8., 9.],
        [5., 2., 3., 4., 5., 6., 7., 8., 9.],
        [5., 2., 3., 4., 5., 6., 7., 8., 9.]])

In [43]:
x_reshaped.shape

torch.Size([1, 9])

In [44]:
# Squeezing a tensor remove any 1 dimensions
x_squeezed = x_reshaped.squeeze()
x_squeezed, x_squeezed.shape

(tensor([5., 2., 3., 4., 5., 6., 7., 8., 9.]), torch.Size([9]))

In [45]:
# Unsqueezing a tensor
x_unsqueezed = x_squeezed.unsqueeze(dim=0)
x_unsqueezed, x_unsqueezed.shape

(tensor([[5., 2., 3., 4., 5., 6., 7., 8., 9.]]), torch.Size([1, 9]))

In [46]:
# torch.permute - re-arranges the dimensions of a tensor in a specific order
x_original = torch.rand(size=(224, 224, 3)) # Height, Width, Colour Channels

# Permute the original tensor to re-arranfe the axis (or dim) order. Indexes start at 0.
x_permuted = x_original.permute(2, 0, 1) # Colour Channels, Height, Width
x_permuted.shape

torch.Size([3, 224, 224])

## Indexing (Selecting data from Tensors)

Indexing with PyTorch is very similar to indexing with NumPy.

* Remeber that if you have a shape of (1, 3, 3)
* When you index you start at 0 so it will be (0, 2, 2).

In [47]:
# Create a tensor
x = torch.arange(1, 10).reshape(1, 3, 3)
x, x.shape, x.dim()

(tensor([[[1, 2, 3],
          [4, 5, 6],
          [7, 8, 9]]]),
 torch.Size([1, 3, 3]),
 3)

In [48]:
# Let index on our new tensor
x[0]

tensor([[1, 2, 3],
        [4, 5, 6],
        [7, 8, 9]])

In [49]:
# Lets index on dim=1
x[0][0]

tensor([1, 2, 3])

In [50]:
# Lets index on the last dimension
x[0][2][2]

tensor(9)

In [51]:
# Get all values of the 0 & 1st dimension, but only the 1st index of the 2nd dimension.
x[:, :, 1]

tensor([[2, 5, 8]])

In [52]:
# Get all the values of the 0 dimension, but only the 1st index value of the 2nd dimension
x[:, 1, 1]

tensor([5])

## PyTorch Tensors and NumPy

NumPy is a popular scientific Python numerical computing library.
PyTorch has the functionality to interact with it.

* Data in Numpy can be converted into a PyTorch Tensor -> `torch.from_numpy(ndarray)`
* PyTorch tensor -> NumPy -> `torch.Tensor.numpy()`

In [53]:
# Numpy array to Tensor
import torch
import numpy as np

array = np.arange(1.0, 8.0) # NumPys default dtype is float64
tensor = torch.from_numpy(array)
array, tensor

(array([1., 2., 3., 4., 5., 6., 7.]),
 tensor([1., 2., 3., 4., 5., 6., 7.], dtype=torch.float64))

In [54]:
# Tensor to NumPy array
tensor = torch.ones(7)
numpy_tensor = tensor.numpy() # Reflect the dtype of the tensor, not numpys default.
tensor, numpy_tensor


(tensor([1., 1., 1., 1., 1., 1., 1.]),
 array([1., 1., 1., 1., 1., 1., 1.], dtype=float32))

## Reproducibility (Trying to take the random out of random)


In short how a neural network learns:
`start with random numbers -> tensor operations -> try to make better (again and again and again)`

To reduce the randomness in neural netwoks and PyTorch comes the concept of a random seed.

In [55]:
# create two random tensors.
random_tensor_A = torch.rand(3, 4)
random_tensor_B = torch.rand(3, 4)

print(random_tensor_A)
print(random_tensor_B)
print(random_tensor_A == random_tensor_B)

tensor([[0.8156, 0.2430, 0.7002, 0.4180],
        [0.5750, 0.2318, 0.4355, 0.7133],
        [0.4946, 0.9338, 0.3007, 0.9177]])
tensor([[0.9530, 0.9903, 0.7709, 0.8457],
        [0.2952, 0.7544, 0.9463, 0.3990],
        [0.5863, 0.5717, 0.6233, 0.7243]])
tensor([[False, False, False, False],
        [False, False, False, False],
        [False, False, False, False]])


In [56]:
# Lets make some random but reproducible tensors
import random
# Set the random seed
RANDOM_SEED = 42
torch.manual_seed(seed=RANDOM_SEED)
torch_tensor_C = torch.rand(3, 4)

# Have to reset the seed every time a new rand() is called
# Without this, tensor_D would be different to tensor_C
torch.manual_seed(seed=RANDOM_SEED)
torch_tensor_D = torch.rand(3, 4)

print(torch_tensor_C)
print(torch_tensor_D)
print(torch_tensor_C == torch_tensor_D)


tensor([[0.8823, 0.9150, 0.3829, 0.9593],
        [0.3904, 0.6009, 0.2566, 0.7936],
        [0.9408, 0.1332, 0.9346, 0.5936]])
tensor([[0.8823, 0.9150, 0.3829, 0.9593],
        [0.3904, 0.6009, 0.2566, 0.7936],
        [0.9408, 0.1332, 0.9346, 0.5936]])
tensor([[True, True, True, True],
        [True, True, True, True],
        [True, True, True, True]])


## Running on a CPU or GPU

Getting a GPU can get colab pro - https://colab.research.google.com/notebooks/gpu.ipynb

Buy your own GPUs, there are lots of options, this is a good post on it - https://timdettmers.com/2023/01/30/which-gpu-for-deep-learning/

Cloud computing platforms such as AWS, GCP, Azure.

In [57]:
!nvidia-smi

Sun May 19 09:01:43 2024       
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.104.05             Driver Version: 535.104.05   CUDA Version: 12.2     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|   0  Tesla T4                       Off | 00000000:00:04.0 Off |                    0 |
| N/A   52C    P8              10W /  70W |      0MiB / 15360MiB |      0%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+
                                                                    

In [59]:
# Check for GPU
torch.cuda.is_available()

True

In [60]:
# Set up device agnostic code.
device = "cuda" if torch.cuda.is_available() else "cpu"
device

'cuda'

In [61]:
# Count number of devices
torch.cuda.device_count()

1