## 00. Pytorch Fundamentals


In [4]:
import torch
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
print(torch.__version__)

2.3.0+cu121


## Introduction to tensors

Creating Tensors

In [5]:
# scalar
scalar = torch.tensor(7)
scalar

tensor(7)

In [6]:
scalar.ndim

0

In [7]:
# Get tensor back as Python int
scalar.item()

7

In [8]:
# Vector

vector = torch.tensor([7,7])
vector

tensor([7, 7])

In [9]:
vector.ndim

1

In [10]:
vector.shape

torch.Size([2])

In [11]:
# Matrix

matrix = torch.tensor([[7,8], [9 ,10]])

matrix

tensor([[ 7,  8],
        [ 9, 10]])

In [12]:
matrix.ndim

2

In [13]:
matrix[1]

tensor([ 9, 10])

In [14]:
matrix[0]

tensor([7, 8])

In [15]:
matrix.shape

torch.Size([2, 2])

In [29]:
# Tensor

TENSOR = torch.tensor([[[1,2,3],[3,6,9],[2,4,5]]])
TENSOR

tensor([[[1, 2, 3],
         [3, 6, 9],
         [2, 4, 5]]])

In [30]:
TENSOR.ndim

3

In [31]:
TENSOR.shape

torch.Size([1, 3, 3])

### Random tensors

Why random tensors?

Random tensors are important because the way neural networks learn is that they start with tensors full of random numbers and then adjust those random numbers to better represent the data.

`Start with random numbers -> look at data -> update random numbers -> look at data ->update random numbers`

In [44]:
# Create a random tensor of size (3,4)

random_tensor = torch.rand(3,4)

In [45]:
random_tensor

tensor([[0.4417, 0.7514, 0.3412, 0.5608],
        [0.6536, 0.4727, 0.0050, 0.1855],
        [0.5417, 0.0661, 0.5290, 0.6854]])

In [46]:
random_tensor.ndim


2

In [47]:
random_tensor.shape

torch.Size([3, 4])

In [49]:
# Create a random tensor with similar shape to an image tensor
random_image_size_tensor = torch.rand(size=(224,224,3)) # Height Width colour channel
random_image_size_tensor.shape, random_image_size_tensor.ndim

(torch.Size([224, 224, 3]), 3)

### Tensor Zeros and Ones


In [68]:
# Tensor Zeros

tensor_zeros = torch.zeros(3,4)
tensor_zeros

tensor([[0., 0., 0., 0.],
        [0., 0., 0., 0.],
        [0., 0., 0., 0.]])

In [70]:
# Tensor Ones
tensor_ones = torch.ones(3,4)
tensor_ones

tensor([[1., 1., 1., 1.],
        [1., 1., 1., 1.],
        [1., 1., 1., 1.]])

In [71]:
tensor_ones.dtype

torch.float32

In [54]:
# Creating a range of tensors and tensors-like

one_to_ten = torch.arange(start=1, end=11, step=1)
one_to_ten

tensor([ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10])

In [58]:
ten_zeros = torch.zeros_like(input=one_to_ten)
ten_zeros

tensor([0, 0, 0, 0, 0, 0, 0, 0, 0, 0])

### Tensor DataTypes

Note: Tensor datatypes is one of the 3 big errors you'll run into with PyTorch & Deep learning:

1. Tensors not right datatype
2. Tensors not right shape
3. Tensors not on the right device



In [63]:
# Float 32 tensor

float_32_tensor = torch.tensor([3.0, 6.0, 9.0],
                               dtype=None,
                               device=None,
                               requires_grad=True )
float_32_tensor

tensor([3., 6., 9.], requires_grad=True)

In [62]:
float_32_tensor.dtype

torch.float32

In [64]:
float_16_tensor = float_32_tensor.type(torch.float16)
float_16_tensor

tensor([3., 6., 9.], dtype=torch.float16, grad_fn=<ToCopyBackward0>)

In [66]:
float_tensor = float_16_tensor * float_32_tensor
float_tensor


tensor([ 9., 36., 81.], grad_fn=<MulBackward0>)

In [67]:
float_tensor.dtype

torch.float32

### Getting Information from Tensor(tensor datatypes)

Tensors not right datatype -> to do get datatype from a tensor, can use tensor.dtype
Tensors not right shape - to get shape from a tensor, can use tensor.shape
Tensors not on the right device - to get device from a tensor, can use tensor.device

In [72]:
some_tensor = torch.rand(3,4)
some_tensor

tensor([[0.8374, 0.2822, 0.5475, 0.7662],
        [0.6177, 0.1306, 0.0716, 0.6550],
        [0.2097, 0.9838, 0.6375, 0.4913]])

In [74]:
# Find out detailes about some tensor
print(some_tensor)
print(f"Datatype of tensor: {some_tensor.dtype}")
print(f"Shape of tensor: {some_tensor.shape}")
print(f"Device tensor is on: {some_tensor.device}")

tensor([[0.8374, 0.2822, 0.5475, 0.7662],
        [0.6177, 0.1306, 0.0716, 0.6550],
        [0.2097, 0.9838, 0.6375, 0.4913]])
Datatype of tensor: torch.float32
Shape of tensor: torch.Size([3, 4])
Device tensor is on: cpu


### Manipulating Tensors (tensor operations)

* Addition
* Subtration
* Multiplication (element-wise)
* Division
* Matrix Multiplication



In [78]:
# Create a tensor and add 10 to it

tensor = torch.tensor([1,2,3])
tensor + 10

tensor([11, 12, 13])

In [79]:
# Multiply a tensor and multiply by 10

tensor*10


tensor([10, 20, 30])

In [80]:
# Subtract a tensor by 10
tensor-10


tensor([-9, -8, -7])

In [81]:
# Tyr out Pytorch in-built functions

torch.mul(tensor, 10)

tensor([10, 20, 30])

### Matrix Multiplication

Two main ways of performing multiplication in neural networks and deep learning

1. Element-wise multiplication
2. Matrix multiplication (dot product)

More information on matrix multiplication https://www.mathsisfun.com/algebra/matrix-multiplying.html

There are two main rules that performing matrix multiplication needs to satisfy:

1: The **inner dimensions** must match:

* `(3,2)@(3,2)` won't work

* `(2,3)@(3,2)` will work

* `(3,2)@(2,3)` will work


2. The resulting matrix has the shape of the **outer dimensions**:

* `(2,3)@(3,2)` -> `(2,2)`

* `(3,2)@(2,3)` -> `(3,3)`


In [82]:
# Element wise multiplication

print(tensor, "*", tensor)
print(f"Equals : {tensor * tensor }")


tensor([1, 2, 3]) * tensor([1, 2, 3])
Equals : tensor([1, 4, 9])


In [83]:
# Matrix Multiplication
torch.matmul(tensor, tensor)

tensor(14)

In [84]:
# Matrix multiplication by Hand
1*1 + 2*2 + 3*3

14

In [86]:
%%time
value = 0
for i in range(len(tensor)):
  value +=tensor[i]*tensor[i]
print(value)

tensor(14)
CPU times: user 3.46 ms, sys: 0 ns, total: 3.46 ms
Wall time: 4.03 ms


In [87]:
%%time
torch.matmul(tensor, tensor)


CPU times: user 1.45 ms, sys: 0 ns, total: 1.45 ms
Wall time: 1.76 ms


tensor(14)

### One of the mosr common errors in deep learning: shape errors

In [88]:
# Shapes for matrix multiplication

tensor_A = torch.tensor([[1,2],
                         [3,4],
                         [5,6]])

tensor_B = torch.tensor([[7,10],
                         [8,11],
                         [9,12]])

torch.mm(tensor_A, tensor_B)

RuntimeError: mat1 and mat2 shapes cannot be multiplied (3x2 and 3x2)

## To FIx our tensor shape issue, we can manipulate shape of one of our tensor using a **Transpose**

A **transpose** switches the axis or dimension of tensor

In [93]:
tensor_B.shape

torch.Size([3, 2])

In [95]:
tensor_B.T, tensor_B.T.shape

(tensor([[ 7,  8,  9],
         [10, 11, 12]]),
 torch.Size([2, 3]))

In [98]:
#The Matric multiplication operation works when tensor_B is transposed
print(f"Origin shapes: tensor_A =  {tensor_A.shape}, tensor_B = {tensor_B.shape}")
print(f"New shapes: tensor_A = {tensor_A.shape}, tensor_B.T = {tensor_B.T.shape}")
print(f"Multiplying: {tensor_A.shape} @ {tensor_B.T.shape} <- inner dimensions must match")
print("Output:\n")
output = torch.matmul(tensor_A, tensor_B.T)
print(output)
print(f"\nOutput shape: {output.shape}")

Origin shapes: tensor_A =  torch.Size([3, 2]), tensor_B = torch.Size([3, 2])
New shapes: tensor_A = torch.Size([3, 2]), tensor_B.T = torch.Size([2, 3])
Multiplying: torch.Size([3, 2]) @ torch.Size([2, 3]) <- inner dimensions must match
Output:

tensor([[ 27,  30,  33],
        [ 61,  68,  75],
        [ 95, 106, 117]])

Output shape: torch.Size([3, 3])


## Finding the min, max, mean, sum etc (tensor aggregation)

In [101]:
# create a tensor

x= torch.arange(0,100, 10)

x, x.dtype

(tensor([ 0, 10, 20, 30, 40, 50, 60, 70, 80, 90]), torch.int64)

In [100]:
torch.min(x), torch.max(x)

(tensor(0), tensor(90))

In [105]:
# Find the mean - note: the torch.mean() function requires a tensor of float32 datatype to work

torch.mean(x.type(torch.float32)), x.type(torch.float32).mean()

(tensor(45.), tensor(45.))

In [106]:
# Find the sum

torch.sum(x), x.sum()

(tensor(450), tensor(450))

### Find the poistional min and max

In [107]:
# Find the position in tensor that has max value with argmax()

torch.argmax(x)

tensor(9)

In [109]:
# Find the position in tensor that has min value with argmin()


torch.argmin(x)

tensor(0)

## Reshping, staking, squeezing and unsqueezing tensors

* Reshaping : reshapes an input tensor to a defined shape
* View : Return a view of an input tensor of certain shape but keep the same memory as the original tensor
* Stacking : combine multiple tensors on top of each other (vstack) or side by side (hstack)
*  Squeeze - removed all `1` dimensions from a tensor
* Unsqueeze - add a `1` dimension to a target tensor
* Permute - Return a view of the input with dimensions permute (swapped) in a certain way


In [4]:
# Let's create a tensor

import torch
x= torch.arange(1., 10.)
x, x.shape

(tensor([1., 2., 3., 4., 5., 6., 7., 8., 9.]), torch.Size([9]))

In [5]:
# Add an extra dimension
x_reshaped = x.reshape(1, 9)
x_reshaped, x_reshaped.shape

(tensor([[1., 2., 3., 4., 5., 6., 7., 8., 9.]]), torch.Size([1, 9]))

In [6]:
# Change the view

z = x.view(1, 9)
z, z.shape

(tensor([[1., 2., 3., 4., 5., 6., 7., 8., 9.]]), torch.Size([1, 9]))

In [16]:
# Changing z changes x (becuase a view of a tensor shares the same memory as the original tensor)

z[:,0] = 5

z, x

(tensor([[5., 2., 3., 4., 5., 6., 7., 8., 9.]]),
 tensor([5., 2., 3., 4., 5., 6., 7., 8., 9.]))

In [14]:
# Stack tensors on top of each other

x_stacked = torch.stack([x, x, x, x], dim=1)
x_stacked

tensor([[5., 5., 5., 5.],
        [2., 2., 2., 2.],
        [3., 3., 3., 3.],
        [4., 4., 4., 4.],
        [5., 5., 5., 5.],
        [6., 6., 6., 6.],
        [7., 7., 7., 7.],
        [8., 8., 8., 8.],
        [9., 9., 9., 9.]])

In [19]:
torch.Size([2, 2, 2])
y = torch.squeeze(x, 0)
y.size()
torch.Size([2, 1, 2, 1, 2])
y = torch.squeeze(x, 1)
y.size()
torch.Size([2, 2, 1, 2])
y = torch.squeeze(x, (1, 2, 3))
torch.Size([2, 2, 2])

torch.Size([2, 2, 2])

## Tensor Permute

Rearanges the dimensions of a target tensor in a specified order

In [20]:
x = torch.randn(2,3,5)
x.size()

torch.Size([2, 3, 5])

In [21]:
torch.permute(x, (2,0,1)).size()

torch.Size([5, 2, 3])

In [23]:
x_original = torch.rand(size=(224, 224,3)) # height , width and color channel

# Permute the original tyensor to rearange the axis (or dim) order

x_permuted = x_original.permute(2, 0, 1) # sifts axis 0->1, 1->2, 2->0

print(f"Previous Shape: {x_original.shape}")
print(f"New shape: {x_permuted.shape}") # color channel, height, width


Previous Shape: torch.Size([224, 224, 3])
New shape: torch.Size([3, 224, 224])


In [24]:
x_original

tensor([[[0.4134, 0.9984, 0.6394],
         [0.0030, 0.6878, 0.9862],
         [0.9946, 0.2702, 0.8809],
         ...,
         [0.8483, 0.7368, 0.1735],
         [0.4561, 0.7456, 0.1353],
         [0.6806, 0.8913, 0.5277]],

        [[0.8555, 0.2090, 0.2196],
         [0.3207, 0.8743, 0.7876],
         [0.4321, 0.5379, 0.7863],
         ...,
         [0.0498, 0.5838, 0.5965],
         [0.3041, 0.7451, 0.6922],
         [0.8928, 0.7441, 0.0657]],

        [[0.5395, 0.9487, 0.8218],
         [0.9523, 0.7990, 0.6414],
         [0.0110, 0.6943, 0.7148],
         ...,
         [0.1858, 0.6297, 0.6107],
         [0.8090, 0.3495, 0.9138],
         [0.3734, 0.6737, 0.2854]],

        ...,

        [[0.6179, 0.0298, 0.6160],
         [0.6517, 0.0069, 0.4195],
         [0.4183, 0.2952, 0.3762],
         ...,
         [0.0109, 0.2254, 0.7564],
         [0.4316, 0.6147, 0.3592],
         [0.3555, 0.3033, 0.5111]],

        [[0.3777, 0.4190, 0.1893],
         [0.7475, 0.2774, 0.6496],
         [0.

In [25]:
x_permuted

tensor([[[0.4134, 0.0030, 0.9946,  ..., 0.8483, 0.4561, 0.6806],
         [0.8555, 0.3207, 0.4321,  ..., 0.0498, 0.3041, 0.8928],
         [0.5395, 0.9523, 0.0110,  ..., 0.1858, 0.8090, 0.3734],
         ...,
         [0.6179, 0.6517, 0.4183,  ..., 0.0109, 0.4316, 0.3555],
         [0.3777, 0.7475, 0.7933,  ..., 0.3071, 0.5055, 0.1661],
         [0.7886, 0.8603, 0.1490,  ..., 0.0393, 0.5112, 0.1548]],

        [[0.9984, 0.6878, 0.2702,  ..., 0.7368, 0.7456, 0.8913],
         [0.2090, 0.8743, 0.5379,  ..., 0.5838, 0.7451, 0.7441],
         [0.9487, 0.7990, 0.6943,  ..., 0.6297, 0.3495, 0.6737],
         ...,
         [0.0298, 0.0069, 0.2952,  ..., 0.2254, 0.6147, 0.3033],
         [0.4190, 0.2774, 0.5716,  ..., 0.6992, 0.2690, 0.9555],
         [0.6827, 0.5974, 0.6356,  ..., 0.7066, 0.3815, 0.5567]],

        [[0.6394, 0.9862, 0.8809,  ..., 0.1735, 0.1353, 0.5277],
         [0.2196, 0.7876, 0.7863,  ..., 0.5965, 0.6922, 0.0657],
         [0.8218, 0.6414, 0.7148,  ..., 0.6107, 0.9138, 0.

In [34]:
x = torch.randn(1,1,1,1)
x.shape

torch.Size([1, 1, 1, 1])

## Indexing  (selecting data from tensors)

In [35]:
import torch

In [41]:
x= torch.arange(1, 10).reshape(1,3,3)

In [42]:
x, x.shape

(tensor([[[1, 2, 3],
          [4, 5, 6],
          [7, 8, 9]]]),
 torch.Size([1, 3, 3]))

In [43]:
x[0]

tensor([[1, 2, 3],
        [4, 5, 6],
        [7, 8, 9]])

In [45]:
x[0][0][0]

tensor(1)

In [46]:
x[::]

tensor([[[1, 2, 3],
         [4, 5, 6],
         [7, 8, 9]]])

In [56]:
# Get all values of 0th and 1st dimensions but only index 1 of 2nd dimension

x[:,:,1]

tensor([[2, 5, 8]])

In [55]:
# Get all values of the 0 dimension but only the 1 index value of 1st and 2nd dimension

x[:, 1, 1]

tensor([5])

In [57]:
# Index on x to return 9

x[:,2,2]

tensor([9])

In [59]:
# Index on x to retun 3, 6, 9
x[:,:, 2]

tensor([[3, 6, 9]])

## Pytorch tensors & Numpy

Numpy is a popular scientific Python numeric computing library

And because of this, PyTorch has functinality to interect with it.

* Data in Numpy, want in PyTorch tensor -> torch.from_numpy(ndarray)
* PyTorch tensor->Numpy-> torch.Tensor.numpy()

In [62]:

# Numpy array to tensor

import torch
import numpy as np

array = np.arange(1.0, 8.0)
tensor = torch.from_numpy(array)
array, tensor

(array([1., 2., 3., 4., 5., 6., 7.]),
 tensor([1., 2., 3., 4., 5., 6., 7.], dtype=torch.float64))

In [63]:
array.dtype

dtype('float64')

In [64]:
tensor.dtype

torch.float64

In [65]:
torch.arange(1.0, 8.0).dtype

torch.float32

In [67]:
# change the value of array, what will this do to `tensor`?

# tensor value do not change

array = array + 1
array, tensor

(array([3., 4., 5., 6., 7., 8., 9.]),
 tensor([1., 2., 3., 4., 5., 6., 7.], dtype=torch.float64))

In [68]:
 # Tensor to Numpy

tensor = torch.ones(7)
numpy_tensor = tensor.numpy()
tensor, numpy_tensor


(tensor([1., 1., 1., 1., 1., 1., 1.]),
 array([1., 1., 1., 1., 1., 1., 1.], dtype=float32))

In [69]:
# change the tensor

tensor = tensor +1

In [70]:
tensor, numpy_tensor


(tensor([2., 2., 2., 2., 2., 2., 2.]),
 array([1., 1., 1., 1., 1., 1., 1.], dtype=float32))

## Reproductbility (trying to take random out of random)

In short how neural networks learns:

`start with random numbers -> tensor operations -> update random numbers to try and make them better representations of the data -> again -> again -> .....`



In [71]:
torch.rand(3,3)

tensor([[0.1036, 0.2109, 0.6969],
        [0.7049, 0.0202, 0.8327],
        [0.5645, 0.0660, 0.8927]])

In [74]:
# Set the random seed

RANDOM_SEED = 42
torch.manual_seed(RANDOM_SEED)
random_tensor_C = torch.rand(3,4)

torch.manual_seed(RANDOM_SEED)
random_tensor_D = torch.rand(3,4)

print(random_tensor_C)
print(random_tensor_D)
print(random_tensor_C==random_tensor_D)





tensor([[0.8823, 0.9150, 0.3829, 0.9593],
        [0.3904, 0.6009, 0.2566, 0.7936],
        [0.9408, 0.1332, 0.9346, 0.5936]])
tensor([[0.8823, 0.9150, 0.3829, 0.9593],
        [0.3904, 0.6009, 0.2566, 0.7936],
        [0.9408, 0.1332, 0.9346, 0.5936]])
tensor([[True, True, True, True],
        [True, True, True, True],
        [True, True, True, True]])


## Extra resources for reproducibility

* https://pytorch.org/docs/stable/notes/randomness.html

* https://en.wikipedia.org/wiki/Random_seed

## Running tensors and PyTorch objects on the GPUs (and making faster computation)

GPUs = faster computation on numbers, thanks to CUDA + NVIDIA hardware + PyTorch working behind the scenes to make everything hunky dary (good)

###  Getting a GPU

1. Easiest - Use Google colab for a free GPU (options to upgrade)

2. use your own GPU : take a little bit of setup and requires the investment of purchase a GPU, theres lot of options..., https://www.fool.com/investing/stock-market/market-sectors/information-technology/gpu-stocks/

3. Use clouod computing : GCP, AWS, Azure, these servceis allow you to rent computers on the cloud and access them

For 2, 3 PyTorch + GPU drivers (CUDA) takes a little bit of setting up, to do this refer to PyTorch setup documentation https://pytorch.org/get-started/locally/



In [1]:
!nvidia-smi

Sun Jun 23 12:17:32 2024       
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.104.05             Driver Version: 535.104.05   CUDA Version: 12.2     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|   0  Tesla T4                       Off | 00000000:00:04.0 Off |                    0 |
| N/A   36C    P8              11W /  70W |      0MiB / 15360MiB |      0%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+
                                                                    

## Check for GPU access with PyTorch


In [4]:
# check for GPU access with PyTorch
import torch
torch.cuda.is_available()


True

## For PyTorch since it's capable of running compute on the GPU or CPU, its best practice to setup device agnostic code : https://pytorch.org/docs/stable/notes/cuda.html#best-practices

E. g. run on GPU if available, else default to CPU

In [5]:
# Setup device agnostic code

device = "CUDA" if torch.cuda.is_available() else "CPU"
device

'CUDA'

In [6]:
# Count number of devices

torch.cuda.device_count()

1

## Putting tensors (and models) on the GPU

The reason we want our tensors/models on the GPU is because using a GPU results in faster computations

In [7]:
# Create a tensor (default on the CPU)
tensor = torch.tensor([1,2,3])

# tensor not on GPU
print(tensor, tensor.device)

tensor([1, 2, 3]) cpu


In [14]:
# Move tensor to GPU (if available)

device = "cuda" if torch.cuda.is_available() else "cpu"

tensor_on_gpu = tensor.to(device)
tensor_on_gpu

tensor([1, 2, 3], device='cuda:0')

### Moving tensors back t CPU

In [12]:
device = "cpu" if torch.cuda.is_available() else "cuda"

tensor_on_cpu = tensor.to(device)
tensor_on_cpu

tensor([1, 2, 3])

In [15]:
tensor_on_gpu.numpy()

TypeError: can't convert cuda:0 device type tensor to numpy. Use Tensor.cpu() to copy the tensor to host memory first.

In [23]:
# To fix the GPU tensor with numpy issue, we can first set it to the CPU
tensor_back_on_cpu = tensor_on_gpu.cpu().numpy()
tensor_back_on_cpu


array([1, 2, 3])

In [22]:
tensor_on_cpu.numpy()

array([1, 2, 3])

In [24]:
tensor_on_gpu

tensor([1, 2, 3], device='cuda:0')