<a href="https://colab.research.google.com/github/Princekay88/my-pytorch-learning-journey/blob/main/pytorch_fundamentals.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [None]:
print("Hello, I'm excited to learn PyTorch!")

Hello, I'm excited to learn PyTorch!


In [None]:
!nvidia-smi

/bin/bash: line 1: nvidia-smi: command not found


In [None]:
import torch
print(torch.__version__)

2.4.0+cu121


## Tensor Datatypes

Tensor datatypes is one of the three big errors you'll run into with PyTorch deep learning:
Tensors not right datatype
Tensors not right shape
Tensors not on the right device


### To get information from tensors;

Tensors not right datatype - use `tensor.dtype`

Tensors not right shape - use `tensor.shape`

Tensors not on the right device - use `tensor.device`


In [None]:
# Create a tensor
some_tensor = torch.rand(3,4)
some_tensor

tensor([[0.2797, 0.8295, 0.4115, 0.5190],
        [0.5986, 0.5771, 0.9777, 0.0208],
        [0.7022, 0.0540, 0.0084, 0.9149]])

In [None]:
# Find out details about some_tensor
print(some_tensor)
print(f"Datatype of tensor: {some_tensor.dtype}\n Shape of Tensor: {some_tensor.shape}\n Device of Tensor: {some_tensor.device}")

tensor([[0.2797, 0.8295, 0.4115, 0.5190],
        [0.5986, 0.5771, 0.9777, 0.0208],
        [0.7022, 0.0540, 0.0084, 0.9149]])
Datatype of tensor: torch.float32
 Shape of Tensor: torch.Size([3, 4])
 Device of Tensor: cpu


## Manipulating Tensors (tensor operations)

Tensor operations include;
.Addition

.Subtraction

.Multiplication(element-wise)

.Division

.Matrix multiplication

In [None]:
# Create a tensor
tensor = torch.tensor([1, 2, 3])
# Addition
tensor + 10

tensor([11, 12, 13])

In [None]:
# Multipication
tensor = tensor * 10
tensor

tensor([10, 20, 30])

In [None]:
# Subtraction
tensor - 10


tensor([ 0, 10, 20])

### Matrix multiplication (giving the differences between matrix multiplication and element wise multiplication)

There are two ways of performing multiplication in neural networks and deep learning

1. Element-wise multiplication (the one above)
2. Matrix multiplication (this is the most common tensor operation you will find inside neural networks) - Another name for Matrix mul..
is dot product

More info on matrix multiplication - https://www.mathsisfun.com/algebra/matrix-multiplying.html

There are two main rules that performing matrix multiplications needs to satisfy;
1. The **inner dimensions** must match:
* `(3,2) @ (3,2)` - won't work
* `(2,3) @ (3,2)` - will work
* `(3,2) @ (2,3)`- will work

2. The resulting matrix has the result of the **outer dimensions**:
* `(2,3) @ (3,2)` => `(2,2)`
* `(3,2) @ (2,3)` => `(3,3)`

In [None]:
# Element wise multiplication
print(tensor, '*', tensor)
print(f"Equals to: {tensor * tensor}")

tensor([10, 20, 30]) * tensor([10, 20, 30])
Equals to: tensor([100, 400, 900])


In [None]:
# Matrix multiplication
# the symbol @ stands for matrix multiplication. It is recommended to use matmul
torch.matmul(tensor, tensor)

tensor(1400)

## One of the most common errors in deep learning is shape errors

In [None]:
tensor @ tensor

tensor(1400)

In [None]:
torch.matmul(torch.rand(3,2), torch.rand(2,3))

tensor([[0.6171, 0.1720, 0.4087],
        [0.7427, 0.2379, 0.4358],
        [0.2971, 0.0680, 0.2237]])

In [None]:
#Shapes for matrix multiplication
tensor_A = torch.tensor([[1,2],
                        [3,4],
                        [5,6]])
tensor_B = torch.tensor([[7,10],
                        [8,11],
                        [9,12]])

# torch.mm(tensor_A, tensor_B) # torch.mm is the same as torch.matmul


In [None]:
tensor_A.shape, tensor_B.shape # it can't multiply because it breaks the first rule

(torch.Size([3, 2]), torch.Size([3, 2]))

### To fix our tensor shape issues, we can manipulate the shape of one of our tensors using **transpose**

A **transpose** switches the axes or dimensions of a given tensor

In [None]:
print(f"{tensor_B.T},\n\n {tensor_B.T.shape}")

tensor([[ 7,  8,  9],
        [10, 11, 12]]),

 torch.Size([2, 3])


In [None]:
print(f"{tensor_B},\n\n {tensor_B.shape}")

tensor([[ 7, 10],
        [ 8, 11],
        [ 9, 12]]),

 torch.Size([3, 2])


In [None]:
print(torch.mm(tensor_B.T, tensor_B))

print(torch.mm(tensor_B.T, tensor_B).shape)

tensor([[194, 266],
        [266, 365]])
torch.Size([2, 2])


## Finding the min, max, mean, sum, etc of a tensor(This is called tensor aggregation)
the min of the tensor above is 194. We found the smallest number

In [None]:
# Create a tensor
x = torch.arange(0,100,10)
x

tensor([ 0, 10, 20, 30, 40, 50, 60, 70, 80, 90])

In [None]:
# Find the min
x.min(), torch.min(x)

(tensor(0), tensor(0))

In [None]:
# Find the max
x.max(), torch.max(x)

(tensor(90), tensor(90))

In [None]:
# Find the mean
torch.mean(x.type(torch.float32)), x.type(torch.float32).mean() # torch.mean() function requires a tensor of float32 datatype to work instead of it's default Long

(tensor(45.), tensor(45.))

In [None]:
# Find the sum
torch.sum(x), x.sum()

(tensor(450), tensor(450))

In [None]:
# Find the index of the min
torch.argmin(x), x.argmin()

(tensor(0), tensor(0))

In [None]:
# Find the index of the max
torch.argmax(x), x.argmax()

(tensor(9), tensor(9))

## Reshaping, stacking, squeezing and unsqueezing tensors

* Reshaping - Reshapes an input tensor to a defined shape
* View - Return a view of an input tensor of certain shape but keep the same memory as the original tensor
* Stacking - combine multiple tensors on top of each other (vstack) or side by side (hstack)
* Squeeze - removes all `1` dimensions from a tensor
* Unsqueeze - add a `1` dimension to a target tensor
* Permute - Return a view of the input with dimensions permuted(swapped) in a certain way

In [None]:
# Create a tensor
import torch
x = torch.arange(1., 10.)
x, x.shape

(tensor([1., 2., 3., 4., 5., 6., 7., 8., 9.]), torch.Size([9]))

In [None]:
# Add an extra dimension
# NB: the reshape has to be compatible with the former one. arange(1,9),(1,10), etc will work. if you put arange(1,8) and below, 9 can't enter 8 so therefore it won't work
x_reshaped = x.reshape(1,9)
x_reshaped, x_reshaped.shape

(tensor([[1., 2., 3., 4., 5., 6., 7., 8., 9.]]), torch.Size([1, 9]))

In [None]:
# Change the view
z = x.view(1,9)
z, z.shape

(tensor([[1., 2., 3., 4., 5., 6., 7., 8., 9.]]), torch.Size([1, 9]))

In [None]:
# Changing z changes x (because the view of a tensor shares the same memory as the original input)
z[:, 0] = 5 # Here we changed the first element(element with index of 0) to 5
z, x

(tensor([[5., 2., 3., 4., 5., 6., 7., 8., 9.]]),
 tensor([5., 2., 3., 4., 5., 6., 7., 8., 9.]))

In [None]:
# Stack tensors on top of each other
x_stacked = torch.stack([x,x,x,x], dim=0) # Here the shape x stacks on top of each omachine learning bookther
x_stacked

tensor([[5., 2., 3., 4., 5., 6., 7., 8., 9.],
        [5., 2., 3., 4., 5., 6., 7., 8., 9.],
        [5., 2., 3., 4., 5., 6., 7., 8., 9.],
        [5., 2., 3., 4., 5., 6., 7., 8., 9.]])

In [None]:
# torch.squeeze - removes all single dimensions from a single tensor
x_reshaped

tensor([[5., 2., 3., 4., 5., 6., 7., 8., 9.]])

In [None]:
x_reshaped.shape

torch.Size([1, 9])

In [None]:
x_squeezed = x_reshaped.squeeze() # it removed the single bracket at the beginning and the end

In [None]:
x_squeezed.shape # as you can see, it succesfully removed 1 dimension from the tensor. If we had [1,1,9], it would remove all of the 1 dimensions

torch.Size([9])

In [None]:
# torch.unsqueeze - adds a single dimension to a target tensor at a specific dimension
x_unsqueezed = x_squeezed.unsqueeze(dim=0)
x_unsqueezed
print(f"\nNew tensor: {x_unsqueezed}")
print(f"New shape: {x_unsqueezed.shape}")


New tensor: tensor([[5., 2., 3., 4., 5., 6., 7., 8., 9.]])
New shape: torch.Size([1, 9])


In [None]:
# torch.permute - rearranges the dimensions of a target tensor in a specified order
x_original = torch.rand(size=(224,224,3)) #[height, width, colour channels]

# Permute the original tensor
x_permuted = x_original.permute(2,0,1) # this is rearranged so that colour channels will go 1st, then height, then width

print(f"Previous shape: {x_original.shape}")
print(f"New shape: {x_permuted.shape}")

Previous shape: torch.Size([224, 224, 3])
New shape: torch.Size([3, 224, 224])


## Indexing(selecting data from tensors)
Indexing with PyTorch is similar to indexing with NumPy

In [None]:
import torch
x = torch.arange(1,10).reshape(1,3,3)
x, x.shape

(tensor([[[1, 2, 3],
          [4, 5, 6],
          [7, 8, 9]]]),
 torch.Size([1, 3, 3]))

In [None]:
# Let's index on our new tensor
x[0]

tensor([[1, 2, 3],
        [4, 5, 6],
        [7, 8, 9]])

In [None]:
# Let's index on the middle bracket (dim=1)
x[0][0] # this defines the first bracket[0] and then the 0 version of that first bracket

tensor([1, 2, 3])

In [None]:
# Let's index on the last dimension
x[0][0][0] # this gives us the 0th tensor, the 0th index and the 0th element(inside the 0th index) which is 1

tensor(1)

In [None]:
# You can use : to select all of a target dimension
x[:, 0]

tensor([[1, 2, 3]])

In [None]:
# Get all values of 0th and 1st dimensions but only index 1 of 2nd dimension
x[:, :, 1]

tensor([[2, 5, 8]])

In [None]:
# Get all values of the 0 dimension but only the 1 index of 1st and 2nd dimension
x[:, 1, 1]

tensor([5])

In [None]:
# Get index of 0 and 0th dimension and all values of 2nd dimension
x[0,0,:]

tensor([1, 2, 3])

In [None]:
# Index on x to return 9
x[0,2,2]

tensor(9)

In [None]:
# Index on x to return [3,6,9]
x[:, :, 2]

tensor([[3, 6, 9]])

## PyTorch tensors and NumPy
Your data might be represented in Numpy, but you want to do some deep learning on it, therefore change it to a PyTorch tensor.
* There is a method of changing Numpy arrays to Pytorch tensors - `torch.from_numpy(nd_array)`
* converting Pytorch tensors to Numpy arrays - `torch.Tensor.numpy()`

In [None]:
# Numpy array to tensor
import torch
import numpy as np

array = np.arange(1.0, 8.0)
tensor = torch.from_numpy(array) # warning: when converting from numpy to pytorch, pytorch reflects numpy's default datatype of float64. To change back to float32, simply add .type(float32)
array, tensor

(array([1., 2., 3., 4., 5., 6., 7.]),
 tensor([1., 2., 3., 4., 5., 6., 7.], dtype=torch.float64))

In [None]:
# Tensor to numpy array
tensor = torch.ones(7)
numpy_tensor = tensor.numpy()
tensor, numpy_tensor

(tensor([1., 1., 1., 1., 1., 1., 1.]),
 array([1., 1., 1., 1., 1., 1., 1.], dtype=float32))

## Reproducibility (trying to take the random out of random)

In short, how a neural network learns:

`start with random numbers -> perform tensor operations -> update random numbers to try and make them better representations of the data -> again -> again -> again...`

To reduce the randomness in neural networks and PyTorch comes the concept of **random seed**.

Essentially, what the random seed does is "flavour" the randomness

In [None]:
import torch

# Create two random tensors
random_tensor_A = torch.rand(3,4)
random_tensor_B = torch.rand(3,4)

print(random_tensor_A)
print(random_tensor_B)
print(random_tensor_A == random_tensor_B)

tensor([[0.2530, 0.8715, 0.9805, 0.5107],
        [0.4087, 0.1724, 0.4379, 0.7781],
        [0.4013, 0.6967, 0.2959, 0.5770]])
tensor([[0.7601, 0.3599, 0.5362, 0.4459],
        [0.9655, 0.4457, 0.8149, 0.3714],
        [0.1994, 0.5305, 0.4881, 0.1234]])
tensor([[False, False, False, False],
        [False, False, False, False],
        [False, False, False, False]])


In [None]:
# As you can see above, it's very hard to make random tensors match each other
# But here, we will reduce the randomness of random tensors and make them to match each other using random seeds

import torch

# Set the random seed
RANDOM_SEED = 42   # Any number can be used, but 42 is preferred

torch.manual_seed(RANDOM_SEED)
random_tensor_C = torch.rand(3,4)

torch.manual_seed(RANDOM_SEED)
random_tensor_D = torch.rand(3,4)

print(random_tensor_C)
print(random_tensor_D)
print(random_tensor_C == random_tensor_D)


tensor([[0.8823, 0.9150, 0.3829, 0.9593],
        [0.3904, 0.6009, 0.2566, 0.7936],
        [0.9408, 0.1332, 0.9346, 0.5936]])
tensor([[0.8823, 0.9150, 0.3829, 0.9593],
        [0.3904, 0.6009, 0.2566, 0.7936],
        [0.9408, 0.1332, 0.9346, 0.5936]])
tensor([[True, True, True, True],
        [True, True, True, True],
        [True, True, True, True]])


##  Running tensors and PyTorch objects on GPU's and making faster computations

GPUs = faster computation on numbers, thanks to CUDA + NVIDIA hardware
+ PyTorch working behind the scenes to make everything good.

There are a few ways to do this;

## 1. Getting a GPU

1. Easiest - using google colab for a free GPU(what we're using now) or options to upgrade as well
2. Use your own GPU - takes a little bit of setup and requires the investment of purchasing a GPU, there're lots of options..
3. Use cloud computing - GCP, AWS, Azure

In [None]:
!nvidia-smi

Tue Sep 10 10:43:36 2024       
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.104.05             Driver Version: 535.104.05   CUDA Version: 12.2     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|   0  Tesla T4                       Off | 00000000:00:04.0 Off |                    0 |
| N/A   38C    P8               9W /  70W |      0MiB / 15360MiB |      0%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+
                                                                    

# Check for GPU access with PyTorch

In [None]:
# Check for GPU access
import torch
torch.cuda.is_available()

True

In [None]:
# Setup device agnostic code
device = "cuda" if torch.cuda.is_available() else "cpu"
device

'cuda'

## 3. Putting tensors and models on the GPU

The reason we are using our tensors/models on the GPU is because using a GPU results in faster computations.

In [None]:
# Create a tensor (default is on the CPU)
tensor = torch.tensor([1,2,3])

# Tensor not on GPU
print(tensor, tensor.device)

tensor([1, 2, 3]) cpu


In [None]:
# Move tensor to GPU(if available)
tensor_on_gpu = tensor.to(device)
tensor_on_gpu

tensor([1, 2, 3], device='cuda:0')

# Moving tensors back to the CPU

In [None]:
# if tensor is on GPU, you can't transfer it to numpy
# To fix the GPU to numpy issue, we can first set it to CPU

tensor_back_on_cpu = tensor_on_gpu.cpu().numpy()
tensor_back_on_cpu

array([1, 2, 3])

In [None]:
tensor_on_gpu

tensor([1, 2, 3], device='cuda:0')