<a href="https://colab.research.google.com/github/YuranShi/pytorch-deep-learning-notes/blob/main/Chapter_0_PyTorch_Fundamentals.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

Class website: https://www.learnpytorch.io/00_pytorch_fundamentals/

In [None]:
import torch
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
print(torch.__version__)

2.5.1+cu121


## Introduction to Tensors

#### Creating Tensors
PyTorch tensors are created using `torch.Tensor()` = https://pytorch.org/docs/stable/tensors.html


In [None]:
# scalar
scalar = torch.tensor(7)
scalar

tensor(7)

In [None]:
scalar.ndim

0

In [None]:
# vector

vector = torch.tensor([7,7])
vector

tensor([7, 7])

In [None]:
print(vector.ndim)
print(vector.shape)

1
torch.Size([2])


In [None]:
# MATRIX

MATRIX = torch.tensor([[7, 8],
                      [9, 10]])
MATRIX

tensor([[ 7,  8],
        [ 9, 10]])

In [None]:
print(MATRIX.ndim)
print(MATRIX.shape)

2
torch.Size([2, 2])


In [None]:
MATRIX[1]

tensor([ 9, 10])

In [None]:
# TENSOR
TENSOR = torch.tensor([[[1, 2, 3],
                        [3, 5, 8],
                        [4, 6, 9]]])
TENSOR

tensor([[[1, 2, 3],
         [3, 5, 8],
         [4, 6, 9]]])

In [None]:
print(TENSOR.ndim)
print(TENSOR.shape)

3
torch.Size([1, 3, 3])


In [None]:
TENSOR[0]

tensor([[1, 2, 3],
        [3, 5, 8],
        [4, 6, 9]])

### Random Tensors

Why random tensors?

Random tensors are important because the way neural networks learn is that they start with tensor full of random numbers and adjust those random numbers to better represent data.

`torch.rand` - https://pytorch.org/docs/stable/generated/torch.rand.html

`Start with random numbers -> look at data -> update random numbers -> look at data -> update random numbers`

In [None]:
# Create a random tensor of size (3, 4)
random_tensor = torch.rand(3, 4)
random_tensor

tensor([[0.6821, 0.3168, 0.1474, 0.4285],
        [0.7801, 0.7122, 0.9542, 0.1788],
        [0.8073, 0.0574, 0.5179, 0.8664]])

In [None]:
random_tensor.ndim

3

In [None]:
# Create a random tensor with similar shape to an image tensor

random_image_tensor = torch.rand(size=(224, 224, 3)) # height, width, channels
print(random_image_tensor.shape)
print(random_image_tensor.ndim)

torch.Size([224, 224, 3])
3


### Zeros and Ones

In [None]:
# Create a tensor of all zeros
zeros = torch.zeros(size=(3,4))
zeros

tensor([[0., 0., 0., 0.],
        [0., 0., 0., 0.],
        [0., 0., 0., 0.]])

In [None]:
# Create a tensor of all ones
ones = torch.ones(size=(3,4))
ones

tensor([[1., 1., 1., 1.],
        [1., 1., 1., 1.],
        [1., 1., 1., 1.]])

In [None]:
# Tensor data type
ones.dtype

torch.float32

### Create a Range of Tensors and Tensors-like

In [None]:
# Use torch.arange()
one_to_ten = torch.arange(start=0, end=1000, step=50)
one_to_ten

tensor([  0,  50, 100, 150, 200, 250, 300, 350, 400, 450, 500, 550, 600, 650,
        700, 750, 800, 850, 900, 950])

In [None]:
# Creating tensors like to create tensor
ten_zeros = torch.zeros_like(input=one_to_ten)
ten_zeros

tensor([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0])

### Tensor Datatypes

__Note:__ Tensor dtype is one of the 3 big error that we'll like run into with PyTorch & deep learning:

1. Tensors not right dtype
2. Tensors not right shape
3. Tensors not right device

In [None]:
# Float 32 tensor
float_32_tensor = torch.tensor([3.0, 6.0, 9.0],
                               dtype=None, # What dtype is the tensor (e.g. float 32 by default or float 16)
                               device="cuda", # What device is your tensor on (e.g. )
                               requires_grad=False) # whether to track the gradient
float_32_tensor

tensor([3., 6., 9.])

In [None]:
float_16_tensor = float_32_tensor.type(torch.float16)
float_16_tensor

tensor([3., 6., 9.], dtype=torch.float16)

In [None]:
# Multiply 2 tensors of different dtype
float_16_tensor * float_32_tensor

tensor([ 9., 36., 81.])

#### Tensor Attributes -- Getting information from Tensors

1. Tensors not right dtype - `tensor.dtype` to get the dtype
2. Tensors not right shape - `tensor.shape` to get the shape
3. Tensors not right device - `tensor.device` to get the device

In [None]:
print(float_32_tensor)
print(f"Datatype of tensor: {float_32_tensor.dtype}")
print(f"Shape of tensor: {float_32_tensor.shape}")
print(f"Device of tensor: {float_32_tensor.device}")

tensor([3., 6., 9.])
Datatype of tensor: torch.float32
Shape of tensor: torch.Size([3])
Device of tensor: cpu


### Manipulating Tensors (Tensor Operations)

Tensor Operations include:
- Addidtion
- Subtraction
- Multiplication (element-wise)
- Division
- Matrix Multiplication


In [None]:
# Addition or use built-in torch.add(tensor, 10)
tensor = torch.tensor([1, 2, 3])
tensor + 10

tensor([11, 12, 13])

In [None]:
# Multiplication or use torch.mul(tensor, 10)
tensor * 10

tensor([10, 20, 30])

In [None]:
# Subtraction
tensor - 10

tensor([-9, -8, -7])

#### Matrix Multiplication

Two main ways of performing multiplication

1. Element-wise multiplication
2. Matrix multiplication (dot product)

- More information: https://www.mathsisfun.com/algebra/matrix-multiplying.html


There are 2 rules that performing matrix multiplication needs to satisfy:
  1. The __inner dimensions__ must match:
   - `(3, 2) @ (3, 2)` won't work
   - `(3, 2) @ (2, 3)` will work
  2. The resulting matrix has the shape of __outer dimension__:
   - `(3, 2) @ (2, 3)` -> `(3, 3)`
   - `(4, 1) @ (1, 2)` -> `(4, 2)`

In [None]:
# Element-wise multiplication
tensor * tensor


tensor([1, 4, 9])

In [None]:
# Matrix multiplication (dot product)
torch.matmul(tensor, tensor) # or directly tensor @ tensor

tensor(14)

In [None]:
# Matrix multiplication dimensions
torch.matmul(torch.rand(3, 2), torch.rand(2, 3)).shape

torch.Size([3, 3])

#### Matrix Multiplication Shape Errors

In [None]:
tensor_A = torch.tensor([[1, 2],
                         [3, 4],
                         [5, 6]])

tensor_B = torch.tensor([[7, 8],
                         [9, 10],
                         [11, 12]])

# torch.mm() is an alias for torch.matmul()
torch.mm(tensor_A, tensor_B) # This does not work! Shape does not match

RuntimeError: mat1 and mat2 shapes cannot be multiplied (3x2 and 3x2)

#### Transpose

A __transpose__ switches the axis of a given tensor

In [None]:
tensor_B.T

tensor([[ 7,  9, 11],
        [ 8, 10, 12]])

In [None]:
# If we transpose tensor_B, tensor.mm() will work
torch.mm(tensor_A, tensor_B.T)

tensor([[ 23,  29,  35],
        [ 53,  67,  81],
        [ 83, 105, 127]])

In [None]:
torch.mm(tensor_A.T, tensor_B)

tensor([[ 89,  98],
        [116, 128]])

### Tensor Aggregation (Finding min, max, mean, sum, etc)

In [None]:
x = torch.arange(0, 100, 10)
x, x.dtype

(tensor([ 0, 10, 20, 30, 40, 50, 60, 70, 80, 90]), torch.int64)

In [None]:
# Find the mean
torch.mean(x.type(torch.float32))

tensor(45.)

In [None]:
# Find the min
torch.min(x) # or
x.min()

tensor(0)

In [None]:
# Find the max
torch.max(x) # or
x.max()

tensor(90)

In [None]:
# Find the sum
torch.sum(x) # or
x.sum()

tensor(450)

### Find the Positional Min and Max - argmin() and argmax()

In [None]:
# Find the position in tensor that have the min value
x.argmin()

tensor(0)

In [None]:
# Find the position in tensor that have the max value
x.argmax()

tensor(9)

### Reshaping, Stacking, Squeezing and Unsqueezing Tensors

- __Reshaping:__ reshapes an input tensor to a definded shape
- __View:__ return a view of an input tensor of a certain shape but kwwp the same memory as the original tensor
- __Stacking:__ combine multiple tensors on top of each other (vstack) or side by side (hstack)
- __Squeeze:__ remove all `1` dimension to a target tensor
- __Unsqueze:__ add a `1` dimension to a target tensor
- __Permute:__ return a view of the input tensor with dimension permuted(swapped) in a certain way

In [None]:
x = torch.arange(1., 11.)
x, x.shape

(tensor([ 1.,  2.,  3.,  4.,  5.,  6.,  7.,  8.,  9., 10.]), torch.Size([10]))

In [None]:
# Add an extra dimension
x_reshape = x.reshape(1, 10)
x_reshape, x_reshape.shape

(tensor([[ 1.,  2.,  3.,  4.,  5.,  6.,  7.,  8.,  9., 10.]]),
 torch.Size([1, 10]))

In [None]:
# Change the view
z = x.view(1, 10)
z, z.shape

(tensor([[ 1.,  2.,  3.,  4.,  5.,  6.,  7.,  8.,  9., 10.]]),
 torch.Size([1, 10]))

In [None]:
#  Changing z changes x (because a view of a tensor shares the same memory as the original tensor)
z[:, 0] = 5
z, x

(tensor([[ 5.,  2.,  3.,  4.,  5.,  6.,  7.,  8.,  9., 10.]]),
 tensor([ 5.,  2.,  3.,  4.,  5.,  6.,  7.,  8.,  9., 10.]))

In [None]:
# Stack tensors on top of each other
x_stacked = torch.stack([x, x, x, x], dim = 0)
x_stacked

tensor([[ 5.,  2.,  3.,  4.,  5.,  6.,  7.,  8.,  9., 10.],
        [ 5.,  2.,  3.,  4.,  5.,  6.,  7.,  8.,  9., 10.],
        [ 5.,  2.,  3.,  4.,  5.,  6.,  7.,  8.,  9., 10.],
        [ 5.,  2.,  3.,  4.,  5.,  6.,  7.,  8.,  9., 10.]])

In [None]:
# torch.squeeze() -- removes all single dimensions from a target tensor
print(f"Previous tensor: {x_reshape}")
print(f"Previous shape: {x_reshape.shape}")

# Remove extra dimensions from x_reshape
x_squeezed = x_reshape.squeeze()
print(f"\nNew tensor: {x_squeezed}")
print(f"nNew shape: {x_squeezed.shape}")

Previous tensor: tensor([[ 5.,  2.,  3.,  4.,  5.,  6.,  7.,  8.,  9., 10.]])
Previous shape: torch.Size([1, 10])

New tensor: tensor([ 5.,  2.,  3.,  4.,  5.,  6.,  7.,  8.,  9., 10.])
nNew shape: torch.Size([10])


In [None]:
# torch.unsqueeze() -- Adds a single dimension to a specific dim
print(f"Previous tensor: {x_squeezed}")
print(f"Previous shape: {x_squeezed.shape}")

# Remove extra dimensions from x_reshape
x_unsqueezed = x_reshape.unsqueeze(dim = 0)
print(f"\nNew tensor: {x_unsqueezed}")
print(f"nNew shape: {x_unsqueezed.shape}")

Previous tensor: tensor([ 5.,  2.,  3.,  4.,  5.,  6.,  7.,  8.,  9., 10.])
Previous shape: torch.Size([10])

New tensor: tensor([[[ 5.,  2.,  3.,  4.,  5.,  6.,  7.,  8.,  9., 10.]]])
nNew shape: torch.Size([1, 1, 10])


In [None]:
# torch.permute() -- rearranges the dimensions of a tensor in a specified order
x_original = torch.rand(size = (224, 224, 3)) #[height, width, color_channels]

# Permute the original tensor to rearrange the axis order
x_permute = x_original.permute(2, 0, 1)  # shifts axis 0->1, 1->2, 2->0
print(f"Previous shape: {x_original.shape}")
print(f"nNew shape: {x_permute.shape}")

Previous shape: torch.Size([224, 224, 3])
nNew shape: torch.Size([3, 224, 224])


### Indexing

Indexing with Pytorch is similar to indexing with Numpy.

In [None]:
x = torch.arange(1, 10).reshape(1, 3, 3)
x, x.shape

(tensor([[[1, 2, 3],
          [4, 5, 6],
          [7, 8, 9]]]),
 torch.Size([1, 3, 3]))

In [None]:
# Index on the first bracket (dim=0)
x[0]

tensor([[1, 2, 3],
        [4, 5, 6],
        [7, 8, 9]])

In [None]:
# Index on the middle bracket (dim=1)
x[0][0]

tensor([1, 2, 3])

In [None]:
# Index on the inner bracket (dim=2)
print(x[0][0][0])
print(x[0][1][2])
print(x[1][0][0]) # it only have 1 dimension

tensor(1)
tensor(6)


IndexError: index 1 is out of bounds for dimension 0 with size 1

In [None]:
# You can also use ":" to select all of a target dimension
x[:, 0]

tensor([[1, 2, 3]])

In [None]:
# Get all values of 0th and 1st dimensions but only index 1 of 2nd dimension
x[:, :, 1]

tensor([[2, 5, 8]])

In [None]:
# Get all values of the 0th dimension but only the 1 index value of the 1st and 2nd dimension
x[:, 1, 1]

tensor([5])

In [None]:
# Get index 0 of 0th and 1st dimension and all values of 2nd dimension
x[0, 0, :]

tensor([1, 2, 3])

In [None]:
# Practice -- Get value 9 out of this tensor
print(x[0, 2, 2]) # or x[0][2][2]

# Get value [3, 6, 9] out of this tensor
print(x[: , :, 2])

tensor(9)
tensor([[3, 6, 9]])


### PyTorch Tensors & Numpy

PyTorch has functionality to interact with NumPy.

* NumPy array in PyTorch Tensor -> `torch.from_numpy(ndarray)`
* PyTorch Tensor -> NumPy array `torch.numpy()`

In [None]:
# Numpy array to tensor
array = np.arange(1.0, 8.0)
tensor = torch.from_numpy(array) # tensor default is float64 unless specified otherwise
array, tensor

(array([1., 2., 3., 4., 5., 6., 7.]),
 tensor([1., 2., 3., 4., 5., 6., 7.], dtype=torch.float64))

In [None]:
# Tensor to Numpy array
tensor = torch.ones(7)
nparray = tensor.numpy()
tensor, nparray

(tensor([1., 1., 1., 1., 1., 1., 1.]),
 array([1., 1., 1., 1., 1., 1., 1.], dtype=float32))

### PyTorch Reproducibility

In short how a neural network learns:

`start with random number -> tensor operations -> update random number -> tensor operations -> update -> ... -> ...`

__Random Seed:__ "flavor" the randomness

https://pytorch.org/docs/stable/notes/randomness.html

In [None]:
# Create two random tensors
rand_tensor_A = torch.rand(3, 4)
rand_tensor_B = torch.rand(3, 4)

print(rand_tensor_A)
print(rand_tensor_B)
print(rand_tensor_A == rand_tensor_B)

tensor([[0.3648, 0.8825, 0.4076, 0.0981],
        [0.9180, 0.3257, 0.4372, 0.5776],
        [0.2419, 0.6578, 0.3862, 0.6737]])
tensor([[0.6023, 0.9822, 0.9996, 0.6298],
        [0.0103, 0.8857, 0.7994, 0.1079],
        [0.8637, 0.7173, 0.9532, 0.7379]])
tensor([[False, False, False, False],
        [False, False, False, False],
        [False, False, False, False]])


In [None]:
# Make random but reproducible tensors
RANDOM_SEED = 42
torch.manual_seed(RANDOM_SEED) # random tensor usually comes before all the methods

random_tensor_C = torch.rand(3, 4)

torch.manual_seed(RANDOM_SEED) # in this case, set seed before each call
random_tensor_D = torch.rand(3, 4)

print(random_tensor_C)
print(random_tensor_D)
print(random_tensor_C == random_tensor_D)

tensor([[0.8823, 0.9150, 0.3829, 0.9593],
        [0.3904, 0.6009, 0.2566, 0.7936],
        [0.9408, 0.1332, 0.9346, 0.5936]])
tensor([[0.8823, 0.9150, 0.3829, 0.9593],
        [0.3904, 0.6009, 0.2566, 0.7936],
        [0.9408, 0.1332, 0.9346, 0.5936]])
tensor([[True, True, True, True],
        [True, True, True, True],
        [True, True, True, True]])


### Running Tensors and PyTorch Objects on the GPUs (Faster computations)

- __Getting a GPU__

  1. Google Colab
      
      To change the device to on Google Colab, go to `Runtime` -- `Change runtime tyoe`

  2. Getting your own GPUs
  3. Use cloud computing - GCP, AWS, Azure, these services allow you to rent computer on the cloud and access them
  
  For 2 and 3, PyTorch GPU takes some set up.

In [None]:
!nvidia-smi # Check the device

Tue Dec 17 19:24:37 2024       
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.104.05             Driver Version: 535.104.05   CUDA Version: 12.2     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|   0  Tesla T4                       Off | 00000000:00:04.0 Off |                    0 |
| N/A   34C    P8               9W /  70W |      0MiB / 15360MiB |      0%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+
                                                                    

In [None]:
# Check for GPU access with PyTorch
torch.cuda.is_available()

True

In [None]:
# Setup device agnostic code
device = "cuda" if torch.cuda.is_available() else "cpu"
device

'cuda'

In [None]:
# Count the number of devices
torch.cuda.device_count()

1

- __Putting Tensors (and Models) on the GPU__

    The reason we want our tensors/models on GPU is because using GPU results in faster computations

In [None]:
# Create a tensor (default on CPU)
tensor = torch.tensor([1, 2, 3])

# Tensor not on GPU
tensor, tensor.device

(tensor([1, 2, 3]), device(type='cpu'))

In [None]:
# Move tensor to GPU (if available)
tensor_on_gpu = tensor.to(device)
tensor_on_gpu

tensor([1, 2, 3], device='cuda:0')

- __Moving Tensors Back to the CPU__

In [None]:
# If tensor is on GPU, cannot transform it to NumPy
tensor_on_gpu.numpy()

TypeError: can't convert cuda:0 device type tensor to numpy. Use Tensor.cpu() to copy the tensor to host memory first.

In [None]:
# To fix the GPU tensor with NumPy issue, we can first set it to the CPU
tensor_back_on_cpu = tensor_on_gpu.cpu().numpy()
tensor_back_on_cpu

array([1, 2, 3])

### Exercise and Extra-curriculum

Find the exercise and extra-cirriculum on the class website.
https://www.learnpytorch.io/00_pytorch_fundamentals/