<a href="https://colab.research.google.com/github/yusnivtr/pytorch-deep-learning/blob/main/my_notebook/pytorch_00.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [None]:
import torch
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
print(torch.__version__)

2.5.1+cu121


## Introduction to Tensors

### Creating tensors
References: https://pytorch.org/docs/stable/tensors.html

In [None]:
# scalar
scalar = torch.tensor(7)
scalar

tensor(7)

In [None]:
scalar.ndim

0

In [None]:
# Get tensor back as Python int
scalar.item()

7

In [None]:
# vector
vector = torch.tensor([7,7])

In [None]:
vector.ndim

1

In [None]:
vector.shape

torch.Size([2])

In [None]:
# Matrix
MATRIX = torch.tensor([[7,8],
                       [9,10]])
MATRIX

tensor([[ 7,  8],
        [ 9, 10]])

In [None]:
MATRIX.ndim

2

In [None]:
MATRIX[0]

tensor([7, 8])

In [None]:
MATRIX[1]

tensor([ 9, 10])

In [None]:
MATRIX.shape

torch.Size([2, 2])

In [None]:
# Tensor
TENSOR = torch.tensor([[[1,2,3],
                        [3,6,9],
                        [2,4,5]]])
TENSOR

tensor([[[1, 2, 3],
         [3, 6, 9],
         [2, 4, 5]]])

In [None]:
TENSOR.ndim

3

In [None]:
TENSOR.shape

torch.Size([1, 3, 3])

In [None]:
TENSOR[0] #matrix

tensor([[1, 2, 3],
        [3, 6, 9],
        [2, 4, 5]])

In [None]:
TENSOR[0][1]

tensor([3, 6, 9])

In [None]:
TENSOR[0][2]

tensor([2, 4, 5])

In [None]:
TENSOR[0][0][1]

tensor(2)

### Random Tensors
Why random tensors?

Random tensors are importance because the way many neural networks learn is that they start with tensors full of random numbers and then adjust those random numbers to better represent the data.

`Start with random number -> look at data -> update random number -> look at data - > update random numbers`

Torch random tensor: https://pytorch.org/docs/stable/generated/torch.rand.html

In [None]:
# Create a random tensor of size (3,4)
random_tensor = torch.rand(3,4)

In [None]:
random_tensor

tensor([[0.3764, 0.4186, 0.9349, 0.9242],
        [0.6483, 0.7116, 0.0636, 0.2390],
        [0.3426, 0.5584, 0.3413, 0.8900]])

In [None]:
random_tensor.ndim

2

In [None]:
random_tensor.shape

torch.Size([3, 4])

In [None]:
# Create a random tensor with similar shape to an image tensor
random_image_size_tensor = torch.rand(size=(3,224,224)) # height, width, color channels (R,G,B)
random_image_size_tensor.shape, random_image_size_tensor.ndim

(torch.Size([3, 224, 224]), 3)

### Zeros and Ones Tensor

In [None]:
 # Create a tensor of all zeros
 zero = torch.zeros(size=(3,4))
 zero

tensor([[0., 0., 0., 0.],
        [0., 0., 0., 0.],
        [0., 0., 0., 0.]])

In [None]:
zero * random_tensor

tensor([[0., 0., 0., 0.],
        [0., 0., 0., 0.],
        [0., 0., 0., 0.]])

In [None]:
# Create a tensor of all ones
ones = torch.ones(size=(3,4))
ones

tensor([[1., 1., 1., 1.],
        [1., 1., 1., 1.],
        [1., 1., 1., 1.]])

### Creating a range of tensors and tensors-like

In [None]:
# Use torch.arange
one_to_ten = torch.arange(start=1, end=11, step=1)
one_to_ten

tensor([ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10])

In [None]:
# Creating a tensors like
ten_zeros = torch.zeros_like(input=one_to_ten)
ten_zeros

tensor([0, 0, 0, 0, 0, 0, 0, 0, 0, 0])

### Tensor data types

**Note**: Tensor datatypes is one of the 3 big errors we'll run into with PyTorch and Deep Learning:
1. Tensors not right datatype
2. Tensors not right shape
3. Tensors not on the right device

In [None]:
# Float 32 tensor
float_32_tensor = torch.tensor([3.0,6.0,9.0],
                               dtype=torch.float16, # what device is your tensor on
                               requires_grad=False) # whether or not to track gradients with this tensors operations
float_32_tensor
float_32_tensor

tensor([3., 6., 9.], dtype=torch.float16)

In [None]:
float_32_tensor.dtype

torch.float16

In [None]:
float_16_tensor = float_32_tensor.type(torch.float16)
float_16_tensor

tensor([3., 6., 9.], dtype=torch.float16)

In [None]:
float_16_tensor * float_32_tensor

tensor([ 9., 36., 81.], dtype=torch.float16)

In [None]:
int_32_tensor = torch.tensor([3,6,9], dtype=torch.int32)
int_32_tensor

tensor([3, 6, 9], dtype=torch.int32)

In [None]:
float_32_tensor * int_32_tensor

tensor([ 9., 36., 81.], dtype=torch.float16)

### Getting information from tensors

```
# This is formatted as code
```
1. Tensors not right datatype `tensor.dtype`
2. Tensors not right shape `tensor.shape`
3. Tensors not on the right device `tensor.device`


In [None]:
some_tensor = torch.rand(3,4)
some_tensor

tensor([[0.3803, 0.8162, 0.8802, 0.8486],
        [0.6200, 0.3850, 0.0588, 0.4088],
        [0.8107, 0.0763, 0.5495, 0.2405]])

In [None]:
print(some_tensor)
print(f"Datatype of tensor: {some_tensor.dtype}")
print(f"Shape of tensor: {some_tensor.shape}")
print(f"Device tensor is on: {some_tensor.device}")

tensor([[0.3803, 0.8162, 0.8802, 0.8486],
        [0.6200, 0.3850, 0.0588, 0.4088],
        [0.8107, 0.0763, 0.5495, 0.2405]])
Datatype of tensor: torch.float32
Shape of tensor: torch.Size([3, 4])
Device tensor is on: cpu


### Manipylating Tensors (tensor operations)

Tensor operations include:
- Addition
- Subtraction
- Multiplication (element-wise)
- Division
- Matrix multiplication

In [None]:
# Create a tensor
tensor = torch.tensor([1,2,3])
tensor + 10

tensor([11, 12, 13])

In [None]:
# Multiply tensor by 10
tensor * 10

tensor([10, 20, 30])

In [None]:
# Subtract 10
tensor - 10

tensor([-9, -8, -7])

In [None]:
# try out Pytorch in-built functions
torch.mul(tensor,10)

tensor([10, 20, 30])

In [None]:
torch.add(tensor,10)

tensor([11, 12, 13])

### Matrix multiplication
1. Elemet-wise multiplication
2. Matrix multiplication (dot product)

There are two main rules that perfoming matrix multiplication needs to satisfy.
1. The **inner dimensions** must match.
2. Resulting matrix has the shape of the **outer dimensions**

In [None]:
print(tensor*tensor)

tensor([1, 4, 9])


In [None]:
# Matrix multiplication
torch.matmul(tensor,tensor)

tensor(14)

In [None]:
%%time
value = 0
for i in range(len(tensor)):
  value += tensor[i] * tensor[i]
print(value)

tensor(14)
CPU times: user 3.16 ms, sys: 55 µs, total: 3.22 ms
Wall time: 6.42 ms


In [None]:
%%time
torch.matmul(tensor,tensor)

CPU times: user 92 µs, sys: 1.01 ms, total: 1.1 ms
Wall time: 1.27 ms


tensor(14)

## One the most common errors in deep learning: shape errors

In [None]:
# shape for matrix multiplication
tensor_A = torch.tensor([[1,2],
                         [3,4],
                          [5,6]])
tensor_B = torch.tensor([[7,10],
                         [8,11],
                          [9,12]])
# torch.mm(tensor_A,tensor_B)

To fix our tensor shape issues, we can manipulate the shape of one of our tensors using a transpose.

In [None]:
tensor_B.T

tensor([[ 7,  8,  9],
        [10, 11, 12]])

In [None]:
torch.matmul(tensor_A,tensor_B.T)

tensor([[ 27,  30,  33],
        [ 61,  68,  75],
        [ 95, 106, 117]])

# Finding the min, max, mean, sum, etc (tensor aggregation)

In [None]:
x = torch.arange(0,100,10)
x

tensor([ 0, 10, 20, 30, 40, 50, 60, 70, 80, 90])

In [None]:
# Find the min
torch.min(x), x.min()

(tensor(0), tensor(0))

In [None]:
# Find the max
torch.max(x), x.max()

(tensor(90), tensor(90))

In [None]:
# Find the mean
torch.mean(x)

RuntimeError: mean(): could not infer output dtype. Input dtype must be either a floating point or complex dtype. Got: Long

In [None]:
x.dtype

torch.int64

In [None]:
# Find the mean - note: the torch.mean() functions requires a tensor of float32 datatype to work
torch.mean(x,dtype=torch.float32), x.type(torch.float32).mean()

(tensor(45.), tensor(45.))

In [None]:
# Find the sum
torch.sum(x), x.sum()

(tensor(450), tensor(450))

In [None]:
tensor = torch.rand(3,2,3)
tensor

tensor([[[0.1554, 0.2720, 0.5561],
         [0.4484, 0.3094, 0.3881]],

        [[0.6971, 0.4107, 0.4094],
         [0.0212, 0.3106, 0.4306]],

        [[0.3226, 0.9086, 0.3070],
         [0.0554, 0.8809, 0.6469]]])

In [None]:
## Finding the positional min and max
torch.argmin(tensor)

tensor(9)

In [None]:
torch.argmax(tensor)

tensor(13)

In [None]:
torch.argmin(tensor,dim=0)

tensor([[0, 0, 2],
        [1, 0, 0]])

In [None]:
torch.argmin(tensor,dim=1)

tensor([[0, 0, 1],
        [1, 1, 0],
        [1, 1, 0]])

In [None]:
torch.argmin(tensor,dim=2)

tensor([[0, 1],
        [2, 0],
        [2, 0]])

# Reshaping, stacking, squeezing and unsqueezing

* Reshaping - reshapes an input tensor to a defined shape
* View - return a view of an input tensor of certain shape but keep the same memory as the original tensor
* Stacking - combine multiple tensors on top of each other (vstack) or side by side (hstack)
* Squeeze - removew all `1` dimensions from a tensor
* Unsqueeze -  add a `1` deminsion to a target tensor
* Permute - Return a view of the input with dimensions permuted (swapped) in a certain way

In [None]:
import torch

x = torch.arange(1.,10.)
x, x.shape

(tensor([1., 2., 3., 4., 5., 6., 7., 8., 9.]), torch.Size([9]))

In [None]:
# Add an extra dimensions
x_reshaped = x.reshape(1,9)
x_reshaped, x_reshaped.shape

(tensor([[1., 2., 3., 4., 5., 6., 7., 8., 9.]]), torch.Size([1, 9]))

In [None]:
x_reshaped = x.reshape(3,3)
x_reshaped, x_reshaped.shape

(tensor([[1., 2., 3.],
         [4., 5., 6.],
         [7., 8., 9.]]),
 torch.Size([3, 3]))

In [None]:
x_reshaped = x.reshape(9,1)
x_reshaped, x_reshaped.shape

(tensor([[1.],
         [2.],
         [3.],
         [4.],
         [5.],
         [6.],
         [7.],
         [8.],
         [9.]]),
 torch.Size([9, 1]))

In [None]:
# Change the view
z = x.view(1,9)
z, z.shape

(tensor([[1., 2., 3., 4., 5., 6., 7., 8., 9.]]), torch.Size([1, 9]))

In [None]:
# Changing z changes x (because a view of a tensor shares the same memory as the original input)
z[:,0] = 5
z,x

(tensor([[5., 2., 3., 4., 5., 6., 7., 8., 9.]]),
 tensor([5., 2., 3., 4., 5., 6., 7., 8., 9.]))

In [None]:
# Stack tensors on top of each other
x_vstacked = torch.stack([x,x,x,x],dim=0)
x_hstacked = torch.stack([x,x,x,x],dim=1)
x_vstacked, x_hstacked

(tensor([[5., 2., 3., 4., 5., 6., 7., 8., 9.],
         [5., 2., 3., 4., 5., 6., 7., 8., 9.],
         [5., 2., 3., 4., 5., 6., 7., 8., 9.],
         [5., 2., 3., 4., 5., 6., 7., 8., 9.]]),
 tensor([[5., 5., 5., 5.],
         [2., 2., 2., 2.],
         [3., 3., 3., 3.],
         [4., 4., 4., 4.],
         [5., 5., 5., 5.],
         [6., 6., 6., 6.],
         [7., 7., 7., 7.],
         [8., 8., 8., 8.],
         [9., 9., 9., 9.]]))

In [None]:
# torch.squeeze() - remove all single dimensions from a target tensor
x_reshaped = x_reshaped.reshape(1,9)
print(f"Previous tensor: {x_reshaped}")
print(f"Previous shape: {x_reshaped.shape}")

# Remove extra dimensions from x_reshaped
x_squeezed = x_reshaped.squeeze()
print(f"\nNew tensor: {x_squeezed}")
print(f"New shape: {x_squeezed.shape}")

Previous tensor: tensor([[5., 2., 3., 4., 5., 6., 7., 8., 9.]])
Previous shape: torch.Size([1, 9])

New tensor: tensor([5., 2., 3., 4., 5., 6., 7., 8., 9.])
New shape: torch.Size([9])


In [None]:
# torch.unsqueeze() - adds a single dimension to a target tensor at a specific dim

x_squeezed = x_reshaped.squeeze()
print(f"Previous tensor: {x_squeezed}")
print(f"Previous shape: {x_squeezed.shape}")

# Add an extra dimension with unsqueeze
x_unsqueezed = x_squeezed.unsqueeze(dim=1)
print(f"\nNew tensor: {x_unsqueezed}")
print(f"New shape: {x_unsqueezed.shape}")

Previous tensor: tensor([5., 2., 3., 4., 5., 6., 7., 8., 9.])
Previous shape: torch.Size([9])

New tensor: tensor([[5.],
        [2.],
        [3.],
        [4.],
        [5.],
        [6.],
        [7.],
        [8.],
        [9.]])
New shape: torch.Size([9, 1])


In [None]:
# torch.permute - rearranges the dimensions of a target tensor
x_original = torch.rand(size=(224,224,3)) # [height,width,colour_channels]

# Permute the original tensor to rearrange the axis (or dim) order
x_permuted = x_original.permute(2,0,1) # shifts axis 0->1, 1->2, 2->0

print(f"Previous shape: {x_original.shape}")
print(f"New shape: {x_permuted.shape}") #[colour_channels, height, width]

Previous shape: torch.Size([224, 224, 3])
New shape: torch.Size([3, 224, 224])


In [None]:
# torch.permute
# Returns a view of the original tensor input with its dimensions permuted.
# So if we change value of x_original then x_permuted will be changed
x_permuted[0,0,0] = 728218
x_original[0,0,0]


tensor(728218.)

# Indexing (selecting data from tensors)

In [None]:
# Create a tensor
import torch
x = torch.arange(1,10).reshape(1,3,3)
x, x.shape

(tensor([[[1, 2, 3],
          [4, 5, 6],
          [7, 8, 9]]]),
 torch.Size([1, 3, 3]))

In [None]:
# Let's index from out tensor
x[0]

tensor([[1, 2, 3],
        [4, 5, 6],
        [7, 8, 9]])

In [None]:
# Let's index on the middle bracket (dim=1)
x[0,0]

tensor([1, 2, 3])

In [None]:
# Let's index on the most inner bracket (last dim)
x[0,0,0]

tensor(1)

In [None]:
x[0,2,2]

tensor(9)

In [None]:
# You can also use ":" to select "all" of a target dimension
x[:,0]

tensor([[1, 2, 3]])

In [None]:
# Get all values of 0th and 1st dimensions but only index 1 of 2nd dimension
x[:,:,1]

tensor([[2, 5, 8]])

In [None]:
# Get all values of the 0 dimension but only the 1 index value of 1st and 2nd dimension
x[:,1,1]

tensor([5])

In [None]:
# Get index 0 of 0th and 1st dimension and all values of 2nd dimension
x[0,0,:]

tensor([1, 2, 3])

In [None]:
# Index on x to return 9
x[0,2,2]

tensor(9)

In [None]:
# Index on x to return 3,6,9
x[:,:,2]

tensor([[3, 6, 9]])

# Pytorch tensors & NumPy

* Data in NumPy, want in Pytorch tensor -> `torch.from_numpy(ndarray)`
* PyTorch tensor -> numpy `torch.Tensor.numpy()`

In [None]:
# NumPy array to tensor
import torch
import numpy as np

array = np.arange(1.0,8.0)
tensor = torch.from_numpy(array) # warning: when converting from numpy -> pytorch, pytorch reflects numpy's default datatype of float64 unless specificed otherwise
array, tensor

(array([1., 2., 3., 4., 5., 6., 7.]),
 tensor([1., 2., 3., 4., 5., 6., 7.], dtype=torch.float64))

In [None]:
tensor_torch = torch.arange(1.0,8.0)
tensor_torch.dtype

torch.float32

In [None]:
# Change the value of array, what will this to do tensor?
array = array + 1
array, tensor

(array([ 4.,  5.,  6.,  7.,  8.,  9., 10.]),
 tensor([1., 2., 3., 4., 5., 6., 7.], dtype=torch.float64))

In [None]:
# Tensor to NumPy array
tensor = torch.ones(7)
numpy_tensor = tensor.numpy()
tensor, numpy_tensor

(tensor([1., 1., 1., 1., 1., 1., 1.]),
 array([1., 1., 1., 1., 1., 1., 1.], dtype=float32))

In [None]:
numpy_tensor.dtype

dtype('float32')

In [None]:
# Change the tensor, what happens to numpy_tensor?
tensor = tensor + 1
tensor,numpy_tensor

(tensor([2., 2., 2., 2., 2., 2., 2.]),
 array([1., 1., 1., 1., 1., 1., 1.], dtype=float32))

# Reproducbility (trying to take random out of random)

How neural networks learns:

`start with random numbers -> tensor operations -> update random numbers to try and make them better representation of the data -> again -> again ...`

To reduce the randomness in nn and PyTorch comes the concept of a **random seed**


In [None]:
import torch
# Create 2 random tensors
random_tensor_A = torch.rand(3,4)
random_tensor_B = torch.rand(3,4)

print(random_tensor_A)
print(random_tensor_B)

tensor([[0.7437, 0.5164, 0.7211, 0.1971],
        [0.4048, 0.6195, 0.0536, 0.2780],
        [0.1231, 0.8399, 0.1779, 0.8618]])
tensor([[0.1529, 0.3767, 0.0449, 0.4920],
        [0.4484, 0.0933, 0.3481, 0.2726],
        [0.6665, 0.8322, 0.6839, 0.3160]])


In [None]:
# Set the random seed
RANDOM_SEED = 42

torch.manual_seed(RANDOM_SEED)
random_tensor_C = torch.rand(3,4)

torch.manual_seed(RANDOM_SEED)
random_tensor_D = torch.rand(3,4)

print(random_tensor_C)
print(random_tensor_D)
print(random_tensor_C == random_tensor_D)


tensor([[0.8823, 0.9150, 0.3829, 0.9593],
        [0.3904, 0.6009, 0.2566, 0.7936],
        [0.9408, 0.1332, 0.9346, 0.5936]])
tensor([[0.8823, 0.9150, 0.3829, 0.9593],
        [0.3904, 0.6009, 0.2566, 0.7936],
        [0.9408, 0.1332, 0.9346, 0.5936]])
tensor([[True, True, True, True],
        [True, True, True, True],
        [True, True, True, True]])


Extra resources:
- https://pytorch.org/docs/stable/notes/randomness.html
- https://en.wikipedia.org/wiki/Random_seed

# Running tensors and Pytorch objects on the GPUs

## 1. Getting a GPU
1. Easiest - Use Google Colab for free GPU
2. Use your own GPU
3. Use cloud computing - GCP, AWS, Azure, these services allow we to rent computers on the cloud and access them.

If use 2 and 3, need to prepare a little bit, refer documentaions of PyTorch.

In [None]:
!nvidia-smi

Wed Dec 18 15:41:52 2024       
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.104.05             Driver Version: 535.104.05   CUDA Version: 12.2     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|   0  Tesla T4                       Off | 00000000:00:04.0 Off |                    0 |
| N/A   37C    P8               9W /  70W |      0MiB / 15360MiB |      0%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+
                                                                    

## 2. Check for GPU access with PyTorch

In [None]:
import torch
torch.cuda.is_available()

True

For PyTorch since it's capable of running compute on the GPU of CPU, it's best practice to setup device agnostic code: https://pytorch.org/docs/stable/notes/cuda.html#best-practices

In [None]:
# Setup device agnostic code
device = "cuda" if torch.cuda.is_available() else "cpu"
device

'cuda'

In [None]:
# Count number of devices
torch.cuda.device_count()

1

## 3. Putting tenosrs (and models) on the GPU
The reason we want our tensors/models on the GPU to compute faster

In [None]:
# Create a tensor (default on the CPU)
tensor = torch.tensor([1,2,3])

# Tensor not on GPUP
print(tensor, tensor.device)

tensor([1, 2, 3]) cpu


In [None]:
# Move tensor to GPU (if available)
tensor_on_gpu = tensor.to(device)
tensor_on_gpu

tensor([1, 2, 3], device='cuda:0')

## 4. Moving tensors back to the CPU

In [None]:
# If tensor is on GPU, can't transform it to NumPy
# tensor_on_gpu.numpy()

In [None]:
# To fix the GPU tensor with NumPy issue, we can first set it to the CPU
tensor_back_on_cpu = tensor_on_gpu.cpu().numpy()
tensor_back_on_cpu

array([1, 2, 3])

In [None]:
tensor_on_gpu

tensor([1, 2, 3], device='cuda:0')