<a href="https://colab.research.google.com/github/fatemehalipour/PyTorch-for-Deep-Learning/blob/main/0_PyTorch_Fundamentals.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# **Machine Learning**
Turning things (data) into numbers and finding patterns in those numbers.

If you can build a simple rule-based system that does not require machine learning, do that.

When to use ML?

*   When the traditional approach fails, ml/deep learning may help.
*   When continiually changing environemnt - deep leanring can adapt to new scenarios.
*   Discovering insights whithin large collection of data.

When not to use?

*   When you need explainability.
*   When the traditional approach is a better option.
*   When errors are unacceptable - since the outputs of deep learning model aren't always predictable.
*   When you don't have much data.

# **Machine Leaning vs Deep Learning:**
It depends how you represent your problem.
structured data --> ml, XGBoost (gradient boosted models).
unstructured data --> dl, all over the place, not standardized structured data, text, nlp, images, speech.

We use tensor to transfer (neural networks) this unstructured data to structured.

# **What are neural networks?**
Before the data can be used in neural network it needs to be turned into numbers (numerical encoding). Then we parse it through a neural network (choose appropriate neural netweork for your problem) to leran reperesentations/patterns/feautures/weights. Then we have an output (learned representation). Then we convert these outputs into human understandable outputs.

Each layer is usually a combination of linear (straight line) and/or non-linear functions.

# **Types of Learning**
*   Supervised Learning
*   Unsupervised and Self-supervised Learning
*   Transfer Learning










# **PyTorch**
Most popular research deep learning framework.

Write fast deep learning code in Python.

Able to access many pre-built deep leanring models.

Whole stack: pre-process data, model data, deploy model in your application/cloud.

Originally designed and used in-house by Facebook, now used by Tesla, Microsoft, OpenAI.

paperswithcode.com --> papers in ML

# **GPU/TPU**
GPU (Graphics Processing Unit), originally designed for video games, fast at numerical processing.

PyTorch leverages CUDA to enable you to run your ml code on GPUs.

TPU (Tensor Processing Unit), newer, not very common right now.

# **Tensor**
Any representation of numbers.







# **Coding!**

In [None]:
# GPU information
!nvidia-smi

Thu Oct 26 20:22:54 2023       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 525.105.17   Driver Version: 525.105.17   CUDA Version: 12.0     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|   0  Tesla V100-SXM2...  Off  | 00000000:00:04.0 Off |                    0 |
| N/A   33C    P0    24W / 300W |      0MiB / 16384MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Proces

In [None]:
import torch
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
print(torch.__version__)

2.0.1+cu118


## Introduction to Tensor


### Creating tensors

Pytorch tensors are created using `torch.tensor()`.

scalar and vector --> lower case

matrix and tensor --> upper case

In [None]:
# scalar
scalar = torch.tensor(7)
scalar

tensor(7)

In [None]:
scalar.ndim

0

In [None]:
# Get tensor back as python int
scalar.item()

7

In [None]:
# Vector
vector = torch.tensor([7, 7])
vector

tensor([7, 7])

In [None]:
vector.ndim

1

In [None]:
vector.shape

torch.Size([2])

In [None]:
# MATRIX
MATRIX = torch.tensor([[7, 8],
                       [9, 10]])
MATRIX

tensor([[ 7,  8],
        [ 9, 10]])

In [None]:
MATRIX.ndim

2

In [None]:
MATRIX[1]

tensor([ 9, 10])

In [None]:
MATRIX.shape

torch.Size([2, 2])

In [None]:
# TENSOR: can be any number of dimension
TENSOR = torch.tensor([[[1, 2, 3],
                        [3, 6, 9],
                        [2, 4, 5]]])
TENSOR

tensor([[[1, 2, 3],
         [3, 6, 9],
         [2, 4, 5]]])

In [None]:
TENSOR.ndim

3

In [None]:
TENSOR.shape

torch.Size([1, 3, 3])

In [None]:
TENSOR[0]

tensor([[1, 2, 3],
        [3, 6, 9],
        [2, 4, 5]])

### Random Tensors

They are important because the way many neural networks learn is that they start with tensors full of random numbers and then adjust those random numbers to better represent the data.

`Start with random numbers --> look at data --> Update random numbers --> look at data --> Update`

In [None]:
# create a random tensor of size (3, 4)
random_tensor = torch.rand(3, 4) # size=(3, 4)
random_tensor

tensor([[0.6902, 0.7093, 0.5496, 0.5096],
        [0.8055, 0.5613, 0.9002, 0.6357],
        [0.3297, 0.6497, 0.8583, 0.8461]])

In [None]:
# create a random tensor with similar shape to an image tensor
random_image_size_tensor = torch.rand(size=(3, 224, 224)) # color channel (red, green, blue), height, width
random_image_size_tensor.shape, random_image_size_tensor.ndim

(torch.Size([3, 224, 224]), 3)

### Zeros and ones

In [None]:
# create a tensor of all zeros
zeros = torch.zeros(size=(3, 4))
zeros

tensor([[0., 0., 0., 0.],
        [0., 0., 0., 0.],
        [0., 0., 0., 0.]])

In [None]:
# zero the values out
zeros * random_tensor

tensor([[0., 0., 0., 0.],
        [0., 0., 0., 0.],
        [0., 0., 0., 0.]])

In [None]:
# create a tensor of all 1s
ones = torch.ones(size=(3, 4))
ones

tensor([[1., 1., 1., 1.],
        [1., 1., 1., 1.],
        [1., 1., 1., 1.]])

In [None]:
random_tensor.dtype

torch.float32

### Create a range of tensors and tensors-like

In [None]:
# use torch.arange()
one_to_ten = torch.arange(start=0, end=11, step=1)
one_to_ten

tensor([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10])

In [None]:
# creating tensors like
ten_zeros = torch.zeros_like(input=one_to_ten)
ten_zeros

tensor([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0])

### Tensor datatypes

Note: Tensor datatypes is one of the 3 big errors you'll run into wiht PyTorch & deep learning:

1.   Tensors not right datatype
2.   Tensors not right shape
3.   Tensors not in the right device


In [None]:
# Float 32 tensor
float_32_tensor = torch.tensor([3.0, 6.0, 9.0],
                               dtype=None, # data types of the tensor (sacrifice some percision --> 16 bit to make it faster, make it more percise --> 64)
                               device=None, # cpu or cuda, what device is the tensor is on
                               requires_grad=False # track the gradients of a tensor with tensor operations
                               )
float_32_tensor

tensor([3., 6., 9.])

In [None]:
float_16_tensor = float_32_tensor.type(torch.float16)
float_16_tensor

tensor([3., 6., 9.], dtype=torch.float16)

In [None]:
float_16_tensor * float_32_tensor

tensor([ 9., 36., 81.])

### Getting information from tensors (tensor attributes)

1.   get datatype from a tensor can use `tensor.dtype`
2.   get shape, `tensor.shape`
3.   get device, `tensor.device`



In [None]:
some_tensor = torch.rand(3, 4)
some_tensor

tensor([[0.0773, 0.3879, 0.0994, 0.8692],
        [0.8600, 0.0489, 0.5178, 0.9792],
        [0.2679, 0.1536, 0.4665, 0.1352]])

In [None]:
print(f"Datatype of tensor: {some_tensor.dtype}")
print(f"Shape of tensor: {some_tensor.size()}") # function
print(f"Sahpe of tensor: {some_tensor.shape}") # attribute
print(f"Device of tensor: {some_tensor.device}")

Datatype of tensor: torch.float32
Shape of tensor: torch.Size([3, 4])
Sahpe of tensor: torch.Size([3, 4])
Device of tensor: cpu


### Manipulating Tensors (tensor operations)

Tensor operation include:
* Addition
* Subtraction
* Multiplication (element-wise)
* Division
* Matrix multiplication (dot product)

Rules of matrix multiplication:
1. The **inner dimensions** must match:
* (3, 2) @ (3, 2) won't work.
2. The resulting matrix has the shape of **outer dimension**.

In [None]:
tensor = torch.tensor([1, 2, 3])
tensor + 10

tensor([11, 12, 13])

In [None]:
tensor = tensor * 10

In [None]:
tensor - 10

tensor([ 0, 10, 20])

In [None]:
# try out PyTorch in-built functions
torch.mul(tensor, 10)
# torch.add(tensor, 10)

tensor([100, 200, 300])

In [None]:
tensor

tensor([10, 20, 30])

In [None]:
%%time
# Matrix multiplication
torch.matmul(tensor, tensor) # optimized function

CPU times: user 669 µs, sys: 0 ns, total: 669 µs
Wall time: 456 µs


tensor(1400)

### One of the most common errors in deep learning: shape error


In [None]:
torch.matmul(torch.rand(3, 2), torch.rand(3, 2))

RuntimeError: ignored

In [None]:
torch.matmul(torch.rand(3, 3), torch.rand(3, 2))

tensor([[0.4640, 0.4189],
        [0.7003, 0.8077],
        [0.4768, 0.7657]])

In [None]:
torch.mm(torch.rand(3, 3), torch.rand(3, 2)) # alias for matmul

tensor([[0.8265, 1.3614],
        [0.8453, 1.2429],
        [0.5597, 0.5537]])

In [None]:
tensor_A = torch.rand(3, 2)
tensor_B = torch.rand(3, 2)
torch.matmul(tensor_A, tensor_B)

RuntimeError: ignored

In [None]:
torch.matmul(tensor_A, tensor_B.T) # A transpose switches the axes or dimensions of a given tensor

tensor([[0.7043, 0.3703, 0.9215],
        [0.5891, 0.2513, 0.8369],
        [0.5723, 0.2224, 0.8376]])

### Finding the min, max, mean, sum, etc (tensor aggregation)

torch.mean() requires a tensor of float32 datatype


In [None]:
x = torch.arange(0, 100, 10)
x, x.dtype

(tensor([ 0, 10, 20, 30, 40, 50, 60, 70, 80, 90]), torch.int64)

In [None]:
torch.min(x), x.min()

(tensor(0), tensor(0))

In [None]:
torch.max(x), x.max()

(tensor(90), tensor(90))

In [None]:
torch.mean(x)

RuntimeError: ignored

In [None]:
torch.mean(x.type(torch.float32))

tensor(45.)

In [None]:
x.type(torch.float32).mean()

tensor(45.)

In [None]:
x.sum(), torch.sum(x)

(tensor(450), tensor(450))

### Find the position min and max

In [None]:
x.argmin(), x.argmax() # useful for softmax layer

(tensor(0), tensor(9))

### Reshaping, stacking, squeezing, unsqueezing tensors

used to fix shape errors

* Reshaping - reshapes an input tensor to a defined shape
* View - return a view of an input tensor of certain shape but keep the same memory as the original tensor
* Stacking - combining multiple tensors on top of each other (vstack) or side by side (hstack)
* Squeezing - removes all `1` dimension from a tensor
* Unsqueez - add a `1` dimension to a target tensor
* Permute - Return a view of the input with dimensions permuted (swapped) in a certain way, share the same memory

In [None]:
import torch
x = torch.arange(1., 13.)
x, x.shape, x.dtype

(tensor([ 1.,  2.,  3.,  4.,  5.,  6.,  7.,  8.,  9., 10., 11., 12.]),
 torch.Size([12]),
 torch.float32)

In [None]:
x_reshaped = x.reshape(3, 4) # dimension has to be compatible with original shape
x_reshaped

tensor([[ 1.,  2.,  3.,  4.],
        [ 5.,  6.,  7.,  8.],
        [ 9., 10., 11., 12.]])

In [None]:
z = x.view(2, 6)
z, z.shape

(tensor([[ 1.,  2.,  3.,  4.,  5.,  6.],
         [ 7.,  8.,  9., 10., 11., 12.]]),
 torch.Size([2, 6]))

In [None]:
x

tensor([ 1.,  2.,  3.,  4.,  5.,  6.,  7.,  8.,  9., 10., 11., 12.])

In [None]:
# changing z changes x because a view of a tensor shares the same memory as the original tensor
z[:, 0] = 5
z, x

(tensor([[ 5.,  2.,  3.,  4.,  5.,  6.],
         [ 5.,  8.,  9., 10., 11., 12.]]),
 tensor([ 5.,  2.,  3.,  4.,  5.,  6.,  5.,  8.,  9., 10., 11., 12.]))

In [None]:
# stack tensors on top of each other
x_stacked = torch.stack([x, x, x, x], dim=0) # dim=0: stack vertically, dim=1: stack horizontally
x_stacked

tensor([[ 5.,  2.,  3.,  4.,  5.,  6.,  5.,  8.,  9., 10., 11., 12.],
        [ 5.,  2.,  3.,  4.,  5.,  6.,  5.,  8.,  9., 10., 11., 12.],
        [ 5.,  2.,  3.,  4.,  5.,  6.,  5.,  8.,  9., 10., 11., 12.],
        [ 5.,  2.,  3.,  4.,  5.,  6.,  5.,  8.,  9., 10., 11., 12.]])

In [None]:
x = torch.tensor([[5., 2.]])
z = torch.squeeze(x)
# x, z
x.shape, z.shape

(torch.Size([1, 2]), torch.Size([2]))

In [None]:
d = torch.unsqueeze(z, dim=0) # add a single dimension to a target tensor at a specific dim
d.shape

torch.Size([1, 2])

In [None]:
x = torch.randn(224, 224, 3) # [height, width, color channel]
x.shape

torch.Size([224, 224, 3])

In [None]:
x_permuted = x.permute(2, 0, 1) # shifts axis 0 -> 1, 1 --> 2, 2 -> 0
x_permuted.shape # [color channel, height, width]

torch.Size([3, 224, 224])

### Indexing (selecting data from tensors)

similar to NumPy

In [None]:
x = torch.arange(1, 10).reshape(1, 3, 3)
x, x.shape

(tensor([[[1, 2, 3],
          [4, 5, 6],
          [7, 8, 9]]]),
 torch.Size([1, 3, 3]))

In [None]:
x[0]

tensor([[1, 2, 3],
        [4, 5, 6],
        [7, 8, 9]])

In [None]:
x[0, 0]

tensor([1, 2, 3])

In [None]:
# x[0, 2, 2]
x[:, :, 2]

tensor([[3, 6, 9]])

### PyTorch tensors & NumPy
* Data in NumPy, want in PyTorch tensor --> `torch.from_numpy(ndarray)`
* PyTorch tensor --> Numpy -> `torch.Tensor.numpy()`

In [None]:
import torch
import numpy as np

array = np.arange(1.0, 8.0)
tensor = torch.from_numpy(array)
array, tensor

(array([1., 2., 3., 4., 5., 6., 7.]),
 tensor([1., 2., 3., 4., 5., 6., 7.], dtype=torch.float64))

In [None]:
# Warning: when converting from numpy to pytorch it keeps the defaut data type of numpy (float64)
tensor = torch.from_numpy(array).type(torch.float32)
array, tensor

(array([1., 2., 3., 4., 5., 6., 7.]), tensor([1., 2., 3., 4., 5., 6., 7.]))

In [None]:
array = array + 1
array, tensor

(array([2., 3., 4., 5., 6., 7., 8.]), tensor([1., 2., 3., 4., 5., 6., 7.]))

In [None]:
tensor = torch.ones(7)
numpy_tensor = tensor.numpy()
tensor, numpy_tensor

(tensor([1., 1., 1., 1., 1., 1., 1.]),
 array([1., 1., 1., 1., 1., 1., 1.], dtype=float32))

In [None]:
tensor = tensor + 1
tensor, numpy_tensor # they don't share memory

(tensor([2., 2., 2., 2., 2., 2., 2.]),
 array([1., 1., 1., 1., 1., 1., 1.], dtype=float32))

### Reproducibility (trying to take the random out of random)

To reduce the randomness in neural networks and PyTorch comes the concept of a **random seed**.

Essentially what the random seed does is "flavour" the randomness. Run EVERYTIME you call rand.

In [None]:
import torch

random_tensor_A = torch.rand(3, 4)
random_tensor_B = torch.rand(3, 4)

print(random_tensor_A)
print(random_tensor_B)
print(random_tensor_A == random_tensor_B)

tensor([[0.9780, 0.0446, 0.5232, 0.9139],
        [0.2903, 0.9426, 0.1820, 0.9411],
        [0.8876, 0.9383, 0.4634, 0.9455]])
tensor([[0.4425, 0.0810, 0.2329, 0.3328],
        [0.1571, 0.6194, 0.0780, 0.3318],
        [0.3960, 0.6473, 0.0965, 0.2394]])
tensor([[False, False, False, False],
        [False, False, False, False],
        [False, False, False, False]])


In [None]:
RANDOM_SEED = 42
torch.manual_seed(RANDOM_SEED)
random_tensor_C = torch.rand(3, 4)

torch.manual_seed(RANDOM_SEED)
random_tensor_D = torch.rand(3, 4)

print(random_tensor_C)
print(random_tensor_D)
print(random_tensor_C == random_tensor_D)

tensor([[0.8823, 0.9150, 0.3829, 0.9593],
        [0.3904, 0.6009, 0.2566, 0.7936],
        [0.9408, 0.1332, 0.9346, 0.5936]])
tensor([[0.8823, 0.9150, 0.3829, 0.9593],
        [0.3904, 0.6009, 0.2566, 0.7936],
        [0.9408, 0.1332, 0.9346, 0.5936]])
tensor([[True, True, True, True],
        [True, True, True, True],
        [True, True, True, True]])


## Running tensors and PyTorch objects on GPUs (and making faster computations)


### Getting GPU

1.   Google Colab for a free GPU
2.   Use your own GPU
3.   Use cloud computing - GCP, AWS, Azure, these services allow you to rent computers on the cloud and access them



In [None]:
!nvidia-smi

Tue Oct  3 15:27:20 2023       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 525.105.17   Driver Version: 525.105.17   CUDA Version: 12.0     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|   0  Tesla V100-SXM2...  Off  | 00000000:00:04.0 Off |                    0 |
| N/A   32C    P0    23W / 300W |      0MiB / 16384MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Proces

### Check for GPU access with PyTorch

Device agnostic code --> good practice

In [None]:
import torch
torch.cuda.is_available()

True

In [None]:
# Setup device agnostic code
device = "cuda" if torch.cuda.is_available() else "cpu"
device

'cuda'

In [None]:
# count number of devices
torch.cuda.device_count()

1

### Putting tesors (and models) on GPU

The reason we want to do this is that GPU results in faster computations.

In [None]:
# default on CPU
tensor = torch.tensor([1, 2, 3])

print(tensor, tensor.device)

tensor([1, 2, 3]) cpu


In [None]:
# Move tensor to GPU if available
tensor_on_gpu = tensor.to(device)
tensor_on_gpu

tensor([1, 2, 3], device='cuda:0')

### Moving tensors back on CPU

Numpy only works on cpu

In [None]:
# If tensor is on GPU, can't transform it to NumPy
tensor_on_gpu.numpy()

TypeError: ignored

In [None]:
# To fix the issue we can first set it to the CPU
tensor_back_on_cpu = tensor_on_gpu.cpu().numpy()
tensor_back_on_cpu

array([1, 2, 3])

In [None]:
tensor_on_gpu

tensor([1, 2, 3], device='cuda:0')