<a href="https://colab.research.google.com/github/shecheeyee/PyTorch_learning/blob/main/00_pytorch.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

00. PyTorch Fundamentals

In [None]:
import torch
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
print(torch.__version__)

1.13.0+cu116


In [None]:
!nvidia-smi

NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running.



##Introduction to tensors

### Creating tensors

tensors are created using torch.Tensor()
https://pytorch.org/docs/stable/tensors.html


In [None]:
# scalar tensors
scalar = torch.tensor(7)
scalar

tensor(7)

In [None]:
#scalar has got no dimensions
scalar.ndim

0

In [None]:
#tensor as python int
scalar.item()

7

In [None]:
#vector
vector = torch.tensor([7,7])
vector

tensor([7, 7])

In [None]:
vector.ndim

1

In [None]:
vector.shape

torch.Size([2])

In [None]:
# MATRIX
MATRIX = torch.tensor([[7,8],
                       [9,10]])
MATRIX

tensor([[ 7,  8],
        [ 9, 10]])

In [None]:
MATRIX.ndim

2

In [None]:
MATRIX[0]

tensor([7, 8])

In [None]:
MATRIX.shape

torch.Size([2, 2])

In [None]:
# TENSOR
TENSOR = torch.tensor([[[1,2,3],
                       [4,5,6],
                       [7,8,9]
                       ]])
TENSOR

tensor([[[1, 2, 3],
         [4, 5, 6],
         [7, 8, 9]]])

In [None]:
TENSOR.ndim

3

In [None]:
TENSOR.shape

torch.Size([1, 3, 3])

In [None]:
TENSOR[0]

# 1 3 by 3 tensor

tensor([[1, 2, 3],
        [4, 5, 6],
        [7, 8, 9]])

In [None]:
TENSOR[0][0]

tensor([1, 2, 3])

In [None]:
TENSOR_2 = torch.tensor([[[1,2,3],
                          [4,5,6],
                          [7,8,9],
                          [0,0,0]],
                         ])

In [None]:
TENSOR_2.ndim

3

In [None]:
TENSOR_2.shape

torch.Size([1, 4, 3])

### Random tensors

Why random tensors?

Random tensors are improtant because the way many neural network learn is that they start with tensors full of random numbers and then adjust those random numbers to better represent the data

`Start with random numbers -> look at data -> update random numbers -> look at data -> update random numbers`

essentially it "makes guesses", compare with the raw data, then improves those guesses

In [None]:
# Create a random tensor of shape/size (3,4)
random_tensor = torch.rand(3,4)
random_tensor

tensor([[0.7319, 0.6070, 0.8908, 0.8887],
        [0.8810, 0.8870, 0.0248, 0.4382],
        [0.4481, 0.9696, 0.2398, 0.3596]])

In [None]:
random_tensor.ndim

2

In [None]:
#create a random tensor with similar shape to an image tensor
random_image_size_tensor = torch.rand(size=(224, 224, 3)) #represents height, width and color channels(RGB)
random_image_size_tensor.shape, random_image_size_tensor.ndim

(torch.Size([224, 224, 3]), 3)

### Tensors of zeros and ones

In [None]:
# create a tensor of all zeros
zeros = torch.zeros(size=(3, 4))
zeros

tensor([[0., 0., 0., 0.],
        [0., 0., 0., 0.],
        [0., 0., 0., 0.]])

In [None]:
zeros*random_tensor

tensor([[0., 0., 0., 0.],
        [0., 0., 0., 0.],
        [0., 0., 0., 0.]])

In [None]:
# create a tensor of all ones
ones = torch.ones(size=(3, 4))
ones

tensor([[1., 1., 1., 1.],
        [1., 1., 1., 1.],
        [1., 1., 1., 1.]])

In [None]:
ones.dtype


torch.float32

### creating a range of tensors and tensors-like

In [None]:
#torch.range()
#torch.range(0, 10) #might be removed
one_to_ten = torch.arange(start=1, end=11, step=1)
one_to_ten

tensor([ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10])

In [None]:
#creating tensors like
ten_zeros = torch.zeros_like(input=one_to_ten)
ten_zeros

tensor([0, 0, 0, 0, 0, 0, 0, 0, 0, 0])

In [None]:
ten_zeros.dtype

torch.int64

### Tensor datatypes

**Note:** Tensor datatypes is one of the 3 big errors encountered with PyTorch & deep learning

1.   Tensors not right datatype
2.   Tensors not right shape
3.   Tensors not on the right device (GPU,CPU, etc)





In [None]:
#float 32 tensor
float_32_tensor = torch.tensor([3.0, 6.0, 9.0],
                               dtype=torch.float32, #tensors datatype, float16/32/64
                               device=None, #what device is your tensor on
                               requires_grad=False #whether or not to track gradients with this tensor operations 
                              )
float_32_tensor

tensor([3., 6., 9.])

In [None]:
float_16_tensor = float_32_tensor.type(torch.float16)
float_16_tensor.dtype

torch.float16

In [None]:
float_16_tensor * float_32_tensor #diff datatypes but work, some operations will run into error

tensor([ 9., 36., 81.])

###Getting information from tensors

1.   Tensors not right datatype - to do get datatype from a tensor, use tensor.dtype
2.   Tensors not right shape - to get shape from a tensor, use tensor.shape
3.   Tensors not on the right device (GPU,CPU, etc) - to get device from a tensor, use tensor.device

In [None]:
#create a tensor
some_tensor = torch.rand(3, 4)
some_tensor

tensor([[0.9700, 0.0617, 0.5680, 0.2557],
        [0.8249, 0.7002, 0.3419, 0.6917],
        [0.0537, 0.0504, 0.6093, 0.9411]])

In [None]:
# find out the information
print(some_tensor)
print(f"Datatype of tensor: {some_tensor.dtype}")
print(f"Shape of tensor: {some_tensor.shape}")
print(f"Device tensor is on: {some_tensor.device}")

tensor([[0.9700, 0.0617, 0.5680, 0.2557],
        [0.8249, 0.7002, 0.3419, 0.6917],
        [0.0537, 0.0504, 0.6093, 0.9411]])
Datatype of tensor: torch.float32
Shape of tensor: torch.Size([3, 4])
Device tensor is on: cpu


### Manipulating tensors (tensor operations)

Tensor operations:
* Additions
* Subtractions
* Mutiplication (ele wise)
* Division
* Matrix multiplication

In [None]:
#create a tensor
tensor = torch.tensor([1,2,3])
tensor + 10

tensor([11, 12, 13])

In [None]:
tensor * 10

tensor([10, 20, 30])

In [None]:
#immutable
tensor 

tensor([1, 2, 3])

In [None]:
tensor - 10

tensor([-9, -8, -7])

In [None]:
#pytorch inbuilt functions
torch.mul(tensor,10)

tensor([10, 20, 30])

In [None]:
torch.add(tensor, 10)

tensor([11, 12, 13])

### matrix multiplication

Two main ways of performing multiplication in neural networks and deep learning:

1. Element-wise multiplication (this is scalar multiplication)
2. Matrix multiplication (iteration of dot products)

2 main rules to follow when doing mat mul, which is same as the theory of doing mat mul
1. The **inner dimensions** must match
2. The resulting matrix has the shape of the outer dimensions



In [None]:
 #ele wise multiplication
print(tensor, "*", tensor)
print(f"Equals: {tensor * tensor}")

tensor([1, 2, 3]) * tensor([1, 2, 3])
Equals: tensor([1, 4, 9])


In [None]:
#matrix multiplication, the usual matrix mul
torch.matmul(tensor, tensor) #this will be 1x3 * 3x1 -> 1x1

tensor(14)

In [None]:
%%time 
value = 0
for i in range(len(tensor)):
  value += tensor[i] * tensor[i]
print(value)

tensor(14)
CPU times: user 2.68 ms, sys: 789 µs, total: 3.47 ms
Wall time: 3.89 ms


In [None]:
%%time 
torch.matmul(tensor, tensor)

CPU times: user 80 µs, sys: 0 ns, total: 80 µs
Wall time: 96.1 µs


tensor(14)

###so much faster using the torch functions to do mat mul

### one of the most common error is shape errors

In [None]:
#shapes for mat mul
tensor_A = torch.tensor([[1,2],
                         [3,4],
                         [5,6]
                         ])

tensor_B = torch.tensor([[7,10],
                         [8,11],
                         [9,12]
                         ])

torch.mm(tensor_A, tensor_B)

RuntimeError: ignored

In [None]:
tensor_A.shape, tensor_B.shape

(torch.Size([3, 2]), torch.Size([3, 2]))

To fix the tensor shape issue, we can transpose of the the tensors

In [None]:
tensor_B.T, tensor_B.T.shape

(tensor([[ 7,  8,  9],
         [10, 11, 12]]), torch.Size([2, 3]))

In [None]:
torch.mm(tensor_A, tensor_B.T)

tensor([[ 27,  30,  33],
        [ 61,  68,  75],
        [ 95, 106, 117]])

## tensor aggregations, finding the min, max, mean, sum etc

In [None]:
x = torch.arange(0,100,10)
x

tensor([ 0, 10, 20, 30, 40, 50, 60, 70, 80, 90])

In [None]:
torch.min(x), x.min()

(tensor(0), tensor(0))

In [None]:
torch.max(x), x.max()

(tensor(90), tensor(90))

In [None]:
torch.mean(x) #mean cannot be used with long datatype

RuntimeError: ignored

In [None]:
#change the datatype
torch.mean(x.type(torch.float32)), x.type(torch.float32).mean()

(tensor(45.), tensor(45.))

In [None]:
#sum
torch.sum(x), x.sum()

(tensor(450), tensor(450))

Finding positional min and max

In [None]:
x.argmin()

tensor(0)

In [None]:
x.argmax()

tensor(9)

In [None]:
x

tensor([ 0, 10, 20, 30, 40, 50, 60, 70, 80, 90])

##Reshaping, stack

* Reshaping - rehspaes an input tensor to a defined shape
* View 0 Return a view of an input tensor of centrain shape but keep the smae memory as the original tensor
* Stacking - combine multiple tensors on top of each other (vstack) or side by side (hstack)
* Squeeze - removes all `1` dimensions from a tensor
* Unsqueeze - add a `1` dimension to a target tensor
* Permute - Return a view of the input with dimensions permuted (swapped) in a certain way

In [None]:
 # creating a tensor
 import torch
 x = torch.arange(1., 10.)
 x, x.shape

(tensor([1., 2., 3., 4., 5., 6., 7., 8., 9.]), torch.Size([9]))

In [None]:
#reshaping - add an extra dim **dimensions have to be compatible with the original one
x_reshaped = x.reshape(1, 9) #compatible meaning that the element count should be the same
x_reshaped, x_reshaped.shape

(tensor([[1., 2., 3., 4., 5., 6., 7., 8., 9.]]), torch.Size([1, 9]))

In [None]:
#changing the view, view is quite similar to reshape, but z shares the same memory of x
z = x.view(1, 9)
z, z.shape

(tensor([[1., 2., 3., 4., 5., 6., 7., 8., 9.]]), torch.Size([1, 9]))

In [None]:
#changing z changes x as the view of a tensor share the smae memory as the original tensor
#mutating z will cause x to change as well
z[:, 0] = 5
z, x, x.ndim

(tensor([[5., 2., 3., 4., 5., 6., 7., 8., 9.]]),
 tensor([5., 2., 3., 4., 5., 6., 7., 8., 9.]),
 1)

In [None]:
#stakcing tensors on top of each other
#vstack and hstack just differs in using dim=0 or dim=1
x_stacked = torch.stack([x, x, x, x], dim=1)
x_stacked

tensor([[5., 5., 5., 5.],
        [2., 2., 2., 2.],
        [3., 3., 3., 3.],
        [4., 4., 4., 4.],
        [5., 5., 5., 5.],
        [6., 6., 6., 6.],
        [7., 7., 7., 7.],
        [8., 8., 8., 8.],
        [9., 9., 9., 9.]])

In [None]:
# torch.squeeze() removes all single dimensions from a tensor
# all input size 1 will be removed

x = torch.zeros(2, 1, 2, 1, 2)
print(x.size())
print("\n")
print(x)

torch.Size([2, 1, 2, 1, 2])


tensor([[[[[0., 0.]],

          [[0., 0.]]]],



        [[[[0., 0.]],

          [[0., 0.]]]]])


In [None]:
y = x.squeeze()
y
y.shape

torch.Size([2, 2, 2])

In [None]:
x_squeezed = x_reshaped.squeeze()


In [None]:
# torch.unsquuze() - adds a single dim to a target tensor at a specific dim 
print(f"Previous target: {x_squeezed}")
print(f"Previous shape: {x_squeezed.shape}")

#adding extra dim with unsqueeze
x_unsqueezed = x_squeezed.unsqueeze(dim=0)
print(f"\nNew tensor: {x_unsqueezed}")
print(f"New shape: {x_unsqueezed.shape}")

Previous target: tensor([5., 2., 3., 4., 5., 6., 7., 8., 9.])
Previous shape: torch.Size([9])

New tensor: tensor([[5., 2., 3., 4., 5., 6., 7., 8., 9.]])
New shape: torch.Size([1, 9])


In [None]:
#torch.permute - rearranges the dim of a target tensor in a specific order
#returns a view of the origianl tensor input with its dim permuted
#swap order of the dimensions
x_original = torch.rand(size=(224, 224, 3))

#permute the origianl tensor to rearrange the axis (or dim) order
#eg switch the color channel to first dim
x_permuted = x_original.permute(2, 0, 1) #shifts the axis 2 -> 0, rest moves back by 1
print(f"Original shape: {x_original.shape}")
print(f"New shape: {x_permuted.shape}")

Original shape: torch.Size([224, 224, 3])
New shape: torch.Size([3, 224, 224])


In [None]:
x_original[0, 0, 0] = 123

In [None]:
x_permuted[0, 0, 0]

tensor(123.)

##Indexing (select data from tensors)

Indexing with PyTorch is similar to NumPy

In [None]:
import torch
x = torch.arange(1, 10).reshape(1, 3 ,3)  #arange will give 9 elements, hence reshape (1,3,3) works
x, x.shape

(tensor([[[1, 2, 3],
          [4, 5, 6],
          [7, 8, 9]]]), torch.Size([1, 3, 3]))

In [None]:
x[0]

tensor([[1, 2, 3],
        [4, 5, 6],
        [7, 8, 9]])

In [None]:
x[0, 0, :] #or x[0][0][:]

tensor([1, 2, 3])

In [None]:
x[0][0][:]

tensor([1, 2, 3])

In [None]:
x[0][0][0] #or x[0, 0, 0]

tensor(1)

## PyTorch tensors & NumPy

* Data in NumPy array, want to do some DL using PyTorch tensors -> `torch.from_numpy(ndarray)`
* PyTorch tensor -> NumPy -> torch.Tensor.numpy()

In [None]:
#NumPy array to tensor
import torch
import numpy as np

array = np.arange(1.0, 8.0)
tensor = torch.from_numpy(array) #when converting from numpy to pytorch, pytorch reflects the default type as float64
array, tensor



(array([1., 2., 3., 4., 5., 6., 7.]),
 tensor([1., 2., 3., 4., 5., 6., 7.], dtype=torch.float64))

In [None]:
tensor.dtype

torch.float64

In [None]:
# change the value of array, and how it affects the tensor, doesnt affect, variable capture
array = array + 1
array, tensor

(array([2., 3., 4., 5., 6., 7., 8.]),
 tensor([1., 2., 3., 4., 5., 6., 7.], dtype=torch.float64))

In [None]:
# tensor to numpy array
tensor = torch.ones(7)
numpy_tensor = tensor.numpy()
tensor, numpy_tensor

(tensor([1., 1., 1., 1., 1., 1., 1.]),
 array([1., 1., 1., 1., 1., 1., 1.], dtype=float32))

In [None]:
numpy_tensor.dtype

dtype('float32')

In [None]:
#change the tensor, how does it affect the numpy array, once again doesnt affect, variable capture
tensor = tensor + 1
tensor, numpy_tensor

(tensor([3., 3., 3., 3., 3., 3., 3.]),
 array([1., 1., 1., 1., 1., 1., 1.], dtype=float32))

## Reproducibility (taking random out of random, setting seed value)

How NN learns:

`start with random numbers -> tensor op -> update the random numbers with a better 'guess' -> so on so forth`

To reduce the randomness in NN and PyTorch, we can set a random seed

In [None]:
import torch

# Create random tensors
random_A = torch.rand(3,4)
random_B = torch.rand(3,4)

print(random_A)
print(random_B)
print(random_A == random_B)

tensor([[0.5323, 0.2941, 0.2280, 0.4425],
        [0.8293, 0.9145, 0.7162, 0.7919],
        [0.9657, 0.4241, 0.2572, 0.6761]])
tensor([[0.6395, 0.1302, 0.8512, 0.6001],
        [0.2483, 0.5048, 0.7175, 0.8031],
        [0.1954, 0.6392, 0.4469, 0.5514]])
tensor([[False, False, False, False],
        [False, False, False, False],
        [False, False, False, False]])


In [None]:
#setting a seed
RANDOM_SEED = 42
torch.manual_seed(RANDOM_SEED)
random_C = torch.rand(3, 4)
torch.manual_seed(RANDOM_SEED)
random_D = torch.rand(3, 4)

print(random_C)
print(random_D)
print(random_C == random_D)

tensor([[0.8823, 0.9150, 0.3829, 0.9593],
        [0.3904, 0.6009, 0.2566, 0.7936],
        [0.9408, 0.1332, 0.9346, 0.5936]])
tensor([[0.8823, 0.9150, 0.3829, 0.9593],
        [0.3904, 0.6009, 0.2566, 0.7936],
        [0.9408, 0.1332, 0.9346, 0.5936]])
tensor([[True, True, True, True],
        [True, True, True, True],
        [True, True, True, True]])


## Running tensors and PyTorch Objects on the GPUs 

GPUs gives a faster computation on numbers

### 1. Getting a GPU

1. Google colab allows for free GPU
2. Use own GPU - need to buy them..
3. Cloud computing - GCP, Azure, etc

## 2. Check for GPU access with PyTorch

In [None]:
#check for GPU access w PyTorhc
import torch
torch.cuda.is_available()

True

since PyTorch is capable of running compute on GPU or CPU, best practice to setup device agnostic code:
https://pytorch.org/docs/stable/notes/cuda.html


In [None]:
# setup device agnostic code
device = "cuda" if torch.cuda.is_available() else "cpu"
device


'cuda'

In [None]:
# count number of devices
torch.cuda.device_count()

1

## 3. Putting tensonrs (and models) on the GPU

GPU results in a much faster compute, hence that is why we might want to run on GPU

In [None]:
tensor = torch.tensor([1,2,3], device="cpu")

print(tensor, tensor.device)

tensor([1, 2, 3]) cpu


In [None]:
#move tensor to GPU (if avail)
tensor_on_gpu = tensor.to(device)
tensor_on_gpu

tensor([1, 2, 3], device='cuda:0')

### 4. Moving tensors back to CPU (eg numpy only works w CPU)

In [None]:
tensor_on_gpy.numpy() #cannot run on CPU

NameError: ignored

In [None]:
#to fix the GPU tensor with NumPy issue, we can first set it to the CPU
tensor_back_on_cpu = tensor_on_gpu.cpu().numpy()
tensor_back_on_cpu

array([1, 2, 3])

## Exercises 

https://www.learnpytorch.io/00_pytorch_fundamentals/#exercises