##**PyTorch Self Learning Study**

> Reference: PyTorch Documentation, Youtube source
* https://pytorch.org/docs/stable/
* https://www.youtube.com/watch?v=V_xro1bcAuA&t=2598s

##### Start Date: 2024.07.24
##### Notebook Setting: T4 GPU






In [None]:
import torch
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
print(torch.__version__) #torch version: 2.3.1 cubaversion: 1.2.1

2.3.1+cu121


##1. Introduction to Tensors (Fundamental)


###1-1.Creating Tensors and observe the characters

In [None]:
#Scalars
scalar = torch.tensor(7)
scalar

tensor(7)

In [None]:
scalar.ndim

0

In [None]:
scalar.item()

7

In [None]:
#vectors
vectors = torch.tensor([1,1])
vectors

tensor([1, 1])

In [None]:
vectors.ndim

1

In [None]:
vectors.shape

torch.Size([2])

In [None]:
#MATRIX
MATRIX = torch.tensor([[1,2],[3,4]])
MATRIX

tensor([[1, 2],
        [3, 4]])

In [None]:
MATRIX.ndim

2

In [None]:
MATRIX.shape

torch.Size([2, 2])

In [None]:
#TENSORS
TENSOR = torch.tensor([[[1,2,3],
                        [4,5,6],
                        [7,8,9]]])
TENSOR

tensor([[[1, 2, 3],
         [4, 5, 6],
         [7, 8, 9]]])

In [None]:
TENSOR.ndim

3

In [None]:
TENSOR.shape

torch.Size([1, 3, 3])

#####Variable name for scalar and vector are represented by lower case
#####Variable name for MATRIX and TENSOR are represented by upper case

###1.2 Creating Random Tensors
####Why random tensor
#####Random tensors are important because the way neural networks learn is that they start with tensors full of random numbers and then adjust those random numbers to better represent the data.
`Start with random numbers -> look at data -> update random numbers -> look at data -> update random numbers`

In [None]:
#Create random tensors
random_tensor = torch.rand(3,4) # == torch.rand(size=(3,4))
random_tensor

tensor([[0.1682, 0.3457, 0.6053, 0.1593],
        [0.3259, 0.6255, 0.3389, 0.4211],
        [0.9786, 0.5279, 0.8660, 0.3062]])

In [None]:
#Create a random tensor with similar shape to an image tensor
random_image_tensor = torch.rand(size=(224,224,3)) #height, width, colour channel (R,G,B)
# can be torch.rand(size=(3, 224, 224)) => parameters in colour channel, height, width order
random_image_tensor.shape, random_image_tensor.ndim

(torch.Size([224, 224, 3]), 3)

###1.3 Creating Tensors with all zeros and ones

In [None]:
##Tensors with zeros and ones
zeros = torch.zeros(size=(3,4))
zeros

tensor([[0., 0., 0., 0.],
        [0., 0., 0., 0.],
        [0., 0., 0., 0.]])

In [None]:
ones = torch.ones(3,4)
ones

tensor([[1., 1., 1., 1.],
        [1., 1., 1., 1.],
        [1., 1., 1., 1.]])

In [None]:
##reasons for floating points in the tensors
#default setting for dtype of tensor is float32
ones.dtype

torch.float32

####Usage of Zero tensors
######Zero tensor can be used to make the target tensor or specific column or row to be zero by multiplying corresponding zero tensor

```
zeros * random_tensor
```



###1.4 Creating a range of tensors and tensors-like


In [None]:
#Use torch.arange(start, end, step)
one_to_ten = torch.arange(1,11)
one_to_ten

tensor([ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10])

In [None]:
#Creating tensors-like : create same size of tensor as input tensor
ten_zeros = torch.zeros_like(input=one_to_ten)
ten_zeros

tensor([0, 0, 0, 0, 0, 0, 0, 0, 0, 0])

###1.5 Tensor Datatype
####**Note:** Tensor datatype is one of the three big errors you will run into with Pytorch and Deep Learning:

1. Tensors not right datatype
2. Tensors not right shape
3. Tensors not on right device

In [None]:
#Float32 Tensor
Float_32_tensor = torch.tensor([3., 6., 9.], dtype=None) #The default datatype of tensor is float32
Float_32_tensor.dtype

torch.float32

In [None]:
Test_tensor = torch.tensor([3., 6., 9.], dtype=torch.float16)
Test_tensor.dtype


torch.float16

In [None]:
Test_tensor = torch.tensor([3.,6.,9.],
                           dtype=torch.float32,#what datatype tensor is
                           device=None,  #what device your tensor is on
                           requires_grad=False) #whether to track gradients with tensors operation

In [None]:
float_16_tensor = Test_tensor.type(torch.float16) #Test_nesor.type(torch.half)
float_16_tensor


tensor([3., 6., 9.], dtype=torch.float16)

In [None]:
float_16_tensor * Test_tensor

tensor([ 9., 36., 81.])

In [None]:
#How to check tensor information #Tensor attribute
#datatype = tensor.dtype
#shape = tensor.shape, tensor.size() --> size() is a function
#device = tensor.device

### 1.6 Manipulating tensor (Tensor Operations)

In [None]:
#Tensor Manipulation (Tensor Operations)
#Addition
#Subtraction
#Multiplication (element-wise)
#Division
#Matrix multiplication (dot product)

In [None]:
#Addition
tensor = torch.tensor([1,2,3])
tensor + 10

tensor([11, 12, 13])

In [None]:
#Multiplication
tensor * 10

tensor([10, 20, 30])

In [None]:
#Subtraction
tensor - 10

tensor([-9, -8, -7])

In [None]:
#Tensor built-in function
torch.mul(tensor, 10), torch.add(tensor, 10), torch.sub(tensor, 10)

(tensor([10, 20, 30]), tensor([11, 12, 13]), tensor([-9, -8, -7]))

In [None]:
#Tensor Matrix multiplication
torch.matmul(tensor, tensor)
#Or
#tensor @ tensor

tensor(14)

#####There are two main rules for matrix multiplication needs to be satisfied:
1. The **inner dimension** must match
2. The resulting matrix has the shape of **outer dimension**

####One of the most common errors in deep learning: **Shape Error**
#####To deal with the shape error, we can use ***Transpose**
`Transpose = Tensor.T `


In [None]:
rand_tens = torch.rand(3,2)

rand_tens, rand_tens.T


(tensor([[0.5256, 0.0073],
         [0.5425, 0.6632],
         [0.9508, 0.0412]]),
 tensor([[0.5256, 0.5425, 0.9508],
         [0.0073, 0.6632, 0.0412]]))

### 1.7 Tensor Aggregation
####Finding min, max, mean, sum, etc

In [None]:
rand_tensor = torch.arange(0,100,10)
rand_tensor, rand_tensor.min(), rand_tensor.max(), rand_tensor.type(torch.float32).mean()
#Could be torch.min(rand_tensor), torch.max(rand_tensor), torch.mean(rand_tensor.type(torch.float32))

(tensor([ 0, 10, 20, 30, 40, 50, 60, 70, 80, 90]),
 tensor(0),
 tensor(90),
 tensor(45.))

In [None]:
#Why change in dtype for mean() function?
rand_tensor.dtype # torch.int64 == Long datatype
# **NOTE**
#Input dtype must be either a floating point or complex dtype for mean() function
#Thus, we need to change the dtype of the tensor to use mean()
#From the following result, we can also note that arange() function creates tensor with dtype of int64

torch.int64

In [None]:
#Finding index of the min, max, mean value in the tensor
rand_tensor.argmin(), rand_tensor.argmax() # == torch.argmin(rand_tensor), torch.argmax(rand_tensor)


(tensor(0), tensor(9))

In [None]:
tensor = torch.rand(10,10)
tensor.argmax(), tensor.argmin()

(tensor(1), tensor(81))

###1.8 Reshaping, stacking, squeezing and unsqueezing tensors
#####These are the tools to deal with shape or dimesion errors
* Reshaping: reshapes a tensor to a defined shape
* View: return a view of an input tensor of certain shape but keep the same memory as the original tensor
* Stacking: combine multiple tensors on top of each other (vstack-vertical stack, hstack-horizontal stack)
* Squeeze: removes all `1` dimensions from a tensor
* Unsqueeze: add a `1` dimension to a target tensor
* Permute: Return a view of the input with dimensions permuted (swapped) in a certain way


In [None]:
# example tensor
x = torch.arange(1., 11.)
x, x.shape

(tensor([ 1.,  2.,  3.,  4.,  5.,  6.,  7.,  8.,  9., 10.]), torch.Size([10]))

In [None]:
#Reshape the tensor
#Add an extra dimension
x.reshape(2,5), x.reshape(5,2)


(tensor([[ 1.,  2.,  3.,  4.,  5.],
         [ 6.,  7.,  8.,  9., 10.]]),
 tensor([[ 1.,  2.],
         [ 3.,  4.],
         [ 5.,  6.],
         [ 7.,  8.],
         [ 9., 10.]]))

In [None]:
# Note that the input value for reshape should be correspond to the size of origianl tensor
#the following example is invalid since it does not have all element of tensor x
x.reshape(2,4)

RuntimeError: shape '[2, 4]' is invalid for input of size 10

In [None]:
x.reshape(2,5).shape, x.reshape(5,2).shape

(torch.Size([2, 5]), torch.Size([5, 2]))

In [None]:
# Change the View **Note that a view of a tensor share the same memory as the origianl tensor
z = x.view(2,5)
#Here we adjust z
z[:,0] = 5 #Here we adjusted tensor z
z, x #But checking z and x, both same element has been changed since view()

(tensor([[ 5.,  2.,  3.,  4.,  5.],
         [ 5.,  7.,  8.,  9., 10.]]),
 tensor([ 5.,  2.,  3.,  4.,  5.,  5.,  7.,  8.,  9., 10.]))

In [None]:
# Stack tensors on top of each other

stacked = torch.stack([x,x,x,x], dim=0)
stacked


tensor([[ 5.,  2.,  3.,  4.,  5.,  5.,  7.,  8.,  9., 10.],
        [ 5.,  2.,  3.,  4.,  5.,  5.,  7.,  8.,  9., 10.],
        [ 5.,  2.,  3.,  4.,  5.,  5.,  7.,  8.,  9., 10.],
        [ 5.,  2.,  3.,  4.,  5.,  5.,  7.,  8.,  9., 10.]])

In [None]:
stacked2 = torch.stack([x,x,x,x], dim=1)
stacked2

tensor([[ 5.,  5.,  5.,  5.],
        [ 2.,  2.,  2.,  2.],
        [ 3.,  3.,  3.,  3.],
        [ 4.,  4.,  4.,  4.],
        [ 5.,  5.,  5.,  5.],
        [ 5.,  5.,  5.,  5.],
        [ 7.,  7.,  7.,  7.],
        [ 8.,  8.,  8.,  8.],
        [ 9.,  9.,  9.,  9.],
        [10., 10., 10., 10.]])

In [None]:
#vstack vs hstack
#vstack =  stack tensors in sequence vertically (row wise)
#hstack =  stack tensors in sequence horizontally (column wise)
a = torch.rand(2,2)
b = torch.rand(2,2)
a,b

(tensor([[0.0862, 0.7321],
         [0.7634, 0.0681]]),
 tensor([[0.1537, 0.3410],
         [0.5295, 0.9480]]))

In [None]:
torch.vstack((a,b)), torch.hstack((a,b))

(tensor([[0.0862, 0.7321],
         [0.7634, 0.0681],
         [0.1537, 0.3410],
         [0.5295, 0.9480]]),
 tensor([[0.0862, 0.7321, 0.1537, 0.3410],
         [0.7634, 0.0681, 0.5295, 0.9480]]))

In [None]:
#Squeez and Unsqueeze
#torch.squeeze = remove all single dimensions from the target tensor
#torch.unsqueeze = add a single dimension to a target tensor at a specific dim

x = torch.zeros(1,2)
x, x.size()

(tensor([[0., 0.]]), torch.Size([1, 2]))

In [None]:
x.squeeze(), x.squeeze().shape

(tensor([0., 0.]), torch.Size([2]))

In [None]:
y = torch.zeros(2,1,2,1,2)
y, y.size()

(tensor([[[[[0., 0.]],
 
           [[0., 0.]]]],
 
 
 
         [[[[0., 0.]],
 
           [[0., 0.]]]]]),
 torch.Size([2, 1, 2, 1, 2]))

In [None]:
y.squeeze(), y.squeeze().shape

(tensor([[[0., 0.],
          [0., 0.]],
 
         [[0., 0.],
          [0., 0.]]]),
 torch.Size([2, 2, 2]))

In [None]:
#Unsqueeze with dim = 0
x_squeezed = x.squeeze()
x_unsqueezed = x_squeezed.unsqueeze(dim=0)
x_squeezed, x_unsqueezed

(tensor([0., 0.]), tensor([[0., 0.]]))

In [None]:
#Unsqueeze with dim = 1
x_squeezed = x.squeeze()
x_unsqueezed = x_squeezed.unsqueeze(dim=1)
x_squeezed, x_unsqueezed


(tensor([0., 0.]),
 tensor([[0.],
         [0.]]))

In [None]:
#torch.permute = rearrange the dimension of a target tensor in a specified order
photo = torch.rand(224,224,3)
photo_permuted = photo.permute(2,0,1)
photo.shape, photo_permuted.shape
#The number in the permute() represent the dimesion's index
#Note that permute is also a view that shares the same memory as the original tensor

(torch.Size([224, 224, 3]), torch.Size([3, 224, 224]))

### 1.9 Tensor Indexing
####Indexing in PyTorch is similar to indexing with Numpy

In [None]:
a = torch.arange(1,10).reshape(1,3,3)
a

tensor([[[1, 2, 3],
         [4, 5, 6],
         [7, 8, 9]]])

In [None]:
a[0], a[0,0] # == a[0][0]

(tensor([[1, 2, 3],
         [4, 5, 6],
         [7, 8, 9]]),
 tensor([1, 2, 3]))

In [None]:
# You can use : to select "all" of the target dimension
a[:,:,2] # == indexing all element in 0th and 1st dimensions, but only index 2 of 2nd dimension

tensor([[3, 6, 9]])

In [None]:
a[:,1,1] # similar to  a[0][1][1] but the difference is [] is included

tensor([5])

####PyTorch tensor and NumPy
#####NumPy is a popular scientific Python numerical computing library
#####And because of this, PyTorch has functionality to interact wiht it
* Data in NumPy to PyTorch tensor -> `torch.from_numpy(ndarray)`  
*PyTorch tensor -> Data in NumPy -> `torch.Tensor.numpy()`

In [None]:
#NumPy array to Tensor
import torch
import numpy as np

array = np.arange(1.,8.)
tensor = torch.from_numpy(array)
array, tensor

(array([1., 2., 3., 4., 5., 6., 7.]),
 tensor([1., 2., 3., 4., 5., 6., 7.], dtype=torch.float64))

In [None]:
array.dtype #the default data type is float64

dtype('float64')

In [None]:
 tensor.dtype
 #warning: when converting numpy -> pytorch, pytorch reflects the default datatype of numpy unless specified

torch.float64

In [None]:
#if we change the value in array, what will happen in tensor
array = array + 1
array, tensor
#it does not change the tensor

(array([2., 3., 4., 5., 6., 7., 8.]),
 tensor([1., 2., 3., 4., 5., 6., 7.], dtype=torch.float64))

In [None]:
# Tensor to NumPy array
tensor = torch.ones(7)
numpy_tensor = tensor.numpy()
tensor, numpy_tensor
# Also note that change in element of tensor does not change the element in the numpy array

(tensor([1., 1., 1., 1., 1., 1., 1.]),
 array([1., 1., 1., 1., 1., 1., 1.], dtype=float32))

## 2.0 Reproducibility (Trying to take random out of random)

#####In short, how neural network learns:
`start with random numbers -> tensor operation -> update random numbers to make them better represent data -> repeat the operation`
##### To reduce the randomness in neural networks, PyTorch uses the concept of **random seed**.

#### "Random seed **flavour** the randomness."


In [None]:
#Make random but reproducible tensor
import torch

#Set random seed
RANDOM_SEED = 1234

torch.manual_seed(RANDOM_SEED) #initialize a pseudorandom number generator
rand_tensor_A = torch.rand(3,4)
rand_tensor_B = torch.rand(3,4)

print(rand_tensor_A)
print(rand_tensor_B)
print(rand_tensor_A == rand_tensor_B)

tensor([[0.0290, 0.4019, 0.2598, 0.3666],
        [0.0583, 0.7006, 0.0518, 0.4681],
        [0.6738, 0.3315, 0.7837, 0.5631]])
tensor([[0.7749, 0.8208, 0.2793, 0.6817],
        [0.2837, 0.6567, 0.2388, 0.7313],
        [0.6012, 0.3043, 0.2548, 0.6294]])
tensor([[False, False, False, False],
        [False, False, False, False],
        [False, False, False, False]])


In [None]:
#Set random seed
RANDOM_SEED = 1234

torch.manual_seed(RANDOM_SEED)
rand_tensor_A = torch.rand(3,4)
torch.manual_seed(RANDOM_SEED) # == reset the random seed
rand_tensor_B = torch.rand(3,4)

print(rand_tensor_A)
print(rand_tensor_B)
print(rand_tensor_A == rand_tensor_B)

tensor([[0.0290, 0.4019, 0.2598, 0.3666],
        [0.0583, 0.7006, 0.0518, 0.4681],
        [0.6738, 0.3315, 0.7837, 0.5631]])
tensor([[0.0290, 0.4019, 0.2598, 0.3666],
        [0.0583, 0.7006, 0.0518, 0.4681],
        [0.6738, 0.3315, 0.7837, 0.5631]])
tensor([[True, True, True, True],
        [True, True, True, True],
        [True, True, True, True]])


### 2.0.0 Accessing a GPU

####For PyTorch, since it is capable of running compute on GPU or CPU, it's best practice to setup device agnostic code  

In [3]:
!nvidia-smi

Mon Aug 26 09:47:50 2024       
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.104.05             Driver Version: 535.104.05   CUDA Version: 12.2     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|   0  Tesla T4                       Off | 00000000:00:04.0 Off |                    0 |
| N/A   42C    P8               9W /  70W |      3MiB / 15360MiB |      0%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+
                                                                    

In [1]:
# Setup device agnostic code
import torch

device = 'cuda' if torch.cuda.is_available() else 'cpu'
device

'cuda'

In [2]:
#Count for number of device
torch.cuda.device_count()

1

##2.1 Putting tensors (and models) on GPU

###Reason for putting tnesors on GPU: Faster computation

In [5]:
tensor = torch.tensor([1,2,3])
print(tensor, tensor.device)
#Recall that the default value of device is cpu

tensor([1, 2, 3]) cpu


In [6]:
#Move the tensor to GPU IF AVAILABLE
tensor_on_gpu = tensor.to(device) # this is the reason why we define value 'device'
tensor_on_gpu
#the number represents index of the device

tensor([1, 2, 3], device='cuda:0')

In [None]:
# Note that if a tensor is on GPU, it cannot transform to NumPy
tensor_on_gpu.numpy()

In [None]:
#To fix the error,
#First, we conver the device to CPU and apply NumPy
tensor_back_on_cpu = tensor_on_gpu.cpu().numpy()
tensor_back_on_cpu

##Exercise and Extra-curricular for fundamentals