<a href="https://colab.research.google.com/github/eliseleahy/Pytorch-Tutorials/blob/main/00_Pytorch_Fundatmentals.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## 00. PyTorch Fundamentals

Resource notebook - https://www.learnpytorch.io/00_pytorch_fundamentals/

In [38]:
import torch
import pandas as pd
import numpy as np 
import matplotlib.pyplot as plt
print(torch.__version__)

1.13.0+cu116


In [39]:
!nvidia-smi


NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running.



# Introduction to Tensors

Creating tensors

Ref: https://pytorch.org/docs/stable/tensors.html

In [40]:
# scalar
scalar = torch.tensor(7)
scalar

tensor(7)

In [41]:
scalar.ndim

0

In [42]:
scalar.item()

7

In [43]:
# Vector
vector = torch.tensor([7,7])
vector

tensor([7, 7])

In [44]:
vector.shape

torch.Size([2])

In [45]:
# MATRIX
MATRIX = torch.tensor([[7,8],[9,10]])
MATRIX

tensor([[ 7,  8],
        [ 9, 10]])

In [46]:
MATRIX[0]


tensor([7, 8])

In [47]:
# TENSOR
TENSOR = torch.tensor([[[1,2,3],
                         [3,6,9],
                         [2,4,8]]])
TENSOR


tensor([[[1, 2, 3],
         [3, 6, 9],
         [2, 4, 8]]])

In [48]:
TENSOR.shape

torch.Size([1, 3, 3])

In [49]:
TENSOR[0] 

tensor([[1, 2, 3],
        [3, 6, 9],
        [2, 4, 8]])

Tensors hold matrices. So in the case above we can see that there is one 3x3 matrix in position TENSOR[0]. 

First bracket is the 1 in the torch.size

Second bracket is the 3 in the torch.size

Third bracket is the 3 in the torch.size

In [50]:
TENSOR = torch.tensor([[[[1,2,3], 
                         [4,5,6], 
                         [7,8,9]],
                        [[1,2,3],
                         [3,6,9],
                         [2,4,8]]],
                       [[[1,2,3], 
                         [4,5,6], 
                         [7,8,9]],
                        [[1,2,3],
                         [3,6,9],
                         [2,4,8]]]])

print(TENSOR.shape)
print(TENSOR.ndim)

torch.Size([2, 2, 3, 3])
4


# Random Tensors 

Why random tensors?

Random tensors are important because the way nueral networks learn is that they start with tensors full of random numbers and then adjust those random numbers to better represnt the data. 

'start with random numbers -> look at data -> update random numbers -> look at data -> update random number'

In [51]:
# Create a random tensor of size (3,4)

random_tensor = torch.rand(3,4)

random_tensor

tensor([[0.3967, 0.1177, 0.7578, 0.3708],
        [0.9396, 0.6968, 0.7557, 0.9778],
        [0.3048, 0.6468, 0.9459, 0.8511]])

In [52]:
#Create a random tensor of a similar shape to an image tensor
random_image_size_tensor = torch.rand(size=(224,224,3)) # height, width, colour channels rgb
random_image_size_tensor.shape, random_image_size_tensor.ndim

(torch.Size([224, 224, 3]), 3)

# Zeros and Ones 

In [53]:
zeros = torch.zeros(3,4)
ones = torch.ones(3,4)
zeros, ones

(tensor([[0., 0., 0., 0.],
         [0., 0., 0., 0.],
         [0., 0., 0., 0.]]), tensor([[1., 1., 1., 1.],
         [1., 1., 1., 1.],
         [1., 1., 1., 1.]]))

In [54]:
ones.dtype

torch.float32

# Range

create a range of tensors and tensor-like

In [55]:
one_to_ten=torch.arange(0,10)

In [56]:
torch.arange(0,10, 0.5)

tensor([0.0000, 0.5000, 1.0000, 1.5000, 2.0000, 2.5000, 3.0000, 3.5000, 4.0000,
        4.5000, 5.0000, 5.5000, 6.0000, 6.5000, 7.0000, 7.5000, 8.0000, 8.5000,
        9.0000, 9.5000])

In [57]:
# creating tensors like
ten_zeros = torch.zeros_like(input=one_to_ten)
ten_zeros

tensor([0, 0, 0, 0, 0, 0, 0, 0, 0, 0])

# Tensor Data Type

Note- Tensor datatypes is one of the 3 big errors you'll run into with PyTorch and deep learning:
1. Tensors not right datatype
2. Tensors not right shape
3. Tensors not on the right device 

In [58]:
float_32_tensor = torch.tensor([3.0,6.0,9.0], 
                               dtype=None, #What datatype is the tensor
                               device=None, # what device is the tensor on e.g. cpu and gpu
                               requires_grad=False) # whether or not to track gradients with this tensors operations
float_32_tensor

tensor([3., 6., 9.])

In [59]:
float_32_tensor.dtype

torch.float32

In [60]:
float_16_tensor = float_32_tensor.type(torch.float16)
float_16_tensor


tensor([3., 6., 9.], dtype=torch.float16)

In [61]:
float_16_tensor * float_32_tensor

tensor([ 9., 36., 81.])

### Getting information from tensors

1. Tensors not right datatype - to get datatype can use - tensor.dtype
2. Tensors not right shape - to get shape can use - tensor.shape
3. Tensors not on the right device -  to get device can use - tensor.device

In [62]:
some_tensor = torch.rand(3,4)
some_tensor

tensor([[0.9566, 0.5699, 0.9635, 0.2292],
        [0.2244, 0.8099, 0.7901, 0.7022],
        [0.7634, 0.3294, 0.9701, 0.1725]])

In [63]:
print(some_tensor)
print(some_tensor.dtype)
print(some_tensor.shape)
print(some_tensor.device)

tensor([[0.9566, 0.5699, 0.9635, 0.2292],
        [0.2244, 0.8099, 0.7901, 0.7022],
        [0.7634, 0.3294, 0.9701, 0.1725]])
torch.float32
torch.Size([3, 4])
cpu


### Manipulating Tensor Operations 

Tensor operations include: 
* Addition
* Subtraction 
* Multiplication (element-wise))
* Division 
* Matrix Multiplication 

In [64]:
#create a tensor 
tensor = torch.tensor([1,2,3])
tensor+ 10

tensor([11, 12, 13])

In [65]:
# multiply tensor by 10 
tensor*10

tensor([10, 20, 30])

In [66]:
# subtract 10 
tensor - 10

tensor([-9, -8, -7])

In [67]:
# try out PyTorch inbuilt functions 
torch.mul(tensor, 10)

tensor([10, 20, 30])

In [68]:
torch.add(tensor, 10)

tensor([11, 12, 13])

### Matrix Multiplication

Two main ways of performing multiplication in neural networks and deep learning:

1. Element-wise
2. Matrix Multiplication (dot product)

There are two main rules that performing matrix multiplication need to satisfy: 
1. inner dimensions must match:
* '(3,2) @ (3,2)' - won't work 
* '(2,3) @ (3,2)' - will work
* '(3,2) @ (2,3)' - will work 

2. The resulting matrix has the shape of the out dimensions 
* '(2,3) @ (3,2)' -> (2,2)
* '(3,2) @ (2,3)' -> (3,3)

In [69]:
# Element - wise
print(tensor, "*", tensor)
print(f"Equals: {tensor*tensor}")

tensor([1, 2, 3]) * tensor([1, 2, 3])
Equals: tensor([1, 4, 9])


In [70]:
# Matrix multiplication 
torch.matmul(tensor,tensor)

tensor(14)

In [71]:
# Matrix muliplication by hand 
1*1 + 2*2 + 3*3 

14

In [72]:
%%time 
value = 0 
for i in range(len(tensor)):
  value+= tensor[i] * tensor[i]
value

CPU times: user 203 µs, sys: 28 µs, total: 231 µs
Wall time: 238 µs


tensor(14)

In [73]:
%%time
torch.matmul(tensor,tensor)

CPU times: user 67 µs, sys: 9 µs, total: 76 µs
Wall time: 82 µs


tensor(14)

### One of the most common errors in deep learning: shape errors


In [74]:
# # shapes for matrix multiplication
# tensor_A = torch.tensor([[1,2], 
#                         [3,4], 
#                         [5,6]])

# tensor_B = torch.tensor([[7,10], 
#                         [8,11],
#                         [9,12]])

# torch.mm(tensor_A, tensor_B) ##torch.mm is the same as torch.matmul

In [75]:
tensor_A.shape, tensor_B.shape

(torch.Size([3, 2]), torch.Size([3, 2]))

To fix out tensor shape issues we can manipulate the shape of one of our tensors by taking the transpose of the matrix 

In [76]:
tensor_B.T # .T takes the transpose 

tensor([[ 7,  8,  9],
        [10, 11, 12]])

In [77]:
# The matrix multiplication mutliplication works when tensor_B is transposed 

torch.mm(tensor_A, tensor_B.T)

tensor([[ 27,  30,  33],
        [ 61,  68,  75],
        [ 95, 106, 117]])

## finding the min, max, sum, ect... (tensor aggregation)

In [78]:
# create a tensor 
x = torch.arange(0,100,10)
x

tensor([ 0, 10, 20, 30, 40, 50, 60, 70, 80, 90])

In [79]:
# find min 
torch.min(x), x.min()


(tensor(0), tensor(0))

In [80]:
# find max
torch.max(x), x.max()

(tensor(90), tensor(90))

In [81]:
#find mean - note: torch.mean function requires a tensor of float32 datatype to work

torch.mean(x.type(torch.float32))

tensor(45.)

In [82]:
#find sum

torch.sum(x), x.sum()

(tensor(450), tensor(450))

In [83]:
#Find indices of the max
torch.topk(x,1)


torch.return_types.topk(
values=tensor([90]),
indices=tensor([9]))

# Finding the positional min and max

In [84]:
x

tensor([ 0, 10, 20, 30, 40, 50, 60, 70, 80, 90])

In [85]:
# Find the positional in tensor that has the minimum value 
x.argmin()

tensor(0)

In [86]:
x[0]

tensor(0)

In [87]:
x.argmax()


tensor(9)

In [88]:
x[9]

tensor(90)

# Reshaping, stacking, squeezing and unsqueezing tensors

* Reshaping - rehspaes an input tensor to a defined shape
* Veiw - Returns a view of an input tensor of a certain shape but keep the same memory as the original tensor
* stacking - combine multiple tensors on top of each other (vstack) or next to each other (hstack)
* Squeezing - removes all '1' dimensions from a tensor
* unsqueeze - add a '1' dimenision to a target tensor 
* Permute - Return a view of the input with dimensions permuted (swapped) in a certain way

In [89]:
# Create a tensor

import torch

x=torch.arange(1.,10.)
x, x.shape

(tensor([1., 2., 3., 4., 5., 6., 7., 8., 9.]), torch.Size([9]))

In [90]:
# Add an extra dimension
x_reshaped = x.reshape(1,9)
x_reshaped, x_reshaped.shape

(tensor([[1., 2., 3., 4., 5., 6., 7., 8., 9.]]), torch.Size([1, 9]))

In [91]:
# Change the view

z = x.view(1,9)
z, z.shape

(tensor([[1., 2., 3., 4., 5., 6., 7., 8., 9.]]), torch.Size([1, 9]))

In [92]:
# Changing z changes x (because a view of a tensor shares the same memory as the original)
z[:,0] = 5
z,x


(tensor([[5., 2., 3., 4., 5., 6., 7., 8., 9.]]),
 tensor([5., 2., 3., 4., 5., 6., 7., 8., 9.]))

In [93]:
# Stack tensors on top of each other 
x_stacked = torch.stack([x,x,x,x], dim=0)
x_stacked

tensor([[5., 2., 3., 4., 5., 6., 7., 8., 9.],
        [5., 2., 3., 4., 5., 6., 7., 8., 9.],
        [5., 2., 3., 4., 5., 6., 7., 8., 9.],
        [5., 2., 3., 4., 5., 6., 7., 8., 9.]])

In [94]:
v_stacked = torch.vstack([x,x,x,x])
v_stacked

tensor([[5., 2., 3., 4., 5., 6., 7., 8., 9.],
        [5., 2., 3., 4., 5., 6., 7., 8., 9.],
        [5., 2., 3., 4., 5., 6., 7., 8., 9.],
        [5., 2., 3., 4., 5., 6., 7., 8., 9.]])

In [95]:
# torch.squeeze() - removes all single dimensions
x_reshaped.shape

torch.Size([1, 9])

In [96]:
x_squeezed = x_reshaped.squeeze()

In [97]:
# removes the 1 dimension
x_reshaped.squeeze().shape



torch.Size([9])

In [98]:
# torch.unsqueeze() adds a single dimension to a target tensor at a dim 
print(f"Previous target: {x_squeezed}")
print(f"Previous shape: {x_squeezed.shape}")

# Add an extra dimension with unsqueeze
x_unsqueezed = x_squeezed.unsqueeze(dim=0)
print(f"\nNew tensor: {x_unsqueezed}")
print(f"New shape: {x_unsqueezed.shape}")

Previous target: tensor([5., 2., 3., 4., 5., 6., 7., 8., 9.])
Previous shape: torch.Size([9])

New tensor: tensor([[5., 2., 3., 4., 5., 6., 7., 8., 9.]])
New shape: torch.Size([1, 9])


In [99]:
# torch.permute rearrages the dimensions of a target tensor in a specified order
x_original = torch.rand(size = (224,224,3)) # [height, width, colour_channels]

#permute the original tensor to rearrange the axis or dim order
x_permuted = x_original.permute(2,0,1) # shifts axis 0->1, 1->2, 2->0 

x_original.shape, x_permuted.shape

(torch.Size([224, 224, 3]), torch.Size([3, 224, 224]))

# Indexing (Selecting data from tensors with indexes)

Indexing with PyTorch is similar to indexing from numpy



In [103]:
import torch
x = torch.arange(1,10).reshape(1,3,3)
x, x.shape 

(tensor([[[1, 2, 3],
          [4, 5, 6],
          [7, 8, 9]]]), torch.Size([1, 3, 3]))

In [102]:
# Lets index on out new tensor 
x[0]

tensor([[1, 2, 3],
        [4, 5, 6],
        [7, 8, 9]])

In [104]:
#lets index on the middle bracket (dim=1)
x[0,0]

tensor([1, 2, 3])

In [105]:
x[0][0]

tensor([1, 2, 3])

In [106]:
#lets index on the most inner bracket

x[0][0][0]


tensor(1)

In [107]:
x[0][2][2]


tensor(9)

In [109]:
x[:,:,1]

tensor([[2, 5, 8]])

In [110]:
# get all values of the zero dim but only 1 index value of 1st and 2nd dims 
x[:,1,1]

tensor([5])

In [115]:
x[0,2,2], x[:,:,2]

(tensor(9), tensor([[3, 6, 9]]))

# Pytorch tensors and NumPy

NumPy is a popular scientific Python numerical computing library

And because of this, PyTorch has functionality to interact with it. 
 
 * Data in NumPy, want in PyTorch tensor -> 'torch.fromnumpy(ndarray)'
 * PyTorch tensor -> NumPy -> torch.Tensor.numpy 

In [119]:
# Numpy array to tensor 

import numpy as np
import torch 

array = np.arange(1.0,8.0)
tensor = torch.from_numpy(array).type(torch.float32) # when converting from numpy -> pytorch, pytorch reflects numpys default data type of float64 unless specified otherwise
array, tensor

(array([1., 2., 3., 4., 5., 6., 7.]), tensor([1., 2., 3., 4., 5., 6., 7.]))

In [120]:
#change value of array
array = array +1
array, tensor

(array([2., 3., 4., 5., 6., 7., 8.]), tensor([1., 2., 3., 4., 5., 6., 7.]))

In [121]:
#Tensor to numpy array 

tensor = torch.ones(7)
numpy_tensor = tensor.numpy()
tensor, numpy_tensor

(tensor([1., 1., 1., 1., 1., 1., 1.]),
 array([1., 1., 1., 1., 1., 1., 1.], dtype=float32))

In [123]:
# Change the tensor, what happens to numpy tensor 

tensor = tensor +1 
tensor, numpy_tensor

(tensor([2., 2., 2., 2., 2., 2., 2.]),
 array([1., 1., 1., 1., 1., 1., 1.], dtype=float32))

# Reproducibility (trying to take the random out of random)

In short how a neural networks learns:

'start with random numbers -> tensor operataions -> update random numbers to try and make them better represnt5ataions of the data -> again -> again -> again ...'

To reduce the randomness in neural networks and pytorch comes the concept of a **random seed**

Essentially what the random seed does is "flavour" the randomness

In [124]:
import torch 

# create two random tensors 

random_tensor_A = torch.rand(3,4)
random_tensor_B = torch.rand(3,4)

print(random_tensor_A)
print(random_tensor_B)

print(random_tensor_A == random_tensor_B)


tensor([[0.1846, 0.6973, 0.8350, 0.9715],
        [0.0887, 0.0401, 0.8045, 0.4775],
        [0.5275, 0.5472, 0.7576, 0.8031]])
tensor([[0.3512, 0.6952, 0.6086, 0.4887],
        [0.2887, 0.9779, 0.0767, 0.0663],
        [0.3977, 0.3933, 0.9953, 0.6396]])
tensor([[False, False, False, False],
        [False, False, False, False],
        [False, False, False, False]])


In [126]:
# using  random seed 

import torch 

# set seed

RANDOM_SEED = 42
torch.manual_seed(RANDOM_SEED)

random_tensor_C = torch.rand(3,4)
torch.manual_seed(RANDOM_SEED)
random_tensor_D = torch.rand(3,4)

print(random_tensor_C)
print(random_tensor_D)

print(random_tensor_C == random_tensor_D)

tensor([[0.8823, 0.9150, 0.3829, 0.9593],
        [0.3904, 0.6009, 0.2566, 0.7936],
        [0.9408, 0.1332, 0.9346, 0.5936]])
tensor([[0.8823, 0.9150, 0.3829, 0.9593],
        [0.3904, 0.6009, 0.2566, 0.7936],
        [0.9408, 0.1332, 0.9346, 0.5936]])
tensor([[True, True, True, True],
        [True, True, True, True],
        [True, True, True, True]])


# Running tensors and PyTorch objects on GPUs (and making faster computataions)

GPUs = faster computataion on numbers, thanks to CUDE + NVIDIA hardware + PyTorch

### 1. Getting a GPU

1. Use google colab for a free GPU (options to upgrade)
2. Use your own GPU - takes a little bit of set up and requires getting a GPU
3. Use Cloud computing - GCP, AWS, Azure

In [2]:
!nvidia-smi


Fri Dec 30 13:56:04 2022       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 460.32.03    Driver Version: 460.32.03    CUDA Version: 11.2     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|   0  Tesla T4            Off  | 00000000:00:04.0 Off |                    0 |
| N/A   56C    P0    27W /  70W |      0MiB / 15109MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Proces

### 2. Check for GPU access with PyTorch

In [3]:
import torch
torch.cuda.is_available()

True

For PyTorch since its capable of running on the GPU or CPU, its best pratice to setup device agnostic code. E.g. run on GPU if available, else default to CPU

In [6]:
# Setup device agnostic code
device = "cuda" if torch.cuda.is_available() else "cpu"
device

'cuda'

In [8]:
# count no. of devices

torch.cuda.device_count()


1

## 3. Putting Tensors and models on the GPU

The reason we want our tensors/models on the GPU is because uaing a GPU results in faster computations 

In [9]:
# create a tensors (default on GPU)
tensor = torch.tensor([1,2,3], device ="cpu")

print(tensor, tensor.device)

tensor([1, 2, 3]) cpu


In [11]:
# Move tensor to GPU (if available)

tensor_on_gpu = tensor.to(device)
tensor_on_gpu


tensor([1, 2, 3], device='cuda:0')

## 4. Moving Tensors back to cpu


In [12]:
# If tensor on GPU, can't transform to numpy

#tensor_on_gpu.numpy()

TypeError: ignored

In [16]:
# to fix gpu tensor with numpy issue 
tensor_back_on_cpu = tensor_on_gpu.cpu().numpy()
tensor_back_on_cpu

array([1, 2, 3])

In [17]:
tensor_on_gpu

tensor([1, 2, 3], device='cuda:0')