<a href="https://colab.research.google.com/github/mikeyshean/pytorch-notebooks/blob/master/00_pytorch_fundamentals.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## PyTorch Fundamentals

Resource notebook: https://www.learnpytorch.io/00_pytorch_fundamentals/

In [None]:
import torch
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
print(torch.__version__)

1.12.1+cu113


## Intro to Tensors

### Creating tensors

PyTorch tensors are created using `torch.Tensor()` = https://pytorch.org/docs/stable/tensors.html

In [None]:
# Scalar

scalar = torch.tensor(7)
scalar

tensor(7)

In [None]:
scalar.ndim

0

In [None]:
# Get tensor back as Python int
scalar.item()

7

In [None]:
# Vector

vector = torch.tensor([7,7])
vector

tensor([7, 7])

In [None]:
vector.ndim

1

In [None]:
vector.shape

torch.Size([2])

In [None]:
# MATRIX

MATRIX = torch.tensor([[7,8],
                       [9, 10]])
MATRIX

tensor([[ 7,  8],
        [ 9, 10]])

In [None]:
MATRIX.ndim

2

In [None]:
MATRIX[0]

tensor([7, 8])

In [None]:
MATRIX[1]

tensor([ 9, 10])

In [None]:
MATRIX.shape

torch.Size([2, 2])

In [None]:
# TENSOR
TENSOR = torch.tensor([[[1, 2 ,3],
                        [4, 5, 6],
                        [7, 8, 9]]])
TENSOR

tensor([[[1, 2, 3],
         [4, 5, 6],
         [7, 8, 9]]])

In [None]:
TENSOR.ndim

3

In [None]:
TENSOR.shape

torch.Size([1, 3, 3])

In [None]:
TENSOR[0]

tensor([[1, 2, 3],
        [4, 5, 6],
        [7, 8, 9]])

In [None]:
MY_TENSOR = torch.tensor([[[1, 2 ,3],
                           [4, 5, 6],
                           [7, 8, 9]],
                          [[10, 11, 12],
                           [13, 14, 15],
                           [16, 17, 18]]])
MY_TENSOR

tensor([[[ 1,  2,  3],
         [ 4,  5,  6],
         [ 7,  8,  9]],

        [[10, 11, 12],
         [13, 14, 15],
         [16, 17, 18]]])

In [None]:
MY_TENSOR.ndim

3

In [None]:
MY_TENSOR.shape

torch.Size([2, 3, 3])

## Random Tensors

Why random tensors?

Random tensors are important because the way many neural networks learn is that they start with tensors full of random numbers and then adjust those random numbers to better represent the data.

`Start with rand -> look at data -> update rand -> look at data -> update rand`

In [None]:
# Create a random tensor of size (3, 4)
random_tensor = torch.rand(3, 4)
random_tensor

tensor([[0.0040, 0.0013, 0.7002, 0.3062],
        [0.2118, 0.9720, 0.6754, 0.1215],
        [0.6801, 0.4847, 0.8090, 0.3533]])

In [None]:
# Create a random tensor with similar shape to an image tensor
random_image_size_tensor = torch.rand(10, 10, 3) # height, width, color channels (RGB)
random_image_size_tensor.shape, random_image_size_tensor.ndim

(torch.Size([10, 10, 3]), 3)

In [None]:
random_tensor.shape

torch.Size([3, 4])

## Zeros and ones

In [None]:
# Create a tensor of all zeros
zeros = torch.zeros(size=(3, 4))
zeros

tensor([[0., 0., 0., 0.],
        [0., 0., 0., 0.],
        [0., 0., 0., 0.]])

In [None]:
# Create tensor of all ones
ones = torch.ones(size=(3, 4))
ones

tensor([[1., 1., 1., 1.],
        [1., 1., 1., 1.],
        [1., 1., 1., 1.]])

In [None]:
ones.dtype

torch.float32

### Creating a range of tensors and tensors-like

In [None]:
# Use torch.arange()
torch.arange(0, 10)

tensor([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [None]:
one_to_ten = torch.arange(start=1, end=11, step=1)
one_to_ten

tensor([ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10])

In [None]:
# Creating tensors like
ten_zeros = torch.zeros_like(input=one_to_ten)
ten_zeros

tensor([0, 0, 0, 0, 0, 0, 0, 0, 0, 0])

## Tensor datatypes

**Note:** Tensor datatypes in one of the 3 big errors you will run into with PyTorch & deep learning:
1.  Tensors not right datatype
2.  Tensors not right shape
3.  Tensors not on right device

https://en.wikipedia.org/wiki/Precision_(computer_science)


In [None]:
# Float 32 tensor
float_32_tensor = torch.tensor([3.0, 6.0, 9.0],
                               dtype=None,  # What datatype is the tensor (torch.float32 is default)
                               device=None, # What device your tensor is on (cuda, cpu)
                               require_grad=False) # Whether or not to track gradients with this tensors operations
float_32_tensor

tensor([3., 6., 9.], dtype=torch.float16)

In [None]:
float_32_tensor.dtype

torch.float32

In [None]:
float_16_tensor = float_32_tensor.type(torch.float16)
float_16_tensor

tensor([3., 6., 9.], dtype=torch.float16)

In [None]:
float_16_tensor * float_32_tensor

tensor([ 9., 36., 81.], dtype=torch.float16)

In [None]:
int_tensor = torch.tensor([3, 0, 9], dtype=torch.bool)
int_tensor

tensor([ True, False,  True])

In [None]:
int_tensor * float_16_tensor

tensor([3., 0., 9.], dtype=torch.float16)

# Getting information from tensors (attributes)

1.  Tensors not right datatype - to get datatype from tensor, use `tensor.dtype`
2.  Tensor not right shape - to get shape from a tensor, use `tensor.shape`
3.  Tensor not on the right device - to get device from a tensor, use `tensor.device`

In [None]:
# Create a tensor
some_tensor = torch.rand(size=(3, 4),dtype=torch.double, device="cpu")
some_tensor

tensor([[0.7857, 0.9127, 0.7844, 0.1379],
        [0.8103, 0.5564, 0.5378, 0.3506],
        [0.4116, 0.7295, 0.2461, 0.9586]], dtype=torch.float64)

In [None]:
# Find out details about tensor
print(f'Data type: {some_tensor.dtype}')
print(f'Shape: {some_tensor.shape}')
print(f'Device: {some_tensor.device}')
print(f'ndim: {some_tensor.ndim}')

Data type: torch.float64
Shape: torch.Size([3, 4])
Device: cpu
ndim: 2


### Manipulating Tensors (tensor operations)

Tensor operations includes:
* Addition
* Substraction
* Multiplication (element-wise)
* Division
* Matrix multiplication

In [None]:
# Create a tensor and add 10 to it
tensor = torch.tensor([1, 2, 3])
tensor + 10

tensor([11, 12, 13])

In [None]:
# Multiply tensor by 10
tensor * 10

tensor([10, 20, 30])

In [None]:
tensor

tensor([1, 2, 3])

In [None]:
# Subtract 10
tensor - 10

tensor([-9, -8, -7])

In [None]:
# Try out PyTorch built-in functions
torch.mul(tensor, 10)

tensor([10, 20, 30])

In [None]:
torch.add(tensor, 10)

tensor([11, 12, 13])

### Matrix Multiplication

Two main ways of performin multiplication in neural networks and deep learning:

1.  Element-wise multiplication
2.  Matrix multiplication

More info: https://www.mathsisfun.com/algebra/matrix-multiplying.html

In [None]:
# Element wise multiplication
print(tensor, "*", tensor)
print(f'Equals: {tensor * tensor}')

tensor([1, 2, 3]) * tensor([1, 2, 3])
Equals: tensor([1, 4, 9])


In [None]:
# Matrix multiplication with torch.matmul
torch.matmul(tensor, tensor)

tensor(14)

In [None]:
# By Hand
1*1 + 2*2 + 3*3

14

In [None]:
%%time 
torch.matmul(tensor, tensor) # very fast

CPU times: user 160 µs, sys: 0 ns, total: 160 µs
Wall time: 165 µs


tensor(14)

There are two main rules that matrix multiplication must satisfy:
1.  The **inner dimensions** must match
- `(3, 2) @ (3, 2) won't work`
- `(2, 3) @ (3, 2) will work`
- `(3, 2) @ (2, 3) will work`
2.  The resulting matrix has the shape of the **outer dimensions**
- `(2, 3) @ (3, 2)` -> `(2, 2)`
- `(3, 2) @ (2, 3)` -> `(3, 3)`

In [None]:
# torch.matmul(torch.rand(3,2), torch.rand(3,2))

RuntimeError: ignored

In [None]:
torch.matmul(torch.rand(2, 3), torch.rand(3,2))

tensor([[0.9618, 0.7299],
        [1.2641, 1.1081]])

In [None]:
torch.matmul(torch.rand(3, 2), torch.rand(2, 3)) # outer dimension are 3 x 3

tensor([[0.3828, 1.1994, 0.7984],
        [0.3302, 1.2223, 0.7551],
        [0.3039, 0.8324, 0.5914]])

### One of the most common errors in deep learning: Shape Errors

In [None]:
# Shapes for matrix multiplication
tensor_A = torch.tensor([[1, 2],
                         [3, 4],
                         [5, 6]])
tensor_B = torch.tensor([[7, 10],
                         [8, 11],
                         [9, 12]])
# torch.mm(tensor_A, tensor_A) # torch.mm is the same as torch.matmul
torch.matmul(tensor_A, tensor_B)

To fix our tensor shape issues, we can 
manipulate the shape of one of our tensors using **transpose**

A **transpose** switches the axes or dimensions of a given tensor

In [None]:
tensor_B

tensor([[ 7, 10],
        [ 8, 11],
        [ 9, 12]])

In [None]:

tensor_B.T, tensor_B.T.shape

(tensor([[ 7,  8,  9],
         [10, 11, 12]]), torch.Size([2, 3]))

In [None]:
# The matrix multiplication operation works when tensor_B is transposed
print(f'Original shapes: tensor_A = {tensor_A.shape}, tensor_B = {tensor_B.shape}')
print(f'New shapes: tensor_A = {tensor_A.shape}, tensor_B = {tensor_B.T.shape}')
print(f'Multiplying: {tensor_A.shape} @ {tensor_B.T.shape} <- inner dimensions must match')
print("Output:\n")
output = torch.mm(tensor_A, tensor_B.T)
print(output)
print(f'\nOutput shape: {output.shape}')

Original shapes: tensor_A = torch.Size([3, 2]), tensor_B = torch.Size([3, 2])
New shapes: tensor_A = torch.Size([3, 2]), tensor_B = torch.Size([2, 3])
Multiplying: torch.Size([3, 2]) @ torch.Size([2, 3]) <- inner dimensions must match
Output:

tensor([[ 27,  30,  33],
        [ 61,  68,  75],
        [ 95, 106, 117]])

Output shape: torch.Size([3, 3])


## Find the min, max, mean, sum, etc (tensor aggregation)

In [None]:
# Create a tensor
x = torch.arange(5, 100, 10)
x, x.dtype

(tensor([ 5, 15, 25, 35, 45, 55, 65, 75, 85, 95]), torch.int64)

In [None]:
# Find the min
torch.min(x), x.min()

(tensor(0), tensor(0))

In [None]:
# Find the max
torch.max(x), x.max()

(tensor(90), tensor(90))

In [None]:
# Find the mean - note: the torch.mean() func requires a tensor of float32 datatype to work
torch.mean(x.type(torch.float32)), x.type(torch.float32).mean()

(tensor(45.), tensor(45.))

In [None]:
# Find the sum
torch.sum(x), x.sum()

(tensor(450), tensor(450))

## Find the positional min and max

In [None]:
x

tensor([ 5, 15, 25, 35, 45, 55, 65, 75, 85, 95])

In [None]:
# Find the position in the tensor that has the minimum value with argmin() -> returns the index position of target tensor where min value occurs
x.argmin()

tensor(0)

In [None]:
x[0]

tensor(5)

In [None]:
# Find the position  in the tensor that has th emax value with argmax()
x.argmax()

tensor(9)

In [None]:
x[9]

tensor(95)

## Reshaping, stacking, squeezing and unsqueezing tensors

* Reshaping - reshapes an input tensor to a defined shape
* View - Return a view of an input tensor of certain shape but keep the same memory as the original tensor
* Stacking - combine multiple tensors on top of each other (vstack) or side-by-side (hstack)
* Squeeze - remove all `1` dimensions from a tensor
* Unsqueeze - add a `1` dimension to a target tensor
* Permute - Return a view of the input with dimensions permuted (swapped) in a certain way

In [None]:
# Let's create a tensor
import torch
x = torch.arange(1., 10.)
x, x.shape

(tensor([1., 2., 3., 4., 5., 6., 7., 8., 9.]), torch.Size([9]))

In [None]:
# Add an extra dimension
x_reshaped = x.reshape(1, 9)
x_reshaped, x_reshaped.shape

(tensor([[1., 2., 3., 4., 5., 6., 7., 8., 9.]]), torch.Size([1, 9]))

In [None]:
# Change the view
z = x.view(1, 9)
z, z.shape

(tensor([[1., 2., 3., 4., 5., 6., 7., 8., 9.]]), torch.Size([1, 9]))

In [None]:
# Change z changes x (because a view of a tensor shares the same memory as the original tensor)
z[:, 0] = 5
z, x

(tensor([[5., 2., 3., 4., 5., 6., 7., 8., 9.]]),
 tensor([5., 2., 3., 4., 5., 6., 7., 8., 9.]))

In [None]:
# Stack tensors on top of each other
x_stack = torch.stack([x, x, x, x], dim=0)
x_stack, x_stack.shape

(tensor([[5., 2., 3., 4., 5., 6., 7., 8., 9.],
         [5., 2., 3., 4., 5., 6., 7., 8., 9.],
         [5., 2., 3., 4., 5., 6., 7., 8., 9.],
         [5., 2., 3., 4., 5., 6., 7., 8., 9.]]), torch.Size([4, 9]))

In [None]:
# torch.unsqueeze() - adds a single dimension to a target tensor at a specified dim (dimension) 
unsqueezed_tensor = torch.unsqueeze(x_stack, 1)
unsqueezed_tensor, unsqueezed_tensor.shape

(tensor([[[5., 2., 3., 4., 5., 6., 7., 8., 9.]],
 
         [[5., 2., 3., 4., 5., 6., 7., 8., 9.]],
 
         [[5., 2., 3., 4., 5., 6., 7., 8., 9.]],
 
         [[5., 2., 3., 4., 5., 6., 7., 8., 9.]]]), torch.Size([4, 1, 9]))

In [None]:
# torch.squeeze() - removes all single dimensions from a target tensor
squeezed_tensor = torch.squeeze(unsqueezed_tensor, dim=1)
squeezed_tensor, squeezed_tensor.shape

(tensor([[5., 2., 3., 4., 5., 6., 7., 8., 9.],
         [5., 2., 3., 4., 5., 6., 7., 8., 9.],
         [5., 2., 3., 4., 5., 6., 7., 8., 9.],
         [5., 2., 3., 4., 5., 6., 7., 8., 9.]]), torch.Size([4, 9]))

In [None]:
# torch.permute - rearranges the dimensions of a target tensor in a specified order
# commonly used for image data tensors

x_original = torch.rand(size=(224, 224, 3)) # h, w, RGB

# Permute the originaal tensor to rearrange the axis (or dim) order:
x_permuted = torch.permute(x_original, (2, 0, 1)) # shifts axis 0->1, 1->2, 2->0

print(f'Old shape: {x_original.shape}')
print(f'New shape: {x_permuted.shape}') # rgb, h, w

Old shape: torch.Size([224, 224, 3])
New shape: torch.Size([3, 224, 224])


In [None]:
x_original[0, 0, 0] = .5555
x_original[0, 0, 0], x_permuted[0, 0, 0]

(tensor(0.5555), tensor(0.5555))

## Indexing (selecting data from tensors)

Indexing with PyTorch is similar to indexing with NumPy

In [None]:
# Create a tensor
import torch
x = torch.arange(1, 10).reshape(1, 3, 3)
x, x.shape

(tensor([[[1, 2, 3],
          [4, 5, 6],
          [7, 8, 9]]]), torch.Size([1, 3, 3]))

In [None]:
# Let's index on our new tensor
x[0]

tensor([[1, 2, 3],
        [4, 5, 6],
        [7, 8, 9]])

In [None]:

# Index on middle bracket (dim=1)
x[0, 0]

tensor([1, 2, 3])

In [None]:
# Index on inner most bracket (last dimension)
x[0, 0, 0]

tensor(1)

In [None]:
# You can also use ":" to select "all" of a target dimension
# Try: Get all values of 0th dim and 1st dim, but only 1st index of 2nd dim
x[:, :, 1]

tensor([[2, 5, 8]])

In [None]:
# Get all values of the 0 dim but only 1 valu index of 1st and 2nd dim
x[:, 1, 1]

tensor([5])

In [None]:
# Get idx of 0 of 0th and 1st dim and all values of 2nd dim
x[0, 0, :]

tensor([1, 2, 3])

## PyTorch tensors & NumPy

NumPy is a popular scientific Python numerical computing library.

As such, PyTorch has functionality to interact with it

* Data in NumPy, want in PyTorch tensor -> `torch.from_numpy(ndarray)`
* PyTorch tensor -> NumPy -> `torch.tensor.numpy()`



In [None]:
# NumPy array to tensor
import torch
import numpy as np

array = np.arange(1.0, 8.0)
tensor = torch.from_numpy(array) # warning: when covnerting from numpy -> pytorch reflects numpy's default dtype of float64 unless specified
array, tensor

(array([1., 2., 3., 4., 5., 6., 7.]),
 tensor([1., 2., 3., 4., 5., 6., 7.], dtype=torch.float64))

In [None]:
array.dtype, tensor.dtype, torch.arange(1.0, 8.0).dtype

(dtype('float64'), torch.float64, torch.float32)

In [None]:
tensor = torch.from_numpy(array).type(torch.float32)
tensor.dtype

torch.float32

In [None]:
tensor.dtype

torch.float32

In [None]:
# Change the value of array, what will this do to tensor?
array = array + 1
array, tensor  # they are separate object in memory

(array([2., 3., 4., 5., 6., 7., 8.]), tensor([1., 2., 3., 4., 5., 6., 7.]))

In [None]:
# Tensor to NumPy array
tensor = torch.ones(7)
numpy_tensor = tensor.numpy()
tensor, numpy_tensor  # Using default dtype as created from tensor

(tensor([1., 1., 1., 1., 1., 1., 1.]),
 array([1., 1., 1., 1., 1., 1., 1.], dtype=float32))

In [None]:
# Change the tensor, what happens to NumPy array?

tensor = tensor + 1
tensor, numpy_tensor

(tensor([3., 3., 3., 3., 3., 3., 3.]),
 array([1., 1., 1., 1., 1., 1., 1.], dtype=float32))

## Reproducibility (trying to take the random out of random)

To reduce the randomness in neural networks and PyTorch, we utilize a **random seed**

PyTorch docs: https://pytorch.org/docs/stable/notes/randomness.html?highlight=reproducibility

In [None]:
import torch

# Create two random tensors
rand_A = torch.rand(3, 4)
rand_B = torch.rand(3, 4)

print(rand_A)
print(rand_B)
print(rand_A == rand_B)

tensor([[0.2326, 0.4205, 0.3738, 0.9848],
        [0.3451, 0.8938, 0.6139, 0.2941],
        [0.4486, 0.8609, 0.3326, 0.4979]])
tensor([[0.9770, 0.9186, 0.3726, 0.9103],
        [0.8038, 0.2859, 0.6773, 0.4935],
        [0.3256, 0.7729, 0.2153, 0.1391]])
tensor([[False, False, False, False],
        [False, False, False, False],
        [False, False, False, False]])


In [None]:
# Let's make random but reproducible tensors

# Set the random seed
RANDOM_SEED = 42

torch.manual_seed(RANDOM_SEED)
rand_C = torch.rand(3, 4)

torch.manual_seed(RANDOM_SEED)
rand_D = torch.rand(3, 4)

print(rand_C)
print(rand_D)
print(rand_C == rand_D)

tensor([[0.8823, 0.9150, 0.3829, 0.9593],
        [0.3904, 0.6009, 0.2566, 0.7936],
        [0.9408, 0.1332, 0.9346, 0.5936]])
tensor([[0.8823, 0.9150, 0.3829, 0.9593],
        [0.3904, 0.6009, 0.2566, 0.7936],
        [0.9408, 0.1332, 0.9346, 0.5936]])
tensor([[True, True, True, True],
        [True, True, True, True],
        [True, True, True, True]])


## Running tensors and PyTorch object on GPUs

GPUs = faster computation on numbers thanks to CUDA + NVIDIA hardware + PyTorch

For local setup: https://pytorch.org/get-started/locally/

In [None]:
!nvidia-smi

Thu Nov 10 02:27:03 2022       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 460.32.03    Driver Version: 460.32.03    CUDA Version: 11.2     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|   0  Tesla T4            Off  | 00000000:00:04.0 Off |                    0 |
| N/A   57C    P8    10W /  70W |      0MiB / 15109MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Proces

## Check for GPU access with PyTorch

In [None]:
# Check for GPU access with PyTorch
import torch
torch.cuda.is_available()

True

Write device agnostic code: https://pytorch.org/blog/pytorch-0_4_0-migration-guide/#writing-device-agnostic-code

In [None]:
# Setup device agnostic code
device = "cuda" if torch.cuda.is_available() else "cpu"
device

'cuda'

In [None]:
# Count number of devices
torch.cuda.device_count()

1

## Putting tensors (and models) on the GPU

We want to optimize for faster computations on the GPU

In [None]:
# Create a tensor (default on the CPU)
tensor = torch.tensor([1, 2, 3])

# Tensor not on GPU
print(tensor, tensor.device)

tensor([1, 2, 3]) cpu


In [None]:
# Move tensor to GPU if available
tensor_on_gpu = tensor.to(device)
tensor_on_gpu

tensor([1, 2, 3], device='cuda:0')

### Move tensors back to CPU

In [None]:
# If tensor is on GPU, can't transform it to NumPy
tensor_on_gpu.numpy()

TypeError: ignored

In [None]:
# To fix the GPU tensor with NumPy issue, we can first set it to the CPU
tensor_back_on_cpu = tensor_on_gpu.cpu().numpy()
tensor_back_on_cpu

array([1, 2, 3])

In [None]:
tensor_on_gpu

tensor([1, 2, 3], device='cuda:0')