## PyTorch Fundamentals

In [1]:
import torch
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
print(torch.__version__)

2.0.1+cu118


## Introduction to Tensors
### Creating tensors

#### Pytorch tensors are created using `torch.tensor()` = https://pytorch.org/docs/stable/tensors.html

In [2]:
# scaler
scaler = torch.tensor(7)
# Get tensor back as python int
scaler.item()

7

In [3]:
scaler.ndim

0

In [4]:
# Vector
vector = torch.tensor([7,7])
vector

tensor([7, 7])

In [5]:
vector.ndim

1

In [6]:
vector.shape

torch.Size([2])

In [7]:
# MATRIX
MATRIX = torch.tensor([[5,6],[7,8]])
MATRIX

tensor([[5, 6],
        [7, 8]])

In [8]:
MATRIX.ndim

2

In [9]:
MATRIX.shape

torch.Size([2, 2])

In [10]:
# TENSOR
TENSOR = torch.tensor([[[1,2,3],
                        [4,5,6],
                        [7,8,9]],
                       [[1,2,3],
                        [4,5,6],
                        [7,8,9]]])
TENSOR

tensor([[[1, 2, 3],
         [4, 5, 6],
         [7, 8, 9]],

        [[1, 2, 3],
         [4, 5, 6],
         [7, 8, 9]]])

In [11]:
TENSOR.ndim

3

In [12]:
TENSOR.shape

torch.Size([2, 3, 3])

### Random tensors

why random tensors?

Random tensors are important because the way neural networks learn is that they start with tensor full of random numbers and then adjust those random numbers to better represent the data.

`start at random numbers -> look at data -> update random numbers -> look at data -> update random numbers`



In [13]:
# create a random tensor of size (3,4)

random_tensor = torch.rand(3,4)
random_tensor


tensor([[0.0068, 0.1585, 0.9216, 0.6877],
        [0.2269, 0.6162, 0.6780, 0.5107],
        [0.2126, 0.1144, 0.2221, 0.8896]])

In [14]:
random_tensor.ndim

2

In [15]:
random_tensor.shape

torch.Size([3, 4])

In [16]:
#  create random tensor similar to the real image
random_image_size_tensor = torch.rand(size=(224,224,3)) # height, width, color channel (RGB)
random_image_size_tensor.shape, random_image_size_tensor.ndim

(torch.Size([224, 224, 3]), 3)

### Zeros and ones


In [17]:
# creating tensors of zeros and ones
zeros = torch.zeros(size=(2,3,4))
zeros

tensor([[[0., 0., 0., 0.],
         [0., 0., 0., 0.],
         [0., 0., 0., 0.]],

        [[0., 0., 0., 0.],
         [0., 0., 0., 0.],
         [0., 0., 0., 0.]]])

In [18]:
ones = torch.ones(size=(2,3,4))
ones

tensor([[[1., 1., 1., 1.],
         [1., 1., 1., 1.],
         [1., 1., 1., 1.]],

        [[1., 1., 1., 1.],
         [1., 1., 1., 1.],
         [1., 1., 1., 1.]]])

In [19]:
ones.dtype

torch.float32

### Creating a range of tensors and tensors-like

In [20]:
one_to_ten = torch.arange(start=1,end=11,step=1)
one_to_ten

tensor([ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10])

In [21]:
# creating tensors like
ten_zeros = torch.zeros_like(input= one_to_ten)
ten_zeros

tensor([0, 0, 0, 0, 0, 0, 0, 0, 0, 0])

### Tensor datatype
**Note:** 3 big errors faced when run into pytorch & deep learning:
1. Tensors not right datatype
2. Tensors not right shape
3. Tensors not right device

In [22]:
# float 32 tensor
float_32_tensor = torch.tensor([3.0,6.0,9.0],
                               dtype=None,
                               device=None, # cpu or gpu (tensor on which device)
                               requires_grad=False) # track gradients or not
float_32_tensor

tensor([3., 6., 9.])

In [23]:
float_32_tensor.dtype

torch.float32

In [24]:
# float 16 tensor
float_16_tensor = torch.tensor([3.0,6.0,9.0],
                               dtype= torch.float16,
                               device=None,
                               requires_grad=False)
float_16_tensor

tensor([3., 6., 9.], dtype=torch.float16)

In [25]:
float_16_tensor

tensor([3., 6., 9.], dtype=torch.float16)

In [26]:
float_16_tensor2 = float_32_tensor.type(dtype=torch.HalfTensor)
float_16_tensor2

tensor([3., 6., 9.], dtype=torch.float16)

In [27]:
float_16_tensor * float_32_tensor

tensor([ 9., 36., 81.])

In [28]:
int_32_tensor = float_32_tensor.type(dtype=torch.int32)
int_32_tensor* float_32_tensor

tensor([ 9., 36., 81.])

### Getting information from tensors
1. Tensors not right datatype - we use `tensor.dtype`
2. Tensors not right shape - we use `tensor.shape`
3. Tensors not right device - we use `tensor.device`

In [29]:
# create a tensor
some_tensor = torch.rand(1,3,4)
some_tensor, some_tensor.dtype, some_tensor.shape, some_tensor.device

(tensor([[[0.8061, 0.8366, 0.1620, 0.8709],
          [0.6122, 0.0712, 0.2415, 0.5822],
          [0.2586, 0.0284, 0.1223, 0.1142]]]),
 torch.float32,
 torch.Size([1, 3, 4]),
 device(type='cpu'))

### Manipulating Tensors(tensor operations)
Tensor options include
- addition
- subtraction
- division
- mulitplication (element wise)
- matrix mulitiplication

      1.(**inner dimensions** must match)
      `(3,2) @ (2,3)` will work
      `(3,2) @ (3,2)` won't work

      2. the resulting matrix has the shape of the **outer dimensions**
        `(3,2) @ (2,3)` -> `(3,3)`



In [30]:
tensor = torch.tensor([1,2,3])
tensor+10, tensor-10, tensor/10, tensor*10

(tensor([11, 12, 13]),
 tensor([-9, -8, -7]),
 tensor([0.1000, 0.2000, 0.3000]),
 tensor([10, 20, 30]))

In [31]:
# matrix multiplication (dot product)
# 1. element wise
tensor*tensor

tensor([1, 4, 9])

In [32]:
# 2. matrix mulitplication (builtin method vs for loop) -> builtin method is much faster and easy
torch.matmul(tensor,tensor)

tensor(14)

### one of the most common errors in deep learning: Shape errors

In [33]:
torch.matmul(torch.rand(3,2), torch.rand(3,2))

RuntimeError: ignored

In [34]:
torch.matmul(torch.rand(3,2), torch.rand(2,3))

tensor([[0.8104, 1.0350, 0.8779],
        [0.2698, 0.5290, 0.3190],
        [0.4958, 0.7144, 0.5489]])

To fix our tensor shape issues, we manipulate the shape by taking transpose

In [35]:
tensor_A =torch.rand(3,2)
tensor_B =torch.rand(3,2)
tensor_A.shape, tensor_B.shape

(torch.Size([3, 2]), torch.Size([3, 2]))

In [36]:
tensor_B.T.shape

torch.Size([2, 3])

In [37]:
torch.mm(tensor_A, tensor_B.T)

tensor([[0.7902, 0.6117, 0.9896],
        [0.7557, 0.0641, 0.6981],
        [0.6901, 0.6006, 0.8960]])

### Finding min, max, mean, sum, etc (tensor aggregation)

In [38]:
# random tensor
x= torch.arange(0,100,10)
x

tensor([ 0, 10, 20, 30, 40, 50, 60, 70, 80, 90])

In [39]:
# find min
torch.min(x), x.min()

(tensor(0), tensor(0))

In [40]:
# find max
torch.max(x), x.max()

(tensor(90), tensor(90))

In [41]:
# find average/mean
torch.mean(x.type(dtype= torch.float32)), x.type(dtype= torch.float32).mean()

(tensor(45.), tensor(45.))

In [42]:
# find sum
torch.sum(x), x.sum()

(tensor(450), tensor(450))

### Finding the min and max index using `argmin and argmax`

In [43]:
torch.argmin(x), torch.argmax(x)

(tensor(0), tensor(9))

In [44]:
x.argmin(), x.argmax()

(tensor(0), tensor(9))

## Reshaping, stacking, squeezing and unsqueezing tensors
* Reshaping - reshapes an input tensor to a defined shape
* View - Return a view of an input tensor of certain shape but keep the same memory as the original tensor
* Stacking - combine multiple tensors on top of each other (vstack) or side by side (hstack)
* Squeeze - removes all 1 dimensions from a tensor
* Unsqueeze - add a 1 dimension to a target tensor
* Permute - Return a view of the input with dimensions permuted (swapped) in a certain way



In [45]:
import torch
x = torch.arange(1.,10.)
x,x.shape

(tensor([1., 2., 3., 4., 5., 6., 7., 8., 9.]), torch.Size([9]))

In [46]:
# add extra dimention
x_reshaped = x.reshape(1,9)
x_reshaped, x_reshaped.shape

(tensor([[1., 2., 3., 4., 5., 6., 7., 8., 9.]]), torch.Size([1, 9]))

In [47]:
# change the view
z= x.view(1,9)
z, z.shape

(tensor([[1., 2., 3., 4., 5., 6., 7., 8., 9.]]), torch.Size([1, 9]))

In [48]:
# views of a tensor shares the same memory that means changing z changes x
z[:,0]=5
z,x

(tensor([[5., 2., 3., 4., 5., 6., 7., 8., 9.]]),
 tensor([5., 2., 3., 4., 5., 6., 7., 8., 9.]))

In [49]:
# staking the tensors
torch.stack([x,x,x,x], dim=0)

tensor([[5., 2., 3., 4., 5., 6., 7., 8., 9.],
        [5., 2., 3., 4., 5., 6., 7., 8., 9.],
        [5., 2., 3., 4., 5., 6., 7., 8., 9.],
        [5., 2., 3., 4., 5., 6., 7., 8., 9.]])

In [50]:
# torch.squeeze() remove all single dimensions
x_reshaped, x_reshaped.squeeze()

(tensor([[5., 2., 3., 4., 5., 6., 7., 8., 9.]]),
 tensor([5., 2., 3., 4., 5., 6., 7., 8., 9.]))

In [51]:
x_reshaped.shape, x_reshaped.squeeze().shape

(torch.Size([1, 9]), torch.Size([9]))

In [52]:
# torch.unsqueeze() add all single dimensions
x_squeezed = x_reshaped.squeeze()
x_unsqueezed =x_squeezed.unsqueeze(0)
x_squeezed, x_unsqueezed

(tensor([5., 2., 3., 4., 5., 6., 7., 8., 9.]),
 tensor([[5., 2., 3., 4., 5., 6., 7., 8., 9.]]))

In [53]:
x_squeezed.shape,x_unsqueezed.shape

(torch.Size([9]), torch.Size([1, 9]))

In [54]:
# torch.Permute (rearranges the dimention of tensor in specified order)
x_original = torch.rand(size=(224,224,3)) # height, widht, rgb
torch.permute(x_original, (2,1,0)).shape

torch.Size([3, 224, 224])

## Indexing (pytorch is similar to indexing with Numpy)

In [61]:
x = torch.arange(1,10).reshape(1,3,3)
x, x.shape

(tensor([[[1, 2, 3],
          [4, 5, 6],
          [7, 8, 9]]]),
 torch.Size([1, 3, 3]))

In [63]:
# inner dimention, 1st dim,
x[0][0][0], x[0][0], x[0]

(tensor(1),
 tensor([1, 2, 3]),
 tensor([[1, 2, 3],
         [4, 5, 6],
         [7, 8, 9]]))

In [67]:
# can use ":" to select all of target dimension
# eg. get all values of 0th and 1st dimension but only index 1 of 2nd dimension
x[:,:,1]

tensor([[2, 5, 8]])

## PyTorch tensors & NumPy

In [72]:
# Numpy array to tensor
import torch
import numpy as np
array = np.arange(1.0,8.0)
tensor = torch.from_numpy(array)
array, tensor

(array([1., 2., 3., 4., 5., 6., 7.]),
 tensor([1., 2., 3., 4., 5., 6., 7.], dtype=torch.float64))

In [80]:
# tensor to numpy array
ones = torch.ones(10)
numpy_tensor = ones.numpy()
numpy_tensor, ones

(array([1., 1., 1., 1., 1., 1., 1., 1., 1., 1.], dtype=float32),
 tensor([1., 1., 1., 1., 1., 1., 1., 1., 1., 1.]))

## Reproducbility (trying to take random out of random)

In short how a neural network works

`start with random numbers -> tensor operations -> update random numbers to try and make them better representations of the data -> again -> again ->again`

To reduce randomness in neural networks and PyTorch comes the concept of a **random seed**.
Essentially what the random seed does is "flavour" the randomness.



In [82]:
import torch
# let create two tensor and check by any chance are they equal
tensor_A = torch.rand(3,4)
tensor_B = torch.rand(3,4)
print(tensor_A)
print(tensor_B)
print(tensor_A==tensor_B)

tensor([[0.3633, 0.9227, 0.4481, 0.4440],
        [0.1460, 0.8294, 0.5174, 0.5000],
        [0.0071, 0.9985, 0.0834, 0.8468]])
tensor([[0.8420, 0.9703, 0.3454, 0.2158],
        [0.8450, 0.1931, 0.4006, 0.2073],
        [0.0386, 0.6066, 0.1971, 0.1943]])
tensor([[False, False, False, False],
        [False, False, False, False],
        [False, False, False, False]])


In [85]:
# lets see that with random seed
RANDOM_SEED=42

torch.manual_seed(RANDOM_SEED)
tensor_C = torch.rand(3,4)

torch.manual_seed(RANDOM_SEED)
tensor_D = torch.rand(3,4)

print(tensor_C)
print(tensor_D)
print(tensor_C == tensor_D)

tensor([[0.8823, 0.9150, 0.3829, 0.9593],
        [0.3904, 0.6009, 0.2566, 0.7936],
        [0.9408, 0.1332, 0.9346, 0.5936]])
tensor([[0.8823, 0.9150, 0.3829, 0.9593],
        [0.3904, 0.6009, 0.2566, 0.7936],
        [0.9408, 0.1332, 0.9346, 0.5936]])
tensor([[True, True, True, True],
        [True, True, True, True],
        [True, True, True, True]])


## Running tensors and PyTorch objects on the GPUs (and making faster computations)
GPUs = faster computation on numbers, thanks to CUDA + NVIDIA hardware + PyTorch working behind the scenes to make everything hunky dory (good).

### 1. Getting a GPU
1. Easiest - Use Google Colab for a free GPU (options to upgrade as well)
2. Use your own GPU - takes a little bit of setup and requires the investment of purchasing a GPU, there's lots of options..., see this post for what option to get: https://timdettmers.com/2020/09/07/which-gpu-for-deep-learning/
3. Use cloud computing - GCP, AWS, Azure, these services allow you to rent computers on the cloud and access them

For 2, 3 PyTorch + GPU drivers (CUDA) takes a little bit of setting up, to do this, refer to PyTorch setup documentation: https://pytorch.org/get-started/locally/

In [1]:
!nvidia-smi

/bin/bash: line 1: nvidia-smi: command not found


### 2. check for GPU access with pytorch

In [1]:
import torch
torch.cuda.is_available()

True

In [4]:
# Setup device agnostic code
device = "cuda" if torch.cuda.is_available() else "cpu"
device

'cuda'

In [5]:
# count number of devices
torch.cuda.device_count()

1

### 3. Putting tensors (and models) on the GPU
The reason we want our tensors/models on the GPU is because using a GPU results in faster computations.

In [6]:
# create a tensor (default on the CPU)
tensor = torch.tensor([1,2,3])

# tensor not on GPU
print(tensor, tensor.device)

tensor([1, 2, 3]) cpu


In [7]:
# move tenor to GPU (if available)
tensor_on_gpu = tensor.to(device)
tensor_on_gpu

tensor([1, 2, 3], device='cuda:0')

### 4. Moving tensors back to CPU

In [8]:
# if tensor is on GPU, can't transform it to NumPy
tensor_on_gpu.numpy()

TypeError: ignored

In [9]:
# to fix the GPU tensor with NumPy issue, we can first set it to the CPU
tensor_back_on_cpu = tensor_on_gpu.cpu().numpy()
tensor_back_on_cpu


array([1, 2, 3])