Pytorch Fundamentals

In [None]:
import torch
print(torch.__version__)

2.0.1+cu118


# Creating Tensors

In [None]:
# Sclars
scalar = torch.tensor(7)
scalar

tensor(7)

In [None]:
scalar.ndim

0

In [None]:
#Get tensor back as integer
scalar.item()

7

In [None]:
#Vector
vector = torch.tensor([7,7])
vector

tensor([7, 7])

In [None]:
vector.ndim

1

In [None]:
vector.shape

torch.Size([2])

In [None]:
# MATRIX
matrix = torch.tensor([[7,8],
                       [9,10]
                       ])
matrix

tensor([[ 7,  8],
        [ 9, 10]])

In [None]:
matrix.ndim

2

In [None]:
matrix[1]

tensor([ 9, 10])

In [None]:
matrix.shape

torch.Size([2, 2])

In [None]:
TENSOR = torch.tensor([[[1,2,3],
                        [4,5,6],
                        [7,8,9]
                        ]])
TENSOR

tensor([[[1, 2, 3],
         [4, 5, 6],
         [7, 8, 9]]])

In [None]:
TENSOR.ndim

3

In [None]:
TENSOR.shape

torch.Size([1, 3, 3])

In [None]:
NEW_TENSOR = torch.tensor([[[[[1,2,3],
                             [4,5,6],
                             [7,8,9]
                             ]]]])
NEW_TENSOR

tensor([[[[[1, 2, 3],
           [4, 5, 6],
           [7, 8, 9]]]]])

In [None]:
NEW_TENSOR.shape

torch.Size([1, 1, 1, 3, 3])

In [None]:
NEW_TENSOR.ndim

5

### Random tensors

Why Random tensors?

Random tensors are important because the way many nueral networks learn is they start with tensors full of random numbers and then adjust those
numbers to better represent the data.

In [None]:
# Create a random tensor
random_tensor = torch.rand(1,3,4)
random_tensor

tensor([[[0.1831, 0.2223, 0.7255, 0.2466],
         [0.8179, 0.3199, 0.0858, 0.7012],
         [0.4456, 0.2947, 0.4373, 0.7721]]])

In [None]:
random_tensor.ndim

3

In [None]:
# Random tensor with similar shape to an Image
random_image_size_tensor = torch.rand(size=(224,224,3)) #height, width and color channels
random_image_size_tensor.shape, random_image_size_tensor.ndim

(torch.Size([224, 224, 3]), 3)

# Zeroes and Ones

In [None]:
# Create a tensor of zero
zeros = torch.zeros(size = (3,4))
zeros

tensor([[0., 0., 0., 0.],
        [0., 0., 0., 0.],
        [0., 0., 0., 0.]])

In [None]:
# Create a tensor of ones
ones = torch.ones(size = (3,4))
ones

tensor([[1., 1., 1., 1.],
        [1., 1., 1., 1.],
        [1., 1., 1., 1.]])

In [None]:
ones.dtype

torch.float32

# Creating a range of tensors and tensors-like

In [None]:
# Use torch.range
one_to_ten = torch.arange(1,11)
torch.arange(start = 1, end = 1000, step = 77)

tensor([  1,  78, 155, 232, 309, 386, 463, 540, 617, 694, 771, 848, 925])

In [None]:
# Using tensors-like (Create tensors of same shape)
ten_zeros = torch.zeros_like(input = one_to_ten)
ten_zeros

tensor([0, 0, 0, 0, 0, 0, 0, 0, 0, 0])

# Tensor Datatypes
**Note:** Tensor datatypes is one of the 3 big errors you'll run into with PyTorch & deep learning:
1. Tensors not right datatype
2. Tensors not right shape
3. Tensors not on the right device

In [None]:
# Float-32 tensor (Default datatype is float32 even if it's set to None)
float_32_tensor = torch.tensor([3.0, 6.0, 9.0], dtype = None)
float_32_tensor.dtype

torch.float32

In [None]:
test_tensor = torch.tensor([3.0, 6.0, 9.0],
                   dtype = None, # What datatype(i.e. float32, float16 and many more)
                   device = None, # What device is your tensor on (i.e. CPU or GPU)
                   requires_grad = False # Whether or not to track gradients with this tensor operations
                   )

In [None]:
float_16_tensor = float_32_tensor.type(torch.float16)
float_16_tensor

tensor([3., 6., 9.], dtype=torch.float16)

In [None]:
float_16_tensor * float_32_tensor

tensor([ 9., 36., 81.])

In [None]:
int_32_tensor = torch.tensor([3, 6, 9], dtype = torch.int32)
int_32_tensor

tensor([3, 6, 9], dtype=torch.int32)

In [None]:
float_32_tensor * int_32_tensor

tensor([ 9., 36., 81.])

# Getting information from tensors
1. To get datatype of the tensor we can use tensor.dtype
2. To get device of the tensor we can use tensor.device
3. To get shape of the tensor we can use tensor.shape

In [None]:
some_tensor = torch.rand(3,4)
some_tensor

tensor([[0.3389, 0.2000, 0.5747, 0.3235],
        [0.0839, 0.8075, 0.9627, 0.0449],
        [0.4312, 0.8169, 0.2137, 0.7343]])

In [None]:
# Finding out details
print(some_tensor)
print(f"Datatype of tensor: {some_tensor.dtype}")
print(f"Shape of the tensor: {some_tensor.shape}")
print(f"Device of the tensor: {some_tensor.device}")

tensor([[0.3389, 0.2000, 0.5747, 0.3235],
        [0.0839, 0.8075, 0.9627, 0.0449],
        [0.4312, 0.8169, 0.2137, 0.7343]])
Datatype of tensor: torch.float32
Shape of the tensor: torch.Size([3, 4])
Device of the tensor: cpu


### Manipulating Tensors (Tensors Operations)

Tensor operations include:
* Addition
* Subtraction
* Multiplication (element-wise)
* Division
* Matrix multiplication

In [None]:
# Addition
tensor = torch.tensor([1,2,3])
tensor + 100

tensor([101, 102, 103])

In [None]:
# Multiply
tensor * 10

tensor([10, 20, 30])

In [None]:
# Subtract
tensor - 10

tensor([-9, -8, -7])

In [None]:
# Torch built-ins
torch.mul(tensor, 10)

tensor([10, 20, 30])

# Matrix Multiplication

Two main ways of performing multiplications in nueral networks and deep learning"
1. Element-wise
2. Matrix Multiplication (dot product)

There are two main rules need to satisfy for matrix multiplication:
1. The **inner dimensions** must match:
* `(3,2) @ (3,2)` won't work
* `(2,3) @ (3,2)` will work
* `(3,2) @ (2,3)` will work

2. The resulting matrix has the shape of the **outer dimension**:
* `(2,3) @ (3,2)` -> `(2,2)`
* `(3,2) @ (2,3)` -> `(3,3)`

In [None]:
# torch.matmul(torch.rand(3,2), torch.rand(3,2)) Will give error because the inner dim (2,3) doesn't match
torch.matmul(torch.rand(2,3), torch.rand(3,2))
torch.matmul(torch.rand(10,3), torch.rand(3,10))

tensor([[0.9390, 0.2322, 0.3028, 0.5742, 1.0554, 0.3257, 1.0075, 0.5071, 0.3234,
         0.5331],
        [0.6370, 0.2416, 0.3771, 0.5792, 0.7246, 0.5286, 0.8658, 0.3306, 0.2825,
         0.5730],
        [0.9926, 0.4504, 0.7978, 1.0792, 1.1428, 1.1408, 1.5497, 0.5062, 0.5154,
         1.1026],
        [1.0737, 0.4404, 0.6376, 1.0400, 1.2178, 0.9608, 1.4863, 0.5487, 0.4790,
         1.0229],
        [1.0529, 0.3168, 0.7388, 0.8186, 1.2188, 0.7945, 1.4457, 0.5736, 0.5000,
         0.8533],
        [1.2006, 0.5072, 0.8521, 1.2144, 1.3747, 1.2139, 1.7683, 0.6166, 0.5831,
         1.2246],
        [1.4189, 0.4326, 0.8433, 1.0886, 1.6260, 0.9630, 1.8489, 0.7640, 0.6232,
         1.0987],
        [0.4484, 0.1186, 0.4041, 0.3324, 0.5302, 0.3750, 0.6641, 0.2530, 0.2418,
         0.3716],
        [0.7651, 0.3641, 0.5253, 0.8493, 0.8696, 0.8440, 1.1466, 0.3814, 0.3684,
         0.8427],
        [1.1214, 0.4696, 0.8037, 1.1278, 1.2852, 1.1316, 1.6538, 0.5773, 0.5470,
         1.1400]])

In [None]:
# Element wise multiplication
print(tensor, '*', tensor)
tensor * tensor

tensor([1, 2, 3]) * tensor([1, 2, 3])


tensor([1, 4, 9])

In [None]:
# Matrix Multiplication
torch.matmul(tensor, tensor)

tensor(14)

In [None]:
%%time
value = 0
for i in range(len(tensor)):
  value += tensor[i] * tensor[i]
print(value)

tensor(14)
CPU times: user 741 µs, sys: 0 ns, total: 741 µs
Wall time: 751 µs


In [None]:
%%time
torch.matmul(tensor, tensor)

CPU times: user 84 µs, sys: 0 ns, total: 84 µs
Wall time: 88.5 µs


tensor(14)

### One of the most common errors in deep learning: Shape errors

In [None]:
# Shapes for matrix multiplication

tensor_A = torch.tensor([[1,2],
                         [3,4],
                         [5,6]])

tensor_B = torch.tensor([[7,10],
                         [8,11],
                         [9,12]])

# torch.mm(tensor_A, tensor_B) torch.mm is the same as torch.matml
# torch.matmul(tensor_A, tensor_B) Will give an error

In [None]:
tensor_A.shape, tensor_B.shape

(torch.Size([3, 2]), torch.Size([3, 2]))

To fix our tensor shape issues, we can manipulate the shape of one of our tensor using transpose.

A **transpose** switches the dimensions of a given tensor.

In [None]:
tensor_B.T

tensor([[ 7,  8,  9],
        [10, 11, 12]])

In [None]:
tensor_B.T.shape

torch.Size([2, 3])

In [None]:
torch.matmul(tensor_A, tensor_B.T)

tensor([[ 27,  30,  33],
        [ 61,  68,  75],
        [ 95, 106, 117]])

In [None]:
torch.matmul(tensor_A, tensor_B.T).shape

torch.Size([3, 3])

## Finding the min, max, mean, sum etc (tensor aggregation)

In [None]:
x = torch.arange(0,100,10)
x

tensor([ 0, 10, 20, 30, 40, 50, 60, 70, 80, 90])

In [None]:
torch.min(x), x.min() # Same thing

(tensor(0), tensor(0))

In [None]:
torch.max(x), x.max() # Same thing

(tensor(90), tensor(90))

In [None]:
# Find the mean (Doesn't work on long data type)
# torch.mean() Will give an error cz of long data type
torch.mean(x.type(torch.float32)), x.type(torch.float32).mean()

(tensor(45.), tensor(45.))

In [None]:
# Finding the sum
torch.sum(x), x.sum()

(tensor(450), tensor(450))

## Find the position of the min or max index using argmax and argmin

In [None]:
x.argmin()

tensor(0)

In [None]:
x[0]

tensor(0)

In [None]:
x.argmax()

tensor(9)

In [None]:
x[9]

tensor(90)

## Reshaping, stacking, squeezing and unsqueezing tensors

* Reshaping - reshapes an input tensor to a defined shape.
* View - Return a view of an input tensor of certain shape but keep the same memory as the original tensor.
* Stacking - combine multiple tensors on top of each other (vstack) or side by side (hstack).
* Squeeze - removes all `1` dimensions from a tensor.
* Unsqueeze - adds a `1` dimension to a target tensor.
* Permute - Return a view of the input with dimensions permuted (swapped) in a certain way.

In [None]:
x = torch.arange(1., 13.)
x, x.shape

(tensor([ 1.,  2.,  3.,  4.,  5.,  6.,  7.,  8.,  9., 10., 11., 12.]),
 torch.Size([12]))

In [None]:
# Reshaping
# x_reshaped = x.reshape(1,7) Will give an error because 1 x 7 gives us 7 so 7 with 9 not changeable
x_reshaped = x.reshape(3,4)
x_reshaped, x_reshaped.shape

(tensor([[ 1.,  2.,  3.,  4.],
         [ 5.,  6.,  7.,  8.],
         [ 9., 10., 11., 12.]]),
 torch.Size([3, 4]))

In [None]:
# Change the view (View shares the same memory as tensor)
# If we change z it will change x because the view shares the same memory
z = x.view(1,12)
z, z.shape

z[:,0] = 5
z,x

(tensor([[ 5.,  2.,  3.,  4.,  5.,  6.,  7.,  8.,  9., 10., 11., 12.]]),
 tensor([ 5.,  2.,  3.,  4.,  5.,  6.,  7.,  8.,  9., 10., 11., 12.]))

In [None]:
# Stacks tensors on top of each other
x_stacked = torch.stack([x,x,x,x], dim = 0)
x_stacked

tensor([[ 5.,  2.,  3.,  4.,  5.,  6.,  7.,  8.,  9., 10., 11., 12.],
        [ 5.,  2.,  3.,  4.,  5.,  6.,  7.,  8.,  9., 10., 11., 12.],
        [ 5.,  2.,  3.,  4.,  5.,  6.,  7.,  8.,  9., 10., 11., 12.],
        [ 5.,  2.,  3.,  4.,  5.,  6.,  7.,  8.,  9., 10., 11., 12.]])

In [None]:
#Squeeze - removes all single dimensions from a target tensor.
x_reshaped, x_reshaped.shape

(tensor([[ 5.,  2.,  3.,  4.],
         [ 5.,  6.,  7.,  8.],
         [ 9., 10., 11., 12.]]),
 torch.Size([3, 4]))

In [None]:
x_reshaped.squeeze()

tensor([[ 5.,  2.,  3.,  4.],
        [ 5.,  6.,  7.,  8.],
        [ 9., 10., 11., 12.]])

In [None]:
dem = torch.rand(1,1,10)
dem, dem.shape

(tensor([[[0.4507, 0.1207, 0.0916, 0.4599, 0.8520, 0.7102, 0.4170, 0.6173,
           0.1712, 0.9868]]]),
 torch.Size([1, 1, 10]))

In [None]:
dem.squeeze(), dem.squeeze().shape

(tensor([0.4507, 0.1207, 0.0916, 0.4599, 0.8520, 0.7102, 0.4170, 0.6173, 0.1712,
         0.9868]),
 torch.Size([10]))

In [None]:
# torch.unsqueeze() - adds a single dimension to a target tensor at a specific dim (dimesion)
print(f"Previous target: {dem.squeeze()}")
print(f"Previous shape: {dem.squeeze().shape}")

#Add an extra dimension with unsqueeze
x_unsqueezed = dem.squeeze().unsqueeze(dim=0)
print(f"\nNew tensor: {x_unsqueezed}")
print(f"\nNew Shape: {x_unsqueezed.shape}")

Previous target: tensor([0.4507, 0.1207, 0.0916, 0.4599, 0.8520, 0.7102, 0.4170, 0.6173, 0.1712,
        0.9868])
Previous shape: torch.Size([10])

New tensor: tensor([[0.4507, 0.1207, 0.0916, 0.4599, 0.8520, 0.7102, 0.4170, 0.6173, 0.1712,
         0.9868]])

New Shape: torch.Size([1, 10])


In [None]:
# torch.permute - rearranges the dimensions of a target tensors in a specified order
x_original = torch.rand(size = (224,224,3)) #height, width, color_channels
x_permuted = x_original.permute(2,0,1) # Dimension values 2 is color channels because 0 index. Shifts axis 0 -> 1, 1 -> 2, 2 -> 0
print(f"Previous shape: {x_original.shape}")
print(f"New shape: {x_permuted.shape}")

Previous shape: torch.Size([224, 224, 3])
New shape: torch.Size([3, 224, 224])


In [None]:
x_original[0,0,0] = 72818
x_original[0,0,0], x_permuted[0,0,0]

(tensor(72818.), tensor(72818.))

## Indexing (selecting data from tensors)

Indexing with Pytorch is similar to indexing with NumPy

In [None]:
x = torch.arange(1,10).reshape(1,3,3)
x, x.shape

(tensor([[[1, 2, 3],
          [4, 5, 6],
          [7, 8, 9]]]),
 torch.Size([1, 3, 3]))

In [None]:
# Indexing on our new tensor
x[0]

tensor([[1, 2, 3],
        [4, 5, 6],
        [7, 8, 9]])

In [None]:
# Index on middle bracket(dim=1)
x[0][0]

tensor([1, 2, 3])

In [None]:
# Index on the inner bracket (Last dimension)
# x[1][0][0] Error becuase we can have only 1 dimension 0
x[0][1][0]

tensor(4)

In [None]:
# You can use ":" to select all of the target dimensions
x[:, 0]

tensor([[1, 2, 3]])

In [None]:
# All elements of 0 and 1st dimension but only index 1 of 2nd dimension
x[:,:,1]

tensor([[2, 5, 8]])

In [None]:
# Get all values of the 0 dimension but only the index value of 1st and 2nd dimension
x[:,1,1]

tensor([5])

In [None]:
# Get index 0 of 0th dimension and all values of 2nd dimension.
x[0, 0, :]

tensor([1, 2, 3])

In [None]:
# Index on x to return 9
x[0][2][2]
# Index on x to return 3,6,9
x[:,:,2]

tensor([[3, 6, 9]])

## Pytorch tensors & NumPy

NumPy is a popular scientific Python numerical computing library.

And becuase of this, Pytorch has functionality to interact with it.

* Data in NumPy, want in PyTorch tensor -> `torch.from_numpy(ndarray)`
* Pytorch tensor -> NumPy -> `torch.Tensor.numpy()`

In [None]:
import numpy as np

array = np.arange(1.0, 8.0)
tensor = torch.from_numpy(array) #when converting from numpy -> pytorch, pytorch reflects numpy's default datatype of float64 unless specified otherwise.
array, tensor

(array([1., 2., 3., 4., 5., 6., 7.]),
 tensor([1., 2., 3., 4., 5., 6., 7.], dtype=torch.float64))

In [None]:
# The defualt pytorch datatype if float32 be aware while numpy has float64.
array.dtype

dtype('float64')

In [None]:
# Change the value of array, what will this do to tensor. (Nothing)
array = array + 1
array, tensor

(array([2., 3., 4., 5., 6., 7., 8.]),
 tensor([1., 2., 3., 4., 5., 6., 7.], dtype=torch.float64))

In [None]:
# Tensor to numpy array
tensor = torch.ones(7)
numpy_tensor = tensor.numpy()
tensor, numpy_tensor

(tensor([1., 1., 1., 1., 1., 1., 1.]),
 array([1., 1., 1., 1., 1., 1., 1.], dtype=float32))

In [None]:
numpy_tensor.dtype

dtype('float32')

In [None]:
# Change the tensor, what happens to numpy_tensor (Nothing)
tensor = tensor + 1
tensor, numpy_tensor

(tensor([2., 2., 2., 2., 2., 2., 2.]),
 array([1., 1., 1., 1., 1., 1., 1.], dtype=float32))

## Reproducibility (Trying to take random out of random)

In short how a nueral network learns:

`start with random numbers -> tensor operations -> update random numbers to try and make them
better representations of the data -> again -> again -> again...`

To reduce the randomness in nueral networks and PyTorch comes the concept of a **random seed**.

Essentially what the random seed does is "flavour" the randomness.

In [None]:
random_tensor_A = torch.rand(3,4)
random_tensor_B = torch.rand(3,4)

print(random_tensor_A)
print(random_tensor_B)
print(random_tensor_A == random_tensor_B)

tensor([[0.9059, 0.5239, 0.2736, 0.3612],
        [0.4969, 0.1588, 0.7984, 0.4596],
        [0.7902, 0.1157, 0.1175, 0.2241]])
tensor([[0.7347, 0.1778, 0.3226, 0.4523],
        [0.6920, 0.0179, 0.7264, 0.2352],
        [0.8751, 0.4745, 0.0472, 0.0853]])
tensor([[False, False, False, False],
        [False, False, False, False],
        [False, False, False, False]])


In [None]:
# Let's make some random but reproducible tensors

#Set the random seed
RANDOM_SEED = 42
torch.manual_seed(RANDOM_SEED)

random_tensor_C = torch.rand(3,4)

torch.manual_seed(RANDOM_SEED)
random_tensor_D = torch.rand(3,4)

print(random_tensor_C)
print(random_tensor_D)
print(random_tensor_C == random_tensor_D)

tensor([[0.8823, 0.9150, 0.3829, 0.9593],
        [0.3904, 0.6009, 0.2566, 0.7936],
        [0.9408, 0.1332, 0.9346, 0.5936]])
tensor([[0.8823, 0.9150, 0.3829, 0.9593],
        [0.3904, 0.6009, 0.2566, 0.7936],
        [0.9408, 0.1332, 0.9346, 0.5936]])
tensor([[True, True, True, True],
        [True, True, True, True],
        [True, True, True, True]])


## Running tensors and PyTorch objects on the GPUs (and making faster computations)

GPUs = faster computation on numbers, thanks to CUDA + NVIDIA hardware + Pytorch working behind the scenes to make everything good.

### 1. Getting a GPU

1. Easiest - Use Google Colab for a free GPU (options to upgrade as well)
2. Use your own GPU - takes a little bit of setup and requires investment of puchasing a GPU.
3. Use cloud computing - GCP, AWS, Azure, these services allow you to rent computers on the cloud and access them.

### Check for GPU access with PyTorch

In [None]:
# Check for GPU access with PyTorch
import torch
torch.cuda.is_available()

True

For PyTorch since it's capable of running on the GPU or CPU, it's the best practice to setup device agnostic code.
E.g. Run on GPU if available otherwise run on CPU

In [None]:
# Setup device agnostic code
device = "cuda"if torch.cuda.is_available() else "cpu"
device

'cuda'

In [None]:
# Count the number of devices
torch.cuda.device_count()

1

## Putting tensors (and models) on the GPU

The reason we want our tensors/models on the GPU is because using a GPU results in fatser computations.

In [None]:
# Create a tensor (default CPU)
tensor = torch.tensor([1,2,3])

# Tensor not on GPU
print(tensor, tensor.device)

tensor([1, 2, 3]) cpu


In [None]:
# Move tensor to GPU (if available)
tensor_on_gpu = tensor.to(device)
tensor_on_gpu

tensor([1, 2, 3], device='cuda:0')

## Moving tensors back to the CPU

In [None]:
# If tensor is on GPU, can't transform it to NumPy
# tensor_on_gpu.numpy() will give an error

# To fix the GPU tensor with NumPy issue, we can first set it to the CPU
tensor_back_on_cpu = tensor_on_gpu.cpu().numpy()
tensor_back_on_cpu

array([1, 2, 3])

In [None]:
tensor_on_gpu

tensor([1, 2, 3], device='cuda:0')