<a href="https://colab.research.google.com/github/DevP-ai/Pytorch-For-Deep-Learning/blob/main/00_pytorch_fundamental.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## 00. Pytorch Fundamentals
Resource notebook:https://www.learnpytorch.io/

In [None]:
import torch

In [None]:
torch.__version__

'2.0.1+cu118'

In [None]:
import torch
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

# Introduction to Tensors

## Creating Tensor

PyTorch tensors are created using `torch.Tensor()` =https://pytorch.org/docs/stable/tensors.html

In [None]:
#Scalar 
scalar=torch.tensor(7)
scalar

tensor(7)

In [None]:
scalar.ndim #dimension of the tensor

0

In [None]:
scalar.dtype

torch.int64

In [None]:
#Get the Python number within a tensor(only works one-element tensors)
scalar.item()

7

In [None]:
#Vector
vector=torch.tensor([7,7])
vector

tensor([7, 7])

In [None]:
vector.ndim # number of dimension of the vector

1

In [None]:
vector.shape #Shape of the vector

torch.Size([2])

In [None]:
#Matrix
MATRIX=torch.tensor([[1,3],
                     [4,5]])
MATRIX

tensor([[1, 3],
        [4, 5]])

In [None]:
MATRIX.ndim #dimension of the matrix

2

In [None]:
MATRIX.shape # Shape of the matrix

torch.Size([2, 2])

In [None]:
MATRIX[0]  #Retrieve 0th  elements of the matrix

tensor([1, 3])

In [None]:
MATRIX[1]  # Retrieve 1th elements of the matrix

tensor([4, 5])

In [None]:
# Tensor
TENSOR=torch.tensor([[[4,5,6],
                      [6,7,8],
                      [10,11,12]]])
TENSOR

tensor([[[ 4,  5,  6],
         [ 6,  7,  8],
         [10, 11, 12]]])

In [None]:
TENSOR.ndim #dimension

3

In [None]:
TENSOR.shape #shape

torch.Size([1, 3, 3])


Alright, it outputs `torch.Size([1, 3, 3])`.

The dimensions go outer to inner.

That means there's 1 dimension of 3 by 3.

### Random tensors

Why random tensors?

Random tensors are important because the way many neural networks learn is that they start with tensors full of random numbers and then adjust those random numbers to better represent the data.

`Start with random number -> look at data -> update random numbers -> look at data -> update random numbers`

We can do so using `torch.rand()` and passing the `size` parameter

In [None]:
#Create a random tensor of size (3,4)
random_tensor=torch.rand(size=(3,4))
random_tensor,random_tensor.dtype

(tensor([[0.1481, 0.6836, 0.2349, 0.9521],
         [0.4340, 0.6642, 0.9555, 0.5876],
         [0.6227, 0.2728, 0.8781, 0.6951]]),
 torch.float32)

In [None]:
#Create a random tensor of size (224,224,3)
random_image_size_tensor=torch.rand(size=(224,224,3))
random_image_size_tensor.shape,random_image_size_tensor.ndim

(torch.Size([224, 224, 3]), 3)

## Zeros and Ones

Sometime want to fill tensors with zeros and ones.

This happens a lot with masking (like masking some of the values in one tensor with zeros to let a model know not to learn them).

Create a tensor full of zeros and ones with `torch.zeros(),tensor.ones()` and passing the `size` parameter

In [None]:
#Create a tensor of all zeros
zeros=torch.zeros(size=(3,4))
zeros,zeros.dtype

(tensor([[0., 0., 0., 0.],
         [0., 0., 0., 0.],
         [0., 0., 0., 0.]]),
 torch.float32)

In [None]:
#Create a tensor of all ones
ones=torch.ones(size=(3,4))
ones,ones.dtype

(tensor([[1., 1., 1., 1.],
         [1., 1., 1., 1.],
         [1., 1., 1., 1.]]),
 torch.float32)

## Creating a range and tensor like

We can use `torch.arange(start,end,step)`

In [None]:
r=torch.range(0,10)  # torch.range() is deprecated
r

  r=torch.range(0,10)  # torch.range() is deprecated


tensor([ 0.,  1.,  2.,  3.,  4.,  5.,  6.,  7.,  8.,  9., 10.])

In [None]:
#use torch.arange()
zero_to_ten=torch.arange(start=0,end=10,step=1)
zero_to_ten

tensor([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

Sometimes we might want one tensor of a certain type with same shape as another tensor.

We can use `torch.zeros_like(input)` or `torch.ones_like(input)`

In [None]:
ten_zeros=torch.zeros_like(zero_to_ten)
ten_zeros

tensor([0, 0, 0, 0, 0, 0, 0, 0, 0, 0])

In [None]:
ten_ones=torch.ones_like(zero_to_ten)
ten_ones

tensor([1, 1, 1, 1, 1, 1, 1, 1, 1, 1])

## Tensor datatypes

In [None]:
float_32_tensor=torch.tensor([3.0,6.0,9.0],
                             dtype=None,# defaults to None, which is torch.float32 or whatever datatype is passed
                             device=None,# defaults to None, which uses the default tensor type
                             requires_grad=False)# if True, operations performed on the tensor are recorded 

float_32_tensor.dtype,float_32_tensor.device

(torch.float32, device(type='cpu'))

In [None]:
float_16_tensor = torch.tensor([3.0, 6.0, 9.0],
                               dtype=torch.float16) # torch.half would also work

float_16_tensor.dtype

torch.float16

## Getting information from tensor

In [None]:
# Create a tensor
some_tensor = torch.rand(3, 4)

# Find out details about it
print(some_tensor)
print(f"Shape of tensor: {some_tensor.shape}")
print(f"Datatype of tensor: {some_tensor.dtype}")
print(f"Device tensor is stored on: {some_tensor.device}") # will default to CPU

tensor([[0.6341, 0.9434, 0.1437, 0.9016],
        [0.7301, 0.2453, 0.0899, 0.4950],
        [0.9425, 0.3555, 0.3599, 0.4394]])
Shape of tensor: torch.Size([3, 4])
Datatype of tensor: torch.float32
Device tensor is stored on: cpu


##  Manipulating tensors (tensor operations)

In [None]:
#Addition
tensor=torch.tensor([1,2,4])
tensor+10

tensor([11, 12, 14])

In [None]:
#Multiplication
tensor*10

tensor([10, 20, 40])


Notice how the tensor values above didn't end up being `tensor([110, 120, 140])`, this is because the values inside the tensor don't change unless they're reassigned.

In [None]:
# Tensors don't change unless reassigned
tensor

tensor([1, 2, 4])

In [None]:
# Subtract and reassign
tensor = tensor - 10
tensor

tensor([-9, -8, -6])

In [None]:
# Add and reassign
tensor = tensor + 10
tensor

tensor([1, 2, 4])


PyTorch also has a bunch of built-in functions like `torch.mul()` (short for multiplication) and `torch.add()` to perform basic operations.

In [None]:
# Can also use torch functions
torch.multiply(tensor, 10)

tensor([10, 20, 40])

In [None]:
# Original tensor is still unchanged 
tensor

tensor([1, 2, 4])

In [None]:
# Element-wise multiplication (each element multiplies its equivalent, index 0->0, 1->1, 2->2)
print(tensor, "*", tensor)
print("Equals:", tensor * tensor)

tensor([1, 2, 4]) * tensor([1, 2, 4])
Equals: tensor([ 1,  4, 16])


## Matrix multiplication

One of the most common operations in machine learning and deep learning algorithms (like neural networks) is matrix multiplication.

PyTorch implements matrix multiplication functionality in the `torch.matmul()`method.



In [None]:
import torch
tensor = torch.tensor([1, 2, 3])
tensor.shape

torch.Size([3])


The difference between element-wise multiplication and matrix multiplication is the addition of values.

In [None]:
# Element-wise matrix multiplication
tensor * tensor

tensor([1, 4, 9])

In [None]:
# Matrix multiplication
torch.matmul(tensor, tensor)

tensor(14)

In [None]:
# Can also use the "@" symbol for matrix multiplication, though not recommended
tensor @ tensor

tensor(14)

In [None]:
%%time
# Matrix multiplication by hand 
# (avoid doing operations with for loops at all cost, they are computationally expensive)
value = 0
for i in range(len(tensor)):
  value += tensor[i] * tensor[i]
value

CPU times: user 265 µs, sys: 0 ns, total: 265 µs
Wall time: 271 µs


tensor(14)

In [None]:
%%time
torch.matmul(tensor, tensor)

CPU times: user 39 µs, sys: 0 ns, total: 39 µs
Wall time: 44.3 µs


tensor(14)

## One of the most common errors in deep learning (shape errors)

Because much of deep learning is multiplying and performing operations on matrices and matrices have a strict rule about what shapes and sizes can be combined.

In [None]:
# Shapes need to be in the right way  
tensor_A = torch.tensor([[1, 2],
                         [3, 4],
                         [5, 6]], dtype=torch.float32)

tensor_B = torch.tensor([[7, 10],
                         [8, 11], 
                         [9, 12]], dtype=torch.float32)

torch.matmul(tensor_A, tensor_B) # (this will error)

RuntimeError: ignored

We can make matrix multiplication work between tensor_A and tensor_B by making their inner dimensions match.

One of the ways to do this is with a transpose (switch the dimensions of a given tensor).

1.`torch.transpose(input, dim0, dim1)` - where `input` is the desired tensor to transpose and dim0 and dim1 are the dimensions to be swapped.

2.`tensor.T` - where tensor is the desired tensor to transpose.

In [None]:
# View tensor_A and tensor_B
print(tensor_A)
print(tensor_B)

tensor([[1., 2.],
        [3., 4.],
        [5., 6.]])
tensor([[ 7., 10.],
        [ 8., 11.],
        [ 9., 12.]])


In [None]:
# View tensor_A and tensor_B.T
print(tensor_A)
print(tensor_B.T)

tensor([[1., 2.],
        [3., 4.],
        [5., 6.]])
tensor([[ 7.,  8.,  9.],
        [10., 11., 12.]])


In [None]:
# The operation works when tensor_B is transposed
print(f"Original shapes: tensor_A = {tensor_A.shape}, tensor_B = {tensor_B.shape}\n")
print(f"New shapes: tensor_A = {tensor_A.shape} (same as above), tensor_B.T = {tensor_B.T.shape}\n")
print(f"Multiplying: {tensor_A.shape} * {tensor_B.T.shape} <- inner dimensions match\n")
print("Output:\n")
output = torch.matmul(tensor_A, tensor_B.T)
print(output) 
print(f"\nOutput shape: {output.shape}")

Original shapes: tensor_A = torch.Size([3, 2]), tensor_B = torch.Size([3, 2])

New shapes: tensor_A = torch.Size([3, 2]) (same as above), tensor_B.T = torch.Size([2, 3])

Multiplying: torch.Size([3, 2]) * torch.Size([2, 3]) <- inner dimensions match

Output:

tensor([[ 27.,  30.,  33.],
        [ 61.,  68.,  75.],
        [ 95., 106., 117.]])

Output shape: torch.Size([3, 3])


You can also use `torch.mm()` which is a short for `torch.matmul()`.

In [None]:
# torch.mm is a shortcut for matmul
torch.mm(tensor_A, tensor_B.T)

tensor([[ 27.,  30.,  33.],
        [ 61.,  68.,  75.],
        [ 95., 106., 117.]])


Neural networks are full of matrix multiplications and dot products.

The `torch.nn.Linear()` module (we'll see this in action later on), also known as a feed-forward layer or fully connected layer, implements a matrix multiplication between an input `x` and a weights matrix `A`.

In [None]:
# Since the linear layer starts with a random weights matrix, let's make it reproducible (more on this later)
torch.manual_seed(42)
# This uses matrix multiplication
linear = torch.nn.Linear(in_features=2, # in_features = matches inner dimension of input 
                         out_features=6) # out_features = describes outer value 
x = tensor_A
output = linear(x)
print(f"Input shape: {x.shape}\n")
print(f"Output:\n{output}\n\nOutput shape: {output.shape}")

Input shape: torch.Size([3, 2])

Output:
tensor([[2.2368, 1.2292, 0.4714, 0.3864, 0.1309, 0.9838],
        [4.4919, 2.1970, 0.4469, 0.5285, 0.3401, 2.4777],
        [6.7469, 3.1648, 0.4224, 0.6705, 0.5493, 3.9716]],
       grad_fn=<AddmmBackward0>)

Output shape: torch.Size([3, 6])


## Finding the min, max, mean, sum, etc (aggregation)

In [None]:
# Create a tensor
x = torch.arange(0, 100, 10)
x

tensor([ 0, 10, 20, 30, 40, 50, 60, 70, 80, 90])

In [None]:
print(f"Minimum: {x.min()}")
print(f"Maximum: {x.max()}")
# print(f"Mean: {x.mean()}") # this will error
print(f"Mean: {x.type(torch.float32).mean()}") # won't work without float datatype
print(f"Sum: {x.sum()}")

Minimum: 0
Maximum: 90
Mean: 45.0
Sum: 450


## Positional min/max

We can also find the index of a tensor where the max or minimum occurs with `torch.argmax()` and `torch.argmin()` respectively.


In [None]:
# Create a tensor
tensor = torch.arange(10, 100, 10)
print(f"Tensor: {tensor}")

# Returns index of max and min values
print(f"Index where max value occurs: {tensor.argmax()}")
print(f"Index where min value occurs: {tensor.argmin()}")

Tensor: tensor([10, 20, 30, 40, 50, 60, 70, 80, 90])
Index where max value occurs: 8
Index where min value occurs: 0


## Change tensor datatype

In [None]:
# Create a tensor and check its datatype
tensor = torch.arange(10., 100., 10.)
tensor.dtype

torch.float32

In [None]:
# Create a float16 tensor
tensor_float16 = tensor.type(torch.float16)
tensor_float16

tensor([10., 20., 30., 40., 50., 60., 70., 80., 90.], dtype=torch.float16)

In [None]:
# Create a int8 tensor
tensor_int8 = tensor.type(torch.int8)
tensor_int8

tensor([10, 20, 30, 40, 50, 60, 70, 80, 90], dtype=torch.int8)

## Reshaping, stacking, squeezing and unsqueezing

In [None]:
# Create a tensor
import torch
x = torch.arange(1., 8.)
x, x.shape

(tensor([1., 2., 3., 4., 5., 6., 7.]), torch.Size([7]))

In [None]:
# Add an extra dimension
x_reshaped = x.reshape(1, 7)
x_reshaped, x_reshaped.shape

(tensor([[1., 2., 3., 4., 5., 6., 7.]]), torch.Size([1, 7]))

In [None]:
# Change view (keeps same data as original but changes view)
# See more: https://stackoverflow.com/a/54507446/7900723
z = x.view(1, 7)
z, z.shape

(tensor([[1., 2., 3., 4., 5., 6., 7.]]), torch.Size([1, 7]))

In [None]:
# Changing z changes x
z[:, 0] = 5
z, x

(tensor([[5., 2., 3., 4., 5., 6., 7.]]), tensor([5., 2., 3., 4., 5., 6., 7.]))

## PyTorch tensors & NumPy

In [None]:
# NumPy array to tensor
import torch
import numpy as np
array = np.arange(1.0, 8.0)
tensor = torch.from_numpy(array)
array, tensor

(array([1., 2., 3., 4., 5., 6., 7.]),
 tensor([1., 2., 3., 4., 5., 6., 7.], dtype=torch.float64))

In [None]:
# Change the array, keep the tensor
array = array + 1
array, tensor

(array([2., 3., 4., 5., 6., 7., 8.]),
 tensor([1., 2., 3., 4., 5., 6., 7.], dtype=torch.float64))

In [None]:
# Tensor to NumPy array
tensor = torch.ones(7) # create a tensor of ones with dtype=float32
numpy_tensor = tensor.numpy() # will be dtype=float32 unless changed
tensor, numpy_tensor

(tensor([1., 1., 1., 1., 1., 1., 1.]),
 array([1., 1., 1., 1., 1., 1., 1.], dtype=float32))

In [None]:
# Change the tensor, keep the array the same
tensor = tensor + 1
tensor, numpy_tensor

(tensor([2., 2., 2., 2., 2., 2., 2.]),
 array([1., 1., 1., 1., 1., 1., 1.], dtype=float32))

## Reproducibility (trying to take the random out of random)

In [None]:
import torch

# Create two random tensors
random_tensor_A = torch.rand(3, 4)
random_tensor_B = torch.rand(3, 4)

print(f"Tensor A:\n{random_tensor_A}\n")
print(f"Tensor B:\n{random_tensor_B}\n")
print(f"Does Tensor A equal Tensor B? (anywhere)")
random_tensor_A == random_tensor_B

Tensor A:
tensor([[0.2666, 0.6274, 0.2696, 0.4414],
        [0.2969, 0.8317, 0.1053, 0.2695],
        [0.3588, 0.1994, 0.5472, 0.0062]])

Tensor B:
tensor([[0.9516, 0.0753, 0.8860, 0.5832],
        [0.3376, 0.8090, 0.5779, 0.9040],
        [0.5547, 0.3423, 0.6343, 0.3644]])

Does Tensor A equal Tensor B? (anywhere)


tensor([[False, False, False, False],
        [False, False, False, False],
        [False, False, False, False]])

Just as we  might've expected, the tensors come out with different values.

But what if we wanted to created two random tensors with the same values.

In [None]:
import torch
import random

# # Set the random seed
RANDOM_SEED=42 # try changing this to different values and see what happens to the numbers below
torch.manual_seed(seed=RANDOM_SEED) 
random_tensor_C = torch.rand(3, 4)

# Have to reset the seed every time a new rand() is called 
# Without this, tensor_D would be different to tensor_C 
torch.random.manual_seed(seed=RANDOM_SEED) # try commenting this line out and seeing what happens
random_tensor_D = torch.rand(3, 4)

print(f"Tensor C:\n{random_tensor_C}\n")
print(f"Tensor D:\n{random_tensor_D}\n")
print(f"Does Tensor C equal Tensor D? (anywhere)")
random_tensor_C == random_tensor_D

Tensor C:
tensor([[0.8823, 0.9150, 0.3829, 0.9593],
        [0.3904, 0.6009, 0.2566, 0.7936],
        [0.9408, 0.1332, 0.9346, 0.5936]])

Tensor D:
tensor([[0.8823, 0.9150, 0.3829, 0.9593],
        [0.3904, 0.6009, 0.2566, 0.7936],
        [0.9408, 0.1332, 0.9346, 0.5936]])

Does Tensor C equal Tensor D? (anywhere)


tensor([[True, True, True, True],
        [True, True, True, True],
        [True, True, True, True]])

#Getting PyTorch to run on the GPU

In [None]:
# Check for GPU
import torch
torch.cuda.is_available()

True

In [None]:
# Set device type
device = "cuda" if torch.cuda.is_available() else "cpu"
device

'cuda'

In [None]:
# Count number of devices
torch.cuda.device_count()

1

## Putting tensors (and models) on the GPU

In [None]:
# Create tensor (default on CPU)
tensor = torch.tensor([1, 2, 3])

# Tensor not on GPU
print(tensor, tensor.device)

# Move tensor to GPU (if available)
tensor_on_gpu = tensor.to(device)
tensor_on_gpu

tensor([1, 2, 3]) cpu


tensor([1, 2, 3], device='cuda:0')

## Moving tensors back to the CPU

In [None]:
# If tensor is on GPU, can't transform it to NumPy (this will error)
tensor_on_gpu.numpy()

TypeError: ignored


Instead, to get a tensor back to CPU and usable with NumPy we can use Tensor.cpu().

This copies the tensor to CPU memory so it's usable with CPUs.

In [None]:
# Instead, copy the tensor back to cpu
tensor_back_on_cpu = tensor_on_gpu.cpu().numpy()
tensor_back_on_cpu

array([1, 2, 3])


The above returns a copy of the GPU tensor in CPU memory so the original tensor is still on GPU

In [None]:
tensor_on_gpu

tensor([1, 2, 3], device='cuda:0')

## Exercise

### Create a random tensor with shape (7, 7)

In [None]:
#import torch 
import torch
#Create random tensor
x=torch.rand(size=(7,7))
x,x.shape,x.dtype,x.ndim

(tensor([[0.6648, 0.4388, 0.6558, 0.1240, 0.4555, 0.4320, 0.0959],
         [0.9237, 0.6994, 0.8865, 0.1393, 0.8143, 0.1979, 0.8855],
         [0.3343, 0.7225, 0.5302, 0.0061, 0.3070, 0.2643, 0.3227],
         [0.8878, 0.4780, 0.6033, 0.7427, 0.2313, 0.3060, 0.6087],
         [0.9153, 0.1833, 0.5366, 0.8879, 0.3969, 0.0740, 0.4959],
         [0.2115, 0.1177, 0.9666, 0.3463, 0.5029, 0.8460, 0.8981],
         [0.7044, 0.0807, 0.6602, 0.6961, 0.3827, 0.1809, 0.8744]]),
 torch.Size([7, 7]),
 torch.float32,
 2)

### Perform a matrix multiplication on the tensor from 2 with another random tensor with shape (1, 7) (hint: you may have to transpose the second tensor).

In [None]:
#Create another random tensor
TENSOR=torch.rand(size=(1,7))
TENSOR

tensor([[0.3201, 0.4248, 0.9468, 0.6622, 0.5084, 0.9840, 0.1286]])

In [None]:
t=TENSOR.T
t

tensor([[0.3201],
        [0.4248],
        [0.9468],
        [0.6622],
        [0.5084],
        [0.9840],
        [0.1286]])

In [None]:
2*t

tensor([[0.6403],
        [0.8497],
        [1.8935],
        [1.3244],
        [1.0167],
        [1.9680],
        [0.2571]])