<a href="https://colab.research.google.com/github/darshnkd/pytorch-fundamentals/blob/main/pytorch_fundamentals.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## PyTorch Fundamentals

In [None]:
!nvidia-smi

/bin/bash: line 1: nvidia-smi: command not found


In [None]:
import torch
print(torch.__version__)

2.4.0+cu121


## Introduction to Tensors

### Creating tensors

In [None]:
# scalar
scalar = torch.tensor(10)
scalar

tensor(10)

In [None]:
scalar.ndim

0

In [None]:
# get tensor back as python int
scalar.item()

10

In [None]:
# vector
vector = torch.tensor([10,24])
vector

tensor([10, 24])

In [None]:
vector.ndim

1

In [None]:
vector.shape

torch.Size([2])

In [None]:
# MATRIX
MATRIX = torch.tensor([[1,5],
                      [5,8]])
MATRIX

tensor([[1, 5],
        [5, 8]])

In [None]:
MATRIX.ndim

2

In [None]:
MATRIX.shape

torch.Size([2, 2])

In [None]:
MATRIX[1]

tensor([5, 8])

In [None]:
TENSOR = torch.tensor([[[1,2,3],
                       [4,5,6],
                       [7,8,9]]])
TENSOR

tensor([[[1, 2, 3],
         [4, 5, 6],
         [7, 8, 9]]])

In [None]:
TENSOR.ndim

3

In [None]:
TENSOR.shape

torch.Size([1, 3, 3])

### Random tensor

why random tensors?

Random tensor are important because the way neural network learn is that they start with tensors full of random numbers and adjust those random numbers to better represent the data.

` start with random number -> look at data -> update random numbers -> look at the data -> update randm number`

In [None]:
# Create random tensor of size (3,4)
random_tensor = torch.rand(3,4)
random_tensor

tensor([[0.4980, 0.9787, 0.3276, 0.2751],
        [0.4128, 0.6367, 0.2164, 0.8503],
        [0.3674, 0.9757, 0.5627, 0.8998]])

In [None]:
random_tensor.ndim

2

In [None]:
# create random tensor with similar shape to an image tensor
random_image_size_tensor = torch.rand(size=(224,224,3)) #. height , width , color channel(RGB)
random_image_size_tensor.shape , random_image_size_tensor.ndim


(torch.Size([224, 224, 3]), 3)

### Zeros and ones

In [None]:
zeros = torch.zeros(size=(4,4))
zeros

tensor([[0., 0., 0., 0.],
        [0., 0., 0., 0.],
        [0., 0., 0., 0.],
        [0., 0., 0., 0.]])

In [None]:
ones = torch.ones(size=(5,3))
ones

tensor([[1., 1., 1.],
        [1., 1., 1.],
        [1., 1., 1.],
        [1., 1., 1.],
        [1., 1., 1.]])

In [None]:
ones.dtype

torch.float32

### Creating a range of tensor and tensor-like

In [None]:
# creating tensor using arange()
one_to_ten = torch.arange(start=1,end=11,step=1)
one_to_ten

tensor([ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10])

In [None]:
# creating tensor like
ten_zeros = torch.zeros_like(input = one_to_ten)
ten_zeros

tensor([0, 0, 0, 0, 0, 0, 0, 0, 0, 0])

### Dealing with tensor datatypes

**NOTE**: Tensor datatype is one of the 3 big errrors you'll run in to with PyTorch & deep learning .

1. Tensors not right datatype
2. Tensors not right shape
3. Tensors not on the right device

In [None]:
# Default datatype in pytorch is float32

In [None]:
# Float 32 tensor
float_32_tensor = torch.tensor([2.,4.,5.,56.,6.],
                               dtype=None,     # what datatype is the tensor(e.g float16)
                               device=None,    # what device is your tensor on
                               requires_grad=False) # whether or not to track gradients with this tensors operations
float_32_tensor

tensor([ 2.,  4.,  5., 56.,  6.])

In [None]:
float_32_tensor.dtype

torch.float32

In [None]:
# converting in to float16
float_16_tensor = float_32_tensor.type(torch.float16)
float_16_tensor

tensor([ 2.,  4.,  5., 56.,  6.], dtype=torch.float16)

In [None]:
float_16_tensor * float_32_tensor

tensor([   4.,   16.,   25., 3136.,   36.])

In [None]:
int_32_tensor = torch.tensor([1,2,3,4,5], dtype=torch.int32)
int_32_tensor

tensor([1, 2, 3, 4, 5], dtype=torch.int32)

In [None]:
int_32_tensor * float_32_tensor

tensor([  2.,   8.,  15., 224.,  30.])

### Getting information from tensors (tensor attribute)

1. Tensors not right datatype - to do get datatype from a tensor , can use `tensor.dtype`
2. Tensors not right shape - to get shape from tensor , csn use `tensor.shape`
3. Tensors not on right device - to get device from tensor, can use `tensor.device`

In [None]:
some_tensor = torch.rand(4,4)
some_tensor

tensor([[0.8981, 0.4722, 0.6077, 0.6294],
        [0.3986, 0.3900, 0.9575, 0.4441],
        [0.0223, 0.9245, 0.4182, 0.4571],
        [0.5373, 0.7531, 0.7686, 0.0219]])

In [None]:
# Find the details of the tensor
print(some_tensor)
print(f"Datatype of tensor : {some_tensor.dtype}")
print(f"Shape of tensor : {some_tensor.shape}")
print(f"Device of tensor : {some_tensor.device}")

tensor([[0.8981, 0.4722, 0.6077, 0.6294],
        [0.3986, 0.3900, 0.9575, 0.4441],
        [0.0223, 0.9245, 0.4182, 0.4571],
        [0.5373, 0.7531, 0.7686, 0.0219]])
Datatype of tensor : torch.float32
Shape of tensor : torch.Size([4, 4])
Device of tensor : cpu


### Manupulating tensors (tensor operations)

1. Tensor operation include:
* Addition
* Subtraction
* Multiplication (element-wise)
* Division
* Matrix multiplication

In [None]:
# creating tensor and add 10
tensor = torch.tensor([1,3,3])
tensor + 10

tensor([11, 13, 13])

In [None]:
# Multiply tensor by 10
tensor * 10

tensor([10, 30, 30])

In [None]:
# subtract tensor
tensor - 10

tensor([-9, -7, -7])

In [None]:
# Try out PyTorch in-build function
torch.mul(tensor,10)

tensor([10, 30, 30])

In [None]:
torch.add(tensor , 20)

tensor([21, 23, 23])

### Matrix multiplication

Two main ways to perform matrix multiplicattion in neural network and deep learning :
1. Element-wise Multiplication
2. Matrix multiplication (dot product)
###two main rule to do matrix multiplication
1. The **inner Dimension*8 must match:
* (3,2) @ (2,3) --> (3,3)
2. The resulting matrix has shape of the **outer dimension**
*(2,3) @ (3,2) --> (2,2)

In [None]:
# element wise multiplication
tensor * tensor

tensor([1, 9, 9])

In [None]:
# using matmul func
torch.matmul(tensor,tensor)

tensor(19)

**Using loop for matrix multiplication takes more time than using matmul func**

###One of the most common errors in deep learning is shpe errors

In [None]:
 # shapes for matrix multi[lication
 tensor_A  = torch.tensor([[3,4],
                          [5,7],
                          [9,6]])

 tensor_B = torch.tensor([[9,3],
                          [5,8],
                          [1,5]])

#torch.mm(tensor_A,tensor_B) # torch.mm is the same as torch.matmul (it's an alias for matmul)
torch.matmul(tensor_A,tensor_B)

RuntimeError: mat1 and mat2 shapes cannot be multiplied (3x2 and 3x2)

To fix our tensor shape issue , we can manipulate the shape of one of our tensor using **transpose**

In [None]:
tensor_B , tensor_B.shape

(tensor([[9, 3],
         [5, 8],
         [1, 5]]),
 torch.Size([3, 2]))

In [None]:
tensor_B.T , tensor_B.T.shape

(tensor([[9, 5, 1],
         [3, 8, 5]]),
 torch.Size([2, 3]))

In [None]:
mm = torch.matmul(tensor_A, tensor_B.T)
mm

tensor([[39, 47, 23],
        [66, 81, 40],
        [99, 93, 39]])

In [None]:
mm.shape

torch.Size([3, 3])

### Tensor Aggregation (finding min, max, mean, sum)

In [None]:
# create tensor
x = torch.arange(0,100,10)
x

tensor([ 0, 10, 20, 30, 40, 50, 60, 70, 80, 90])

In [None]:
# find the min
torch.min(x), x.min()

(tensor(0), tensor(0))

In [None]:
# find the max
torch.max(x), x.max()

(tensor(90), tensor(90))

In [None]:
# find the mean ---> torch.mean() require a tensor of float32 datatype to work
torch.mean(x.type(torch.float32)), x.type(torch.float32).mean()

(tensor(45.), tensor(45.))

In [None]:
# find the sum
torch.sum(x) , x.sum()

(tensor(450), tensor(450))

### Finding the positional min and max

In [None]:
x

tensor([ 0, 10, 20, 30, 40, 50, 60, 70, 80, 90])

In [None]:
# find the position in tensor that has the minimum value with argmin() --> return indx of that
x.argmin()

tensor(0)

In [None]:
# find the position in tensor that has the maximum value with argmax()
x.argmax()

tensor(9)

### Reshaping , staking , squeezing , and unsqueezing tensor

* ``reshaping``: reshape an input tensor to a define shape
* ``View``: return a view of an input tensor of certain shape but keep the same memory
* ``staking``: combine multiple tensos on top of each other (vstack) side by side (hstacki)
* ``squeez`` : remove all `1` dimension from a tensor
* ``unsqueez`` : add `1` dimension to a tensor
* ``permute`` : return a view of the input with dimension prenuted (swapped) in a certain way

In [None]:
# let's create tensor
import torch
x = torch.arange(1.,10.)
x , x.shape

(tensor([1., 2., 3., 4., 5., 6., 7., 8., 9.]), torch.Size([9]))

In [None]:
# add extra dimension
x_reshaped = x.reshape(1,9)
x_reshaped , x_reshaped.shape

(tensor([[1., 2., 3., 4., 5., 6., 7., 8., 9.]]), torch.Size([1, 9]))

In [None]:
# change the view
z = x.view_as(x)
z , z.shape

(tensor([1., 2., 3., 4., 5., 6., 7., 8., 9.]), torch.Size([9]))

In [None]:
# changing z changes x (because a view of a tensor shares the same memory as the original)
z[0] = 5
z,x

(tensor([5., 2., 3., 4., 5., 6., 7., 8., 9.]),
 tensor([5., 2., 3., 4., 5., 6., 7., 8., 9.]))

In [None]:
# Stack tensors on top of each other
x_stacked = torch.stack([x,x,x,x],dim=1)
x_stacked

tensor([[5., 5., 5., 5.],
        [2., 2., 2., 2.],
        [3., 3., 3., 3.],
        [4., 4., 4., 4.],
        [5., 5., 5., 5.],
        [6., 6., 6., 6.],
        [7., 7., 7., 7.],
        [8., 8., 8., 8.],
        [9., 9., 9., 9.]])

In [None]:
# hstack and vstack
x_hstack = torch.hstack([x,x,x,x])
x_hstack

tensor([5., 2., 3., 4., 5., 6., 7., 8., 9., 5., 2., 3., 4., 5., 6., 7., 8., 9.,
        5., 2., 3., 4., 5., 6., 7., 8., 9., 5., 2., 3., 4., 5., 6., 7., 8., 9.])

In [None]:
# v_stack
x_vstack = torch.vstack([x,x,x,x])
x_vstack

tensor([[5., 2., 3., 4., 5., 6., 7., 8., 9.],
        [5., 2., 3., 4., 5., 6., 7., 8., 9.],
        [5., 2., 3., 4., 5., 6., 7., 8., 9.],
        [5., 2., 3., 4., 5., 6., 7., 8., 9.]])

In [None]:
# Squeeze
x_reshaped , x_reshaped.shape

(tensor([[5., 2., 3., 4., 5., 6., 7., 8., 9.]]), torch.Size([1, 9]))

In [None]:
x_squeezed = x_reshaped.squeeze()
x_squeezed , x_squeezed.shape

(tensor([5., 2., 3., 4., 5., 6., 7., 8., 9.]), torch.Size([9]))

In [None]:
# torch.squeeze
print(f"previous tensor : {x_reshaped}")
print(f"shape of previous tensor : {x_reshaped.shape}")

# remove all extra dimension (squeeze)
print(f"squeezed tensor : {x_squeezed}")
print(f"shape of squeezed tensor : {x_squeezed.shape}")

previous tensor : tensor([[5., 2., 3., 4., 5., 6., 7., 8., 9.]])
shape of previous tensor : torch.Size([1, 9])
squeezed tensor : tensor([5., 2., 3., 4., 5., 6., 7., 8., 9.])
shape of squeezed tensor : torch.Size([9])


In [None]:
# torch.unsqueeze
print(f"previous tensor : {x_squeezed}")
print(f"shape of previous tensor : {x_squeezed.shape}")

# remove all extra dimension (squeeze)
print(f"squeezed tensor : {x_squeezed.unsqueeze(dim=0)}")
print(f"shape of squeezed tensor : {x_squeezed.unsqueeze(dim=0).shape}")


previous tensor : tensor([5., 2., 3., 4., 5., 6., 7., 8., 9.])
shape of previous tensor : torch.Size([9])
squeezed tensor : tensor([[5., 2., 3., 4., 5., 6., 7., 8., 9.]])
shape of squeezed tensor : torch.Size([1, 9])


In [None]:
# torch.permute - rearrange the dimension of target tensor in a specific order
x_original = torch.rand(size=(224,224,3))# height, width, colour channel

# permute the original tensor to rearrange the axis (or dim ) order
x_permuted = x_original.permute(2,0,1) # shift aixs 0->1 ,1->2 ,2->0

print(f"previous shape : {x_original.shape}")
print(f"permuted shape : {x_permuted.shape}")

previous shape : torch.Size([224, 224, 3])
permuted shape : torch.Size([3, 224, 224])


In [None]:
x_original[0,0,0] = 0.8783

###Indexing (selecting data from tensor)

Indexing with PyTorch is similar to indexing with NumPy

In [None]:
# create a tensor
import torch
x = torch.arange(1,10).reshape(1,3,3)
x , x.shape

(tensor([[[1, 2, 3],
          [4, 5, 6],
          [7, 8, 9]]]),
 torch.Size([1, 3, 3]))

In [None]:
# let's index on tensor
x[0]

tensor([[1, 2, 3],
        [4, 5, 6],
        [7, 8, 9]])

In [None]:
# index on first row
x[0][0]

tensor([1, 2, 3])

In [None]:
# index on first innner bracket (last dimension)
x[0][0][0]

tensor(1)

In [None]:
# getting number 9 elemnt from tensor

x[0][2][2]

tensor(9)

In [None]:
# you can also use ":" to select all of the target dimension
x[:,0]

tensor([[1, 2, 3]])

In [None]:
# get all values of 0th and 1st dimension but only index 1 of 2nd dimension
x[:,:,1]

tensor([[2, 5, 8]])

In [None]:
# get all values of the 0 dimension but only the 1 index value of the 1st and 2nd dimension

In [None]:
x[:,1,1]

tensor([5])

In [None]:
# get index 0 of 0th and 1st dimension and all values of 2nd dimension
x[0,0,:]

tensor([1, 2, 3])

In [None]:
# to get last column
x[:,:,2]

tensor([[3, 6, 9]])

In [None]:
x

tensor([[[1, 2, 3],
         [4, 5, 6],
         [7, 8, 9]]])

### PyTorch tensor & NumPy

NumPy is popular scientific Python numerical computing library.
And because of this, PyTorch has functionality to interact with it.

* Data in NumPy, want in PyTorch tensor -> `torch.from_numpy(ndarray)`
* PyTorch tensor -> NumPy -> `torch.Tensor.numpy()`

In [None]:
# numpy array to tensor
import numpy as np
import torch
import tensorflow as tf

array = np.arange(1.0,9.0)
tensor  = torch.from_numpy(array)
tensor_1 = tf.constant(array)

array , tensor , tensor_1

(array([1., 2., 3., 4., 5., 6., 7., 8.]),
 tensor([1., 2., 3., 4., 5., 6., 7., 8.], dtype=torch.float64),
 <tf.Tensor: shape=(8,), dtype=float64, numpy=array([1., 2., 3., 4., 5., 6., 7., 8.])>)

In [None]:
array.dtype

dtype('float64')

**numpy array has default dtype (float64) and torch.tensor has default dtype float32**

In [None]:
# If we change the original array that does not change the tensor
array = array + 10
array , tensor

(array([11., 12., 13., 14., 15., 16., 17., 18.]),
 tensor([1., 2., 3., 4., 5., 6., 7., 8.], dtype=torch.float64))

In [None]:
# Tensor to numpy array
tensor = torch.ones(10)
numpy_tensor = tensor.numpy()
tensor , numpy_tensor

(tensor([1., 1., 1., 1., 1., 1., 1., 1., 1., 1.]),
 array([1., 1., 1., 1., 1., 1., 1., 1., 1., 1.], dtype=float32))

## Reproducebility (trying  tio take random out of random)

In short how a neural network learns:

` start with random number -> tensor operation -> update random number to try and make them better representation of the data -> again -> again...

TO reduce the randomness in neural networks and PyTorch comes the concept of a **random seed**.

Essentially what the random seed does is "favour" the randomness.

In [None]:
# Create two random tensor
tensor_A  = torch.rand(4,4)
tensor_B = torch.rand(4,4)

print(tensor_A )
print(tensor_B)

print(tensor_A == tensor_B)

tensor([[0.8678, 0.9374, 0.7676, 0.5991],
        [0.2312, 0.9576, 0.1279, 0.7740],
        [0.9189, 0.2320, 0.6329, 0.2045],
        [0.3360, 0.3080, 0.6623, 0.3800]])
tensor([[0.2930, 0.2843, 0.2025, 0.9058],
        [0.9128, 0.0292, 0.9215, 0.0778],
        [0.8883, 0.7673, 0.8364, 0.3900],
        [0.7166, 0.1625, 0.7173, 0.2497]])
tensor([[False, False, False, False],
        [False, False, False, False],
        [False, False, False, False],
        [False, False, False, False]])


In [None]:
# let's make some random but reproducible tensor
import torch

# create tensors
torch.manual_seed(42)
tensor_C  = torch.rand(4,4)
torch.manual_seed(42)
tensor_D = torch.rand(4,4)

print(tensor_C )
print(tensor_D)

print(tensor_C == tensor_D)

tensor([[0.8823, 0.9150, 0.3829, 0.9593],
        [0.3904, 0.6009, 0.2566, 0.7936],
        [0.9408, 0.1332, 0.9346, 0.5936],
        [0.8694, 0.5677, 0.7411, 0.4294]])
tensor([[0.8823, 0.9150, 0.3829, 0.9593],
        [0.3904, 0.6009, 0.2566, 0.7936],
        [0.9408, 0.1332, 0.9346, 0.5936],
        [0.8694, 0.5677, 0.7411, 0.4294]])
tensor([[True, True, True, True],
        [True, True, True, True],
        [True, True, True, True],
        [True, True, True, True]])


### Running tensors and PyTorch object on the GPUs (and making faster computation)
GPUs = faster computation on  (CUDA + NVIDIA + PyTorch)

### 1. Getting GPUs

1. Easiest - Use google colab (option to upgrade)
2. Cloud computing - use GCP , AWS , Azure
3. Use your own GPU -  require setup and good GPU

In [None]:
!nvidia-smi

Sun Sep  8 02:13:00 2024       
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.104.05             Driver Version: 535.104.05   CUDA Version: 12.2     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|   0  Tesla T4                       Off | 00000000:00:04.0 Off |                    0 |
| N/A   45C    P8               9W /  70W |      0MiB / 15360MiB |      0%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+
                                                                    

### 2. Check for GPU access with PyTorch

In [None]:
# check for GPU access with PyTorch
import torch
torch.cuda.is_available()

True

In [None]:
# setup device agnostic code
device = 'cuda' if torch.cuda.is_available() else 'cpu'
device

'cuda'

In [None]:
# count the number of device
torch.cuda.device_count()

1

### 3. Putting tensors (and models) on the GPU

The reason we want our tensosr /models on the GPU is because using GPU results in faster computation

In [None]:
# create tensor (default on CPU )
tensor = torch.tensor([1,2,3])

# tensor not on GPU
tensor, tensor.device

(tensor([1, 2, 3]), device(type='cpu'))

In [None]:
# Move tensor to GPU (if avilable)
tensor_on_gpu = tensor.to(device)
tensor_on_gpu

tensor([1, 2, 3], device='cuda:0')

### 4. Moving tensor back to CPU

In [None]:
# If tensor is on GPU , can't transform it to numoy
tensor_on_gpu.numpy()

TypeError: can't convert cuda:0 device type tensor to numpy. Use Tensor.cpu() to copy the tensor to host memory first.

In [None]:
# To fix the GPU tensor with numpy issue, we can first set it to the CPU
tensor_back_on_cpu = tensor_on_gpu.cpu().numpy()
tensor_back_on_cpu

array([1, 2, 3])

In [None]:
tensor_on_gpu # it remains unchange

tensor([1, 2, 3], device='cuda:0')