<a href="https://colab.research.google.com/github/ankit-singh973/Deep_Learning/blob/main/01_PyTorch_fundamentals.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# **PyTorch Fundamentals**


In [2]:
import torch
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
print(torch.__version__)

2.1.0+cu121


In [3]:
!nvidia-smi

/bin/bash: line 1: nvidia-smi: command not found


## **Introduction to Tensors**

### **Create Tensors**
> PyTorch tensors are created using `torch.Tensor()`

In [4]:
# scalar
scalar = torch.tensor(7)
scalar

tensor(7)

In [5]:
scalar.ndim

0

- A scalar has no dimensions

In [6]:
# Get tensor back as Python int
scalar.item()

7

In [7]:
vector = torch.Tensor([7, 7])
vector

tensor([7., 7.])

In [8]:
vector.ndim

1

In [9]:
vector.shape

torch.Size([2])

In [10]:
# Matrix
MATRIX = torch.Tensor([[7, 8],
                      [5, 6]])
MATRIX

tensor([[7., 8.],
        [5., 6.]])

In [11]:
MATRIX.ndim

2

In [12]:
MATRIX.shape

torch.Size([2, 2])

In [13]:
MATRIX[1]

tensor([5., 6.])

In [14]:
MATRIX[0,1]

tensor(8.)

In [15]:
# TENSOR
TENSOR = torch.Tensor([[[1, 2, 3],
                        [3, 6, 9],
                        [2, 4, 6]]])
TENSOR

tensor([[[1., 2., 3.],
         [3., 6., 9.],
         [2., 4., 6.]]])

In [16]:
TENSOR.ndim

3

In [17]:
TENSOR.shape

torch.Size([1, 3, 3])

In [18]:
TENSOR[0]

tensor([[1., 2., 3.],
        [3., 6., 9.],
        [2., 4., 6.]])

In [19]:
TENSOR[0,1]

tensor([3., 6., 9.])

In [20]:
TENSOR[0, 1, 2]

tensor(9.)

### Random Tensors

Why random Tensors?
> Random tensors are important because the way many neural networks learn is that they start with tensors full of random numbers and then adjust those random numbers to better represent the data.

In [21]:
# create a random tensor of size (3, 4)
random_tensor = torch.rand(3, 4)
random_tensor

tensor([[0.6818, 0.1995, 0.0789, 0.9757],
        [0.1559, 0.6512, 0.0675, 0.8798],
        [0.1160, 0.3137, 0.1566, 0.0553]])

In [22]:
random_tensor.ndim

2

In [23]:
random_tensor.shape

torch.Size([3, 4])

In [24]:
random_tensor[1]

tensor([0.1559, 0.6512, 0.0675, 0.8798])

In [25]:
random_tensor[2,3]

tensor(0.0553)

In [26]:
random_tensor1 = torch.rand([5, 3, 4])
random_tensor1

tensor([[[0.8448, 0.4007, 0.8093, 0.5640],
         [0.7459, 0.8195, 0.3180, 0.7415],
         [0.4973, 0.0406, 0.4349, 0.3122]],

        [[0.1613, 0.8291, 0.5028, 0.3648],
         [0.3478, 0.0632, 0.2863, 0.7646],
         [0.4301, 0.8853, 0.0108, 0.7203]],

        [[0.5900, 0.5283, 0.6767, 0.0858],
         [0.9035, 0.3160, 0.4453, 0.2721],
         [0.5682, 0.7917, 0.3785, 0.5946]],

        [[0.4260, 0.9991, 0.6685, 0.7797],
         [0.1435, 0.3751, 0.3858, 0.9754],
         [0.3246, 0.5738, 0.7935, 0.9021]],

        [[0.6990, 0.7058, 0.2222, 0.3874],
         [0.2675, 0.1593, 0.0357, 0.1829],
         [0.0904, 0.8740, 0.4835, 0.0143]]])

In [27]:
random_tensor1.ndim

3

In [28]:
random_tensor1.shape

torch.Size([5, 3, 4])

In [29]:
random_image_size_tensor = torch.rand(size = (224, 224, 3)) # height, width color channel
random_image_size_tensor.shape, random_image_size_tensor.ndim

(torch.Size([224, 224, 3]), 3)

## **Zeros and Ones**

In [30]:
# create tensor of all zeros
zeros = torch.zeros(size = (3,4))
zeros

tensor([[0., 0., 0., 0.],
        [0., 0., 0., 0.],
        [0., 0., 0., 0.]])

In [31]:
# multiply two tensors
zeros*random_tensor

tensor([[0., 0., 0., 0.],
        [0., 0., 0., 0.],
        [0., 0., 0., 0.]])

In [32]:
ones = torch.ones(size = (3,4))
ones

tensor([[1., 1., 1., 1.],
        [1., 1., 1., 1.],
        [1., 1., 1., 1.]])

In [33]:
ones.dtype

torch.float32

## **Create a range of tensors**

In [34]:
# Create a range of values 0 to 10
# use torch.arange(start, end, step)
zero_to_ten = torch.arange(start=0, end=11, step=1)
zero_to_ten

tensor([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10])

In [35]:
# use torch.arange(start, end, step)
one_to_hun = torch.arange(0, 100, 5)
one_to_hun

tensor([ 0,  5, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85,
        90, 95])

In [36]:
 # creating tensor like
 # "like(input = variable_name)" creates tensors of same dimensions as given variable
ten_zeros = torch.zeros_like(input =  zero_to_ten)
ten_zeros

tensor([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0])

## **Tensor Datatypes**

**Note:**  Tensor datatypes is one of the 3 big errors with PyTorch and deep learning:
1. Tensors not right datatype
2. Tensors not right shape
3. Tensors not on the right device

In [37]:
# Float 32 tensor
float_32_tensor = torch.tensor([3.0, 6.0, 9.0],
                               dtype = torch.float32, #what datatype is tenosr e.g float32 or float64
                               device = None, # what device is your tensor on i.e "CPU" or "GPU"
                               requires_grad = False) # wether to track gradients with this tensor operation
float_32_tensor

tensor([3., 6., 9.])

In [38]:
float_32_tensor.dtype

torch.float32

In [39]:
# converting float32 to float16 tensor
float_16_tensor = float_32_tensor.type(torch.float16)
float_16_tensor

tensor([3., 6., 9.], dtype=torch.float16)

In [40]:
aas =  float_16_tensor*float_32_tensor
aas

tensor([ 9., 36., 81.])

In [41]:
aas.dtype

torch.float32

In [42]:
int_32_tensor = torch.tensor([3, 6, 9], dtype =torch.int32 )
int_32_tensor

tensor([3, 6, 9], dtype=torch.int32)

In [43]:
# multiply tensors of different dtypes
# Sometimes this kind of multiplication gives error because the data-type of variables are different
int_32_tensor*float_32_tensor

tensor([ 9., 36., 81.])

In [44]:
# create a tensor
some_tensor = torch.rand(3, 4)
some_tensor

tensor([[0.3998, 0.1839, 0.4222, 0.9944],
        [0.7264, 0.0052, 0.2004, 0.2320],
        [0.2094, 0.9274, 0.0358, 0.0403]])

In [45]:
# find out details about some_tensor
print(some_tensor,'\n')
print(f"Datatype of tensor: {some_tensor.dtype}")
print(f"Shape of tensor: {some_tensor.shape} ")
print(f"Device tensor is on: {some_tensor.device}")

tensor([[0.3998, 0.1839, 0.4222, 0.9944],
        [0.7264, 0.0052, 0.2004, 0.2320],
        [0.2094, 0.9274, 0.0358, 0.0403]]) 

Datatype of tensor: torch.float32
Shape of tensor: torch.Size([3, 4]) 
Device tensor is on: cpu


## **Manipulating Tensors (Tensor operations)**

In [46]:
# Create a tensor and add 10 to it
tensor= torch.tensor([1,2,3])
tensor+10

tensor([11, 12, 13])

In [47]:
# multiply tensor by 10
tensor*10

tensor([10, 20, 30])

In [48]:
# subtract by 10
tensor-10

tensor([-9, -8, -7])

In [49]:
# try out PyTorch in-built function
torch.mul(tensor, 10)

tensor([10, 20, 30])

In [50]:
torch.add(tensor, 10)

tensor([11, 12, 13])

## **Matrix Multiplication (dot product)**
`[2x3] X [3x2] = [2x2]`

In [51]:
# element wise multiplication
print(tensor, '*', tensor)
print(f"Equals: {tensor*tensor}")

tensor([1, 2, 3]) * tensor([1, 2, 3])
Equals: tensor([1, 4, 9])


In [52]:
tensor.shape

torch.Size([3])

In [53]:
# PyTorch matrix multiplication
torch.matmul(tensor, tensor)

tensor(14)

In [54]:
%%time
value = 0
for i in range(len(tensor)):
  value += tensor[i] * tensor[i]
print(value)

tensor(14)
CPU times: user 1.68 ms, sys: 0 ns, total: 1.68 ms
Wall time: 1.57 ms


In [55]:
%%time
torch.matmul(tensor, tensor)

CPU times: user 30 µs, sys: 6 µs, total: 36 µs
Wall time: 38.9 µs


tensor(14)

### **One of the most common errors in deep learning: Shape Errors**

In [56]:
# Shape for matrix multiplication
tensor_A = torch.tensor([[1, 2],
                         [3, 4],
                         [5, 6]])

tensor_B = torch.tensor([[7, 10],
                         [8, 11],
                         [9, 12]])

# it will show error beacuse of shapes
torch.matmul(tensor_A, tensor_B)

RuntimeError: mat1 and mat2 shapes cannot be multiplied (3x2 and 3x2)

#### To fix our tensor shape issues, we can mutliply one of our tensors using **Transpose**
- A **transpose** switches the axes or dimensions of a given tensor

In [57]:
tensor_B.T, tensor_B.T.shape

(tensor([[ 7,  8,  9],
         [10, 11, 12]]),
 torch.Size([2, 3]))

In [58]:
tensor_B, tensor_B.shape

(tensor([[ 7, 10],
         [ 8, 11],
         [ 9, 12]]),
 torch.Size([3, 2]))

In [59]:
'''After taking transpose of tensor_B, we can do the matrix
multiplication operation between tensor_A and tensor_B.T'''
torch.matmul(tensor_A, tensor_B.T)

tensor([[ 27,  30,  33],
        [ 61,  68,  75],
        [ 95, 106, 117]])

## **Finding the min, max, mean, sum, etc. (tensor aggregation)**

In [60]:
# create a tensor
x = torch.arange(0, 100, 10)
x

tensor([ 0, 10, 20, 30, 40, 50, 60, 70, 80, 90])

In [61]:
x.dtype

torch.int64

In [62]:
# Find the min
torch.min(x), x.min()

(tensor(0), tensor(0))

In [63]:
# Find the max
torch.max(x), x.max()

(tensor(90), tensor(90))

In [64]:
'''print the mean.
The troch.mean requires float and int values only
x is int64 i.e long integer, mean doesnot work on int64'''
torch.mean(x.type(torch.float32))

tensor(45.)

In [65]:
# Find the sum
torch.sum(x)

tensor(450)

## **Finding positional min and mx**
`argmin()` and `argmax()`

In [66]:
# returns index position of min value
torch.argmin(x)

tensor(0)

In [67]:
# returns index position of max value
torch.argmax(x)

tensor(9)

## **Reshaping, stacking, squeezing, and unsqueezing**

* **Reshaping** - reshapes an input tensor to a defined shape
* **View** - return view of an input tensor of certain shape but keep the same memory as the original tensor
* **Stacking** - combine multiple tensors on top of each other (vstack) or side by side (hstack)
* **Squeeze** - removes all one dimensions from tensors
* **Unsqueeze** - adds  one dimesnion to target tensor
* **Commute** - return a view of input with dimensions permuted(swapped) in a certain way

In [68]:
# Let's create a tensor
x = torch.arange(1., 11.)
print(x,'\n', x.shape)

tensor([ 1.,  2.,  3.,  4.,  5.,  6.,  7.,  8.,  9., 10.]) 
 torch.Size([10])


In [69]:
# add extra dimensions
x_reshaped = x.reshape(1, 2, 5)
x_reshaped, x_reshaped.shape

(tensor([[[ 1.,  2.,  3.,  4.,  5.],
          [ 6.,  7.,  8.,  9., 10.]]]),
 torch.Size([1, 2, 5]))

In [70]:
X_resaped = x.reshape(1, 5, 2)
X_resaped

tensor([[[ 1.,  2.],
         [ 3.,  4.],
         [ 5.,  6.],
         [ 7.,  8.],
         [ 9., 10.]]])

In [71]:
X_reshaped = x.reshape(5, 2)
X_reshaped

tensor([[ 1.,  2.],
        [ 3.,  4.],
        [ 5.,  6.],
        [ 7.,  8.],
        [ 9., 10.]])

In [72]:
# change the view
z = x.view(1,10)
z, z.shape

(tensor([[ 1.,  2.,  3.,  4.,  5.,  6.,  7.,  8.,  9., 10.]]),
 torch.Size([1, 10]))

##**Note:**  changing `z` changes `x` (because `view` of a tensor shares memory with the original input)

In [73]:
z[:, 0] = 5
z, x

(tensor([[ 5.,  2.,  3.,  4.,  5.,  6.,  7.,  8.,  9., 10.]]),
 tensor([ 5.,  2.,  3.,  4.,  5.,  6.,  7.,  8.,  9., 10.]))

In [74]:
# stack tensors on top of each other
x_stacked = torch.stack([x, x, x, x], dim = 0)
x_stacked

tensor([[ 5.,  2.,  3.,  4.,  5.,  6.,  7.,  8.,  9., 10.],
        [ 5.,  2.,  3.,  4.,  5.,  6.,  7.,  8.,  9., 10.],
        [ 5.,  2.,  3.,  4.,  5.,  6.,  7.,  8.,  9., 10.],
        [ 5.,  2.,  3.,  4.,  5.,  6.,  7.,  8.,  9., 10.]])

In [75]:
y_stacked = torch.stack([x, x, x, x], dim = 1)
y_stacked

tensor([[ 5.,  5.,  5.,  5.],
        [ 2.,  2.,  2.,  2.],
        [ 3.,  3.,  3.,  3.],
        [ 4.,  4.,  4.,  4.],
        [ 5.,  5.,  5.,  5.],
        [ 6.,  6.,  6.,  6.],
        [ 7.,  7.,  7.,  7.],
        [ 8.,  8.,  8.,  8.],
        [ 9.,  9.,  9.,  9.],
        [10., 10., 10., 10.]])

In [76]:
print(x_reshaped)
print(x_reshaped.shape)

tensor([[[ 5.,  2.,  3.,  4.,  5.],
         [ 6.,  7.,  8.,  9., 10.]]])
torch.Size([1, 2, 5])


In [77]:
# squeeze
x_squeezed = x_reshaped.squeeze()
x_squeezed

tensor([[ 5.,  2.,  3.,  4.,  5.],
        [ 6.,  7.,  8.,  9., 10.]])

In [78]:
'''the squeezed function removed
the single dimensions i.e it will remove all the 1's from the dimesion'''
x_squeezed.shape

torch.Size([2, 5])

In [79]:
#torch.unsqueeze, it adds single dimension
x_unsqueezed = x_squeezed.unsqueeze(dim=0)
print(x_unsqueezed, '\n', x_unsqueezed.shape)

tensor([[[ 5.,  2.,  3.,  4.,  5.],
         [ 6.,  7.,  8.,  9., 10.]]]) 
 torch.Size([1, 2, 5])


In [80]:
#torch.unsqueeze, it adds single dimension
x_unsqueezed = x_squeezed.unsqueeze(dim=1)
print(x_unsqueezed, '\n', x_unsqueezed.shape)

tensor([[[ 5.,  2.,  3.,  4.,  5.]],

        [[ 6.,  7.,  8.,  9., 10.]]]) 
 torch.Size([2, 1, 5])


In [81]:
# 'torch.permute' --> rearragnes the dimesions of a target tensor in a specified order
x_original = torch.rand(size = (224, 224, 3)) # [height, width, color_channel]

# Permute the original tensor to rearrange the axis (or dim) order
x_permuted = x_original.permute(2, 0, 1) # shifts axis 0 -->1, 1-->2, 2-->0

print(f"Previous shape: {x_original.shape}")
print(f"New shape: {x_permuted.shape}") # [color_channels, height, width]

Previous shape: torch.Size([224, 224, 3])
New shape: torch.Size([3, 224, 224])


## **Indexing (selecting data from tensors)**

In [82]:
# create a tensor
x = torch.arange(1, 10).reshape(1, 3, 3)
x, x.shape

(tensor([[[1, 2, 3],
          [4, 5, 6],
          [7, 8, 9]]]),
 torch.Size([1, 3, 3]))

In [83]:
# lets index on our new tensor
x[0]

tensor([[1, 2, 3],
        [4, 5, 6],
        [7, 8, 9]])

In [84]:
x[0, 0]

tensor([1, 2, 3])

In [85]:
# lets index on middle bracket
x[0, 1]

tensor([4, 5, 6])

In [86]:
x[0,2]

tensor([7, 8, 9])

In [87]:
# you can also use ":" to select "all" of a target dimension
# variable[start:end, column]
x[:, 0]

tensor([[1, 2, 3]])

In [88]:
x[0, 2,2]

tensor(9)

In [89]:
x[:, 1]

tensor([[4, 5, 6]])

In [90]:
x[:, :, 1]

tensor([[2, 5, 8]])

In [91]:
x[:, 1, 1]

tensor([5])

In [92]:
!nvidia-smi

/bin/bash: line 1: nvidia-smi: command not found


In [93]:
torch.cuda.is_available()

False

In [94]:
#setup device diagnostic code
device = 'cuda' if torch.cuda.is_available() else 'cpu'
device

'cpu'

In [98]:
import numpy as np

array = np.arange(1.0, 8.0)
tensor = torch.from_numpy(array) # warning: when converting from numpy -> pytorch, pytorch reflects the numpy's default dtype of float64 unless specified
array, tensor

(array([1., 2., 3., 4., 5., 6., 7.]),
 tensor([1., 2., 3., 4., 5., 6., 7.], dtype=torch.float64))

In [99]:
# change the value of array and see what this will do to the tensor
array = array + 1
array, tensor

(array([2., 3., 4., 5., 6., 7., 8.]),
 tensor([1., 2., 3., 4., 5., 6., 7.], dtype=torch.float64))

In [100]:
# tensor to numpy array
tensor = torch.ones(7)
numpy_tensor = tensor.numpy()
tensor, numpy_tensor

(tensor([1., 1., 1., 1., 1., 1., 1.]),
 array([1., 1., 1., 1., 1., 1., 1.], dtype=float32))

In [101]:
numpy_tensor.dtype

dtype('float32')

# **Reproducibility**
> To reduce the randomness in neural networks, PyTorch comes with the concept of **random seed**.

In [104]:
# create two randomm tensors

random_tensor_A = torch.rand(3, 4)
random_tensor_B = torch.rand(3, 4)

print(random_tensor_A,'\n')
print(random_tensor_B, '\n')
print(random_tensor_A == random_tensor_B)

tensor([[0.3864, 0.0291, 0.3550, 0.6227],
        [0.7062, 0.3406, 0.4249, 0.6933],
        [0.4437, 0.1961, 0.8262, 0.4737]]) 

tensor([[0.1267, 0.6655, 0.2378, 0.8509],
        [0.3339, 0.5077, 0.5958, 0.0460],
        [0.1839, 0.8763, 0.9164, 0.5111]]) 

tensor([[False, False, False, False],
        [False, False, False, False],
        [False, False, False, False]])


In [106]:
# Let's make some random but reproducible tensors

# set the random seed
RANDOM_SEED = 42
torch.manual_seed(RANDOM_SEED) # this generally works for only one block of code

random_tensor_C = torch.rand(3, 4)

torch.manual_seed(RANDOM_SEED)
random_tensor_D = torch.rand(3, 4)

print(random_tensor_C,'\n')
print(random_tensor_D, '\n')
print(random_tensor_C == random_tensor_D)

tensor([[0.8823, 0.9150, 0.3829, 0.9593],
        [0.3904, 0.6009, 0.2566, 0.7936],
        [0.9408, 0.1332, 0.9346, 0.5936]]) 

tensor([[0.8823, 0.9150, 0.3829, 0.9593],
        [0.3904, 0.6009, 0.2566, 0.7936],
        [0.9408, 0.1332, 0.9346, 0.5936]]) 

tensor([[True, True, True, True],
        [True, True, True, True],
        [True, True, True, True]])


# **Getting PyTorch to run on the GPU**
> We can test if PyTorch has access to a GPU using `torch.cuda.is_available()`.

In [1]:
#check for the GPU with python
import torch
torch.cuda.is_available()

True

In [2]:
device = "cuda" if torch.cuda.is_available() else "cpu"
device

'cuda'

In [3]:
# count number of devices
torch.cuda.device_count()

1

In [5]:
!nvidia-smi

Sat Feb 10 10:47:34 2024       
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.104.05             Driver Version: 535.104.05   CUDA Version: 12.2     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|   0  Tesla T4                       Off | 00000000:00:04.0 Off |                    0 |
| N/A   44C    P8              10W /  70W |      3MiB / 15360MiB |      0%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+
                                                                    

# **Putting tensors and models on GPU**
> GPU results in faster computation

In [6]:
# create a tensor (default on cpu)
tensor = torch.tensor([1, 2, 3])

#Tensor not on GPU
print(tensor, tensor.device)

tensor([1, 2, 3]) cpu


In [7]:
# Move tensor to GPU if available
tensor_on_gpu = tensor.to(device)
tensor_on_gpu

tensor([1, 2, 3], device='cuda:0')

In [10]:
# if tensor on gpu we can't transform it into numpy
tensor_on_gpu.numpy()

TypeError: can't convert cuda:0 device type tensor to numpy. Use Tensor.cpu() to copy the tensor to host memory first.

In [9]:
# moving tensors back to cpu

tensor_back_on_cpu = tensor_on_gpu.cpu()
tensor_back_on_cpu

tensor([1, 2, 3])

In [12]:
tensor_back_on_cpu.numpy()

array([1, 2, 3])