****PyTorch Tutorials****

## Table of Contents

1. [Importing Modules](#Importing-Modules)
2. [Tensor Basics](#Tensor-Basics)
3. [Autograd](#Autograd)
4. [Backpropagation](#Backpropagation)
5. [Gradient Descent](#Gradient-Descent)
6. [Training Pipeline](#Training-Pipeline)
7. [Linear Regression](#Linear-Regression)
8. [Logistic Regression](#Logistic-Regression)
9. [Dataset and Dataloader](#Dataset-and-Dataloader)
10. [Dataset Transforms](#Dataset-Transforms)
11. [Softmax and Crossentropy](#Softmax-and-Crossentropy)
12. [Activation Functions](#Activation-Functions)
13. [Feed Forward Net](#Feed-Forward-Net)
14. [CNN](#CNN)
15. [Tensorboard](#Tensorboard)
16. [Save & Load Models](#Save--Load-Models)
16. [Tensorboard](#Tensorboard)
17. [Save & Load Models](#Save--Load-Models)


<hr>

**What are Tensors?**

-> A tensor can be defined as a generalised array , which can have any number of dimensions.

**What is the main difference between arrays and tensors?**

->  Arrays store numbers in multiple dimensions, while tensors are specialized arrays optimized for AI and deep learning, enabling faster computations on GPUs. 

# Importing Modules

In [27]:
import torch 
import numpy as np

# Tensor Basics  

<hr>

**1.) empty tensor**

In [28]:
t_em_1=torch.empty(1)  # 1D tensor 
print(t_em_1)

t_em_2=torch.empty(1,2) # 2D tensor 
print(t_em_2)

t_em_3=torch.empty(1,2,3) # 3d Tensor
print(t_em_3)

tensor([inf])
tensor([[0., 0.]])
tensor([[[6.4104e-30, 2.1468e-42, 1.4013e-45],
         [0.0000e+00, 1.4013e-45, 0.0000e+00]]])


**2.) tensor containing random values**

In [29]:
t_rand_1=torch.rand(1) # 1D tensor containing random values between 0 and 1
print(t_rand_1)

t_rand_2=torch.rand(1,2) # 2D tensor containing random values between 0 and 1
print(t_rand_2)

t_rand_3=torch.rand(1,2,3) # 3D tensor containing random values between 0 and 1
print(t_rand_3)

t_randn_1=torch.randn(1)  # 1D tensor containing random variable with mean 0 and variance 1 (nummbers will be between -infinity and infinity)
print(t_randn_1)

tensor([0.9270])
tensor([[0.3347, 0.6518]])
tensor([[[0.8908, 0.8825, 0.8191],
         [0.5871, 0.6928, 0.9165]]])
tensor([-1.4969])


**3.) tensor initialized with zeros**

In [30]:
t_zero_2=torch.zeros(1,2) # 2D tensor initialised with zeroes
print(t_zero_2)

tensor([[0., 0.]])


**4.) tensor initialized with ones**


In [31]:
t_one_2=torch.ones(1,3) # A 2D tensor initialised with ones
print(t_one_2)

tensor([[1., 1., 1.]])


**Note:** By default the datatype of a tensor is float32

**5.)creating tensors of specific data types**

In [32]:
# 1D tensor of float64 type

t_dtype_1=torch.zeros(1, dtype=torch.float64)
print(t_dtype_1)

# 2D tensor of int type (int32)

t_dtype_2=torch.ones(1,2,dtype=torch.int)
print(t_dtype_2)

tensor([0.], dtype=torch.float64)
tensor([[1, 1]], dtype=torch.int32)


**Note:** We can see the size of a tensor by using the .size() function

**6.) checking the size of a tensor**

In [33]:
t_size_1=torch.rand(1,3) # A 2d tensor containing random valeus and having a size of (1,3) 

# Checking the size 
print(t_size_1.size())

torch.Size([1, 3])


**7.) creating a tensor from a list**

In [34]:
data=[1.5,53.4,5445,355]

# Creatin the tensor 
t_list_1=torch.tensor(data)
print(t_list_1)

tensor([1.5000e+00, 5.3400e+01, 5.4450e+03, 3.5500e+02])


**Performing basic operations using tensors**

**Note:** In pytorch every function that contains a trailing "_" performs an inplace operation

*Creating the base tensors which will be used to perform the operations*

In [35]:
x=torch.rand(2,2)
y=torch.rand(2,2)

**1.) addition**

In [36]:
# Direct addition
z_add=x+y 

# Addition using pytorch function
z_add_func=torch.add(x,y)

# Inplace addition 
y_add_test=y #Copying y to another variable to keep the base tensor the saem 
y_add_test.add_(x)

# Printing the base tensors

print("BASE TENSORS\n")
print("Base Tensor 1:\n",x)
print("Base Tensor 2:\n",y)

# Printing the resultant tensors 

print("\nRESULTANT TENSORS\n")
print("Resultant Tensor 1 :\n ",z_add) # direct addition
print("Resultant Tensor 2 : \n",z_add_func) # using pytorch add funiton
print("Resultant Tensor 3 : \n",y_add_test) # using inplace addition

BASE TENSORS

Base Tensor 1:
 tensor([[0.7874, 0.3351],
        [0.1211, 0.7048]])
Base Tensor 2:
 tensor([[1.7649, 0.3626],
        [0.5242, 1.1341]])

RESULTANT TENSORS

Resultant Tensor 1 :
  tensor([[1.7649, 0.3626],
        [0.5242, 1.1341]])
Resultant Tensor 2 : 
 tensor([[1.7649, 0.3626],
        [0.5242, 1.1341]])
Resultant Tensor 3 : 
 tensor([[1.7649, 0.3626],
        [0.5242, 1.1341]])


**2.) subraction**

In [37]:
# Direct Subraction
z_sub=x-y 

# Addition using pytorch function
z_sub_func=torch.sub(x,y)

# Inplace Subraction 
y_sub_test=y #Copying y to another variable to keep the base tensor the saem 
y_sub_test.sub_(x)

# Printing the base tensors
print("BASE TENSORS\n")
print("Base Tensor 1:\n",x)
print("Base Tensor 2:\n",y)

# Printing the resultant tensors 

print("\nRESULTANT TENSORS\n")
print("Resultant Tensor 1 :\n ",z_sub) # direct Subraction
print("Resultant Tensor 2 : \n",z_sub_func) # using pytorch sub function
print("Resultant Tensor 3 : \n",y_sub_test) # using inplace Subraction

BASE TENSORS

Base Tensor 1:
 tensor([[0.7874, 0.3351],
        [0.1211, 0.7048]])
Base Tensor 2:
 tensor([[0.9775, 0.0275],
        [0.4031, 0.4293]])

RESULTANT TENSORS

Resultant Tensor 1 :
  tensor([[-0.9775, -0.0275],
        [-0.4031, -0.4293]])
Resultant Tensor 2 : 
 tensor([[-0.9775, -0.0275],
        [-0.4031, -0.4293]])
Resultant Tensor 3 : 
 tensor([[0.9775, 0.0275],
        [0.4031, 0.4293]])


**3.) multiplication**

In [38]:
# Direct Multiplication
z_mul=x*y 

# Addition using pytorch function
z_mul_func=torch.mul(x,y)

# Inplace Multiplication 
y_mul_test=y #Copying y to another variable to keep the base tensor the saem 
y_mul_test.mul_(x)

# Printing the base tensors
print("BASE TENSORS\n")
print("Base Tensor 1:\n",x)
print("Base Tensor 2:\n",y)

# Printing the resultant tensors 

print("\nRESULTANT TENSORS\n")
print("Resultant Tensor 1 :\n ",z_mul) # direct Multiplication
print("Resultant Tensor 2 : \n",z_mul_func) # using pytorch mul function
print("Resultant Tensor 3 : \n",y_mul_test) # using inplace Multiplication

BASE TENSORS

Base Tensor 1:
 tensor([[0.7874, 0.3351],
        [0.1211, 0.7048]])
Base Tensor 2:
 tensor([[0.7696, 0.0092],
        [0.0488, 0.3026]])

RESULTANT TENSORS

Resultant Tensor 1 :
  tensor([[0.7696, 0.0092],
        [0.0488, 0.3026]])
Resultant Tensor 2 : 
 tensor([[0.7696, 0.0092],
        [0.0488, 0.3026]])
Resultant Tensor 3 : 
 tensor([[0.7696, 0.0092],
        [0.0488, 0.3026]])


**4.)division**

In [39]:
# Direct Division
z_div=x/y 

# Addition using pytorch function
z_div_func=torch.div(x,y)

# Inplace Division 
y_div_test=y #Copying y to another variable to keep the base tensor the saem 
y_div_test.div_(x)

# Printing the base tensors
print("BASE TENSORS\n")
print("Base Tensor 1:\n",x)
print("Base Tensor 2:\n",y)

# Printing the resultant tensors 

print("\nRESULTANT TENSORS\n")
print("Resultant Tensor 1 :\n ",z_div) # direct Division
print("Resultant Tensor 2 : \n",z_div_func) # using pytorch div function
print("Resultant Tensor 3 : \n",y_div_test) # using inplace Division

BASE TENSORS

Base Tensor 1:
 tensor([[0.7874, 0.3351],
        [0.1211, 0.7048]])
Base Tensor 2:
 tensor([[0.9775, 0.0275],
        [0.4031, 0.4293]])

RESULTANT TENSORS

Resultant Tensor 1 :
  tensor([[ 1.0230, 36.3828],
        [ 2.4808,  2.3293]])
Resultant Tensor 2 : 
 tensor([[ 1.0230, 36.3828],
        [ 2.4808,  2.3293]])
Resultant Tensor 3 : 
 tensor([[0.9775, 0.0275],
        [0.4031, 0.4293]])


**5.) slicing operation**

In [40]:
x_slice=torch.rand(5,6) # Creating a random 2d tensor to perform slicing 

t_slice_1=x_slice[:,2] # To get all the rows of the 3rd column 
t_slice_2=x_slice[:2,3] # To get  the First 2 rows of the 4th column 

print("Original Tensor:\n",x_slice)
print("\n Sliced Tensor-1:\n",t_slice_1) # all rows of the 3rd columns 
print("\n Sliced Tensor-2:\n",t_slice_2) # First 2 rows  of the 4th column 

Original Tensor:
 tensor([[0.7905, 0.9172, 0.8340, 0.0468, 0.4464, 0.1711],
        [0.1757, 0.6912, 0.9022, 0.0520, 0.3623, 0.5038],
        [0.8996, 0.8022, 0.5400, 0.5823, 0.1195, 0.3165],
        [0.6260, 0.5134, 0.4350, 0.9371, 0.4844, 0.4437],
        [0.4337, 0.5514, 0.5513, 0.5181, 0.8524, 0.2866]])

 Sliced Tensor-1:
 tensor([0.8340, 0.9022, 0.5400, 0.4350, 0.5513])

 Sliced Tensor-2:
 tensor([0.0468, 0.0520])


**Note:** To get the actual value of an element in the tensor , we can use the .item() function. But we can use this method when we have only 1 element in the sliced tensor

In [41]:
a_test=torch.rand(2,2)
print("Original tensor:\n",a_test)
print("\n",a_test[1,1].item())

Original tensor:
 tensor([[0.6195, 0.1377],
        [0.1856, 0.1210]])

 0.12095630168914795


**6.)resizing a tensor**

In [42]:
a_test_2=torch.rand(2,4)

a_resize_1=a_test_2.view(-1,2) # The -1 value in the first dimension means , that pytorch automatically decides the appropriate dimesnion size 

print("Original Tensor:\n",a_test_2)
print("\nResized Tensor:\n",a_resize_1)

Original Tensor:
 tensor([[0.7621, 0.9147, 0.2647, 0.2112],
        [0.8561, 0.2782, 0.0176, 0.9660]])

Resized Tensor:
 tensor([[0.7621, 0.9147],
        [0.2647, 0.2112],
        [0.8561, 0.2782],
        [0.0176, 0.9660]])


***Conversions between numpy arrays and torch tensors***

**1.) converting a tensor to a numpy array**  


In [43]:
a_test_3=torch.rand(1,4)

# Converting it into a numpy array 

a_numpy_1 = a_test_3.numpy()

print(a_test_3,"\nData type: ",a_test_3.dtype)
print("\n",a_numpy_1,"\nData type: ",a_numpy_1.dtype)

tensor([[0.3461, 0.5214, 0.0774, 0.2737]]) 
Data type:  torch.float32

 [[0.34607476 0.5214479  0.07736278 0.27366966]] 
Data type:  float32


**Note:** The numpy array and the torch tensor point to the same memory address , so if one is changed , then the other one also gets changed 

**2.) converting a numpy array into a tensor**

In [44]:
np_array_1=np.zeros((3,3))
print("Numpy Array:\n",np_array_1)

# Converting it into a tensor 

torch_ten_1=torch.from_numpy(np_array_1) 

print("\nTorch tenser\n",torch_ten_1)

Numpy Array:
 [[0. 0. 0.]
 [0. 0. 0.]
 [0. 0. 0.]]

Torch tenser
 tensor([[0., 0., 0.],
        [0., 0., 0.],
        [0., 0., 0.]], dtype=torch.float64)


<hr>

# Autograd

**Note** To use autograd , we have to specify an argument requires_grad as True

**Gradient:** In deep learning , gradients tell us how much we have to change the weights so that the models predictions improve

**Back Propagation:** It is the process by which the models continuously changes its weights , based on the gradients to improve the predictions

**1.) Creating the base tensors on which the operations will be performed**

In [45]:
t_grad_1=torch.tensor([5.76],requires_grad=True)
t_grad_2=torch.rand(1,requires_grad=True)

print(t_grad_1)
print(t_grad_2)

tensor([5.7600], requires_grad=True)
tensor([0.0741], requires_grad=True)


**2.) Performing some operations on the tensor to see how the tracking is done**

In [46]:
# Addition 

t_res_add=torch.add(t_grad_1,t_grad_2) #Main tensor

print(t_res_add)

# Subraction 

t_res_sub=torch.sub(t_grad_1,t_grad_2) #Main tensor

print(t_res_sub) 

# Multiplication 

t_res_mul=torch.mul(t_grad_1,t_grad_2) #Main tensor

print(t_res_mul)

# Division 

t_res_div=torch.div(t_grad_1,t_grad_2) #Main tensor

print(t_res_div)

tensor([5.8341], grad_fn=<AddBackward0>)
tensor([5.6859], grad_fn=<SubBackward0>)
tensor([0.4270], grad_fn=<MulBackward0>)
tensor([77.6940], grad_fn=<DivBackward0>)


**3.) Calculating the gradients for each operation**

**Note:** To calculate the gradient of a tensor which contains more than 1 value , we have to pass a vector of the same size into the .backward() method as an argument. 

In [47]:
#  Calculating and printing the gradients for each operation 

# Addition 
print("Gradients for addition operation: \n")
t_res_add.backward()
print(t_grad_1.grad)
print(t_grad_2.grad)

# Subraction 

print("\nGradients for subraction operation: \n")
t_res_sub.backward()
print(t_grad_1.grad)
print(t_grad_2.grad)

# Multilplication 

print("\nGradients for multiplication operation: \n")
t_res_mul.backward()
print(t_grad_1.grad)
print(t_grad_2.grad) 

# Division 

print("\nGradients for Division operation: \n")
t_res_div.backward()
print(t_grad_1.grad)
print(t_grad_2.grad)


Gradients for addition operation: 

tensor([1.])
tensor([1.])

Gradients for subraction operation: 

tensor([2.])
tensor([0.])

Gradients for multiplication operation: 

tensor([2.0741])
tensor([5.7600])

Gradients for Division operation: 

tensor([15.5627])
tensor([-1042.2179])


**4.) Creating gradients for tensors with more than 1 element**

In [48]:
# Creating base tensors with require_Grad=True

t_grad_3=torch.randn(10,requires_grad=True)
t_grad_4=torch.randn(10,requires_grad=True)

# Printing all the elements of the tensor 

print(t_grad_3,"\n")
print(t_grad_4)

tensor([ 0.7947, -1.1164, -1.6054,  0.7655,  0.1070, -2.0595,  0.1088,  0.2675,
         0.7315, -1.4014], requires_grad=True) 

tensor([ 0.0104, -1.3171, -1.0139, -0.2205,  0.5254,  1.1900, -0.6731, -0.0298,
         1.4573, -1.3420], requires_grad=True)


*Creating different types of vectors of size 10 to calculate the gradients*

In [49]:
# If we use a vector of ones, we'll get the gradients for each element of the tensor
v_1=torch.ones(10)
v_1

# IF we want to get the gradient of a specific element 'n' in the tensor , we use a vector , which contains 2 at the nth position , and zeroes at all other positions 

v_2=torch.zeros(10)
v_2[4]=1

*Calculating the gradients*

In [52]:
# Performing a simple operation 
t_res_test_1=torch.add(t_grad_3,t_grad_4)
t_res_test_2=torch.sub(t_grad_3,t_grad_4)

# Uaing backward to calculate the gradients of all the elements
print("Operation")
t_res_test_1.backward(v_1)
print(t_grad_3.grad,"\n")
print(t_grad_4.grad,"\n")

# Zeroing out the gradients for the base tensors to perform another operation 

t_grad_3.grad=None
t_grad_4.grad=None

# Using backward to calculate the gradient of the 5th element of the tensor 

t_res_test_2.backward(v_2)
print(t_grad_3.grad,"\n")
print(t_grad_4.grad,"\n")


tensor([1., 1., 1., 1., 2., 1., 1., 1., 1., 1.]) 

tensor([1., 1., 1., 1., 0., 1., 1., 1., 1., 1.]) 

tensor([0., 0., 0., 0., 1., 0., 0., 0., 0., 0.]) 

tensor([-0., -0., -0., -0., -1., -0., -0., -0., -0., -0.]) 



**5.) Testing multiple methods to detach the tensor from pytorch tracking for gradient**

**Note:** When we don't want pytorch to track the history of tensor operations and track gradients , we can use multiple methods like .requires_grad() , .detach() method or we can put it in a with torch.no_grad() statement.

In [58]:
# Creating a test tensor with requires_Grad=True

t_grad_5=torch.randn(5,requires_grad=True)
t_grad_6=torch.randn(5,requires_grad=True)
t_grad_7=torch.randn(5,requires_grad=True)
print("Test Tensor-1: \n",t_grad_5)
print("\nTest Tensor-2: \n",t_grad_6)
print("\nTest Tensor-3: \n",t_grad_7)



Test Tensor-1: 
 tensor([ 1.1181,  1.9842, -0.5044,  0.8874,  1.1429], requires_grad=True)

Test Tensor-2: 
 tensor([-0.1952, -1.1459, -1.0293,  2.1160, -0.9339], requires_grad=True)

Test Tensor-3: 
 tensor([-0.6892,  0.6875,  0.8057,  0.5373,  0.6222], requires_grad=True)


In [67]:
# Removing the tensor from pytorch tracking 

# 1.) using required_grad() 
t_grad_5.requires_grad_(False)
print("Tensor after using requires_grad method: \n",t_grad_5)

# 2.) Using .detach() method 
t_grad_6.detach_()
print("\nTensor after deataching: \n",t_grad_6)

Tensor after using requires_grad method: 
 tensor([ 1.1181,  1.9842, -0.5044,  0.8874,  1.1429])

Tensor after deataching: 
 tensor([-0.1952, -1.1459, -1.0293,  2.1160, -0.9339])

Printing the tensor before using torch.no_grad(): 

tensor([-0.6892,  0.6875,  0.8057,  0.5373,  0.6222], requires_grad=True)

Tensor after using torch.no_grad(): 
 tensor(1.9634)


In [69]:
print("\nPrinting the tensor before using torch.no_grad(): \n",t_grad_7)

# 3.) Using wtih torch.no_grad():
with torch.no_grad():
    t_res_nograd_1=t_grad_7.sum()
    print("\nTensor after using torch.no_grad(): \n",t_res_nograd_1)


Printing the tensor before using torch.no_grad(): 
 tensor([-0.6892,  0.6875,  0.8057,  0.5373,  0.6222], requires_grad=True)

Tensor after using torch.no_grad(): 
 tensor(1.9634)


**Note:** We can make the gradients 0 by using the method .grad.zero_()

<hr>

# BackPropagation

**What is Backpropagation?**

-> Backpropagation is a process by which a models improves it accuracy by constantly adjusting its weights

**What is computational graph?**

-> A **computational graph** visually represents mathematical operations in a machine-learning model, breaking down complex calculations into smaller, interconnected steps for efficient processing.

**Explain the difference between Gradients and local gradients?**

-> A gradient is defined as the the overall rate of change of the loss function with respect to a model parameter 

-> A local gradient is defined as the rate of change of an intermediate node’s output with respect to its input in a computational graph.

In simple words, we can say that ,

-> Gradient determines how much a weight should change to reduce error, whereas a Local Gradient is a small part of this, helping backpropagation compute the full gradient using the chain rule.