<a href="https://colab.research.google.com/github/kameshcodes/deep-learning-codes/blob/main/pytorch_basics_2.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

$$\textbf{On Day 1, We Learned}$$

**[Day 1 Notebook](https://github.com/kameshcodes/deep-learning-codes/blob/main/pytorch_basics.ipynb)**


---

# $\textbf{1.Tensor Basics}$
---

<br>

### $\textbf{But Why Tensors and not arrays or dataframes ?}$


While NumPy arrays and Pandas dataframes has been useful for numerical computations and data manipulation, PyTorch tensors offer several advantages for machine learning, especially in deep learning while training neural networks:

#### GPU Acceleration:
- **Tensors**: PyTorch tensors can be easily transferred to GPUs, enabling faster computations crucial for training large neural networks.
- **NumPy Arrays/Pandas DataFrames**: Primarily designed for CPU operations. GPU support via libraries like CuPy is less seamless than in PyTorch.

#### - Automatic Differentiation:
- **Tensors**: PyTorch's `autograd` package works with tensors to automatically compute gradients, essential for backpropagation in neural network training.
- **NumPy Arrays/Pandas DataFrames**: Do not inherently support automatic differentiation, making manual gradient calculations cumbersome and error-prone.

<br>

In [None]:
import torch

import random
random.seed(1) #for reprodicibility

## $\textbf{1.1 Tensor Creation}$

## $\textbf{1.2 Tensor Operations}$

## $\textbf{1.3 Slicing in Tensors}$

## $\textbf{1.4 Resize a Tensor}$

## $\textbf{1.5 Numpy <=> Tensor}$

## $\textbf{1.6 Creating Tensors on CUDA GPU}$

# $\textbf{2. AutoGrad in Pytorch}$
---

<br>

PyTorch's $\textbf{autograd}$ it is a tool that automatically calculates the gradients needed in the backpropagation step of learning models, particularly neural networks.

- It works by creating a $\textbf{dynamic computational graph}$ as you perform operations, which makes it very flexible and easy to debug.

- To use autograd, you simply set **`requires_grad=True`** on the tensors you want to track. When you perform operations on these tensors, PyTorch keeps track of them. Later, you can call the $\textbf{backward()}$ method on the final result to compute the gradients, which will be stored in the $\textbf{.grad}$ attribute of the original tensors.

<br>

$\text{What is a computational graph?}$

 A computational graph is a dynamic representation of the operations (like addition, multiplication, etc.)  performed on tensors during the forward pass of a neural network.
- a computational graph is typically represented as a $\text{Directed Acyclic Graph (DAG)}$
- In PyTorch, computational graphs are dynamic, meaning they are constructed dynamically as operations are performed during the execution of the forward pass.


## $\textbf{2.1 Intialize Autograd for a tensor}$

## $\textbf{2.2 Visualizing Computation Graph in PyTorch}$

## $\textbf{2.3 Gradient Calculation}$

# **Day 2**

$$\textbf{Pytorch Day 2}$$

---


# $\textbf{3. BackPropagation}$

$\text{For every operation we do with tensors, Pytorch create a computational graph for us.}$

$\text{Suppose we have two tensors x and y:}$
- $\text{Then at perform an operation f(.) on tensors x and y to get a node representing z=f(x,y)}$
- $\text{Now at these nodes z=f(x,y), we do calculate local gradients which we can use later in chain rule for calculation of final gradient.}$



<br>

<center>

![Img](https://polakowo.io/datadocs/assets/1*q1M7LGiDTirwU-4LcFq7_Q.png)

</center>


<br>
<br>
<br>

$\text{Whole Concept Consist of 3 Steps:}$

1. **Forward Pass:** Compute $Loss$
2. **Compute local gradients**
3. **Backward Pass:**  Compute $\frac{d(Loss)}{d\text{weights}}$ using the chain rule.


## $\textbf{3.1 Backprop in Pytorch}$
---

In [None]:
import torch

In [None]:
# Lets say we have data x, y
x = torch.tensor(1.0)
y = torch.tensor(2.0) # not we dont turn on auto grade here coz we need to update only weights and not change data

# initialize a random weight
w = torch.tensor(1.0, requires_grad=True)

# Step 1: Forward Pass
y_hat = w*x
loss = (y_hat - y)**2

# Step 2: Pytorch will calculate gradient automatically since autograd is on
# Step 3: backward pass
loss.backward()
print(w.grad)

tensor(-2.)


$\text{We then update weights and do couple of epochs of forward and backward pass to get optimum weights.}$

<br>

## $\textbf{3.2 Gradient Descent in Linear Regression}$
---

### $\text{3.2.1 SLR implementation With Numpy}$

Lets implement simple linear regression from scratch using $Numpy$

In [None]:
import numpy as np

In [None]:
def forward(x):
  return w*x

def loss(y, y_pred):
  return ((y-y_pred)**2).mean()

#gradient
#MSE, J= (1/N) * (w*x - y)**2
#Gradient dJ/dx = (1/N) * 2 * x * (w*x - y)

def gradient(x,y, y_pred):
  return np.dot(2*x, y_pred-y).mean()

In [None]:
# Lets assume f = 2*x : this is true objective function
X = np.array([1,2,3,4], dtype=np.float32)
y = np.array(2*X, dtype=np.float32)


w = 0.0  #Initializing weights as zero
lr = 0.005
epochs = 20



print(f"\nPrediction before training: f(5) = {forward(5):.3f}\n")



for epoch in range(epochs):
  y_pred = forward(X)
  l = loss(y, y_pred)
  dw = gradient(X, y, y_pred)
  w = w - lr*dw
  if epoch%1 == 0:
    print(f"epoch {epoch+1}: w = {w:.3f} and loss = {l:.5f}")

print(f"\nPrediction after training: f(5) = {forward(5):.3f}\n")


Prediction before training: f(5) = 0.000

epoch 1: w = 0.600 and loss = 30.00000
epoch 2: w = 1.020 and loss = 14.70000
epoch 3: w = 1.314 and loss = 7.20300
epoch 4: w = 1.520 and loss = 3.52947
epoch 5: w = 1.664 and loss = 1.72944
epoch 6: w = 1.765 and loss = 0.84743
epoch 7: w = 1.835 and loss = 0.41524
epoch 8: w = 1.885 and loss = 0.20347
epoch 9: w = 1.919 and loss = 0.09970
epoch 10: w = 1.944 and loss = 0.04885
epoch 11: w = 1.960 and loss = 0.02394
epoch 12: w = 1.972 and loss = 0.01173
epoch 13: w = 1.981 and loss = 0.00575
epoch 14: w = 1.986 and loss = 0.00282
epoch 15: w = 1.991 and loss = 0.00138
epoch 16: w = 1.993 and loss = 0.00068
epoch 17: w = 1.995 and loss = 0.00033
epoch 18: w = 1.997 and loss = 0.00016
epoch 19: w = 1.998 and loss = 0.00008
epoch 20: w = 1.998 and loss = 0.00004

Prediction after training: f(5) = 9.992



### $\text{3.2.2 SLR implementation With Pytorch}$

Lets implement simple linear regression from scratch using Pytorch $Autograd$

In [None]:
def forward(x):
  return w*x

def loss(y, y_pred):
  return ((y-y_pred)**2).mean()

#gradient
#MSE, J= (1/N) * (w*x - y)**2
#Gradient dJ/dx = (1/N) * 2 * x * (w*x - y)

# def gradient(x,y, y_pred):
#   return np.dot(2*x, y_pred-y).mean() #this is not nedded pytorch will calculate it automatically

In [None]:
# Lets assume f = 2*x : this is true objective function
X = torch.tensor([1,2,3,4], dtype=torch.float32)
y = 2*X


w = torch.tensor(0.0, dtype=torch.float32, requires_grad=True)  #Initializing weights as zero

lr = 0.005
epochs = 20


print(f"\nPrediction before training: f(5) = {forward(5):.3f}\n")


for epoch in range(epochs):
  #foreard pass
  y_pred = forward(X)
  l = loss(y, y_pred)

  #backward pass
  l.backward() #dl/dw :calcuate gradient

  with torch.no_grad():
      w-=(lr*w.grad)

  #zero the grad
  w.grad.zero_()

  if epoch%1 == 0:
      print(f"epoch {epoch+1}: w = {w:.3f} and loss = {l:.5f}")

print(f"\nPrediction after training: f(5) = {forward(5):.3f}\n")


Prediction before training: f(5) = 0.000

epoch 1: w = 0.150 and loss = 30.00000
epoch 2: w = 0.289 and loss = 25.66875
epoch 3: w = 0.417 and loss = 21.96283
epoch 4: w = 0.536 and loss = 18.79194
epoch 5: w = 0.646 and loss = 16.07886
epoch 6: w = 0.747 and loss = 13.75747
epoch 7: w = 0.841 and loss = 11.77124
epoch 8: w = 0.928 and loss = 10.07176
epoch 9: w = 1.008 and loss = 8.61765
epoch 10: w = 1.083 and loss = 7.37348
epoch 11: w = 1.152 and loss = 6.30893
epoch 12: w = 1.215 and loss = 5.39808
epoch 13: w = 1.274 and loss = 4.61873
epoch 14: w = 1.329 and loss = 3.95190
epoch 15: w = 1.379 and loss = 3.38135
epoch 16: w = 1.425 and loss = 2.89317
epoch 17: w = 1.469 and loss = 2.47547
epoch 18: w = 1.508 and loss = 2.11807
epoch 19: w = 1.545 and loss = 1.81227
epoch 20: w = 1.579 and loss = 1.55063

Prediction after training: f(5) = 7.897



In [None]:
# Lets assume f = 2*x : this is true objective function
X = torch.tensor([1,2,3,4], dtype=torch.float32)
y = 2*X


w = torch.tensor(0.0, dtype=torch.float32, requires_grad=True)  #Initializing weights as zero

lr = 0.005
epochs = 20


print(f"\nPrediction before training: f(5) = {forward(5):.3f}\n")


for epoch in range(epochs):
  #foreard pass
  y_pred = forward(X)
  l = loss(y, y_pred)

  #backward pass
  l.backward() #dl/dw :calcuate gradient

  with torch.no_grad():
      w=w-(lr*w.grad)

  #zero the grad
  w.grad.zero_()

  if epoch%1 == 0:
      print(f"epoch {epoch+1}: w = {w:.3f} and loss = {l:.5f}")

print(f"\nPrediction after training: f(5) = {forward(5):.3f}\n")


Prediction before training: f(5) = 0.000



AttributeError: 'NoneType' object has no attribute 'zero_'

$\text{Remarks: }$

----

- In PyTorch, during backpropagation, gradients are calculated symbolically using automatic differentiation. This means that PyTorch keeps a record of all the mathematical operations performed during the forward pass to determine how each parameter affects the loss. However, due to the complexity of symbolic differentiation, small errors can occur in these calculations. As a result, the convergence, or the process of reaching the optimal solution, may be slower compared to methods that perform exact numerical computations.

- Always update weights inplace $w-=(lr*w.grad)$ and **not** $w= w-(lr*w.grad)$, later gives error because When you assign a new value to a variable (in this case, the weights $w$) without modifying it in place, it breaks the connection between the variable and the computational graph. This can lead to issues with backpropagation, where the gradients are propagated backward through the graph to update the weights during training.

<br>

## $\text{3.2.3 Training Pipeline: Model, Loss, and optimizer}$


$\text{General Training Pipeline in pytorch:}$

1. Design our model: (input_size, output_size, forward_pass)
2. Construct loss and optimizers
3. Training Loop

 - forward pass: compute prediction

 - backward pass: gradients

 - update weights

In [None]:
import torch
import torch.nn as nn

In [None]:
# Define input tensor X and target tensor y
X = torch.tensor([[1], [2], [3], [4]], dtype=torch.float32)
y = 2 * X

# Define a test input tensor X_test
X_test = torch.tensor([5], dtype=torch.float32)

# Set learning rate and number of epochs
lr = 0.05
epochs = 20

In [None]:
# Get the number of samples and features from input tensor X
n_samples, n_features = X.shape
input_size = n_features
output_size = 1

# Define a simple linear regression model
model = nn.Linear(input_size, output_size)

In [None]:
# Define mean squared error loss function
loss = nn.MSELoss()

# Define stochastic gradient descent optimizer
optimizer = torch.optim.SGD(model.parameters(), lr=lr)

In [None]:
# Training loop
for epoch in range(epochs):
    # Forward pass: predict y using the model
    y_pred = model(X)

    # Calculate the loss
    l = loss(y, y_pred)

    # Backward pass: compute gradients
    l.backward()

    # Update model parameters
    optimizer.step()

    # Reset gradients to zero
    optimizer.zero_grad()

    # Print progress every epoch
    if epoch % 1 == 0:
        w, b = model.parameters()
        print(f"epoch {epoch + 1}: w = {w[0].item():.3f} and loss = {l:.5f}")

# Print prediction for the test input after training
print(f"\nPrediction after training: {model(X_test).item():.3f}\n")


epoch 1: w = 1.467 and loss = 37.92918
epoch 2: w = 1.748 and loss = 1.08841
epoch 3: w = 1.797 and loss = 0.08412
epoch 4: w = 1.807 and loss = 0.05517
epoch 5: w = 1.811 and loss = 0.05281
epoch 6: w = 1.814 and loss = 0.05122
epoch 7: w = 1.817 and loss = 0.04970
epoch 8: w = 1.820 and loss = 0.04822
epoch 9: w = 1.823 and loss = 0.04679
epoch 10: w = 1.825 and loss = 0.04540
epoch 11: w = 1.828 and loss = 0.04405
epoch 12: w = 1.831 and loss = 0.04274
epoch 13: w = 1.833 and loss = 0.04147
epoch 14: w = 1.836 and loss = 0.04024
epoch 15: w = 1.838 and loss = 0.03904
epoch 16: w = 1.840 and loss = 0.03788
epoch 17: w = 1.843 and loss = 0.03676
epoch 18: w = 1.845 and loss = 0.03566
epoch 19: w = 1.847 and loss = 0.03460
epoch 20: w = 1.850 and loss = 0.03358

Prediction after training: 9.691



## $\text{3.2.4 Custom Linear Regression Model}$

In [None]:
torch.manual_seed(42)



# data
X = torch.tensor([[1], [2], [3], [4]], dtype=torch.float32)
y = 2 * X

X_test = torch.tensor([5], dtype=torch.float32)

lr = 0.05
epochs = 20





class LinearRegression(nn.Module):
  def __init__(self, input_size, output_size):
    super(LinearRegression, self).__init__()
    #define model layers
    self.linear = nn.Linear(input_size, output_size)

  def forward(self, x):
    return self.linear(x)

n_samples, n_features = X.shape
input_size = n_features
output_size = 1

model = LinearRegression(input_size, output_size)





# Define mean squared error loss function
loss = nn.MSELoss()

# Define stochastic gradient descent optimizer
optimizer = torch.optim.SGD(model.parameters(), lr=lr)






for epoch in range(epochs):
    # Forward pass: predict y using the model
    y_pred = model(X)

    # Calculate the loss
    l = loss(y, y_pred)

    # Backward pass: compute gradients
    l.backward()

    # Update model parameters
    optimizer.step()

    # Reset gradients to zero
    optimizer.zero_grad()

    # Print progress every epoch
    if epoch % 1 == 0:
        w, b = model.parameters()
        print(f"epoch {epoch + 1}: w = {w[0].item():.3f} and loss = {l:.5f}")

# Print prediction for the test input after training
print(f"\nPrediction after training: {model(X_test).item():.3f}\n")


epoch 1: w = 1.484 and loss = 7.00944
epoch 2: w = 1.607 and loss = 0.38854
epoch 3: w = 1.632 and loss = 0.20248
epoch 4: w = 1.641 and loss = 0.19171
epoch 5: w = 1.646 and loss = 0.18588
epoch 6: w = 1.652 and loss = 0.18036
epoch 7: w = 1.657 and loss = 0.17500
epoch 8: w = 1.662 and loss = 0.16980
epoch 9: w = 1.667 and loss = 0.16475
epoch 10: w = 1.672 and loss = 0.15986
epoch 11: w = 1.677 and loss = 0.15511
epoch 12: w = 1.682 and loss = 0.15050
epoch 13: w = 1.687 and loss = 0.14603
epoch 14: w = 1.691 and loss = 0.14169
epoch 15: w = 1.696 and loss = 0.13748
epoch 16: w = 1.701 and loss = 0.13339
epoch 17: w = 1.705 and loss = 0.12943
epoch 18: w = 1.709 and loss = 0.12558
epoch 19: w = 1.714 and loss = 0.12185
epoch 20: w = 1.718 and loss = 0.11823

Prediction after training: 9.419

