<a href="https://colab.research.google.com/github/Shreejan-git/pytorch-complete-course/blob/main/pytorch_nn_basics.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [1]:
import torch
from torch import nn
import matplotlib.pyplot as plt

In [2]:
print(torch.__version__)

2.0.1+cu118


In [3]:
!nvidia-smi

/bin/bash: line 1: nvidia-smi: command not found


**Notes while creating a custom model:**


1.   If we are creating our custom neural network module, it must inherit from nn.Module. https://pytorch.org/docs/stable/generated/torch.nn.Module.html
2.   Always call the __ init __ (constructor) of the parent class using the super().__ init__ (). This is a mandatory step. If we do not call the parents's constructor, we will have problems while training and saving the model. (linked with point 3)
3. When we are inheriting from nn.Module, it is mandatory to add forward method. Later while training the model, setting our to model to traning model with model.train(X_train), it will automatically forward_pass using forward method.
4. Below, nn.parameter's work is to mark that variable as a learnable parameter. nn.parameter will track it while training.


In [10]:
class SimpleLinearRegression(nn.Module):
  def __init__(self):
    super().__init__()
    self.weight = nn.Parameter(torch.rand(1,
                               requires_grad=True,
                               dtype=torch.float32))
    self.bias = nn.Parameter(torch.rand(1,
                             requires_grad=True,
                             dtype=torch.float32))

  def forward(self, X: torch.Tensor) -> torch.Tensor:
    return self.weight * X + self.bias

In [11]:
model = SimpleLinearRegression() # Initializing the model

**Resources for loss and optimization functions**

Pytorch Loss function: https://pytorch.org/docs/stable/nn.html#loss-functions , https://neptune.ai/blog/pytorch-loss-functions

Pytorch optimization functions: https://pytorch.org/docs/stable/optim.html

In [7]:
# Defining the loss function.
loss_fun = nn.L1Loss() # L1Loss is MAE loss function.
print(loss_fun)

L1Loss()


We can define any optimization function. like Adam, SGD, Adagrad.

1st parameter of SGD is the model's parameters. We can get this by writing model.parameters().

2nd parameter is learning rate (lr). Higher the value, the bigger the steps will be while reducing the gradient to global minima and vice versa.

In [6]:
# Defining the optimization function.
optimizer = torch.optim.SGD(params=model.parameters(), lr = 0.001,)

NameError: ignored

**Setting the training and evaluation loop**

In [8]:
epochs = 1

**Below code explanation:**


1.   Line 3: model.train() This line sets the PyTorch model (model) to training mode. In training mode, the model keeps track of the operations and layers that require gradients for backpropagation. This is necessary because some layers, like dropout or batch normalization(if we have one) must be turned on while training and turned off while evaluating.
2.   Line 6: y_pred = model(X_train) This line performs a forward pass through the model using the training data X_train. It computes the predicted values (y_pred) based on the current model parameters. This step is a fundamental part of training, as it computes the model's predictions.
3. line 12 [IMPORTANT] In PyTorch, when you perform a backward pass (loss.backward()), the gradients of the model's parameters are accumulated in the respective parameter tensors. But, when you're training a neural network with a single loss function and a single backward pass per iteration, you typically want to reset the gradients to zero before each backward pass to avoid accumulating gradients from previous iterations. Without zeroing out gradients, if you run multiple training iterations, the gradients will accumulate over time. This can lead to incorrect gradient updates, making your model's training unstable or divergent. By using optimizer.zero_grad() before each backward pass, you ensure that the gradients are cleared, and only the gradients of the current iteration are used to update the model parameters.



In [None]:
for epoch in range(epochs):
    # Set the model to traning mode
    model.train()

    # Forward pass
    y_pred = model(X_train)

    # Calculate the loss
    loss = loss_fn(y_pred, y_train)

    # Optimizer zero grad
    optimizer.zero_grad()

    # Perform backpropagation on the loss with respect to the parameters of the model
    # Parameters are those which we set requires_grad = True.
    loss.backward()

    # step the optimizer (perform gradient descent)
    optimizer.step()

    # testing
    model.eval()