<a href="https://colab.research.google.com/github/Tankasala25/PyTorch/blob/main/FirstModel.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Simple Definition of `torch.nn`

`torch.nn` is the PyTorch module that provides the basic building blocks for creating neural networks, including layers, activation functions, and loss functions.

## 1. Optimizer
The optimizer **updates the model's weights** using the gradients.  
Example: SGD, Adam.

---

## 2. backward() Function
The `backward()` function **calculates the gradients** of the loss with respect to each weight.  
These gradients are then passed to the optimizer.

---

## 3. Gradient Descent
Gradient Descent is the overall **process** where:
- `backward()` computes gradients  
- the optimizer updates weights  
- and the model gradually reduces the loss  

In simple words:  
**Gradient Descent = calculate gradients + update weights to reduce loss.**


In [2]:
import torch
import torch.nn as nn


x=torch.tensor([1,2,3,4],dtype=torch.float32)
y=torch.tensor([3,6,9,12],dtype=torch.float32)

class Linearmodel(nn.Module):
  def __init__(self):
    super().__init__()
    self.weight=nn.Parameter(torch.randn(1,requires_grad=True,dtype=torch.float32))

  def forward(self,x):
    return self.weight*x

model=Linearmodel()

loss=nn.MSELoss()
optimizer=torch.optim.SGD(model.parameters(),lr=0.01)
learning_rate=0.01
iters=100

print(f"prediction before training: f(5)= {model(torch.tensor(5.0)).item():.3f}")

for epoch in range(iters):

  #first we have to do forward function
  y_pred=model(x)

  #then we have to do loss function
  l=loss(y,y_pred)

  #we have to find gradient
  l.backward()

  #we have to update weights
  optimizer.step()

  #we have to zero gradient
  optimizer.zero_grad()

  if epoch%10==0:
    print(f"Epoch {epoch+1}: weight={model.weight.item():.3f}, loss={l.item():.8f}")

print(f"prediction After training: f(5)= {model(torch.tensor(5.0)).item():.3f}")

prediction before training: f(5)= -2.347
Epoch 1: weight=0.051, loss=90.28034973
Epoch 11: weight=2.419, loss=3.49922514
Epoch 21: weight=2.886, loss=0.13562892
Epoch 31: weight=2.977, loss=0.00525692
Epoch 41: weight=2.996, loss=0.00020374
Epoch 51: weight=2.999, loss=0.00000790
Epoch 61: weight=3.000, loss=0.00000031
Epoch 71: weight=3.000, loss=0.00000001
Epoch 81: weight=3.000, loss=0.00000000
Epoch 91: weight=3.000, loss=0.00000000
prediction After training: f(5)= 15.000



---

# üìò **Cell 4 ‚Äî Difference between saving entire model and model state dictionary**

```markdown
## ‚úÖ Side-by-Side Comparison

| Feature | Saving Entire Model | Saving `state_dict` |
|--------|---------------------|----------------------|
| Saves architecture | ‚úî Yes | ‚ùå No |
| Saves weights | ‚úî Yes | ‚úî Yes |
| Loading requires class definition | ‚ùå No | ‚úî Yes |
| Version stability | ‚ùå Low | ‚úî High |
| Production use (Triton, ONNX) | ‚ùå Not recommended | ‚úî Recommended |
| File size | Larger | Smaller |
| PyTorch recommended | ‚ùå No | ‚úî Yes |


In [3]:
model.state_dict()

OrderedDict([('weight', tensor([3.0000]))])

In [5]:
from pathlib import Path

#Creating a model Directory path

MODEL_Path=Path("Models")
MODEL_Path.mkdir(parents=True,exist_ok=True)

#Creating a model save path

Model_name= "Pytorch_FirstModel.pth"
Model_save_Path=MODEL_Path/Model_name

#Saving a Model
print(f"Saving model to path :{Model_save_Path}")
torch.save(obj=model.state_dict(),f=Model_save_Path)


Saving model to path :Models/Pytorch_FirstModel.pth


In [6]:
!ls -l Models

total 4
-rw-r--r-- 1 root root 1718 Nov 18 18:05 Pytorch_FirstModel.pth


# Loading Models in PyTorch ‚Äî All Key Points

## 1. Loading the Entire Model
- Use this when the model was saved using: `torch.save(model, "model.pth")`
- Load it directly with: `model = torch.load("model.pth")`
- Automatically restores architecture + weights
- Does NOT require the model class definition
- Very easy but NOT recommended for production use

---

## 2. Loading a Model Using state_dict (Recommended)
- Use when saved using: `torch.save(model.state_dict(), "model_state.pth")`
- Requires recreating the exact same model architecture
- Steps:
  1. Define the model class
  2. Create an instance of the model
  3. Load weights using: `model.load_state_dict(torch.load("model_state.pth"))`
- Most stable and production-ready
- Works best with ONNX, TorchScript, TensorRT, Triton

---

## 3. Loading a Model on CPU or GPU
- Load on CPU: use `map_location="cpu"`
- Load on GPU: use `map_location="cuda"` or a device variable
- After loading, move the model to device using: `model.to(device)`
- Needed when model was trained on one device and loaded on another

---

## 4. Loading Checkpoints (Model + Optimizer)
- Used when resuming training
- Checkpoint usually contains:
  - epoch number
  - model_state_dict
  - optimizer_state_dict
  - loss value
- Requires:
  1. Recreating the model architecture
  2. Recreating the optimizer
  3. Loading both states from the checkpoint dictionary
- Useful to continue training exactly where it stopped

---

## 5. Summary
- Entire Model:
  - Easy to load
  - No class needed
  - Not production-safe

- state_dict (Recommended):
  - Class must be recreated
  - Loads only weights
  - Industry standard, best for deployment

- Checkpoint:
  - Loads model + optimizer
  - Best for resuming training


In [9]:
#To Load a saved model state we have to create a new instance for Model class
Model_loaded_state=Linearmodel()

#Load the model with saved model state dictionary
Model_loaded_state.load_state_dict(torch.load(f=Model_save_Path))


<All keys matched successfully>

In [10]:
Model_loaded_state.state_dict()


OrderedDict([('weight', tensor([3.0000]))])

In [12]:
Model_loaded_state.eval()
with torch.inference_mode():
  y_pred=Model_loaded_state(torch.tensor([5,6,7,8],dtype=torch.float32))

y_pred

tensor([15.0000, 18.0000, 21.0000, 24.0000])