<a href="https://colab.research.google.com/github/PravinTiwari023/Pytorch-Tutorial/blob/main/3_NN_Module_in_Pytorch.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

**End Goal:**

1. The NN module
2. The Torch.optim module

**Improvement:**

1. Building neural networks using nn module
2. Using built-in activation function
3. Using built-in loss function
4. Using built-in optimizer
---

### 1Ô∏è‚É£ **`torch.nn` (Neural Network Module)**  
üîπ **Purpose:** Helps in building and defining neural networks easily.  
üîπ **Why use it?** Instead of manually implementing activation functions, layers, and loss calculations, `torch.nn` provides ready-made components.

üîπ **Key Components:**  
‚úÖ `nn.Linear()` ‚Äì Fully connected (dense) layers  
‚úÖ `nn.Conv2d()` ‚Äì Convolutional layers (used in CNNs)  
‚úÖ `nn.ReLU(), nn.Sigmoid(), nn.Tanh()` ‚Äì Activation functions  
‚úÖ `nn.CrossEntropyLoss(), nn.MSELoss()` ‚Äì Loss functions  
‚úÖ `nn.Sequential()` ‚Äì To stack layers in an easy way  

üëâ `torch.nn` makes it easy to define and structure deep learning models.

---

### 2Ô∏è‚É£ **`torch.optim` (Optimization Module)**  
üîπ **Purpose:** Helps in training the model by optimizing the weights using different optimization algorithms.  
üîπ **Why use it?** Instead of manually updating weights using gradient descent, `torch.optim` provides efficient optimizers.

üîπ **Key Optimizers:**  
‚úÖ `optim.SGD()` ‚Äì Stochastic Gradient Descent  
‚úÖ `optim.Adam()` ‚Äì Adaptive Moment Estimation (widely used)  
‚úÖ `optim.RMSprop()` ‚Äì Root Mean Square Propagation  

üëâ `torch.optim` helps adjust model weights automatically based on the computed gradients.

---

In [2]:
# Create model class
import torch
import torch.nn as nn

class Model(nn.Module):

    def __init__(self, num_features):
      super().__init__()
      self.linear = nn.Linear(num_features, 1)
      self.sigmoid = nn.Sigmoid()

    def forward(self, features):
        out = self.linear(features)
        out = self.sigmoid(out)
        return out

In [4]:
# Create dataset
features = torch.randn(10, 5)

# Create model
model = Model(features.shape[1])

# Forward pass
output = model(features)

In [6]:
# Model weights and bias
print(model.linear.weight)
print(model.linear.bias)

Parameter containing:
tensor([[ 0.0890,  0.3011,  0.3548, -0.0801,  0.0266]], requires_grad=True)
Parameter containing:
tensor([0.0500], requires_grad=True)


In [8]:
# !pip install torchinfo

In [9]:
from torchinfo import summary

summary(model, input_size=(10, 5))

Layer (type:depth-idx)                   Output Shape              Param #
Model                                    [10, 1]                   --
‚îú‚îÄLinear: 1-1                            [10, 1]                   6
‚îú‚îÄSigmoid: 1-2                           [10, 1]                   --
Total params: 6
Trainable params: 6
Non-trainable params: 0
Total mult-adds (Units.MEGABYTES): 0.00
Input size (MB): 0.00
Forward/backward pass size (MB): 0.00
Params size (MB): 0.00
Estimated Total Size (MB): 0.00

### üèÜ **Summary:**  
| Module       | Purpose |
|-------------|----------------|
| `torch.nn`  | Defines layers, activation functions, loss functions, etc. |
| `torch.optim` | Optimizes weights during training using different algorithms. |

üí° **Real-world analogy:**  
- `torch.nn` is like a **chef** who prepares the meal (building the model).  
- `torch.optim` is like the **waiter** who adjusts the seasoning based on customer feedback (training process).
---

In [19]:
# New Example without containner
class Model1(nn.Module):

  def __init__(self, num_features):
    super().__init__()
    self.linear1 = nn.Linear(num_features, 3)
    self.relu = nn.ReLU()
    self.linear2 = nn.Linear(3, 1)
    self.sigmoid = nn.Sigmoid()

  def forward(self, features):
    out = self.linear1(features)
    out = self.relu(out)
    out = self.linear2(out)
    out = self.sigmoid(out)
    return out

In [20]:
# New Example with container
class Model2(nn.Module):

  def __init__(self, num_features):
    super().__init__()
    self.network = nn.Sequential(
    nn.Linear(num_features, 3),
    nn.ReLU(),
    nn.Linear(3, 1),
    nn.Sigmoid()
    )

  def forward(self, features):
    out = self.network(features)
    return out

In [24]:
# Create dataset
features = torch.randn(10, 5)

# Create model
model1 = Model1(features.shape[1])

# Forward pass
output = model(features)

In [25]:
# Create dataset
features = torch.randn(10, 5)

# Create model
model2 = Model2(features.shape[1])

# Forward pass
output = model(features)

In [26]:
# Model weights and bias
print(model1.linear1.weight)
print(model1.linear1.bias)

print(model1.linear2.weight)
print(model1.linear2.bias)

Parameter containing:
tensor([[ 0.1387, -0.3192,  0.3674,  0.4197, -0.2175],
        [-0.2342,  0.0012, -0.2923, -0.2506,  0.3070],
        [ 0.3001, -0.4139,  0.0111, -0.2665,  0.2507]], requires_grad=True)
Parameter containing:
tensor([-0.3743,  0.3665, -0.2896], requires_grad=True)
Parameter containing:
tensor([[-0.3729, -0.1021,  0.1874]], requires_grad=True)
Parameter containing:
tensor([0.1306], requires_grad=True)


In [18]:
from torchinfo import summary

summary(model, input_size=(10, 5))

Layer (type:depth-idx)                   Output Shape              Param #
Model                                    [10, 1]                   --
‚îú‚îÄLinear: 1-1                            [10, 3]                   18
‚îú‚îÄReLU: 1-2                              [10, 3]                   --
‚îú‚îÄLinear: 1-3                            [10, 1]                   4
‚îú‚îÄSigmoid: 1-4                           [10, 1]                   --
Total params: 22
Trainable params: 22
Non-trainable params: 0
Total mult-adds (Units.MEGABYTES): 0.00
Input size (MB): 0.00
Forward/backward pass size (MB): 0.00
Params size (MB): 0.00
Estimated Total Size (MB): 0.00

**written flow diagram** for above neural network code with:  

- **5 Input Neurons**  
- **1 Hidden Layer with 3 Neurons**  
- **1 Output Layer with 1 Neuron**  

---

### **üìú Written Flow Diagram:**

1Ô∏è‚É£ **Input Layer (5 neurons)**  
   - \( X_1, X_2, X_3, X_4, X_5 \) ‚Üí Inputs  

2Ô∏è‚É£ **Hidden Layer (3 neurons) with weights**  
   - Each neuron in the hidden layer receives all 5 inputs, meaning it has 5 weights + 1 bias.  
   - Neurons: \( H_1, H_2, H_3 \)  
   - Weight connections:  
     - \( W_{1,1}, W_{1,2}, W_{1,3}, W_{1,4}, W_{1,5} \) ‚Üí Weights for \( H_1 \)  
     - \( W_{2,1}, W_{2,2}, W_{2,3}, W_{2,4}, W_{2,5} \) ‚Üí Weights for \( H_2 \)  
     - \( W_{3,1}, W_{3,2}, W_{3,3}, W_{3,4}, W_{3,5} \) ‚Üí Weights for \( H_3 \)  

3Ô∏è‚É£ **Activation Function (ReLU, Sigmoid, etc.)**  
   - Each neuron applies an activation function \( f(x) \) to produce an output  
   - \( H_1 = f(\sum (X_i \cdot W_{1,i}) + B_1) \)  
   - \( H_2 = f(\sum (X_i \cdot W_{2,i}) + B_2) \)  
   - \( H_3 = f(\sum (X_i \cdot W_{3,i}) + B_3) \)  

4Ô∏è‚É£ **Output Layer (1 neuron)**  
   - Takes 3 inputs from the hidden layer neurons  
   - Weight connections:  
     - \( W_{o1}, W_{o2}, W_{o3} \) ‚Üí Weights for the output neuron  
   - Final output:  
     - \( Y = f(H_1 \cdot W_{o1} + H_2 \cdot W_{o2} + H_3 \cdot W_{o3} + B_o) \)  

---

### **üìù Final Summary:**
```
   Input Layer (5 Neurons)  ‚Üí  Hidden Layer (3 Neurons)  ‚Üí  Output Layer (1 Neuron)
```
Each connection has its own weight, and each neuron applies an activation function before passing the value to the next layer.