# ðŸ§© Problem Statement

### 1. The Problem
A regression model's loss is plateauing (stuck), and we want to see if switching from **SGD** to **Adam** helps it learn faster.

### 2. Steps to Solve
1.  Generate synthetic regression data.
2.  Scale features and targets.
3.  Build a 3-Layer MLP in PyTorch.
4.  Train with SGD (Run A) and Adam (Run B).
5.  Compare results.

### 3. Expected Output
Comparative plots show Adam converging faster.

### ðŸ”¹ Library Imports
#### 2.1 What the line does
Imports necessary libraries for Deep Learning (PyTorch), Data Handling (Pandas/Numpy), and Plotting (Matplotlib).
#### 2.2 Why it is used
To access pre-built functions so we don't have to write everything from scratch.
#### 2.3 When to use it
At the start of every script.
#### 2.4 Where to use it
Top of the file.
#### 2.5 How to use it
`import torch`
#### 2.6 How it works internally
Loads the compiled C++/CUDA code into Python memory.
#### 2.7 Output with sample examples
No visible output, but functions become available.

In [None]:
import torch
import torch.nn as nn
import torch.optim as optim
from torch.utils.data import DataLoader, TensorDataset
from sklearn.datasets import make_regression
from sklearn.preprocessing import StandardScaler
from sklearn.model_selection import train_test_split
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import copy
import os

os.makedirs('outputs', exist_ok=True)

### ðŸ”¹ Reproducibility (Seeding)
#### 2.1 What the line does
Sets fixed seeds for random number generators.
#### 2.2 Why it is used
To ensure fair comparison; both runs must start with identical randomness.
#### 2.3 When to use it
In any scientific experiment involving randomness.
#### 2.4 Where to use it
Before any random operations.
#### 2.5 How to use it
`torch.manual_seed(42)`
#### 2.6 How it works internally
Initializes the pseudo-random state vector.
#### 2.7 Output with sample examples
Consistent random numbers.

In [None]:
SEED = 42
torch.manual_seed(SEED)
np.random.seed(SEED)

### ðŸ”¹ Data Preparation Function
#### 2.1 What the line does
Defines a function to generate, split, and scale data.
#### 2.2 Why it is used
To keep the main code clean and reusable.
#### 2.3 When to use it
When processing data steps are complex.
#### 2.4 Where to use it
Before the training loop.
#### 2.5 How to use it
`loaders, dim, scaler = prepare_data()`
#### 2.6 How it works internally
Calls sklearn functions and wraps result in PyTorch DataLoaders.
#### 2.7 Output with sample examples
Returns iterators for training.

In [None]:
def prepare_data():
    X, y = make_regression(n_samples=2000, n_features=40, noise=15, random_state=SEED)
    X_train, X_temp, y_train, y_temp = train_test_split(X, y, test_size=0.3, random_state=SEED)
    X_val, X_test, y_val, y_test = train_test_split(X_temp, y_temp, test_size=0.5, random_state=SEED)
    
    # Feature Scaling
    scaler_x = StandardScaler()
    X_train = scaler_x.fit_transform(X_train)
    X_val = scaler_x.transform(X_val)
    X_test = scaler_x.transform(X_test)
    
    # Target Scaling (CRITICAL)
    scaler_y = StandardScaler()
    y_train = scaler_y.fit_transform(y_train.reshape(-1, 1)).flatten()
    y_val = scaler_y.transform(y_val.reshape(-1, 1)).flatten()
    y_test = scaler_y.transform(y_test.reshape(-1, 1)).flatten()
    
    train_dataset = TensorDataset(torch.FloatTensor(X_train), torch.FloatTensor(y_train).unsqueeze(1))
    val_dataset = TensorDataset(torch.FloatTensor(X_val), torch.FloatTensor(y_val).unsqueeze(1))
    test_dataset = TensorDataset(torch.FloatTensor(X_test), torch.FloatTensor(y_test).unsqueeze(1))
    
    loaders = {
        'train': DataLoader(train_dataset, batch_size=64, shuffle=True),
        'val': DataLoader(val_dataset, batch_size=64, shuffle=False),
        'test': DataLoader(test_dataset, batch_size=64, shuffle=False)
    }
    return loaders, X_train.shape[1], scaler_y

### ðŸ”¹ Model Definition (MLP)
#### 2.1 What the line does
Defines the Neural Network architecture.
#### 2.2 Why it is used
To tell PyTorch how to process inputs.
#### 2.3 When to use it
For any Deep Learning task.
#### 2.4 Where to use it
Inside a class inheriting from `nn.Module`.
#### 2.5 How to use it
`model = MLP(input_dim)`
#### 2.6 How it works internally
Registers layers to the computational graph.
#### 2.7 Output with sample examples
A model object.

In [None]:
class MLP(nn.Module):
    def __init__(self, input_dim):
        super(MLP, self).__init__()
        self.layers = nn.Sequential(
            nn.Linear(input_dim, 128),
            nn.ReLU(),
            nn.Linear(128, 64),
            nn.ReLU(),
            nn.Linear(64, 1)
        )
        
    def forward(self, x):
        return self.layers(x)

### ðŸ”¹ Training Loop
#### 2.1 What the line does
Orchestrates the learning process: Forward -> Loss -> Backward -> Step.
#### 2.2 Why it is used
To update weights and minimize error.
#### 2.3 When to use it
For training models.
#### 2.4 Where to use it
In the main logic.
#### 2.5 How to use it
Calls `optimizer.step()`.
#### 2.6 How it works internally
Calculates gradients via backpropagation and applies optimizer rules.
#### 2.7 Output with sample examples
Returns a history of losses.

In [None]:
def train_model(model_class, input_dim, optimizer_name, learning_rate, momentum=0.0):
    torch.manual_seed(SEED)
    model = model_class(input_dim)
    criterion = nn.MSELoss()
    
    if optimizer_name == 'SGD':
        optimizer = optim.SGD(model.parameters(), lr=learning_rate, momentum=momentum)
    else:
        optimizer = optim.Adam(model.parameters(), lr=learning_rate)
        
    history = {'train_loss': [], 'val_loss': [], 'val_rmse': []}
    best_val_rmse = float('inf')
    loaders, _, _ = prepare_data()
    
    for epoch in range(40):
        model.train()
        running_loss = 0.0
        for X_batch, y_batch in loaders['train']:
            optimizer.zero_grad()
            outputs = model(X_batch)
            loss = criterion(outputs, y_batch)
            loss.backward()
            optimizer.step()
            running_loss += loss.item()
        
        avg_train_loss = running_loss / len(loaders['train'])
        
        model.eval()
        val_loss = 0.0
        with torch.no_grad():
            for X_val, y_val in loaders['val']:
                outputs = model(X_val)
                loss = criterion(outputs, y_val)
                val_loss += loss.item()
        
        avg_val_loss = val_loss / len(loaders['val'])
        val_rmse = np.sqrt(avg_val_loss)
        
        history['train_loss'].append(avg_train_loss)
        history['val_loss'].append(avg_val_loss)
        history['val_rmse'].append(val_rmse)
        
        if val_rmse < best_val_rmse:
            best_val_rmse = val_rmse
            
    print(f"Finished {optimizer_name}: Best Val RMSE: {best_val_rmse:.4f}")
    return history, best_val_rmse

### ðŸ”¹ Execution & Visualization
#### 2.1 What the line does
Runs the experiment for SGD and Adam, then plots results.
#### 2.2 Why it is used
To analyze the difference in performance.
#### 2.3 When to use it
At the end of the script.
#### 2.4 Where to use it
Main block.
#### 2.5 How to use it
Loops through configs and calls train.
#### 2.6 How it works internally
Calls the training function 2 times.
#### 2.7 Output with sample examples
Graphs and Tables.

In [None]:
optimizers_to_run = [
    {'name': 'SGD', 'lr': 5e-3, 'momentum': 0.9},
    {'name': 'Adam', 'lr': 1e-3, 'momentum': 0.0}
]

results = {}
loaders, input_dim, scaler_y = prepare_data()

for opt_config in optimizers_to_run:
    print(f"Training with {opt_config['name']}...")
    hist, best_rmse = train_model(
        MLP, 
        input_dim, 
        opt_config['name'], 
        opt_config['lr'], 
        opt_config['momentum']
    )
    results[opt_config['name']] = {'history': hist, 'best_rmse': best_rmse}

plt.figure(figsize=(12, 5))
plt.subplot(1, 2, 1)
plt.plot(results['SGD']['history']['train_loss'], label='SGD')
plt.plot(results['Adam']['history']['train_loss'], label='Adam')
plt.title('Training Loss')
plt.legend()

plt.subplot(1, 2, 2)
plt.plot(results['SGD']['history']['val_rmse'], label='SGD')
plt.plot(results['Adam']['history']['val_rmse'], label='Adam')
plt.title('Validation RMSE')
plt.legend()
plt.show()