 ##  Understanding Deep Learning : Lab 4


## Linear Regression with a Regression Dataset <br>
In this activity, we applied linear regression using PyTorch on the real-world Diabetes dataset from scikit-learn.
We began by loading and normalizing the input features, then converted the data into PyTorch tensors and created a DataLoader for mini-batch training.
Next, we defined a Linear Regression model using a single fully connected layer, trained it with Mean Squared Error (MSE) loss and Stochastic Gradient Descent (SGD), and monitored the training loss over multiple epochs.
Finally, we tested the model by comparing its predicted output with the actual target value to evaluate its learning performance.

Here is the given problem setup:

<img src="https://i.ibb.co/Fk0pP2wR/Screenshot-2025-09-13-at-2-14-30-PM.png" width="1200">



### IMPLEMENTION

In [3]:
# ----------------------------
# Linear Regression with PyTorch
# Dataset: Diabetes dataset (real-world regression)
# ----------------------------

import numpy as np
import torch
import torch.nn as nn
from torch.utils.data import TensorDataset, DataLoader
from sklearn.datasets import load_diabetes
from sklearn.preprocessing import StandardScaler

In [4]:
# Load the dataset

data = load_diabetes()
X, y = data.data, data.target   # X = features, y = target (disease progression score)


In [5]:
# Normalize the features (important for regression!)
scaler = StandardScaler()
X = scaler.fit_transform(X)

In [6]:
# Convert to PyTorch tensors
inputs = torch.tensor(X, dtype=torch.float32)
targets = torch.tensor(y, dtype=torch.float32).reshape(-1, 1)  # reshape for regression

In [7]:
 # Create DataLoader for batch training
train_ds = TensorDataset(inputs, targets)          # wrap tensors in a dataset
train_dl = DataLoader(train_ds, batch_size=16, shuffle=True)  # 16 samples per batch##

In [8]:
# Define a simple Linear Regression model
class LinearRegressionModel(nn.Module):
    def __init__(self, input_dim, output_dim):
        super().__init__()
        self.linear = nn.Linear(input_dim, output_dim)  # single linear layer
    
    def forward(self, x):
        return self.linear(x)

In [9]:
# Initialize model
model = LinearRegressionModel(input_dim=inputs.shape[1], output_dim=1)

In [10]:
# Define loss function and optimizer
loss_fn = nn.MSELoss()                     # Mean Squared Error for regression
optimizer = torch.optim.SGD(model.parameters(), lr=0.01)  # SGD optimizer

In [11]:
# Training loop
epochs = 100
for epoch in range(epochs):
    for xb, yb in train_dl:       # get mini-batches
        # Forward pass: compute prediction
        pred = model(xb)
        loss = loss_fn(pred, yb)

        # Backward pass: compute gradients
        optimizer.zero_grad()
        loss.backward()

        # Update parameters
        optimizer.step()
    
    # Print progress every 10 epochs
    if (epoch+1) % 10 == 0:
        print(f"Epoch [{epoch+1}/{epochs}], Loss: {loss.item():.4f}")

Epoch [10/100], Loss: 2471.2607
Epoch [20/100], Loss: 2595.7314
Epoch [30/100], Loss: 2083.4661
Epoch [40/100], Loss: 5486.0869
Epoch [50/100], Loss: 316.9049
Epoch [60/100], Loss: 3338.3445
Epoch [70/100], Loss: 2962.1626
Epoch [80/100], Loss: 3420.6724
Epoch [90/100], Loss: 3563.8625
Epoch [100/100], Loss: 2373.8159


In [12]:
# Test the model with a sample
with torch.no_grad():  # no gradients needed for testing
    sample_input = inputs[0].unsqueeze(0)   # take the first sample
    predicted = model(sample_input)
    print("\nExample Prediction:")
    print("Predicted value:", predicted.item())
    print("Actual value   :", targets[0].item())


Example Prediction:
Predicted value: 204.79237365722656
Actual value   : 151.0


## Conclusions / Learnings
<div style="border: 2px solid #4CAF50; background-color: #e8f5e9; padding: 15px; border-radius: 10px; margin-top: 10px; margin-bottom: 10px;">
- Training a linear regression model in PyTorch involves defining a simple network, selecting a suitable loss function (MSE), and using an optimizer (SGD) to update weights.  
- With enough epochs, the model parameters converge and approximate the true underlying relationship between input and output.  
- Batch training (e.g., batch size = 8) helps improve efficiency and stability during gradient updates.  

## Key Takeaways

- PyTorch provides <strong>TensorDataset</strong> and <strong>DataLoader</strong> for efficient dataset handling and batching.  
- A simple <strong>linear model</strong> can capture patterns effectively, even with synthetic data.  
- The training loop follows the sequence: <strong>Forward pass → Loss calculation → Backpropagation → Optimization</strong>.  
- <strong> updates weights step by step based on computed gradients.  
- <strong>Data preprocessing and normalization</strong> can improve training stability and convergence speed.  
