<a href="https://colab.research.google.com/github/anandchauhan21/Machine_Learning/blob/main/Labs/lab5_linear_regression.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# 🧪 Lab 5: Build a Linear Regression Model to Predict Health Parameters (BMI, Blood Pressure)

## 🎯 Objective
To implement a **Linear Regression model** using Python to predict health parameters such as **BMI** or **Blood Pressure**
based on other features (e.g., age, height, weight).

---

## 🧠 Concept Recap
- **Linear Regression** is used for predicting continuous outcomes.  
- It assumes a **linear relationship** between dependent (Y) and independent (X) variables.  
- Example equation:  
  \[
  Y = β_0 + β_1 X + ε
  \]
  where:
  - \( Y \) = dependent variable (BMI or Blood Pressure)  
  - \( X \) = independent variable (Age, Weight, etc.)  
  - \( β_0, β_1 \) = coefficients learned by the model  
  - \( ε \) = error term


In [None]:
# ✅ Importing required libraries
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.linear_model import LinearRegression
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error, r2_score

In [None]:
# Synthetic dataset creation
np.random.seed(42)
n = 100
age = np.random.randint(18, 65, n)
weight = np.random.randint(45, 100, n)
height = np.random.normal(1.65, 0.1, n)  # height in meters
bmi = weight / (height ** 2) + np.random.normal(0, 1, n)  # adding noise

df = pd.DataFrame({
    'Age': age,
    'Weight': weight,
    'Height': height.round(2),
    'BMI': bmi.round(2)
})

# Display dataset
df.head(10)


In [None]:
# Check data info and summary
print(df.info())
print(df.describe())

# Visualize relationship between Weight, Height, and BMI
sns.pairplot(df, vars=['Weight', 'Height', 'BMI'], kind='reg')
plt.suptitle("Relationship Between Features", y=1.02)
plt.show()


In [None]:
# Independent variable(s): Age, Weight, Height
X = df[['Age', 'Weight', 'Height']]

# Dependent variable: BMI
y = df['BMI']

# Split into train/test sets (80/20)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

print("Training samples:", len(X_train))
print("Testing samples:", len(X_test))


In [None]:
# Initialize model
model = LinearRegression()

# Train model
model.fit(X_train, y_train)

# Coefficients and intercept
print("Intercept (β0):", model.intercept_)
print("Coefficients (β1..βn):", model.coef_)
print("\nFeature mapping:")
for name, coef in zip(X.columns, model.coef_):
    print(f"  {name}: {coef:.4f}")


In [None]:
# Predict on test data
y_pred = model.predict(X_test)

# Evaluate model
mse = mean_squared_error(y_test, y_pred)
r2 = r2_score(y_test, y_pred)

print(f"Mean Squared Error: {mse:.2f}")
print(f"R² Score: {r2:.3f}")

# Plot predictions vs actual
plt.figure(figsize=(6,5))
plt.scatter(y_test, y_pred, color='teal', alpha=0.7)
plt.xlabel("Actual BMI")
plt.ylabel("Predicted BMI")
plt.title("Actual vs Predicted BMI")
plt.plot([y.min(), y.max()], [y.min(), y.max()], 'r--')
plt.show()


In [None]:
# Predict for new input
sample = pd.DataFrame({'Age':[30], 'Weight':[70], 'Height':[1.75]})
predicted_bmi = model.predict(sample)[0]
print(f"Predicted BMI for {sample.to_dict('records')[0]} → {predicted_bmi:.2f}")


In [None]:
# Optional: Predict Blood Pressure (Synthetic Example)
# Generate synthetic data for blood pressure prediction
np.random.seed(10)
n = 100
age = np.random.randint(20, 70, n)
bmi = np.random.normal(25, 4, n)
bp = 80 + 0.6*age + 1.2*bmi + np.random.normal(0, 3, n)

df_bp = pd.DataFrame({'Age': age, 'BMI': bmi, 'BloodPressure': bp})
X_bp = df_bp[['Age', 'BMI']]
y_bp = df_bp['BloodPressure']

# Train model
model_bp = LinearRegression().fit(X_bp, y_bp)
preds = model_bp.predict(X_bp)

print("Model Coefficients:")
for n,c in zip(X_bp.columns, model_bp.coef_):
    print(f"  {n}: {c:.2f}")
print("Intercept:", round(model_bp.intercept_, 2))

# Plot Actual vs Predicted
plt.scatter(y_bp, preds, color='purple', alpha=0.6)
plt.xlabel("Actual BP"); plt.ylabel("Predicted BP")
plt.title("Blood Pressure Prediction — Linear Regression")
plt.plot([min(y_bp), max(y_bp)], [min(y_bp), max(y_bp)], 'r--')
plt.show()


## 📌 Summary
- Implemented **Linear Regression** for predicting **BMI** and **Blood Pressure**.  
- Explored data visually using **Seaborn**.  
- Evaluated model performance with **MSE** and **R² Score**.  
- Demonstrated prediction on **new patient data**.  

---

## ✅ Viva Questions
1. What is Linear Regression?  
2. What is the role of coefficients and intercept?  
3. How do you evaluate a regression model?  
4. What does the R² score represent?  
5. Can Linear Regression handle nonlinear data?
