#1. Linear Regression
Linear Regression models the relationship between a dependent variable (y) and one or more independent variables (X) by fitting a linear equation:

y = b0 + b1x1 + b2x2 + ... + e

Objective: Minimize the sum of squared residuals (Ordinary Least Squares).

Assumptions: Linearity, homoscedasticity, and independence of features.

Use Case: Predicting house prices, sales forecasting.

- - Evaluation Metrics:

- - - MSE (Mean Squared Error): Lower = better.

- - - R² Score: Closer to 1 = better fit.

In [None]:
# Setup
import numpy as np
import pandas as pd
from sklearn.linear_model import LinearRegression
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error
from sklearn.metrics import r2_score


In [None]:
# Load data (e.g., Boston Housing)
from sklearn.datasets import fetch_california_housing
data = fetch_california_housing()
X, y = data.data, data.target
df = pd.DataFrame(data.data, columns=data.feature_names)
df['Target'] = data.target
print("Dataset Head:\n", df.head())


In [None]:
# Split data
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)



In [None]:
# Train
model = LinearRegression()
model.fit(X_train, y_train)



In [None]:
# Evaluate
y_pred = model.predict(X_test)
print(f"MSE: {mean_squared_error(y_test, y_pred):.2f}")
print(f"R² Score: {r2_score(y_test, y_pred):.2f}")


In [None]:
custom_input = [X.mean(axis=0)]  # Using mean of each feature
prediction = model.predict(custom_input)
print(f"\nPrediction for {custom_input[0]}:\nPredicted Value: {prediction[0]:.2f}")

#2. Decision Tree Regression
Theory
- Non-parametric method that splits data into branches based on feature thresholds.

- Objective: Minimize variance in leaf nodes (e.g., using MSE).

- Pros: Handles non-linear data, no need for feature scaling.

- Cons: Prone to overfitting (use pruning or ensemble methods).

In [None]:
from sklearn.tree import DecisionTreeRegressor


In [None]:
# Train
model = DecisionTreeRegressor(max_depth=3, random_state=42)
model.fit(X_train, y_train)



In [None]:
# Evaluate
y_pred = model.predict(X_test)
print(f"MSE: {mean_squared_error(y_test, y_pred):.2f}")
print(f"R² Score: {r2_score(y_test, y_pred):.2f}")


In [None]:
# Predict for custom input
prediction = model.predict(custom_input)
print(f"\nPrediction: {prediction[0]:.2f}")


#3. Random Forest Regression
Theory
- Ensemble of decision trees, averaging predictions to reduce overfitting.

- Objective: Combine multiple trees (via bagging) to improve robustness.

- Hyperparameters: n_estimators (number of trees), max_depth.

In [None]:
from sklearn.ensemble import RandomForestRegressor



In [None]:
# Train
model = RandomForestRegressor(n_estimators=100, random_state=42)
model.fit(X_train, y_train)



In [None]:
# Evaluate
y_pred = model.predict(X_test)
print(f"MSE: {mean_squared_error(y_test, y_pred):.2f}")
print(f"R² Score: {r2_score(y_test, y_pred):.2f}")


#4. Support Vector Regression (SVR)
Theory
- Kernel-based method that fits the best "tube" (ϵ-insensitive) around data.

- Objective: Minimize deviations beyond a threshold (ϵ).

- Kernels: RBF (non-linear), linear, or polynomial.

In [None]:
from sklearn.svm import SVR
from sklearn.preprocessing import StandardScaler



In [None]:
# Scale features (critical for SVR)
scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = scaler.transform(X_test)



In [None]:
# Train
model = SVR(kernel='rbf', C=1.0, epsilon=0.1)
model.fit(X_train_scaled, y_train)



In [None]:
# Evaluate
y_pred = model.predict(X_test_scaled)
print(f"MSE: {mean_squared_error(y_test, y_pred):.2f}")
print(f"R² Score: {r2_score(y_test, y_pred):.2f}")


#5. Gradient Boosting (XGBoost)
Theory
- Boosting technique that sequentially corrects errors from previous trees.

- Objective: Optimize loss function (e.g., MSE) using gradient descent.

- Pros: High accuracy, handles missing values.

Data Preprocessing: Always scale features for SVR

In [None]:
# !pip install xgboost
import xgboost as xgb



In [None]:
# Train
model = xgb.XGBRegressor(n_estimators=100, learning_rate=0.1, random_state=42)
model.fit(X_train, y_train)



In [None]:
# Evaluate
y_pred = model.predict(X_test)
print(f"MSE: {mean_squared_error(y_test, y_pred):.2f}")
print(f"R² Score: {r2_score(y_test, y_pred):.2f}")
