# Regression Models Catalog

Folder: `01_regression_models`

This notebook documents **20 popular regression models**, including descriptions, hyperparameter tuning ranges (suitable for GridSearchCV), strengths, and weaknesses.

Assumes a standard supervised setup with `X_train, X_test, y_train, y_test`.

In [None]:
import numpy as np
import pandas as pd

from sklearn.model_selection import train_test_split, GridSearchCV
from sklearn.metrics import mean_squared_error, r2_score

## Linear Regression

**Description:** Ordinary Least Squares linear regression.

**Importing:**
```python
from sklearn.linear_model import LinearRegression
```

**Fitting:**
```python
model = LinearRegression()
model.fit(X_train, y_train)
```

**Hyperparameter Tuning (GridSearch):**
```python
- fit_intercept: [True, False]
- copy_X: [True, False]
- n_jobs: [None, -1]
```


**Strengths:** Simple, interpretable, fast

**Weaknesses:** Assumes linearity, sensitive to outliers

## Ridge Regression

**Description:** L2-regularized linear regression.

**Importing:**
```python
from sklearn.linear_model import Ridge
```

**Fitting:**
```python
model = Ridge()
model.fit(X_train, y_train)
```

**Hyperparameter Tuning (GridSearch):**
```python
- alpha: [1e-4,1e-3,1e-2,0.1,1,10,100]
- solver: ['auto','svd','lsqr']
- max_iter: [1000,5000]
```


**Strengths:** Handles multicollinearity

**Weaknesses:** No feature selection

## Lasso Regression

**Description:** L1-regularized linear regression with feature selection.

**Importing:**
```python
from sklearn.linear_model import Lasso
```

**Fitting:**
```python
model = Lasso()
model.fit(X_train, y_train)
```

**Hyperparameter Tuning (GridSearch):**
```python
- alpha: [1e-4,1e-3,1e-2,0.1,1]
- max_iter: [1000,5000]
- selection: ['cyclic','random']
```



**Strengths:** Sparse solutions

**Weaknesses:** Unstable with correlated features

## Elastic Net

**Description:** Combination of L1 and L2 regularization.

**Importing:**
```python
from sklearn.linear_model import ElasticNet
```

**Fitting:**
```python
model = ElasticNet()
model.fit(X_train, y_train)
```

**Hyperparameter Tuning (GridSearch):**
```python
- alpha: [1e-4,1e-3,1e-2,0.1,1]
- l1_ratio: [0.1,0.3,0.5,0.7,0.9]
- max_iter: [1000,5000]
```



**Strengths:** Balances Lasso and Ridge

**Weaknesses:** More complex tuning

## Decision Tree Regressor

**Description:** Tree-based non-linear regression.

**Importing:**
```python
from sklearn.tree import DecisionTreeRegressor
```

**Fitting:**
```python
model = DecisionTreeRegressor()
model.fit(X_train, y_train)
```

**Hyperparameter Tuning (GridSearch):**
```python
- max_depth: [None,5,10,20]
- min_samples_split: [2,5,10]
- min_samples_leaf: [1,2,5]
- max_features: [None,'sqrt','log2']
```


**Strengths:** Captures non-linearities

**Weaknesses:** Prone to overfitting

## Random Forest Regressor

**Description:** Ensemble of trees using bagging.

**Importing:**
```python
from sklearn.ensemble import RandomForestRegressor
```

**Fitting:**
```python
model = RandomForestRegressor(n_estimators=200)
model.fit(X_train, y_train)
```

**Hyperparameter Tuning (GridSearch):**
```python
- n_estimators: [100,300,500]
- max_depth: [None,10,30]
- max_features: ['sqrt','log2']
- min_samples_leaf: [1,2,5]
```



**Strengths:** Robust and accurate

**Weaknesses:** Less interpretable

## Gradient Boosting Regressor

**Description:** Sequential boosting of weak learners.

**Importing:**
```python
from sklearn.ensemble import GradientBoostingRegressor
```

**Fitting:**
```python
model = GradientBoostingRegressor()
model.fit(X_train, y_train)
```

**Hyperparameter Tuning (GridSearch):**
```python
- n_estimators: [100,300,500]
- learning_rate: [0.01,0.05,0.1]
- max_depth: [2,3,5]
- subsample: [0.6,0.8,1.0]
```



**Strengths:** High predictive power

**Weaknesses:** Sensitive to tuning

## XGBoost Regressor

**Description:** Optimized gradient boosting with regularization.

**Importing:**
```python
from xgboost import XGBRegressor
```

**Fitting:**
```python
model = XGBRegressor(objective='reg:squarederror')
model.fit(X_train, y_train)
```

**Hyperparameter Tuning (GridSearch):**
```python
- n_estimators: [200,500,800]
- learning_rate: [0.01,0.05,0.1]
- max_depth: [3,5,7]
- subsample: [0.6,0.8,1.0]
- colsample_bytree: [0.6,0.8,1.0]
```



**Strengths:** State-of-the-art performance

**Weaknesses:** Complex tuning

## LightGBM Regressor

**Description:** Histogram-based gradient boosting.

**Importing:**
```python
from lightgbm import LGBMRegressor
```

**Fitting:**
```python
model = LGBMRegressor()
model.fit(X_train, y_train)
```

**Hyperparameter Tuning (GridSearch):**
- num_leaves: [31,63,127]
- learning_rate: [0.01,0.05,0.1]
- n_estimators: [200,500]
- max_depth: [-1,10,20]

**Strengths:** Very fast

**Weaknesses:** Can overfit

## CatBoost Regressor

**Description:** Boosting with categorical handling.

**Importing:**
```python
from catboost import CatBoostRegressor
```

**Fitting:**
```python
model = CatBoostRegressor(verbose=0)
model.fit(X_train, y_train)
```

**Hyperparameter Tuning (GridSearch):**
- iterations: [300,600,1000]
- learning_rate: [0.01,0.05,0.1]
- depth: [4,6,8]

**Strengths:** Minimal preprocessing

**Weaknesses:** Slower training

## SVR

**Description:** Support Vector Regression.

**Importing:**
```python
from sklearn.svm import SVR
```

**Fitting:**
```python
model = SVR()
model.fit(X_train, y_train)
```

**Hyperparameter Tuning (GridSearch):**
- C: [0.1,1,10,100]
- epsilon: [0.01,0.1,0.2]
- kernel: ['linear','rbf','poly']

**Strengths:** Effective in high dimensions

**Weaknesses:** Poor scalability

## KNN Regressor

**Description:** Instance-based regression.

**Importing:**
```python
from sklearn.neighbors import KNeighborsRegressor
```

**Fitting:**
```python
model = KNeighborsRegressor()
model.fit(X_train, y_train)
```

**Hyperparameter Tuning (GridSearch):**
- n_neighbors: [3,5,7,11,15]
- weights: ['uniform','distance']
- p: [1,2]

**Strengths:** Simple, intuitive

**Weaknesses:** Slow inference

## Bayesian Ridge

**Description:** Bayesian linear regression.

**Importing:**
```python
from sklearn.linear_model import BayesianRidge
```

**Fitting:**
```python
model = BayesianRidge()
model.fit(X_train, y_train)
```

**Hyperparameter Tuning (GridSearch):**
- alpha_1: [1e-6,1e-5]
- alpha_2: [1e-6,1e-5]
- lambda_1: [1e-6,1e-5]
- lambda_2: [1e-6,1e-5]

**Strengths:** Uncertainty estimation

**Weaknesses:** Slower than OLS

## Huber Regressor

**Description:** Robust regression for outliers.

**Importing:**
```python
from sklearn.linear_model import HuberRegressor
```

**Fitting:**
```python
model = HuberRegressor()
model.fit(X_train, y_train)
```

**Hyperparameter Tuning (GridSearch):**
- epsilon: [1.1,1.35,1.5,2.0]
- alpha: [1e-4,1e-3,1e-2]
- max_iter: [100,300,1000]

**Strengths:** Outlier resistant

**Weaknesses:** Slower convergence

## Extra Trees Regressor

**Description:** Extremely randomized trees.

**Importing:**
```python
from sklearn.ensemble import ExtraTreesRegressor
```

**Fitting:**
```python
model = ExtraTreesRegressor(n_estimators=300)
model.fit(X_train, y_train)
```

**Hyperparameter Tuning (GridSearch):**
- n_estimators: [200,500]
- max_depth: [None,10,30]
- max_features: ['sqrt','log2']

**Strengths:** Low variance

**Weaknesses:** Low interpretability

## AdaBoost Regressor

**Description:** Boosting focused on hard samples.

**Importing:**
```python
from sklearn.ensemble import AdaBoostRegressor
```

**Fitting:**
```python
model = AdaBoostRegressor()
model.fit(X_train, y_train)
```

**Hyperparameter Tuning (GridSearch):**
- n_estimators: [50,100,300]
- learning_rate: [0.01,0.1,1.0]
- loss: ['linear','square','exponential']

**Strengths:** Improves weak learners

**Weaknesses:** Sensitive to noise

## Poisson Regressor

**Description:** GLM for count data.

**Importing:**
```python
from sklearn.linear_model import PoissonRegressor
```

**Fitting:**
```python
model = PoissonRegressor()
model.fit(X_train, y_train)
```

**Hyperparameter Tuning (GridSearch):**
- alpha: [0,0.01,0.1,1]
- max_iter: [100,300,1000]

**Strengths:** Good for counts

**Weaknesses:** Distribution assumptions

## Tweedie Regressor

**Description:** Flexible generalized linear model.

**Importing:**
```python
from sklearn.linear_model import TweedieRegressor
```

**Fitting:**
```python
model = TweedieRegressor()
model.fit(X_train, y_train)
```

**Hyperparameter Tuning (GridSearch):**
- power: [0,1,1.5,2]
- alpha: [0,0.01,0.1,1]
- link: ['identity','log']

**Strengths:** Distribution flexibility

**Weaknesses:** Needs domain knowledge

## Passive Aggressive Regressor

**Description:** Online learning regression model.

**Importing:**
```python
from sklearn.linear_model import PassiveAggressiveRegressor
```

**Fitting:**
```python
model = PassiveAggressiveRegressor()
model.fit(X_train, y_train)
```

**Hyperparameter Tuning (GridSearch):**
- C: [0.01,0.1,1,10]
- epsilon: [0.01,0.1,0.2]
- max_iter: [500,1000]

**Strengths:** Very fast

**Weaknesses:** Sensitive to noise

## Gaussian Process Regressor

**Description:** Non-parametric probabilistic regression.

**Importing:**
```python
from sklearn.gaussian_process import GaussianProcessRegressor
from sklearn.gaussian_process.kernels import RBF, Matern
```

**Fitting:**
```python
kernel = RBF()
model = GaussianProcessRegressor(kernel=kernel)
model.fit(X_train, y_train)
```

**Hyperparameter Tuning (GridSearch):**
- kernel: [RBF(), Matern()]
- alpha: [1e-10,1e-5,1e-2]
- normalize_y: [True, False]

**Strengths:** Uncertainty estimation

**Weaknesses:** Poor scalability