# 🧺 Bagging Regression

## 📌 What is Bagging?

**Bagging (Bootstrap Aggregating)** is an ensemble technique that:
- Reduces **variance**
- Improves **stability and performance**
- Works by combining predictions from **multiple regressors trained independently**

---

## 📐 Bagging for Regression

### 🎯 Goal:
Improve prediction accuracy by averaging outputs from multiple base regressors trained on **bootstrap samples**.

---

## 🔁 Steps of Bagging Regression

1. **Create bootstrap samples** from the training data  
   (sampling **with replacement**)

2. **Train a separate regressor** (e.g., decision tree) on each bootstrap sample

3. **Aggregate predictions** using **mean**:
   $$
   \hat{y}(x) = \frac{1}{M} \sum_{i=1}^{M} f_i(x)
   $$
   where $f_i(x)$ is the prediction from the $i^{th}$ model, $M$ is number of base regressors

---

## 🌳 Common Base Learner

- **Decision Trees** are commonly used as base regressors because:
  - They are **high-variance** models
  - Bagging effectively reduces their variance

---

## 📊 Why Bagging Works for Regression

- Reduces **overfitting** caused by variance in complex models
- Helps generalize better on unseen data
- Each model sees a **different subset**, adding diversity

---

## ✅ Advantages

| Benefit               | Description                                      |
|-----------------------|--------------------------------------------------|
| Reduces Variance      | Combines unstable models (like trees) stably     |
| Handles Overfitting   | Especially effective on complex, noisy datasets  |
| Parallelizable        | Models are trained independently                 |
| Easy to Implement     | Simple averaging strategy                        |

---

## ❌ Limitations

| Limitation            | Description                                      |
|------------------------|--------------------------------------------------|
| No Bias Reduction     | Doesn't fix underfitting (high bias)             |
| Inefficient on Small Data | Bootstrap sampling may discard valuable data |
| Less Interpretability | Multiple models make explanation harder          |

---

## 🛠 Scikit-Learn Example

```python
from sklearn.ensemble import BaggingRegressor
from sklearn.tree import DecisionTreeRegressor

model = BaggingRegressor(
    base_estimator=DecisionTreeRegressor(),
    n_estimators=100,
    oob_score=True,
    random_state=42
)

model.fit(X_train, y_train)

print("OOB R^2 Score:", model.oob_score_)
