# 🗳️ Voting Ensemble in Regression

---

## 🎯 **What is it?**
A regression ensemble method that combines predictions from multiple base regressors by averaging their outputs. Unlike classification voting, regression voting uses **simple averaging** or **weighted averaging** of continuous predictions.

---

## 📊 **How Voting Works in Regression**

### 🎯 **Simple Averaging (Uniform Voting)**
> All models contribute equally to final prediction

$$\hat{y}_{\text{ensemble}} = \frac{1}{n} \sum_{i=1}^{n} \hat{y}_i$$

Where:
- $n$ = number of base models
- $\hat{y}_i$ = prediction from model $i$

### ⚖️ **Weighted Averaging**
> Models contribute based on their performance weights

$$\hat{y}_{\text{ensemble}} = \frac{\sum_{i=1}^{n} w_i \cdot \hat{y}_i}{\sum_{i=1}^{n} w_i}$$

Where:
- $w_i$ = weight for model $i$
- $\sum_{i=1}^{n} w_i = 1$ (normalized weights)

---

## 🧮 **Weight Calculation Strategies**

### 📈 **Performance-Based Weights**

**1. Inverse MSE Weighting:**
$$w_i = \frac{1/\text{MSE}_i}{\sum_{j=1}^{n} 1/\text{MSE}_j}$$

**2. R² Score Weighting:**
$$w_i = \frac{\text{R²}_i}{\sum_{j=1}^{n} \text{R²}_j}$$

**3. Inverse RMSE Weighting:**
$$w_i = \frac{1/\text{RMSE}_i}{\sum_{j=1}^{n} 1/\text{RMSE}_j}$$

### 🎲 **Cross-Validation Based Weights**
```python
# Pseudo-code for CV-based weighting
for model_i in models:
    cv_scores = cross_val_score(model_i, X_train, y_train, cv=5, 
                               scoring='neg_mean_squared_error')
    weights[i] = 1 / (-cv_scores.mean())  # Lower MSE = higher weight
```

---

## 🏗️ **Implementation Types**

### 🔧 **VotingRegressor (Sklearn)**
> Built-in implementation with simple averaging

```python
from sklearn.ensemble import VotingRegressor

# Create base models
lr = LinearRegression()
rf = RandomForestRegressor(n_estimators=100)
gb = GradientBoostingRegressor(n_estimators=100)

# Voting ensemble
voting_reg = VotingRegressor([
    ('linear', lr),
    ('forest', rf), 
    ('boosting', gb)
])
```

### 🎨 **Custom Weighted Voting**
> Manual implementation with custom weights

```python
class WeightedVotingRegressor:
    def __init__(self, models, weights=None):
        self.models = models
        self.weights = weights or [1/len(models)] * len(models)
    
    def fit(self, X, y):
        for model in self.models:
            model.fit(X, y)
        return self
    
    def predict(self, X):
        predictions = np.array([model.predict(X) for model in self.models])
        return np.average(predictions, axis=0, weights=self.weights)
```

---

## 📊 **Mathematical Properties**

### 🎯 **Bias-Variance Decomposition**

For ensemble of $n$ models with individual bias $B_i$ and variance $V_i$:

**Ensemble Bias:**
$$\text{Bias}^2_{\text{ensemble}} = \left(\frac{1}{n}\sum_{i=1}^{n} B_i\right)^2$$

**Ensemble Variance (uncorrelated models):**
$$\text{Var}_{\text{ensemble}} = \frac{1}{n^2}\sum_{i=1}^{n} V_i$$

**Ensemble Variance (correlated models):**
$$\text{Var}_{\text{ensemble}} = \rho \sigma^2 + \frac{1-\rho}{n}\sigma^2$$

Where $\rho$ = average correlation, $\sigma^2$ = average variance

---

## ✅ **Advantages**

```
✓ Simple and intuitive approach
✓ Reduces variance without increasing bias
✓ Works well with diverse, uncorrelated models
✓ Less prone to overfitting than individual models
✓ Easy to implement and understand
✓ Computationally efficient (parallel training)
✓ Robust to outliers in individual predictions
```

## ❌ **Disadvantages**

```
✗ Limited improvement if models are highly correlated
✗ Poor models can drag down ensemble performance
✗ No learning of optimal combination weights
✗ Requires storage and prediction from all models
✗ May not capture complex model interactions
✗ Performance bounded by best individual model
```

---

## 🎪 **Use Cases & Applications**

### 🏠 **Real Estate Price Prediction**
```python
# Combine different approaches
models = [
    ('location_model', LinearRegression()),      # Linear relationships
    ('feature_model', RandomForestRegressor()),  # Non-linear features  
    ('trend_model', GradientBoostingRegressor()) # Temporal patterns
]
```

### 📈 **Financial Forecasting**
```python
# Different time horizons and approaches
models = [
    ('technical', SVR()),                    # Technical indicators
    ('fundamental', LinearRegression()),     # Economic fundamentals
    ('sentiment', RandomForestRegressor())   # Market sentiment
]
```

### 🌡️ **Environmental Modeling**
```python
# Different physical models
models = [
    ('statistical', Ridge()),               # Statistical patterns
    ('physical', DecisionTreeRegressor()),  # Rule-based physics
    ('hybrid', XGBRegressor())             # Machine learning
]
```

---

## 🛠️ **Best Practices**

### 🎯 **Model Selection Strategy**

| Model Type | Strength | Best Used For |
|------------|----------|---------------|
| **Linear Models** | Simple relationships | Baseline, interpretability |
| **Tree-based** | Non-linear patterns | Feature interactions |
| **Neural Networks** | Complex patterns | High-dimensional data |
| **SVR** | Robust to outliers | Noisy data |

### 📊 **Diversity Maximization**
- **Different algorithms**: Linear, tree-based, neural
- **Different features**: Subsets, transformations
- **Different data**: Bootstrap samples, cross-validation folds
- **Different hyperparameters**: Various complexity levels

---

## 📈 **Performance Evaluation**

### 🎯 **Ensemble vs Individual Models**

```python
# Evaluation framework
def evaluate_ensemble(models, X_test, y_test):
    individual_scores = []
    predictions = []
    
    for name, model in models:
        pred = model.predict(X_test)
        score = r2_score(y_test, pred)
        individual_scores.append((name, score))
        predictions.append(pred)
    
    # Ensemble prediction
    ensemble_pred = np.mean(predictions, axis=0)
    ensemble_score = r2_score(y_test, ensemble_pred)
    
    return individual_scores, ensemble_score
```

### 📊 **Key Metrics**

| Metric | Formula | Interpretation |
|--------|---------|----------------|
| **R² Score** | $1 - \frac{SS_{res}}{SS_{tot}}$ | Variance explained |
| **RMSE** | $\sqrt{\frac{1}{n}\sum(y_i - \hat{y_i})^2}$ | Prediction error |
| **MAE** | $\frac{1}{n}\sum|y_i - \hat{y_i}|$ | Average absolute error |

---

## 💻 **Complete Implementation Example**

```python
from sklearn.ensemble import VotingRegressor
from sklearn.model_selection import cross_val_score
from sklearn.metrics import mean_squared_error, r2_score

# Define base models
base_models = [
    ('linear', LinearRegression()),
    ('forest', RandomForestRegressor(n_estimators=100, random_state=42)),
    ('gradient', GradientBoostingRegressor(n_estimators=100, random_state=42)),
    ('svr', SVR(kernel='rbf'))
]

# Create voting ensemble
voting_reg = VotingRegressor(base_models)

# Train ensemble
voting_reg.fit(X_train, y_train)

# Make predictions
y_pred = voting_reg.predict(X_test)

# Evaluate
r2 = r2_score(y_test, y_pred)
rmse = np.sqrt(mean_squared_error(y_test, y_pred))

print(f"Ensemble R² Score: {r2:.4f}")
print(f"Ensemble RMSE: {rmse:.4f}")
```

---

## 🎯 **Advanced Techniques**

### 🧠 **Dynamic Weighting**
Adjust weights based on input characteristics:
$$w_i(x) = \text{softmax}(\text{NN}(x))_i$$

### 🔄 **Stacked Ensemble**
Use meta-model to learn optimal combination:
$$\hat{y} = \text{MetaModel}(\hat{y_1}, \hat{y_2}, ..., \hat{y_n})$$

---

> **💡 Pro Tip**: Voting works best with diverse, moderately accurate models. Avoid including very poor models as they can hurt overall performance!