# Stacking Ensemble Learning

## Problem Type
**Stacking Ensemble Learning** is primarily used for:
- **Supervised Learning**
- **Regression** and **Classification** tasks
- **Applications**: Any predictive modeling problem where multiple base models can be combined to improve performance (e.g., tabular data, image classification, NLP tasks).

### How Stacking Ensemble Learning Works
- **Base Learners:**
  - Stacking uses multiple diverse models (e.g., decision trees, SVMs, neural networks) as base learners, each trained on the same dataset.
- **Meta-Learner:**
  - A meta-learner (often a simple model like linear regression or logistic regression) is trained on the predictions made by the base learners, combining them into a final prediction.
- **Layering:**
  - The process involves layering the models where the first layer consists of base learners and the final layer consists of the meta-learner.
- **Training:**
  - Each base learner is trained on the original training data, and their predictions are used as input features to train the meta-learner.
- **Cross-Validation:**
  - Often, cross-validation is used to generate predictions from the base learners to ensure that the meta-learner is not overfitting to the base learners' predictions.
- **Model Diversity:**
  - The key to stacking is using a diverse set of base learners that capture different aspects of the data, improving the ensemble's overall performance.
- **Prediction:**
  - During inference, the base learners make predictions on the input data, and the meta-learner combines these predictions to produce the final output.

### Key Tuning Metrics
- **`base_learners`:**
  - **Description:** Types and number of base models used in the stacking ensemble.
  - **Impact:** Diversity among base learners helps in capturing different data patterns, improving ensemble performance.
  - **Default:** Common choices include decision trees, SVMs, and neural networks.
- **`meta_learner`:**
  - **Description:** The model used to aggregate predictions from base learners.
  - **Impact:** A simple, robust meta-learner (e.g., logistic regression for classification) can effectively combine diverse base models' predictions.
  - **Default:** Often a linear model like logistic regression or linear regression.
- **`cross_validation_splits`:**
  - **Description:** Number of folds in cross-validation used for generating base learners' predictions.
  - **Impact:** More splits can reduce overfitting but increase computational cost.
  - **Default:** Typically `5` or `10` folds.
  
### Pros vs Cons

| Pros                                                  | Cons                                                   |
|-------------------------------------------------------|--------------------------------------------------------|
| Can significantly improve model performance by combining strengths of diverse models | Computationally expensive, especially with many base learners |
| Reduces the risk of overfitting compared to individual models | Requires careful selection and tuning of base learners and meta-learner |
| Versatile and can be applied to both regression and classification tasks | Difficult to interpret and understand the final model due to multiple layers |
| Handles both linear and non-linear relationships effectively | More complex to implement and tune compared to simpler ensemble methods |
| Robust to noisy data if base learners are diverse     | High risk of overfitting if cross-validation or stacking is not properly configured |

### Evaluation Metrics
- **Accuracy (Classification):**
  - **Description:** Ratio of correct predictions to total predictions.
  - **Good Value:** Higher is better; values above 0.85 indicate strong model performance.
  - **Bad Value:** Below 0.5 suggests poor model performance.
- **Precision (Classification):**
  - **Description:** Proportion of true positives among all positive predictions.
  - **Good Value:** Higher values indicate fewer false positives, especially important in imbalanced datasets.
  - **Bad Value:** Low values suggest many false positives.
- **Recall (Classification):**
  - **Description:** Proportion of actual positives correctly identified.
  - **Good Value:** Higher values indicate fewer false negatives, important in recall-sensitive applications.
  - **Bad Value:** Low values suggest many false negatives.
- **F1 Score (Classification):**
  - **Description:** Harmonic mean of Precision and Recall.
  - **Good Value:** Higher values indicate a good balance between Precision and Recall.
  - **Bad Value:** Low values suggest a poor balance between Precision and Recall.
- **R-squared (Regression):**
  - **Description:** Proportion of variance in the dependent variable explained by the model.
  - **Good Value:** Higher is better; values closer to 1 indicate a strong model.
  - **Bad Value:** Values closer to 0 suggest the model does not explain much of the variance.
- **Mean Absolute Error (MAE) (Regression):**
  - **Description:** Measures the average absolute difference between predicted and actual values.
  - **Good Value:** Lower is better; values close to `0` indicate high accuracy.
  - **Bad Value:** Higher values suggest significant prediction errors.
- **Root Mean Squared Error (RMSE) (Regression):**
  - **Description:** Measures the square root of the average squared difference between predicted and actual values.
  - **Good Value:** Lower is better; values close to `0` indicate high accuracy.
  - **Bad Value:** Higher values suggest the model's predictions deviate significantly from actual values.
- **AUC-ROC (Classification):**
  - **Description:** Measures the model's ability to distinguish between classes across all thresholds.
  - **Good Value:** Values closer to 1 indicate strong separability between classes.
  - **Bad Value:** Values near 0.5 suggest random guessing.



In [None]:
from sklearn.datasets import load_iris
from sklearn.ensemble import RandomForestClassifier, StackingClassifier
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score, classification_report
from sklearn.model_selection import train_test_split
from sklearn.svm import SVC

In [None]:
# Load a dataset
data = load_iris()
X, y = data.data, data.target

# Split the data
X_train, X_test, y_train, y_test = train_test_split(X, y, stratify=y, random_state=42)

# Define the base models
base_models = [
    ("rf", RandomForestClassifier(n_estimators=10, random_state=42)),
    ("svc", SVC()),
]

# Define the meta model
meta_model = LogisticRegression()

# Define the stacking classifier
stacking_model = StackingClassifier(
    estimators=base_models,
    final_estimator=meta_model,
    cv=5,
)

# Fit the model
stacking_model.fit(X_train, y_train)

# Make predictions
predictions = stacking_model.predict(X_test)

In [None]:
# Evaluate the stacking model
predictions = stacking_model.predict(X_test)
print("Stacking Model Accuracy: ", accuracy_score(y_test, predictions))
print(classification_report(y_test, predictions))

# Evaluate each base model
for name, model in stacking_model.named_estimators_.items():
    model.fit(X_train, y_train)
    predictions = model.predict(X_test)
    print(f"{name} Accuracy: ", accuracy_score(y_test, predictions))
    print(classification_report(y_test, predictions))