# Bagging, AdaBoost, Stacking, Voting

### Comparison of Ensemble Learning Techniques

| Classifier | Core Principle | Base Learners | Training Method | Key Advantages | Best For | Python Class |
| :--- | :--- | :--- | :--- | :--- | :--- | :--- |
| **Bagging** | **Bootstrap Aggregating**: Parallel training via sampling. | Same type ( Trees). | Independent training on random subsets. | Reduces variance & overfitting. | High-variance models. | `BaggingClassifier` |
| **AdaBoost**| **Adaptive Boosting**: Each model corrects previous errors. | Weak learners (Stumps). | Sequential; weights misclassified samples higher. | Reduces bias, high accuracy. | Binary classification. | `AdaBoostClassifier`|
| **Voting** | **Aggregates predictions** via majority or average. | Heterogeneous types. | Independent training on full dataset. | Simple, robust, no retraining needed. | Diverse model strengths. | `VotingClassifier` |
| **Stacking** | **Stacked Generalization**: Trains a meta-model to combine results. | Heterogeneous types. | Base models output $\rightarrow$ Meta-model input. | Highest accuracy potential. | Competitions/Final tuning. | `StackingClassifier` |

Bagging (Bootstrap Aggregating)

Key Insight: By training on different data samples, it reduces the variance of unstable models. The classic example is the Random Forest, which is Bagging applied to Decision Trees with added random feature selection.

When to Use: Your base model (like a deep Decision Tree) is accurate but overfits (high variance). Bagging stabilizes it.



In [None]:
from sklearn.ensemble import BaggingClassifier
from sklearn.tree import DecisionTreeClassifier
bagging_model = BaggingClassifier(
    estimator=DecisionTreeClassifier(),
    n_estimators=100,
    max_samples=0.8,  # Use 80% of data for each bootstrap sample
    random_state=42
)

AdaBoost (Adaptive Boosting)

Key Insight: It's an adaptive process. The algorithm focuses more and more on the examples that previous models got wrong. This sequential correction helps reduce bias.

When to Use: Your base model is too simple (high bias), like a stump (tree of depth 1). AdaBoost combines many weak learners to create a strong, complex boundary.

Interview Tip: Be ready to explain the weight update formula. Misclassified samples have their weights increased, correctly classified ones have weights decreased.

Voting Classifier

Key Insight: It's the simplest form of "model averaging." Hard Voting uses the majority class label, while Soft Voting averages predicted probabilities (often better).

When to Use: You have several good but different models ( a logistic regression, an SVM, and a tree). Voting often yields a more reliable "committee" decision.

In [None]:
from sklearn.ensemble import VotingClassifier
from sklearn.linear_model import LogisticRegression
from sklearn.svm import SVC
voting_model = VotingClassifier(
    estimators=[('lr', LogisticRegression()), ('svc', SVC(probability=True))],
    voting='soft'  # Use 'soft' for probabilistic averaging
)

Stacking (Stacked Generalization)

Key Insight: It learns how to combine models. Instead of a simple vote, a meta-learner (like logistic regression) discovers the optimal way to weigh the base models' predictions.

When to Use: For squeezing out the last bit of performance in competitions. It's more powerful but also more complex and prone to overfitting if not carefully cross-validated.

Crucial Practice: The base models' predictions for the meta-model must be generated using cross-validation on the training set to prevent data leakage. Scikit-learn's StackingClassifier handles this internally.

How to Choose the Right Ensemble Method
Use this simple decision guide:

Start with your base model:

Is it a complex model that overfits (, a deep Decision Tree)? → Try Bagging.

Is it a very simple model that underfits (, a shallow tree)? → Try AdaBoost.

Look at your model collection:

Do you have several well-performing models of different types? → Try Voting for simplicity or Stacking for maximum performance.

Do you want a quick, robust improvement with minimal complexity? → Choose Voting.

Are you in a competition or final tuning stage and need the best possible accuracy? → Invest time in Stacking.

Consider computation:

Bagging is easily parallelized.

AdaBoost and Stacking are sequential and can be slower.

Bagging → Parallel, Bootstrap, reduces Variance.

Boosting (AdaBoost) → Sequential, Reweights errors, reduces Bias.

Voting → Averages predictions, Simple ensemble.

Stacking → Meta-learner, Complex but powerful ensemble.