# Ensemble Methods

Ensemble methods combine multiple models to improve accuracy and robustness. The main types are:

1. Bagging (Bootstrap Aggregating)
2. Boosting (Sequential Learning)
3. Stacking (Blending Multiple Models)

We'll demonstrate these using sklearn on the breast cancer dataset.

Lets load the data and preprocess it

In [1]:
import numpy as np
import matplotlib.pyplot as plt
from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.ensemble import BaggingClassifier, AdaBoostClassifier, GradientBoostingClassifier, StackingClassifier
from sklearn.linear_model import LogisticRegression
from sklearn.tree import DecisionTreeClassifier
from sklearn.svm import SVC
from sklearn.metrics import accuracy_score

# Load dataset
data = load_breast_cancer()
X, y = data.data, data.target

# Split into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Scale features
scaler = StandardScaler()
X_train = scaler.fit_transform(X_train)
X_test = scaler.transform(X_test)

<span style="color: #00008B;">**Bagging (Bootstrap Aggregating)**</span>

Bagging reduces variance by training multiple models on random subsets of the data.

In [2]:
# Bagging using Decision Tree
bagging_clf = BaggingClassifier(base_estimator=DecisionTreeClassifier(), n_estimators=50, random_state=42)
bagging_clf.fit(X_train, y_train)
y_pred_bagging = bagging_clf.predict(X_test)

# Evaluate
print(f"Bagging Accuracy: {accuracy_score(y_test, y_pred_bagging):.2f}")

TypeError: BaggingClassifier.__init__() got an unexpected keyword argument 'base_estimator'