# 🌟 Ensemble Methods: Bagging, Boosting & Intro to XGBoost
This notebook helps beginners understand:
- What are Ensemble Methods?
- Bagging using Random Forest
- Boosting using AdaBoost
- Intro to XGBoost
- Visual comparison of model performance

In [None]:
!pip install xgboost matplotlib scikit-learn

In [None]:
import numpy as np
import matplotlib.pyplot as plt
from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier, AdaBoostClassifier
from sklearn.metrics import accuracy_score
import xgboost as xgb

## 🎯 Generate Classification Dataset

In [None]:
X, y = make_classification(n_samples=1000, n_features=20, n_informative=10, random_state=42)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

## 🌲 Bagging with Random Forest

In [None]:
rf_model = RandomForestClassifier(n_estimators=100, random_state=42)
rf_model.fit(X_train, y_train)
rf_pred = rf_model.predict(X_test)
rf_acc = accuracy_score(y_test, rf_pred)
print("Random Forest Accuracy:", rf_acc)

## 🚀 Boosting with AdaBoost

In [None]:
ab_model = AdaBoostClassifier(n_estimators=100, random_state=42)
ab_model.fit(X_train, y_train)
ab_pred = ab_model.predict(X_test)
ab_acc = accuracy_score(y_test, ab_pred)
print("AdaBoost Accuracy:", ab_acc)

## ⚡ Intro to XGBoost

In [None]:
xgb_model = xgb.XGBClassifier(n_estimators=100, learning_rate=0.1, max_depth=3, use_label_encoder=False, eval_metric='logloss')
xgb_model.fit(X_train, y_train)
xgb_pred = xgb_model.predict(X_test)
xgb_acc = accuracy_score(y_test, xgb_pred)
print("XGBoost Accuracy:", xgb_acc)

## 📊 Compare Model Accuracies

In [None]:
models = ['Random Forest', 'AdaBoost', 'XGBoost']
accuracies = [rf_acc, ab_acc, xgb_acc]

plt.figure(figsize=(8,5))
plt.bar(models, accuracies, color=['green', 'orange', 'blue'])
plt.ylabel('Accuracy')
plt.title('Model Comparison: Bagging vs Boosting')
plt.ylim(0.8, 1.0)
plt.grid(True)
plt.show()