# Gradient Boosting vs AdaBoost

This notebook provides an in-depth explanation of **Gradient Boosting** and **AdaBoost** with examples, comparisons with **Random Forest**, and decision tree diagrams for each method.

---

## 🌳 What is Boosting?
Boosting is an **ensemble technique** that combines multiple weak learners to form a strong learner. It builds models sequentially such that each new model corrects the errors of the previous ones.

- **Weak Learner**: A model that performs slightly better than random guessing (e.g., decision stump).
- **Strong Learner**: Combined model that achieves high accuracy.

**Key Boosting Techniques:**
- AdaBoost
- Gradient Boosting
- XGBoost / LightGBM / CatBoost (advanced variants)

---

## 🔁 AdaBoost (Adaptive Boosting)
### Concept:
- Focuses on misclassified data points by assigning them higher weights.
- Trains a sequence of weak learners, where each learner tries to fix the mistakes of the previous one.

### Key Points:
- Uses **decision stumps** (1-level trees).
- Each model gets a **weight** based on its performance.
- Combines models using weighted majority vote (classification) or weighted sum (regression).

---

In [None]:
# AdaBoost Example
from sklearn.ensemble import AdaBoostClassifier
from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score

X, y = make_classification(n_samples=1000, n_features=20, random_state=42)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

ada = AdaBoostClassifier(n_estimators=50, learning_rate=1.0, random_state=42)
ada.fit(X_train, y_train)
y_pred = ada.predict(X_test)
print("AdaBoost Accuracy:", accuracy_score(y_test, y_pred))

## 🎯 Gradient Boosting
### Concept:
- Optimizes a loss function by adding weak learners in a **stage-wise fashion**.
- Each new model is trained on the **residual errors** of the previous model.

### Key Points:
- Learns **gradients of the loss function**.
- Supports both regression and classification.
- Slower but more **accurate** than AdaBoost.

---

In [None]:
# Gradient Boosting Example
from sklearn.ensemble import GradientBoostingClassifier

gb = GradientBoostingClassifier(n_estimators=100, learning_rate=0.1, max_depth=3, random_state=42)
gb.fit(X_train, y_train)
y_pred_gb = gb.predict(X_test)
print("Gradient Boosting Accuracy:", accuracy_score(y_test, y_pred_gb))

## 🔍 Difference Between AdaBoost, Gradient Boosting and Random Forest

| Feature | AdaBoost | Gradient Boosting | Random Forest |
|--------|----------|------------------|----------------|
| Learner | Sequential | Sequential | Parallel |
| Weak Learner | Decision Stumps | Decision Trees | Full Trees |
| Focus | Misclassified samples | Residual errors | Bootstrap aggregation |
| Optimization | Exponential loss | Any differentiable loss | Not required |
| Overfitting | Less prone | Can overfit | Less prone |
| Speed | Fast | Slower | Fast |

---

## 🌲 Visualizing Tree for AdaBoost and Gradient Boosting
Let's visualize the first tree for both AdaBoost and Gradient Boosting to understand the difference.

In [None]:
from sklearn.tree import plot_tree
import matplotlib.pyplot as plt

# AdaBoost First Tree
plt.figure(figsize=(12, 6))
plot_tree(ada.estimators_[0], filled=True)
plt.title("AdaBoost - First Tree")
plt.show()

# Gradient Boosting First Tree
plt.figure(figsize=(12, 6))
plot_tree(gb.estimators_[0, 0], filled=True)
plt.title("Gradient Boosting - First Tree")
plt.show()