# AdaBoost (Adaptive Boosting)

## What is AdaBoost?

**AdaBoost** (Adaptive Boosting) is a powerful ensemble learning algorithm that combines multiple weak classifiers to create a strong classifier. It was introduced by Yoav Freund and Robert Schapire in 1996.

## Key Concepts:

1. **Weak Learners**: Simple models that perform slightly better than random guessing (typically decision stumps - decision trees with depth 1)

2. **Sequential Learning**: Models are trained sequentially, where each new model focuses on correcting the errors of previous models

3. **Weight Adjustment**: AdaBoost adjusts the weights of incorrectly classified instances so that subsequent models focus more on difficult cases

4. **Final Prediction**: Weighted vote of all weak learners

## How AdaBoost Works:

1. Initialize equal weights for all training samples
2. Train a weak learner on the weighted dataset
3. Calculate the error rate and learner weight
4. Increase weights of misclassified samples
5. Repeat steps 2-4 for the specified number of estimators
6. Combine all weak learners using weighted voting

## Advantages:
- Simple to implement
- Works well with weak learners
- Less prone to overfitting compared to other boosting methods
- No need for extensive parameter tuning

## Disadvantages:
- Sensitive to noisy data and outliers
- Can be slower to train than some other algorithms
- Performance depends on the quality of weak learners

## Step 1: Import Required Libraries

In [None]:
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.ensemble import AdaBoostClassifier
from sklearn.tree import DecisionTreeClassifier
from sklearn.metrics import accuracy_score, classification_report, confusion_matrix
from sklearn.datasets import make_classification
import matplotlib.pyplot as plt
import seaborn as sns

## Step 2: Create a Synthetic Dataset

We'll create a classification dataset for demonstration purposes.

In [None]:
# Create a synthetic dataset
X, y = make_classification(n_samples=1000, n_features=20, n_informative=15, 
                           n_redundant=5, random_state=42)

# Split the data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

print(f"Training set size: {X_train.shape}")
print(f"Testing set size: {X_test.shape}")
print(f"Number of features: {X_train.shape[1]}")
print(f"Class distribution in training set: {np.bincount(y_train)}")

## Step 3: Train AdaBoost Model

We'll create an AdaBoost classifier with Decision Tree stumps as weak learners.

In [None]:
# Create base estimator (weak learner) - Decision Tree with max_depth=1 (stump)
base_estimator = DecisionTreeClassifier(max_depth=1, random_state=42)

# Create AdaBoost classifier
ada_boost = AdaBoostClassifier(
    estimator=base_estimator,
    n_estimators=50,  # Number of weak learners
    learning_rate=1.0,  # Weight applied to each classifier
    random_state=42
)

# Train the model
ada_boost.fit(X_train, y_train)

print("AdaBoost model trained successfully!")
print(f"Number of estimators used: {ada_boost.n_estimators}")

## Step 4: Make Predictions and Evaluate Model

In [None]:
# Make predictions
y_train_pred = ada_boost.predict(X_train)
y_test_pred = ada_boost.predict(X_test)

# Calculate accuracy
train_accuracy = accuracy_score(y_train, y_train_pred)
test_accuracy = accuracy_score(y_test, y_test_pred)

print(f"Training Accuracy: {train_accuracy:.4f}")
print(f"Testing Accuracy: {test_accuracy:.4f}")
print("\n" + "="*50)
print("Classification Report (Test Set):")
print("="*50)
print(classification_report(y_test, y_test_pred))

## Step 5: Visualize Confusion Matrix

In [None]:
# Create confusion matrix
cm = confusion_matrix(y_test, y_test_pred)

# Plot confusion matrix
plt.figure(figsize=(8, 6))
sns.heatmap(cm, annot=True, fmt='d', cmap='Blues', cbar=True)
plt.title('Confusion Matrix - AdaBoost Classifier')
plt.ylabel('Actual Label')
plt.xlabel('Predicted Label')
plt.show()

print(f"\nConfusion Matrix:\n{cm}")

## Step 6: Analyze Feature Importance

In [None]:
# Get feature importances
feature_importance = ada_boost.feature_importances_

# Create a dataframe for better visualization
feature_df = pd.DataFrame({
    'Feature': [f'Feature_{i}' for i in range(len(feature_importance))],
    'Importance': feature_importance
}).sort_values('Importance', ascending=False)

# Plot top 10 features
plt.figure(figsize=(10, 6))
plt.bar(feature_df['Feature'][:10], feature_df['Importance'][:10])
plt.xlabel('Features')
plt.ylabel('Importance')
plt.title('Top 10 Feature Importances in AdaBoost')
plt.xticks(rotation=45)
plt.tight_layout()
plt.show()

print("Top 10 Most Important Features:")
print(feature_df.head(10))

## Step 7: Effect of Number of Estimators

Let's see how the number of estimators affects model performance.

In [None]:
# Test different numbers of estimators
n_estimators_range = [1, 5, 10, 25, 50, 100, 150, 200]
train_scores = []
test_scores = []

for n_est in n_estimators_range:
    ada = AdaBoostClassifier(
        estimator=DecisionTreeClassifier(max_depth=1, random_state=42),
        n_estimators=n_est,
        learning_rate=1.0,
        random_state=42
    )
    ada.fit(X_train, y_train)
    
    train_scores.append(ada.score(X_train, y_train))
    test_scores.append(ada.score(X_test, y_test))

# Plot the results
plt.figure(figsize=(10, 6))
plt.plot(n_estimators_range, train_scores, marker='o', label='Training Accuracy')
plt.plot(n_estimators_range, test_scores, marker='s', label='Testing Accuracy')
plt.xlabel('Number of Estimators')
plt.ylabel('Accuracy')
plt.title('AdaBoost Performance vs Number of Estimators')
plt.legend()
plt.grid(True)
plt.show()

print("Performance by Number of Estimators:")
for n_est, train_acc, test_acc in zip(n_estimators_range, train_scores, test_scores):
    print(f"n_estimators={n_est:3d}: Train={train_acc:.4f}, Test={test_acc:.4f}")

## Key Parameters in AdaBoost:

- **estimator**: The base weak learner (default: DecisionTreeClassifier with max_depth=1)
- **n_estimators**: Number of weak learners to train sequentially
- **learning_rate**: Weight applied to each classifier at each boosting iteration (lower values require more estimators)
- **algorithm**: The boosting algorithm to use ('SAMME' or 'SAMME.R')

## Conclusion:

AdaBoost is a powerful ensemble method that:
- Combines weak learners into a strong classifier
- Focuses on difficult-to-classify instances
- Often achieves high accuracy with simple base models
- Works well when you have a good weak learner

The key to AdaBoost's success is its adaptive nature - it learns from mistakes and focuses on the hardest cases!