# Ensemble Methods - Data Science Koans

Welcome to Notebook 12: Ensemble Methods!

## What You Will Learn
- Random Forest fundamentals and implementation
- Feature importance analysis and interpretation  
- Gradient Boosting and XGBoost techniques
- Voting classifiers for model combination
- Stacking ensembles for meta-learning
- Comparing and evaluating ensemble methods

## Why This Matters
Ensemble methods are among the most powerful techniques in machine learning because they:
- **Reduce Overfitting**: Multiple models average out individual errors
- **Improve Accuracy**: Combine strengths of different algorithms
- **Increase Robustness**: Less sensitive to outliers and noise
- **Handle Complexity**: Capture non-linear patterns effectively
- **Win Competitions**: Dominate Kaggle and real-world challenges

## Key Concepts
- **Bagging**: Train multiple models on different data subsets (Random Forest)
- **Boosting**: Train models sequentially, focusing on previous errors (XGBoost)  
- **Voting**: Combine predictions through majority vote or averaging
- **Stacking**: Use a meta-model to learn optimal combination weights

## Prerequisites
- Dimensionality Reduction (Notebook 11)
- Understanding of classification and regression
- Experience with scikit-learn

## How to Use
1. Read each koan's objective and theory explanation
2. Implement the TODO sections step by step
3. Run validation to verify correctness
4. Study the feedback and insights provided
5. Progress through increasingly advanced ensemble techniques

Ready to build some powerful ensemble models? Let's go! 🚀

In [None]:
# Setup - Run first!
import sys
sys.path.append('../..')

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.datasets import load_breast_cancer, load_wine, make_classification
from sklearn.model_selection import train_test_split, cross_val_score, GridSearchCV
from sklearn.ensemble import (RandomForestClassifier, RandomForestRegressor, 
                              GradientBoostingClassifier, VotingClassifier, 
                              StackingClassifier)
from sklearn.linear_model import LogisticRegression
from sklearn.svm import SVC
from sklearn.tree import DecisionTreeClassifier
from sklearn.metrics import accuracy_score, classification_report
from sklearn.preprocessing import StandardScaler

# XGBoost (may need installation: pip install xgboost)
try:
    import xgboost as xgb
    XGBOOST_AVAILABLE = True
    print("✓ XGBoost available")
except ImportError:
    XGBOOST_AVAILABLE = False
    print("⚠️ XGBoost not available - will use GradientBoostingClassifier instead")

from koans.core.validator import KoanValidator
from koans.core.progress import ProgressTracker

validator = KoanValidator("12_ensemble_methods")
tracker = ProgressTracker()

print("Setup complete!")
print(f"Current progress: {tracker.get_notebook_progress('12_ensemble_methods')}%")

## KOAN 12.1: Random Forest Classifier
**Objective**: Build and train a Random Forest for classification  
**Difficulty**: Advanced

Random Forest combines many decision trees using bagging (bootstrap aggregating). Each tree trains on a random subset of data and features, then predictions are averaged (regression) or voted (classification).

**Key Concepts**: 
- **Bootstrap Sampling**: Each tree sees different training data
- **Feature Randomness**: Each split considers random subset of features  
- **Averaging**: Multiple weak learners create a strong ensemble

In [None]:
def build_random_forest_classifier():
    """
    Build and train a Random Forest classifier on the breast cancer dataset.
    
    Steps:
    1. Load the breast cancer dataset
    2. Split into training and test sets
    3. Create RandomForestClassifier with 100 trees
    4. Train the model
    5. Return the trained model and test accuracy
    
    Returns:
        tuple: (trained_model, test_accuracy)
    """
    # Load dataset
    cancer = load_breast_cancer()
    X, y = cancer.data, cancer.target
    
    # Split the data
    X_train, X_test, y_train, y_test = train_test_split(
        X, y, test_size=0.2, random_state=42, stratify=y
    )
    
    # TODO: Create RandomForestClassifier with n_estimators=100, random_state=42
    rf_model = None
    
    # TODO: Fit the model on training data
    # rf_model.fit(...)
    
    # TODO: Make predictions and calculate accuracy
    # y_pred = rf_model.predict(...)
    # accuracy = accuracy_score(...)
    accuracy = None
    
    return rf_model, accuracy

@validator.koan(1, "Random Forest Classifier", difficulty="Advanced")
def validate():
    model, accuracy = build_random_forest_classifier()
    
    assert model is not None, "Model is None - did you create the RandomForestClassifier?"
    assert isinstance(model, RandomForestClassifier), "Model should be RandomForestClassifier"
    assert hasattr(model, 'n_estimators'), "Model should have n_estimators attribute"
    assert model.n_estimators == 100, f"Expected 100 estimators, got {model.n_estimators}"
    
    assert accuracy is not None, "Accuracy is None - did you calculate it?"
    assert isinstance(accuracy, (float, np.floating)), "Accuracy should be a float"
    assert 0.8 <= accuracy <= 1.0, f"Accuracy should be reasonable (0.8-1.0), got {accuracy:.3f}"
    
    print(f"✓ Random Forest trained successfully!")
    print(f"  - Number of trees: {model.n_estimators}")
    print(f"  - Test accuracy: {accuracy:.3f}")
    print(f"  - Features used: {model.n_features_in_}")
    
    # Show feature importance preview
    cancer = load_breast_cancer()
    feature_names = cancer.feature_names
    importances = model.feature_importances_
    top_features = np.argsort(importances)[-3:][::-1]
    
    print("  - Top 3 important features:")
    for i, idx in enumerate(top_features):
        print(f"    {i+1}. {feature_names[idx]}: {importances[idx]:.3f}")

validate()

## KOAN 12.2: Feature Importance Analysis
**Objective**: Extract and analyze feature importance from Random Forest  
**Difficulty**: Advanced

Random Forest provides built-in feature importance scores based on how much each feature decreases impurity across all trees. This helps identify which variables are most predictive.

**Key Concept**: `feature_importances_` gives a score for each feature. Higher scores mean more important features. Scores sum to 1.0.

In [None]:
def analyze_feature_importance():
    """
    Train Random Forest on wine dataset and analyze feature importance.
    
    Returns:
        tuple: (feature_names, importance_scores, top_5_indices)
        where top_5_indices are indices of 5 most important features
    """
    # Load wine dataset
    wine = load_wine()
    X, y = wine.data, wine.target
    
    # Split the data
    X_train, X_test, y_train, y_test = train_test_split(
        X, y, test_size=0.2, random_state=42, stratify=y
    )
    
    # TODO: Create and train RandomForestClassifier with 200 trees
    rf_model = None
    
    # TODO: Get feature names and importance scores
    feature_names = None  # wine.feature_names
    importance_scores = None  # rf_model.feature_importances_
    
    # TODO: Find indices of top 5 most important features
    # Hint: Use np.argsort()[-5:][::-1] to get top 5 in descending order
    top_5_indices = None
    
    return feature_names, importance_scores, top_5_indices

@validator.koan(2, "Feature Importance Analysis", difficulty="Advanced")
def validate():
    feature_names, importance_scores, top_5_indices = analyze_feature_importance()
    
    assert feature_names is not None, "Feature names is None"
    assert importance_scores is not None, "Importance scores is None" 
    assert top_5_indices is not None, "Top 5 indices is None"
    
    assert len(feature_names) == 13, f"Wine dataset should have 13 features, got {len(feature_names)}"
    assert len(importance_scores) == 13, f"Should have 13 importance scores, got {len(importance_scores)}"
    assert len(top_5_indices) == 5, f"Should have 5 top indices, got {len(top_5_indices)}"
    
    # Check that importance scores sum to approximately 1
    total_importance = np.sum(importance_scores)
    assert abs(total_importance - 1.0) < 0.001, f"Importances should sum to 1.0, got {total_importance:.3f}"
    
    # Check that top indices are in descending order of importance
    for i in range(4):
        curr_imp = importance_scores[top_5_indices[i]]
        next_imp = importance_scores[top_5_indices[i+1]]
        assert curr_imp >= next_imp, "Top indices should be in descending order of importance"
    
    print("✓ Feature importance analysis complete!")
    print(f"  - Total features: {len(feature_names)}")
    print(f"  - Importance sum: {total_importance:.3f}")
    
    print("\n  🏆 Top 5 Most Important Features:")
    for i, idx in enumerate(top_5_indices):
        print(f"    {i+1}. {feature_names[idx]}: {importance_scores[idx]:.3f}")
    
    # Additional insight
    most_important = top_5_indices[0]
    print(f"\n  💡 Most important feature: '{feature_names[most_important]}'")
    print(f"     Contributes {importance_scores[most_important]*100:.1f}% to predictions")

validate()

## KOAN 12.3: Gradient Boosting Classifier  
**Objective**: Implement gradient boosting for sequential learning  
**Difficulty**: Advanced

Gradient Boosting builds models sequentially, where each new model focuses on correcting errors from previous models. This creates a strong learner from many weak learners.

**Key Concept**: Unlike bagging (Random Forest), boosting trains models in sequence. Each model tries to reduce the residual errors of the ensemble built so far.

In [None]:
def compare_random_forest_vs_gradient_boosting():
    """
    Compare Random Forest and Gradient Boosting on the same dataset.
    
    Returns:
        dict: Contains both models and their cross-validation scores
    """
    # Create a synthetic dataset for comparison
    X, y = make_classification(
        n_samples=1000, n_features=20, n_informative=10, 
        n_redundant=10, random_state=42
    )
    
    # TODO: Create RandomForestClassifier with 100 estimators, random_state=42
    rf_model = None
    
    # TODO: Create GradientBoostingClassifier with 100 estimators, random_state=42  
    gb_model = None
    
    # TODO: Calculate 5-fold cross-validation scores for both models
    # Hint: Use cross_val_score(model, X, y, cv=5)
    rf_scores = None
    gb_scores = None
    
    return {
        'rf_model': rf_model,
        'gb_model': gb_model, 
        'rf_scores': rf_scores,
        'gb_scores': gb_scores,
        'rf_mean': np.mean(rf_scores) if rf_scores is not None else None,
        'gb_mean': np.mean(gb_scores) if gb_scores is not None else None
    }

@validator.koan(3, "Gradient Boosting Classifier", difficulty="Advanced")
def validate():
    results = compare_random_forest_vs_gradient_boosting()
    
    # Check that models were created
    assert results['rf_model'] is not None, "Random Forest model is None"
    assert results['gb_model'] is not None, "Gradient Boosting model is None"
    
    # Check model types
    assert isinstance(results['rf_model'], RandomForestClassifier), "rf_model should be RandomForestClassifier"
    assert isinstance(results['gb_model'], GradientBoostingClassifier), "gb_model should be GradientBoostingClassifier"
    
    # Check cross-validation scores
    assert results['rf_scores'] is not None, "Random Forest CV scores is None"
    assert results['gb_scores'] is not None, "Gradient Boosting CV scores is None"
    assert len(results['rf_scores']) == 5, "Should have 5 CV scores for Random Forest"
    assert len(results['gb_scores']) == 5, "Should have 5 CV scores for Gradient Boosting"
    
    # Check that scores are reasonable
    rf_mean = results['rf_mean']
    gb_mean = results['gb_mean']
    assert 0.7 <= rf_mean <= 1.0, f"RF accuracy should be reasonable, got {rf_mean:.3f}"
    assert 0.7 <= gb_mean <= 1.0, f"GB accuracy should be reasonable, got {gb_mean:.3f}"
    
    print("✓ Successfully compared Random Forest vs Gradient Boosting!")
    print(f"\n  🌲 Random Forest Results:")
    print(f"     Mean CV Accuracy: {rf_mean:.3f} (±{np.std(results['rf_scores']):.3f})")
    print(f"     Individual scores: {[f'{s:.3f}' for s in results['rf_scores']]}")
    
    print(f"\n  🚀 Gradient Boosting Results:")  
    print(f"     Mean CV Accuracy: {gb_mean:.3f} (±{np.std(results['gb_scores']):.3f})")
    print(f"     Individual scores: {[f'{s:.3f}' for s in results['gb_scores']]}")
    
    # Compare performance
    if rf_mean > gb_mean:
        winner = "Random Forest"
        diff = rf_mean - gb_mean
    else:
        winner = "Gradient Boosting" 
        diff = gb_mean - rf_mean
        
    print(f"\n  🏆 Winner: {winner} (by {diff:.3f})")
    print(f"\n  💡 Key Differences:")
    print(f"     • Random Forest: Parallel training, faster")
    print(f"     • Gradient Boosting: Sequential training, can overfit")

validate()

## KOAN 12.4: XGBoost - Advanced Gradient Boosting
**Objective**: Use XGBoost for state-of-the-art boosting performance  
**Difficulty**: Advanced

XGBoost (Extreme Gradient Boosting) is an optimized gradient boosting framework that often wins machine learning competitions. It includes regularization, handles missing values, and is highly optimized.

**Key Concepts**: XGBoost adds regularization terms, uses second-order derivatives, and implements advanced techniques like tree pruning and parallel processing.

In [None]:
def train_xgboost_classifier():
    """
    Train XGBoost classifier and compare with regular Gradient Boosting.
    Falls back to GradientBoostingClassifier if XGBoost not available.
    
    Returns:
        dict: Contains model, accuracy, and model type info
    """
    # Load breast cancer dataset
    cancer = load_breast_cancer()
    X, y = cancer.data, cancer.target
    
    # Split the data
    X_train, X_test, y_train, y_test = train_test_split(
        X, y, test_size=0.2, random_state=42, stratify=y
    )
    
    if XGBOOST_AVAILABLE:
        # TODO: Create XGBClassifier with n_estimators=100, random_state=42
        # Hint: Use xgb.XGBClassifier(n_estimators=100, random_state=42)
        model = None
        model_type = "XGBoost"
    else:
        # Fallback to Gradient Boosting
        # TODO: Create GradientBoostingClassifier with n_estimators=100, random_state=42
        model = None
        model_type = "GradientBoosting"
    
    # TODO: Train the model
    # model.fit(X_train, y_train)
    
    # TODO: Calculate test accuracy
    # y_pred = model.predict(X_test)
    # accuracy = accuracy_score(y_test, y_pred)
    accuracy = None
    
    return {
        'model': model,
        'accuracy': accuracy, 
        'model_type': model_type,
        'n_estimators': model.n_estimators if model else None
    }

@validator.koan(4, "XGBoost - Advanced Gradient Boosting", difficulty="Advanced")
def validate():
    results = train_xgboost_classifier()
    
    assert results['model'] is not None, "Model is None"
    assert results['accuracy'] is not None, "Accuracy is None"
    assert results['model_type'] in ['XGBoost', 'GradientBoosting'], "Invalid model type"
    
    model = results['model']
    accuracy = results['accuracy']
    
    # Check model properties
    if XGBOOST_AVAILABLE:
        assert hasattr(model, 'n_estimators'), "XGBoost model should have n_estimators"
    else:
        assert isinstance(model, GradientBoostingClassifier), "Should be GradientBoostingClassifier"
        
    assert results['n_estimators'] == 100, f"Should have 100 estimators, got {results['n_estimators']}"
    assert 0.8 <= accuracy <= 1.0, f"Accuracy should be reasonable, got {accuracy:.3f}"
    
    print(f"✓ Successfully trained {results['model_type']} classifier!")
    print(f"  - Model type: {results['model_type']}")
    print(f"  - Number of estimators: {results['n_estimators']}")
    print(f"  - Test accuracy: {accuracy:.3f}")
    
    if results['model_type'] == 'XGBoost':
        print(f"  - XGBoost advantages:")
        print(f"    • Built-in regularization")
        print(f"    • Handles missing values")  
        print(f"    • Optimized performance")
        print(f"    • Feature importance")
    else:
        print(f"  - Using GradientBoostingClassifier (XGBoost not available)")
        print(f"  - Install XGBoost: pip install xgboost")

validate()

## KOAN 12.5: Voting Classifier - Combining Multiple Models
**Objective**: Use VotingClassifier to combine different algorithms  
**Difficulty**: Advanced

Voting classifiers combine predictions from multiple different algorithms. Hard voting uses majority vote, while soft voting averages predicted probabilities for potentially better performance.

**Key Concept**: Diversity in base models is crucial - combining similar models provides little benefit, but combining different algorithm types (tree-based, linear, etc.) can improve performance.

In [None]:
def create_voting_ensemble():
    """
    Create a voting classifier that combines multiple different algorithms.
    
    We'll combine: Random Forest, Logistic Regression, and SVM
    
    Returns:
        tuple: (voting_classifier, individual_scores, ensemble_score)
    """
    # Load and prepare data
    cancer = load_breast_cancer()
    X, y = cancer.data, cancer.target
    
    # Split the data
    X_train, X_test, y_train, y_test = train_test_split(
        X, y, test_size=0.2, random_state=42, stratify=y
    )
    
    # Scale features for SVM and Logistic Regression
    scaler = StandardScaler()
    X_train_scaled = scaler.fit_transform(X_train)
    X_test_scaled = scaler.transform(X_test)
    
    # TODO: Create individual base models
    rf_model = None   # RandomForestClassifier(n_estimators=100, random_state=42)
    lr_model = None   # LogisticRegression(random_state=42, max_iter=1000)  
    svm_model = None  # SVC(random_state=42, probability=True)  # probability=True for soft voting
    
    # TODO: Create VotingClassifier with soft voting
    # Hint: VotingClassifier([('rf', rf_model), ('lr', lr_model), ('svm', svm_model)], voting='soft')
    voting_clf = None
    
    # Train individual models on scaled data (for fair comparison)
    rf_model.fit(X_train_scaled, y_train) 
    lr_model.fit(X_train_scaled, y_train)
    svm_model.fit(X_train_scaled, y_train)
    
    # TODO: Train the voting classifier
    # voting_clf.fit(X_train_scaled, y_train)
    
    # Calculate individual scores
    rf_score = accuracy_score(y_test, rf_model.predict(X_test_scaled))
    lr_score = accuracy_score(y_test, lr_model.predict(X_test_scaled))
    svm_score = accuracy_score(y_test, svm_model.predict(X_test_scaled))
    
    # TODO: Calculate ensemble score
    # ensemble_score = accuracy_score(y_test, voting_clf.predict(X_test_scaled))
    ensemble_score = None
    
    individual_scores = {
        'Random Forest': rf_score,
        'Logistic Regression': lr_score, 
        'SVM': svm_score
    }
    
    return voting_clf, individual_scores, ensemble_score

@validator.koan(5, "Voting Classifier - Combining Multiple Models", difficulty="Advanced")
def validate():
    voting_clf, individual_scores, ensemble_score = create_voting_ensemble()
    
    assert voting_clf is not None, "Voting classifier is None"
    assert isinstance(voting_clf, VotingClassifier), "Should be VotingClassifier"
    assert individual_scores is not None, "Individual scores is None"
    assert ensemble_score is not None, "Ensemble score is None"
    
    # Check that we have 3 base estimators
    assert len(voting_clf.estimators) == 3, f"Should have 3 estimators, got {len(voting_clf.estimators)}"
    
    # Check voting type
    assert voting_clf.voting == 'soft', f"Should use soft voting, got {voting_clf.voting}"
    
    # Check individual scores are reasonable
    for name, score in individual_scores.items():
        assert 0.7 <= score <= 1.0, f"{name} score should be reasonable, got {score:.3f}"
    
    # Check ensemble score is reasonable  
    assert 0.7 <= ensemble_score <= 1.0, f"Ensemble score should be reasonable, got {ensemble_score:.3f}"
    
    print("✓ Successfully created voting ensemble!")
    print(f"  - Number of base models: {len(voting_clf.estimators)}")
    print(f"  - Voting type: {voting_clf.voting}")
    
    print(f"\n  📊 Individual Model Performance:")
    for name, score in individual_scores.items():
        print(f"    {name}: {score:.3f}")
    
    print(f"\n  🗳️  Ensemble Performance: {ensemble_score:.3f}")
    
    # Check if ensemble improves over individual models
    best_individual = max(individual_scores.values())
    improvement = ensemble_score - best_individual
    
    if improvement > 0:
        print(f"  🎉 Ensemble improved by {improvement:.3f} over best individual!")
    elif improvement > -0.01:  # Small decrease is acceptable
        print(f"  ✓ Ensemble performs similarly to best individual")
    else:
        print(f"  ⚠️ Ensemble underperformed (can happen with small datasets)")
    
    print(f"\n  💡 Voting Ensemble Benefits:")
    print(f"    • Reduces overfitting through diversity")  
    print(f"    • More robust predictions")
    print(f"    • Soft voting can capture prediction confidence")

validate()

## KOAN 12.6: Stacking Ensemble - Meta-Learning  
**Objective**: Implement stacking with a meta-learner  
**Difficulty**: Advanced

Stacking uses a meta-model (blender) to learn how to best combine predictions from base models. Instead of simple voting, the meta-model learns optimal weights based on base model performance patterns.

**Key Concept**: Base models make predictions, then a meta-model learns from those predictions to make the final decision. This can capture complex interaction patterns between base models.

In [None]:
def create_stacking_ensemble():
    """
    Create a stacking ensemble with base models and a meta-learner.
    
    Base models: Random Forest, SVM, Decision Tree
    Meta-learner: Logistic Regression
    
    Returns:
        tuple: (stacking_classifier, base_scores, stacking_score)
    """
    # Load and prepare data
    cancer = load_breast_cancer()
    X, y = cancer.data, cancer.target
    
    # Split the data
    X_train, X_test, y_train, y_test = train_test_split(
        X, y, test_size=0.2, random_state=42, stratify=y
    )
    
    # Scale features
    scaler = StandardScaler()
    X_train_scaled = scaler.fit_transform(X_train)
    X_test_scaled = scaler.transform(X_test)
    
    # TODO: Define base models (estimators for stacking)
    base_models = [
        ('rf', None),   # ('rf', RandomForestClassifier(n_estimators=50, random_state=42))
        ('svm', None),  # ('svm', SVC(random_state=42, probability=True))
        ('dt', None)    # ('dt', DecisionTreeClassifier(random_state=42))
    ]
    
    # TODO: Create StackingClassifier with LogisticRegression as meta-learner
    # Hint: StackingClassifier(estimators=base_models, final_estimator=LogisticRegression(), cv=5)
    stacking_clf = None
    
    # TODO: Fit the stacking classifier
    # stacking_clf.fit(X_train_scaled, y_train)
    
    # Train individual base models for comparison
    base_scores = {}
    for name, model in base_models:
        if model is not None:
            model.fit(X_train_scaled, y_train)
            score = accuracy_score(y_test, model.predict(X_test_scaled))
            base_scores[name] = score
    
    # TODO: Calculate stacking ensemble score
    # stacking_score = accuracy_score(y_test, stacking_clf.predict(X_test_scaled))
    stacking_score = None
    
    return stacking_clf, base_scores, stacking_score

@validator.koan(6, "Stacking Ensemble - Meta-Learning", difficulty="Advanced")
def validate():
    stacking_clf, base_scores, stacking_score = create_stacking_ensemble()
    
    assert stacking_clf is not None, "Stacking classifier is None"
    assert isinstance(stacking_clf, StackingClassifier), "Should be StackingClassifier"
    assert base_scores is not None, "Base scores is None"
    assert stacking_score is not None, "Stacking score is None"
    
    # Check that we have base estimators
    assert len(stacking_clf.estimators) == 3, f"Should have 3 base estimators, got {len(stacking_clf.estimators)}"
    
    # Check final estimator
    assert isinstance(stacking_clf.final_estimator, LogisticRegression), "Meta-learner should be LogisticRegression"
    
    # Check scores are reasonable
    for name, score in base_scores.items():
        assert 0.7 <= score <= 1.0, f"{name} score should be reasonable, got {score:.3f}"
    
    assert 0.7 <= stacking_score <= 1.0, f"Stacking score should be reasonable, got {stacking_score:.3f}"
    
    print("✓ Successfully created stacking ensemble!")
    print(f"  - Number of base models: {len(stacking_clf.estimators)}")
    print(f"  - Meta-learner: {type(stacking_clf.final_estimator).__name__}")
    print(f"  - Cross-validation folds: {stacking_clf.cv}")
    
    print(f"\n  📊 Base Model Performance:")
    for name, score in base_scores.items():
        print(f"    {name.upper()}: {score:.3f}")
    
    print(f"\n  🥞 Stacking Performance: {stacking_score:.3f}")
    
    # Compare with best base model
    best_base_score = max(base_scores.values())
    improvement = stacking_score - best_base_score
    
    if improvement > 0:
        print(f"  🎉 Stacking improved by {improvement:.3f} over best base model!")
    elif improvement > -0.01:
        print(f"  ✓ Stacking performs similarly to best base model")  
    else:
        print(f"  ⚠️ Stacking underperformed (meta-learner may need tuning)")
    
    print(f"\n  💡 Stacking Advantages:")
    print(f"    • Meta-learner adapts to base model strengths")
    print(f"    • Can learn complex combination patterns") 
    print(f"    • Often outperforms simple voting")
    print(f"    • Cross-validation prevents overfitting")

validate()

## KOAN 12.7: Ensemble Method Comparison
**Objective**: Compare all ensemble methods on the same dataset  
**Difficulty**: Advanced

Now let's bring it all together! We'll compare Random Forest, Gradient Boosting, Voting, and Stacking ensembles to see their relative strengths and when each performs best.

**Key Concept**: Different ensemble methods excel in different scenarios. Understanding their trade-offs helps choose the right approach for your problem.

In [None]:
def comprehensive_ensemble_comparison():
    """
    Compare multiple ensemble methods on the same dataset using cross-validation.
    
    Methods: Random Forest, Gradient Boosting, Voting, Stacking
    
    Returns:
        dict: Cross-validation results for each ensemble method
    """
    # Load and prepare data  
    cancer = load_breast_cancer()
    X, y = cancer.data, cancer.target
    
    # Scale the features
    scaler = StandardScaler()
    X_scaled = scaler.fit_transform(X)
    
    # TODO: Define ensemble models
    models = {}
    
    # 1. Random Forest
    models['Random Forest'] = None  # RandomForestClassifier(n_estimators=100, random_state=42)
    
    # 2. Gradient Boosting  
    models['Gradient Boosting'] = None  # GradientBoostingClassifier(n_estimators=100, random_state=42)
    
    # 3. Voting Classifier
    base_models = [
        ('rf', RandomForestClassifier(n_estimators=50, random_state=42)),
        ('lr', LogisticRegression(random_state=42, max_iter=1000)),
        ('svm', SVC(random_state=42, probability=True))
    ]
    models['Voting'] = None  # VotingClassifier(estimators=base_models, voting='soft')
    
    # 4. Stacking Classifier
    stacking_base = [
        ('rf', RandomForestClassifier(n_estimators=50, random_state=42)),
        ('gb', GradientBoostingClassifier(n_estimators=50, random_state=42))
    ]
    models['Stacking'] = None  # StackingClassifier(estimators=stacking_base, final_estimator=LogisticRegression(), cv=3)
    
    # TODO: Calculate cross-validation scores for each model
    results = {}
    for name, model in models.items():
        if model is not None:
            # Use cross_val_score with cv=5
            cv_scores = None  # cross_val_score(model, X_scaled, y, cv=5, scoring='accuracy')
            results[name] = {
                'scores': cv_scores,
                'mean': np.mean(cv_scores) if cv_scores is not None else None,
                'std': np.std(cv_scores) if cv_scores is not None else None
            }
    
    return results

@validator.koan(7, "Ensemble Method Comparison", difficulty="Advanced")
def validate():
    results = comprehensive_ensemble_comparison()
    
    assert results is not None, "Results is None"
    assert len(results) == 4, f"Should have 4 ensemble methods, got {len(results)}"
    
    expected_methods = ['Random Forest', 'Gradient Boosting', 'Voting', 'Stacking']
    for method in expected_methods:
        assert method in results, f"Missing {method} in results"
        assert results[method]['scores'] is not None, f"{method} scores is None"
        assert len(results[method]['scores']) == 5, f"{method} should have 5 CV scores"
        assert 0.7 <= results[method]['mean'] <= 1.0, f"{method} mean score should be reasonable"
    
    print("✓ Successfully compared all ensemble methods!")
    print(f"\n  📊 Cross-Validation Results (5-fold):")
    print(f"  {'Method':<18} {'Mean':<8} {'Std':<8} {'Range'}")
    print(f"  {'-'*50}")
    
    # Sort by mean performance
    sorted_results = sorted(results.items(), key=lambda x: x[1]['mean'], reverse=True)
    
    for i, (method, result) in enumerate(sorted_results):
        mean = result['mean']
        std = result['std'] 
        min_score = np.min(result['scores'])
        max_score = np.max(result['scores'])
        
        print(f"  {method:<18} {mean:.3f}    {std:.3f}    [{min_score:.3f}, {max_score:.3f}]")
    
    # Identify best performer
    best_method, best_result = sorted_results[0]
    print(f"\n  🏆 Best Performer: {best_method} ({best_result['mean']:.3f} ± {best_result['std']:.3f})")
    
    # Performance insights
    print(f"\n  💡 Ensemble Method Characteristics:")
    print(f"    🌲 Random Forest: Fast, interpretable, good baseline")
    print(f"    🚀 Gradient Boosting: Sequential learning, can overfit") 
    print(f"    🗳️  Voting: Combines diverse algorithms, robust")
    print(f"    🥞 Stacking: Meta-learning, most flexible")
    
    print(f"\n  🎯 Choosing the Right Ensemble:")
    print(f"    • Speed needed? → Random Forest")
    print(f"    • Maximum accuracy? → Stacking or XGBoost") 
    print(f"    • Interpretability? → Random Forest + feature importance")
    print(f"    • Robustness? → Voting with diverse models")

validate()

## 🎉 Congratulations!

You have mastered ensemble methods - some of the most powerful techniques in machine learning!

### What You've Accomplished
- ✅ **Random Forest**: Built bagging ensembles with bootstrap sampling
- ✅ **Feature Importance**: Analyzed variable significance in tree ensembles  
- ✅ **Gradient Boosting**: Implemented sequential learning algorithms
- ✅ **XGBoost**: Used state-of-the-art optimized boosting
- ✅ **Voting Classifiers**: Combined diverse algorithms effectively
- ✅ **Stacking**: Implemented meta-learning for optimal combination
- ✅ **Ensemble Comparison**: Evaluated trade-offs between methods

### Key Insights Gained
1. **Ensemble Power**: Multiple weak learners create strong predictors
2. **Diversity Matters**: Different algorithm types improve ensemble performance
3. **Bias-Variance Trade-off**: Bagging reduces variance, boosting reduces bias
4. **Meta-Learning**: Stacking can learn optimal combination strategies
5. **Real-World Impact**: Ensembles dominate ML competitions and applications

### Next Steps  
- **Notebook 13**: Hyperparameter Tuning (optimize your ensembles!)
- **Advanced Topics**: Deep ensemble methods, neural network ensembles
- **Practice**: Apply ensembles to your own datasets and competitions

### Real-World Applications
- **Finance**: Credit scoring, algorithmic trading, fraud detection
- **Healthcare**: Disease diagnosis, drug discovery, medical imaging
- **Technology**: Recommendation systems, search ranking, ad targeting  
- **Science**: Climate modeling, genomics, particle physics
- **Business**: Customer churn, demand forecasting, price optimization

You now have the tools to build world-class predictive models! 🚀

In [None]:
# Final Progress Check
progress = tracker.get_notebook_progress('12_ensemble_methods')
print(f"\n📊 Your Progress: {progress}% complete!")

if progress == 100:
    print("🎉 Outstanding! You've mastered all ensemble method koans!")
    print("🎯 Ready for Notebook 13: Hyperparameter Tuning")
elif progress >= 75:
    print("🌟 Excellent progress! Almost there with ensemble mastery.")
elif progress >= 50:
    print("💪 Great work! You're building powerful ensemble skills.")
else:
    print("🚀 Keep going! Each ensemble technique builds on the last.")

print(f"\n📈 Overall course progress:")
total_notebooks = 15
completed_notebooks = len([nb for nb in range(1, 13) if tracker.get_notebook_progress(f'{nb:02d}_*') == 100])
print(f"   Completed notebooks: {completed_notebooks}/{total_notebooks}")
print(f"   Course progress: {(completed_notebooks/total_notebooks)*100:.1f}%")

print(f"\n🎯 Ensemble Method Mastery Achieved!")
print(f"   You can now build production-ready ML ensembles! 🏆")