# 🔬 Ensemble Methods Deep Dive

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/your-org/anomaly-detection/blob/main/docs/notebooks/06_ensemble_methods_deep_dive.ipynb)

**Difficulty**: Advanced | **Time**: 55 minutes

Master advanced ensemble techniques for anomaly detection. Learn how to combine multiple detectors using voting, stacking, and hierarchical methods to achieve superior detection performance.

## 🎯 Learning Objectives

- Understand ensemble learning principles for anomaly detection
- Implement voting, averaging, and stacking ensembles
- Build hierarchical and dynamic ensemble architectures
- Optimize ensemble performance and interpretability
- Deploy ensemble models in production systems

## 📦 Prerequisites

- Complete [Algorithm Comparison Tutorial](02_algorithm_comparison_tutorial.ipynb)
- Understanding of multiple anomaly detection algorithms
- Basic knowledge of machine learning ensemble methods

In [None]:
# Import required libraries
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import plotly.graph_objects as go
import plotly.express as px
from plotly.subplots import make_subplots
import time
from collections import defaultdict
import warnings
warnings.filterwarnings('ignore')

# Interactive widgets
import ipywidgets as widgets
from IPython.display import display, HTML, clear_output

# Machine learning
from sklearn.ensemble import IsolationForest
from sklearn.neighbors import LocalOutlierFactor
from sklearn.svm import OneClassSVM
from sklearn.cluster import DBSCAN
from sklearn.preprocessing import StandardScaler, MinMaxScaler
from sklearn.metrics import (
    precision_score, recall_score, f1_score, 
    roc_auc_score, classification_report
)
from sklearn.model_selection import cross_val_score
from sklearn.linear_model import LogisticRegression
from sklearn.tree import DecisionTreeClassifier

# Set plotting style
plt.style.use('seaborn-v0_8')
sns.set_palette("husl")

print("✅ Libraries imported successfully!")
print("🔬 Ready for ensemble methods deep dive!")

## 🏗️ Building Individual Anomaly Detectors

First, let's create a comprehensive suite of individual detectors that we'll combine into ensembles.

In [None]:
class IndividualDetector:
    """Base class for individual anomaly detectors."""
    
    def __init__(self, name, detector, scaler=None):
        self.name = name
        self.detector = detector
        self.scaler = scaler or StandardScaler()
        self.is_fitted = False
        self.training_time = 0
        self.prediction_time = 0
    
    def fit(self, X, y=None):
        """Fit the detector to training data."""
        start_time = time.time()
        
        # Scale the data
        X_scaled = self.scaler.fit_transform(X)
        
        # Fit the detector
        self.detector.fit(X_scaled)
        
        self.training_time = time.time() - start_time
        self.is_fitted = True
        
        return self
    
    def predict(self, X):
        """Predict anomalies."""
        if not self.is_fitted:
            raise ValueError(f"Detector {self.name} is not fitted yet.")
        
        start_time = time.time()
        
        # Scale the data
        X_scaled = self.scaler.transform(X)
        
        # Make predictions
        predictions = self.detector.predict(X_scaled)
        
        self.prediction_time = time.time() - start_time
        
        return predictions
    
    def decision_function(self, X):
        """Get anomaly scores."""
        if not self.is_fitted:
            raise ValueError(f"Detector {self.name} is not fitted yet.")
        
        X_scaled = self.scaler.transform(X)
        
        # Get decision scores
        if hasattr(self.detector, 'decision_function'):
            return self.detector.decision_function(X_scaled)
        elif hasattr(self.detector, 'score_samples'):
            return self.detector.score_samples(X_scaled)
        else:
            # Fallback to predictions converted to scores
            preds = self.detector.predict(X_scaled)
            return preds.astype(float)


class DetectorSuite:
    """Collection of diverse anomaly detectors."""
    
    def __init__(self, contamination=0.1):
        self.contamination = contamination
        self.detectors = self._create_detector_suite()
    
    def _create_detector_suite(self):
        """Create a diverse suite of anomaly detectors."""
        detectors = []
        
        # 1. Isolation Forest variants
        detectors.append(IndividualDetector(
            "IsolationForest_Default",
            IsolationForest(
                contamination=self.contamination,
                random_state=42,
                n_estimators=100
            )
        ))
        
        detectors.append(IndividualDetector(
            "IsolationForest_HighTrees",
            IsolationForest(
                contamination=self.contamination,
                random_state=43,
                n_estimators=200,
                max_samples=0.8
            )
        ))
        
        detectors.append(IndividualDetector(
            "IsolationForest_LowSample",
            IsolationForest(
                contamination=self.contamination,
                random_state=44,
                n_estimators=50,
                max_samples=0.5
            )
        ))
        
        # 2. Local Outlier Factor variants
        detectors.append(IndividualDetector(
            "LOF_Small_Neighborhood",
            LocalOutlierFactor(
                contamination=self.contamination,
                n_neighbors=5,
                novelty=True
            )
        ))
        
        detectors.append(IndividualDetector(
            "LOF_Large_Neighborhood",
            LocalOutlierFactor(
                contamination=self.contamination,
                n_neighbors=20,
                novelty=True
            )
        ))
        
        # 3. One-Class SVM variants
        detectors.append(IndividualDetector(
            "OneClassSVM_RBF",
            OneClassSVM(
                kernel='rbf',
                gamma='scale',
                nu=self.contamination
            )
        ))
        
        detectors.append(IndividualDetector(
            "OneClassSVM_Linear",
            OneClassSVM(
                kernel='linear',
                nu=self.contamination
            )
        ))
        
        detectors.append(IndividualDetector(
            "OneClassSVM_Poly",
            OneClassSVM(
                kernel='poly',
                degree=3,
                nu=self.contamination
            )
        ))
        
        return detectors
    
    def fit_all(self, X, y=None, verbose=True):
        """Fit all detectors in the suite."""
        if verbose:
            print(f"🔧 Fitting {len(self.detectors)} detectors...")
        
        for i, detector in enumerate(self.detectors):
            if verbose:
                print(f"   [{i+1}/{len(self.detectors)}] Fitting {detector.name}...")
            
            try:
                detector.fit(X, y)
                if verbose:
                    print(f"      ✅ Completed in {detector.training_time:.3f}s")
            except Exception as e:
                if verbose:
                    print(f"      ❌ Failed: {e}")
        
        fitted_count = sum(1 for d in self.detectors if d.is_fitted)
        if verbose:
            print(f"\n✅ Successfully fitted {fitted_count}/{len(self.detectors)} detectors")
    
    def predict_all(self, X, return_scores=False):
        """Get predictions from all fitted detectors."""
        predictions = {}
        scores = {}
        
        for detector in self.detectors:
            if detector.is_fitted:
                try:
                    predictions[detector.name] = detector.predict(X)
                    if return_scores:
                        scores[detector.name] = detector.decision_function(X)
                except Exception as e:
                    print(f"❌ Prediction failed for {detector.name}: {e}")
        
        if return_scores:
            return predictions, scores
        return predictions
    
    def evaluate_individual_performance(self, X, y_true):
        """Evaluate performance of individual detectors."""
        predictions = self.predict_all(X)
        results = {}
        
        for name, y_pred in predictions.items():
            # Convert predictions to binary (1 for normal, -1 for anomaly)
            y_pred_binary = (y_pred == -1).astype(int)
            y_true_binary = (y_true == -1).astype(int)
            
            results[name] = {
                'precision': precision_score(y_true_binary, y_pred_binary, zero_division=0),
                'recall': recall_score(y_true_binary, y_pred_binary, zero_division=0),
                'f1_score': f1_score(y_true_binary, y_pred_binary, zero_division=0),
                'accuracy': np.mean(y_pred_binary == y_true_binary)
            }
        
        return results

print("✅ Individual detector classes created!")
print("🔧 Ready to build detector suites!")

## 🎯 Ensemble Methods Implementation

Now let's implement various ensemble techniques to combine our individual detectors.

In [None]:
class VotingEnsemble:
    """Voting-based ensemble for anomaly detection."""
    
    def __init__(self, detectors, voting_strategy='majority', weights=None):
        self.detectors = detectors
        self.voting_strategy = voting_strategy
        self.weights = weights
        self.is_fitted = False
    
    def fit(self, X, y=None):
        """Fit all base detectors."""
        for detector in self.detectors:
            if not detector.is_fitted:
                detector.fit(X, y)
        
        self.is_fitted = True
        return self
    
    def predict(self, X):
        """Make ensemble predictions using voting."""
        if not self.is_fitted:
            raise ValueError("Ensemble must be fitted before prediction.")
        
        # Get predictions from all detectors
        all_predictions = []
        for detector in self.detectors:
            if detector.is_fitted:
                predictions = detector.predict(X)
                all_predictions.append(predictions)
        
        if not all_predictions:
            raise ValueError("No fitted detectors available for prediction.")
        
        # Convert to numpy array for easier manipulation
        predictions_array = np.array(all_predictions)  # Shape: (n_detectors, n_samples)
        
        if self.voting_strategy == 'majority':
            # Majority voting: predict anomaly if majority says anomaly
            anomaly_votes = np.sum(predictions_array == -1, axis=0)
            ensemble_predictions = np.where(
                anomaly_votes > len(all_predictions) / 2, -1, 1
            )
        
        elif self.voting_strategy == 'unanimous':
            # Unanimous voting: predict anomaly only if all agree
            ensemble_predictions = np.where(
                np.all(predictions_array == -1, axis=0), -1, 1
            )
        
        elif self.voting_strategy == 'any':
            # Any voting: predict anomaly if any detector says anomaly
            ensemble_predictions = np.where(
                np.any(predictions_array == -1, axis=0), -1, 1
            )
        
        elif self.voting_strategy == 'weighted':
            # Weighted voting using provided weights
            if self.weights is None:
                raise ValueError("Weights must be provided for weighted voting.")
            
            weighted_votes = np.zeros(X.shape[0])
            total_weight = 0
            
            for i, (predictions, weight) in enumerate(zip(all_predictions, self.weights)):
                weighted_votes += (predictions == -1).astype(float) * weight
                total_weight += weight
            
            ensemble_predictions = np.where(
                weighted_votes > total_weight / 2, -1, 1
            )
        
        else:
            raise ValueError(f"Unknown voting strategy: {self.voting_strategy}")
        
        return ensemble_predictions
    
    def get_voting_details(self, X):
        """Get detailed voting information for each sample."""
        all_predictions = []
        detector_names = []
        
        for detector in self.detectors:
            if detector.is_fitted:
                predictions = detector.predict(X)
                all_predictions.append(predictions)
                detector_names.append(detector.name)
        
        predictions_df = pd.DataFrame(
            np.array(all_predictions).T,
            columns=detector_names
        )
        
        # Add voting statistics
        predictions_df['anomaly_votes'] = (predictions_df == -1).sum(axis=1)
        predictions_df['normal_votes'] = (predictions_df == 1).sum(axis=1)
        predictions_df['ensemble_prediction'] = self.predict(X)
        
        return predictions_df


class AveragingEnsemble:
    """Score averaging ensemble for anomaly detection."""
    
    def __init__(self, detectors, averaging_strategy='mean', weights=None):
        self.detectors = detectors
        self.averaging_strategy = averaging_strategy
        self.weights = weights
        self.threshold = None
        self.is_fitted = False
    
    def fit(self, X, y=None, contamination=0.1):
        """Fit all base detectors and determine threshold."""
        # Fit base detectors
        for detector in self.detectors:
            if not detector.is_fitted:
                detector.fit(X, y)
        
        # Determine threshold based on training data
        training_scores = self.decision_function(X)
        self.threshold = np.percentile(training_scores, (1 - contamination) * 100)
        
        self.is_fitted = True
        return self
    
    def decision_function(self, X):
        """Get ensemble anomaly scores."""
        if not self.is_fitted:
            # Allow decision function during fitting
            fitted_detectors = [d for d in self.detectors if d.is_fitted]
            if not fitted_detectors:
                raise ValueError("No fitted detectors available.")
        else:
            fitted_detectors = [d for d in self.detectors if d.is_fitted]
        
        # Get scores from all fitted detectors
        all_scores = []
        for detector in fitted_detectors:
            try:
                scores = detector.decision_function(X)
                all_scores.append(scores)
            except Exception as e:
                print(f"⚠️ Skipping {detector.name}: {e}")
        
        if not all_scores:
            raise ValueError("No valid scores obtained from detectors.")
        
        # Convert to numpy array
        scores_array = np.array(all_scores)  # Shape: (n_detectors, n_samples)
        
        # Apply averaging strategy
        if self.averaging_strategy == 'mean':
            ensemble_scores = np.mean(scores_array, axis=0)
        
        elif self.averaging_strategy == 'median':
            ensemble_scores = np.median(scores_array, axis=0)
        
        elif self.averaging_strategy == 'weighted':
            if self.weights is None:
                raise ValueError("Weights must be provided for weighted averaging.")
            
            weighted_scores = np.zeros(X.shape[0])
            total_weight = 0
            
            for scores, weight in zip(all_scores, self.weights):
                weighted_scores += scores * weight
                total_weight += weight
            
            ensemble_scores = weighted_scores / total_weight
        
        elif self.averaging_strategy == 'max':
            # Take maximum (most anomalous) score
            ensemble_scores = np.max(scores_array, axis=0)
        
        elif self.averaging_strategy == 'min':
            # Take minimum (least anomalous) score
            ensemble_scores = np.min(scores_array, axis=0)
        
        else:
            raise ValueError(f"Unknown averaging strategy: {self.averaging_strategy}")
        
        return ensemble_scores
    
    def predict(self, X):
        """Make predictions based on averaged scores."""
        if not self.is_fitted:
            raise ValueError("Ensemble must be fitted before prediction.")
        
        scores = self.decision_function(X)
        predictions = np.where(scores > self.threshold, -1, 1)
        
        return predictions


class StackingEnsemble:
    """Stacking ensemble using a meta-learner."""
    
    def __init__(self, detectors, meta_learner=None):
        self.detectors = detectors
        self.meta_learner = meta_learner or LogisticRegression(random_state=42)
        self.is_fitted = False
    
    def fit(self, X, y):
        """Fit base detectors and meta-learner."""
        # Fit base detectors
        for detector in self.detectors:
            if not detector.is_fitted:
                detector.fit(X)
        
        # Generate meta-features using base detector predictions
        meta_features = self._generate_meta_features(X)
        
        # Convert labels to binary format for meta-learner
        y_binary = (y == -1).astype(int)
        
        # Fit meta-learner
        self.meta_learner.fit(meta_features, y_binary)
        
        self.is_fitted = True
        return self
    
    def _generate_meta_features(self, X):
        """Generate meta-features from base detector outputs."""
        meta_features = []
        
        for detector in self.detectors:
            if detector.is_fitted:
                # Get both predictions and scores as features
                predictions = detector.predict(X)
                scores = detector.decision_function(X)
                
                meta_features.append(predictions)
                meta_features.append(scores)
        
        return np.column_stack(meta_features)
    
    def predict(self, X):
        """Make ensemble predictions using meta-learner."""
        if not self.is_fitted:
            raise ValueError("Ensemble must be fitted before prediction.")
        
        meta_features = self._generate_meta_features(X)
        meta_predictions = self.meta_learner.predict(meta_features)
        
        # Convert back to anomaly detection format
        return np.where(meta_predictions == 1, -1, 1)
    
    def predict_proba(self, X):
        """Get prediction probabilities from meta-learner."""
        if not self.is_fitted:
            raise ValueError("Ensemble must be fitted before prediction.")
        
        meta_features = self._generate_meta_features(X)
        return self.meta_learner.predict_proba(meta_features)

print("✅ Ensemble method classes created!")
print("🎯 Ready to build powerful ensemble detectors!")

## 📊 Interactive Ensemble Comparison

Let's create an interactive tool to compare different ensemble methods.

In [None]:
class EnsembleComparator:
    """Interactive tool for comparing ensemble methods."""
    
    def __init__(self):
        self.data = None
        self.labels = None
        self.detector_suite = None
        self.ensembles = {}
        self.results = {}
        self.create_widgets()
    
    def create_widgets(self):
        """Create interactive widgets."""
        # Data generation widgets
        self.n_samples_slider = widgets.IntSlider(
            value=1000, min=500, max=5000, step=500,
            description='Samples:',
            style={'description_width': 'initial'}
        )
        
        self.n_features_slider = widgets.IntSlider(
            value=5, min=2, max=15, step=1,
            description='Features:',
            style={'description_width': 'initial'}
        )
        
        self.contamination_slider = widgets.FloatSlider(
            value=0.1, min=0.05, max=0.3, step=0.05,
            description='Contamination:',
            style={'description_width': 'initial'}
        )
        
        # Ensemble selection widgets
        self.ensemble_methods = widgets.SelectMultiple(
            options=[
                'Majority Voting',
                'Unanimous Voting',
                'Any Voting',
                'Mean Averaging',
                'Median Averaging',
                'Max Averaging',
                'Stacking'
            ],
            value=['Majority Voting', 'Mean Averaging', 'Stacking'],
            description='Ensemble Methods:',
            style={'description_width': 'initial'},
            layout={'height': '150px'}
        )
        
        # Control buttons
        self.generate_button = widgets.Button(
            description='Generate Data',
            button_style='primary',
            icon='database'
        )
        
        self.compare_button = widgets.Button(
            description='Compare Ensembles',
            button_style='success',
            icon='chart-bar',
            disabled=True
        )
        
        # Output widgets
        self.status_output = widgets.Output()
        self.results_output = widgets.Output()
        self.plot_output = widgets.Output()
        
        # Event handlers
        self.generate_button.on_click(self.generate_data)
        self.compare_button.on_click(self.compare_ensembles)
    
    def display_interface(self):
        """Display the complete interface."""
        # Configuration panel
        config_panel = widgets.VBox([
            widgets.HTML("<h3>🔧 Configuration</h3>"),
            widgets.HBox([self.n_samples_slider, self.n_features_slider]),
            self.contamination_slider,
            self.ensemble_methods,
            widgets.HBox([self.generate_button, self.compare_button])
        ])
        
        # Results panel
        results_panel = widgets.VBox([
            widgets.HTML("<h3>📊 Results</h3>"),
            self.results_output
        ])
        
        # Main interface
        interface = widgets.VBox([
            widgets.HTML("<h2>🔬 Ensemble Methods Comparison</h2>"),
            widgets.HBox([config_panel, results_panel]),
            self.plot_output,
            self.status_output
        ])
        
        display(interface)
    
    def generate_data(self, button):
        """Generate synthetic dataset for comparison."""
        with self.status_output:
            clear_output(wait=True)
            print("🔄 Generating synthetic dataset...")
        
        try:
            n_samples = self.n_samples_slider.value
            n_features = self.n_features_slider.value
            contamination = self.contamination_slider.value
            
            # Generate normal data
            np.random.seed(42)
            normal_data = np.random.multivariate_normal(
                mean=np.zeros(n_features),
                cov=np.eye(n_features),
                size=int(n_samples * (1 - contamination))
            )
            
            # Generate anomalous data
            n_anomalies = int(n_samples * contamination)
            anomaly_data = np.random.uniform(
                low=-4, high=4,
                size=(n_anomalies, n_features)
            )
            
            # Combine data
            self.data = np.vstack([normal_data, anomaly_data])
            self.labels = np.hstack([
                np.ones(len(normal_data)),
                -np.ones(len(anomaly_data))
            ])
            
            # Shuffle
            indices = np.random.permutation(len(self.data))
            self.data = self.data[indices]
            self.labels = self.labels[indices]
            
            # Create detector suite
            self.detector_suite = DetectorSuite(contamination=contamination)
            
            with self.status_output:
                clear_output(wait=True)
                print(f"✅ Generated dataset: {n_samples} samples, {n_features} features")
                print(f"📊 Contamination: {contamination:.1%} ({n_anomalies} anomalies)")
            
            # Enable comparison button
            self.compare_button.disabled = False
        
        except Exception as e:
            with self.status_output:
                clear_output(wait=True)
                print(f"❌ Error generating data: {e}")
    
    def compare_ensembles(self, button):
        """Compare selected ensemble methods."""
        if self.data is None:
            with self.status_output:
                clear_output(wait=True)
                print("⚠️ Please generate data first!")
            return
        
        with self.status_output:
            clear_output(wait=True)
            print("🔄 Training detectors and building ensembles...")
        
        try:
            # Fit individual detectors
            self.detector_suite.fit_all(self.data, verbose=False)
            
            # Split data for stacking (need labels for meta-learner)
            split_idx = int(0.7 * len(self.data))
            train_X, test_X = self.data[:split_idx], self.data[split_idx:]
            train_y, test_y = self.labels[:split_idx], self.labels[split_idx:]
            
            # Build selected ensembles
            self.ensembles = {}
            selected_methods = self.ensemble_methods.value
            
            if 'Majority Voting' in selected_methods:
                self.ensembles['Majority Voting'] = VotingEnsemble(
                    self.detector_suite.detectors, 'majority'
                )
            
            if 'Unanimous Voting' in selected_methods:
                self.ensembles['Unanimous Voting'] = VotingEnsemble(
                    self.detector_suite.detectors, 'unanimous'
                )
            
            if 'Any Voting' in selected_methods:
                self.ensembles['Any Voting'] = VotingEnsemble(
                    self.detector_suite.detectors, 'any'
                )
            
            if 'Mean Averaging' in selected_methods:
                self.ensembles['Mean Averaging'] = AveragingEnsemble(
                    self.detector_suite.detectors, 'mean'
                )
            
            if 'Median Averaging' in selected_methods:
                self.ensembles['Median Averaging'] = AveragingEnsemble(
                    self.detector_suite.detectors, 'median'
                )
            
            if 'Max Averaging' in selected_methods:
                self.ensembles['Max Averaging'] = AveragingEnsemble(
                    self.detector_suite.detectors, 'max'
                )
            
            if 'Stacking' in selected_methods:
                self.ensembles['Stacking'] = StackingEnsemble(
                    self.detector_suite.detectors
                )
            
            # Fit and evaluate ensembles
            self.results = {}
            
            for name, ensemble in self.ensembles.items():
                with self.status_output:
                    print(f"   🔧 Training {name}...")
                
                start_time = time.time()
                
                if name == 'Stacking':
                    # Stacking needs labels for training
                    ensemble.fit(train_X, train_y)
                    predictions = ensemble.predict(test_X)
                    eval_labels = test_y
                else:
                    # Other ensembles don't need labels
                    ensemble.fit(self.data)
                    predictions = ensemble.predict(self.data)
                    eval_labels = self.labels
                
                training_time = time.time() - start_time
                
                # Evaluate performance
                y_pred_binary = (predictions == -1).astype(int)
                y_true_binary = (eval_labels == -1).astype(int)
                
                self.results[name] = {
                    'precision': precision_score(y_true_binary, y_pred_binary, zero_division=0),
                    'recall': recall_score(y_true_binary, y_pred_binary, zero_division=0),
                    'f1_score': f1_score(y_true_binary, y_pred_binary, zero_division=0),
                    'accuracy': np.mean(y_pred_binary == y_true_binary),
                    'training_time': training_time,
                    'predictions': predictions
                }
            
            # Display results
            self.display_results()
            self.create_comparison_plots()
            
            with self.status_output:
                clear_output(wait=True)
                print(f"✅ Comparison completed for {len(self.ensembles)} ensemble methods!")
        
        except Exception as e:
            with self.status_output:
                clear_output(wait=True)
                print(f"❌ Error during comparison: {e}")
    
    def display_results(self):
        """Display comparison results in a table."""
        with self.results_output:
            clear_output(wait=True)
            
            if not self.results:
                print("No results to display.")
                return
            
            # Create results DataFrame
            results_df = pd.DataFrame(self.results).T
            results_df = results_df[['precision', 'recall', 'f1_score', 'accuracy', 'training_time']]
            
            # Format the results
            html_table = "<div style='max-height: 300px; overflow-y: auto;'>"
            html_table += "<table style='width: 100%; border-collapse: collapse;'>"
            html_table += "<tr style='background-color: #f0f0f0;'>"
            html_table += "<th style='padding: 8px; border: 1px solid #ddd;'>Method</th>"
            html_table += "<th style='padding: 8px; border: 1px solid #ddd;'>Precision</th>"
            html_table += "<th style='padding: 8px; border: 1px solid #ddd;'>Recall</th>"
            html_table += "<th style='padding: 8px; border: 1px solid #ddd;'>F1-Score</th>"
            html_table += "<th style='padding: 8px; border: 1px solid #ddd;'>Accuracy</th>"
            html_table += "<th style='padding: 8px; border: 1px solid #ddd;'>Time (s)</th>"
            html_table += "</tr>"
            
            # Find best performer for each metric
            best_precision = results_df['precision'].idxmax()
            best_recall = results_df['recall'].idxmax()
            best_f1 = results_df['f1_score'].idxmax()
            best_accuracy = results_df['accuracy'].idxmax()
            fastest = results_df['training_time'].idxmin()
            
            for method, row in results_df.iterrows():
                html_table += "<tr>"
                html_table += f"<td style='padding: 8px; border: 1px solid #ddd; font-weight: bold;'>{method}</td>"
                
                # Highlight best values
                precision_style = "background-color: #90EE90;" if method == best_precision else ""
                recall_style = "background-color: #90EE90;" if method == best_recall else ""
                f1_style = "background-color: #90EE90;" if method == best_f1 else ""
                accuracy_style = "background-color: #90EE90;" if method == best_accuracy else ""
                time_style = "background-color: #87CEEB;" if method == fastest else ""
                
                html_table += f"<td style='padding: 8px; border: 1px solid #ddd; {precision_style}'>{row['precision']:.3f}</td>"
                html_table += f"<td style='padding: 8px; border: 1px solid #ddd; {recall_style}'>{row['recall']:.3f}</td>"
                html_table += f"<td style='padding: 8px; border: 1px solid #ddd; {f1_style}'>{row['f1_score']:.3f}</td>"
                html_table += f"<td style='padding: 8px; border: 1px solid #ddd; {accuracy_style}'>{row['accuracy']:.3f}</td>"
                html_table += f"<td style='padding: 8px; border: 1px solid #ddd; {time_style}'>{row['training_time']:.3f}</td>"
                html_table += "</tr>"
            
            html_table += "</table>"
            html_table += "<p style='font-size: 12px; color: #666;'>💡 Green: Best performance | Blue: Fastest training</p>"
            html_table += "</div>"
            
            display(HTML(html_table))
    
    def create_comparison_plots(self):
        """Create visualization comparing ensemble methods."""
        with self.plot_output:
            clear_output(wait=True)
            
            if not self.results:
                return
            
            # Create subplots
            fig = make_subplots(
                rows=2, cols=2,
                subplot_titles=('Performance Metrics', 'Training Time',
                               'Precision vs Recall', 'Method Comparison'),
                specs=[[{"secondary_y": False}, {"secondary_y": False}],
                       [{"secondary_y": False}, {"secondary_y": False}]]
            )
            
            methods = list(self.results.keys())
            
            # 1. Performance metrics bar chart
            metrics = ['precision', 'recall', 'f1_score', 'accuracy']
            colors = ['blue', 'green', 'red', 'orange']
            
            for i, (metric, color) in enumerate(zip(metrics, colors)):
                values = [self.results[method][metric] for method in methods]
                fig.add_trace(
                    go.Bar(
                        name=metric.replace('_', ' ').title(),
                        x=methods,
                        y=values,
                        marker_color=color,
                        opacity=0.7
                    ),
                    row=1, col=1
                )
            
            # 2. Training time
            training_times = [self.results[method]['training_time'] for method in methods]
            fig.add_trace(
                go.Bar(
                    x=methods,
                    y=training_times,
                    name='Training Time',
                    marker_color='purple',
                    showlegend=False
                ),
                row=1, col=2
            )
            
            # 3. Precision vs Recall scatter
            precisions = [self.results[method]['precision'] for method in methods]
            recalls = [self.results[method]['recall'] for method in methods]
            
            fig.add_trace(
                go.Scatter(
                    x=recalls,
                    y=precisions,
                    mode='markers+text',
                    text=methods,
                    textposition="top center",
                    marker=dict(size=12, color='darkblue'),
                    name='Methods',
                    showlegend=False
                ),
                row=2, col=1
            )
            
            # 4. Radar chart for method comparison
            # Use F1-scores for the radar chart
            f1_scores = [self.results[method]['f1_score'] for method in methods]
            
            fig.add_trace(
                go.Bar(
                    x=methods,
                    y=f1_scores,
                    name='F1-Score',
                    marker_color='gold',
                    showlegend=False
                ),
                row=2, col=2
            )
            
            # Update layout
            fig.update_layout(
                height=800,
                title_text="Ensemble Methods Comparison Dashboard",
                showlegend=True
            )
            
            # Update axes labels
            fig.update_xaxes(title_text="Method", row=1, col=1)
            fig.update_yaxes(title_text="Score", row=1, col=1)
            fig.update_xaxes(title_text="Method", row=1, col=2)
            fig.update_yaxes(title_text="Time (seconds)", row=1, col=2)
            fig.update_xaxes(title_text="Recall", row=2, col=1)
            fig.update_yaxes(title_text="Precision", row=2, col=1)
            fig.update_xaxes(title_text="Method", row=2, col=2)
            fig.update_yaxes(title_text="F1-Score", row=2, col=2)
            
            fig.show()

print("✅ EnsembleComparator created!")
print("🎛️ Ready to launch interactive ensemble comparison!")

## 🚀 Launch Interactive Ensemble Comparison

Use the interactive tool below to compare different ensemble methods!

In [None]:
# Create and display the ensemble comparator
comparator = EnsembleComparator()
comparator.display_interface()

print("\n" + "="*60)
print("🔬 ENSEMBLE COMPARISON INSTRUCTIONS:")
print("="*60)
print("1. 🎛️ Configure dataset parameters (samples, features, contamination)")
print("2. 🎯 Select ensemble methods to compare")
print("3. 📊 Click 'Generate Data' to create synthetic dataset")
print("4. 🚀 Click 'Compare Ensembles' to run the comparison")
print("5. 📈 Analyze results in the table and visualizations")
print("="*60)
print("💡 Available Ensemble Methods:")
print("   • Majority Voting: Predict anomaly if majority agrees")
print("   • Unanimous Voting: Predict anomaly only if all agree")
print("   • Any Voting: Predict anomaly if any detector agrees")
print("   • Mean Averaging: Average anomaly scores")
print("   • Median Averaging: Use median of anomaly scores")
print("   • Max Averaging: Use maximum anomaly score")
print("   • Stacking: Use meta-learner on base detector outputs")
print("="*60)

## 🏗️ Advanced Ensemble Architectures

Let's explore more sophisticated ensemble architectures for complex scenarios.

In [None]:
class HierarchicalEnsemble:
    """Hierarchical ensemble with multiple levels of combination."""
    
    def __init__(self, detector_groups, combination_strategy='voting'):
        self.detector_groups = detector_groups  # List of detector groups
        self.combination_strategy = combination_strategy
        self.group_ensembles = []
        self.meta_ensemble = None
        self.is_fitted = False
    
    def fit(self, X, y=None):
        """Fit hierarchical ensemble."""
        # Step 1: Create ensembles for each group
        self.group_ensembles = []
        
        for i, detector_group in enumerate(self.detector_groups):
            print(f"🔧 Training group {i+1} ensemble...")
            
            # Fit individual detectors in the group
            for detector in detector_group:
                if not detector.is_fitted:
                    detector.fit(X)
            
            # Create group ensemble
            if self.combination_strategy == 'voting':
                group_ensemble = VotingEnsemble(detector_group, 'majority')
            else:
                group_ensemble = AveragingEnsemble(detector_group, 'mean')
            
            group_ensemble.fit(X)
            self.group_ensembles.append(group_ensemble)
        
        # Step 2: Create meta-ensemble from group ensembles
        print("🔧 Training meta-ensemble...")
        if self.combination_strategy == 'voting':
            self.meta_ensemble = VotingEnsemble(self.group_ensembles, 'majority')
        else:
            self.meta_ensemble = AveragingEnsemble(self.group_ensembles, 'mean')
        
        # Meta-ensemble doesn't need fitting as sub-ensembles are already fitted
        self.meta_ensemble.is_fitted = True
        
        self.is_fitted = True
        return self
    
    def predict(self, X):
        """Make hierarchical predictions."""
        if not self.is_fitted:
            raise ValueError("Hierarchical ensemble must be fitted first.")
        
        return self.meta_ensemble.predict(X)
    
    def get_group_predictions(self, X):
        """Get predictions from each group ensemble."""
        group_predictions = {}
        
        for i, group_ensemble in enumerate(self.group_ensembles):
            group_predictions[f'Group_{i+1}'] = group_ensemble.predict(X)
        
        return group_predictions


class DynamicEnsemble:
    """Dynamic ensemble that adapts based on data characteristics."""
    
    def __init__(self, detectors, adaptation_strategy='confidence'):
        self.detectors = detectors
        self.adaptation_strategy = adaptation_strategy
        self.detector_weights = None
        self.is_fitted = False
    
    def fit(self, X, y=None):
        """Fit detectors and calculate adaptive weights."""
        # Fit all detectors
        for detector in self.detectors:
            if not detector.is_fitted:
                detector.fit(X)
        
        # Calculate weights based on adaptation strategy
        if self.adaptation_strategy == 'confidence':
            self.detector_weights = self._calculate_confidence_weights(X)
        elif self.adaptation_strategy == 'diversity':
            self.detector_weights = self._calculate_diversity_weights(X)
        elif self.adaptation_strategy == 'performance':
            if y is not None:
                self.detector_weights = self._calculate_performance_weights(X, y)
            else:
                print("⚠️ Performance-based weighting requires labels, using equal weights")
                self.detector_weights = np.ones(len(self.detectors)) / len(self.detectors)
        else:
            # Equal weights
            self.detector_weights = np.ones(len(self.detectors)) / len(self.detectors)
        
        self.is_fitted = True
        return self
    
    def _calculate_confidence_weights(self, X):
        """Calculate weights based on prediction confidence."""
        weights = []
        
        for detector in self.detectors:
            if detector.is_fitted:
                try:
                    scores = detector.decision_function(X)
                    # Use standard deviation of scores as confidence measure
                    confidence = np.std(scores)
                    weights.append(confidence)
                except:
                    weights.append(1.0)  # Default weight
            else:
                weights.append(0.0)
        
        # Normalize weights
        weights = np.array(weights)
        return weights / np.sum(weights) if np.sum(weights) > 0 else weights
    
    def _calculate_diversity_weights(self, X):
        """Calculate weights based on prediction diversity."""
        # Get predictions from all detectors
        all_predictions = []
        for detector in self.detectors:
            if detector.is_fitted:
                predictions = detector.predict(X)
                all_predictions.append(predictions)
        
        if len(all_predictions) < 2:
            return np.ones(len(self.detectors)) / len(self.detectors)
        
        # Calculate diversity for each detector
        weights = []
        predictions_array = np.array(all_predictions)
        
        for i in range(len(all_predictions)):
            # Calculate disagreement with other detectors
            diversity = 0
            for j in range(len(all_predictions)):
                if i != j:
                    diversity += np.mean(predictions_array[i] != predictions_array[j])
            
            weights.append(diversity / (len(all_predictions) - 1))
        
        # Normalize weights
        weights = np.array(weights)
        return weights / np.sum(weights) if np.sum(weights) > 0 else np.ones(len(weights)) / len(weights)
    
    def _calculate_performance_weights(self, X, y):
        """Calculate weights based on individual performance."""
        weights = []
        
        for detector in self.detectors:
            if detector.is_fitted:
                predictions = detector.predict(X)
                
                # Calculate F1-score as performance measure
                y_pred_binary = (predictions == -1).astype(int)
                y_true_binary = (y == -1).astype(int)
                
                f1 = f1_score(y_true_binary, y_pred_binary, zero_division=0)
                weights.append(f1)
            else:
                weights.append(0.0)
        
        # Normalize weights
        weights = np.array(weights)
        return weights / np.sum(weights) if np.sum(weights) > 0 else np.ones(len(weights)) / len(weights)
    
    def predict(self, X):
        """Make dynamic weighted predictions."""
        if not self.is_fitted:
            raise ValueError("Dynamic ensemble must be fitted first.")
        
        # Get predictions from all detectors
        all_predictions = []
        valid_weights = []
        
        for i, detector in enumerate(self.detectors):
            if detector.is_fitted:
                predictions = detector.predict(X)
                all_predictions.append(predictions)
                valid_weights.append(self.detector_weights[i])
        
        if not all_predictions:
            raise ValueError("No fitted detectors available.")
        
        # Weighted voting
        predictions_array = np.array(all_predictions)  # Shape: (n_detectors, n_samples)
        valid_weights = np.array(valid_weights)
        
        # Calculate weighted anomaly votes
        anomaly_weights = np.zeros(X.shape[0])
        
        for i, (predictions, weight) in enumerate(zip(all_predictions, valid_weights)):
            anomaly_weights += (predictions == -1).astype(float) * weight
        
        # Make final predictions
        ensemble_predictions = np.where(anomaly_weights > 0.5, -1, 1)
        
        return ensemble_predictions
    
    def get_detector_weights(self):
        """Get current detector weights."""
        if not self.is_fitted:
            return None
        
        return dict(zip(
            [d.name for d in self.detectors],
            self.detector_weights
        ))

print("✅ Advanced ensemble architectures created!")
print("🏗️ Ready for hierarchical and dynamic ensembles!")

## 🧪 Advanced Ensemble Demo

Let's demonstrate the advanced ensemble architectures with a comprehensive example.

In [None]:
# Generate demonstration dataset
np.random.seed(42)
n_samples = 2000
n_features = 8
contamination = 0.12

print("🔄 Generating comprehensive demonstration dataset...")

# Generate normal data with multiple clusters
cluster1 = np.random.multivariate_normal([0, 0] + [0] * (n_features-2), np.eye(n_features), size=800)
cluster2 = np.random.multivariate_normal([3, 3] + [0] * (n_features-2), np.eye(n_features) * 0.5, size=600)
cluster3 = np.random.multivariate_normal([-2, 2] + [0] * (n_features-2), np.eye(n_features) * 0.8, size=360)

normal_data = np.vstack([cluster1, cluster2, cluster3])

# Generate diverse types of anomalies
n_anomalies = int(n_samples * contamination)
anomaly1 = np.random.uniform(-6, 6, size=(n_anomalies//3, n_features))  # Scattered outliers
anomaly2 = np.random.multivariate_normal([8, -8] + [0] * (n_features-2), np.eye(n_features) * 0.2, size=n_anomalies//3)  # Distant cluster
anomaly3 = np.random.exponential(2, size=(n_anomalies - 2*(n_anomalies//3), n_features))  # Different distribution

anomaly_data = np.vstack([anomaly1, anomaly2, anomaly3])

# Combine and shuffle
demo_data = np.vstack([normal_data, anomaly_data])
demo_labels = np.hstack([np.ones(len(normal_data)), -np.ones(len(anomaly_data))])

indices = np.random.permutation(len(demo_data))
demo_data = demo_data[indices]
demo_labels = demo_labels[indices]

print(f"✅ Dataset created: {len(demo_data)} samples, {n_features} features")
print(f"📊 Normal samples: {np.sum(demo_labels == 1)}, Anomalies: {np.sum(demo_labels == -1)}")

# Create detector groups for hierarchical ensemble
detector_suite = DetectorSuite(contamination=contamination)
detector_suite.fit_all(demo_data, verbose=False)

# Group detectors by type
isolation_group = [d for d in detector_suite.detectors if 'IsolationForest' in d.name and d.is_fitted]
lof_group = [d for d in detector_suite.detectors if 'LOF' in d.name and d.is_fitted]
svm_group = [d for d in detector_suite.detectors if 'SVM' in d.name and d.is_fitted]

print(f"\n🔧 Detector groups:")
print(f"   Isolation Forest group: {len(isolation_group)} detectors")
print(f"   LOF group: {len(lof_group)} detectors")
print(f"   SVM group: {len(svm_group)} detectors")

# Test hierarchical ensemble
print("\n🏗️ Testing Hierarchical Ensemble...")
hierarchical = HierarchicalEnsemble(
    detector_groups=[isolation_group, lof_group, svm_group],
    combination_strategy='voting'
)
hierarchical.fit(demo_data)
hierarchical_predictions = hierarchical.predict(demo_data)

# Test dynamic ensemble with different strategies
print("\n⚡ Testing Dynamic Ensembles...")
fitted_detectors = [d for d in detector_suite.detectors if d.is_fitted]

dynamic_confidence = DynamicEnsemble(fitted_detectors, 'confidence')
dynamic_confidence.fit(demo_data)
confidence_predictions = dynamic_confidence.predict(demo_data)

dynamic_diversity = DynamicEnsemble(fitted_detectors, 'diversity')
dynamic_diversity.fit(demo_data)
diversity_predictions = dynamic_diversity.predict(demo_data)

dynamic_performance = DynamicEnsemble(fitted_detectors, 'performance')
dynamic_performance.fit(demo_data, demo_labels)
performance_predictions = dynamic_performance.predict(demo_data)

# Evaluate all ensembles
print("\n📊 ADVANCED ENSEMBLE EVALUATION")
print("="*60)

ensembles = {
    'Hierarchical Voting': hierarchical_predictions,
    'Dynamic (Confidence)': confidence_predictions,
    'Dynamic (Diversity)': diversity_predictions,
    'Dynamic (Performance)': performance_predictions
}

results_summary = []

for name, predictions in ensembles.items():
    y_pred_binary = (predictions == -1).astype(int)
    y_true_binary = (demo_labels == -1).astype(int)
    
    precision = precision_score(y_true_binary, y_pred_binary, zero_division=0)
    recall = recall_score(y_true_binary, y_pred_binary, zero_division=0)
    f1 = f1_score(y_true_binary, y_pred_binary, zero_division=0)
    accuracy = np.mean(y_pred_binary == y_true_binary)
    
    results_summary.append({
        'Method': name,
        'Precision': precision,
        'Recall': recall,
        'F1-Score': f1,
        'Accuracy': accuracy
    })
    
    print(f"{name:25} | P: {precision:.3f} | R: {recall:.3f} | F1: {f1:.3f} | Acc: {accuracy:.3f}")

# Display dynamic ensemble weights
print("\n🔍 DYNAMIC ENSEMBLE WEIGHTS")
print("="*60)

print("\n📊 Confidence-based weights:")
confidence_weights = dynamic_confidence.get_detector_weights()
for detector, weight in confidence_weights.items():
    print(f"   {detector:30} {weight:.3f}")

print("\n🎯 Diversity-based weights:")
diversity_weights = dynamic_diversity.get_detector_weights()
for detector, weight in diversity_weights.items():
    print(f"   {detector:30} {weight:.3f}")

print("\n🏆 Performance-based weights:")
performance_weights = dynamic_performance.get_detector_weights()
for detector, weight in performance_weights.items():
    print(f"   {detector:30} {weight:.3f}")

print("\n✅ Advanced ensemble demonstration completed!")

## 📈 Ensemble Performance Visualization

Let's create comprehensive visualizations to understand ensemble behavior.

In [None]:
# Create comprehensive ensemble performance visualization
def create_ensemble_analysis_plots():
    """Create detailed analysis plots for ensemble methods."""
    
    # Create master subplot figure
    fig = make_subplots(
        rows=3, cols=2,
        subplot_titles=(
            'Performance Comparison', 'Detector Weight Distribution',
            'Prediction Agreement Matrix', 'ROC Comparison',
            'Data Distribution (2D Projection)', 'Ensemble Decision Boundaries'
        ),
        specs=[
            [{"secondary_y": False}, {"secondary_y": False}],
            [{"secondary_y": False}, {"secondary_y": False}],
            [{"secondary_y": False}, {"secondary_y": False}]
        ]
    )
    
    # 1. Performance comparison bar chart
    methods = [r['Method'] for r in results_summary]
    f1_scores = [r['F1-Score'] for r in results_summary]
    precisions = [r['Precision'] for r in results_summary]
    recalls = [r['Recall'] for r in results_summary]
    
    fig.add_trace(
        go.Bar(name='F1-Score', x=methods, y=f1_scores, marker_color='gold'),
        row=1, col=1
    )
    fig.add_trace(
        go.Bar(name='Precision', x=methods, y=precisions, marker_color='lightblue'),
        row=1, col=1
    )
    fig.add_trace(
        go.Bar(name='Recall', x=methods, y=recalls, marker_color='lightcoral'),
        row=1, col=1
    )
    
    # 2. Detector weight distribution (performance-based)
    detector_names = list(performance_weights.keys())
    weights = list(performance_weights.values())
    
    fig.add_trace(
        go.Bar(x=detector_names, y=weights, marker_color='purple', name='Weights'),
        row=1, col=2
    )
    
    # 3. Prediction agreement matrix
    ensemble_names = list(ensembles.keys())
    agreement_matrix = np.zeros((len(ensemble_names), len(ensemble_names)))
    
    for i, (name1, pred1) in enumerate(ensembles.items()):
        for j, (name2, pred2) in enumerate(ensembles.items()):
            agreement = np.mean(pred1 == pred2)
            agreement_matrix[i, j] = agreement
    
    fig.add_trace(
        go.Heatmap(
            z=agreement_matrix,
            x=ensemble_names,
            y=ensemble_names,
            colorscale='Viridis',
            name='Agreement'
        ),
        row=2, col=1
    )
    
    # 4. Individual detector performance
    individual_results = detector_suite.evaluate_individual_performance(demo_data, demo_labels)
    individual_names = list(individual_results.keys())
    individual_f1s = [individual_results[name]['f1_score'] for name in individual_names]
    
    fig.add_trace(
        go.Bar(x=individual_names, y=individual_f1s, marker_color='orange', name='Individual F1'),
        row=2, col=2
    )
    
    # 5. Data distribution (2D projection of first two features)
    normal_mask = demo_labels == 1
    anomaly_mask = demo_labels == -1
    
    fig.add_trace(
        go.Scatter(
            x=demo_data[normal_mask, 0],
            y=demo_data[normal_mask, 1],
            mode='markers',
            name='Normal',
            marker=dict(color='blue', size=4, opacity=0.6)
        ),
        row=3, col=1
    )
    
    fig.add_trace(
        go.Scatter(
            x=demo_data[anomaly_mask, 0],
            y=demo_data[anomaly_mask, 1],
            mode='markers',
            name='Anomaly',
            marker=dict(color='red', size=6, opacity=0.8)
        ),
        row=3, col=1
    )
    
    # 6. Ensemble prediction comparison (scatter)
    colors = ['gold', 'lightblue', 'lightgreen', 'orange']
    for i, (name, predictions) in enumerate(ensembles.items()):
        pred_anomalies = predictions == -1
        if np.any(pred_anomalies):
            fig.add_trace(
                go.Scatter(
                    x=demo_data[pred_anomalies, 0],
                    y=demo_data[pred_anomalies, 1],
                    mode='markers',
                    name=f'{name} Predictions',
                    marker=dict(color=colors[i], size=8, opacity=0.7,
                               symbol='x' if i % 2 == 0 else 'circle')
                ),
                row=3, col=2
            )
    
    # Update layout
    fig.update_layout(
        height=1200,
        title_text="Comprehensive Ensemble Analysis Dashboard",
        showlegend=True
    )
    
    # Update individual subplot properties
    fig.update_xaxes(title_text="Method", row=1, col=1)
    fig.update_yaxes(title_text="Score", row=1, col=1)
    fig.update_xaxes(title_text="Detector", row=1, col=2)
    fig.update_yaxes(title_text="Weight", row=1, col=2)
    fig.update_xaxes(title_text="Feature 1", row=3, col=1)
    fig.update_yaxes(title_text="Feature 2", row=3, col=1)
    fig.update_xaxes(title_text="Feature 1", row=3, col=2)
    fig.update_yaxes(title_text="Feature 2", row=3, col=2)
    
    return fig

# Create and display the comprehensive analysis
print("🎨 Creating comprehensive ensemble analysis visualization...")
analysis_fig = create_ensemble_analysis_plots()
analysis_fig.show()

# Create summary insights
print("\n💡 ENSEMBLE INSIGHTS AND RECOMMENDATIONS")
print("="*70)

# Find best performing method
best_method = max(results_summary, key=lambda x: x['F1-Score'])
print(f"🏆 Best Overall Method: {best_method['Method']} (F1: {best_method['F1-Score']:.3f})")

# Analyze weight distributions
top_weighted_detector = max(performance_weights.items(), key=lambda x: x[1])
print(f"🎯 Most Important Detector: {top_weighted_detector[0]} (Weight: {top_weighted_detector[1]:.3f})")

# Method recommendations
print("\n📋 Method Recommendations:")
print("   🚀 High Accuracy: Use performance-weighted dynamic ensemble")
print("   ⚡ Fast Inference: Use majority voting ensemble")
print("   🎛️ Adaptability: Use hierarchical ensemble with diverse groups")
print("   🔍 Interpretability: Use confidence-weighted dynamic ensemble")

print("\n✅ Comprehensive ensemble analysis completed!")

## 🚀 Production Ensemble System

Let's create a production-ready ensemble system with all the advanced features.

In [None]:
class ProductionEnsembleSystem:
    """Production-ready ensemble anomaly detection system."""
    
    def __init__(self, config=None):
        self.config = config or self._get_default_config()
        self.detector_suite = None
        self.ensembles = {}
        self.model_registry = {}
        self.performance_metrics = {}
        self.is_fitted = False
    
    def _get_default_config(self):
        """Get default production configuration."""
        return {
            'contamination': 0.1,
            'ensemble_methods': [
                'majority_voting',
                'mean_averaging', 
                'performance_weighted',
                'hierarchical'
            ],
            'enable_model_selection': True,
            'enable_monitoring': True,
            'enable_fallback': True,
            'performance_threshold': 0.7,
            'model_update_strategy': 'periodic',
            'backup_models': 2
        }
    
    def fit(self, X, y=None, validation_split=0.2):
        """Fit the production ensemble system."""
        print("🏭 Initializing Production Ensemble System...")
        
        # Split data for validation
        if validation_split > 0:
            split_idx = int((1 - validation_split) * len(X))
            train_X, val_X = X[:split_idx], X[split_idx:]
            if y is not None:
                train_y, val_y = y[:split_idx], y[split_idx:]
            else:
                train_y, val_y = None, None
        else:
            train_X, val_X = X, X
            train_y, val_y = y, y
        
        # Create and fit detector suite
        print("🔧 Building detector suite...")
        self.detector_suite = DetectorSuite(self.config['contamination'])
        self.detector_suite.fit_all(train_X, verbose=False)
        
        # Build ensemble methods
        print("🎯 Creating ensemble methods...")
        self._build_ensembles(train_X, train_y)
        
        # Validate and select best models
        if self.config['enable_model_selection'] and val_y is not None:
            print("📊 Validating and selecting best models...")
            self._validate_and_select_models(val_X, val_y)
        
        # Create fallback models
        if self.config['enable_fallback']:
            print("🛡️ Creating fallback models...")
            self._create_fallback_models(train_X, train_y)
        
        self.is_fitted = True
        print("✅ Production ensemble system ready!")
        
        return self
    
    def _build_ensembles(self, X, y):
        """Build all configured ensemble methods."""
        fitted_detectors = [d for d in self.detector_suite.detectors if d.is_fitted]
        
        if 'majority_voting' in self.config['ensemble_methods']:
            self.ensembles['majority_voting'] = VotingEnsemble(
                fitted_detectors, 'majority'
            )
            self.ensembles['majority_voting'].fit(X)
        
        if 'mean_averaging' in self.config['ensemble_methods']:
            self.ensembles['mean_averaging'] = AveragingEnsemble(
                fitted_detectors, 'mean'
            )
            self.ensembles['mean_averaging'].fit(X, contamination=self.config['contamination'])
        
        if 'performance_weighted' in self.config['ensemble_methods'] and y is not None:
            self.ensembles['performance_weighted'] = DynamicEnsemble(
                fitted_detectors, 'performance'
            )
            self.ensembles['performance_weighted'].fit(X, y)
        
        if 'hierarchical' in self.config['ensemble_methods']:
            # Group detectors by type
            groups = self._group_detectors_by_type(fitted_detectors)
            if len(groups) > 1:
                self.ensembles['hierarchical'] = HierarchicalEnsemble(groups)
                self.ensembles['hierarchical'].fit(X)
    
    def _group_detectors_by_type(self, detectors):
        """Group detectors by algorithm type."""
        groups = defaultdict(list)
        
        for detector in detectors:
            if 'IsolationForest' in detector.name:
                groups['isolation'].append(detector)
            elif 'LOF' in detector.name:
                groups['lof'].append(detector)
            elif 'SVM' in detector.name:
                groups['svm'].append(detector)
            else:
                groups['other'].append(detector)
        
        # Return non-empty groups
        return [group for group in groups.values() if len(group) > 0]
    
    def _validate_and_select_models(self, val_X, val_y):
        """Validate ensembles and select best performing ones."""
        validation_results = {}
        
        for name, ensemble in self.ensembles.items():
            try:
                predictions = ensemble.predict(val_X)
                
                y_pred_binary = (predictions == -1).astype(int)
                y_true_binary = (val_y == -1).astype(int)
                
                f1 = f1_score(y_true_binary, y_pred_binary, zero_division=0)
                precision = precision_score(y_true_binary, y_pred_binary, zero_division=0)
                recall = recall_score(y_true_binary, y_pred_binary, zero_division=0)
                
                validation_results[name] = {
                    'f1_score': f1,
                    'precision': precision,
                    'recall': recall,
                    'status': 'good' if f1 >= self.config['performance_threshold'] else 'poor'
                }
                
            except Exception as e:
                validation_results[name] = {
                    'f1_score': 0.0,
                    'precision': 0.0,
                    'recall': 0.0,
                    'status': 'failed',
                    'error': str(e)
                }
        
        self.performance_metrics = validation_results
        
        # Select top performing models
        good_models = {name: metrics for name, metrics in validation_results.items() 
                      if metrics['status'] == 'good'}
        
        if good_models:
            # Keep top performers
            top_models = sorted(good_models.items(), 
                              key=lambda x: x[1]['f1_score'], 
                              reverse=True)[:self.config['backup_models'] + 1]
            
            self.model_registry = {name: self.ensembles[name] for name, _ in top_models}
            print(f"🎯 Selected {len(self.model_registry)} top performing models")
        else:
            print("⚠️ No models met performance threshold, keeping all models")
            self.model_registry = self.ensembles.copy()
    
    def _create_fallback_models(self, X, y):
        """Create simple fallback models for robustness."""
        # Simple Isolation Forest fallback
        fallback_detector = IndividualDetector(
            "Fallback_IsolationForest",
            IsolationForest(
                contamination=self.config['contamination'],
                random_state=42,
                n_estimators=50
            )
        )
        fallback_detector.fit(X)
        
        self.model_registry['fallback'] = fallback_detector
    
    def predict(self, X, method='best', enable_consensus=True):
        """Make production predictions with fallback handling."""
        if not self.is_fitted:
            raise ValueError("System must be fitted before prediction.")
        
        try:
            if method == 'best':
                # Use best performing model
                if self.performance_metrics:
                    best_model = max(self.performance_metrics.items(), 
                                   key=lambda x: x[1]['f1_score'])[0]
                    if best_model in self.model_registry:
                        return self.model_registry[best_model].predict(X)
                
                # Fallback to first available model
                return next(iter(self.model_registry.values())).predict(X)
            
            elif method == 'consensus' and enable_consensus:
                # Consensus prediction from multiple models
                all_predictions = []
                
                for name, model in self.model_registry.items():
                    if name != 'fallback':  # Skip fallback for consensus
                        try:
                            pred = model.predict(X)
                            all_predictions.append(pred)
                        except Exception as e:
                            print(f"⚠️ Model {name} failed: {e}")
                
                if all_predictions:
                    # Majority voting
                    predictions_array = np.array(all_predictions)
                    anomaly_votes = np.sum(predictions_array == -1, axis=0)
                    return np.where(anomaly_votes > len(all_predictions) / 2, -1, 1)
            
            elif method in self.model_registry:
                return self.model_registry[method].predict(X)
            
            else:
                raise ValueError(f"Unknown method: {method}")
        
        except Exception as e:
            print(f"⚠️ Prediction failed, using fallback: {e}")
            if 'fallback' in self.model_registry:
                return self.model_registry['fallback'].predict(X)
            else:
                raise RuntimeError("No fallback model available")
    
    def get_system_status(self):
        """Get comprehensive system status."""
        return {
            'is_fitted': self.is_fitted,
            'available_methods': list(self.model_registry.keys()),
            'performance_metrics': self.performance_metrics,
            'configuration': self.config,
            'detector_count': len(self.detector_suite.detectors) if self.detector_suite else 0,
            'ensemble_count': len(self.ensembles)
        }
    
    def get_prediction_explanation(self, X, sample_idx=0):
        """Get explanation for a specific prediction."""
        if not self.is_fitted:
            raise ValueError("System must be fitted before explanation.")
        
        sample = X[sample_idx:sample_idx+1]
        explanations = {}
        
        for name, model in self.model_registry.items():
            if name != 'fallback':
                try:
                    prediction = model.predict(sample)[0]
                    explanations[name] = {
                        'prediction': 'Anomaly' if prediction == -1 else 'Normal',
                        'confidence': 'High' if name in self.performance_metrics and 
                                     self.performance_metrics[name]['f1_score'] > 0.8 else 'Medium'
                    }
                except Exception as e:
                    explanations[name] = {'error': str(e)}
        
        return explanations

print("✅ ProductionEnsembleSystem created!")
print("🏭 Ready for enterprise-grade ensemble anomaly detection!")

## 🧪 Production System Demo

Let's demonstrate the complete production ensemble system.

In [None]:
# Initialize and test production system
print("🏭 PRODUCTION ENSEMBLE SYSTEM DEMO")
print("="*60)

# Custom configuration
production_config = {
    'contamination': 0.12,
    'ensemble_methods': [
        'majority_voting',
        'mean_averaging',
        'performance_weighted',
        'hierarchical'
    ],
    'enable_model_selection': True,
    'enable_monitoring': True,
    'enable_fallback': True,
    'performance_threshold': 0.6,
    'model_update_strategy': 'periodic',
    'backup_models': 3
}

# Initialize system
production_system = ProductionEnsembleSystem(production_config)

# Fit system with validation
production_system.fit(demo_data, demo_labels, validation_split=0.3)

# Test different prediction methods
print("\n🎯 Testing prediction methods...")

test_sample = demo_data[:100]  # Use first 100 samples for testing
test_labels = demo_labels[:100]

# Best model prediction
best_predictions = production_system.predict(test_sample, method='best')
best_f1 = f1_score(
    (test_labels == -1).astype(int),
    (best_predictions == -1).astype(int),
    zero_division=0
)

# Consensus prediction
consensus_predictions = production_system.predict(test_sample, method='consensus')
consensus_f1 = f1_score(
    (test_labels == -1).astype(int),
    (consensus_predictions == -1).astype(int),
    zero_division=0
)

print(f"   Best Model F1-Score: {best_f1:.3f}")
print(f"   Consensus F1-Score: {consensus_f1:.3f}")

# Get system status
status = production_system.get_system_status()

print("\n📊 SYSTEM STATUS REPORT")
print("="*60)
print(f"🔄 System Status: {'Ready' if status['is_fitted'] else 'Not Ready'}")
print(f"🎯 Available Methods: {', '.join(status['available_methods'])}")
print(f"🔧 Total Detectors: {status['detector_count']}")
print(f"🎪 Ensemble Count: {status['ensemble_count']}")

print("\n📈 Performance Metrics:")
for method, metrics in status['performance_metrics'].items():
    if 'error' not in metrics:
        print(f"   {method:20} | F1: {metrics['f1_score']:.3f} | Status: {metrics['status']}")
    else:
        print(f"   {method:20} | Status: {metrics['status']} | Error: {metrics['error']}")

# Test prediction explanation
print("\n🔍 PREDICTION EXPLANATION (Sample 0)")
print("="*60)
explanation = production_system.get_prediction_explanation(test_sample, sample_idx=0)

for method, details in explanation.items():
    if 'error' not in details:
        print(f"   {method:20} | Prediction: {details['prediction']:8} | Confidence: {details['confidence']}")
    else:
        print(f"   {method:20} | Error: {details['error']}")

# Performance comparison
print("\n🏆 PRODUCTION VS INDIVIDUAL METHODS")
print("="*60)

# Compare with individual detector performance
individual_performance = detector_suite.evaluate_individual_performance(test_sample, test_labels)
avg_individual_f1 = np.mean([perf['f1_score'] for perf in individual_performance.values()])

print(f"   Average Individual F1: {avg_individual_f1:.3f}")
print(f"   Best Ensemble F1:      {best_f1:.3f}")
print(f"   Consensus Ensemble F1: {consensus_f1:.3f}")
print(f"   Improvement Factor:     {best_f1/avg_individual_f1:.2f}x")

print("\n✅ Production ensemble system demo completed successfully!")
print("🎉 System is ready for enterprise deployment!")

## 🎓 Key Takeaways and Best Practices

### 🔬 Ensemble Learning Principles
- **Diversity is key**: Combine different algorithm types for better performance
- **Weighted combinations**: Use performance-based weighting for optimal results
- **Hierarchical architectures**: Group similar detectors before final combination
- **Dynamic adaptation**: Adjust ensemble weights based on data characteristics

### 🎯 Ensemble Method Selection
- **Voting methods**: Simple and robust, good for diverse detector outputs
- **Averaging methods**: Better for similar detectors with score outputs
- **Stacking**: Most powerful but requires labeled data for training
- **Dynamic ensembles**: Adapt to changing data patterns automatically

### 🏭 Production Considerations
- **Model validation**: Always validate ensemble performance on held-out data
- **Fallback strategies**: Include simple, robust models as fallbacks
- **Monitoring systems**: Track ensemble performance over time
- **Interpretability**: Provide explanations for ensemble decisions

### ⚡ Performance Optimization
- **Selective ensembles**: Use only high-performing base detectors
- **Efficient voting**: Pre-compute weights and use vectorized operations
- **Parallel prediction**: Run base detectors in parallel when possible
- **Resource management**: Balance accuracy vs computational efficiency

## 🔗 Next Steps

Continue your learning journey with:
- [Real-Time Streaming Detection](07_real_time_streaming_detection.ipynb)
- [Production Deployment Guide](09_production_deployment_guide.ipynb)
- [Model Explainability Tutorial](08_model_explainability_tutorial.ipynb)

## 🆘 Getting Help

Having trouble with ensemble methods? Check out:
- [API Documentation](../api.md) for detailed function references
- [Algorithm Guide](../algorithms.md) for understanding base detectors
- [Performance Guide](../performance.md) for optimization strategies

---

**🎉 Congratulations!** You've mastered advanced ensemble methods for anomaly detection. You can now build sophisticated, production-ready ensemble systems that significantly outperform individual detectors while maintaining robustness and interpretability.