# 🚀 Comprehensive Federated Learning Experiments with `run_extensive_experiments.py`

This notebook provides a comprehensive guide to using the powerful `run_extensive_experiments.py` script, which is the core tool for conducting large-scale federated learning research in this framework.

## 📚 What You'll Learn

- How to configure and run extensive experiments
- Understanding the experiment matrix (strategies × attacks × datasets)
- Working with the enhanced experiment runner
- Analyzing and visualizing comprehensive results
- Best practices for large-scale FL research

## 🎯 Why Use `run_extensive_experiments.py`?

This script is designed for serious federated learning research that requires:
- **Systematic comparison** of multiple FL strategies
- **Robustness evaluation** under various attack scenarios
- **Statistical significance** through multiple experimental runs
- **Checkpoint/resume** functionality for long-running experiments
- **Parallel execution** for efficient resource utilization

In [None]:
import sys
import os
import subprocess
import time
import threading
from pathlib import Path
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import numpy as np
from datetime import datetime
import yaml
import json

# Set up plotting style
plt.style.use('seaborn-v0_8')
sns.set_palette("husl")

# Add framework to path
framework_root = Path().absolute().parent
sys.path.insert(0, str(framework_root))

print(f"🔧 Framework root: {framework_root}")
print(f"📁 Current working directory: {Path.cwd()}")

# Check if run_extensive_experiments.py exists
extensive_script = framework_root / "experiment_runners" / "run_extensive_experiments.py"
print(f"✅ Extensive experiments script exists: {extensive_script.exists()}")

## 🔍 Understanding the Experiment Matrix

The `run_extensive_experiments.py` script automatically generates a comprehensive matrix of experiments by combining:

### 🧠 Federated Learning Strategies (19 total)
1. **Core Strategies**: FedAvg, FedAvgM, FedProx, FedNova, SCAFFOLD, FedAdam
2. **Byzantine-Robust**: Krum, TrimmedMean, Bulyan
3. **Flower Baselines**: DASHA, DepthFL, HeteroFL, FedMeta, FedPer, FjORD, FLANDERS, FedOpt

### ⚔️ Attack Scenarios (7 configurations)
1. **none** - Clean baseline
2. **noise** - Gaussian noise injection
3. **missed** - Client participation failures
4. **failure** - Random client failures
5. **asymmetry** - Data distribution asymmetry
6. **labelflip** - Label flipping attacks
7. **gradflip** - Gradient flipping attacks

### 📊 Datasets (3 available)
- **MNIST** - Handwritten digits (28×28, 10 classes)
- **FMNIST** - Fashion items (28×28, 10 classes)
- **CIFAR10** - Natural images (32×32×3, 10 classes)

In [None]:
# Let's calculate the total number of experiments
strategies = ['fedavg', 'fedavgm', 'fedprox', 'fednova', 'scaffold', 'fedadam', 
              'krum', 'trimmedmean', 'bulyan', 'dasha', 'depthfl', 'heterofl', 
              'fedmeta', 'fedper', 'fjord', 'flanders', 'fedopt']

attacks = ['none', 'noise', 'missed', 'failure', 'asymmetry', 'labelflip', 'gradflip']

datasets = ['MNIST', 'FMNIST', 'CIFAR10']

total_configs = len(strategies) * len(attacks) * len(datasets)

print(f"🧮 Experiment Matrix Calculation:")
print(f"   Strategies: {len(strategies)}")
print(f"   Attacks: {len(attacks)}")
print(f"   Datasets: {len(datasets)}")
print(f"   Total configurations: {total_configs}")
print(f"\n📈 With multiple runs (e.g., 10 runs per config):")
print(f"   Total experiments: {total_configs * 10:,}")
print(f"   Estimated time (5 min/experiment): {(total_configs * 10 * 5) / 60:.1f} hours")

# Display some example combinations
print(f"\n🎯 Example experiment combinations:")
examples = [
    ('fedavg', 'none', 'MNIST', 'Baseline performance'),
    ('fedprox', 'noise', 'CIFAR10', 'Robust strategy under noise'),
    ('krum', 'labelflip', 'FMNIST', 'Byzantine-robust vs malicious attack'),
    ('scaffold', 'asymmetry', 'MNIST', 'Advanced strategy vs data heterogeneity')
]

for strategy, attack, dataset, description in examples:
    print(f"   {strategy} + {attack} + {dataset}: {description}")

## 🏃‍♂️ Running Your First Extensive Experiment

Let's start with a small-scale test to understand how the script works:

In [None]:
def run_extensive_experiments_test():
    """Run a small test version of extensive experiments."""
    
    # Change to framework directory
    original_cwd = os.getcwd()
    os.chdir(framework_root)
    
    try:
        # Build command for test run
        cmd = [
            sys.executable,
            "experiment_runners/run_extensive_experiments.py",
            "--num-runs", "1",  # Just 1 run for testing
            "--test-mode",       # Enable test mode (fewer configurations)
            "--timeout", "300"   # 5 minute timeout per experiment
        ]
        
        print(f"🚀 Running test command: {' '.join(cmd)}")
        print("\n" + "="*60)
        print("🧪 Starting Test Extensive Experiments...")
        print("="*60)
        
        # Execute with real-time output
        process = subprocess.Popen(
            cmd, stdout=subprocess.PIPE, stderr=subprocess.STDOUT, 
            text=True, bufsize=1, universal_newlines=True
        )
        
        # Print output in real-time
        for line in process.stdout:
            print(line.rstrip())
        
        process.wait()
        return_code = process.returncode
        
        print(f"\n✅ Test completed with return code: {return_code}")
        return return_code == 0
        
    except subprocess.TimeoutExpired:
        print("⏰ Test timed out - this is normal for demonstration")
        return False
    except Exception as e:
        print(f"❌ Error running test: {e}")
        return False
    finally:
        os.chdir(original_cwd)

# Uncomment the next line to run the test (may take several minutes)
# run_extensive_experiments_test()

## ⚙️ Advanced Configuration Options

The `run_extensive_experiments.py` script supports many advanced options:

In [None]:
# Let's explore the command-line options
def show_extensive_experiments_help():
    """Display help for the extensive experiments script."""
    
    original_cwd = os.getcwd()
    os.chdir(framework_root)
    
    try:
        result = subprocess.run([
            sys.executable,
            "experiment_runners/run_extensive_experiments.py",
            "--help"
        ], capture_output=True, text=True)
        
        print("📋 Available command-line options:")
        print("=" * 50)
        print(result.stdout)
        
    except Exception as e:
        print(f"❌ Error getting help: {e}")
    finally:
        os.chdir(original_cwd)

show_extensive_experiments_help()

## 🛠️ Custom Configuration Examples

Here are some practical examples of how to run extensive experiments with different configurations:

In [None]:
# Example configurations for different research scenarios

def create_experiment_commands():
    """Create example commands for different research scenarios."""
    
    commands = {
        "Quick Test": [
            "python experiment_runners/run_extensive_experiments.py",
            "--num-runs 1",
            "--test-mode",
            "--timeout 300"
        ],
        
        "Statistical Analysis": [
            "python experiment_runners/run_extensive_experiments.py",
            "--num-runs 10",
            "--parallel",
            "--max-workers 4"
        ],
        
        "Byzantine Robustness Study": [
            "python experiment_runners/run_extensive_experiments.py",
            "--num-runs 5",
            "--strategies krum,trimmedmean,bulyan,fedavg",
            "--attacks labelflip,gradflip,none",
            "--datasets MNIST,CIFAR10"
        ],
        
        "Communication Efficiency": [
            "python experiment_runners/run_extensive_experiments.py",
            "--num-runs 3",
            "--strategies fedavg,fednova,scaffold",
            "--attacks none,asymmetry",
            "--rounds 20",
            "--clients 20"
        ],
        
        "Resume from Checkpoint": [
            "python experiment_runners/run_extensive_experiments.py",
            "--resume",
            "--checkpoint-dir enhanced_experiment_results/checkpoints"
        ]
    }
    
    print("🎯 Example Experiment Commands:\n")
    
    for scenario, cmd_parts in commands.items():
        print(f"**{scenario}:**")
        print(f"```bash")
        print(" ".join(cmd_parts))
        print(f"```\n")
        
    return commands

example_commands = create_experiment_commands()

## 📊 Understanding the Output Structure

The extensive experiments generate comprehensive results in a structured format:

In [None]:
# Let's examine the expected output structure
def show_expected_output_structure():
    """Display the expected output structure from extensive experiments."""
    
    print("📁 Expected Output Directory Structure:")
    print("="*50)
    print("""
enhanced_experiment_results/
├── final_results_YYYYMMDD_HHMMSS.csv        # Main results file
├── final_results_YYYYMMDD_HHMMSS.json       # Backup in JSON format
├── final_results_YYYYMMDD_HHMMSS.csv.gz     # Compressed backup
├── intermediate_results_*.csv                # Periodic saves
├── enhanced_experiment_runner.log            # Detailed execution log
└── checkpoints/
    ├── checkpoint_YYYYMMDD_HHMMSS.yaml      # Experiment state
    └── experiments_status.json              # Progress tracking
    """)
    
    print("\n📊 CSV Results Columns:")
    print("="*30)
    columns = [
        ('algorithm', 'Federated learning strategy used'),
        ('attack', 'Attack type applied (includes parameters)'),
        ('dataset', 'Dataset used for training'),
        ('run', 'Run number (for statistical analysis)'),
        ('client_id', 'Client identifier (-1 for server metrics)'),
        ('round', 'Federated learning round number'),
        ('metric', 'Type of metric (loss, accuracy, precision, etc.)'),
        ('value', 'Actual metric value')
    ]
    
    for col, desc in columns:
        print(f"  {col:<12}: {desc}")

show_expected_output_structure()

## 📈 Analyzing Extensive Experiment Results

Once you have results, here's how to analyze them effectively:

In [None]:
def analyze_extensive_results(results_file=None):
    """Analyze results from extensive experiments."""
    
    # If no specific file provided, look for the most recent one
    if results_file is None:
        results_dir = framework_root / "enhanced_experiment_results"
        if results_dir.exists():
            csv_files = list(results_dir.glob("final_results_*.csv"))
            if csv_files:
                results_file = max(csv_files, key=lambda x: x.stat().st_mtime)
                print(f"📊 Using most recent results file: {results_file.name}")
            else:
                print("⚠️ No results files found. Run experiments first.")
                return create_sample_results_for_demo()
        else:
            print("⚠️ Results directory not found. Creating sample data for demonstration.")
            return create_sample_results_for_demo()
    
    try:
        # Load the results
        df = pd.read_csv(results_file)
        print(f"✅ Loaded {len(df)} records from {results_file}")
        
    except Exception as e:
        print(f"❌ Error loading results: {e}")
        return create_sample_results_for_demo()
    
    return df

def create_sample_results_for_demo():
    """Create sample results data for demonstration purposes."""
    
    print("🎭 Creating sample results for demonstration...")
    
    # Generate realistic sample data
    np.random.seed(42)
    
    strategies = ['fedavg', 'fedprox', 'krum', 'scaffold']
    attacks = ['none', 'noise', 'labelflip']
    datasets = ['MNIST', 'CIFAR10']
    metrics = ['accuracy', 'loss']
    
    data = []
    
    for strategy in strategies:
        for attack in attacks:
            for dataset in datasets:
                for run in range(3):  # 3 runs per configuration
                    for round_num in range(1, 11):  # 10 rounds
                        # Simulate realistic accuracy progression
                        base_acc = 0.7 if dataset == 'MNIST' else 0.6
                        if attack == 'labelflip':
                            base_acc *= 0.8  # Attack degrades performance
                        if strategy == 'krum' and attack != 'none':
                            base_acc *= 1.1  # Robust strategy helps under attack
                        
                        final_acc = base_acc + (round_num * 0.03) + np.random.normal(0, 0.02)
                        final_acc = max(0.1, min(0.95, final_acc))  # Clamp to realistic range
                        
                        data.append({
                            'algorithm': strategy,
                            'attack': attack,
                            'dataset': dataset,
                            'run': run,
                            'client_id': -1,  # Server metric
                            'round': round_num,
                            'metric': 'accuracy',
                            'value': final_acc
                        })
                        
                        # Add corresponding loss
                        loss = 2.0 - (final_acc * 1.8) + np.random.normal(0, 0.1)
                        loss = max(0.1, loss)
                        
                        data.append({
                            'algorithm': strategy,
                            'attack': attack,
                            'dataset': dataset,
                            'run': run,
                            'client_id': -1,
                            'round': round_num,
                            'metric': 'loss',
                            'value': loss
                        })
    
    df = pd.DataFrame(data)
    print(f"✅ Generated {len(df)} sample records")
    return df

# Load or create sample results
results_df = analyze_extensive_results()

# Display basic statistics
print(f"\n📊 Dataset Overview:")
print(f"  Total records: {len(results_df):,}")
print(f"  Unique strategies: {results_df['algorithm'].nunique()}")
print(f"  Unique attacks: {results_df['attack'].nunique()}")
print(f"  Unique datasets: {results_df['dataset'].nunique()}")
print(f"  Metrics tracked: {results_df['metric'].nunique()}")
print(f"  Total runs: {results_df['run'].nunique()}")

# Show data structure
print(f"\n📋 Sample Data:")
print(results_df.head(10))

## 📊 Comprehensive Results Visualization

Let's create comprehensive visualizations to understand the experimental results:

In [None]:
def create_comprehensive_visualizations(df):
    """Create comprehensive visualizations of the experimental results."""
    
    # Set up the plotting environment
    plt.rcParams['figure.figsize'] = (15, 10)
    
    # 1. Strategy Performance Comparison
    fig, axes = plt.subplots(2, 2, figsize=(18, 12))
    fig.suptitle('🚀 Comprehensive Federated Learning Experiment Results', fontsize=16, fontweight='bold')
    
    # Filter for final round accuracy
    final_acc = df[(df['metric'] == 'accuracy') & (df['round'] == df['round'].max())]
    
    # Plot 1: Strategy performance across attacks
    strategy_performance = final_acc.groupby(['algorithm', 'attack'])['value'].mean().unstack()
    strategy_performance.plot(kind='bar', ax=axes[0,0], width=0.8)
    axes[0,0].set_title('Strategy Performance Under Different Attacks')
    axes[0,0].set_ylabel('Final Accuracy')
    axes[0,0].legend(title='Attack Type', bbox_to_anchor=(1.05, 1), loc='upper left')
    axes[0,0].tick_params(axis='x', rotation=45)
    
    # Plot 2: Dataset-specific performance
    dataset_performance = final_acc.groupby(['dataset', 'algorithm'])['value'].mean().unstack()
    dataset_performance.plot(kind='bar', ax=axes[0,1], width=0.8)
    axes[0,1].set_title('Strategy Performance by Dataset')
    axes[0,1].set_ylabel('Final Accuracy')
    axes[0,1].legend(title='Strategy', bbox_to_anchor=(1.05, 1), loc='upper left')
    axes[0,1].tick_params(axis='x', rotation=0)
    
    # Plot 3: Learning curves for top strategies
    acc_data = df[df['metric'] == 'accuracy']
    top_strategies = final_acc.groupby('algorithm')['value'].mean().nlargest(4).index
    
    for strategy in top_strategies:
        strategy_data = acc_data[(acc_data['algorithm'] == strategy) & (acc_data['attack'] == 'none')]
        learning_curve = strategy_data.groupby('round')['value'].mean()
        axes[1,0].plot(learning_curve.index, learning_curve.values, marker='o', label=strategy, linewidth=2)
    
    axes[1,0].set_title('Learning Curves (No Attack Scenario)')
    axes[1,0].set_xlabel('Federated Round')
    axes[1,0].set_ylabel('Accuracy')
    axes[1,0].legend()
    axes[1,0].grid(True, alpha=0.3)
    
    # Plot 4: Robustness analysis (performance degradation under attacks)
    baseline_perf = final_acc[final_acc['attack'] == 'none'].groupby('algorithm')['value'].mean()
    attack_perf = final_acc[final_acc['attack'] != 'none'].groupby(['algorithm', 'attack'])['value'].mean()
    
    robustness_data = []
    for strategy in baseline_perf.index:
        baseline = baseline_perf[strategy]
        for attack in final_acc['attack'].unique():
            if attack != 'none' and (strategy, attack) in attack_perf.index:
                degradation = baseline - attack_perf[(strategy, attack)]
                robustness_data.append({
                    'Strategy': strategy,
                    'Attack': attack,
                    'Performance Degradation': degradation
                })
    
    robustness_df = pd.DataFrame(robustness_data)
    if not robustness_df.empty:
        robustness_pivot = robustness_df.pivot(index='Strategy', columns='Attack', values='Performance Degradation')
        sns.heatmap(robustness_pivot, annot=True, fmt='.3f', cmap='RdYlBu_r', ax=axes[1,1], cbar_kws={'label': 'Performance Degradation'})
        axes[1,1].set_title('Strategy Robustness Analysis\n(Lower values = more robust)')
    
    plt.tight_layout()
    plt.show()
    
    return strategy_performance, dataset_performance, robustness_df

# Create visualizations
strategy_perf, dataset_perf, robustness_data = create_comprehensive_visualizations(results_df)

## 📈 Statistical Analysis and Insights

Let's perform detailed statistical analysis of the results:

In [None]:
def perform_statistical_analysis(df):
    """Perform comprehensive statistical analysis of the experimental results."""
    
    print("📊 STATISTICAL ANALYSIS REPORT")
    print("=" * 50)
    
    # 1. Overall Performance Summary
    final_acc = df[(df['metric'] == 'accuracy') & (df['round'] == df['round'].max())]
    
    print("\n🎯 OVERALL PERFORMANCE SUMMARY")
    print("-" * 35)
    
    # Best performing strategies
    strategy_means = final_acc.groupby('algorithm')['value'].agg(['mean', 'std', 'count'])
    strategy_means = strategy_means.sort_values('mean', ascending=False)
    
    print("\n🏆 Top 5 Strategies (by mean accuracy):")
    for i, (strategy, row) in enumerate(strategy_means.head().iterrows(), 1):
        print(f"  {i}. {strategy:<12}: {row['mean']:.3f} ± {row['std']:.3f} (n={row['count']})")
    
    # 2. Attack Impact Analysis
    print("\n\n⚔️ ATTACK IMPACT ANALYSIS")
    print("-" * 30)
    
    attack_impact = final_acc.groupby('attack')['value'].agg(['mean', 'std', 'count'])
    attack_impact = attack_impact.sort_values('mean', ascending=False)
    
    print("\nAttack severity ranking (higher accuracy = less severe):")
    for i, (attack, row) in enumerate(attack_impact.iterrows(), 1):
        print(f"  {i}. {attack:<12}: {row['mean']:.3f} ± {row['std']:.3f} (n={row['count']})")
    
    # 3. Dataset Difficulty Analysis
    print("\n\n📊 DATASET DIFFICULTY ANALYSIS")
    print("-" * 32)
    
    dataset_difficulty = final_acc.groupby('dataset')['value'].agg(['mean', 'std', 'count'])
    dataset_difficulty = dataset_difficulty.sort_values('mean', ascending=False)
    
    print("\nDataset difficulty ranking (higher accuracy = easier):")
    for i, (dataset, row) in enumerate(dataset_difficulty.iterrows(), 1):
        print(f"  {i}. {dataset:<12}: {row['mean']:.3f} ± {row['std']:.3f} (n={row['count']})")
    
    # 4. Robustness Rankings
    print("\n\n🛡️ ROBUSTNESS ANALYSIS")
    print("-" * 25)
    
    # Calculate robustness score for each strategy
    baseline_scores = final_acc[final_acc['attack'] == 'none'].groupby('algorithm')['value'].mean()
    attack_scores = final_acc[final_acc['attack'] != 'none'].groupby('algorithm')['value'].mean()
    
    robustness_scores = []
    for strategy in baseline_scores.index:
        if strategy in attack_scores.index:
            baseline = baseline_scores[strategy]
            under_attack = attack_scores[strategy]
            robustness = under_attack / baseline  # Ratio of performance retention
            robustness_scores.append((strategy, robustness, baseline, under_attack))
    
    robustness_scores.sort(key=lambda x: x[1], reverse=True)
    
    print("\nStrategy robustness ranking (performance retention under attacks):")
    for i, (strategy, robustness, baseline, under_attack) in enumerate(robustness_scores, 1):
        print(f"  {i}. {strategy:<12}: {robustness:.1%} retention ({baseline:.3f} → {under_attack:.3f})")
    
    # 5. Efficiency Analysis (Learning Speed)
    print("\n\n⚡ LEARNING EFFICIENCY ANALYSIS")
    print("-" * 35)
    
    # Calculate how quickly each strategy reaches 80% of its final performance
    efficiency_scores = []
    
    for strategy in df['algorithm'].unique():
        strategy_data = df[(df['algorithm'] == strategy) & (df['metric'] == 'accuracy') & (df['attack'] == 'none')]
        if len(strategy_data) > 0:
            final_perf = strategy_data[strategy_data['round'] == strategy_data['round'].max()]['value'].mean()
            target_perf = final_perf * 0.8
            
            # Find the round where 80% performance is reached
            round_performance = strategy_data.groupby('round')['value'].mean()
            rounds_to_target = None
            
            for round_num, perf in round_performance.items():
                if perf >= target_perf:
                    rounds_to_target = round_num
                    break
            
            if rounds_to_target:
                efficiency_scores.append((strategy, rounds_to_target, final_perf))
    
    efficiency_scores.sort(key=lambda x: x[1])  # Sort by rounds needed (fewer = more efficient)
    
    print("\nStrategy efficiency ranking (rounds to reach 80% of final performance):")
    for i, (strategy, rounds, final_perf) in enumerate(efficiency_scores, 1):
        print(f"  {i}. {strategy:<12}: {rounds} rounds (final: {final_perf:.3f})")
    
    return {
        'strategy_performance': strategy_means,
        'attack_impact': attack_impact,
        'dataset_difficulty': dataset_difficulty,
        'robustness_scores': robustness_scores,
        'efficiency_scores': efficiency_scores
    }

# Perform statistical analysis
analysis_results = perform_statistical_analysis(results_df)

## 🎯 Research Recommendations

Based on the analysis, here are actionable recommendations for federated learning research:

In [None]:
def generate_research_recommendations(analysis_results):
    """Generate actionable research recommendations based on the analysis."""
    
    print("🎯 RESEARCH RECOMMENDATIONS")
    print("=" * 35)
    
    # Get top performing strategies
    top_strategies = analysis_results['strategy_performance'].head(3).index.tolist()
    most_robust = [x[0] for x in analysis_results['robustness_scores'][:3]]
    most_efficient = [x[0] for x in analysis_results['efficiency_scores'][:3]]
    
    recommendations = [
        {
            'category': '📈 Performance Optimization',
            'recommendations': [
                f"Consider {', '.join(top_strategies)} as primary strategies for high-accuracy scenarios",
                "Implement hyperparameter tuning for top-performing strategies",
                "Investigate ensemble methods combining multiple top strategies"
            ]
        },
        {
            'category': '🛡️ Security & Robustness',
            'recommendations': [
                f"Use {', '.join(most_robust)} for adversarial environments",
                "Implement early attack detection mechanisms",
                "Develop adaptive defense strategies that switch based on detected threats"
            ]
        },
        {
            'category': '⚡ Efficiency & Scalability',
            'recommendations': [
                f"Choose {', '.join(most_efficient)} for resource-constrained scenarios",
                "Implement adaptive learning rates based on convergence monitoring",
                "Consider client sampling strategies to reduce communication overhead"
            ]
        },
        {
            'category': '🔬 Future Research Directions',
            'recommendations': [
                "Investigate personalized federated learning approaches",
                "Develop privacy-preserving techniques beyond differential privacy",
                "Study federated learning in edge computing environments",
                "Explore cross-silo federated learning with different organizations"
            ]
        },
        {
            'category': '📊 Experimental Best Practices',
            'recommendations': [
                "Run at least 10 repetitions for statistical significance",
                "Include confidence intervals in all performance reports",
                "Test multiple datasets to ensure generalizability",
                "Document all hyperparameters and experimental conditions"
            ]
        }
    ]
    
    for rec_group in recommendations:
        print(f"\n{rec_group['category']}")
        print("-" * len(rec_group['category']))
        for i, rec in enumerate(rec_group['recommendations'], 1):
            print(f"  {i}. {rec}")
    
    # Configuration recommendations
    print(f"\n\n⚙️ CONFIGURATION RECOMMENDATIONS")
    print("-" * 35)
    
    config_recommendations = {
        "High-Performance Setup": {
            "strategies": top_strategies,
            "num_rounds": "15-20",
            "num_clients": "20-50",
            "use_case": "When accuracy is the primary concern"
        },
        "Robust Setup": {
            "strategies": most_robust,
            "num_rounds": "20-30",
            "num_clients": "10-20",
            "use_case": "When operating in adversarial environments"
        },
        "Efficient Setup": {
            "strategies": most_efficient,
            "num_rounds": "10-15",
            "num_clients": "5-15",
            "use_case": "When minimizing communication overhead is critical"
        }
    }
    
    for setup_name, config in config_recommendations.items():
        print(f"\n{setup_name}:")
        print(f"  Strategies: {', '.join(config['strategies'])}")
        print(f"  Rounds: {config['num_rounds']}")
        print(f"  Clients: {config['num_clients']}")
        print(f"  Use case: {config['use_case']}")

# Generate recommendations
generate_research_recommendations(analysis_results)

## 🚀 Running Production Experiments

Now that you understand the framework, here's how to run production-scale experiments:

In [None]:
def create_production_experiment_script():
    """Create a production-ready experiment script."""
    
    script_content = '''#!/bin/bash
# Production Federated Learning Experiment Script
# Generated by the Comprehensive FL Framework

set -e  # Exit on any error

echo "🚀 Starting Production Federated Learning Experiments"
echo "================================================="

# Configuration
NUM_RUNS=10
MAX_WORKERS=4
TIMEOUT=1800  # 30 minutes per experiment
RESULTS_DIR="production_results_$(date +%Y%m%d_%H%M%S)"

# Create results directory
mkdir -p "$RESULTS_DIR"

echo "📁 Results will be saved to: $RESULTS_DIR"
echo "⚙️ Configuration: $NUM_RUNS runs, $MAX_WORKERS workers, ${TIMEOUT}s timeout"

# Run the extensive experiments
python experiment_runners/run_extensive_experiments.py \
    --num-runs $NUM_RUNS \
    --parallel \
    --max-workers $MAX_WORKERS \
    --timeout $TIMEOUT \
    --results-dir "$RESULTS_DIR" \
    --config configuration/enhanced_config.yaml \
    --checkpoint-interval 5 \
    --save-intermediate \
    --verbose

# Check if experiments completed successfully
if [ $? -eq 0 ]; then
    echo "✅ Experiments completed successfully!"
    echo "📊 Results available in: $RESULTS_DIR"
    
    # Optionally, run analysis
    echo "🔍 Running basic analysis..."
    python scripts/analyze_results.py "$RESULTS_DIR"
else
    echo "❌ Experiments failed. Check logs for details."
    exit 1
fi

echo "🎉 Production experiment pipeline completed!"
'''
    
    script_path = framework_root / "run_production_experiments.sh"
    
    try:
        with open(script_path, 'w') as f:
            f.write(script_content)
        
        # Make script executable (Unix-like systems)
        import stat
        script_path.chmod(script_path.stat().st_mode | stat.S_IEXEC)
        
        print(f"✅ Created production experiment script: {script_path}")
        print(f"\n🚀 To run production experiments:")
        print(f"   bash {script_path}")
        
    except Exception as e:
        print(f"❌ Error creating script: {e}")

# Create the production script
create_production_experiment_script()

## 🎓 Summary and Next Steps

Congratulations! You've learned how to use the comprehensive `run_extensive_experiments.py` script for federated learning research.

### 🌟 Key Takeaways

1. **Comprehensive Testing**: The script automatically tests 19 strategies × 7 attack types × 3 datasets
2. **Robust Execution**: Built-in checkpoint/resume, parallel execution, and error handling
3. **Rich Analysis**: Detailed results in long-form format for statistical analysis
4. **Production Ready**: Suitable for research papers and production deployments

### 🔬 Research Applications

This framework is ideal for:
- **Algorithm Comparison Studies**: Compare multiple FL strategies systematically
- **Security Research**: Evaluate defenses against various attack types
- **Robustness Analysis**: Test performance under different conditions
- **Educational Purposes**: Learn FL concepts through hands-on experimentation

### 📚 Next Steps

1. **Run Your Own Experiments**: Start with test mode, then scale up
2. **Analyze Results**: Use the visualization and analysis tools provided
3. **Customize Configurations**: Modify strategies, attacks, and parameters
4. **Contribute**: Share your findings and improvements with the community

### 🤝 Need Help?

- Check the `README.md` for detailed documentation
- Explore other example notebooks in this directory
- Review the source code in `experiment_runners/`
- Consult the scientific papers referenced in the documentation

Happy experimenting! 🚀