# Multi-Agent Pattern Comparison Tutorial

**Lesson 14 - Task 5.9**: Interactive tutorial comparing 5 multi-agent coordination patterns.

## Learning Objectives

By completing this notebook, you will:
1. Understand 5 multi-agent coordination patterns through simplified simulations
2. Compare trade-offs between patterns (latency vs. quality vs. cost)
3. Learn when to use each pattern based on task characteristics
4. Visualize pattern performance using radar charts

## Pattern Overview

| Pattern | Description | Best For | Trade-offs |
|---------|-------------|----------|------------|
| **Hierarchical** | Central orchestrator delegates to specialists | Complex tasks with clear subtasks | Higher latency, better quality |
| **Diamond** | Broadcast ‚Üí Generate ‚Üí Select best | Optimization problems | Higher cost (parallel LLM calls) |
| **P2P** | Peer-to-peer handoff with context | Sequential pipelines | Context loss risk |
| **Collaborative** | Shared workspace with peer review | Creative tasks | Coordination overhead |
| **Adaptive Loop** | Iterative refinement until quality threshold | Quality-critical tasks | Variable latency |

## Execution Modes

- **DEMO Mode**: 5 scenarios, simulated execution (~$0, <2 min)
- **FULL Mode**: 30 scenarios, simulated execution (~$0, <5 min)

**Note**: This notebook uses *simulated* pattern execution to demonstrate concepts. For production implementation, see `backend/multi_agent_patterns.py`.

## Setup: Imports and Configuration

In [None]:
# Standard library imports
import json
import time
from pathlib import Path
from typing import Any

# Third-party imports
import matplotlib.pyplot as plt
import numpy as np
from tqdm.auto import tqdm

print("‚úÖ All imports successful")

## Configuration: Select Execution Mode

In [None]:
# ============================================================================
# EXECUTION MODE CONFIGURATION
# ============================================================================

# Change this to "FULL" for comprehensive evaluation
MODE = "DEMO"  # Options: "DEMO" or "FULL"

# Mode-specific configuration
MODE_CONFIG = {
    "DEMO": {
        "num_scenarios": 5,
        "estimated_time": "<2 min",
    },
    "FULL": {
        "num_scenarios": 30,
        "estimated_time": "<5 min",
    },
}

config = MODE_CONFIG[MODE]
print(f"üîß Mode: {MODE}")
print(f"üìä Scenarios: {config['num_scenarios']}")
print(f"‚è±Ô∏è  Estimated Time: {config['estimated_time']}")
print(f"\n‚ö†Ô∏è  Note: This notebook uses SIMULATED execution to demonstrate pattern concepts")

## Load Test Scenarios

In [None]:
# Load test scenarios
data_path = Path("data/multi_agent_scenarios.json")
assert data_path.exists(), f"Data file not found: {data_path}"

with open(data_path, "r") as f:
    dataset = json.load(f)

# Select scenarios based on mode
all_scenarios = dataset["scenarios"]
scenarios = all_scenarios[:config["num_scenarios"]]

print(f"‚úÖ Loaded {len(scenarios)} test scenarios")
print(f"\nScenario Preview:")
for i, scenario in enumerate(scenarios[:3], 1):
    print(f"  {i}. [{scenario['expected_pattern'].upper()}] {scenario['query'][:60]}...")
if len(scenarios) > 3:
    print(f"  ... and {len(scenarios) - 3} more scenarios")

## Simplified Pattern Simulation

Instead of running actual LLM calls, we'll simulate pattern behavior based on theoretical characteristics.
This allows us to understand pattern trade-offs without API costs.

In [None]:
def simulate_hierarchical(scenario: dict[str, Any]) -> dict[str, Any]:
    """Simulate hierarchical pattern execution.
    
    Characteristics:
    - Higher latency due to orchestration overhead
    - Higher quality due to specialist coordination
    - Moderate cost (multiple agents, but sequential)
    """
    base_latency = scenario["evaluation_criteria"].get("latency_target", 5.0)
    num_workers = scenario["evaluation_criteria"].get("agent_utilization", 3)
    
    # Hierarchical adds orchestration overhead but improves quality
    latency = base_latency * 1.2  # 20% overhead for coordination
    cost = num_workers * 0.05  # Cost per worker
    quality = min(1.0, scenario["evaluation_criteria"].get("quality_threshold", 0.85) + 0.05)
    
    return {"latency": latency, "cost": cost, "quality": quality, "agent_count": num_workers}


def simulate_diamond(scenario: dict[str, Any]) -> dict[str, Any]:
    """Simulate diamond (competitive) pattern execution.
    
    Characteristics:
    - Lower latency (parallel execution)
    - Higher cost (N competing agents)
    - High quality (best-of-N selection)
    """
    base_latency = scenario["evaluation_criteria"].get("latency_target", 5.0)
    num_agents = 3  # Typically 3-5 competing agents
    
    latency = base_latency * 0.6  # Parallel execution is faster
    cost = num_agents * 0.08  # Pay for all competing agents
    quality = min(1.0, scenario["evaluation_criteria"].get("quality_threshold", 0.85) + 0.08)
    
    return {"latency": latency, "cost": cost, "quality": quality, "agent_count": num_agents}


def simulate_p2p(scenario: dict[str, Any]) -> dict[str, Any]:
    """Simulate peer-to-peer handoff pattern execution.
    
    Characteristics:
    - Moderate latency (sequential pipeline)
    - Lower cost (minimal overhead)
    - Quality depends on context preservation
    """
    base_latency = scenario["evaluation_criteria"].get("latency_target", 5.0)
    num_steps = len(scenario.get("reference_trajectory", []))
    
    latency = base_latency * 0.9  # Efficient handoffs
    cost = num_steps * 0.04  # Lower cost per step
    quality = scenario["evaluation_criteria"].get("quality_threshold", 0.85) - 0.03  # Context loss risk
    
    return {"latency": latency, "cost": cost, "quality": quality, "agent_count": num_steps}


def simulate_collaborative(scenario: dict[str, Any]) -> dict[str, Any]:
    """Simulate collaborative pattern execution.
    
    Characteristics:
    - Moderate latency (peer review adds time)
    - Moderate cost (shared workspace)
    - High quality for creative tasks
    """
    base_latency = scenario["evaluation_criteria"].get("latency_target", 5.0)
    num_peers = 3
    
    # Collaborative pattern benefits creative tasks
    is_creative = scenario["query_type"] in ["creative_writing", "creative_design", "brainstorming"]
    quality_boost = 0.10 if is_creative else 0.03
    
    latency = base_latency * 1.1  # Peer review overhead
    cost = num_peers * 0.06
    quality = min(1.0, scenario["evaluation_criteria"].get("quality_threshold", 0.85) + quality_boost)
    
    return {"latency": latency, "cost": cost, "quality": quality, "agent_count": num_peers}


def simulate_adaptive_loop(scenario: dict[str, Any]) -> dict[str, Any]:
    """Simulate adaptive loop pattern execution.
    
    Characteristics:
    - Variable latency (depends on iterations to converge)
    - Moderate cost (iterative refinement)
    - Highest quality (meets threshold)
    """
    base_latency = scenario["evaluation_criteria"].get("latency_target", 5.0)
    quality_target = scenario["evaluation_criteria"].get("quality_threshold", 0.85)
    
    # Simulate 2-4 iterations based on difficulty
    difficulty = scenario.get("difficulty", "medium")
    iterations = {"easy": 2, "medium": 3, "hard": 4}[difficulty]
    
    latency = base_latency * (1 + iterations * 0.3)  # Each iteration adds time
    cost = iterations * 0.06
    quality = min(1.0, quality_target + 0.10)  # Meets or exceeds target
    
    return {"latency": latency, "cost": cost, "quality": quality, "agent_count": iterations}


# Pattern simulation functions
pattern_simulators = {
    "hierarchical": simulate_hierarchical,
    "diamond": simulate_diamond,
    "p2p": simulate_p2p,
    "collaborative": simulate_collaborative,
    "adaptive_loop": simulate_adaptive_loop,
}

print("‚úÖ Initialized 5 pattern simulators")

## Run Pattern Comparison Experiment

In [None]:
# Run experiment
results = []
total_experiments = len(scenarios) * len(pattern_simulators)

print(f"üöÄ Starting simulation: {len(scenarios)} scenarios √ó {len(pattern_simulators)} patterns = {total_experiments} executions\n")

with tqdm(total=total_experiments, desc="Running simulations") as pbar:
    for scenario in scenarios:
        scenario_id = scenario["scenario_id"]
        query = scenario["query"]
        
        for pattern_name, simulator in pattern_simulators.items():
            # Simulate pattern execution
            metrics = simulator(scenario)
            
            # Add small random variation (¬±10%) to make it realistic
            noise = np.random.uniform(0.9, 1.1)
            metrics["latency"] *= noise
            metrics["cost"] *= noise
            
            # Record result
            results.append({
                "scenario_id": scenario_id,
                "query": query,
                "pattern": pattern_name,
                "latency": metrics["latency"],
                "cost": metrics["cost"],
                "quality": metrics["quality"],
                "agent_count": metrics["agent_count"],
                "success": True,
            })
            
            pbar.update(1)
            time.sleep(0.01)  # Small delay for visualization

print(f"\n‚úÖ Simulation complete: {len(results)} results collected")

## Aggregate Results by Pattern

In [None]:
# Aggregate metrics by pattern
pattern_metrics = {}

for pattern_name in pattern_simulators.keys():
    pattern_results = [r for r in results if r["pattern"] == pattern_name]
    
    pattern_metrics[pattern_name] = {
        "avg_latency": np.mean([r["latency"] for r in pattern_results]),
        "std_latency": np.std([r["latency"] for r in pattern_results]),
        "avg_cost": np.mean([r["cost"] for r in pattern_results]),
        "avg_quality": np.mean([r["quality"] for r in pattern_results]),
        "avg_agents": np.mean([r["agent_count"] for r in pattern_results]),
    }

# Display aggregated metrics
print("\nüìä Pattern Performance Summary (Simulated)\n")
print(f"{'Pattern':<20} {'Latency (s)':<18} {'Cost ($)':<15} {'Quality':<15} {'Avg Agents'}")
print("=" * 90)

for pattern_name, metrics in pattern_metrics.items():
    print(
        f"{pattern_name.upper():<20} "
        f"{metrics['avg_latency']:<6.2f} ¬±{metrics['std_latency']:<9.2f} "
        f"{metrics['avg_cost']:<15.4f} "
        f"{metrics['avg_quality']:<15.2f} "
        f"{metrics['avg_agents']:.1f}"
    )

## Visualization: Radar Chart Comparison

In [None]:
# Create radar chart
def create_radar_chart(pattern_metrics: dict[str, dict[str, float]]) -> None:
    """Create radar chart comparing patterns across metrics."""
    # Prepare data
    metrics_names = ["Speed\n(1/latency)", "Efficiency\n(1/cost)", "Quality"]
    
    # Normalize metrics to 0-1 scale (higher is better)
    max_latency = max(m["avg_latency"] for m in pattern_metrics.values())
    max_cost = max(m["avg_cost"] for m in pattern_metrics.values())
    
    values_by_pattern = {}
    for pattern_name, metrics in pattern_metrics.items():
        # Invert latency and cost (lower is better ‚Üí higher score)
        speed = 1 - (metrics["avg_latency"] / max_latency)
        efficiency = 1 - (metrics["avg_cost"] / max_cost)
        quality = metrics["avg_quality"]
        
        values_by_pattern[pattern_name] = [speed, efficiency, quality]
    
    # Setup radar chart
    angles = np.linspace(0, 2 * np.pi, len(metrics_names), endpoint=False).tolist()
    angles += angles[:1]  # Complete the circle
    
    fig, ax = plt.subplots(figsize=(12, 9), subplot_kw=dict(projection='polar'))
    
    # Plot each pattern
    colors = ['#FF6B6B', '#4ECDC4', '#45B7D1', '#FFA07A', '#98D8C8']
    for (pattern_name, values), color in zip(values_by_pattern.items(), colors):
        values += values[:1]  # Complete the circle
        ax.plot(angles, values, 'o-', linewidth=2.5, label=pattern_name.upper(), color=color, markersize=8)
        ax.fill(angles, values, alpha=0.2, color=color)
    
    # Styling
    ax.set_xticks(angles[:-1])
    ax.set_xticklabels(metrics_names, size=11, weight='bold')
    ax.set_ylim(0, 1)
    ax.set_yticks([0.2, 0.4, 0.6, 0.8, 1.0])
    ax.set_yticklabels(['0.2', '0.4', '0.6', '0.8', '1.0'], size=9)
    ax.grid(True, linestyle='--', alpha=0.7, linewidth=1.2)
    
    # Title and legend
    plt.title('Multi-Agent Pattern Comparison (Simulated)\nHigher values = Better performance', 
              size=16, weight='bold', pad=25)
    plt.legend(loc='upper right', bbox_to_anchor=(1.35, 1.15), fontsize=11, framealpha=0.9)
    
    plt.tight_layout()
    plt.show()

create_radar_chart(pattern_metrics)

## Pattern Recommendations

In [None]:
# Generate pattern recommendations
fastest = min(pattern_metrics.items(), key=lambda x: x[1]["avg_latency"])[0]
cheapest = min(pattern_metrics.items(), key=lambda x: x[1]["avg_cost"])[0]
highest_quality = max(pattern_metrics.items(), key=lambda x: x[1]["avg_quality"])[0]

recommendations = {
    "hierarchical": {
        "description": "Complex tasks with clear subtask decomposition. Trade higher latency for better quality and coordination.",
        "use_when": ["Task can be broken into independent subtasks", "Need specialist expertise", "Quality matters more than speed"],
        "avoid_when": ["Real-time response required", "Simple tasks", "Limited budget"],
    },
    "diamond": {
        "description": "Optimization problems where you need the best of multiple approaches. Higher cost due to parallel execution.",
        "use_when": ["Need highest quality output", "Multiple valid approaches exist", "Budget allows parallel calls"],
        "avoid_when": ["Cost-sensitive application", "Single obvious solution", "Time is critical"],
    },
    "p2p": {
        "description": "Sequential pipelines with clear handoff points. Lower coordination overhead but context transfer risk.",
        "use_when": ["Clear processing pipeline", "Each step depends on previous", "Cost optimization important"],
        "avoid_when": ["Complex context needed throughout", "Steps can be parallelized", "High accuracy required"],
    },
    "collaborative": {
        "description": "Creative tasks requiring multiple perspectives. Good balance of quality and agent utilization.",
        "use_when": ["Creative or brainstorming tasks", "Multiple viewpoints add value", "Moderate complexity"],
        "avoid_when": ["Single correct answer", "Coordination overhead too high", "Simple factual queries"],
    },
    "adaptive_loop": {
        "description": "Quality-critical tasks where iterative refinement is acceptable. Variable latency based on convergence.",
        "use_when": ["Quality threshold must be met", "Iterative improvement possible", "Time is flexible"],
        "avoid_when": ["Fixed latency requirement", "First attempt usually sufficient", "High cost sensitivity"],
    },
}

print("\nüéØ Pattern Selection Guide\n")
print("=" * 90)
for pattern_name, rec in recommendations.items():
    badges = []
    if pattern_name == fastest:
        badges.append("‚ö° FASTEST")
    if pattern_name == cheapest:
        badges.append("üí∞ CHEAPEST")
    if pattern_name == highest_quality:
        badges.append("üèÜ HIGHEST QUALITY")
    
    badge_str = " ".join(badges)
    
    print(f"\n{pattern_name.upper()}: {badge_str}")
    print(f"{rec['description']}")
    print(f"\n‚úÖ Use when:")
    for point in rec["use_when"]:
        print(f"   - {point}")
    print(f"\n‚ùå Avoid when:")
    for point in rec["avoid_when"]:
        print(f"   - {point}")
    print("-" * 90)

## Save Results to JSON

In [None]:
# Prepare results JSON for dashboard integration
output_data = {
    "experiment_metadata": {
        "mode": MODE,
        "num_scenarios": len(scenarios),
        "num_patterns": len(pattern_simulators),
        "total_executions": len(results),
        "simulation_type": "theoretical",
        "timestamp": time.strftime("%Y-%m-%d %H:%M:%S"),
    },
    "pattern_metrics": {
        name: {
            "avg_latency": float(m["avg_latency"]),
            "avg_cost": float(m["avg_cost"]),
            "avg_quality": float(m["avg_quality"]),
            "avg_agents": float(m["avg_agents"]),
        }
        for name, m in pattern_metrics.items()
    },
    "detailed_results": results,
    "pattern_rankings": {
        "fastest": fastest,
        "cheapest": cheapest,
        "highest_quality": highest_quality,
    },
    "recommendations": recommendations,
}

# Save to results directory
results_dir = Path("results")
results_dir.mkdir(exist_ok=True)
output_path = results_dir / "multi_agent_pattern_comparison.json"

with open(output_path, "w") as f:
    json.dump(output_data, f, indent=2)

print(f"\n‚úÖ Results saved to: {output_path}")
print(f"\nüìä Summary:")
print(f"  - Fastest Pattern: {fastest.upper()}")
print(f"  - Cheapest Pattern: {cheapest.upper()}")
print(f"  - Highest Quality: {highest_quality.upper()}")

## Key Takeaways

### Pattern Trade-offs (Based on Simulation)

1. **Hierarchical Pattern** (Manager-Worker)
   - ‚úÖ Best for: Complex tasks with clear decomposition
   - ‚ö†Ô∏è Trade-off: Higher latency due to sequential orchestration  
   - üí° Use when: Quality and coordination matter more than speed
   - üîß Real implementation: `HierarchicalAgent` in `backend/multi_agent_patterns.py:156`

2. **Diamond Pattern** (Competitive)
   - ‚úÖ Best for: Finding optimal solution among alternatives
   - ‚ö†Ô∏è Trade-off: Higher cost due to parallel LLM calls
   - üí° Use when: Budget allows and you need best-of-N selection
   - üîß Real implementation: `DiamondAgent` in `backend/multi_agent_patterns.py:336`

3. **P2P Pattern** (Peer-to-Peer Handoff)
   - ‚úÖ Best for: Sequential pipelines with specialized agents
   - ‚ö†Ô∏è Trade-off: Context loss risk during handoffs
   - üí° Use when: Clear pipeline stages with minimal context dependency
   - üîß Real implementation: `P2PAgent` in `backend/multi_agent_patterns.py:633`

4. **Collaborative Pattern** (Shared Workspace)
   - ‚úÖ Best for: Creative tasks requiring multiple perspectives
   - ‚ö†Ô∏è Trade-off: Coordination overhead for merging contributions
   - üí° Use when: Diverse viewpoints improve output quality
   - üîß Real implementation: `CollaborativeAgent` in `backend/multi_agent_patterns.py:838`

5. **Adaptive Loop Pattern** (Iterative Refinement)
   - ‚úÖ Best for: Quality-critical tasks with refinement
   - ‚ö†Ô∏è Trade-off: Variable latency (depends on convergence)
   - üí° Use when: Quality threshold must be met, time is flexible
   - üîß Real implementation: `AdaptiveLoopAgent` in `backend/multi_agent_patterns.py:1033`

### Next Steps

1. **Study real implementations**: Read `backend/multi_agent_patterns.py` for production code
2. **Try different scenarios**: Edit `data/multi_agent_scenarios.json` to test your own tasks
3. **Explore automotive case study**: See `automotive_ai_case_study.ipynb` for real-world application
4. **Read concept tutorials**:
   - `multi_agent_fundamentals.md` - 11 core components
   - `multi_agent_design_patterns.md` - 5 coordination patterns in depth
   - `multi_agent_challenges_evaluation.md` - 6 challenges and evaluation

### Important Note

This notebook uses **simulated execution** to demonstrate pattern concepts without API costs. For production use:
- Use the real implementations in `backend/multi_agent_patterns.py`
- Run unit tests: `pytest tests/test_multi_agent_patterns.py`
- See `tests/test_multi_agent_patterns.py` for usage examples with real LLM calls