# AgorAI Demo: Democratic Multi-Agent Aggregation

Welcome to AgorAI! This notebook demonstrates the key features of the library:

1. **Core Aggregation** - 14+ methods from social choice theory
2. **Benchmarking** - Scientific evaluation with metrics
3. **Visualization** - Publication-quality plots and explanations

**Version:** 0.2.0  
**Date:** November 21, 2025

## Setup

First, let's install AgorAI if needed and import the modules.

In [None]:
# Uncomment to install
# !pip install -e ..[research]  # From package root
# Or: !pip install agorai[research]  # From PyPI (once published)

import sys
sys.path.insert(0, '../src')  # Add package to path

In [None]:
# Import core modules
from agorai.aggregate import aggregate, list_methods
from agorai.benchmarks import evaluate_method, compare_methods, list_benchmarks
from agorai.visualization import (
    plot_utility_matrix,
    plot_aggregation_comparison,
    plot_fairness_tradeoffs,
    explain_decision,
    explain_method
)

# Standard libraries
import numpy as np
import matplotlib.pyplot as plt
import json

# Display settings
%matplotlib inline
plt.rcParams['figure.figsize'] = (10, 6)
plt.rcParams['font.size'] = 11

print("‚úÖ Imports successful!")

---

## Part 1: Core Aggregation

Let's start with basic aggregation. We'll create a simple scenario with 3 agents voting on 2 candidates.

### 1.1 Define Utilities

Each row represents an agent's utilities for the candidates.

In [None]:
# Define utility matrix
# Rows = agents, Columns = candidates
utilities = [
    [0.8, 0.2],  # Agent 1 strongly prefers candidate 0
    [0.3, 0.7],  # Agent 2 strongly prefers candidate 1
    [0.5, 0.5],  # Agent 3 is indifferent
]

print("Utility Matrix:")
print(f"Agent 1: {utilities[0]}")
print(f"Agent 2: {utilities[1]}")
print(f"Agent 3: {utilities[2]}")

### 1.2 Try Different Aggregation Methods

Let's see how different methods handle this split decision.

In [None]:
# List all available methods
methods = list_methods()
print(f"Available methods ({len(methods)}):")
for i, method in enumerate(methods, 1):
    print(f"  {i:2d}. {method}")

In [None]:
# Try a few key methods
methods_to_try = ["majority", "atkinson", "maximin", "nash_bargaining"]

print("\nAggregation Results:\n" + "="*60)

for method in methods_to_try:
    result = aggregate(utilities, method=method)
    winner = result['winner']
    scores = result['scores']
    
    print(f"\n{method.upper():20s}")
    print(f"  Winner: Candidate {winner}")
    print(f"  Scores: {scores}")

**Observation:** Different methods choose different winners! This is because they optimize for different objectives:
- **Majority**: Count votes (candidate with most votes)
- **Atkinson**: Balance fairness and efficiency
- **Maximin**: Protect worst-off agent
- **Nash Bargaining**: Game-theoretic compromise

---

## Part 2: Benchmarking

Now let's use the benchmarking module to scientifically evaluate these methods.

### 2.1 Available Benchmarks

In [None]:
# List built-in benchmarks
benchmarks = list_benchmarks()

print("Available Benchmarks:\n" + "="*60)
for bench in benchmarks:
    print(f"\n{bench['name']}:")
    print(f"  Description: {bench['description']}")
    print(f"  Test cases: {bench['num_items']}")

### 2.2 Evaluate Single Method

In [None]:
# Evaluate Atkinson method on simple_voting benchmark
results = evaluate_method(
    method="atkinson",
    benchmark="simple_voting",
    metrics=["fairness", "efficiency", "agreement"],
    epsilon=1.0
)

print("Atkinson Method Evaluation (Œµ=1.0)\n" + "="*60)
print(f"\nBenchmark: {results['benchmark']}")
print(f"Test cases: {results['num_items']}")
print("\nSummary Metrics:")
print("\nFairness:")
for metric, value in results['summary']['fairness'].items():
    print(f"  {metric:25s}: {value:.4f}")

print("\nEfficiency:")
for metric, value in results['summary']['efficiency'].items():
    print(f"  {metric:25s}: {value:.4f}")

print("\nAgreement:")
for metric, value in results['summary']['agreement'].items():
    print(f"  {metric:25s}: {value:.4f}")

### 2.3 Compare Multiple Methods

In [None]:
# Compare three methods
comparison = compare_methods(
    methods=["majority", "atkinson", "maximin"],
    benchmark="simple_voting",
    plot=True,  # Generate plots
    save_results="comparison_results.json"
)

print("Method Comparison\n" + "="*60)
print("\nFairness Rankings (Gini Coefficient - lower is better):")
for rank, method in enumerate(comparison['rankings']['fairness_gini_coefficient'], 1):
    print(f"  {rank}. {method}")

print("\nEfficiency Rankings (Social Welfare - higher is better):")
for rank, method in enumerate(comparison['rankings']['efficiency_social_welfare'], 1):
    print(f"  {rank}. {method}")

In [None]:
# Create comparison table
import pandas as pd

data = []
for method_result in comparison['methods']:
    name = method_result['method']
    summary = method_result['summary']
    
    data.append({
        'Method': name,
        'Gini': summary['fairness']['gini_coefficient'],
        'Atkinson Index': summary['fairness']['atkinson_index'],
        'Social Welfare': summary['efficiency']['social_welfare'],
        'Consensus': summary['agreement']['consensus_score']
    })

df = pd.DataFrame(data)
print("\nDetailed Comparison:")
print(df.to_string(index=False))

---

## Part 3: Visualization

Let's create publication-quality visualizations.

### 3.1 Utility Matrix Heatmap

In [None]:
# Visualize our original utility matrix
plot_utility_matrix(
    utilities,
    agent_labels=["Western Perspective", "Eastern Perspective", "Global South"],
    candidate_labels=["Approve", "Reject"],
    save_path="utility_heatmap.png"
)

print("\n‚úÖ Heatmap saved to: utility_heatmap.png")

### 3.2 Method Comparison Chart

In [None]:
# Compare multiple methods visually
plot_aggregation_comparison(
    utilities,
    methods=["majority", "atkinson", "maximin", "nash_bargaining"],
    highlight_differences=True,
    save_path="method_comparison.png"
)

print("\n‚úÖ Comparison saved to: method_comparison.png")

### 3.3 Fairness-Efficiency Tradeoff

In [None]:
# Visualize fairness-efficiency tradeoffs
plot_fairness_tradeoffs(
    utilities,
    methods=["majority", "borda", "atkinson", "maximin", "nash_bargaining"],
    x_axis="social_welfare",
    y_axis="gini_coefficient",
    save_path="fairness_efficiency_tradeoff.png"
)

print("\n‚úÖ Tradeoff plot saved to: fairness_efficiency_tradeoff.png")
print("\nInterpretation:")
print("  - Bottom-right = Ideal (high welfare, low inequality)")
print("  - Top-left = Poor (low welfare, high inequality)")

---

## Part 4: Natural Language Explanations

AgorAI can explain decisions in plain language!

### 4.1 Explain a Specific Decision

In [None]:
# Get aggregation result
result = aggregate(utilities, method="atkinson", epsilon=1.0)

# Explain the decision
explanation = explain_decision(
    utilities,
    method="atkinson",
    winner=result['winner'],
    scores=result['scores'],
    epsilon=1.0
)

print("Decision Explanation:\n" + "="*60)
print(explanation)

### 4.2 Explain How Methods Work

In [None]:
# Get method guide
guide = explain_method("maximin")

print("Maximin Method Guide:\n" + "="*60)
print(guide)

### 4.3 Compare Explanations Across Methods

In [None]:
# Explain how different methods would decide
methods_to_explain = ["majority", "atkinson", "maximin"]

for method in methods_to_explain:
    result = aggregate(utilities, method=method)
    explanation = explain_decision(
        utilities, method, result['winner'], result['scores']
    )
    
    print("\n" + "="*60)
    print(f"METHOD: {method.upper()}")
    print("="*60)
    print(explanation[:300] + "...")  # Show first 300 chars

---

## Part 5: Advanced Example - Parameter Sweep

Let's explore how the Atkinson method behaves with different inequality aversion parameters.

In [None]:
# Sweep epsilon parameter
epsilons = np.linspace(0.0, 2.0, 21)
winners = []
gini_values = []
welfare_values = []

for epsilon in epsilons:
    # Run aggregation
    result = aggregate(utilities, method="atkinson", epsilon=epsilon)
    winner = result['winner']
    
    # Evaluate fairness and efficiency
    benchmark_data = {
        'name': 'temp',
        'items': [{'utilities': utilities}]
    }
    eval_result = evaluate_method(
        method="atkinson",
        benchmark=benchmark_data,
        epsilon=epsilon
    )
    
    winners.append(winner)
    gini_values.append(eval_result['summary']['fairness']['gini_coefficient'])
    welfare_values.append(eval_result['summary']['efficiency']['social_welfare'])

print("Parameter sweep complete!")

In [None]:
# Plot results
fig, (ax1, ax2, ax3) = plt.subplots(1, 3, figsize=(15, 4))

# Winner vs epsilon
ax1.plot(epsilons, winners, 'o-', linewidth=2, markersize=6)
ax1.set_xlabel('Epsilon (Œµ)', fontsize=12)
ax1.set_ylabel('Winner (Candidate ID)', fontsize=12)
ax1.set_title('Winner Selection vs Inequality Aversion', fontsize=13, fontweight='bold')
ax1.grid(True, alpha=0.3)
ax1.set_yticks([0, 1])

# Gini vs epsilon
ax2.plot(epsilons, gini_values, 'o-', color='orange', linewidth=2, markersize=6)
ax2.set_xlabel('Epsilon (Œµ)', fontsize=12)
ax2.set_ylabel('Gini Coefficient', fontsize=12)
ax2.set_title('Fairness vs Inequality Aversion', fontsize=13, fontweight='bold')
ax2.grid(True, alpha=0.3)

# Social welfare vs epsilon
ax3.plot(epsilons, welfare_values, 'o-', color='green', linewidth=2, markersize=6)
ax3.set_xlabel('Epsilon (Œµ)', fontsize=12)
ax3.set_ylabel('Social Welfare', fontsize=12)
ax3.set_title('Efficiency vs Inequality Aversion', fontsize=13, fontweight='bold')
ax3.grid(True, alpha=0.3)

plt.tight_layout()
plt.savefig('parameter_sweep.png', dpi=150, bbox_inches='tight')
plt.show()

print("\n‚úÖ Parameter sweep visualization saved to: parameter_sweep.png")
print("\nKey Insights:")
print(f"  - Low Œµ (utilitarian): Winner = Candidate {winners[0]}")
print(f"  - High Œµ (egalitarian): Winner = Candidate {winners[-1]}")
print(f"  - Fairness improves as Œµ increases (Gini decreases)")
print(f"  - Efficiency may decrease as Œµ increases (welfare changes)")

---

## Part 6: Real-World Scenario

Let's apply AgorAI to a realistic scenario: **Content Moderation with Cultural Diversity**

### 6.1 Setup: Multiple Moderators with Different Cultural Backgrounds

In [None]:
# Scenario: 5 moderators from different regions vote on 3 content decisions
# 0 = Remove, 1 = Flag for review, 2 = Allow

moderator_utilities = [
    [0.9, 0.5, 0.1],  # Western moderator (cautious)
    [0.3, 0.6, 0.8],  # Eastern moderator (lenient)
    [0.7, 0.7, 0.3],  # Global South moderator (balanced)
    [0.8, 0.4, 0.2],  # European moderator (cautious)
    [0.4, 0.8, 0.6],  # Middle Eastern moderator (nuanced)
]

moderator_labels = [
    "Western",
    "Eastern",
    "Global South",
    "European",
    "Middle Eastern"
]

decision_labels = ["Remove", "Flag", "Allow"]

print("Content Moderation Scenario")
print("="*60)
print(f"Moderators: {len(moderator_utilities)}")
print(f"Decisions: {len(decision_labels)}")
print("\nUtilities:")
for label, utils in zip(moderator_labels, moderator_utilities):
    print(f"  {label:15s}: {utils}")

### 6.2 Visualize Moderator Preferences

In [None]:
plot_utility_matrix(
    moderator_utilities,
    agent_labels=moderator_labels,
    candidate_labels=decision_labels,
    save_path="content_moderation_utilities.png"
)

print("‚úÖ Visualization saved to: content_moderation_utilities.png")

### 6.3 Compare Aggregation Approaches

In [None]:
# Compare different aggregation approaches
moderation_methods = [
    "majority",           # Simple majority vote
    "atkinson",          # Balance fairness and efficiency
    "maximin",           # Protect minority perspectives
    "nash_bargaining",   # Game-theoretic compromise
]

print("Aggregation Results for Content Moderation\n" + "="*60)

for method in moderation_methods:
    result = aggregate(moderator_utilities, method=method)
    winner = result['winner']
    scores = result['scores']
    
    print(f"\n{method.upper():20s}")
    print(f"  Decision: {decision_labels[winner]}")
    print(f"  Scores: {[f'{s:.3f}' for s in scores]}")

### 6.4 Evaluate Fairness Implications

In [None]:
# Create benchmark from scenario
moderation_benchmark = {
    'name': 'content_moderation',
    'items': [{'utilities': moderator_utilities}]
}

# Evaluate each method
fairness_comparison = []

for method in moderation_methods:
    eval_result = evaluate_method(
        method=method,
        benchmark=moderation_benchmark,
        metrics=["fairness", "efficiency", "agreement"]
    )
    
    fairness_comparison.append({
        'Method': method,
        'Gini': eval_result['summary']['fairness']['gini_coefficient'],
        'Atkinson': eval_result['summary']['fairness']['atkinson_index'],
        'Social Welfare': eval_result['summary']['efficiency']['social_welfare'],
        'Consensus': eval_result['summary']['agreement']['consensus_score']
    })

# Display comparison
df = pd.DataFrame(fairness_comparison)
print("\nFairness Analysis:\n" + "="*60)
print(df.to_string(index=False))

print("\nüí° Interpretation:")
best_fairness = df.loc[df['Gini'].idxmin(), 'Method']
best_welfare = df.loc[df['Social Welfare'].idxmax(), 'Method']
print(f"  - Most fair (lowest Gini): {best_fairness}")
print(f"  - Most efficient (highest welfare): {best_welfare}")
print(f"  - Trade-off: Fairness vs efficiency often conflict")

### 6.5 Explain the Recommended Decision

In [None]:
# Use Atkinson for balanced approach
result = aggregate(moderator_utilities, method="atkinson", epsilon=1.0)

explanation = explain_decision(
    moderator_utilities,
    method="atkinson",
    winner=result['winner'],
    scores=result['scores'],
    epsilon=1.0
)

print("Recommended Content Moderation Decision:\n" + "="*60)
print(f"\nüéØ Decision: {decision_labels[result['winner']]}\n")
print(explanation)

---

## Summary & Next Steps

In this demo, you've learned:

1. ‚úÖ **Core Aggregation** - Using 14+ methods from social choice theory
2. ‚úÖ **Benchmarking** - Scientific evaluation with fairness/efficiency metrics
3. ‚úÖ **Visualization** - Creating publication-quality plots
4. ‚úÖ **Explanations** - Understanding decisions in plain language
5. ‚úÖ **Real-World Application** - Content moderation with cultural diversity

### Where to Go Next

**Explore More:**
- Try different aggregation methods
- Create custom benchmarks
- Experiment with parameters (epsilon, weights, thresholds)
- Apply to your own use cases

**Documentation:**
- [Aggregation API](../docs/aggregate.md)
- [Benchmarking Guide](../docs/benchmarks.md)
- [Visualization Guide](../docs/visualization.md)

**Research:**
- Read the [Research Strategy Report](../../AgorAI_Research_Strategy_Report.md)
- Check [AI Research Report](../../AI_Research_2024_2025_Comprehensive_Report.md)
- Explore connections to Constitutional AI, EBMs, MARL

---

**Built with ‚ù§Ô∏è for the democratic AI research community**