# 02. Bot Battle Arena

This notebook runs tournaments between different agent types and visualizes their performance.

We'll compare:
- **Random Agent**: Makes random legal moves
- **RuleBased Agent**: Uses hand evaluation heuristics
- **MCTS Agent**: Monte Carlo Tree Search for card play
- **CFR Agent**: Counterfactual Regret Minimization for bidding

In [None]:
import sys
import os
import matplotlib.pyplot as plt
import numpy as np
from collections import defaultdict

# Add parent directory to path
sys.path.insert(0, os.path.abspath(os.path.join('..')))

from euchre.engine.state import EuchreGameState, GamePhase
from euchre.agents import RandomAgent, RuleBasedAgent, MCTSAgent

# Import the evaluator functions from utils
from euchre.utils.evaluator import run_tournament, TournamentStats

print('Imports successful!')

# Set up matplotlib style
plt.style.use('seaborn-v0_8-darkgrid')
%matplotlib inline

## Tournament 1: Random vs Random (Control)

First, let's establish a baseline by having random agents play against each other.
We expect roughly 50/50 win rates.

In [None]:
print('Running Random vs Random (100 games)...')
team0 = (RandomAgent('R0'), RandomAgent('R2'))
team1 = (RandomAgent('R1'), RandomAgent('R3'))
stats_random = run_tournament(team0, team1, num_games=100, verbose=False)
stats_random.print_summary()

## Tournament 2: RuleBased vs Random

Now let's see how much better a heuristic-based agent performs compared to random play.

In [None]:
print('Running RuleBased vs Random (100 games)...')
team0 = (RuleBasedAgent('H0'), RuleBasedAgent('H2'))
team1 = (RandomAgent('R1'), RandomAgent('R3'))
stats_heuristic = run_tournament(team0, team1, num_games=100, verbose=False)
stats_heuristic.print_summary()

## Tournament 3: RuleBased vs RuleBased

What happens when two heuristic agents face each other?

In [None]:
print('Running RuleBased vs RuleBased (100 games)...')
team0 = (RuleBasedAgent('H0'), RuleBasedAgent('H2'))
team1 = (RuleBasedAgent('H1'), RuleBasedAgent('H3'))
stats_heuristic_mirror = run_tournament(team0, team1, num_games=100, verbose=False)
stats_heuristic_mirror.print_summary()

## Tournament 4: MCTS vs RuleBased

MCTS uses simulation to find good moves. It's slower but should be stronger.

**Note:** Running fewer games due to computational cost.

In [None]:
print('Running MCTS vs RuleBased (20 games, this will take a while)...')
team0 = (MCTSAgent('M0', simulation_time=0.5), MCTSAgent('M2', simulation_time=0.5))
team1 = (RuleBasedAgent('H1'), RuleBasedAgent('H3'))
stats_mcts = run_tournament(team0, team1, num_games=20, verbose=False)
stats_mcts.print_summary()

## Visualization 1: Win Rates Comparison

In [None]:
# Collect win rate data
matchups = [
    'Random vs\nRandom',
    'RuleBased vs\nRandom',
    'RuleBased vs\nRuleBased',
    'MCTS vs\nRuleBased'
]

team0_win_rates = [
    stats_random.team0_wins / stats_random.games_played * 100,
    stats_heuristic.team0_wins / stats_heuristic.games_played * 100,
    stats_heuristic_mirror.team0_wins / stats_heuristic_mirror.games_played * 100,
    stats_mcts.team0_wins / stats_mcts.games_played * 100
]

team1_win_rates = [
    stats_random.team1_wins / stats_random.games_played * 100,
    stats_heuristic.team1_wins / stats_heuristic.games_played * 100,
    stats_heuristic_mirror.team1_wins / stats_heuristic_mirror.games_played * 100,
    stats_mcts.team1_wins / stats_mcts.games_played * 100
]

# Create grouped bar chart
x = np.arange(len(matchups))
width = 0.35

fig, ax = plt.subplots(figsize=(12, 6))
bars1 = ax.bar(x - width/2, team0_win_rates, width, label='Team 0 (First)', color='#2E86AB')
bars2 = ax.bar(x + width/2, team1_win_rates, width, label='Team 1 (Second)', color='#A23B72')

ax.set_ylabel('Win Rate (%)', fontsize=12, fontweight='bold')
ax.set_title('Agent Win Rates Across Different Matchups', fontsize=14, fontweight='bold')
ax.set_xticks(x)
ax.set_xticklabels(matchups)
ax.legend()
ax.set_ylim([0, 100])
ax.axhline(y=50, color='gray', linestyle='--', alpha=0.5, label='50% baseline')

# Add value labels on bars
def autolabel(bars):
    for bar in bars:
        height = bar.get_height()
        ax.annotate(f'{height:.1f}%',
                    xy=(bar.get_x() + bar.get_width() / 2, height),
                    xytext=(0, 3),
                    textcoords="offset points",
                    ha='center', va='bottom',
                    fontsize=9, fontweight='bold')

autolabel(bars1)
autolabel(bars2)

plt.tight_layout()
plt.show()

## Visualization 2: Average Scores per Game

In [None]:
# Collect average score data
team0_avg_scores = [
    stats_random.team0_total_score / stats_random.games_played,
    stats_heuristic.team0_total_score / stats_heuristic.games_played,
    stats_heuristic_mirror.team0_total_score / stats_heuristic_mirror.games_played,
    stats_mcts.team0_total_score / stats_mcts.games_played
]

team1_avg_scores = [
    stats_random.team1_total_score / stats_random.games_played,
    stats_heuristic.team1_total_score / stats_heuristic.games_played,
    stats_heuristic_mirror.team1_total_score / stats_heuristic_mirror.games_played,
    stats_mcts.team1_total_score / stats_mcts.games_played
]

# Create grouped bar chart
fig, ax = plt.subplots(figsize=(12, 6))
bars1 = ax.bar(x - width/2, team0_avg_scores, width, label='Team 0', color='#2E86AB')
bars2 = ax.bar(x + width/2, team1_avg_scores, width, label='Team 1', color='#A23B72')

ax.set_ylabel('Average Final Score', fontsize=12, fontweight='bold')
ax.set_title('Average Game Scores by Matchup (Target: 10 points)', fontsize=14, fontweight='bold')
ax.set_xticks(x)
ax.set_xticklabels(matchups)
ax.legend()
ax.axhline(y=10, color='green', linestyle='--', alpha=0.5, label='Target score')

# Add value labels
def autolabel_scores(bars):
    for bar in bars:
        height = bar.get_height()
        ax.annotate(f'{height:.1f}',
                    xy=(bar.get_x() + bar.get_width() / 2, height),
                    xytext=(0, 3),
                    textcoords="offset points",
                    ha='center', va='bottom',
                    fontsize=9, fontweight='bold')

autolabel_scores(bars1)
autolabel_scores(bars2)

plt.tight_layout()
plt.show()

## Visualization 3: Agent Performance Summary

In [None]:
# Create a summary table
import pandas as pd

summary_data = {
    'Matchup': matchups,
    'Team 0 Agent': ['Random', 'RuleBased', 'RuleBased', 'MCTS'],
    'Team 1 Agent': ['Random', 'Random', 'RuleBased', 'RuleBased'],
    'Games Played': [
        stats_random.games_played,
        stats_heuristic.games_played,
        stats_heuristic_mirror.games_played,
        stats_mcts.games_played
    ],
    'Team 0 Win %': [f'{wr:.1f}%' for wr in team0_win_rates],
    'Team 1 Win %': [f'{wr:.1f}%' for wr in team1_win_rates],
    'Avg Hands/Game': [
        stats_random.total_hands / stats_random.games_played,
        stats_heuristic.total_hands / stats_heuristic.games_played,
        stats_heuristic_mirror.total_hands / stats_heuristic_mirror.games_played,
        stats_mcts.total_hands / stats_mcts.games_played
    ]
}

df = pd.DataFrame(summary_data)
df['Avg Hands/Game'] = df['Avg Hands/Game'].round(1)

print('\n' + '='*80)
print('TOURNAMENT SUMMARY TABLE')
print('='*80)
print(df.to_string(index=False))
print('='*80)

## Key Findings

Based on the tournaments above:

1. **Random vs Random**: Should show ~50/50 win rate (baseline)
2. **RuleBased vs Random**: RuleBased agent should win significantly more often (typically 60-70%)
3. **RuleBased vs RuleBased**: Should return to ~50/50 when agents are matched
4. **MCTS vs RuleBased**: MCTS should have a slight edge due to better card play decisions

### Agent Strengths:

- **RandomAgent**: Useful for baseline testing, no strategic value
- **RuleBasedAgent**: Strong bidding heuristics, fast execution, good baseline
- **MCTSAgent**: Better card play through simulation, slower but more accurate
- **CFRAgent**: (If trained) Near-optimal bidding strategy for Round 1

## Optional: Test CFR Agent

If you've trained the CFR policy, uncomment and run this cell:

In [None]:
# try:
#     from euchre.agents import CFRAgent
#     
#     print('Running CFR vs RuleBased (100 games)...')
#     team0 = (CFRAgent('C0'), CFRAgent('C2'))
#     team1 = (RuleBasedAgent('H1'), RuleBasedAgent('H3'))
#     stats_cfr = run_tournament(team0, team1, num_games=100, verbose=False)
#     stats_cfr.print_summary()
# except FileNotFoundError:
#     print('CFR policy file not found!')
#     print('Train the policy first with: python -m euchre.training.cfr_trainer')

## Next Steps

1. **Train CFR**: Run `python -m euchre.training.cfr_trainer` to create the policy file
2. **Tune MCTS**: Experiment with different `simulation_time` values
3. **Improve Heuristics**: Modify the RuleBasedAgent's hand evaluation
4. **Build UI**: Create a CLI or web interface to play against the bots