Skip to content

9.4.1 — AI Tournaments & Balance Tooling #49

@SorraTheOrc

Description

@SorraTheOrc

Description

Implement AI tournament infrastructure that runs parallel games with varied strategies, world configs, and seeds to produce comparative balance reports.

This is the final task in Phase 9 (AI Player Testing & Validation), building on the completed observer (9.1.1), rule-based action layer (9.2.1), and LLM-enhanced decisions (9.3.1).

Acceptance Criteria

  • Tournament script scripts/run_ai_tournament.py runs 100+ games in parallel with configurable strategies
  • Comparative reports surface win rate deltas and balance anomalies
  • Analysis script scripts/analyze_ai_games.py identifies unused story seeds or overpowered actions
  • Documentation guides designers through balance iteration workflow
  • CI integration runs nightly tournaments and archives results
  • Tests cover tournament execution, result aggregation, and analysis

Priority

Medium (Final Phase 9 task; all AI infrastructure complete)

Estimated Effort

High (3-5 days)

Dependencies

All dependencies are complete. This task can start immediately.

Risks & Mitigations

  • Risk: Parallel execution complexity and resource usage
    • Mitigation: Use process pools with configurable worker limits; test on small batches first
  • Risk: Tournament results difficult to interpret
    • Mitigation: Design clear metrics (win rate, stability curves, story seed coverage); provide example reports
  • Risk: CI tournament runs too slow or expensive
    • Mitigation: Make nightly runs optional; use smaller tick budgets for CI

Implementation Details

From the implementation plan (docs/simul/emergent_story_game_implementation_plan.md):

Tournament Script Requirements:

  • Execute N parallel games with varied AI strategies (BALANCED, AGGRESSIVE, DIPLOMATIC, HYBRID)
  • Support different world configs and random seeds
  • Capture per-game telemetry: final stability, story seed activations, resource efficiency
  • Aggregate results into structured JSON reports

Analysis Script Requirements:

  • Compare win rates across strategies
  • Plot average stability curves over time
  • Identify story seeds that never triggered
  • Flag balance outliers (overpowered actions, dominant strategies)
  • Generate human-readable summary reports

Folder Structure:

scripts/
  run_ai_tournament.py     # Multi-game execution with parallel workers
  analyze_ai_games.py      # Balance and coverage analysis
tests/ai_player/
  test_tournament.py       # Tournament infrastructure tests
  test_analysis.py         # Analysis logic tests

Next Steps

  1. Design tournament configuration schema (strategies, worlds, seeds, tick budgets)
  2. Implement parallel game execution with result capture
  3. Create analysis tooling for balance reports
  4. Add tournament tests and documentation
  5. Integrate with CI for nightly runs
  6. Update README and implementation plan with workflow examples

Related

  • Tracker: .pm/tracker.md Task 9.4.1
  • Implementation plan: docs/simul/emergent_story_game_implementation_plan.md M9.4 (lines 687-702)
  • Phase 9 overview: See implementation plan Phase 9 section

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions