Description
Implement AI tournament infrastructure that runs parallel games with varied strategies, world configs, and seeds to produce comparative balance reports.
This is the final task in Phase 9 (AI Player Testing & Validation), building on the completed observer (9.1.1), rule-based action layer (9.2.1), and LLM-enhanced decisions (9.3.1).
Acceptance Criteria
Priority
Medium (Final Phase 9 task; all AI infrastructure complete)
Estimated Effort
High (3-5 days)
Dependencies
All dependencies are complete. This task can start immediately.
Risks & Mitigations
- Risk: Parallel execution complexity and resource usage
- Mitigation: Use process pools with configurable worker limits; test on small batches first
- Risk: Tournament results difficult to interpret
- Mitigation: Design clear metrics (win rate, stability curves, story seed coverage); provide example reports
- Risk: CI tournament runs too slow or expensive
- Mitigation: Make nightly runs optional; use smaller tick budgets for CI
Implementation Details
From the implementation plan (docs/simul/emergent_story_game_implementation_plan.md):
Tournament Script Requirements:
- Execute N parallel games with varied AI strategies (BALANCED, AGGRESSIVE, DIPLOMATIC, HYBRID)
- Support different world configs and random seeds
- Capture per-game telemetry: final stability, story seed activations, resource efficiency
- Aggregate results into structured JSON reports
Analysis Script Requirements:
- Compare win rates across strategies
- Plot average stability curves over time
- Identify story seeds that never triggered
- Flag balance outliers (overpowered actions, dominant strategies)
- Generate human-readable summary reports
Folder Structure:
scripts/
run_ai_tournament.py # Multi-game execution with parallel workers
analyze_ai_games.py # Balance and coverage analysis
tests/ai_player/
test_tournament.py # Tournament infrastructure tests
test_analysis.py # Analysis logic tests
Next Steps
- Design tournament configuration schema (strategies, worlds, seeds, tick budgets)
- Implement parallel game execution with result capture
- Create analysis tooling for balance reports
- Add tournament tests and documentation
- Integrate with CI for nightly runs
- Update README and implementation plan with workflow examples
Related
- Tracker:
.pm/tracker.md Task 9.4.1
- Implementation plan:
docs/simul/emergent_story_game_implementation_plan.md M9.4 (lines 687-702)
- Phase 9 overview: See implementation plan Phase 9 section
Description
Implement AI tournament infrastructure that runs parallel games with varied strategies, world configs, and seeds to produce comparative balance reports.
This is the final task in Phase 9 (AI Player Testing & Validation), building on the completed observer (9.1.1), rule-based action layer (9.2.1), and LLM-enhanced decisions (9.3.1).
Acceptance Criteria
scripts/run_ai_tournament.pyruns 100+ games in parallel with configurable strategiesscripts/analyze_ai_games.pyidentifies unused story seeds or overpowered actionsPriority
Medium (Final Phase 9 task; all AI infrastructure complete)
Estimated Effort
High (3-5 days)
Dependencies
All dependencies are complete. This task can start immediately.
Risks & Mitigations
Implementation Details
From the implementation plan (docs/simul/emergent_story_game_implementation_plan.md):
Tournament Script Requirements:
Analysis Script Requirements:
Folder Structure:
Next Steps
Related
.pm/tracker.mdTask 9.4.1docs/simul/emergent_story_game_implementation_plan.mdM9.4 (lines 687-702)