Skip to content

feat(monte-carlo): wire greedy/random AI strategies into harness with CLI flags and CI guardrail#429

Merged
SorraTheOrc merged 2 commits intomainfrom
copilot/extend-monte-carlo-harness-ai-strategies
Mar 13, 2026
Merged

feat(monte-carlo): wire greedy/random AI strategies into harness with CLI flags and CI guardrail#429
SorraTheOrc merged 2 commits intomainfrom
copilot/extend-monte-carlo-harness-ai-strategies

Conversation

Copy link
Copy Markdown
Contributor

Copilot AI commented Mar 13, 2026

The Monte Carlo harness only supported internal legacy strategies (market-greedy, demo-greedy) and lacked the --seeds/--maxTurns CLI flags, making it impossible to evaluate the GreedyStrategy/RandomStrategy AI implementations at scale or gate CI on win-rate balance.

Harness (MainStreetMonteCarlo.ts)

  • Extends MonteCarloStrategy type: 'market-greedy' | 'demo-greedy' | 'greedy' | 'random'
  • Adds createAiPlayerForStrategy(): maps strategy name → MainStreetAiPlayer bound to a deterministic RNG derived from seedToNumber(${seed}-ai). Returns null for legacy strategies (preserves existing behaviour)
  • runSeed() now branches: AI strategies use a chooseAction() loop; legacy strategies retain the upfront-plan path
// greedy/random: action loop (one decision per call)
let action = aiPlayer.chooseAction(state);
while (action.type !== 'end-turn' && state.gameResult === 'playing') {
  executeAction(state, action);
  executedAction = true;
  action = aiPlayer.chooseAction(state);
}

CLI (scripts/monte-carlo.ts)

  • Adds --seeds (alias for --runs) and --maxTurns (alias for --max-turns)
  • Accepts greedy and random as valid --strategy values
  • Defaults: strategy → greedy, maxTurns → 25

npm script (package.json)

Updated monte-carlo to use new flag names and the greedy default:

tsx scripts/monte-carlo.ts --seeds 200 --seed-prefix mc-balance --maxTurns 25 --strategy greedy ...

CI guardrail (tests/main-street/monte-carlo-greedy-guardrail.test.ts)

New Vitest suite — fails build when greedy win rate falls outside 20–80% over 100 deterministic seeds on Medium difficulty. Also smoke-tests random strategy for valid output range.

CI workflow (.github/workflows/deploy.yml)

Adds a monte-carlo job that runs the harness and uploads results/ (JSON + CSV) as a 30-day artifact. build-and-deploy now depends on this job.

Original prompt

This section details on the original issue you should resolve

<issue_title>Main Street M3: Monte Carlo Harness Extension for AI Strategies</issue_title>
<issue_description>

Problem statement

Extend the Main Street Monte Carlo harness so it can run and evaluate the project AI strategies (greedy + random) at scale, produce JSON and CSV metrics, and gate CI with a configurable win‑rate guardrail. This will let designers and engineers validate economy and difficulty balance deterministically across many seeded runs.

Users

  • Designers: validate economic balance and difficulty targets using repeatable Monte Carlo results.
  • Engineers: run automated stability checks and performance regressions in CI and locally.
  • QA: reproduce failing seeds and verify fixes.

User stories

  • As a designer, I can run 200 seeded Monte Carlo runs with the greedy and random AI and see win rate and key metrics in JSON and CSV so I can validate balance changes.
  • As an engineer, I can invoke npm run monte-carlo -- --strategy <name> --seeds <n> --maxTurns <n> locally or in CI and have the pipeline fail if win rate falls outside the guardrail.
  • As QA, I can repro a single failing seed and inspect the run transcript/metrics to diagnose issues.

Success criteria

  • MonteCarlo harness accepts --strategy (default greedy), --seeds (default 200), and --maxTurns (default 25).
  • Output artifacts: a JSON file containing MonteCarloMetrics and a CSV export of runs; both written to the --out path when provided.
  • CI guardrail: a test that fails the build when greedy strategy win rate on Medium falls outside 20–80%.
  • Strategies wired: greedy and random strategies are discoverable and selectable by name by the harness.
  • Existing Monte Carlo and integration tests remain green after changes.

Constraints

  • Deterministic RNG/seed handling must be preserved so runs are reproducible.
  • Keep run-time reasonable in CI; default seeds=200 and maxTurns=25 are accepted unless changed.
  • Do not alter game mechanics as part of this task; only add harness wiring, CLI flags, and CI guardrail.

Existing state

  • Current harness: example-games/main-street/MainStreetMonteCarlo.ts implements runMonteCarlo and CSV exporter; scripts/monte-carlo.ts is a thin runner.
  • Tests: tests/main-street/monte-carlo-balance.test.ts and related integration tests exist and exercise the harness.
  • AI strategies: Random and Greedy strategies are implemented elsewhere in the Main Street example (per repo search and work-item references).
  • Worklog item: CG-0MMN8V9UU0MF2GHK already exists and is staged to idea and in-progress.

Desired change

  • Extend RunMonteCarloOptions/MonteCarloStrategy to include strategy names for greedy and random (if not already present).
  • Update scripts/monte-carlo.ts and npm script monte-carlo to accept --strategy, --seeds, --maxTurns, and --out.
  • Wire harness to import and call the corresponding AI strategy implementation via a name->factory map.
  • Add a CI test that executes the harness (with defaults) and fails the build when the greedy win rate falls outside 20–80%.
  • Ensure outputs (JSON + CSV) are produced and linked as job artifacts for debugging.

Related work

Potentially related docs

  • example-games/main-street/MainStreetMonteCarlo.ts — current harness implementation and CSV exporter.
  • scripts/monte-carlo.ts — CLI runner that invokes the harness.
  • example-games/main-street/MainStreetAiStrategy.ts — AI strategy implementations and strategy notes.
  • tests/main-street/monte-carlo-balance.test.ts — existing Monte Carlo balance test that will need to remain green.

Potentially related work items

  • Main Street M3: Monte Carlo Harness Extension for AI Strategies (CG-0MMN8V9UU0MF2GHK) — this intake (open, high).
  • Main Street M3: Random AI Strategy (CG-0MMN8UKE10G7GASW) — Random strategy implementation (completed).
  • Main Street: PRD Milestone 3 -- AI, Hints, and Undo System (CG-0MM4REQ4C01X8C08) — parent PRD work item with AI milestone context (in-progress).
  • Tests, Playtests & Monte Carlo Harness (CG-0MMJ8S9FI0E0Q9JP) — prior harness/test work (completed).

Notes / decisions captured from intake

  • Defaults confirmed: seeds=200, maxTurns=25, default strategy=greedy.
  • CI guardrail behavior: treat guardrail as a hard CI failure when win rate is outside 20–80% (requested).
  • Strategy availability: Random and Greedy strategies are present and will be imported into the harness by name.

Open questions

  1. Where should the CI job be defined/placed (existing ci/ workflow or a new workflow)? Suggested default: add to existing main CI workflow as a separate job that runs the harness and uploads artifacts.
  2. Are there required artifact retention or naming conventions for the JSO...

📍 Connect Copilot coding agent with Jira, Azure Boards or Linear to delegate work to Copilot in one click without leaving your project management tool.

…rt and CI guardrail

- Add 'greedy' and 'random' to MonteCarloStrategy type in MainStreetMonteCarlo.ts
- Wire GreedyStrategy/RandomStrategy from MainStreetAiStrategy.ts via createAiPlayerForStrategy()
- Update runSeed() to use MainStreetAiPlayer for AI strategies (preserving legacy path)
- Add --seeds/--maxTurns CLI aliases in scripts/monte-carlo.ts; update defaults
- Update package.json monte-carlo script to use --strategy greedy
- Add CI guardrail test (20-80% win rate for greedy strategy)
- Add monte-carlo CI job with JSON/CSV artifact upload to deploy.yml

Co-authored-by: SorraTheOrc <250240+SorraTheOrc@users.noreply.github.com>
Copilot AI changed the title [WIP] Extend Main Street Monte Carlo harness for AI strategies feat(monte-carlo): wire greedy/random AI strategies into harness with CLI flags and CI guardrail Mar 13, 2026
Copilot AI requested a review from SorraTheOrc March 13, 2026 08:31
@SorraTheOrc SorraTheOrc marked this pull request as ready for review March 13, 2026 08:55
@SorraTheOrc SorraTheOrc merged commit f3af405 into main Mar 13, 2026
@SorraTheOrc SorraTheOrc deleted the copilot/extend-monte-carlo-harness-ai-strategies branch March 13, 2026 08:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Main Street M3: Monte Carlo Harness Extension for AI Strategies

2 participants