Wire #13 infrastructure into game loop + Docker e2e#16
Conversation
EventLog emits at every run_tick() stage (7 event types, thread-safe). CostTracker records token usage from agent._last_usage after each deliberation. ActionResolver returns ResolverStats for conflict counting. MetricContext populated with conflicts + CentralPost message tracking. DEFAULT_WEIGHTS redistributed (proportional 20% cut) to activate coordination (0.12) and communication_efficiency (0.08). All 6 scenario YAMLs updated. --ablation runs 8 benchmark passes (full + 7 agent-removed). Weave tracing optional on _call_provider via enable_weave(). Docker entrypoint CRLF fix. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Entrypoint now seeds ModsConfig.xml (Harmony → Core → Royalty → HeadlessRim → RIMAPI), links Workshop mods from SteamCMD download, replaces game Mods/ dir with merged mods symlink. Runs as root for setup then drops to rimworld user via su. Dockerfile strips CRLF from entrypoint.sh, extends healthcheck for longer startup. docker-compose.yml drops :ro on game volume (entrypoint needs to swap Mods dir). Tested: RIMAPI responds inside container with patched HeadlessRimPatch (PR IlyaChichkov/HeadlessRimPatch#6). IPv6 loopback binding blocks external access through Docker port forwarding — separate RIMAPI fix. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Mono's HttpListener binds to [::1] regardless of config. Docker port forwarding can't reach loopback. socat bridges 0.0.0.0:8765 (IPv4) to [::1]:8765 (where RIMAPI actually listens). Tested: curl from host gets HTTP 200 JSON response through Docker port mapping. Full e2e validated with patched HeadlessRimPatch (no autoplay) + socat bridge. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Code ReviewDocker & InfrastructureCRITICAL: Container runs as root Socat bridge has no supervision Healthcheck allows 30-minute wait CRLF fix is good ✓ Orchestration & ScoringMetricContext wiring — correct ✓ EventLog thread safety — correct ✓ Weight redistribution — correct ✓ Agents & ScriptsAblation logic — correct ✓ Weave tracing — truly optional ✓ Gap: ablation + save load failure Gap: TestsGood coverage on new features ✓
Gap: scenario YAML weight sums not tested Action Items
Core instrumentation work is solid — thread safety, metric wiring, and weight math are all correct. Main concerns are Docker security and the ablation save-load gap. |
Fixes AppliedAll review items addressed in Docker security & reliability
Ablation save-load guard
Test coverage
Verification
|
Docker e2e ValidatedRebuilt image with all fixes, ran crashlanded scenario (10 ticks, Nemotron 120B paid via OpenRouter):
Stats: 95K tokens, 68 LLM calls, $0.03, 900s wall time Entrypoint fixes confirmed working:
Ship it. |
Summary
--ablationruns 8 benchmark passes (full + 7 agent-removed)._call_provider().Test plan
Closes #13
🤖 Generated with Claude Code