Skip to content

docs(otel): Phase-2 port-audit follow-up after bundles #1 + #2#144

Merged
slabgorb merged 9 commits intomainfrom
docs/otel-phase-2-port-audit
Apr 25, 2026
Merged

docs(otel): Phase-2 port-audit follow-up after bundles #1 + #2#144
slabgorb merged 9 commits intomainfrom
docs/otel-phase-2-port-audit

Conversation

@slabgorb
Copy link
Copy Markdown
Owner

Summary

Discovered-state correction for the original Phase-2 emission-rollouts handoff after shipping bundles #1 (state-patch) and #2 (NPC) in the server repo (slabgorb/sidequest-server#49, #50).

The original handoff assumed all ~26 SPAN_* families had live OTEL spans being opened in Python production code. Audit found that the Python port (ADR-082) replaced most Rust `info_span!`/`start_as_current_span` emissions with `logger.info(...)` strings or direct `watcher_publish` calls. The `SPAN*` constants in `spans.py` are mostly byte-identical-to-Rust name-string prefixes, not actual span openings.

This follow-up:

Files changed

  • New: `docs/superpowers/plans/2026-04-25-otel-phase-2-port-audit-followup.md` (1 file, +174 lines)

Test plan

🤖 Generated with Claude Code

slabgorb and others added 9 commits April 25, 2026 04:07
Design for the proper architectural fix to per-session GameSnapshot
divergence in multiplayer. Replaces the band-aid
_merge_peer_state_into_snapshot helper with a single canonical snapshot
held on SessionRoom and shared by every WS session bound to the slug.

Constraint that simplifies scope: no saved MP games exist on disk
(multiplayer has never worked end-to-end). No migration path needed.
Band-aid + its 5 merge tests are deleted in the same change.

Out of scope: ADR-028 LLM rewrites, per-recipient narration region
filtering, PlayerState overlay struct.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…port)

Step-by-step plan implementing the spec at
docs/superpowers/specs/2026-04-25-shared-room-snapshot-design.md.
9 tasks, 8 commits, ends at 739 passed / 2 skipped on the server +
agents sweep. Sequential single-developer plan suitable for inline
execution as the Bicycle Repair Man dev agent.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Forensic audit found four root causes of the OTEL dashboard regression
since the Rust→Python port: broken `just otel` recipe, ~80% dead SPAN_*
constants, missing Layer-3 narrative validator, and impoverished
translator. Design specifies a four-phase faithful port that restores
full parity with the Rust contract.

Approved interactively via /superpowers:brainstorming. Self-review pass
fixed an emit-double on json_extraction_result (translator owns; not
the validator) and reclassified SPAN_CONTENT_RESOLVE to FLAT_ONLY due
to volume.

Next: writing-plans for implementation plan.
- Update `just otel` to call `playtest_dashboard.py` via `uv run` instead of
  the deleted `playtest.py --dashboard-only`
- Add `__main__` entry point with argparse to `playtest_dashboard.py`
  (it was a library-only module with no CLI entry point)
- Add `websockets>=12.0` and `rich>=13.0` to orchestrator `pyproject.toml`
  (required by playtest_dashboard.py, previously missing from manifest)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Add observability tag, replace stale Rust-only opening blockquote, and
append Python-port note pointing at the canonical telemetry implementation
files in sidequest-server.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Pairs with the design spec at
docs/superpowers/specs/2026-04-25-otel-dashboard-restoration-design.md.
This is the 25-task plan (Phase 0–4) executed by the OTEL feat branches
across orchestrator, sidequest-server, and sidequest-ui.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Discovered-state correction for the original Phase-2 emission-rollouts
handoff: the Python port (ADR-082) replaced most Rust info_span!/
start_as_current_span calls with logger.info(...) strings or direct
_watcher_publish calls. The SPAN_* constants in spans.py are mostly
byte-identical-to-Rust name-string prefixes, not actual span openings.

Records:
- Which families are genuinely live in Python (the ones that ship now).
- Which ~7 bundles in the original handoff are port-dead and require
  engine porting before routing.
- The two real Phase-2 workstreams: migrate live _watcher_publish sites
  (Workstream A — what bundles #49 and #50 did), vs. port the missing
  engines (Workstream B — out of scope for telemetry restoration).
- Suggested next bundle picks from Workstream A.
- Open question on whether to bridge port gaps via log-string parsing
  (recommended: no — Illusionism violation per ADR-031).

Closes the loop after PRs slabgorb/sidequest-server#49 and #50.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@slabgorb slabgorb merged commit 4ab175f into main Apr 25, 2026
1 check passed
@slabgorb slabgorb deleted the docs/otel-phase-2-port-audit branch April 25, 2026 15:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant