feat(probes): service_loop turn boundaries + phase decomposition (#151 sprinkle 2)#1537
Merged
joelteply merged 1 commit intoJun 6, 2026
Merged
Conversation
…tion (#151 sprinkle 2) Wires probes at the per-persona service-loop turn boundaries so an operator can reconstruct every turn as a complete record (start → spoke/silent/error → phase timings). Each probe is one line at the seam; none changes behavior. Per Joel's RTOS-debugger framing `[[jtag-probes-are-rtos-debugger]]`: "timing of anything, so we can hunt down bottlenecks." The `persona.turn.spoke` probe carries the full 5-phase decomposition (recall_ms + admit_ms + compose_ms + respond_ms + say_ms) so a single `jq` query over the JSONL log surfaces which phase is dominating wall-clock without sprinkling N separate `time_sync!` blocks. ## What gets wired (4 new probes) - `persona.turn.start` — turn entry at the airc boundary (after self-loop and pre-watermark filters, before any cognition call). persona / persona_id / room_id / lamport / peer_id / text_len. Pair with `persona.turn.spoke`/`silent`/`error` (same lamport) for the complete turn record. - `persona.turn.spoke` — turn completed successfully with full phase decomposition: response_len / turn_duration_ms / recall_ms / admit_ms / compose_ms / respond_ms / say_ms. The bottleneck-hunting workhorse — one probe per turn, every phase in one line. - `persona.turn.silent` — persona's own cognitive output was Silence. reason (verbatim from `PersonaResponse::Silent`) + lamport + persona. Critical for distinguishing "this persona chose silence for THIS message" (cognition signal) from "no persona is responding to anything" (probably an `cognition.analyze.parse` issue — see prior PR). - `persona.turn.error` — turn failed somewhere downstream of the cognition entry. `stage = "respond" | "say"` field distinguishes a cognition-cycle failure (inside `respond_inner`) from an airc-publish failure (after cognition succeeded). error string + lamport for cross-referencing the upstream traces. ## Manual updated `docs/architecture/RTOS-DEBUGGER-PROBES.md` checklist now shows `persona/service_loop.rs::serve_persona_loop_inner` as wired. Remaining open items (`prompt_assembly` standalone, `score_persona` re-wire, `llama_cpp_adapter` per-batch) are noted as "covered indirectly" by the existing probes — re-evaluate if a real debugging session needs deeper visibility there. ## Doctrine - `[[jtag-probes-are-rtos-debugger]]` — RTOS breakpoints with variable inspection + timing. One probe per seam. - `[[no-rust-gates-around-cognition]]` — every probe in this slice is read-only state capture; none changes flow. card: `b390a821` parent task: #151 foundation: #1535 (merged) sprinkle 1: #1536 (under review) Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
joelteply
added a commit
that referenced
this pull request
Jun 6, 2026
#151 sprinkle 3) Per Joel 2026-06-06: the JTAG probes are "fundamentally something we'd like to make convenient to use to debug this massively complex machine." Exact-match-only filtering hits its ergonomic ceiling fast — `CONTINUUM_PROBE_CLASSES=persona.turn.start,persona.turn.silent,persona.turn.spoke,persona.response.render.prompt,persona.response.render.raw,…` is the kind of incantation that stops getting typed correctly. This slice replaces the exact-match HashSet lookup with a three-rule matcher that mirrors `tracing_subscriber::EnvFilter`'s shape — the shape every Rust dev already knows from `RUST_LOG`. Same env vars, same sink layer, dramatically shorter operator commands. ## What ships `routing::probe_file_sink::class_passes_filter` — pure helper, unit-testable in isolation, used by the Layer's `on_event` path. Rules in priority order: 1. **Empty filter set** = "no filter configured." Every class passes. (`CONTINUUM_PROBE_CLASSES` unset.) 2. **`*` is in the set** = explicit "match every class" wildcard. The firehose — distinct from rule 1 in intent per `[[no-fallbacks-ever]]`: empty is "I didn't configure", `*` is "I deliberately want everything." 3. **Exact OR namespace prefix.** `C == F` (exact) or `C.starts_with(F + ".")` (namespace prefix). The `.` guard prevents `persona` from accidentally matching `personality.x`. Same convention as RUST_LOG. ## Operator-side wins Before: ```bash CONTINUUM_PROBE_CLASSES=persona.turn.start,persona.turn.spoke,persona.turn.silent,persona.turn.error,persona.response.enter,persona.response.analyze.result,persona.response.render.prompt,persona.response.render.raw,persona.response.exit.spoke,cognition.analyze.enter,cognition.analyze.noop_single_specialty,cognition.analyze.cache_hit,cognition.analyze.parse ``` After: ```bash CONTINUUM_PROBE_CLASSES=persona,cognition # same coverage, 27 chars CONTINUUM_PROBE_CLASSES=* # firehose CONTINUUM_PROBE_CLASSES=persona.turn,cognition.analyze.parse # exact + namespace, mixed ``` ## Tests (+4 new, on top of the existing 5) `routing::probe_file_sink::tests`: - `class_filter_namespace_prefix_matches_subclasses` — `persona` prefix matches `persona.turn.spoke` AND `persona.response.render.prompt` but NOT `personality.something` or `cognition.analyze.parse`. Pins the dot guard. - `class_filter_wildcard_matches_every_class` — `*` captures every class regardless of name. - `class_filter_combines_exact_and_prefix_in_one_set` — the realistic operator pattern (one specific class + one namespace prefix) works in the same HashSet without rule contention. - `class_passes_filter_pure_function_unit_tests` — direct unit tests on the helper covering all three rules + the dot guard + the literal-prefix-as-exact edge case. Future refactors of the per-event Layer can't drift the matching contract without breaking this pin. The existing `class_filter_drops_unallowed_classes` test still passes (exact match is a degenerate case of rule 3). ## Manual + README updated - `docs/architecture/RTOS-DEBUGGER-PROBES.md` — "How to enable + read" section rewritten with the three-rule spec + example invocations + explicit `*` semantics. - `README.md` — the "Debugging this substrate" section now leads with the short prefix form (`CONTINUUM_PROBE_CLASSES=persona,cognition`) instead of the long comma-separated list. First impression for any new contributor / agent matches the actual usability. ## Why this matters Joel's framing: the probes are the substrate's debugger for a massively complex machine. A debugger people don't type because the syntax is painful isn't a debugger. Prefix matching is the single cheapest change that takes the JTAG from "in principle usable" to "actually used in every debug session." ## Doctrine - `[[jtag-probes-are-rtos-debugger]]` — "easy one liners or it won't happen." Prefix matching shrinks the operator-side line by an order of magnitude. - `[[observability-is-half-the-architecture]]` — same Layer shape, zero new infrastructure, just better operator UX. - `[[no-fallbacks-ever]]` — empty filter vs `*` are distinct intents with distinct names. The substrate doesn't silently synthesize `*` from "no env var set." card: `7d286195` parent task: #151 foundation: #1535 (merged) sprinkle 1: #1536 (merged) sprinkle 2: #1537 (in review) Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
joelteply
added a commit
that referenced
this pull request
Jun 6, 2026
…ne seam (#151 sprinkle 4) Two beats in one slice — and the second is here because the first exposed the brittleness of the design Joel had already flagged. ## Beat 1: server actually installs the sink Per Joel 2026-06-06: "Let's perfect debugging as we use it." The `JsonlProbeFileSink` existed and the env vars were documented, but `continuum-core-server`'s `main.rs` was installing a bespoke `FmtSubscriber` — so setting `CONTINUUM_PROBE_FILE` had no effect. The JTAG hardware was unwired. New helper `routing::install_probe_tracing(config)` composes the substrate-canonical stack in one call: 1. `UriCaptureLayer` — URI ancestry for every probe 2. `ProbeRouterLayer` — in-process broadcast for `debug/probes/{class}/stream` consumers 3. `JsonlProbeFileSink` — disk-persisted JSONL log when `config.probe_file` is set 4. `tracing_subscriber::fmt::layer()` — stderr text logs governed by `RUST_LOG` (falls back to `config.default_filter`) `main.rs` swaps to the helper. Server logs `probes landing at <path>` at boot when the file sink is configured — the operator-side proof that the env var took effect. ## Beat 2: env-coupling collapsed to ONE seam Joel 2026-06-06: "That's why env vars are problematic — they are brittle." Cargo's parallel test runner raced two `install_probe_tracing` tests that mutated `CONTINUUM_PROBE_FILE` across threads; `std::env::set_var` is process-global mutable state (slated for `unsafe` marking in Rust 2024 precisely because of this class of bug). Operators hit the same class of brittleness when env-var inheritance varies across subprocesses or shell contexts. This commit fixes the brittleness at the design layer instead of working around it in tests: ### `ProbeTracingConfig { probe_file, probe_classes, default_filter }` (new) Typed boot configuration. `install_probe_tracing` takes ONLY typed values — no env access inside the library function. ### `ProbeTracingConfig::from_env(default_filter)` — THE env seam The ONE function that touches `std::env`. Reads `CONTINUUM_PROBE_FILE` + `CONTINUUM_PROBE_CLASSES` (the operator- facing env names stay verbatim) into a typed config. Future alternative sources (config file, CLI flags, hardcoded defaults) become additional constructors on `ProbeTracingConfig` without rippling into the install function. ### Consequences - **`main.rs`**: `install_probe_tracing(ProbeTracingConfig::from_env("info"))?` — explicit env-coupling at the call site, not buried in the library - **Tests**: construct `ProbeTracingConfig { probe_file: Some(temp.path()), ... }` directly — zero env mutation, fully parallel-safe. The previous merged-sequential test from the first revision of this slice is now split back into TWO parallel `#[test]` functions. - **Operator UX**: unchanged. Same `CONTINUUM_PROBE_FILE` + `CONTINUUM_PROBE_CLASSES` env vars, same prefix-match filter rules from PR #1538. ## Tests (+3) `routing::tracing_init::tests`: - `install_is_idempotent_with_no_disk_capture` — double-call is safe via `try_init`; typed config means no env mutation; runs parallel-safely against any other test - `install_surfaces_open_failed_for_unwritable_path` — typed `ProbeFileSinkError::OpenFailed` surfaces per `[[no-fallbacks-ever]]`; bad path passed as typed value, no env var racing - `from_env_reads_documented_env_vars` — pins both populated and empty paths through the env constructor; the ONE test that touches `std::env` (scoped so the brittleness can't leak) ## Manual updated `docs/architecture/RTOS-DEBUGGER-PROBES.md` — "How to enable + read" section now describes the typed-config split: > The installer takes a typed `ProbeTracingConfig` — NOT env vars > directly. Env coupling lives at exactly one seam: > `ProbeTracingConfig::from_env(default_filter)`. This keeps the > library function parallel-test-safe (no `std::env::set_var` > racing under `cargo test`) and puts every config source (env, > file, CLI flags, hardcoded) on equal footing. ## Doctrine - `[[jtag-probes-are-rtos-debugger]]` — debugger must be ON in real binaries, not just tests, to debug real problems. - `[[observability-is-half-the-architecture]]` — same layers, same order, every entry point. - `[[no-fallbacks-ever]]` — typed-error surfacing (env-var-unset ≠ path-unwritable; the two distinct intents stay distinct). - The new lesson: process-global mutable state belongs at ONE seam. Library functions take typed values. The brittleness Joel named showed up first in our own tests; the design fix removes the class of bug rather than masking it. card: `305c8fb9` parent task: #151 foundation: #1535 (merged) — sprinkle 1: #1536 (merged) — sprinkle 2: #1537 (merged) — sprinkle 3: #1538 (in review) Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
joelteply
added a commit
that referenced
this pull request
Jun 6, 2026
…v coupling at one seam (#151) (#1538) * feat(probes): namespace-prefix + wildcard filter in JsonlProbeFileSink (#151 sprinkle 3) Per Joel 2026-06-06: the JTAG probes are "fundamentally something we'd like to make convenient to use to debug this massively complex machine." Exact-match-only filtering hits its ergonomic ceiling fast — `CONTINUUM_PROBE_CLASSES=persona.turn.start,persona.turn.silent,persona.turn.spoke,persona.response.render.prompt,persona.response.render.raw,…` is the kind of incantation that stops getting typed correctly. This slice replaces the exact-match HashSet lookup with a three-rule matcher that mirrors `tracing_subscriber::EnvFilter`'s shape — the shape every Rust dev already knows from `RUST_LOG`. Same env vars, same sink layer, dramatically shorter operator commands. ## What ships `routing::probe_file_sink::class_passes_filter` — pure helper, unit-testable in isolation, used by the Layer's `on_event` path. Rules in priority order: 1. **Empty filter set** = "no filter configured." Every class passes. (`CONTINUUM_PROBE_CLASSES` unset.) 2. **`*` is in the set** = explicit "match every class" wildcard. The firehose — distinct from rule 1 in intent per `[[no-fallbacks-ever]]`: empty is "I didn't configure", `*` is "I deliberately want everything." 3. **Exact OR namespace prefix.** `C == F` (exact) or `C.starts_with(F + ".")` (namespace prefix). The `.` guard prevents `persona` from accidentally matching `personality.x`. Same convention as RUST_LOG. ## Operator-side wins Before: ```bash CONTINUUM_PROBE_CLASSES=persona.turn.start,persona.turn.spoke,persona.turn.silent,persona.turn.error,persona.response.enter,persona.response.analyze.result,persona.response.render.prompt,persona.response.render.raw,persona.response.exit.spoke,cognition.analyze.enter,cognition.analyze.noop_single_specialty,cognition.analyze.cache_hit,cognition.analyze.parse ``` After: ```bash CONTINUUM_PROBE_CLASSES=persona,cognition # same coverage, 27 chars CONTINUUM_PROBE_CLASSES=* # firehose CONTINUUM_PROBE_CLASSES=persona.turn,cognition.analyze.parse # exact + namespace, mixed ``` ## Tests (+4 new, on top of the existing 5) `routing::probe_file_sink::tests`: - `class_filter_namespace_prefix_matches_subclasses` — `persona` prefix matches `persona.turn.spoke` AND `persona.response.render.prompt` but NOT `personality.something` or `cognition.analyze.parse`. Pins the dot guard. - `class_filter_wildcard_matches_every_class` — `*` captures every class regardless of name. - `class_filter_combines_exact_and_prefix_in_one_set` — the realistic operator pattern (one specific class + one namespace prefix) works in the same HashSet without rule contention. - `class_passes_filter_pure_function_unit_tests` — direct unit tests on the helper covering all three rules + the dot guard + the literal-prefix-as-exact edge case. Future refactors of the per-event Layer can't drift the matching contract without breaking this pin. The existing `class_filter_drops_unallowed_classes` test still passes (exact match is a degenerate case of rule 3). ## Manual + README updated - `docs/architecture/RTOS-DEBUGGER-PROBES.md` — "How to enable + read" section rewritten with the three-rule spec + example invocations + explicit `*` semantics. - `README.md` — the "Debugging this substrate" section now leads with the short prefix form (`CONTINUUM_PROBE_CLASSES=persona,cognition`) instead of the long comma-separated list. First impression for any new contributor / agent matches the actual usability. ## Why this matters Joel's framing: the probes are the substrate's debugger for a massively complex machine. A debugger people don't type because the syntax is painful isn't a debugger. Prefix matching is the single cheapest change that takes the JTAG from "in principle usable" to "actually used in every debug session." ## Doctrine - `[[jtag-probes-are-rtos-debugger]]` — "easy one liners or it won't happen." Prefix matching shrinks the operator-side line by an order of magnitude. - `[[observability-is-half-the-architecture]]` — same Layer shape, zero new infrastructure, just better operator UX. - `[[no-fallbacks-ever]]` — empty filter vs `*` are distinct intents with distinct names. The substrate doesn't silently synthesize `*` from "no env var set." card: `7d286195` parent task: #151 foundation: #1535 (merged) sprinkle 1: #1536 (merged) sprinkle 2: #1537 (in review) Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * feat(probes): boot wiring + typed-config refactor — env coupling at one seam (#151 sprinkle 4) Two beats in one slice — and the second is here because the first exposed the brittleness of the design Joel had already flagged. ## Beat 1: server actually installs the sink Per Joel 2026-06-06: "Let's perfect debugging as we use it." The `JsonlProbeFileSink` existed and the env vars were documented, but `continuum-core-server`'s `main.rs` was installing a bespoke `FmtSubscriber` — so setting `CONTINUUM_PROBE_FILE` had no effect. The JTAG hardware was unwired. New helper `routing::install_probe_tracing(config)` composes the substrate-canonical stack in one call: 1. `UriCaptureLayer` — URI ancestry for every probe 2. `ProbeRouterLayer` — in-process broadcast for `debug/probes/{class}/stream` consumers 3. `JsonlProbeFileSink` — disk-persisted JSONL log when `config.probe_file` is set 4. `tracing_subscriber::fmt::layer()` — stderr text logs governed by `RUST_LOG` (falls back to `config.default_filter`) `main.rs` swaps to the helper. Server logs `probes landing at <path>` at boot when the file sink is configured — the operator-side proof that the env var took effect. ## Beat 2: env-coupling collapsed to ONE seam Joel 2026-06-06: "That's why env vars are problematic — they are brittle." Cargo's parallel test runner raced two `install_probe_tracing` tests that mutated `CONTINUUM_PROBE_FILE` across threads; `std::env::set_var` is process-global mutable state (slated for `unsafe` marking in Rust 2024 precisely because of this class of bug). Operators hit the same class of brittleness when env-var inheritance varies across subprocesses or shell contexts. This commit fixes the brittleness at the design layer instead of working around it in tests: ### `ProbeTracingConfig { probe_file, probe_classes, default_filter }` (new) Typed boot configuration. `install_probe_tracing` takes ONLY typed values — no env access inside the library function. ### `ProbeTracingConfig::from_env(default_filter)` — THE env seam The ONE function that touches `std::env`. Reads `CONTINUUM_PROBE_FILE` + `CONTINUUM_PROBE_CLASSES` (the operator- facing env names stay verbatim) into a typed config. Future alternative sources (config file, CLI flags, hardcoded defaults) become additional constructors on `ProbeTracingConfig` without rippling into the install function. ### Consequences - **`main.rs`**: `install_probe_tracing(ProbeTracingConfig::from_env("info"))?` — explicit env-coupling at the call site, not buried in the library - **Tests**: construct `ProbeTracingConfig { probe_file: Some(temp.path()), ... }` directly — zero env mutation, fully parallel-safe. The previous merged-sequential test from the first revision of this slice is now split back into TWO parallel `#[test]` functions. - **Operator UX**: unchanged. Same `CONTINUUM_PROBE_FILE` + `CONTINUUM_PROBE_CLASSES` env vars, same prefix-match filter rules from PR #1538. ## Tests (+3) `routing::tracing_init::tests`: - `install_is_idempotent_with_no_disk_capture` — double-call is safe via `try_init`; typed config means no env mutation; runs parallel-safely against any other test - `install_surfaces_open_failed_for_unwritable_path` — typed `ProbeFileSinkError::OpenFailed` surfaces per `[[no-fallbacks-ever]]`; bad path passed as typed value, no env var racing - `from_env_reads_documented_env_vars` — pins both populated and empty paths through the env constructor; the ONE test that touches `std::env` (scoped so the brittleness can't leak) ## Manual updated `docs/architecture/RTOS-DEBUGGER-PROBES.md` — "How to enable + read" section now describes the typed-config split: > The installer takes a typed `ProbeTracingConfig` — NOT env vars > directly. Env coupling lives at exactly one seam: > `ProbeTracingConfig::from_env(default_filter)`. This keeps the > library function parallel-test-safe (no `std::env::set_var` > racing under `cargo test`) and puts every config source (env, > file, CLI flags, hardcoded) on equal footing. ## Doctrine - `[[jtag-probes-are-rtos-debugger]]` — debugger must be ON in real binaries, not just tests, to debug real problems. - `[[observability-is-half-the-architecture]]` — same layers, same order, every entry point. - `[[no-fallbacks-ever]]` — typed-error surfacing (env-var-unset ≠ path-unwritable; the two distinct intents stay distinct). - The new lesson: process-global mutable state belongs at ONE seam. Library functions take typed values. The brittleness Joel named showed up first in our own tests; the design fix removes the class of bug rather than masking it. card: `305c8fb9` parent task: #151 foundation: #1535 (merged) — sprinkle 1: #1536 (merged) — sprinkle 2: #1537 (merged) — sprinkle 3: #1538 (in review) Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Follow-up to PR #1535 (foundation) + #1536 (sprinkle 1 — both merged). Wires probes at the per-persona service-loop turn boundaries so an operator can reconstruct every turn as a complete record (start → spoke/silent/error) and find phase-level bottlenecks from one JSONL log.
Per Joel's RTOS-debugger framing
[[jtag-probes-are-rtos-debugger]]: "timing of anything, so we can hunt down bottlenecks." Thepersona.turn.spokeprobe carries the full 5-phase decomposition (recall_ms+admit_ms+compose_ms+respond_ms+say_ms) so onejqquery surfaces which phase is dominating wall-clock without N separate timing blocks.What gets wired (4 new probes)
persona.turn.start— turn entry at the airc boundary (after self-loop / pre-watermark filters, before any cognition call). persona / persona_id / room_id / lamport / peer_id / text_len. Pair with the spoke/silent/error variants (same lamport) for the complete turn record.persona.turn.spoke— turn completed successfully with FULL phase decomposition: response_len / turn_duration_ms / recall_ms / admit_ms / compose_ms / respond_ms / say_ms. The bottleneck-hunting workhorse.persona.turn.silent— persona's own cognitive output was Silence (PersonaResponse::Silentfrom the canonical cycle). reason (verbatim from the brain) + lamport + persona.persona.turn.error— turn failed downstream of cognition entry.stage = "respond" | "say"distinguishes a cognition-cycle failure from an airc-publish failure.Manual updated
docs/architecture/RTOS-DEBUGGER-PROBES.mdchecklist now showspersona/service_loop.rs::serve_persona_loop_inneras wired. Remaining open items (prompt_assemblystandalone,score_personare-wire,llama_cpp_adapterper-batch) are noted as covered indirectly by the existing probes — re-evaluate when a real debug session needs deeper visibility there.Build / tests
Build clean with
--features metal. Sink tests 5/5 pass (no new infrastructure; these are call sites against the existing JsonlProbeFileSink). The probes are operationally exercised by any scenario that runsserve_persona_loopwithCONTINUUM_PROBE_FILEset.Doctrine
[[jtag-probes-are-rtos-debugger]]— RTOS breakpoints with variable inspection + timing.[[no-rust-gates-around-cognition]]— every probe in this slice is read-only state capture; none changes flow.card:
b390a821parent task: #151
foundation: #1535 (merged)
sprinkle 1: #1536 (merged)