feat(probes): sprinkle LLM-boundary visibility into persona::response + cognition::shared_analysis (#151)#1536
Merged
joelteply merged 1 commit intoJun 6, 2026
Conversation
… + cognition::shared_analysis (#151 sprinkle 1) Wires probes into the seams Joel named as the diagnostic target: "we what's going in and out of an LLM or its subconscious processes, where they're slow or incorrect." Probes are non-blocking breakpoints with variable inspection per `[[jtag-probes-are-rtos-debugger]]`. Each call is one line at a seam; the macro shape `probe!(class = "...", field = val, "msg")` mirrors `tracing::info!` so adding them stays one-line cheap. None of these change behavior — they observe state, don't decide it. ## What gets wired (5 new probes) ### `persona::response::respond_inner` (3 probes) - `persona.response.enter` — turn entry. persona / persona_id / room_id / message_id / message_text_len / history_count / known_specialties / media_count / recalled_engrams. Pair with `persona.response.exit.spoke` (same message_id) for a complete turn record. - `persona.response.analyze.result` — what the analyze stage gave THIS persona. from_cache / model_used / analyze_duration_ms / suggested_angles_count / matched_angle_present + len / intent. Critical signal: a tiny model returning non-empty angles for every specialty on every message is the echo-storm root cause. - `persona.response.exit.spoke` — final answer. visible_text + len / think_blocks / model_used / total_ms / inference_ms. ### `persona::response::run_render` (2 probes) - `persona.response.render.prompt` — **what's going INTO the LLM**. Captures `system_message` verbatim plus message_count, estimated_tokens, matched_angle_present, engrams_count, history_count. The single most informative snapshot for "why did the model produce that response?" — the prompt drives the output. - `persona.response.render.raw` — **what came OUT of the LLM**. Captures `raw_response.text` verbatim before any post-process (think-block strip, leaked-markup strip). Pair with `.render.prompt` to reconstruct the exact model contract for every turn. ### `cognition::shared_analysis::analyze` (3 probes — the "subconscious" stage) - `cognition.analyze.enter` — input fingerprint at the entry to the shared-analysis verb. Correlates N personas hitting the single-flight cache via the same message_id + cache_key. - `cognition.analyze.noop_single_specialty` — short-circuit fired (single-specialty room → empty suggested_angles, no LLM call). Distinguishes "I chose silence because no angle matched" from "I chose silence because analyze didn't run" — they look identical downstream without this signal. - `cognition.analyze.cache_hit` — L1 hit → N-1 personas skip the LLM. Hit-rate is one of the substrate's load-bearing metrics for multi-persona rooms. - `cognition.analyze.parse` — the parsed angle decision shape: total angles / non-empty / empty + intent / summary_len / key_concepts_count + analyze_duration_ms. The diagnostic starting point for "why is every persona responding to a trivial greeting?" — empty-vs-non-empty ratio shows whether the LCD-tier model defaults to filling every angle. ## Manual updated `docs/architecture/RTOS-DEBUGGER-PROBES.md` checklist now shows which seams are wired (response_inner, run_render, shared_analysis) and which are still pending (service_loop turn boundaries, prompt_assembly standalone probes, adapter-level probes). service_loop sprinkle + RAG flexbox probes follow in the next PR. ## Doctrine - `[[jtag-probes-are-rtos-debugger]]` — RTOS-style breakpoints with variable inspection. Easy one-liners or it won't happen. - `[[observability-is-half-the-architecture]]` — structured capture of load-bearing decisions; this slice surfaces the LLM-boundary decisions which are the substrate's most opaque seam. - `[[no-rust-gates-around-cognition]]` — probes OBSERVE the cognition, they do not decide for it. Every probe in this slice is read-only state capture; none changes flow. card: `8d7ca5c3` parent: #151 (debug the silence-affordance bug INSIDE the cognition by reading the LLM's actual in/out, not by adding a Rust gate around it) Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
joelteply
added a commit
that referenced
this pull request
Jun 6, 2026
…tion (#151 sprinkle 2) Wires probes at the per-persona service-loop turn boundaries so an operator can reconstruct every turn as a complete record (start → spoke/silent/error → phase timings). Each probe is one line at the seam; none changes behavior. Per Joel's RTOS-debugger framing `[[jtag-probes-are-rtos-debugger]]`: "timing of anything, so we can hunt down bottlenecks." The `persona.turn.spoke` probe carries the full 5-phase decomposition (recall_ms + admit_ms + compose_ms + respond_ms + say_ms) so a single `jq` query over the JSONL log surfaces which phase is dominating wall-clock without sprinkling N separate `time_sync!` blocks. ## What gets wired (4 new probes) - `persona.turn.start` — turn entry at the airc boundary (after self-loop and pre-watermark filters, before any cognition call). persona / persona_id / room_id / lamport / peer_id / text_len. Pair with `persona.turn.spoke`/`silent`/`error` (same lamport) for the complete turn record. - `persona.turn.spoke` — turn completed successfully with full phase decomposition: response_len / turn_duration_ms / recall_ms / admit_ms / compose_ms / respond_ms / say_ms. The bottleneck-hunting workhorse — one probe per turn, every phase in one line. - `persona.turn.silent` — persona's own cognitive output was Silence. reason (verbatim from `PersonaResponse::Silent`) + lamport + persona. Critical for distinguishing "this persona chose silence for THIS message" (cognition signal) from "no persona is responding to anything" (probably an `cognition.analyze.parse` issue — see prior PR). - `persona.turn.error` — turn failed somewhere downstream of the cognition entry. `stage = "respond" | "say"` field distinguishes a cognition-cycle failure (inside `respond_inner`) from an airc-publish failure (after cognition succeeded). error string + lamport for cross-referencing the upstream traces. ## Manual updated `docs/architecture/RTOS-DEBUGGER-PROBES.md` checklist now shows `persona/service_loop.rs::serve_persona_loop_inner` as wired. Remaining open items (`prompt_assembly` standalone, `score_persona` re-wire, `llama_cpp_adapter` per-batch) are noted as "covered indirectly" by the existing probes — re-evaluate if a real debugging session needs deeper visibility there. ## Doctrine - `[[jtag-probes-are-rtos-debugger]]` — RTOS breakpoints with variable inspection + timing. One probe per seam. - `[[no-rust-gates-around-cognition]]` — every probe in this slice is read-only state capture; none changes flow. card: `b390a821` parent task: #151 foundation: #1535 (merged) sprinkle 1: #1536 (under review) Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
joelteply
added a commit
that referenced
this pull request
Jun 6, 2026
…tion (#151 sprinkle 2) (#1537) Wires probes at the per-persona service-loop turn boundaries so an operator can reconstruct every turn as a complete record (start → spoke/silent/error → phase timings). Each probe is one line at the seam; none changes behavior. Per Joel's RTOS-debugger framing `[[jtag-probes-are-rtos-debugger]]`: "timing of anything, so we can hunt down bottlenecks." The `persona.turn.spoke` probe carries the full 5-phase decomposition (recall_ms + admit_ms + compose_ms + respond_ms + say_ms) so a single `jq` query over the JSONL log surfaces which phase is dominating wall-clock without sprinkling N separate `time_sync!` blocks. ## What gets wired (4 new probes) - `persona.turn.start` — turn entry at the airc boundary (after self-loop and pre-watermark filters, before any cognition call). persona / persona_id / room_id / lamport / peer_id / text_len. Pair with `persona.turn.spoke`/`silent`/`error` (same lamport) for the complete turn record. - `persona.turn.spoke` — turn completed successfully with full phase decomposition: response_len / turn_duration_ms / recall_ms / admit_ms / compose_ms / respond_ms / say_ms. The bottleneck-hunting workhorse — one probe per turn, every phase in one line. - `persona.turn.silent` — persona's own cognitive output was Silence. reason (verbatim from `PersonaResponse::Silent`) + lamport + persona. Critical for distinguishing "this persona chose silence for THIS message" (cognition signal) from "no persona is responding to anything" (probably an `cognition.analyze.parse` issue — see prior PR). - `persona.turn.error` — turn failed somewhere downstream of the cognition entry. `stage = "respond" | "say"` field distinguishes a cognition-cycle failure (inside `respond_inner`) from an airc-publish failure (after cognition succeeded). error string + lamport for cross-referencing the upstream traces. ## Manual updated `docs/architecture/RTOS-DEBUGGER-PROBES.md` checklist now shows `persona/service_loop.rs::serve_persona_loop_inner` as wired. Remaining open items (`prompt_assembly` standalone, `score_persona` re-wire, `llama_cpp_adapter` per-batch) are noted as "covered indirectly" by the existing probes — re-evaluate if a real debugging session needs deeper visibility there. ## Doctrine - `[[jtag-probes-are-rtos-debugger]]` — RTOS breakpoints with variable inspection + timing. One probe per seam. - `[[no-rust-gates-around-cognition]]` — every probe in this slice is read-only state capture; none changes flow. card: `b390a821` parent task: #151 foundation: #1535 (merged) sprinkle 1: #1536 (under review) Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
joelteply
added a commit
that referenced
this pull request
Jun 6, 2026
#151 sprinkle 3) Per Joel 2026-06-06: the JTAG probes are "fundamentally something we'd like to make convenient to use to debug this massively complex machine." Exact-match-only filtering hits its ergonomic ceiling fast — `CONTINUUM_PROBE_CLASSES=persona.turn.start,persona.turn.silent,persona.turn.spoke,persona.response.render.prompt,persona.response.render.raw,…` is the kind of incantation that stops getting typed correctly. This slice replaces the exact-match HashSet lookup with a three-rule matcher that mirrors `tracing_subscriber::EnvFilter`'s shape — the shape every Rust dev already knows from `RUST_LOG`. Same env vars, same sink layer, dramatically shorter operator commands. ## What ships `routing::probe_file_sink::class_passes_filter` — pure helper, unit-testable in isolation, used by the Layer's `on_event` path. Rules in priority order: 1. **Empty filter set** = "no filter configured." Every class passes. (`CONTINUUM_PROBE_CLASSES` unset.) 2. **`*` is in the set** = explicit "match every class" wildcard. The firehose — distinct from rule 1 in intent per `[[no-fallbacks-ever]]`: empty is "I didn't configure", `*` is "I deliberately want everything." 3. **Exact OR namespace prefix.** `C == F` (exact) or `C.starts_with(F + ".")` (namespace prefix). The `.` guard prevents `persona` from accidentally matching `personality.x`. Same convention as RUST_LOG. ## Operator-side wins Before: ```bash CONTINUUM_PROBE_CLASSES=persona.turn.start,persona.turn.spoke,persona.turn.silent,persona.turn.error,persona.response.enter,persona.response.analyze.result,persona.response.render.prompt,persona.response.render.raw,persona.response.exit.spoke,cognition.analyze.enter,cognition.analyze.noop_single_specialty,cognition.analyze.cache_hit,cognition.analyze.parse ``` After: ```bash CONTINUUM_PROBE_CLASSES=persona,cognition # same coverage, 27 chars CONTINUUM_PROBE_CLASSES=* # firehose CONTINUUM_PROBE_CLASSES=persona.turn,cognition.analyze.parse # exact + namespace, mixed ``` ## Tests (+4 new, on top of the existing 5) `routing::probe_file_sink::tests`: - `class_filter_namespace_prefix_matches_subclasses` — `persona` prefix matches `persona.turn.spoke` AND `persona.response.render.prompt` but NOT `personality.something` or `cognition.analyze.parse`. Pins the dot guard. - `class_filter_wildcard_matches_every_class` — `*` captures every class regardless of name. - `class_filter_combines_exact_and_prefix_in_one_set` — the realistic operator pattern (one specific class + one namespace prefix) works in the same HashSet without rule contention. - `class_passes_filter_pure_function_unit_tests` — direct unit tests on the helper covering all three rules + the dot guard + the literal-prefix-as-exact edge case. Future refactors of the per-event Layer can't drift the matching contract without breaking this pin. The existing `class_filter_drops_unallowed_classes` test still passes (exact match is a degenerate case of rule 3). ## Manual + README updated - `docs/architecture/RTOS-DEBUGGER-PROBES.md` — "How to enable + read" section rewritten with the three-rule spec + example invocations + explicit `*` semantics. - `README.md` — the "Debugging this substrate" section now leads with the short prefix form (`CONTINUUM_PROBE_CLASSES=persona,cognition`) instead of the long comma-separated list. First impression for any new contributor / agent matches the actual usability. ## Why this matters Joel's framing: the probes are the substrate's debugger for a massively complex machine. A debugger people don't type because the syntax is painful isn't a debugger. Prefix matching is the single cheapest change that takes the JTAG from "in principle usable" to "actually used in every debug session." ## Doctrine - `[[jtag-probes-are-rtos-debugger]]` — "easy one liners or it won't happen." Prefix matching shrinks the operator-side line by an order of magnitude. - `[[observability-is-half-the-architecture]]` — same Layer shape, zero new infrastructure, just better operator UX. - `[[no-fallbacks-ever]]` — empty filter vs `*` are distinct intents with distinct names. The substrate doesn't silently synthesize `*` from "no env var set." card: `7d286195` parent task: #151 foundation: #1535 (merged) sprinkle 1: #1536 (merged) sprinkle 2: #1537 (in review) Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
joelteply
added a commit
that referenced
this pull request
Jun 6, 2026
…ne seam (#151 sprinkle 4) Two beats in one slice — and the second is here because the first exposed the brittleness of the design Joel had already flagged. ## Beat 1: server actually installs the sink Per Joel 2026-06-06: "Let's perfect debugging as we use it." The `JsonlProbeFileSink` existed and the env vars were documented, but `continuum-core-server`'s `main.rs` was installing a bespoke `FmtSubscriber` — so setting `CONTINUUM_PROBE_FILE` had no effect. The JTAG hardware was unwired. New helper `routing::install_probe_tracing(config)` composes the substrate-canonical stack in one call: 1. `UriCaptureLayer` — URI ancestry for every probe 2. `ProbeRouterLayer` — in-process broadcast for `debug/probes/{class}/stream` consumers 3. `JsonlProbeFileSink` — disk-persisted JSONL log when `config.probe_file` is set 4. `tracing_subscriber::fmt::layer()` — stderr text logs governed by `RUST_LOG` (falls back to `config.default_filter`) `main.rs` swaps to the helper. Server logs `probes landing at <path>` at boot when the file sink is configured — the operator-side proof that the env var took effect. ## Beat 2: env-coupling collapsed to ONE seam Joel 2026-06-06: "That's why env vars are problematic — they are brittle." Cargo's parallel test runner raced two `install_probe_tracing` tests that mutated `CONTINUUM_PROBE_FILE` across threads; `std::env::set_var` is process-global mutable state (slated for `unsafe` marking in Rust 2024 precisely because of this class of bug). Operators hit the same class of brittleness when env-var inheritance varies across subprocesses or shell contexts. This commit fixes the brittleness at the design layer instead of working around it in tests: ### `ProbeTracingConfig { probe_file, probe_classes, default_filter }` (new) Typed boot configuration. `install_probe_tracing` takes ONLY typed values — no env access inside the library function. ### `ProbeTracingConfig::from_env(default_filter)` — THE env seam The ONE function that touches `std::env`. Reads `CONTINUUM_PROBE_FILE` + `CONTINUUM_PROBE_CLASSES` (the operator- facing env names stay verbatim) into a typed config. Future alternative sources (config file, CLI flags, hardcoded defaults) become additional constructors on `ProbeTracingConfig` without rippling into the install function. ### Consequences - **`main.rs`**: `install_probe_tracing(ProbeTracingConfig::from_env("info"))?` — explicit env-coupling at the call site, not buried in the library - **Tests**: construct `ProbeTracingConfig { probe_file: Some(temp.path()), ... }` directly — zero env mutation, fully parallel-safe. The previous merged-sequential test from the first revision of this slice is now split back into TWO parallel `#[test]` functions. - **Operator UX**: unchanged. Same `CONTINUUM_PROBE_FILE` + `CONTINUUM_PROBE_CLASSES` env vars, same prefix-match filter rules from PR #1538. ## Tests (+3) `routing::tracing_init::tests`: - `install_is_idempotent_with_no_disk_capture` — double-call is safe via `try_init`; typed config means no env mutation; runs parallel-safely against any other test - `install_surfaces_open_failed_for_unwritable_path` — typed `ProbeFileSinkError::OpenFailed` surfaces per `[[no-fallbacks-ever]]`; bad path passed as typed value, no env var racing - `from_env_reads_documented_env_vars` — pins both populated and empty paths through the env constructor; the ONE test that touches `std::env` (scoped so the brittleness can't leak) ## Manual updated `docs/architecture/RTOS-DEBUGGER-PROBES.md` — "How to enable + read" section now describes the typed-config split: > The installer takes a typed `ProbeTracingConfig` — NOT env vars > directly. Env coupling lives at exactly one seam: > `ProbeTracingConfig::from_env(default_filter)`. This keeps the > library function parallel-test-safe (no `std::env::set_var` > racing under `cargo test`) and puts every config source (env, > file, CLI flags, hardcoded) on equal footing. ## Doctrine - `[[jtag-probes-are-rtos-debugger]]` — debugger must be ON in real binaries, not just tests, to debug real problems. - `[[observability-is-half-the-architecture]]` — same layers, same order, every entry point. - `[[no-fallbacks-ever]]` — typed-error surfacing (env-var-unset ≠ path-unwritable; the two distinct intents stay distinct). - The new lesson: process-global mutable state belongs at ONE seam. Library functions take typed values. The brittleness Joel named showed up first in our own tests; the design fix removes the class of bug rather than masking it. card: `305c8fb9` parent task: #151 foundation: #1535 (merged) — sprinkle 1: #1536 (merged) — sprinkle 2: #1537 (merged) — sprinkle 3: #1538 (in review) Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
joelteply
added a commit
that referenced
this pull request
Jun 6, 2026
…v coupling at one seam (#151) (#1538) * feat(probes): namespace-prefix + wildcard filter in JsonlProbeFileSink (#151 sprinkle 3) Per Joel 2026-06-06: the JTAG probes are "fundamentally something we'd like to make convenient to use to debug this massively complex machine." Exact-match-only filtering hits its ergonomic ceiling fast — `CONTINUUM_PROBE_CLASSES=persona.turn.start,persona.turn.silent,persona.turn.spoke,persona.response.render.prompt,persona.response.render.raw,…` is the kind of incantation that stops getting typed correctly. This slice replaces the exact-match HashSet lookup with a three-rule matcher that mirrors `tracing_subscriber::EnvFilter`'s shape — the shape every Rust dev already knows from `RUST_LOG`. Same env vars, same sink layer, dramatically shorter operator commands. ## What ships `routing::probe_file_sink::class_passes_filter` — pure helper, unit-testable in isolation, used by the Layer's `on_event` path. Rules in priority order: 1. **Empty filter set** = "no filter configured." Every class passes. (`CONTINUUM_PROBE_CLASSES` unset.) 2. **`*` is in the set** = explicit "match every class" wildcard. The firehose — distinct from rule 1 in intent per `[[no-fallbacks-ever]]`: empty is "I didn't configure", `*` is "I deliberately want everything." 3. **Exact OR namespace prefix.** `C == F` (exact) or `C.starts_with(F + ".")` (namespace prefix). The `.` guard prevents `persona` from accidentally matching `personality.x`. Same convention as RUST_LOG. ## Operator-side wins Before: ```bash CONTINUUM_PROBE_CLASSES=persona.turn.start,persona.turn.spoke,persona.turn.silent,persona.turn.error,persona.response.enter,persona.response.analyze.result,persona.response.render.prompt,persona.response.render.raw,persona.response.exit.spoke,cognition.analyze.enter,cognition.analyze.noop_single_specialty,cognition.analyze.cache_hit,cognition.analyze.parse ``` After: ```bash CONTINUUM_PROBE_CLASSES=persona,cognition # same coverage, 27 chars CONTINUUM_PROBE_CLASSES=* # firehose CONTINUUM_PROBE_CLASSES=persona.turn,cognition.analyze.parse # exact + namespace, mixed ``` ## Tests (+4 new, on top of the existing 5) `routing::probe_file_sink::tests`: - `class_filter_namespace_prefix_matches_subclasses` — `persona` prefix matches `persona.turn.spoke` AND `persona.response.render.prompt` but NOT `personality.something` or `cognition.analyze.parse`. Pins the dot guard. - `class_filter_wildcard_matches_every_class` — `*` captures every class regardless of name. - `class_filter_combines_exact_and_prefix_in_one_set` — the realistic operator pattern (one specific class + one namespace prefix) works in the same HashSet without rule contention. - `class_passes_filter_pure_function_unit_tests` — direct unit tests on the helper covering all three rules + the dot guard + the literal-prefix-as-exact edge case. Future refactors of the per-event Layer can't drift the matching contract without breaking this pin. The existing `class_filter_drops_unallowed_classes` test still passes (exact match is a degenerate case of rule 3). ## Manual + README updated - `docs/architecture/RTOS-DEBUGGER-PROBES.md` — "How to enable + read" section rewritten with the three-rule spec + example invocations + explicit `*` semantics. - `README.md` — the "Debugging this substrate" section now leads with the short prefix form (`CONTINUUM_PROBE_CLASSES=persona,cognition`) instead of the long comma-separated list. First impression for any new contributor / agent matches the actual usability. ## Why this matters Joel's framing: the probes are the substrate's debugger for a massively complex machine. A debugger people don't type because the syntax is painful isn't a debugger. Prefix matching is the single cheapest change that takes the JTAG from "in principle usable" to "actually used in every debug session." ## Doctrine - `[[jtag-probes-are-rtos-debugger]]` — "easy one liners or it won't happen." Prefix matching shrinks the operator-side line by an order of magnitude. - `[[observability-is-half-the-architecture]]` — same Layer shape, zero new infrastructure, just better operator UX. - `[[no-fallbacks-ever]]` — empty filter vs `*` are distinct intents with distinct names. The substrate doesn't silently synthesize `*` from "no env var set." card: `7d286195` parent task: #151 foundation: #1535 (merged) sprinkle 1: #1536 (merged) sprinkle 2: #1537 (in review) Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * feat(probes): boot wiring + typed-config refactor — env coupling at one seam (#151 sprinkle 4) Two beats in one slice — and the second is here because the first exposed the brittleness of the design Joel had already flagged. ## Beat 1: server actually installs the sink Per Joel 2026-06-06: "Let's perfect debugging as we use it." The `JsonlProbeFileSink` existed and the env vars were documented, but `continuum-core-server`'s `main.rs` was installing a bespoke `FmtSubscriber` — so setting `CONTINUUM_PROBE_FILE` had no effect. The JTAG hardware was unwired. New helper `routing::install_probe_tracing(config)` composes the substrate-canonical stack in one call: 1. `UriCaptureLayer` — URI ancestry for every probe 2. `ProbeRouterLayer` — in-process broadcast for `debug/probes/{class}/stream` consumers 3. `JsonlProbeFileSink` — disk-persisted JSONL log when `config.probe_file` is set 4. `tracing_subscriber::fmt::layer()` — stderr text logs governed by `RUST_LOG` (falls back to `config.default_filter`) `main.rs` swaps to the helper. Server logs `probes landing at <path>` at boot when the file sink is configured — the operator-side proof that the env var took effect. ## Beat 2: env-coupling collapsed to ONE seam Joel 2026-06-06: "That's why env vars are problematic — they are brittle." Cargo's parallel test runner raced two `install_probe_tracing` tests that mutated `CONTINUUM_PROBE_FILE` across threads; `std::env::set_var` is process-global mutable state (slated for `unsafe` marking in Rust 2024 precisely because of this class of bug). Operators hit the same class of brittleness when env-var inheritance varies across subprocesses or shell contexts. This commit fixes the brittleness at the design layer instead of working around it in tests: ### `ProbeTracingConfig { probe_file, probe_classes, default_filter }` (new) Typed boot configuration. `install_probe_tracing` takes ONLY typed values — no env access inside the library function. ### `ProbeTracingConfig::from_env(default_filter)` — THE env seam The ONE function that touches `std::env`. Reads `CONTINUUM_PROBE_FILE` + `CONTINUUM_PROBE_CLASSES` (the operator- facing env names stay verbatim) into a typed config. Future alternative sources (config file, CLI flags, hardcoded defaults) become additional constructors on `ProbeTracingConfig` without rippling into the install function. ### Consequences - **`main.rs`**: `install_probe_tracing(ProbeTracingConfig::from_env("info"))?` — explicit env-coupling at the call site, not buried in the library - **Tests**: construct `ProbeTracingConfig { probe_file: Some(temp.path()), ... }` directly — zero env mutation, fully parallel-safe. The previous merged-sequential test from the first revision of this slice is now split back into TWO parallel `#[test]` functions. - **Operator UX**: unchanged. Same `CONTINUUM_PROBE_FILE` + `CONTINUUM_PROBE_CLASSES` env vars, same prefix-match filter rules from PR #1538. ## Tests (+3) `routing::tracing_init::tests`: - `install_is_idempotent_with_no_disk_capture` — double-call is safe via `try_init`; typed config means no env mutation; runs parallel-safely against any other test - `install_surfaces_open_failed_for_unwritable_path` — typed `ProbeFileSinkError::OpenFailed` surfaces per `[[no-fallbacks-ever]]`; bad path passed as typed value, no env var racing - `from_env_reads_documented_env_vars` — pins both populated and empty paths through the env constructor; the ONE test that touches `std::env` (scoped so the brittleness can't leak) ## Manual updated `docs/architecture/RTOS-DEBUGGER-PROBES.md` — "How to enable + read" section now describes the typed-config split: > The installer takes a typed `ProbeTracingConfig` — NOT env vars > directly. Env coupling lives at exactly one seam: > `ProbeTracingConfig::from_env(default_filter)`. This keeps the > library function parallel-test-safe (no `std::env::set_var` > racing under `cargo test`) and puts every config source (env, > file, CLI flags, hardcoded) on equal footing. ## Doctrine - `[[jtag-probes-are-rtos-debugger]]` — debugger must be ON in real binaries, not just tests, to debug real problems. - `[[observability-is-half-the-architecture]]` — same layers, same order, every entry point. - `[[no-fallbacks-ever]]` — typed-error surfacing (env-var-unset ≠ path-unwritable; the two distinct intents stay distinct). - The new lesson: process-global mutable state belongs at ONE seam. Library functions take typed values. The brittleness Joel named showed up first in our own tests; the design fix removes the class of bug rather than masking it. card: `305c8fb9` parent task: #151 foundation: #1535 (merged) — sprinkle 1: #1536 (merged) — sprinkle 2: #1537 (merged) — sprinkle 3: #1538 (in review) Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Follow-up to PR #1535 (foundation, merged). Wires the first batch of probes into the cognition seams Joel named as the diagnostic target: "we what's going in and out of an LLM or its subconscious processes, where they're slow or incorrect."
Per
[[jtag-probes-are-rtos-debugger]]: probes are non-blocking breakpoints with variable inspection. None of these change behavior — they observe state, don't decide it. Per[[no-rust-gates-around-cognition]]: every probe in this slice is read-only state capture; none changes flow.What gets wired (5 new probes across 2 files)
persona::response::respond_innerpersona.response.enter— turn entry: persona / persona_id / room_id / message_id / message_text_len / history_count / known_specialties / media_count / recalled_engrams. Pair withpersona.response.exit.spoke(same message_id) for a complete turn record.persona.response.analyze.result— what the analyze stage gave THIS persona. from_cache / model_used / analyze_duration_ms / suggested_angles_count / matched_angle_present / intent. Critical signal: a tiny model returning non-empty angles for every specialty on every message is the echo-storm root cause.persona.response.exit.spoke— final answer: visible_text + len / think_blocks / model_used / total_ms / inference_ms.persona::response::run_renderpersona.response.render.prompt— what's going INTO the LLM. Capturesassembled.system_messageverbatim plus message_count, estimated_tokens, matched_angle_present, engrams_count, history_count. The single most informative snapshot for cognition bugs (missing instructions, drifted template, wrong angle injection, social-block absence).persona.response.render.raw— what came OUT of the LLM. Capturesraw_response.textverbatim before any post-process. Pair with.render.promptto reconstruct the exact model contract for every turn.cognition::shared_analysis::analyze(the "subconscious" stage)cognition.analyze.enter— input fingerprint at entry. Correlates N personas hitting the single-flight cache via the same message_id.cognition.analyze.noop_single_specialty— short-circuit fired (single-specialty room → empty angles, no LLM call). Distinguishes "I chose silence because no angle matched" from "analyze didn't run" — identical downstream without this signal.cognition.analyze.cache_hit— L1 hit → N-1 personas skip the LLM. Hit-rate is load-bearing for multi-persona rooms.cognition.analyze.parse— the parsed angle decision shape: total / non-empty / empty + intent / summary_len / key_concepts_count + analyze_duration_ms. The diagnostic starting point for "why is every persona responding to a trivial greeting?" — empty-vs-non-empty ratio shows whether the LCD-tier model defaults to filling every angle.Manual updated
docs/architecture/RTOS-DEBUGGER-PROBES.mdsprinkle checklist now shows which seams are wired (response_inner,run_render,shared_analysis) and which are still pending (service_loop turn boundaries + timing).Build / tests
Build clean with
--features metal. Sink tests 5/5 pass (no new ones — these are probe call-sites, not new infrastructure; their value is operational, exercised by the existing scenarios that will useCONTINUUM_PROBE_FILE+CONTINUUM_PROBE_CLASSES).Doctrine
[[jtag-probes-are-rtos-debugger]]— RTOS-style breakpoints with variable inspection. One line at each seam.[[no-rust-gates-around-cognition]]— probes OBSERVE, they do not decide.card:
8d7ca5c3parent task: #151 (debug the silence-affordance bug INSIDE the cognition by reading the LLM's actual in/out)
foundation: #1535 (merged)