Skip to content

fix(persona): always-record persona turns including failures#1082

Merged
joelteply merged 2 commits into
canaryfrom
fix/recorder-always-records
May 11, 2026
Merged

fix(persona): always-record persona turns including failures#1082
joelteply merged 2 commits into
canaryfrom
fix/recorder-always-records

Conversation

@joelteply
Copy link
Copy Markdown
Contributor

Summary

Per RTX (continuum-8e97)'s #1068 review finding: `record_turn()` at response.rs:273 only fired on the `Ok`-path of `respond()`. If `respond()` returned `Err` (analyze fail, render fail, any internal `?` propagation), NO fixture landed on disk — direct contradiction of recorder.rs's F1-telemetry header ("F1 cars don't ship without telemetry") and Joel's "evidence is for the debugger, not the trash" rule. The most diagnostic turns (the failures) left zero trail.

What changed

  • `respond()` wraps `respond_inner()` with always-record. Inner-fn pattern carries every `?` exit point so the outer fn captures Ok with the real `PersonaResponse` OR Err with a synthesized `TurnError` recorder artifact + partial `CognitionTrace`.
  • `recorder.rs` adds `record_failed_turn(input, error_msg, total_ms, trace)` writing `{rustResponse: null, rustError: TurnError{error_msg, last_completed_seam, partial_trace_seams, total_ms}, cognitionTrace}`.
  • `PersonaResponse` enum stays pure {Silent, Spoke} — no TerminatedAtSeam widening. Recorder-only artifact preserves separation: runtime value type for success-path semantics; debug fixture for diagnostic.
  • `CognitionTrace` adds `last_seam_name()` + `seam_count()` helpers consumed by the failure-path synthesis.
  • TS `PromptCapture.ts` sink + `SentinelCleanupServerCommand.ts` tombstone deleted — Rust recorder is now SSoT for capture (Joel: "stdout/println invalid for realtime systems").
  • 3 integration test files get `other_persona_names: Vec::new()` to fix pre-existing canary compile drift in test fixtures (PR Persona resource substrate + native multimodal restoration #950 added the field, fixtures didn't update).

Validation class

  • Contract TDD: yes — `record_failed_turn_writes_error_with_partial_trace` asserts fixture shape, `last_completed_seam` correctness, partial trace survival.
  • Failure TDD: yes — same test simulates "render adapter timed out at 30s" with only SEAM_ANALYZE recorded; asserts that seam survives in the fixture.
  • Performance/Resource/Residency VDD: n/a — observability-only diff, no hot-path changes.
  • Browser/UX evidence: precommit browser ping passed.
  • Platform coverage: Mac M5 Pro Metal, cargo test --lib --features metal,accelerate.

Validation

  • `cargo test --lib --features metal,accelerate` — 1922 passed; 0 failed; 28 ignored (includes new `last_seam_name_tracks_most_recent_record`, `seam_count_matches_recorded_seams`, `record_failed_turn_writes_error_with_partial_trace`).
  • Precommit hook: TypeScript clean; browser-ping precommit test passed.
  • Pre-push hook: Rust lib tests passed (64s); Docker native push skipped (worktree had unstaged test fixes at the time — re-runnable per scripts/push-current-arch.sh).

Three-way convergence

  • @airc-8a5e (Mac sibling) flagged scope match on inner-fn + always-record pattern; Q1 architectural ACK.
  • @continuum-8e97 (Windows/RTX) original Make Rust persona recorder the single fixture source #1068 reviewer; ACK with principles review (recorder-only failure capture = correct SSoT, no PersonaResponse widening, partial trace as diagnostic value).
  • @airc-8a5e (Codex) implementation execution + final code review.

Known gaps

  • Pre-existing canary breakage in `tests/qwen35_live_pipeline_diff.rs` (SamplingConfig API drift) and `tests/persona_prompt_token_diagnostic.rs` (format-string syntax) is NOT addressed here — separate PR scope.
  • Followup work: SEAM_RAG_COMPOSE addition + RagSourceTrace{source_id, snippet_hash, score, included|dropped, drop_reason, bytes_in_prompt} in trace.metadata, raised by the doc-vs-shipped-Rust review of docs: define CBAR-like Rust substrate #1081 (CBAR substrate).

Test and others added 2 commits May 11, 2026 14:53
… fixtures

Three integration test files (persona_respond_replay, vision_integration,
fixture_assembly_replay) constructed RespondInput/PersonaContext literals
without the other_persona_names field that was added to those structs in
PR #950 (2c31cc2). The fixtures wouldn't compile, blocking the cargo
--tests build path.

Defensive follow-up to 41aee0c (move prompt capture to rust recorder):
the recorder commit lands cleanly on cargo test --lib (1922/0), but the
broader test build was already broken on canary by the field-add drift.
This commit fixes only the field omission; pre-existing format-string
+ SamplingConfig API drift in qwen35_live_pipeline_diff and
persona_prompt_token_diagnostic remain (separate PR scope).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@joelteply joelteply merged commit fb76eae into canary May 11, 2026
3 checks passed
@joelteply joelteply deleted the fix/recorder-always-records branch May 11, 2026 20:16
joelteply added a commit that referenced this pull request May 11, 2026
…ft) (#1086)

Three pre-existing canary breakages in integration tests blocked the
broader cargo --tests build, hiding any other regression that might
land. Fixes are mechanical and isolated to test fixtures:

- llamacpp_metal_throughput.rs / qwen35_live_pipeline_diff.rs: backend
  .generate(...) signature took `temperature: f64` until the SamplingConfig
  refactor; tests still passed `0.0` / `0.7`. Updated to SamplingConfig
  literal (qwen35: explicit greedy, no repeat_penalty so output matches
  the bare-decode reference) and SamplingConfig::chat() (throughput:
  matches what live chat traffic uses).

- persona_prompt_token_diagnostic.rs: format string `"{model_path()}"`
  uses Rust 2024 captured-identifier syntax which doesn't allow function
  calls — emits "expected `}`, found `(`" at compile time. Bound to a
  local + use positional `{}` with `path.display()`.

Same scope as the test-fixture follow-up in 98a6c91 (other_persona_names
field add). Was flagged as out-of-scope in PR #1082's "Known gaps" — now
it can come off that list. Pre-push hook still passes (cargo test --lib
unaffected; this only restores `cargo build --tests` and `cargo test
<integration>`).

Co-authored-by: Test <test@test.com>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant