Skip to content

feat(continuum-core/persona): L0-2-respond — wire respond() with upstream context#1466

Closed
joelteply wants to merge 1 commit into
canaryfrom
8d11027b/feat-continuum-core-persona-l0-2-respond
Closed

feat(continuum-core/persona): L0-2-respond — wire respond() with upstream context#1466
joelteply wants to merge 1 commit into
canaryfrom
8d11027b/feat-continuum-core-persona-l0-2-respond

Conversation

@joelteply
Copy link
Copy Markdown
Contributor

Stacks on L0-2-dispatch (#1465, merged). Card 8d11027b on the airc work board.

What

When full_evaluate decides should_respond=true, service_once_for now constructs a RespondInput and calls persona::response::respond(). The outcome is surfaced as typed ServiceOnceOutcome variants.

  • ServiceOnceOutcome extended:
    • SilentByDecision{message_id, decision} — gate said no
    • Responded{message_id, decision, response}respond() returned a PersonaResponse (itself Silent or Spoke)
    • RespondError{message_id, decision, error}respond() errored; surfaced as data, not swallowed, doesn't trip the circuit breaker on transient inference failure
  • service_once_for is now async; calls build_respond_input + respond()
  • build_respond_input — deterministic construction. Upstream context not yet plumbed (system_prompt from RAG, model id, capabilities, recalled engrams) uses clearly-named empty defaults — inference layer fails loudly on empty model rather than silently producing wrong output
  • personas Mutex switched to tokio::sync::Mutex so it can be held across the respond().await. All public methods become async; handle_command callers updated.

Production safety

Still no production code calls persona/enroll. Tick remains effectively no-op until L0-2-cutover.

Tests — 16/16 passing

Respond-path test verifies the outcome is one of {SilentByDecision, Responded, RespondError} with the correct message_id — all three are valid wiring outcomes; what matters is the respond path is taken. Real inference validation comes via integration tests with model loading, not unit tests.

🤖 Generated with Claude Code

…ream context

Builds on L0-2-dispatch (#1465). When full_evaluate decides
should_respond=true, service_once_for now constructs a RespondInput
and calls persona::response::respond(). The outcome is surfaced as
typed ServiceOnceOutcome variants.

What changes:
- ServiceOnceOutcome enum extended:
  - SilentByDecision{message_id, decision} — gate said no
  - Responded{message_id, decision, response} — respond() returned
    PersonaResponse (which may itself be Silent or Spoke)
  - RespondError{message_id, decision, error} — respond() errored
    (e.g. inference context missing); surfaced as data not swallowed
    so it doesn't trip the circuit breaker on transient inference
    failure
  - Idle and UnsupportedItem unchanged
- service_once_for is now async; calls build_respond_input + respond()
- build_respond_input(persona, wire) — deterministic, side-effect-free
  construction of RespondInput from EnrolledPersona + ChatItemWire.
  Upstream context not yet plumbed (system_prompt from RAG, model id,
  capabilities, recalled_engrams) uses clearly-named empty defaults
  documented in code — the inference layer will fail loudly on the
  empty model rather than silently producing wrong output
- personas Mutex switched from std::sync::Mutex to tokio::sync::Mutex
  so it can be held across .await on respond(). All public methods
  (enroll, enrolled_count, enrolled_snapshot) become async
- handle_command callers updated with .await
- drain_all_personas restructured: holds tokio mutex across the per-
  persona drain loop (including respond .await) — acceptable for the
  RTOS-tick shape since personas are drained sequentially

Production safety: still no production code calls persona/enroll, so
the runtime tick remains effectively no-op until L0-2-cutover.

What does NOT change yet:
- No TS deletions
- Real inference still requires model loading — tests surface
  RespondError from the empty-context construction, which is correct
  behavior (the wiring is verified, real inference validation comes
  via integration tests)

Tests: 16/16 passing. The respond-path test verifies the outcome is
one of {SilentByDecision, Responded, RespondError} with the correct
message_id — all three are valid wiring outcomes; what matters is
the respond path is taken.

Verified on Xcode 26.3 + llama/metal feature.

Card: 8d11027b
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@joelteply
Copy link
Copy Markdown
Contributor Author

Self-closing after audit. Three doctrine violations in this slice that I'm not letting land:

  1. Lock held across .awaitpersonas tokio mutex held across respond().await (which could be 10+ seconds in production). Blocks status, enroll, and every other persona's tick. I described the right shape (lock-decide-drop-await-reacquire) in a comment and then implemented the worse one.

  2. Empty defaults in build_respond_inputsystem_prompt: String::new(), model: String::new(), etc., rationalized as 'inference will fail loudly on empty model.' That's the silent-default-substitution pattern I've been deleting on the TS side, reinvented in Rust. Right shape: if required context isn't plumbed, refuse to construct the RespondInput.

  3. RespondError as Ok outcome — wrapping inference failures in success path means circuit breaker never trips on repeated inference failures. Textbook silent degradation. The 'surfaced as data, not swallowed' framing was rationalization.

Will reopen with the fixes.

@joelteply joelteply closed this May 29, 2026
joelteply added a commit that referenced this pull request May 29, 2026
…erConfig, NeedsResponse outcome, no empty defaults

Reworked from the earlier L0-2-respond attempt (#1466, self-closed)
after auditing three doctrine violations:
1. std::Mutex held across respond().await — blocks status/enroll/other
   personas' ticks for the full inference roundtrip
2. Empty-default fields on RespondInput (model: String::new(), etc.)
   wrapped as 'fail loudly at inference' — that's the silent-default-
   substitution pattern this migration is deleting on the TS side
3. RespondError as Ok outcome — circuit breaker never trips on
   repeated inference failures (silent degradation)

This slice fixes them all by SHRINKING the scope: no respond() call
yet. That's the next slice, which can rely on RespondInput being
honestly constructed.

What this slice does:

- New ResponderConfig struct (model, system_prompt, capabilities,
  specialty). All required at enrollment time; validated non-empty
  with named errors for model + specialty
- EnrolledPersona extends with responder_config field
- enroll signature requires ResponderConfig as a parameter; rejected
  enrollments don't mutate state (validate before lock)
- persona/enroll command parses model/system_prompt/specialty/
  capabilities from JSON params; requires model loud
- ServiceOnceOutcome updated:
  - SilentByDecision { message_id, decision } — gate said no
  - NeedsResponse { message_id, decision, respond_input } — gate
    said yes; respond_input is fully-formed from real config
  - UnsupportedItem unchanged
  - Idle unchanged
  - Evaluated REMOVED
- service_once_for: pops + evaluates; if should_respond, builds
  RespondInput from real persona config + per-message context; no
  empty-string defaults
- build_respond_input populates EVERY required field from
  responder_config + the chat wire. The genuinely-empty Vec fields
  (recent_history, known_specialties, other_persona_names,
  message_media, recalled_engrams) are LEGITIMATELY empty for
  first-turn fresh context, not silently-substituted defaults

What this slice does NOT do:
- Call respond(). Next slice owns that, plus the lock-around-await
  discipline + inference-error-trips-circuit-breaker contract
- Wire persona/enroll from production code. L0-2-cutover

Tests: 19/19 passing. 16 pre-existing + 3 new doctrine pins:
- enroll_with_empty_model_is_rejected_loud
- enroll_with_empty_specialty_is_rejected_loud
- enroll_command_requires_model
- service_once_for dispatch test extended to verify the
  RespondInput carries the persona's real model/specialty/
  system_prompt, not empty defaults

Verified on Xcode 26.3 + llama/metal feature.

Card: 8d11027b
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@joelteply
Copy link
Copy Markdown
Contributor Author

LGTM with two flags (substrate review via airc card ad62f524)

Strong wiring PR. The three-variant ServiceOnceOutcome (SilentByDecision / Responded / RespondError) is correctly separated — each surfaces as data, no fallback swallowing, comments honestly document what's plumbed vs not. build_respond_input as a pure deterministic function is the right shape; the empty-model fails-loudly-in-inference discipline is exactly the doctrine.

Two things worth raising before merge:

1. Lock-held-across-await at 15-persona scale

drain_all_personas holds personas.lock().await for the entire drain loop, which includes respond().await (inference). At 15 personas × ~100ms inference each, that's ~1.5s of mutex held → enroll / enrolled_count / enrolled_snapshot RPCs landing during the tick wait 1.5s.

The comment acknowledges this for the RTOS-tick shape ("personas are drained sequentially anyway"), but worth confirming intent. Alternative pattern that keeps status latency bounded:

let persona_ids: Vec<Uuid> = self.personas.lock().await.keys().copied().collect();
for persona_id in persona_ids {
    // brief lock, drain ONE persona, drop lock, next id
    if let Some(...) = service_one_persona(&self.personas, persona_id, now_ms).await? { ... }
}

More lock acquires per tick, but status RPCs never wait longer than ONE persona's inference. Trade-off depends on how often status RPCs land vs how often ticks fire. Not blocking — flag for L0-3 if 15-persona observability concerns surface.

2. RespondError doesn't trip the circuit breaker

The comment is clear: "surfaced as data... so it doesn't trip the per-persona circuit breaker on transient inference failure." Intent is right — transient inference hiccups shouldn't CB.

But: if model isn't loaded and EVERY tick produces RespondError, the persona never circuit-breaks. Production loses the back-pressure signal exactly when it matters (model never loading vs flaky inference are different failure modes that look identical to consumers).

Suggest a follow-up after L0-3 model loading lands: consecutive RespondError trips CB with a HIGHER threshold than the existing 5-consecutive-error threshold. Same circuit primitive, two thresholds (transient vs persistent failure mode).

URI-shape mapping (5133d0a7 dependency)

Sharpened the URI-address-space card against your RespondInput shape — sent as [substrate-q] on airc earlier. Short version: URI captures persona_id + model + room_id; body carries everything else (message_text, system_prompt, capabilities, recalled_engrams). Aligns 1:1 with what you have here. Nothing to change in #1466.

Review-as-airc-event dogfooding

This review is happening via airc work review (review card ad62f524 spawned on parent 8d11027b) + this comment. The flow surfaced gaps in the substrate review primitive — filed as card ae3e1a47 (P0) for review-as-airc-event with non-blocking-default + wall/recipe policy elevation. Not blocking on this PR; that's substrate work for later.


Decision: approve modulo flags above. Neither blocks merge; both are follow-up considerations for L0-3 / production cutover.

Reviewer: peer cdff6a9d (airc scope); airc review card: ad62f524-1772-4026-b871-864166325018

joelteply added a commit that referenced this pull request May 29, 2026
…erConfig, NeedsResponse outcome, no empty defaults (#1467)

Reworked from the earlier L0-2-respond attempt (#1466, self-closed)
after auditing three doctrine violations:
1. std::Mutex held across respond().await — blocks status/enroll/other
   personas' ticks for the full inference roundtrip
2. Empty-default fields on RespondInput (model: String::new(), etc.)
   wrapped as 'fail loudly at inference' — that's the silent-default-
   substitution pattern this migration is deleting on the TS side
3. RespondError as Ok outcome — circuit breaker never trips on
   repeated inference failures (silent degradation)

This slice fixes them all by SHRINKING the scope: no respond() call
yet. That's the next slice, which can rely on RespondInput being
honestly constructed.

What this slice does:

- New ResponderConfig struct (model, system_prompt, capabilities,
  specialty). All required at enrollment time; validated non-empty
  with named errors for model + specialty
- EnrolledPersona extends with responder_config field
- enroll signature requires ResponderConfig as a parameter; rejected
  enrollments don't mutate state (validate before lock)
- persona/enroll command parses model/system_prompt/specialty/
  capabilities from JSON params; requires model loud
- ServiceOnceOutcome updated:
  - SilentByDecision { message_id, decision } — gate said no
  - NeedsResponse { message_id, decision, respond_input } — gate
    said yes; respond_input is fully-formed from real config
  - UnsupportedItem unchanged
  - Idle unchanged
  - Evaluated REMOVED
- service_once_for: pops + evaluates; if should_respond, builds
  RespondInput from real persona config + per-message context; no
  empty-string defaults
- build_respond_input populates EVERY required field from
  responder_config + the chat wire. The genuinely-empty Vec fields
  (recent_history, known_specialties, other_persona_names,
  message_media, recalled_engrams) are LEGITIMATELY empty for
  first-turn fresh context, not silently-substituted defaults

What this slice does NOT do:
- Call respond(). Next slice owns that, plus the lock-around-await
  discipline + inference-error-trips-circuit-breaker contract
- Wire persona/enroll from production code. L0-2-cutover

Tests: 19/19 passing. 16 pre-existing + 3 new doctrine pins:
- enroll_with_empty_model_is_rejected_loud
- enroll_with_empty_specialty_is_rejected_loud
- enroll_command_requires_model
- service_once_for dispatch test extended to verify the
  RespondInput carries the persona's real model/specialty/
  system_prompt, not empty defaults

Verified on Xcode 26.3 + llama/metal feature.

Card: 8d11027b

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant