Skip to content

feat(persona): spawn_persona_service helper (#133 slice 12)#1508

Merged
joelteply merged 1 commit into
canaryfrom
feat/persona-headless-host-loop
Jun 2, 2026
Merged

feat(persona): spawn_persona_service helper (#133 slice 12)#1508
joelteply merged 1 commit into
canaryfrom
feat/persona-headless-host-loop

Conversation

@joelteply
Copy link
Copy Markdown
Contributor

Summary

  • Adds spawn_persona_service(hosted, runtime, opts, rt_handle) -> JoinHandle<Result<ServeOutcome, String>> — the composition seam between the substrate's planning surface (slices 5-11, merged in feat(persona): citizen substrate + cognition cache hierarchy foundation (slices 1-6) #1507) and the headless boot loop.
  • Net-additive only: no change to crate::ipc::start_server's existing bootstrap loop. Slice 13 wires this helper into the boot path.
  • 3-line composition: up-cast Arc<Airc> to Arc<dyn AircTranscriptReader> (zero-cost; impl at airc_source.rs:74), construct AircPersonaConversation, rt_handle.spawn(serve_persona_loop(...)).

Doctrine

  • [[no-stdio-piping-for-process-ipc]]: substrate talks to airc only via Arc<PersonaAircRuntime> + airc-lib sockets — no stdin/stdout.
  • [[substrate-is-a-good-citizen-on-the-host]]: caller controls tokio scheduling pool via the supplied Handle; no hidden thread spawns.
  • [[no-fallbacks-ever]]: JoinHandle resolves with Err(message) from serve_persona_loop failures; no implicit retry / no substitution.

Why split helper from wire-up

The helper is a 3-line composition; the wire-up (slice 13) rewrites the IPC boot loop at ~ipc/mod.rs:1024 to:

  1. After bootstrap_one(&intent) succeeds, build the inference profile via build_profile (slice 5).
  2. Materialize the adapter via LlamaCppPersonaAdapterFactory (slice 9).
  3. Construct HostedPersona (slice 9).
  4. Call spawn_persona_service (this PR) to start hosting.
  5. Track JoinHandles for graceful shutdown.

That's a load-bearing boot-flow rewrite. Splitting keeps each commit reviewable — this PR is the unit-testable composition seam; slice 13 is the consumer.

Reference docs

  • docs/architecture/CBAR-SUBSTRATE-ARCHITECTURE.md — substrate lifecycle that the spawned task plugs into
  • Per-module docs touched: src/workers/continuum-core/src/persona/host.rs (new), src/workers/continuum-core/src/persona/mod.rs (registers the module)
  • Memory: [[constitutional-design-always-a-next-step]] — this PR ships the helper + names slice 13's consumer in the doc-comment so future authors don't dead-end

Test plan

  • cargo build --bin airc_chat_demo --no-default-features --features livekit-webrtc,accelerate,llama/mac-cpu-only,load-dynamic-ort passes (verified locally before push)
  • cargo test --lib persona:: slice-related modules stay green (verified locally — service_loop, supervisor, spawner, spawner_module all pass)
  • No unit test for spawn_persona_service itself — composition is 3 lines, types check; integration test happens via slice 13's IPC boot path
  • Pre-existing persona::allocator::test_allocate_no_keys failure on canary is tracked separately (task Build(deps-dev): Bump @types/node from 22.14.0 to 24.7.0 #138), unrelated to this PR

🤖 Generated with Claude Code

The composition seam between the substrate's planning surface
(slices 5-11) and the headless boot loop (slice 13).

`spawn_persona_service(hosted, runtime, opts, rt_handle)`:
  1. Up-cast the persona's `Arc<Airc>` to `Arc<dyn
     AircTranscriptReader>` for the RAG layer (zero-cost — same
     pointer, different vtable view; impl already exists at
     `airc_source.rs:74`).
  2. Wrap `Arc<PersonaAircRuntime>` in `AircPersonaConversation`
     (slice 11) — production conversation that knows how to talk
     to the live daemon.
  3. `rt_handle.spawn(serve_persona_loop(hosted, &mut conversation,
     reader, opts))` — slice 10's loop runs on the caller's tokio
     pool.

Returns a `JoinHandle<Result<ServeOutcome, String>>` so the slice-13
boot path can collect handles for graceful shutdown (.abort() on
server stop, or just .await for steady-state ServeOutcome capture).

Net-additive — does NOT touch the existing IPC boot loop. Slice 13
rewires `crate::ipc::start_server` (~line 1024) so that after
`bootstrap_one(&intent)` succeeds, the boot path builds the
inference profile + adapter + HostedPersona and calls
`spawn_persona_service` to start hosting the persona. Splitting the
helper from the wire-up keeps each commit reviewable.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@joelteply
Copy link
Copy Markdown
Contributor Author

Adversarial review — slice 12 spawn_persona_service

Verdict: APPROVE with two non-blocking notes.

Reviewed against the eight requested axes. The 3-line composition is correct, the lifetimes are sound, the up-cast is essentially zero-cost, slice 13 is unblocked, and the doctrines hold. Findings below in order of severity.


Soundness ✅

Arc<Airc>Arc<dyn AircTranscriptReader> up-cast (host.rs:90)

let reader: Arc<dyn AircTranscriptReader> = runtime.airc().clone();
  • runtime.airc() returns &Arc<Airc> (airc_runtime.rs:251).
  • .clone() yields Arc<Airc> — ONE relaxed atomic increment on the strong-count word of the existing ArcInner<Airc>. No heap alloc.
  • The assignment is an unsized coercion Arc<T> → Arc<dyn Trait> where impl AircTranscriptReader for airc_lib::Airc (airc_source.rs:74). The data pointer stays identical; the Arc fat pointer gains a vtable view. Object-safe because async_trait desugars to Pin<Box<dyn Future ...>>-returning methods (no Self, no generics, Send + Sync bounds satisfied).
  • Result: same ArcInner slot, two Arc handles (runtime's airc field + this reader), strong-count = 2 once the task is spawned. No alloc beyond the count bump.

Doc-comment accuracy (host.rs:87-89): The "same heap pointer, different vtable view" claim is correct for the coercion itself. Strictly speaking there's one atomic increment from .clone(), but that's a single uncontended op against a hot cache line, not a "cost" in any architectural sense. I would not block on this; if you want to be pedantic, change "zero-cost" to "near-zero-cost — one atomic refcount bump, no heap alloc, no vtable indirection on the Arc itself". Optional.


Lifetime / leak audit ✅

The spawned task captures by move: hosted (owns Arc<dyn AIProviderAdapter>), runtime (Arc<PersonaAircRuntime>), opts, and the freshly-built conversation (which also holds Arc<PersonaAircRuntime>).

  • Cycle check: Arc<PersonaAircRuntime> is held by both runtime (the local var) AND conversation.runtime (airc_persona_conversation.rs:53). Both are dropped together when the task finishes or is aborted. Strong-count goes 2 → 0 (modulo any other holders, e.g. the registry, which is the intended permanent holder). No cycle — neither side reaches back into the task.
  • Adapter lifetime: hosted.adapter is Arc<dyn AIProviderAdapter>. Per service_loop.rs:174 it's cloned into the RAG path each turn. On task abort, the loop's local adapter Arc and the hosted.adapter Arc both drop; if the adapter is sole-owned at that point, the GGUF unloads via the adapter's own Drop. Clean.
  • .abort() semantics: tokio::task::JoinHandle::abort() schedules cancellation at the next .await point. The loop is fully await-driven (subscribe, RAG, inference, say), so cancellation is responsive within one async step. All owned Arcs in the task's stack frame drop on the cancellation path.

Cancellation safety ⚠️ NOTE (non-blocking)

If .abort() lands mid-say or mid-inference:

  • The inference adapter's generate_text will be dropped mid-future. LlamaCppAdapter is in-process Candle; dropping its future cancels token generation cleanly (no externally-observable state). ✅
  • The airc subscribe stream is owned by AircPersonaConversation.stream (airc_persona_conversation.rs:63). On task drop the stream drops; airc-lib's EventStream is expected to send an unsubscribe / close on its own Drop. I did not verify airc-lib's EventStream::drop behavior in this review — if that close path is missing, the daemon will keep the subscription alive until heartbeat timeout. Worth a one-line check on the airc-lib side before slice 13 starts wiring shutdown.
  • The persona's say is airc_lib::Airc::say — atomic from the daemon's perspective (single publish). Cancellation mid-flight either lands or doesn't; no half-state.

Not blocking. This is the right shape for slice 12; the audit belongs in slice 13's shutdown smoke test.


Slice 13 brick-wall check ✅

Walked the proposed slice-13 path against ipc/mod.rs:1062:

  1. bootstrap_one(&intent) returns PersonaInstanceInfo and INTERNALLY registers the runtime in PersonaAircRuntimeRegistry (persona_instance_manager.rs:217). Slice 13 will need to either (a) call registry.get(info.persona_id) to recover the Arc<PersonaAircRuntime>, or (b) refactor bootstrap_one to return the Arc directly. Option (a) is the smaller diff and keeps the registry as the single owner. No brick wall, but worth naming in the host.rs doc-comment that slice 13 must look the runtime back up via the registry.
  2. The adapter materialization path (LlamaCppPersonaAdapterFactorymaterialize_adapters) is already in place from slice 9. Slice 13 will need to plumb MaterializedPersonaPlan through bootstrap_one's return — feasible from existing primitives.
  3. rt_handle is trivially tokio::runtime::Handle::current() inside the boot loop (already inside tokio::spawn).
  4. Tracking JoinHandles for graceful shutdown: caller pushes into a Vec<JoinHandle<Result<ServeOutcome, String>>> in the boot scope and .abort() each on shutdown. The JoinHandle<Result<ServeOutcome, String>> shape supports this directly.

No hidden cyclic ownership. Slice 13 author has the surface they need.


API ergonomics ✅

JoinHandle<Result<ServeOutcome, String>> is the right shape:

  • The Result<ServeOutcome, String> is the inner result of serve_persona_loop. Surfacing it preserves the no-fallbacks doctrine — the supervisor sees structured success (with outcome counters) vs structured error.
  • JoinHandle<ServeOutcome> with errors logged would VIOLATE [[no-fallbacks-ever]] by swallowing failure into logs.
  • A typed error (ServeError enum) would be marginally better than String, but serve_persona_loop itself returns Result<ServeOutcome, String> (service_loop.rs:162), so this PR can't improve typing without changing slice 10's surface. Defer to a future cleanup pass.

Doctrine conformance ✅

  • [[no-stdio-piping-for-process-ipc]]: ✅ — all I/O routes through PersonaAircRuntimeairc_lib typed sockets. No stdin/stdout anywhere.
  • [[no-if-statements-use-llms-for-cognition]]: ✅ — the helper itself contains zero decision gates; it composes and spawns. The substrate's pre-LLM filters (high-water-mark, self-skip, non-text) live inside serve_persona_loop and AircPersonaConversation::next_message, both downstream of this helper. The LLM still decides.
  • [[no-fallbacks-ever]]: ✅ — failures propagate through Result<ServeOutcome, String>; no implicit retry, no substitution.
  • [[substrate-is-a-good-citizen-on-the-host]]: ✅ — rt_handle is supplied by the caller; no tokio::spawn against a global runtime, no std::thread::spawn. Caller controls scheduling.
  • Lifetime = JoinHandle lifetime: ✅ — supervisor abort cleanly cascades through the loop's await points.

Testability ⚠️ NOTE (non-blocking)

The 3-line helper is too thin to unit-test meaningfully without standing up:

  • A live Arc<PersonaAircRuntime> (needs a real airc daemon for Airc::attach_as)
  • A real HostedPersona (needs a real adapter)

The PR body's claim — "no unit test for spawn_persona_service itself — composition is 3 lines, types check; integration test happens via slice 13's IPC boot path" — is correct. Adding a trait abstraction for the spawn surface itself would be over-engineering: the helper IS the abstraction.

The one thing that COULD be unit-tested without a daemon is "does spawn_persona_service return a JoinHandle whose abort actually cancels the inner loop?" — but that requires fabricating a HostedPersona + PersonaAircRuntime outside their constructors, which both crates lock down. Integration test in slice 13 is correct.


Missing pieces ⚠️ NOTE

The doc-comment at host.rs:17-41 correctly names slice 13's consumer + the wire-up points, satisfying [[constitutional-design-always-a-next-step]] — no dead-end. Two small additions would harden it:

  1. Naming the registry lookup: explicitly say "slice 13 recovers the Arc<PersonaAircRuntime> via PersonaAircRuntimeRegistry::get(info.persona_id) after bootstrap_one". The current wording leaves slice 13 to discover this themselves.
  2. Subscribe lazy-init reminder: the conversation's subscribe stream is lazy (airc_persona_conversation.rs:103); the spawn helper itself does NOT cause an airc daemon round-trip. Worth one line so slice 13 knows boot is still cheap even with N personas.

Both are doc-only nits. Optional.


Summary

Axis Finding
Soundness Up-cast correct, near-zero-cost (one atomic increment).
Lifetimes No cycles. .abort() cleanly drops adapter + runtime + conversation.
Cancellation Clean at all await points. Verify EventStream::drop in airc-lib (slice 13 work, not blocking).
Slice 13 path Unblocked. Will need registry.get(persona_id) to recover runtime; doc could name this.
API shape JoinHandle<Result<ServeOutcome, String>> is correct — preserves no-fallbacks.
Doc accuracy "zero-cost" is slightly loose; one atomic op. Optional nit.
Testability Integration-only is correct given the 3-line surface.
Doctrine All five honored.

Verdict: APPROVE. Ship. The two notes (doc-comment additions + airc-lib EventStream::drop audit) belong in slice 13's wire-up commit, not as a blocker on this seam.

@joelteply joelteply merged commit 980f731 into canary Jun 2, 2026
3 checks passed
@joelteply joelteply deleted the feat/persona-headless-host-loop branch June 2, 2026 04:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant