feat(inference): inference-llm PR-5 — Runtime registration by joelteply · Pull Request #1404 · CambrianTech/continuum

joelteply · 2026-05-18T18:21:55Z

Summary

PR-5 of inference-llm. Wires InferenceLlmModule into the Runtime so it's callable from the cognition path via inference/llm/request commands.

Pure Rust, zero TS, 20-line diff.

What lands

Add "inference-llm" to EXPECTED_MODULES in runtime/runtime.rs
runtime.register(Arc::new(InferenceLlmModule::new())) in ipc/mod.rs alongside the existing InferenceModule registration

Design choices

Constructed via .new() (bus-less, stub-backed) rather than .with_bus_and_adapter(). Reason: with_bus_and_adapter requires an AIProviderAdapter Arc, which would couple PR-5's runtime registration to a specific LlamaCppAdapter init lifecycle. The substrate's LlamaCppAdapter is owned by AIProviderModule's adapter registry with its own initialization phase; threading the adapter Arc here would either duplicate the registration or create an init-ordering dependency this slice shouldn't introduce.
The stub-backed registration is still useful: it exposes the inference/llm/request command surface to the cognition path so downstream PRs (turn-execute chaining drain-turn-frame → response_prompt → inference/llm/request) can wire against the real command name. Bus + adapter integration is a follow-up PR that updates the construction call here.

Test plan

cargo build --features metal,accelerate --lib clean
EXPECTED_MODULES enforcement validates at boot — if the registration is missing the runtime fails with "missing inference-llm" error
Pre-push gate clean
No new test fixtures needed — the module's existing 44/44 tests cover the trait-impl correctness; this PR just plumbs construction into runtime startup

Stack

feat(inference): inference-llm PR-1 — typed event surface (MODULE-CATALOG §II) #1387 — inference-llm PR-1: typed event surface
feat(inference): inference-llm PR-2 — InferenceLlmModule ServiceModule impl (stub-backed) #1391 — inference-llm PR-2: ServiceModule impl (stub-backed)
feat(inference): inference-llm PR-3a — canonical ArtifactKeys + publishing helpers #1392 — inference-llm PR-3a: bus keys + publishing helpers
feat(inference): inference-llm PR-3b — InferenceLlmModule auto-publishes via bus hook (pure Rust) #1393 — inference-llm PR-3b: auto-publish wiring
feat(inference): inference-llm PR-4 — adapter integration (translation + new constructors) #1395 — inference-llm PR-4: adapter integration (translation + new constructors)
This PR — inference-llm PR-5: Runtime registration
FOLLOW-UP — adapter Arc wiring when LlamaCppAdapter init phase is integrated with Runtime startup

🤖 Generated with Claude Code

Wires InferenceLlmModule into the Runtime so it's callable from the cognition path via inference/llm/request commands. What lands - Add "inference-llm" to EXPECTED_MODULES in runtime/runtime.rs - runtime.register(Arc::new(InferenceLlmModule::new())) in ipc/mod.rs alongside the existing InferenceModule registration Design choices - Constructed via the .new() (bus-less, stub-backed) constructor rather than .with_bus_and_adapter(). Reason: the with_bus_and_adapter constructor requires an AIProviderAdapter Arc, which would couple PR-5's runtime registration to a specific LlamaCppAdapter init lifecycle. The substrate's LlamaCppAdapter is owned by AIProviderModule's adapter registry with its own initialization phase; threading the adapter Arc here would either duplicate the registration or create an init-ordering dependency this slice shouldn't introduce. - The stub-backed registration is still useful: it exposes the inference/llm/request command surface to the cognition path so downstream PRs (turn-execute that chains drain-turn-frame → response_prompt → inference/llm/request) can wire against the real command name. Bus + adapter integration is a follow-up PR that updates the construction call here. What is NOT changed - AIProviderModule + LlamaCppAdapter unchanged - All InferenceLlmModule trait impl logic unchanged (PR-2/3/4 work intact) - The stub vs real-adapter swap point stays exactly where PR-4 put it: with_bus_and_adapter constructor + run_adapter_inference function Tests - cargo build --features metal,accelerate --lib clean (no new test fixtures needed — the module's existing 44/44 tests cover the trait-impl correctness; this PR just plumbs construction into runtime startup) - EXPECTED_MODULES enforcement validates at boot: if the registration is missing the runtime fails with "missing inference-llm" error - Pre-push gate clean Stack - #1387 PR-1: typed event surface - #1391 PR-2: ServiceModule impl (stub-backed) - #1392 PR-3a: bus keys + publishing helpers - #1393 PR-3b: auto-publish wiring - #1395 PR-4: adapter integration (translation + new constructors) - THIS PR — PR-5: Runtime registration - FOLLOW-UP — adapter Arc wiring when LlamaCppAdapter init phase is integrated with Runtime startup Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…mpt -> inference (#1409) (#1415) * feat(persona): Lane D — Rust persona/turn-execute chains drain -> prompt -> inference (#1409) Adds the `persona/turn-execute` command in CognitionModule that executes a full persona turn in ONE Rust hop: drain inbox -> wrap in PersonaTurnFrame -> derive ResponsePrompt (lazy) -> build InferenceRequest (prompt_text path) -> dispatch `inference/llm/request` via the global command_executor (routes to InferenceLlmModule registered in PR-5 #1404) -> bundle replayRecord + inferenceResponse -> persist replay record (v2 schema with response_prompt captured from #1412) Files changed: * src/persona/turn_frame.rs: new `ResponsePrompt::to_prompt_text` helper that flattens system_prompt + chat messages into a single deterministic plain-text prompt for adapter-based engines (LlamaCppAdapter, cloud adapters). Format: "<system>\n\nrole: content\nrole: content\n..." Empty system_prompt produces no leading paragraph; lowercase role matches the on-the-wire PromptRole serde format. * src/modules/cognition.rs: new `persona/turn-execute` command. Inputs: - persona_id (required) - window_ms (default 80), max_items (default 16) - composition_artifact_id (default Uuid::nil()) - max_tokens (default 512), max_duration_ms (default 10_000) Returns: { "replayRecord": PersonaTurnFrameReplayRecord | null, "inferenceResponse": InferenceResponse | null } Empty drain returns the null pair (no-op, not Err). Missing persona returns typed Err per Joel's never-swallow rule. Tests (+9, all green): * persona::turn_frame (6 new, total 18): - to_prompt_text_renders_each_message_as_role_colon_content - to_prompt_text_prepends_system_prompt_when_present - to_prompt_text_skips_empty_system_prompt - to_prompt_text_handles_mixed_roles_in_order - to_prompt_text_handles_no_messages - to_prompt_text_empty_prompt_returns_empty_string * modules::cognition::turn_execute_tests (3 new): - turn_execute_persona_not_found_returns_typed_error - turn_execute_empty_drain_returns_null_bundle - turn_execute_bad_max_items_returns_typed_error The dispatch-success path (drain -> dispatch -> inference response) runs through `command_executor::executor()` which is only initialized at runtime startup (ipc/mod.rs). Tests that exercise the executor live in the integration suite; unit-tests here cover the param-parse + short-circuit + persona-not-found paths. Builds atop #1412 (v2 schema with response_prompt) and #1404 (InferenceLlmModule runtime registration). Closes alpha card #1409. Why one command: the TS persona loop previously executed each stage with its own IPC round-trip (drain, then build prompt, then call inference) — 3 round-trips per turn, prompt-building lived in TS. Lane D pulls all three into the substrate so (a) the prompt is built in Rust where the turn-frame lives, (b) the production replay record carries the exact prompt that fed inference, (c) the persona turn becomes one observable unit on the bus. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(persona,#1409): force turn-execute through Rust registry (#1417) * fix(persona,#1409): force turn-execute through Rust registry * fix(runtime,#1409): use unlimited concurrency contract for cognition --------- Co-authored-by: Test <test@test.com> --------- Co-authored-by: Test <test@test.com> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

joelteply merged commit 1c0656b into canary May 18, 2026
1 check passed

joelteply deleted the feat/inference-llm-runtime-register-pr5 branch May 18, 2026 18:22

github-actions Bot added the size: S label May 18, 2026

This was referenced May 18, 2026

Lane G: refresh Alpha Gap state after Rust cognition stack #1408

Open

feat(persona): Lane D — Rust persona/turn-execute chains drain -> prompt -> inference (#1409) #1415

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(inference): inference-llm PR-5 — Runtime registration#1404

feat(inference): inference-llm PR-5 — Runtime registration#1404
joelteply merged 1 commit into
canaryfrom
feat/inference-llm-runtime-register-pr5

joelteply commented May 18, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

joelteply commented May 18, 2026

Summary

What lands

Design choices

Test plan

Stack

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant