feat(inference): add LlamaCppAdapter::try_new + NoLocalModelLoadable typed error by joelteply · Pull Request #1089 · CambrianTech/continuum

joelteply · 2026-05-11T20:45:50Z

Lane A PR-2 — runtime visibility for install-time-no-Qwen state

Pairs with #1085 (install fix for the SOURCE of the no-Qwen state) by making the runtime VISIBILITY of "no local model loadable" testable + integrable.

Background

@continuum-8e97 RTX 5090 install (2026-05-11) had cuda stack ready, VRAM available, zero personas replying — root cause was no Qwen GGUF seeded by carl install. The existing LlamaCppAdapter::new() would have panicked with the right message, but is constructed LAZILY (first generate_text call). Personas silent-skip pre-resolver, so the panic was never reached. Adapter never tried to load.

Changes

NoLocalModelLoadable typed error: {provider_id, rows_in_registry, rows_with_gguf_local_path} with thiserror Display naming the actionable remediation ("Install seeded no local Qwen GGUF — run model-init downloader or seed manually").
LlamaCppAdapter::try_new() -> Result<Self, NoLocalModelLoadable>: Result-returning variant. Boot-time health checks (continuum status, ai/status, install-time validators) MUST use this so an install with no Qwen reports the typed error cleanly instead of crash-looping later.
LlamaCppAdapter::try_new_from<'a, I>(models: I) pure variant taking a model iterator directly — mirrors my model_resolver.rs pattern. Lets tests assemble synthetic registries without going through the global() singleton.
Legacy new() preserved (panics on err) — same observable behavior as before for callers that haven't migrated.

Tests (3/3 pass)

try_new_from_errors_when_no_llamacpp_local_rows: empty iterator → NoLocalModelLoadable with rows_in_registry=0, error message contains "model-init" remediation hint
try_new_from_errors_when_llamacpp_rows_exist_but_none_have_gguf_path: registry has llamacpp-local rows but artifact resolver couldn't find any GGUF on disk → NoLocalModelLoadable with rows_in_registry=2, rows_with_gguf_local_path=0 (the RTX 5090 case @continuum-8e97 reported)
try_new_from_succeeds_with_at_least_one_resolved_path: mixed registry → adapter picks the resolved row

Validation

cargo test --features metal,accelerate -p continuum-core --lib inference::llamacpp_adapter: 3/3 pass
Precommit hook (TypeScript build + browser ping): PASS
Pre-push hook: PASS

Out of scope (separate followups)

Wire try_new() into a runtime boot health check (Lane A PR-3 or ai/status integration), surfaces the typed error to operators via jtag command output. PR-2 ships the primitive; integration is next.
The artifact resolver behavior when explicit gguf path doesn't exist on disk silently falls through to other resolvers (artifacts.rs:73). Worth a separate audit but doesn't change PR-2's contract.

🤖 Generated with Claude Code

…typed error Lane A PR-2 — surfaces install-time-no-Qwen as observable runtime health rather than process panic. Pairs with #1085 (install fix for the SOURCE of the no-Qwen state) by making the runtime VISIBILITY of "no local model loadable" testable + integrable. Background: continuum-8e97 RTX 5090 install (2026-05-11) had cuda stack ready, VRAM available, zero personas replying — root cause was no Qwen GGUF seeded. The existing `LlamaCppAdapter::new()` would have panicked with the right message, but is constructed LAZILY (first generate_text call). Personas silent-skip pre-resolver, so the panic was never reached. Adapter never tried to load. Changes: - New typed error `NoLocalModelLoadable { provider_id, rows_in_registry, rows_with_gguf_local_path }` with thiserror Display naming the actionable remediation ("Install seeded no local Qwen GGUF — run model-init downloader or seed manually"). - New `LlamaCppAdapter::try_new() -> Result<Self, NoLocalModelLoadable>`: Result-returning variant. Boot-time health checks (continuum status, ai/status, install-time validators) MUST use this so an install with no Qwen seeded reports the typed error cleanly instead of crash-looping later when a persona attempts to invoke. - New `LlamaCppAdapter::try_new_from<'a, I>(models: I)` pure variant taking a model iterator directly, mirroring my model_resolver.rs pattern. Lets tests assemble synthetic registries without going through the global() singleton. `try_new()` calls `try_new_from(global().models_for_provider("llamacpp-local"))`. - Legacy `LlamaCppAdapter::new()` preserved (panics on err) — same observable behavior as before for callers that haven't migrated. 3 tests covering the contract: - try_new_from_errors_when_no_llamacpp_local_rows: empty iterator → NoLocalModelLoadable with rows_in_registry=0, error message contains "model-init" remediation hint - try_new_from_errors_when_llamacpp_rows_exist_but_none_have_gguf_path: registry has llamacpp-local rows but artifact resolver couldn't find any GGUF on disk → NoLocalModelLoadable with rows_in_registry=2, rows_with_gguf_local_path=0 (the RTX 5090 case Codex's #1085 + upstream model-init bug produces) - try_new_from_succeeds_with_at_least_one_resolved_path: mixed registry (one resolved, one not) → adapter picks resolved row, model_path + default_model match Validation: - cargo test --features metal,accelerate -p continuum-core --lib inference::llamacpp_adapter: 3/3 pass Out of scope (separate followups): - Wire `try_new()` into a runtime boot health check (Lane A PR-3 or ai/status integration), surfaces the typed error to operators via jtag command output. PR-2 ships the primitive; integration is next. - The artifact resolver behavior when explicit gguf path doesn't exist on disk — silently falls through to other resolvers (artifacts.rs:73). Worth a separate audit but doesn't change PR-2's contract. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

joelteply · 2026-05-11T20:55:51Z

Mac peer review — LGTM (can't formally approve, GitHub treats me as same author).

Three-way coverage of the install gap is clean (#1085 source + #1089 visibility + #1090 CDN tolerance). Reviewed code + tests:

Strengths:

Typed error with structured fields (provider_id, rows_in_registry, rows_with_gguf_local_path) — operator sees exact state from a single error string, no follow-up debug needed
Error message includes actionable remediation hint ("run model-init downloader or seed manually") — try_new_from_errors_when_no_llamacpp_local_rows test asserts this stays in the message, guards against future drift losing the actionable signal
try_new_from<I> pure variant accepts iterator — tests assemble synthetic registries without going through global() singleton, matches the dependency-injection pattern in model_resolver.rs
Legacy new() preserved as panicking wrapper of try_new() — observable behavior unchanged for unmigrated callers, no surprise breakage
3/3 tests cover the diagnostic-distinguishing axis (empty registry vs rows-without-paths vs mixed) — replay tooling can group install failures by which axis they hit
Per Joel's "fallbacks are ILLEGAL" rule: this is NOT a fallback. Both new() and try_new() fail loud; try_new() just returns the loudness as a typed Result for callers that want to handle it.

Tiny style nit (NOT a blocker):
try_new_from collects two Vecs (candidates + with_path) when only with_path.first() is consumed. Could collapse to a single models.iter().find(|m| m.gguf_local_path.is_some()).ok_or_else(...). Performance-trivial (model registry rows < 10), readability is the only consideration.

Ship it. Three-leg install fix lands cleanly when #1085 + #1089 + #1090 all merge.

…1129) * feat(persona): typed Engram + admission membrane types (#1121 PR-1) PR-1 of the AIRC inbox → cognition-admission → engram-storage bridge described in continuum#1121 and elaborated in today's airc design discussion (Joel + Codex + claude tab #1). Pure value types only — NO Recipe impl, NO admission gate logic, NO PersonaInbox wiring, NO ORM persistence path. Subsequent PRs layer those over these types. Adds: - Engram { id, kind, content, origin, recall_keys, admitted_at_ms, trust_state_at_admission, admission_trace_id } — the storable unit - EngramKind { Episodic, Semantic, Procedural, SelfReflection } — biological-memory analogs as a single discriminator (vs separate types per kind, which composes badly) - EngramOrigin enum { Airc(AircMessageRef), Chat(ChatMessageRef), Tool(ToolInvocationRef), SelfReflection { parent_engram_id } } — variant-typed provenance so each origin's identity primitive is type-system-enforced - AircMessageRef — protocol-compatible reference (transport=airc, room_id, message_id, sender_id, sent_at_ms, received_at_ms, content_hash, signature, proof_refs, schema_version, client_name). Per Joel 2026-05-13: continuum accepts AIRC data by proof/contract, NOT by client identity. Official airc CLI is not privileged; client_name is informational only and never load-bearing for trust decisions. Any producer emitting valid envelopes is acceptable. - ChatMessageRef + ToolInvocationRef — sibling reference types - AdmissionDecision { Admit, Drop, Quarantine } — three terminal outcomes from the admission gate. Quarantine is forensic-not- destructive (per cognitive-immune-model #1122 §3.8) — preserves candidate without admitting to live recall surface - AdmissionDropReason { NotMemorable, PolicyDeniedAdmission, Duplicate } — typed reasons (categorized intentional rejection) - AdmissionError { EnvelopeVerificationFailed, TrustBoundaryRejected, ReplayDetected, RecipeFailure, UnsupportedSchemaVersion } — thiserror typed failure modes for the admission machinery itself. Per Joel's no-fallback rule and the no-try/catch-in-execute discipline: errors are returned not swallowed. Same shape as NoLocalModelLoadable (#1089) and NoMultimodalBase (#1074). - TrustState { Untrusted, Authenticated, Knocker, ApprovedPeer, IntragridMember, SocMember, SelfTrust } — models policy/trust of source, NOT implementation brand (per Joel 2026-05-13). Ordered with PartialOrd so admission gates can compare source_trust >= threshold directly. Convention notes: - Uuid fields use #[ts(type = "string")] — matches existing pattern in cognition_io.rs / channel_items.rs - Timestamps are u64 epoch ms with #[ts(type = "number")] — matches existing PersonaInboxFrame.oldest_timestamp pattern. Workspace chrono crate doesn't have serde feature enabled by default and the persona modules use the u64-epoch shape consistently - All types ship with #[derive(TS)] + export_to ../../../shared/generated/persona/<TypeName>.ts - ts-rs export triggered via explicit export_bindings_<typename> tests per the gpu/memory_manager.rs pattern Validation: - 20/20 tests pass: serde roundtrips for every type, discriminator- tag verification for tagged enums, thiserror Display + serde paths, TrustState ordering for threshold comparison, optional client_name (None + non-airc-CLI value both accepted), all 10 ts-rs export_bindings tests - 10 generated TypeScript files materialize under src/shared/generated/persona/ (Engram.ts, EngramKind.ts, EngramOrigin.ts, AircMessageRef.ts, ChatMessageRef.ts, ToolInvocationRef.ts, AdmissionDecision.ts, AdmissionDropReason.ts, AdmissionError.ts, TrustState.ts) Deferred to follow-up PRs: - PR-2: AircEvent envelope + IsMemorable Recipe impl + admission gate logic (the cognition that produces these types' values) - PR-3: PersonaInbox / PersonaInboxFrame wiring (the integration) - PR-4: Engram ORM persistence path - PR-5: Recall surface (engrams → RAG context) Pairs with cognitive-immune-model (#1122) — the storage substrate those defenses operate over. Pairs with forge-alloy proof contracts (#1119) — same typed-Rust-with-ts-rs-export discipline applied to the runtime cognition layer. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(persona): export generated engram bindings --------- Co-authored-by: Test <test@test.com> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

github-actions Bot added the size: M label May 11, 2026

joelteply mentioned this pull request May 11, 2026

fix(install,#1087): make per-VRM download failures non-fatal #1090

Merged

joelteply merged commit 05481f3 into canary May 11, 2026
3 checks passed

joelteply deleted the feat/local-provider-health-check branch May 11, 2026 21:04

joelteply mentioned this pull request May 13, 2026

feat(persona): typed Engram + admission membrane types (#1121 PR-1) #1129

Merged

6 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(inference): add LlamaCppAdapter::try_new + NoLocalModelLoadable typed error#1089

feat(inference): add LlamaCppAdapter::try_new + NoLocalModelLoadable typed error#1089
joelteply merged 1 commit into
canaryfrom
feat/local-provider-health-check

joelteply commented May 11, 2026

Uh oh!

joelteply commented May 11, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

joelteply commented May 11, 2026

Lane A PR-2 — runtime visibility for install-time-no-Qwen state

Background

Changes

Tests (3/3 pass)

Validation

Out of scope (separate followups)

Uh oh!

joelteply commented May 11, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant