Add sensory-bar requirement profile to model resolver (Position 1, PR #1072)#1074
Merged
Conversation
…1072) Per Joel 2026-05-11 ("every standard persona has sensory I/O and WebRTC presence; text-only is a compatibility mode, not the product. NO COMPROMISE.") and PR #1072's sensory persona alpha contract. ModelRequirement gains: - silicon_residency: SiliconResidencyRequirement field - GpuOrUnifiedMemoryOnly (alpha bar — no silent CPU fallback) - AnySilicon (tests + adapter/compat paths only) - standard_persona(host) constructor — bundles {Chat, Vision, AudioInput, AudioOutput} + GpuOrUnifiedMemoryOnly + PreferLocal. Standard personas go through this; freelance struct construction is for non-alpha paths. - standard_persona_local_only(host) variant — locks LocalOnly for air-gapped / M-series default install. ResolutionError gains two typed buckets so failures are operator-actionable: - NoMultimodalBase{registry_count, required_sensory_capabilities} fires when ANY filter empties candidates AND requirement included the Vision+AudioInput bundle. Names the FORGE GAP directly: ship a multimodal base for this tier. Distinct from the generic NoModelMatchesRequirement which still covers non-sensory failures. - SiliconResidencyViolated{rejected_model_id, actual_silicon} fires when the resolved model's silicon (Cpu, Cloud, etc.) violates the residency requirement. Names what WOULD have run + the silicon it would have landed on. Resolver pipeline gains a 5th gate (silicon_residency) that runs after ranking and before returning. The is_sensory_query check at the start routes ALL filter-empty errors through NoMultimodalBase when the requirement included the multimodal sensory bundle. Tests: 25/25 cognition::model_resolver pass (was 16; +9 new): - standard_persona_constructor_bundles_the_alpha_bar - standard_persona_local_only_constructor_locks_provider_policy - current_registry_state_fails_alpha_bar_naming_the_forge_gap (intentional pin: today's registry has NO local multimodal base; this passes by asserting NoMultimodalBase fires; updates to assert success when the forge ships one) - standard_persona_resolves_when_multimodal_local_base_exists (synthetic multimodal local model + M1 8GB host → resolves on UnifiedMemory) - standard_persona_rejects_cpu_silicon_no_silent_fallback (CPU host + multimodal local model present → SiliconResidencyViolated) - standard_persona_rejects_cloud_silicon_under_gpu_residency_with_prefer_local_fallback (PreferLocal but only cloud satisfies bundle → SiliconResidencyViolated on gpt-4o) - existing missing_capability_errors_no_fallback regression-converted from irrefutable let-binding to match (3 error variants now) Validation: - cargo test --features metal,accelerate -p continuum-core --lib cognition::model_resolver: 25/25 pass - cargo test --features metal,accelerate -p continuum-core --lib model_registry: 13/13 pass (no schema changes; just confirms cross- module isn't disturbed) - npx tsx scripts/build-with-loud-failure.ts: TypeScript clean Out of scope for this PR (separate followup PRs): - Wiring standard_persona() into the actual seed/persona-init code path (Lane A territory — TS adapter/lifecycle integration) - Adding hardware-detection probe that populates HostCapability - Forging the multimodal local base GGUFs the resolver demands at every tier (Position 3 territory) - Re-enabling qwen2-audio-7b in models.toml (substrate work blocked by vision+audio mtmd Metal OOM — not this PR) This is the typed primitive. Subsequent PRs wire it through. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
joelteply
added a commit
that referenced
this pull request
May 11, 2026
) Position 1 PR #1074 shipped the typed primitive (standard_persona(host)). Without a probe, every caller has to construct HostCapability by hand — the resolver is callable but not used. This is the production probe. cognition/host_capability_probe.rs (pure, single file, ~270 lines): - detect_host_capability(gpu_monitor: &dyn GpuMonitor, system_info: &System) -> Result<HostCapability, ProbeError> - Maps GpuMonitor::platform to TargetSilicon and dispatches device-name pattern-matching: * metal → UnifiedMemory + Apple-Silicon tier (M1Uma8Gb, M1Uma16Gb, M2UmaProMax, M3UmaProMax) from CPU brand + total memory bucket * cuda → Gpu + Sm70..Sm120 tier from device-name (RTX 5090 → Sm120, H100 → Sm90, A100 → Sm80, T4/RTX 20xx → Sm75, V100 → Sm70, etc.) * vulkan → Gpu + VulkanAmd * mock → M1Uma16Gb (test fixture) - ProbeError variants: * UnknownGpuDevice{platform, device_name} — pattern-match miss; loud fail per Joel's NO COMPROMISE rule (no silent CpuOnly fallback) * UnsupportedPlatform{platform} — fires when GpuMonitor reports an unrecognized platform string Pattern-ordering is load-bearing in nvidia_sm_tier(): A100 must be checked before A10/A40 because "A10" is a substring of "A100" — the tests cover this regression vector explicitly. Comment in the source calls it out. Tests: 6/6 cognition::host_capability_probe pass: - mock_platform_returns_test_fixture - unsupported_platform_errors_loudly - nvidia_pattern_match_resolves_known_skus (9 device fixtures) - nvidia_unknown_sku_errors_no_silent_fallback - apple_silicon_tier_mapping - export_bindings_probeerror Validation: - cargo test --features metal,accelerate -p continuum-core --lib cognition::host_capability_probe: 6/6 - npx tsx scripts/build-with-loud-failure.ts: TypeScript clean Out of scope (separate followups): - Wiring detect_host_capability() into the actual server boot path so HostCapability becomes a runtime singleton callers can read - Re-detect on hardware-change events (battery, thermal throttle) - Memory-share heuristic (currently total_mem / 2; the right number needs adaptive_throughput integration to coordinate with leases) Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
joelteply
added a commit
that referenced
this pull request
May 11, 2026
Adds scripts/bench-blackwell-vl.sh: Docker-based reproducer that builds llama.cpp upstream HEAD with CUDA arch sm_120, downloads Qwen2-VL-7B Q4_K_M + mmproj, runs llama-bench (text-only) and llama-mtmd-cli (vision smoke). Uses named volume qwen-vl-bench-work for idempotent re-runs. CUDA_ARCH/MODEL_REPO/MODEL_FILE/MMPROJ_FILE/TEST_IMAGE_URL all env-overridable so the harness works on other GPU tiers. Adds docs/benchmarks/blackwell-rtx5090-qwen-vl.md: measured numbers from the first run on RTX 5090 (pp512=12345 t/s, tg128=215 t/s text-only; tg=201 t/s vision-conditioned, ~2.6s total for 4015 image tokens + 28 output tokens, 1290 MiB mmproj footprint). Documents the actual #1072 forge gap (no single model in models.toml has all 4 standard_persona caps: Chat/Vision/AudioInput/AudioOutput) and proposes 3 paths forward (wait for Qwen-Omni GGUF, tier-aware audio re-enable, or multi-model virtual StandardPersona dispatch via RequirementProfile extension). Per #1072 sensory persona alpha contract + #1074 standard_persona requirement profile. Establishes the per-tier perf baseline; does not modify models.toml or the resolver.
5 tasks
joelteply
pushed a commit
to RebelTechPro/continuum
that referenced
this pull request
May 13, 2026
…F PR-2) Per-pattern ratchet on src/system/user/server/, mirroring PR CambrianTech#1091's LOC ratchet shape. Tracks three anti-patterns under the persona surface: - fallback_mention (case-insensitive, baseline 83): Joel 2026-04-22 — "fallbacks have ruined this project ... they are ILLEGAL." The WORD count proxies conceptual presence; comments saying "no fallback here" count too. - direct_adapter_instantiation (baseline 12): matches `new <Name>Adapter(`. TS surface should request providers via the ModelRequirement → ResolvedModel resolver shipped in CambrianTech#1066/CambrianTech#1074, not instantiate adapters directly. - direct_api_key_env_read (baseline 0): matches `process.env.*API_KEY`. Cloud key lookup belongs in the Rust provider registry per Codex's CambrianTech#1077 boundary. Locks 0 in. Per-pattern monotonic-decrease (any pattern growing fails CI; shrinkage allowed and surfaces a hint to --update-baseline post-merge). Same 3-mode shape as PR CambrianTech#1091: default check / --update-baseline / --verbose. Validated locally: clean tree passes (3 patterns hold), intentional +2 fallback growth fails with named pattern + delta + actionable Rust target paths. Lane F (PR CambrianTech#1084 alpha workstreams). Companion to CambrianTech#1091 — extends docs/architecture/TS-PERSONA-COGNITION-RATCHET.md with the new gate. Independent CI workflow (~5s, shell + python only). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
6 tasks
joelteply
added a commit
that referenced
this pull request
May 13, 2026
…1129) * feat(persona): typed Engram + admission membrane types (#1121 PR-1) PR-1 of the AIRC inbox → cognition-admission → engram-storage bridge described in continuum#1121 and elaborated in today's airc design discussion (Joel + Codex + claude tab #1). Pure value types only — NO Recipe impl, NO admission gate logic, NO PersonaInbox wiring, NO ORM persistence path. Subsequent PRs layer those over these types. Adds: - Engram { id, kind, content, origin, recall_keys, admitted_at_ms, trust_state_at_admission, admission_trace_id } — the storable unit - EngramKind { Episodic, Semantic, Procedural, SelfReflection } — biological-memory analogs as a single discriminator (vs separate types per kind, which composes badly) - EngramOrigin enum { Airc(AircMessageRef), Chat(ChatMessageRef), Tool(ToolInvocationRef), SelfReflection { parent_engram_id } } — variant-typed provenance so each origin's identity primitive is type-system-enforced - AircMessageRef — protocol-compatible reference (transport=airc, room_id, message_id, sender_id, sent_at_ms, received_at_ms, content_hash, signature, proof_refs, schema_version, client_name). Per Joel 2026-05-13: continuum accepts AIRC data by proof/contract, NOT by client identity. Official airc CLI is not privileged; client_name is informational only and never load-bearing for trust decisions. Any producer emitting valid envelopes is acceptable. - ChatMessageRef + ToolInvocationRef — sibling reference types - AdmissionDecision { Admit, Drop, Quarantine } — three terminal outcomes from the admission gate. Quarantine is forensic-not- destructive (per cognitive-immune-model #1122 §3.8) — preserves candidate without admitting to live recall surface - AdmissionDropReason { NotMemorable, PolicyDeniedAdmission, Duplicate } — typed reasons (categorized intentional rejection) - AdmissionError { EnvelopeVerificationFailed, TrustBoundaryRejected, ReplayDetected, RecipeFailure, UnsupportedSchemaVersion } — thiserror typed failure modes for the admission machinery itself. Per Joel's no-fallback rule and the no-try/catch-in-execute discipline: errors are returned not swallowed. Same shape as NoLocalModelLoadable (#1089) and NoMultimodalBase (#1074). - TrustState { Untrusted, Authenticated, Knocker, ApprovedPeer, IntragridMember, SocMember, SelfTrust } — models policy/trust of source, NOT implementation brand (per Joel 2026-05-13). Ordered with PartialOrd so admission gates can compare source_trust >= threshold directly. Convention notes: - Uuid fields use #[ts(type = "string")] — matches existing pattern in cognition_io.rs / channel_items.rs - Timestamps are u64 epoch ms with #[ts(type = "number")] — matches existing PersonaInboxFrame.oldest_timestamp pattern. Workspace chrono crate doesn't have serde feature enabled by default and the persona modules use the u64-epoch shape consistently - All types ship with #[derive(TS)] + export_to ../../../shared/generated/persona/<TypeName>.ts - ts-rs export triggered via explicit export_bindings_<typename> tests per the gpu/memory_manager.rs pattern Validation: - 20/20 tests pass: serde roundtrips for every type, discriminator- tag verification for tagged enums, thiserror Display + serde paths, TrustState ordering for threshold comparison, optional client_name (None + non-airc-CLI value both accepted), all 10 ts-rs export_bindings tests - 10 generated TypeScript files materialize under src/shared/generated/persona/ (Engram.ts, EngramKind.ts, EngramOrigin.ts, AircMessageRef.ts, ChatMessageRef.ts, ToolInvocationRef.ts, AdmissionDecision.ts, AdmissionDropReason.ts, AdmissionError.ts, TrustState.ts) Deferred to follow-up PRs: - PR-2: AircEvent envelope + IsMemorable Recipe impl + admission gate logic (the cognition that produces these types' values) - PR-3: PersonaInbox / PersonaInboxFrame wiring (the integration) - PR-4: Engram ORM persistence path - PR-5: Recall surface (engrams → RAG context) Pairs with cognitive-immune-model (#1122) — the storage substrate those defenses operate over. Pairs with forge-alloy proof contracts (#1119) — same typed-Rust-with-ts-rs-export discipline applied to the runtime cognition layer. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(persona): export generated engram bindings --------- Co-authored-by: Test <test@test.com> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Position 1 — typed sensory capability requirement + GPU residency
Implements the resolver-side primitive for the sensory persona alpha contract codified in #1072. NO COMPROMISE per Joel 2026-05-11.
Scope
cognition::model_resolvergains:SiliconResidencyRequirementenum:GpuOrUnifiedMemoryOnly(alpha bar) |AnySilicon(test/compat opt-out)ModelRequirement.silicon_residencyfield — required, no defaultModelRequirement::standard_persona(host)— bundles{Chat, Vision, AudioInput, AudioOutput}+GpuOrUnifiedMemoryOnly+PreferLocal. Standard personas go through this; freelance struct construction is for non-alpha paths.ModelRequirement::standard_persona_local_only(host)— strict variant for air-gapped / M-series default install.ResolutionError::NoMultimodalBase— fires when ANY filter empties candidates AND requirement included Vision+AudioInput. Names the FORGE GAP directly.ResolutionError::SiliconResidencyViolated— fires when resolved silicon (Cpu, Cloud, etc.) violates residency requirement. Names the model that would have run + where.Validation
cargo test --features metal,accelerate -p continuum-core --lib cognition::model_resolver: 25/25 pass (was 16, +9 new sensory-bar tests)cargo test --features metal,accelerate -p continuum-core --lib model_registry: 13/13 passnpx tsx scripts/build-with-loud-failure.ts: TypeScript clean (ts-rs regenerated 4 cognition/* binding files)Key tests pinning the contract
current_registry_state_fails_alpha_bar_naming_the_forge_gap— today's registry has NO local multimodal base; this passes by assertingNoMultimodalBasefires. Updates to assert success when forge ships one.standard_persona_rejects_cpu_silicon_no_silent_fallback— proves the GPU residency gate refuses CPU even when capabilities match.standard_persona_rejects_cloud_silicon_under_gpu_residency_with_prefer_local_fallback— proves PreferLocal+GpuOrUnifiedMemoryOnly errors on cloud-only candidates rather than silently shipping a cloud answer.Out of scope (followup PRs)
This is the typed primitive. Subsequent PRs wire it through:
standard_persona()into the seed/persona-init code path so personas are actually resolved through it.HostCapability.models.toml(currently commented out due to vision+audio mtmd Metal OOM — substrate work, not this PR).ResolutionErrorvariants for its loud-fail buckets.🤖 Generated with Claude Code