Skip to content

Add sensory-bar requirement profile to model resolver (Position 1, PR #1072)#1074

Merged
joelteply merged 3 commits into
canaryfrom
feat/sensory-requirement-profile
May 13, 2026
Merged

Add sensory-bar requirement profile to model resolver (Position 1, PR #1072)#1074
joelteply merged 3 commits into
canaryfrom
feat/sensory-requirement-profile

Conversation

@joelteply
Copy link
Copy Markdown
Contributor

Position 1 — typed sensory capability requirement + GPU residency

Implements the resolver-side primitive for the sensory persona alpha contract codified in #1072. NO COMPROMISE per Joel 2026-05-11.

Scope

cognition::model_resolver gains:

  • SiliconResidencyRequirement enum: GpuOrUnifiedMemoryOnly (alpha bar) | AnySilicon (test/compat opt-out)
  • ModelRequirement.silicon_residency field — required, no default
  • ModelRequirement::standard_persona(host) — bundles {Chat, Vision, AudioInput, AudioOutput} + GpuOrUnifiedMemoryOnly + PreferLocal. Standard personas go through this; freelance struct construction is for non-alpha paths.
  • ModelRequirement::standard_persona_local_only(host) — strict variant for air-gapped / M-series default install.
  • ResolutionError::NoMultimodalBase — fires when ANY filter empties candidates AND requirement included Vision+AudioInput. Names the FORGE GAP directly.
  • ResolutionError::SiliconResidencyViolated — fires when resolved silicon (Cpu, Cloud, etc.) violates residency requirement. Names the model that would have run + where.
  • New 5th filter (silicon_residency) runs after ranking, before returning.

Validation

  • cargo test --features metal,accelerate -p continuum-core --lib cognition::model_resolver: 25/25 pass (was 16, +9 new sensory-bar tests)
  • cargo test --features metal,accelerate -p continuum-core --lib model_registry: 13/13 pass
  • npx tsx scripts/build-with-loud-failure.ts: TypeScript clean (ts-rs regenerated 4 cognition/* binding files)

Key tests pinning the contract

  • current_registry_state_fails_alpha_bar_naming_the_forge_gap — today's registry has NO local multimodal base; this passes by asserting NoMultimodalBase fires. Updates to assert success when forge ships one.
  • standard_persona_rejects_cpu_silicon_no_silent_fallback — proves the GPU residency gate refuses CPU even when capabilities match.
  • standard_persona_rejects_cloud_silicon_under_gpu_residency_with_prefer_local_fallback — proves PreferLocal+GpuOrUnifiedMemoryOnly errors on cloud-only candidates rather than silently shipping a cloud answer.

Out of scope (followup PRs)

This is the typed primitive. Subsequent PRs wire it through:

  1. Lane A territory: wire standard_persona() into the seed/persona-init code path so personas are actually resolved through it.
  2. Probe: hardware-detection that populates HostCapability.
  3. Position 3 territory: forge the multimodal local base GGUFs the resolver demands at every tier (the resolver currently fails-loud on every host because no such model is in the registry — that failure IS the forge gap surfacing).
  4. Re-enable qwen2-audio-7b in models.toml (currently commented out due to vision+audio mtmd Metal OOM — substrate work, not this PR).
  5. Position 2 (test(sensory): Position 2 alpha-contract WebRTC sensory smoke #1073) consumes the new ResolutionError variants for its loud-fail buckets.

🤖 Generated with Claude Code

…1072)

Per Joel 2026-05-11 ("every standard persona has sensory I/O and WebRTC
presence; text-only is a compatibility mode, not the product. NO
COMPROMISE.") and PR #1072's sensory persona alpha contract.

ModelRequirement gains:
- silicon_residency: SiliconResidencyRequirement field
  - GpuOrUnifiedMemoryOnly (alpha bar — no silent CPU fallback)
  - AnySilicon (tests + adapter/compat paths only)
- standard_persona(host) constructor — bundles {Chat, Vision, AudioInput,
  AudioOutput} + GpuOrUnifiedMemoryOnly + PreferLocal. Standard personas
  go through this; freelance struct construction is for non-alpha paths.
- standard_persona_local_only(host) variant — locks LocalOnly for
  air-gapped / M-series default install.

ResolutionError gains two typed buckets so failures are operator-actionable:
- NoMultimodalBase{registry_count, required_sensory_capabilities}
  fires when ANY filter empties candidates AND requirement included the
  Vision+AudioInput bundle. Names the FORGE GAP directly: ship a
  multimodal base for this tier. Distinct from the generic
  NoModelMatchesRequirement which still covers non-sensory failures.
- SiliconResidencyViolated{rejected_model_id, actual_silicon} fires when
  the resolved model's silicon (Cpu, Cloud, etc.) violates the residency
  requirement. Names what WOULD have run + the silicon it would have
  landed on.

Resolver pipeline gains a 5th gate (silicon_residency) that runs after
ranking and before returning. The is_sensory_query check at the start
routes ALL filter-empty errors through NoMultimodalBase when the
requirement included the multimodal sensory bundle.

Tests: 25/25 cognition::model_resolver pass (was 16; +9 new):
- standard_persona_constructor_bundles_the_alpha_bar
- standard_persona_local_only_constructor_locks_provider_policy
- current_registry_state_fails_alpha_bar_naming_the_forge_gap (intentional
  pin: today's registry has NO local multimodal base; this passes by
  asserting NoMultimodalBase fires; updates to assert success when the
  forge ships one)
- standard_persona_resolves_when_multimodal_local_base_exists (synthetic
  multimodal local model + M1 8GB host → resolves on UnifiedMemory)
- standard_persona_rejects_cpu_silicon_no_silent_fallback (CPU host +
  multimodal local model present → SiliconResidencyViolated)
- standard_persona_rejects_cloud_silicon_under_gpu_residency_with_prefer_local_fallback
  (PreferLocal but only cloud satisfies bundle → SiliconResidencyViolated
  on gpt-4o)
- existing missing_capability_errors_no_fallback regression-converted from
  irrefutable let-binding to match (3 error variants now)

Validation:
- cargo test --features metal,accelerate -p continuum-core --lib
  cognition::model_resolver: 25/25 pass
- cargo test --features metal,accelerate -p continuum-core --lib
  model_registry: 13/13 pass (no schema changes; just confirms cross-
  module isn't disturbed)
- npx tsx scripts/build-with-loud-failure.ts: TypeScript clean

Out of scope for this PR (separate followup PRs):
- Wiring standard_persona() into the actual seed/persona-init code path
  (Lane A territory — TS adapter/lifecycle integration)
- Adding hardware-detection probe that populates HostCapability
- Forging the multimodal local base GGUFs the resolver demands at every
  tier (Position 3 territory)
- Re-enabling qwen2-audio-7b in models.toml (substrate work blocked by
  vision+audio mtmd Metal OOM — not this PR)

This is the typed primitive. Subsequent PRs wire it through.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@joelteply joelteply marked this pull request as ready for review May 11, 2026 17:34
joelteply added a commit that referenced this pull request May 11, 2026
)

Position 1 PR #1074 shipped the typed primitive (standard_persona(host)).
Without a probe, every caller has to construct HostCapability by hand —
the resolver is callable but not used. This is the production probe.

cognition/host_capability_probe.rs (pure, single file, ~270 lines):
- detect_host_capability(gpu_monitor: &dyn GpuMonitor, system_info: &System)
  -> Result<HostCapability, ProbeError>
- Maps GpuMonitor::platform to TargetSilicon and dispatches device-name
  pattern-matching:
  * metal → UnifiedMemory + Apple-Silicon tier (M1Uma8Gb, M1Uma16Gb,
    M2UmaProMax, M3UmaProMax) from CPU brand + total memory bucket
  * cuda → Gpu + Sm70..Sm120 tier from device-name (RTX 5090 → Sm120,
    H100 → Sm90, A100 → Sm80, T4/RTX 20xx → Sm75, V100 → Sm70, etc.)
  * vulkan → Gpu + VulkanAmd
  * mock → M1Uma16Gb (test fixture)
- ProbeError variants:
  * UnknownGpuDevice{platform, device_name} — pattern-match miss; loud
    fail per Joel's NO COMPROMISE rule (no silent CpuOnly fallback)
  * UnsupportedPlatform{platform} — fires when GpuMonitor reports an
    unrecognized platform string

Pattern-ordering is load-bearing in nvidia_sm_tier(): A100 must be
checked before A10/A40 because "A10" is a substring of "A100" — the
tests cover this regression vector explicitly. Comment in the source
calls it out.

Tests: 6/6 cognition::host_capability_probe pass:
- mock_platform_returns_test_fixture
- unsupported_platform_errors_loudly
- nvidia_pattern_match_resolves_known_skus (9 device fixtures)
- nvidia_unknown_sku_errors_no_silent_fallback
- apple_silicon_tier_mapping
- export_bindings_probeerror

Validation:
- cargo test --features metal,accelerate -p continuum-core --lib
  cognition::host_capability_probe: 6/6
- npx tsx scripts/build-with-loud-failure.ts: TypeScript clean

Out of scope (separate followups):
- Wiring detect_host_capability() into the actual server boot path so
  HostCapability becomes a runtime singleton callers can read
- Re-detect on hardware-change events (battery, thermal throttle)
- Memory-share heuristic (currently total_mem / 2; the right number
  needs adaptive_throughput integration to coordinate with leases)

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
joelteply added a commit that referenced this pull request May 11, 2026
Adds scripts/bench-blackwell-vl.sh: Docker-based reproducer that builds
llama.cpp upstream HEAD with CUDA arch sm_120, downloads Qwen2-VL-7B
Q4_K_M + mmproj, runs llama-bench (text-only) and llama-mtmd-cli (vision
smoke). Uses named volume qwen-vl-bench-work for idempotent re-runs.
CUDA_ARCH/MODEL_REPO/MODEL_FILE/MMPROJ_FILE/TEST_IMAGE_URL all
env-overridable so the harness works on other GPU tiers.

Adds docs/benchmarks/blackwell-rtx5090-qwen-vl.md: measured numbers from
the first run on RTX 5090 (pp512=12345 t/s, tg128=215 t/s text-only;
tg=201 t/s vision-conditioned, ~2.6s total for 4015 image tokens + 28
output tokens, 1290 MiB mmproj footprint). Documents the actual #1072
forge gap (no single model in models.toml has all 4 standard_persona
caps: Chat/Vision/AudioInput/AudioOutput) and proposes 3 paths forward
(wait for Qwen-Omni GGUF, tier-aware audio re-enable, or multi-model
virtual StandardPersona dispatch via RequirementProfile extension).

Per #1072 sensory persona alpha contract + #1074 standard_persona
requirement profile. Establishes the per-tier perf baseline; does not
modify models.toml or the resolver.
joelteply pushed a commit to RebelTechPro/continuum that referenced this pull request May 13, 2026
…F PR-2)

Per-pattern ratchet on src/system/user/server/, mirroring PR CambrianTech#1091's
LOC ratchet shape. Tracks three anti-patterns under the persona surface:

  - fallback_mention (case-insensitive, baseline 83): Joel 2026-04-22 —
    "fallbacks have ruined this project ... they are ILLEGAL." The WORD
    count proxies conceptual presence; comments saying "no fallback
    here" count too.
  - direct_adapter_instantiation (baseline 12): matches `new <Name>Adapter(`.
    TS surface should request providers via the ModelRequirement →
    ResolvedModel resolver shipped in CambrianTech#1066/CambrianTech#1074, not instantiate
    adapters directly.
  - direct_api_key_env_read (baseline 0): matches `process.env.*API_KEY`.
    Cloud key lookup belongs in the Rust provider registry per Codex's
    CambrianTech#1077 boundary. Locks 0 in.

Per-pattern monotonic-decrease (any pattern growing fails CI; shrinkage
allowed and surfaces a hint to --update-baseline post-merge). Same
3-mode shape as PR CambrianTech#1091: default check / --update-baseline / --verbose.

Validated locally: clean tree passes (3 patterns hold), intentional
+2 fallback growth fails with named pattern + delta + actionable Rust
target paths.

Lane F (PR CambrianTech#1084 alpha workstreams). Companion to CambrianTech#1091 — extends
docs/architecture/TS-PERSONA-COGNITION-RATCHET.md with the new gate.
Independent CI workflow (~5s, shell + python only).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@joelteply joelteply merged commit bdb4fa6 into canary May 13, 2026
3 checks passed
@joelteply joelteply deleted the feat/sensory-requirement-profile branch May 13, 2026 17:26
joelteply added a commit that referenced this pull request May 13, 2026
…1129)

* feat(persona): typed Engram + admission membrane types (#1121 PR-1)

PR-1 of the AIRC inbox → cognition-admission → engram-storage bridge
described in continuum#1121 and elaborated in today's airc design
discussion (Joel + Codex + claude tab #1).

Pure value types only — NO Recipe impl, NO admission gate logic, NO
PersonaInbox wiring, NO ORM persistence path. Subsequent PRs layer
those over these types.

Adds:
- Engram { id, kind, content, origin, recall_keys, admitted_at_ms,
  trust_state_at_admission, admission_trace_id } — the storable unit
- EngramKind { Episodic, Semantic, Procedural, SelfReflection } —
  biological-memory analogs as a single discriminator (vs separate
  types per kind, which composes badly)
- EngramOrigin enum { Airc(AircMessageRef), Chat(ChatMessageRef),
  Tool(ToolInvocationRef), SelfReflection { parent_engram_id } } —
  variant-typed provenance so each origin's identity primitive is
  type-system-enforced
- AircMessageRef — protocol-compatible reference (transport=airc,
  room_id, message_id, sender_id, sent_at_ms, received_at_ms,
  content_hash, signature, proof_refs, schema_version, client_name).
  Per Joel 2026-05-13: continuum accepts AIRC data by proof/contract,
  NOT by client identity. Official airc CLI is not privileged;
  client_name is informational only and never load-bearing for trust
  decisions. Any producer emitting valid envelopes is acceptable.
- ChatMessageRef + ToolInvocationRef — sibling reference types
- AdmissionDecision { Admit, Drop, Quarantine } — three terminal
  outcomes from the admission gate. Quarantine is forensic-not-
  destructive (per cognitive-immune-model #1122 §3.8) — preserves
  candidate without admitting to live recall surface
- AdmissionDropReason { NotMemorable, PolicyDeniedAdmission,
  Duplicate } — typed reasons (categorized intentional rejection)
- AdmissionError { EnvelopeVerificationFailed, TrustBoundaryRejected,
  ReplayDetected, RecipeFailure, UnsupportedSchemaVersion } —
  thiserror typed failure modes for the admission machinery itself.
  Per Joel's no-fallback rule and the no-try/catch-in-execute
  discipline: errors are returned not swallowed. Same shape as
  NoLocalModelLoadable (#1089) and NoMultimodalBase (#1074).
- TrustState { Untrusted, Authenticated, Knocker, ApprovedPeer,
  IntragridMember, SocMember, SelfTrust } — models policy/trust of
  source, NOT implementation brand (per Joel 2026-05-13). Ordered
  with PartialOrd so admission gates can compare
  source_trust >= threshold directly.

Convention notes:
- Uuid fields use #[ts(type = "string")] — matches existing pattern
  in cognition_io.rs / channel_items.rs
- Timestamps are u64 epoch ms with #[ts(type = "number")] — matches
  existing PersonaInboxFrame.oldest_timestamp pattern. Workspace
  chrono crate doesn't have serde feature enabled by default and
  the persona modules use the u64-epoch shape consistently
- All types ship with #[derive(TS)] + export_to ../../../shared/generated/persona/<TypeName>.ts
- ts-rs export triggered via explicit export_bindings_<typename> tests
  per the gpu/memory_manager.rs pattern

Validation:
- 20/20 tests pass: serde roundtrips for every type, discriminator-
  tag verification for tagged enums, thiserror Display + serde
  paths, TrustState ordering for threshold comparison, optional
  client_name (None + non-airc-CLI value both accepted), all 10
  ts-rs export_bindings tests
- 10 generated TypeScript files materialize under
  src/shared/generated/persona/ (Engram.ts, EngramKind.ts,
  EngramOrigin.ts, AircMessageRef.ts, ChatMessageRef.ts,
  ToolInvocationRef.ts, AdmissionDecision.ts, AdmissionDropReason.ts,
  AdmissionError.ts, TrustState.ts)

Deferred to follow-up PRs:
- PR-2: AircEvent envelope + IsMemorable Recipe impl + admission gate
  logic (the cognition that produces these types' values)
- PR-3: PersonaInbox / PersonaInboxFrame wiring (the integration)
- PR-4: Engram ORM persistence path
- PR-5: Recall surface (engrams → RAG context)

Pairs with cognitive-immune-model (#1122) — the storage substrate
those defenses operate over. Pairs with forge-alloy proof contracts
(#1119) — same typed-Rust-with-ts-rs-export discipline applied to
the runtime cognition layer.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(persona): export generated engram bindings

---------

Co-authored-by: Test <test@test.com>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant