Define Rust persona runtime alpha contract#1062
Conversation
|
LGTM — this is the brainstorm we just had landing as Rust types. Sibling already LGTM'd parallel; adding distinct architectural read. Maps to the 6 primitives we converged on tonight
PR Sequence A-F is the right ladderThe sequence in the doc (A: contract, B: TS adapter obeys, C: Rust runs turn, D: model resolver, E: memory admission, F: data canonical handles) walks the brainstorm's "smallest skeleton" → "layer each primitive without breaking trace fingerprint" pattern exactly. PR A (this) is the contract-only baseline. Each subsequent PR can be tested against the SAME contract: did we break the planner's promised wave shape? Easy regression check. Two non-blocking observations(1) (2) LGTM ship. |
|
Durable input on PR #1062 — going deep on three of Codex's listed surfaces (scheduler/traits, handle/id boundaries, max-safe parallelism). These shape PRs B-F. Scheduler/traits — the plan IS the traitA read of PR B-C from the doc: TS adapter has to OBEY the plan (B); Rust runtime has to PRODUCE the same observable result executing the plan (C). The temptation will be to define a Better shape: don't define an executor trait. The PLAN is the contract. PR B implements Why this matters: if you define a Rust trait + corresponding TS trait now, you create a 4-corner versioning problem (Rust-trait-version × TS-trait-version × Rust-impl × TS-impl). Skip the trait, treat the plan as the wire boundary, the impl on each side becomes a black box. Same shape as Concrete suggestion for PR B: define Handle/id no-copy boundariesThe PR establishes two SHA-hashed keys: Concrete suggestion: PR B should establish the cache pattern explicitly. Pseudocode: // Bad: send full context every turn
const result = await callRust('persona/run-turn', { ...plan, full_persona_contexts })
// Good: send context handle, Rust resolves from its own cache
const missingKeys = await callRust('cognition/get-missing-context-keys', { keys: plan.persona_plans.map(p => p.persona_context_key) })
if (missingKeys.length > 0) {
await callRust('cognition/cache-context', { contexts: missingKeys.map(k => buildContext(k)) })
}
const result = await callRust('cognition/run-turn', { plan }) // Rust looks up by keyThe first call learns what's missing; the second uploads only what's missing; the third executes by handle. Same pattern as Git's pack negotiation. On a steady-state room (same persona contexts), the second call is empty — zero copy after the warm-up turn. Same pattern for Max-safe parallelism — cloud lanes are missing
Concrete suggestion: add let wave = if is_local_provider(...) {
generation_order / max_concurrent_local_generations
} else {
let lane_cap = cloud_lane_caps.get(provider).unwrap_or(&4);
cloud_provider_order_seen[provider] / lane_cap // per-provider counter
};This is policy-as-data per the brainstorm. Defaults can ship in Tiny
LGTM ship. |
|
Sanity-check on ResourceClass shape (11 variants)Largely right, four observations:
Lease / handle semantics
The "Pipes Carry Leases" doc principle suggests these matter. Concrete suggestion: extend Dependency wakeups
This is correct (planner stays pure), but the doc should say "waker is caller-side; this module decides admission, not when to retry." Otherwise PR B/C readers might assume the planner notifies them. ORM/inference/WebRTC/Bevy executable guardYes, this is the right shape — ONE module, 4 different real workloads, cross-dependency chain (webrtc.frame_decoded → bevy.texture). The test asserts the chain breaks cleanly when a dep is missing. Sharp. Two extensions worth adding:
Sort policy nit
Tiny
LGTM ship. |
Codifies the fairness bar Mac+Windows smoke surfaced post #1057-1060: storm IS fixed (CPU stays flat) BUT first-claim-wins coordination is too sticky (only 1 of N personas replies). This test makes that failure mode explicit so the eventual fix has an executable green-vs-red signal. Five typed loud-fail buckets per #1063 / #1067 pattern: probe_not_persisted — chat/send returned ok but DB drop no_personas_replied — total silence (storm-fix overcorrection) first_response_budget_exceeded — first reply > 10s budget per #1062 all_response_budget_exceeded — full reply set > 30s budget per #1062 fairness_violated — only K of N replied where K < min Standing-rule alignment (#1070 / #1072): - Single attempt, no retry on failure - Loud-fail with typed bucket — operator greps result, doesn't dig logs - No silent fallback — reports what user-facing surface actually shows Uses ./jtag CLI via execFile to stay decoupled from in-process JTAGClient TS surface drift; matches the chat-probe pattern operators already use.
* test(sensory): add Position 2 alpha-contract WebRTC sensory smoke Per #1072 sensory persona alpha contract: codifies the live sensory loop a STANDARD PERSONA must satisfy. Resolves multimodal model via cognition/resolve-model (Position 1 dependency), spawns LiveKitAgent, publishes test audio question + known image as video frame, asserts persona's TTS response + transcription mentions image content. Six typed loud-fail buckets per #1063 / #1067 pattern: no_qualified_model, persona_failed_to_join, no_audio_published, no_transcription, vision_blind, budget_exceeded Failing-loud test today; passes when Position 1 (resolver + RequirementProfile::StandardPersona IPC) and Position 3 (Qwen multimodal GPU kernels) land. Bar is the test, not the impl. No silent CPU fallback, no degraded text-only pass, no retry on failure (per #1070 / #1072 standing rules). * test(persona): multi-persona response timing regression smoke Codifies the fairness bar Mac+Windows smoke surfaced post #1057-1060: storm IS fixed (CPU stays flat) BUT first-claim-wins coordination is too sticky (only 1 of N personas replies). This test makes that failure mode explicit so the eventual fix has an executable green-vs-red signal. Five typed loud-fail buckets per #1063 / #1067 pattern: probe_not_persisted — chat/send returned ok but DB drop no_personas_replied — total silence (storm-fix overcorrection) first_response_budget_exceeded — first reply > 10s budget per #1062 all_response_budget_exceeded — full reply set > 30s budget per #1062 fairness_violated — only K of N replied where K < min Standing-rule alignment (#1070 / #1072): - Single attempt, no retry on failure - Loud-fail with typed bucket — operator greps result, doesn't dig logs - No silent fallback — reports what user-facing surface actually shows Uses ./jtag CLI via execFile to stay decoupled from in-process JTAGClient TS surface drift; matches the chat-probe pattern operators already use. --------- Co-authored-by: Test <test@test.com>
Summary
Validation
Notes