Skip to content

Fix persona response storm backpressure#1057

Merged
joelteply merged 1 commit into
canaryfrom
fix/runtime-backpressure-cognition-storm
May 7, 2026
Merged

Fix persona response storm backpressure#1057
joelteply merged 1 commit into
canaryfrom
fix/runtime-backpressure-cognition-storm

Conversation

@joelteply
Copy link
Copy Markdown
Contributor

Summary

  • enforce Runtime ModuleConfig.max_concurrency with per-module semaphores
  • serialize cognition, ai_provider, and embedding modules to prevent native threadpool fanout during persona bursts
  • debounce normal same-room PersonaInbox chat wakeups so Rust channel consolidation gets a conversation chunk instead of one wakeup per message
  • hard-gate undirected non-human echo storms while preserving human messages and direct mentions

Validation

  • npm stop cleared the runaway local Continuum process before patch validation
  • npx vitest run system/user/server/tests/validation/PersonaInboxDebounce.test.ts --reporter=verbose
  • ./node_modules/.bin/tsc --noEmit --project . --pretty false
  • cargo check --manifest-path src/workers/Cargo.toml -p continuum-core --features metal,accelerate from a clean worktree with vendor/llama.cpp initialized

Notes

  • Pre-existing Rust warnings remain; no new Rust hard errors.
  • The first temp worktree lacked initialized vendor/llama.cpp, so initial cargo/pre-push attempts failed on missing CMakeLists.txt. Clean worktree validation passed after submodule init.

@joelteply joelteply merged commit 76e0439 into canary May 7, 2026
3 checks passed
@joelteply joelteply deleted the fix/runtime-backpressure-cognition-storm branch May 7, 2026 19:02
joelteply pushed a commit that referenced this pull request May 11, 2026
Codifies the fairness bar Mac+Windows smoke surfaced post #1057-1060:
storm IS fixed (CPU stays flat) BUT first-claim-wins coordination is too
sticky (only 1 of N personas replies). This test makes that failure mode
explicit so the eventual fix has an executable green-vs-red signal.

Five typed loud-fail buckets per #1063 / #1067 pattern:
  probe_not_persisted             — chat/send returned ok but DB drop
  no_personas_replied             — total silence (storm-fix overcorrection)
  first_response_budget_exceeded  — first reply > 10s budget per #1062
  all_response_budget_exceeded    — full reply set > 30s budget per #1062
  fairness_violated               — only K of N replied where K < min

Standing-rule alignment (#1070 / #1072):
- Single attempt, no retry on failure
- Loud-fail with typed bucket — operator greps result, doesn't dig logs
- No silent fallback — reports what user-facing surface actually shows

Uses ./jtag CLI via execFile to stay decoupled from in-process JTAGClient
TS surface drift; matches the chat-probe pattern operators already use.
joelteply added a commit that referenced this pull request May 11, 2026
* test(sensory): add Position 2 alpha-contract WebRTC sensory smoke

Per #1072 sensory persona alpha contract: codifies the live sensory
loop a STANDARD PERSONA must satisfy. Resolves multimodal model via
cognition/resolve-model (Position 1 dependency), spawns LiveKitAgent,
publishes test audio question + known image as video frame, asserts
persona's TTS response + transcription mentions image content.

Six typed loud-fail buckets per #1063 / #1067 pattern:
  no_qualified_model, persona_failed_to_join, no_audio_published,
  no_transcription, vision_blind, budget_exceeded

Failing-loud test today; passes when Position 1 (resolver +
RequirementProfile::StandardPersona IPC) and Position 3 (Qwen
multimodal GPU kernels) land. Bar is the test, not the impl.

No silent CPU fallback, no degraded text-only pass, no retry on
failure (per #1070 / #1072 standing rules).

* test(persona): multi-persona response timing regression smoke

Codifies the fairness bar Mac+Windows smoke surfaced post #1057-1060:
storm IS fixed (CPU stays flat) BUT first-claim-wins coordination is too
sticky (only 1 of N personas replies). This test makes that failure mode
explicit so the eventual fix has an executable green-vs-red signal.

Five typed loud-fail buckets per #1063 / #1067 pattern:
  probe_not_persisted             — chat/send returned ok but DB drop
  no_personas_replied             — total silence (storm-fix overcorrection)
  first_response_budget_exceeded  — first reply > 10s budget per #1062
  all_response_budget_exceeded    — full reply set > 30s budget per #1062
  fairness_violated               — only K of N replied where K < min

Standing-rule alignment (#1070 / #1072):
- Single attempt, no retry on failure
- Loud-fail with typed bucket — operator greps result, doesn't dig logs
- No silent fallback — reports what user-facing surface actually shows

Uses ./jtag CLI via execFile to stay decoupled from in-process JTAGClient
TS surface drift; matches the chat-probe pattern operators already use.

---------

Co-authored-by: Test <test@test.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant