Skip to content

fix: prevent splitSegments infinite recursion and add global background work throttling#355

Merged
BYK merged 3 commits into
mainfrom
fix/distillation-recursion-and-background-throttling
May 16, 2026
Merged

fix: prevent splitSegments infinite recursion and add global background work throttling#355
BYK merged 3 commits into
mainfrom
fix/distillation-recursion-and-background-throttling

Conversation

@BYK
Copy link
Copy Markdown
Owner

@BYK BYK commented May 16, 2026

Summary

Fixes two distinct background errors reported by a user who imported old chats from 5 projects simultaneously:

  1. Recursive crash during idle distillationRangeError: Maximum call stack size exceeded in splitSegments() when a single temporal message exceeds maxSegmentTokens (16384 tokens / ~49KB content). Added a base case to prevent infinite recursion on indivisible oversized messages.

  2. Upstream 429 Too Many Requests — background LLM work (distillation, curation) had no global concurrency limit, allowing N idle sessions to fire N×4 simultaneous requests. Added a global p-limit(2) background limiter with a circuit breaker that trips on 429 and pauses all background work for the Retry-After duration.

Changes

Bug 1: splitSegments() infinite recursion

  • packages/core/src/distillation.ts: Added if (messages.length <= 1) return [messages] base case after the totalTokens <= maxTokens guard (1 line)
  • packages/core/test/distillation.test.ts: Added 2 test cases — single oversized message and oversized message among normal messages

Bug 2: Background 429 rate limiting

  • packages/gateway/src/background-limiter.ts (new): Global p-limit(2) concurrency limiter + circuit breaker module
  • packages/gateway/src/llm-adapter.ts: Trips circuit breaker on non-urgent 429 responses
  • packages/gateway/src/idle.ts: Wraps doIdleWork() through runBackground() — at most 2 sessions do idle work simultaneously
  • packages/gateway/src/pipeline.ts: Wraps incremental distillation and in-flight curation through runBackground(). Urgent distillation (client waiting, gradient in overflow) is intentionally not throttled. Added cleanup in resetPipelineState().
  • packages/gateway/test/background-limiter.test.ts (new): 7 tests for concurrency limiting and circuit breaker behavior
  • packages/gateway/package.json: Added p-limit dependency

Design decisions

  • Concurrency = 2: Allows one distillation + one curation to run concurrently. Conservative enough to prevent thundering-herd on 5+ sessions while maintaining reasonable throughput.
  • Circuit breaker: When any background call gets a 429, all other background work pauses for the Retry-After duration (default 60s). Only extends, never shortens an active pause.
  • Urgent distillation excluded: The gradient is in overflow and the client is effectively waiting — these must not be throttled.
  • Cross-process: Per-process p-limit + circuit breaker provides dampening. Each of 5 lore run processes independently backs off on 429.

BYK added 3 commits May 16, 2026 00:26
…nd work throttling

Two fixes for issues triggered by importing 5 projects simultaneously:

1. splitSegments() infinite recursion (RangeError: Maximum call stack size exceeded)
   — a single message exceeding maxSegmentTokens (16384) caused unbounded recursion
   because findSplitIndex() returned messages.length for a 1-element array, producing
   left=same message, right=empty. Added base case: if (messages.length <= 1) return.

2. Background 429 Too Many Requests — no global concurrency limit on background LLM
   calls meant N idle sessions could fire N×4 simultaneous requests. Added a global
   p-limit(2) background limiter with a circuit breaker that trips on 429, pausing all
   background work for the Retry-After duration. Wired into idle scheduler, pipeline
   incremental distillation, and in-flight curation. Urgent distillation (client waiting)
   is intentionally excluded.
…time, add tests, fix docs

- Re-check isBackgroundPaused() when task reaches front of p-limit queue,
  not just at submission time (prevents queued tasks from executing after
  a 429 trips the breaker while they wait)
- Fix module doc comment: auto-import is NOT wrapped by runBackground
  (it runs sequentially per-process; circuit breaker still protects via
  llm-adapter.ts)
- Add clearQueue() behavior comment in resetBackgroundLimiter()
- Add test for error propagation through runBackground
- Add test for circuit breaker tripping while tasks are queued
- Add .d.ts/.d.ts.map build artifacts to .gitignore
…tion

The resetWorkerModelState test used absolute callCount assertions that
broke when another test file's async resetPipelineState() cleared the
model cache mid-test, causing unexpected re-fetches against the mock.
Fix: reset cache + counter before assertions, use relative deltas.
@BYK BYK merged commit 92b1fbd into main May 16, 2026
7 checks passed
@BYK BYK deleted the fix/distillation-recursion-and-background-throttling branch May 16, 2026 11:14
@craft-deployer craft-deployer Bot mentioned this pull request May 16, 2026
6 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant