fix: consolidation retry storm, idle curation frequency, and session memory leak by BYK · Pull Request #473 · BYK/loreai

BYK · 2026-05-27T10:38:40Z

Summary

Fixes three user-reported issues: excessive Sonnet API usage, 5GB RAM consumption, and knowledge consolidation stuck in an infinite retry loop. All three are interconnected — the consolidation loop was a major driver of the Sonnet overhead.

Changes

Consolidation retry storm (Issue 3 + Issue 1)

Consolidation cooldown (idle.ts): tracks per-project {attemptedAt, entryCount}. When consolidation runs but produces no changes (LLM correctly concludes all entries are unique), enters a 1-hour cooldown. Cooldown clears when entry count changes (curation adds/removes entries). Previously retried every 30-60s indefinitely — 15-30 wasted Sonnet calls per 30-minute idle period.
Stronger consolidation prompt (prompt.ts): added a "FORCED EVICTION" step — when merging/trimming isn't enough, the LLM MUST delete least-valuable entries to reach the target. The user prompt now states "must remove at least N entries."
Curation creation gate (curator.ts): when entry count is at or above maxEntries, curation runs with skipCreate: true, preventing the ratchet effect where entries grow monotonically.

Excessive Sonnet API usage (Issue 1)

Cost-aware idle curation (idle.ts): the idle path was using raw afterTurns=3 while the inline path uses afterTurns * curationMultiplier (=6 for Sonnet, =9 for Opus). Idle curation was firing 2x more often than intended for Sonnet-class models.

Session memory leak (Issue 2)

Session eviction (idle.ts, pipeline.ts, gradient.ts, index.ts): sessions idle > 1 hour are evicted from all in-memory Maps. Persists final cost/gradient state to SQLite before cleanup. Cleans up: gradient state, curation tracker, cost tracking, auth, billing prefix, warmup auth, and pipeline satellite Maps (headerSessionIndex, ltmSessionCache, ltmPinnedText, stableLtmCache, cwdWarned). New evictSession() exported from core for clean single-session gradient cleanup.

…memory leak - Add 1-hour cooldown for failed consolidation attempts (stops wasting Sonnet calls when LLM correctly concludes all entries are unique) - Apply cost-aware curation multiplier in idle path (was using raw afterTurns=3 instead of afterTurns*2=6 for Sonnet — 2x too frequent) - Strengthen consolidation prompt with forced-eviction fallback (LLM must now reduce to target count, not just try) - Gate curation entry creation at maxEntries limit (prevents ratchet effect where entries grow monotonically) - Add session eviction after 1 hour idle (frees gradient state, recall store, LTM caches, cost tracking, auth — all persisted to SQLite) - Export evictSession() from core for clean single-session cleanup

## Summary Follow-up to #473. Onur's logs revealed **1143 knowledge entries** — far beyond what the single-pass consolidation can handle. The previous consolidation sent all entries in one prompt, but with 1143 entries (~343K tokens of input) this overflows the context window, and the 4096 output token budget can only express ~80-100 delete ops (vs the ~1118 needed). ## Context from Onur's logs ``` entry count 1143 exceeds maxEntries 25 — running consolidation entry count 1143 exceeds maxEntries 25 — running consolidation entry count 1143 exceeds maxEntries 25 — running consolidation ...repeating every ~60s... cost-tracker: worker overhead=$1140.8017 (distillation-only=$2.5246) ``` The retry storm (#473) burned $1,138 in consolidation calls that could never succeed due to the token budget constraint. ## Changes Adds batched consolidation mode in `curator.ts`: - When entries ≤ 50: unchanged — sends all entries in a single prompt - When entries > 50 (batched mode): takes the **lowest-confidence** entries (tail of the confidence-sorted list from `forProject()`) as candidates for deletion. Each pass targets removing ~25 entries (half the batch). - The idle scheduler's cooldown (from #473) clears when entry count changes, automatically triggering the next batch on the following idle tick. - Converges to `maxEntries` over multiple passes: 1143 → 1118 → 1093 → ... → 25 For Onur's case: ~45 passes × 1 Sonnet call each ≈ $2-3 total to clean up 1143 entries, spread across idle periods. vs the previous behavior of infinite retries that never made progress.

BYK self-assigned this May 27, 2026

BYK merged commit 48cc25b into main May 27, 2026
7 checks passed

BYK deleted the fix/consolidation-loop-memory-leak branch May 27, 2026 10:45

BYK mentioned this pull request May 27, 2026

fix: batch consolidation for large entry counts #474

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: consolidation retry storm, idle curation frequency, and session memory leak#473

fix: consolidation retry storm, idle curation frequency, and session memory leak#473
BYK merged 1 commit into
mainfrom
fix/consolidation-loop-memory-leak

BYK commented May 27, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

BYK commented May 27, 2026

Summary

Changes

Consolidation retry storm (Issue 3 + Issue 1)

Excessive Sonnet API usage (Issue 1)

Session memory leak (Issue 2)

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant