feat(sleep-cycle): memory budgeting — per-agent warm_tier hard cap (Sprint E)#103
Merged
Conversation
…e eviction Adds memory budgeting: an optional WARM_TIER_MAX_PER_AGENT env var that caps warm_tier size per agent. When exceeded, the sleep cycle archives the lowest-importance rows (not the oldest) to cold_tier until the count matches the cap. Opt-in: default (unset or 0) preserves existing behavior. Existing deployments are unaffected. Design: - Capacity eviction runs as Phase 2b in the sleep cycle, immediately after the threshold-based eviction in Phase 2. The two passes have distinct invariants: threshold eviction operates on absolute importance (evict below X), capacity eviction operates on relative rank (evict bottom N). Keeping them separate makes log events and result fields independently meaningful. - Value = importance. Sleep cycle Phase 1 already computes importance as a weighted blend (recency, frequency, centrality, reflection, stability). Reusing it avoids duplicate scoring logic. - Graduated memories are NOT exempt from capacity eviction. Cap is a hard limit; graduation affects retrieval scoring, not budgeting. Documented inline. - Eviction preserves namespace on the archived row (cold_tier has a namespace column as of migration-v3.1). - New SleepCycleResult.capacity_evicted field is optional — set only when a cap is configured, keeping existing callers unaffected. Config: - WARM_TIER_MAX_PER_AGENT env var (server.ts). - SleepCycleConfig.warmTierMaxPerAgent (types.ts). - .env.example documents the opt-in with lowest-importance-first note. No schema migration. The existing warm_tier_importance_idx (agent_id, importance DESC) already supports the "bottom N by importance" query efficiently. Tests (tests/memory-budgeting.test.ts, 307 lines, 7 cases): 1. Cap disabled (default) — no evictions. 2. Below cap — no evictions. 3. Exactly at cap — no evictions. 4. Over cap — evicts exactly (count - cap) rows, bottom-importance first, with namespace preserved in cold_tier. 5. Cross-namespace cap — cap is per-agent across all namespaces. 6. Threshold + capacity interaction — both passes run, counts sum. 7. Graduated row gets capacity-evicted when it is lowest-importance.
This was referenced Apr 18, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds memory budgeting per the ROADMAP Phase 2 milestone: "Hard limits on warm-tier size per agent with intelligent eviction. When an agent reaches capacity, the lowest-value memories are archived, not the oldest."
Opt-in via `WARM_TIER_MAX_PER_AGENT` env var. Default (unset or 0) = no cap, existing deployments unaffected.
Design
No schema migration
The existing `warm_tier_importance_idx (agent_id, importance DESC)` already supports the bottom-N-by-importance query efficiently.
Changes
Tests (`tests/memory-budgeting.test.ts`, 307 lines, 7 cases)
All tests use explicit timestamps — no sleeps.
Test plan
Out of scope