fix: prevent layer-4 stickiness trap and refactor compression stages by BYK · Pull Request #300 · BYK/loreai

BYK · 2026-05-14T09:28:09Z

Summary

Fix critical stickiness bug (gradient.ts:1533): Sessions hitting emergency mode (layer 4) were trapped there indefinitely — stickiness guard pinned effectiveMinLayer = 4 forever since message count only grows after urgent distillation. This caused cache busts every turn for the rest of the session, the root cause of long-session cost blowups. Fix: exclude layer 4 from stickiness (layers 1-3 only, where dropping back would bust a warm cache).
Replace hardcoded layer blocks with data-driven stage table: Three separate if (effectiveMinLayer <= N) blocks for layers 1-3 collapsed into a COMPRESSION_STAGES table + loop. Same behavior, but escalation path is now visible, tunable, and extensible (new stage = one table row).
Add importance-aware distillation trimming: When stages limit distillation count, selectDistillations() scores by recency (70%) + content signals (30%: decisions, gotchas, architecture, meta-distilled) instead of blind slice(-N). Results re-sorted chronologically for Approach C prefix cache safety.

Cost analysis

Layer-4 stickiness trap was causing every turn after emergency to pay full cache-write cost (~$1.00/turn at Opus pricing with 160K context). With the fix, emergency is transient (1-2 turns), then falls back to layer 1 where cache is warm. Over a 100-turn Opus session, this saves $50-80+ for sessions that hit emergency.

Testing

Typecheck: all 4 packages pass
Tests: 84/84 gradient tests pass

Fix a critical bug where the sticky layer guard trapped sessions at emergency mode (layer 4) indefinitely after a single emergency trigger, causing cache busts every turn for the rest of the session — the root cause of long-session cost blowups. Changes: - Exclude layer 4 from stickiness guard (layers 1-3 only) since emergency already blows the cache and stickiness there is pure downside - Replace three hardcoded layer blocks (1-3) with a data-driven compression stage table and loop — same behavior, tunable, extensible - Add importance-aware distillation trimming (selectDistillations) that keeps decisions/gotchas/architecture entries longer under pressure, replacing blind slice(-N)

- Fix rawWindowCache reset skipped when effectiveMinLayer > 2 (regression) - Fix selectDistillations recency formula: use (length-1) divisor so oldest entry gets 0.0 and newest gets full 0.7 recency weight - Tighten importanceBonus regexes: remove overly broad 'fix'/'error'/'pattern' matches that fire on most distillations, keep distinctive signals only - Hoist CompressionStage type and COMPRESSION_STAGES to module level - Simplify urgent distillation logic (collapse redundant if/else if) - Clean up .lore.md: remove garbled curator entries, fix contradictory title

BYK added 2 commits May 14, 2026 09:27

BYK merged commit bcff2c9 into main May 14, 2026
7 checks passed

BYK deleted the fix/gradient-stickiness-stage-table branch May 14, 2026 09:39

craft-deployer Bot mentioned this pull request May 14, 2026

publish: BYK/loreai@0.19.0 #328

Closed

6 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: prevent layer-4 stickiness trap and refactor compression stages#300

fix: prevent layer-4 stickiness trap and refactor compression stages#300
BYK merged 2 commits into
mainfrom
fix/gradient-stickiness-stage-table

BYK commented May 14, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

BYK commented May 14, 2026

Summary

Cost analysis

Testing

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant