Skip to content

fix: prevent layer-4 stickiness trap and refactor compression stages#300

Merged
BYK merged 2 commits into
mainfrom
fix/gradient-stickiness-stage-table
May 14, 2026
Merged

fix: prevent layer-4 stickiness trap and refactor compression stages#300
BYK merged 2 commits into
mainfrom
fix/gradient-stickiness-stage-table

Conversation

@BYK
Copy link
Copy Markdown
Owner

@BYK BYK commented May 14, 2026

Summary

  • Fix critical stickiness bug (gradient.ts:1533): Sessions hitting emergency mode (layer 4) were trapped there indefinitely — stickiness guard pinned effectiveMinLayer = 4 forever since message count only grows after urgent distillation. This caused cache busts every turn for the rest of the session, the root cause of long-session cost blowups. Fix: exclude layer 4 from stickiness (layers 1-3 only, where dropping back would bust a warm cache).

  • Replace hardcoded layer blocks with data-driven stage table: Three separate if (effectiveMinLayer <= N) blocks for layers 1-3 collapsed into a COMPRESSION_STAGES table + loop. Same behavior, but escalation path is now visible, tunable, and extensible (new stage = one table row).

  • Add importance-aware distillation trimming: When stages limit distillation count, selectDistillations() scores by recency (70%) + content signals (30%: decisions, gotchas, architecture, meta-distilled) instead of blind slice(-N). Results re-sorted chronologically for Approach C prefix cache safety.

Cost analysis

Layer-4 stickiness trap was causing every turn after emergency to pay full cache-write cost (~$1.00/turn at Opus pricing with 160K context). With the fix, emergency is transient (1-2 turns), then falls back to layer 1 where cache is warm. Over a 100-turn Opus session, this saves $50-80+ for sessions that hit emergency.

Testing

  • Typecheck: all 4 packages pass
  • Tests: 84/84 gradient tests pass

BYK added 2 commits May 14, 2026 09:27
Fix a critical bug where the sticky layer guard trapped sessions at emergency
mode (layer 4) indefinitely after a single emergency trigger, causing cache
busts every turn for the rest of the session — the root cause of long-session
cost blowups.

Changes:
- Exclude layer 4 from stickiness guard (layers 1-3 only) since emergency
  already blows the cache and stickiness there is pure downside
- Replace three hardcoded layer blocks (1-3) with a data-driven compression
  stage table and loop — same behavior, tunable, extensible
- Add importance-aware distillation trimming (selectDistillations) that keeps
  decisions/gotchas/architecture entries longer under pressure, replacing
  blind slice(-N)
- Fix rawWindowCache reset skipped when effectiveMinLayer > 2 (regression)
- Fix selectDistillations recency formula: use (length-1) divisor so oldest
  entry gets 0.0 and newest gets full 0.7 recency weight
- Tighten importanceBonus regexes: remove overly broad 'fix'/'error'/'pattern'
  matches that fire on most distillations, keep distinctive signals only
- Hoist CompressionStage type and COMPRESSION_STAGES to module level
- Simplify urgent distillation logic (collapse redundant if/else if)
- Clean up .lore.md: remove garbled curator entries, fix contradictory title
@BYK BYK merged commit bcff2c9 into main May 14, 2026
7 checks passed
@BYK BYK deleted the fix/gradient-stickiness-stage-table branch May 14, 2026 09:39
@craft-deployer craft-deployer Bot mentioned this pull request May 14, 2026
6 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant