feat(rule_engine): rule TTL with stale demotion + entropy-based ordering (Lu 2022)#74
feat(rule_engine): rule TTL with stale demotion + entropy-based ordering (Lu 2022)#74
Conversation
Red-team finding A7: obsolete rules accumulate indefinitely after
graduation, eventually contaminating injection output ("zombie rules").
Add a TTL check at injection time. Any RULE-tier lesson whose
`sessions_since_fire` meets or exceeds `ttl_sessions` (default 50) is
demoted back to PATTERN with a new `stale=True` flag, rather than
deleted — preserving history for user review while removing its
RULE-tier injection dominance. Emits `rule_demoted_ttl` on the event
bus per demotion so the UI can warn the user.
- Add `Lesson.stale: bool = False`
- Add `rule_engine.DEFAULT_TTL_SESSIONS = 50`
- Add `rule_engine.demote_stale_rules(lessons, ttl_sessions, bus)`
- Wire the demotion as step 0 of `apply_rules`, configurable via
`ttl_sessions=` (pass 0 to disable)
- Extend `test_rule_engine_v2.py` with `TestRuleTTL` (11 cases)
Uses the pre-existing `Lesson.sessions_since_fire` counter — already
incremented by `self_improvement.py` at session close — as the TTL
clock. No new timestamp field required.
Lu et al. 2022 "Fantastically Ordered Prompts and Where to Find Them" (https://arxiv.org/abs/2104.08786) showed the same demonstrations in different orders span random-guess to SOTA performance in ICL. Previous behavior within a tier was a uniform secrets-based shuffle — optimizing only the security property (no confidence leakage), not ICL quality. Add a compute-cheap GlobalE-style order selector: - `_ordering_entropy` scores a permutation by Shannon entropy of categories across primacy / middle / recency zones (Liu 2023 "Lost in the Middle" U-curve informs the zone weighting). - `choose_entropy_ordering` samples 8 random permutations by default and returns the highest-scoring one, with a `(task_type, rule_set_hash)` cache for reuse across sessions. - `format_rules_for_prompt` gains `entropy_search=True` (default) and `task_type=""` arguments; delegates in-tier ordering to the new selector. Tier ordering (RULE > PATTERN > INSTINCT) and the no-confidence-leak property are preserved. - When callers supply `shuffle_seed`, the cross-session cache is bypassed so per-seed determinism still holds (test contract). Perf on a 10-rule set: ~196 us added on cold cache, ~17 us warm. No new dependencies; pure-logic SDK layer. The LLM-based surrogate proposed in Lu 2022 remains a future upgrade for deployments with live calibration budget. Tests: 12 new cases in `TestEntropyOrdering` inside the existing `tests/test_rule_engine_v2.py`. Pre-existing security test `test_different_seeds_different_order` still passes under the seed-bypasses-cache rule.
There was a problem hiding this comment.
Gradata has reached the 50-review limit for trial accounts. To continue receiving code reviews, upgrade your plan.
|
Warning Rate limit exceeded
Your organization is not enrolled in usage-based pricing. Contact your admin to enable usage-based pricing to continue reviews beyond the rate limit, or try again in 9 minutes and 5 seconds. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: Organization UI Review profile: ASSERTIVE Plan: Pro Run ID: 📒 Files selected for processing (3)
✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
Summary
Two commits on
rule_engine.pyper gap-analysis synthesis:stale=Trueand demoted. Closes the "forever-valid" assumption that kept graduated rules in the injection pool indefinitely.Files
src/gradata/_types.py— addsstale: bool = FalseonLessondataclass (interleaves cleanly with main's_contradiction_streakfrom PR feat(sdk): Brain.add_rule API + profile gating in runners + codex/cline/continue exports #31)src/gradata/rules/rule_engine.py— TTL check + entropy-ranked selectiontests/test_rule_engine_v2.py— +12TestEntropyOrderingcasesCommits
2711cc2feat(rule_engine): rule TTL with stale-flag demotion4e6d295perf(rule_engine): entropy-based ordering per Lu 2022Tests
2280 pass (+12), ruff clean on touched files.
Ref
Lu et al. 2022 — "Fantastically Ordered Prompts and Where to Find Them" (entropy-based demo ordering for in-context learning).
Co-Authored-By: Gradata noreply@gradata.ai