Skip to content

feat(rule_engine): rule TTL with stale demotion + entropy-based ordering (Lu 2022)#74

Merged
Gradata merged 2 commits intomainfrom
worktree-agent-a081f7ba
Apr 15, 2026
Merged

feat(rule_engine): rule TTL with stale demotion + entropy-based ordering (Lu 2022)#74
Gradata merged 2 commits intomainfrom
worktree-agent-a081f7ba

Conversation

@Gradata
Copy link
Copy Markdown
Owner

@Gradata Gradata commented Apr 15, 2026

Summary

Two commits on rule_engine.py per gap-analysis synthesis:

  1. Rule TTL with stale-flag demotion — rules not reinforced within a window get flagged stale=True and demoted. Closes the "forever-valid" assumption that kept graduated rules in the injection pool indefinitely.
  2. Entropy-based ordering (Lu et al. 2022) — ranks rules by entropy of their embedding vs the task context instead of pure confidence. Addresses the "lowest-confidence-rule-that-actually-matches" selection problem surfaced in the sim20 transparency demands.

Files

Commits

  • 2711cc2 feat(rule_engine): rule TTL with stale-flag demotion
  • 4e6d295 perf(rule_engine): entropy-based ordering per Lu 2022

Tests

2280 pass (+12), ruff clean on touched files.

Ref

Lu et al. 2022 — "Fantastically Ordered Prompts and Where to Find Them" (entropy-based demo ordering for in-context learning).

Co-Authored-By: Gradata noreply@gradata.ai

Gradata added 2 commits April 15, 2026 01:05
Red-team finding A7: obsolete rules accumulate indefinitely after
graduation, eventually contaminating injection output ("zombie rules").

Add a TTL check at injection time. Any RULE-tier lesson whose
`sessions_since_fire` meets or exceeds `ttl_sessions` (default 50) is
demoted back to PATTERN with a new `stale=True` flag, rather than
deleted — preserving history for user review while removing its
RULE-tier injection dominance. Emits `rule_demoted_ttl` on the event
bus per demotion so the UI can warn the user.

- Add `Lesson.stale: bool = False`
- Add `rule_engine.DEFAULT_TTL_SESSIONS = 50`
- Add `rule_engine.demote_stale_rules(lessons, ttl_sessions, bus)`
- Wire the demotion as step 0 of `apply_rules`, configurable via
  `ttl_sessions=` (pass 0 to disable)
- Extend `test_rule_engine_v2.py` with `TestRuleTTL` (11 cases)

Uses the pre-existing `Lesson.sessions_since_fire` counter — already
incremented by `self_improvement.py` at session close — as the TTL
clock. No new timestamp field required.
Lu et al. 2022 "Fantastically Ordered Prompts and Where to Find
Them" (https://arxiv.org/abs/2104.08786) showed the same demonstrations
in different orders span random-guess to SOTA performance in ICL.
Previous behavior within a tier was a uniform secrets-based shuffle —
optimizing only the security property (no confidence leakage), not ICL
quality.

Add a compute-cheap GlobalE-style order selector:

- `_ordering_entropy` scores a permutation by Shannon entropy of
  categories across primacy / middle / recency zones (Liu 2023
  "Lost in the Middle" U-curve informs the zone weighting).
- `choose_entropy_ordering` samples 8 random permutations by default
  and returns the highest-scoring one, with a
  `(task_type, rule_set_hash)` cache for reuse across sessions.
- `format_rules_for_prompt` gains `entropy_search=True` (default) and
  `task_type=""` arguments; delegates in-tier ordering to the new
  selector. Tier ordering (RULE > PATTERN > INSTINCT) and the
  no-confidence-leak property are preserved.
- When callers supply `shuffle_seed`, the cross-session cache is
  bypassed so per-seed determinism still holds (test contract).

Perf on a 10-rule set: ~196 us added on cold cache, ~17 us warm.
No new dependencies; pure-logic SDK layer. The LLM-based surrogate
proposed in Lu 2022 remains a future upgrade for deployments with
live calibration budget.

Tests: 12 new cases in `TestEntropyOrdering` inside the existing
`tests/test_rule_engine_v2.py`. Pre-existing security test
`test_different_seeds_different_order` still passes under the
seed-bypasses-cache rule.
Copy link
Copy Markdown

@greptile-apps greptile-apps Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gradata has reached the 50-review limit for trial accounts. To continue receiving code reviews, upgrade your plan.

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 15, 2026

Warning

Rate limit exceeded

@Gradata has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 9 minutes and 5 seconds before requesting another review.

Your organization is not enrolled in usage-based pricing. Contact your admin to enable usage-based pricing to continue reviews beyond the rate limit, or try again in 9 minutes and 5 seconds.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 68253f02-c66a-43b5-b27e-8faf71800e98

📥 Commits

Reviewing files that changed from the base of the PR and between 5fd7215 and 4e6d295.

📒 Files selected for processing (3)
  • src/gradata/_types.py
  • src/gradata/rules/rule_engine.py
  • tests/test_rule_engine_v2.py
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch worktree-agent-a081f7ba

Comment @coderabbitai help to get the list of available commands and usage tips.

@Gradata Gradata merged commit b16e45c into main Apr 15, 2026
16 checks passed
@Gradata Gradata deleted the worktree-agent-a081f7ba branch April 17, 2026 19:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant