fix(rule_engine): restore ranking inside tree retrieval (was flat-rating to 1.0) by Gradata · Pull Request #169 · Gradata/gradata

Gradata · 2026-05-04T23:52:40Z

Summary

apply_rules_with_tree was hard-coding relevance=1.0 for every returned AppliedRule (src/gradata/rules/rule_engine/_engine.py:444), silently bypassing FSRS / scope-weight ranking.

Council finding (~/.hermes/council_reports/council_2026-05-04T15-53-40.md, Skeptic): this is a correctness bug — adding a token-budget cap on top of broken ranking would just buy 'confidently-wrong rules with finer granularity'. Must fix before any cap work.

Fix (option c — hybrid)

Tree filters to a wide path-aware candidate pool (4× max_rules).
Standard ranker scores the pool: scope-weight × CT boost, sorted by (state_priority, difficulty, relevance, effective_confidence).
Conflict filtering against rule_graph matches apply_rules parity.
Relevance floor of 0.05 keeps tree-selected lessons in the result when scope_json is empty.

Migration

The path column on lesson_transitions was added by the inline migration at src/gradata/_migrations/__init__.py:96 — that migration was the trigger for the tree-retrieval shortcut.

Test plan

tests/test_tree_retrieval_ranking.py (3 new): proves relevance is no longer flat-rated to 1.0 and ranking preserves confidence ordering. Confirmed FAILING on pre-fix code, PASSING after fix.
tests/test_rule_engine_ranking_invariant.py (2 new): legacy-compat — top-K and relevance scores must match between apply_rules and apply_rules_with_tree on a no-path corpus.
Full rule/tree suite: 811 passed, 2 skipped (pytest -k 'rule or tree').

Layering check

No Layer 0 → Layer 2 imports introduced. Public apply_rules signature unchanged.

Risk

Low. Tree path now produces better ranking on path-bearing corpora; legacy (no-path) corpora unchanged via fallback delegation.

…ing to 1.0) apply_rules_with_tree previously hard-coded relevance=1.0 for every returned AppliedRule, silently bypassing the FSRS / scope-weight ranker. Council finding (council_2026-05-04T15-53-40.md, Skeptic perspective): this is a correctness bug — adding a token-budget cap on top of broken ranking just produces 'confidently-wrong rules with finer granularity'. Fix (option c, hybrid): tree retrieval now filters to a wide path-aware candidate pool (4x max_rules), then runs the same scoring stack as apply_rules over that pool — scope-weight relevance × CT boost, sorted by (state_priority, difficulty, relevance, effective_confidence). A small relevance floor (0.05) keeps tree- selected lessons in the result when their scope_json is empty. Backwards compat: when no lesson carries a path (legacy / pre- migration corpora), apply_rules_with_tree still delegates to apply_rules — locked in by tests/test_rule_engine_ranking_invariant.py. The 'path' column was added by the inline migration at src/gradata/_migrations/__init__.py:96 (lesson_transitions). Tests: - tests/test_tree_retrieval_ranking.py — three regression tests proving relevance is no longer flat-rated to 1.0 and ranking preserves confidence ordering. - tests/test_rule_engine_ranking_invariant.py — top-K + relevance equivalence between apply_rules and apply_rules_with_tree on a no-path corpus.

greptile-apps

Your free trial has ended. If you'd like to continue receiving code reviews, you can add a payment method here.

coderabbitai · 2026-05-04T23:52:55Z

Caution

Review failed

The pull request is closed.

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: eb9052f5-fda1-4d24-baa5-82bc85a5c034

📥 Commits

Reviewing files that changed from the base of the PR and between e42c9f3 and e1deb6d.

📒 Files selected for processing (3)

Gradata/src/gradata/rules/rule_engine/_engine.py
Gradata/tests/test_rule_engine_ranking_invariant.py
Gradata/tests/test_tree_retrieval_ranking.py

📝 Walkthrough

Summary

Bug Fix: Fixed apply_rules_with_tree hard-coding relevance=1.0 for all returned rules, bypassing proper FSRS/scope-weight ranking.

Key Changes:

Tree-retrieved candidate pool now undergoes full ranking pass (4× max_rules) with scope-weight relevance scoring and correction-type boost
Multi-criteria sorting applied (state priority, difficulty, relevance, effective confidence) matching apply_rules behavior
Conflict filtering against rule_graph preserves parity with apply_rules
Relevance floor of 0.05 prevents tree-selected lessons from being dropped when scope_json is empty
No public API signature changes; backward compatible with legacy no-path corpora

Tests Added:

test_tree_retrieval_ranking.py (3 tests) — validates non-flat relevance scores and confidence-based ordering
test_rule_engine_ranking_invariant.py (2 tests) — ensures apply_rules_with_tree matches apply_rules on legacy corpora

Risk: Low; improves correctness for path-bearing corpora, unchanged behavior for legacy corpora.

Walkthrough

apply_rules_with_tree now performs full ranking instead of hard-coding relevance=1.0. When lessons lack path data, it delegates to apply_rules. Otherwise, it retrieves a wider candidate pool, scores by scope-weight with correction-type boost, sorts by state priority and computed relevance, filters conflicts, and returns AppliedRule objects with dynamic relevance scores. Two test modules validate backward compatibility and ranking correctness.

Changes

Rule Engine Ranking Refinement

Layer / File(s)	Summary
Core Ranking Logic `Gradata/src/gradata/rules/rule_engine/_engine.py` (lines 410–525)	`apply_rules_with_tree` delegates to `apply_rules` when lessons have no `path`. Otherwise, retrieves a wider candidate pool via `RuleTree`, computes scope-weight relevance with correction-type boost, floors non-positive values to 0.05 epsilon, sorts candidates by `(state_priority, difficulty, relevance, effective_confidence)`, applies conflict filtering, and formats `AppliedRule` with computed relevance instead of constant 1.0.
Legacy Backward Compatibility Tests `Gradata/tests/test_rule_engine_ranking_invariant.py`	Added `test_top_k_equivalent_when_no_lessons_have_path` to verify `apply_rules_with_tree` delegates to `apply_rules` when no `path` exists, ensuring identical `rule_id` ordering and relevance scores. Added `test_relevance_not_flat_rated_when_paths_present` to assert non-degenerate relevance when lessons include `path`.
Tree Retrieval Ranking Tests `Gradata/tests/test_tree_retrieval_ranking.py`	Added regression suite with `_make(...)` helper and three tests validating that `apply_rules_with_tree` produces non-constant relevance scores, correctly ranks high-confidence lessons above low-confidence ones regardless of input order, and places the highest-confidence lesson at the top.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Gradata/gradata#15: Modifies apply_rules with domain-scoping and bus parameter additions; this PR aligns apply_rules_with_tree ranking logic with those apply_rules semantics (state priority, difficulty, relevance, effective confidence).

Suggested labels

bug

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch fix/tree-retrieval-ranking

Warning

There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure.

🔧 OpenGrep (1.20.0)

OpenGrep fatal error (exit code 2):
┌──────────────┐
│ Opengrep CLI │
└──────────────┘

�[32m✔�[39m �[1mOpengrep OSS�[0m
�[32m✔�[39m Basic security coverage for first-party code vulnerabilities.

�[1m Loading rules from local config...�[0m
[00.23][ERROR]: Error: exception Glob.Lexer.Syntax_error("malformed glob pattern: missing ']'")
Raised at Glob__Lexer.syntax_error in file "libs/glob/Lexer.mll", line 8, characters 2-26
Called from Glob__Lexer.__ocaml_lex_token_rec in file "libs/glob/Lexer.mll", line 29, characters 26-53
Cal

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

Gradata merged commit eded405 into main May 4, 2026
6 of 8 checks passed

greptile-apps Bot reviewed May 4, 2026

View reviewed changes

Gradata deleted the fix/tree-retrieval-ranking branch May 4, 2026 23:52

coderabbitai Bot added the bug Something isn't working label May 4, 2026

coderabbitai Bot mentioned this pull request May 5, 2026

feat(v0.7.0): gradata_recall MCP tool + universal hook adapters + audit CLI #171

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(rule_engine): restore ranking inside tree retrieval (was flat-rating to 1.0)#169

fix(rule_engine): restore ranking inside tree retrieval (was flat-rating to 1.0)#169
Gradata merged 1 commit intomainfrom
fix/tree-retrieval-ranking

Gradata commented May 4, 2026

Uh oh!

Uh oh!

greptile-apps Bot left a comment

Uh oh!

coderabbitai Bot commented May 4, 2026 •

edited

Loading

Review failed

Summary

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested labels

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Gradata commented May 4, 2026

Summary

Fix (option c — hybrid)

Migration

Test plan

Layering check

Risk

Uh oh!

Uh oh!

greptile-apps Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot commented May 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review failed

Summary

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested labels

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

coderabbitai Bot commented May 4, 2026 •

edited

Loading