Skip to content

fix(rule_engine): restore ranking inside tree retrieval (was flat-rating to 1.0)#169

Merged
Gradata merged 1 commit intomainfrom
fix/tree-retrieval-ranking
May 4, 2026
Merged

fix(rule_engine): restore ranking inside tree retrieval (was flat-rating to 1.0)#169
Gradata merged 1 commit intomainfrom
fix/tree-retrieval-ranking

Conversation

@Gradata
Copy link
Copy Markdown
Owner

@Gradata Gradata commented May 4, 2026

Summary

apply_rules_with_tree was hard-coding relevance=1.0 for every returned AppliedRule (src/gradata/rules/rule_engine/_engine.py:444), silently bypassing FSRS / scope-weight ranking.

Council finding (~/.hermes/council_reports/council_2026-05-04T15-53-40.md, Skeptic): this is a correctness bug — adding a token-budget cap on top of broken ranking would just buy 'confidently-wrong rules with finer granularity'. Must fix before any cap work.

Fix (option c — hybrid)

  1. Tree filters to a wide path-aware candidate pool (4× max_rules).
  2. Standard ranker scores the pool: scope-weight × CT boost, sorted by (state_priority, difficulty, relevance, effective_confidence).
  3. Conflict filtering against rule_graph matches apply_rules parity.
  4. Relevance floor of 0.05 keeps tree-selected lessons in the result when scope_json is empty.

Migration

The path column on lesson_transitions was added by the inline migration at src/gradata/_migrations/__init__.py:96 — that migration was the trigger for the tree-retrieval shortcut.

Test plan

  • tests/test_tree_retrieval_ranking.py (3 new): proves relevance is no longer flat-rated to 1.0 and ranking preserves confidence ordering. Confirmed FAILING on pre-fix code, PASSING after fix.
  • tests/test_rule_engine_ranking_invariant.py (2 new): legacy-compat — top-K and relevance scores must match between apply_rules and apply_rules_with_tree on a no-path corpus.
  • Full rule/tree suite: 811 passed, 2 skipped (pytest -k 'rule or tree').

Layering check

No Layer 0 → Layer 2 imports introduced. Public apply_rules signature unchanged.

Risk

Low. Tree path now produces better ranking on path-bearing corpora; legacy (no-path) corpora unchanged via fallback delegation.

…ing to 1.0)

apply_rules_with_tree previously hard-coded relevance=1.0 for every
returned AppliedRule, silently bypassing the FSRS / scope-weight
ranker. Council finding (council_2026-05-04T15-53-40.md, Skeptic
perspective): this is a correctness bug — adding a token-budget cap
on top of broken ranking just produces 'confidently-wrong rules with
finer granularity'.

Fix (option c, hybrid): tree retrieval now filters to a wide
path-aware candidate pool (4x max_rules), then runs the same scoring
stack as apply_rules over that pool — scope-weight relevance × CT
boost, sorted by (state_priority, difficulty, relevance,
effective_confidence). A small relevance floor (0.05) keeps tree-
selected lessons in the result when their scope_json is empty.

Backwards compat: when no lesson carries a path (legacy / pre-
migration corpora), apply_rules_with_tree still delegates to
apply_rules — locked in by tests/test_rule_engine_ranking_invariant.py.

The 'path' column was added by the inline migration at
src/gradata/_migrations/__init__.py:96 (lesson_transitions).

Tests:
- tests/test_tree_retrieval_ranking.py — three regression tests
  proving relevance is no longer flat-rated to 1.0 and ranking
  preserves confidence ordering.
- tests/test_rule_engine_ranking_invariant.py — top-K + relevance
  equivalence between apply_rules and apply_rules_with_tree on a
  no-path corpus.
@Gradata Gradata merged commit eded405 into main May 4, 2026
6 of 8 checks passed
Copy link
Copy Markdown

@greptile-apps greptile-apps Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your free trial has ended. If you'd like to continue receiving code reviews, you can add a payment method here.

@Gradata Gradata deleted the fix/tree-retrieval-ranking branch May 4, 2026 23:52
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 4, 2026

Caution

Review failed

The pull request is closed.

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: eb9052f5-fda1-4d24-baa5-82bc85a5c034

📥 Commits

Reviewing files that changed from the base of the PR and between e42c9f3 and e1deb6d.

📒 Files selected for processing (3)
  • Gradata/src/gradata/rules/rule_engine/_engine.py
  • Gradata/tests/test_rule_engine_ranking_invariant.py
  • Gradata/tests/test_tree_retrieval_ranking.py

📝 Walkthrough

Summary

Bug Fix: Fixed apply_rules_with_tree hard-coding relevance=1.0 for all returned rules, bypassing proper FSRS/scope-weight ranking.

Key Changes:

  • Tree-retrieved candidate pool now undergoes full ranking pass (4× max_rules) with scope-weight relevance scoring and correction-type boost
  • Multi-criteria sorting applied (state priority, difficulty, relevance, effective confidence) matching apply_rules behavior
  • Conflict filtering against rule_graph preserves parity with apply_rules
  • Relevance floor of 0.05 prevents tree-selected lessons from being dropped when scope_json is empty
  • No public API signature changes; backward compatible with legacy no-path corpora

Tests Added:

  • test_tree_retrieval_ranking.py (3 tests) — validates non-flat relevance scores and confidence-based ordering
  • test_rule_engine_ranking_invariant.py (2 tests) — ensures apply_rules_with_tree matches apply_rules on legacy corpora

Risk: Low; improves correctness for path-bearing corpora, unchanged behavior for legacy corpora.

Walkthrough

apply_rules_with_tree now performs full ranking instead of hard-coding relevance=1.0. When lessons lack path data, it delegates to apply_rules. Otherwise, it retrieves a wider candidate pool, scores by scope-weight with correction-type boost, sorts by state priority and computed relevance, filters conflicts, and returns AppliedRule objects with dynamic relevance scores. Two test modules validate backward compatibility and ranking correctness.

Changes

Rule Engine Ranking Refinement

Layer / File(s) Summary
Core Ranking Logic
Gradata/src/gradata/rules/rule_engine/_engine.py (lines 410–525)
apply_rules_with_tree delegates to apply_rules when lessons have no path. Otherwise, retrieves a wider candidate pool via RuleTree, computes scope-weight relevance with correction-type boost, floors non-positive values to 0.05 epsilon, sorts candidates by (state_priority, difficulty, relevance, effective_confidence), applies conflict filtering, and formats AppliedRule with computed relevance instead of constant 1.0.
Legacy Backward Compatibility Tests
Gradata/tests/test_rule_engine_ranking_invariant.py
Added test_top_k_equivalent_when_no_lessons_have_path to verify apply_rules_with_tree delegates to apply_rules when no path exists, ensuring identical rule_id ordering and relevance scores. Added test_relevance_not_flat_rated_when_paths_present to assert non-degenerate relevance when lessons include path.
Tree Retrieval Ranking Tests
Gradata/tests/test_tree_retrieval_ranking.py
Added regression suite with _make(...) helper and three tests validating that apply_rules_with_tree produces non-constant relevance scores, correctly ranks high-confidence lessons above low-confidence ones regardless of input order, and places the highest-confidence lesson at the top.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

  • Gradata/gradata#15: Modifies apply_rules with domain-scoping and bus parameter additions; this PR aligns apply_rules_with_tree ranking logic with those apply_rules semantics (state priority, difficulty, relevance, effective confidence).

Suggested labels

bug

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/tree-retrieval-ranking

Warning

There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure.

🔧 OpenGrep (1.20.0)

OpenGrep fatal error (exit code 2):
┌──────────────┐
│ Opengrep CLI │
└──────────────┘

�[32m✔�[39m �[1mOpengrep OSS�[0m
�[32m✔�[39m Basic security coverage for first-party code vulnerabilities.

�[1m Loading rules from local config...�[0m
[00.23][ERROR]: Error: exception Glob.Lexer.Syntax_error("malformed glob pattern: missing ']'")
Raised at Glob__Lexer.syntax_error in file "libs/glob/Lexer.mll", line 8, characters 2-26
Called from Glob__Lexer.__ocaml_lex_token_rec in file "libs/glob/Lexer.mll", line 29, characters 26-53
Cal


Comment @coderabbitai help to get the list of available commands and usage tips.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants