Skip to content

feat(jit,graduation): BM25 for JIT ranking + raise Beta LB default to 0.85#101

Merged
Gradata merged 6 commits intomainfrom
feat/jit-bm25-and-beta-lb
Apr 17, 2026
Merged

feat(jit,graduation): BM25 for JIT ranking + raise Beta LB default to 0.85#101
Gradata merged 6 commits intomainfrom
feat/jit-bm25-and-beta-lb

Conversation

@Gradata
Copy link
Copy Markdown
Owner

@Gradata Gradata commented Apr 17, 2026

Summary

  • BM25 for JIT ranking (src/gradata/hooks/jit_inject.py): replaces Jaccard with BM25 via optional bm25s dep when installed; falls back to Jaccard so the SDK stays zero-required-deps. Same pattern as rules/rule_ranker.py. BM25 weights rare terms over generic ones — better signal for matching drafts to rule descriptions.
  • Beta LB default 0.70 -> 0.85 (src/gradata/enhancements/self_improvement/_graduation.py): per S105 recommendation. Gate stays opt-in (GRADATA_BETA_LB_GATE off by default) so this is a no-op for anyone who hasn't enabled it.

Test plan

  • pytest tests/test_jit_inject.py — 34/34 (32 existing + 2 new: BM25 path, Jaccard fallback)
  • pytest tests/test_agent_graduation.py tests/test_graduation_notification.py tests/test_pattern_graduation_integration.py — 74 passed, 2 xfailed
  • pytest tests/test_wiring_compound.py — 14/14

Generated with Gradata

Gradata and others added 5 commits April 16, 2026 23:57
…ompts

Two measured waste eliminations:

1. inject_brain_rules: mandatory_reminder block restated every mandatory
   rule verbatim in a recency tag. mandatory_block already places them in
   primacy position — the reminder was defensive redundancy costing
   ~50-200 tok/session per mandatory rule.

2. context_inject: MIN_MESSAGE_LEN raised 10 → 60 to skip brain FTS
   search on trivial follow-ups ("ok", "yes", "continue"). Previous
   threshold only filtered 1-9 char messages, causing ~400 tok of
   wasted context per ack-style reply.

Council audit (adversarial, synthesized from Skeptic + Architect +
Pragmatist) found my earlier measurement methodology used synthetic
inputs that masked real per-operation costs. Real heavy session
(10 prompts, 30 edits, 5 agents) accumulates ~9.6K gradata tokens;
context_inject alone can contribute 3.5K when prompts are substantive.

Tests: 2884 passed, 1 preexisting cp1252 Windows unicode failure
in test_rule_to_hook (unrelated to injection).

Co-Authored-By: Gradata <noreply@gradata.ai>
…imits

Rule_enforcement was re-injecting SessionStart rules on every Write/Edit,
costing ~165 tok/edit for zero added primacy value. Default it off behind
GRADATA_RULE_ENFORCEMENT=1 opt-in for ablation.

Also add env knobs for tuning injection volume without redeploying:
- context_inject: GRADATA_CONTEXT_INJECT kill switch + MIN_MESSAGE_LEN
  raised 60 -> 100 so ack-style replies skip FTS cost
- inject_brain_rules: GRADATA_MAX_RULES, GRADATA_MIN_CONFIDENCE,
  GRADATA_MAX_META_RULES for tuning SessionStart payload

Tests updated to opt into rule_enforcement where they exercise the hook
and to use a >=100 char message for context_inject.

Co-Authored-By: Gradata <noreply@gradata.ai>
Telemetry showed 10 SessionStart events in a long session re-injected
1.9KB each (~3.7k tokens total) — duplicative because the compact summary
already carries rules from the prior session, and the new session's
primacy slot is consumed by that summary.

Skip inject_brain_rules when source is 'compact' or 'resume'. Opt back
in with GRADATA_INJECT_ON_COMPACT=1 for ablation.

Co-Authored-By: Gradata <noreply@gradata.ai>
…d metas

When a meta-rule is injected, its source_lesson_ids are added to a mutex
set; individual RULE/PATTERN leaves whose _lesson_id() is in that set are
skipped so the principle isn't double-injected as both the meta and its
concrete leaves.

Freed slots get refilled: rank_rules is now called with a 3x overshoot
so the render loop has candidates beyond the MAX_RULES cap; the cap is
enforced at render time (cluster_lines + individual_lines <= MAX_RULES)
rather than at rank time.

On current brain: 1822B -> 1319B per SessionStart (27% smaller), and
leaf categories diversify from {LEADS, DEMO_PREP, DRAFTING} to
{DRAFTING, TONE, DATA_INTEGRITY, ARCHITECTURE, CODE} while the LEADS +
DEMO_PREP principles still land via metas.

Opt-out: GRADATA_META_RULE_MUTEX=0 for ablation. Mirrors the existing
cluster-vs-meta mutex pattern (meta_covered_categories).

Co-Authored-By: Gradata <noreply@gradata.ai>
… 0.85

JIT injection (src/gradata/hooks/jit_inject.py) now uses BM25 via the
optional bm25s dep when installed, with Jaccard as a graceful fallback.
Matches the pattern already used in rules/rule_ranker.py. BM25 weights
rare terms higher than generic ones, which is exactly what we want when
matching a fresh draft against a corpus of rule descriptions.

Beta LB gate default threshold raised from 0.70 -> 0.85 per S105
recommendation — S105 ablation showed the 0.70 floor still lets through
format-based false graduations. Gate remains opt-in (GRADATA_BETA_LB_GATE
off by default), so this is a no-op for anyone who hasn't enabled it.

Tests: all 34 jit_inject tests pass (32 existing + 2 new covering BM25
path and Jaccard fallback). Graduation + wiring tests: 88/88 pass.

Co-Authored-By: Gradata <noreply@gradata.ai>
Copy link
Copy Markdown

@greptile-apps greptile-apps Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gradata has reached the 50-review limit for trial accounts. To continue receiving code reviews, upgrade your plan.

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 17, 2026

📝 Walkthrough
  • JIT ranking now BM25-first with Jaccard fallback: rank_rules_for_draft() uses BM25 scoring via optional bm25s (falls back to existing Jaccard path if unavailable or on error)
  • Beta LB threshold raised: default parsed GRADATA_BETA_LB_THRESHOLD changed from 0.70 → 0.85; note the GRADATA_BETA_LB_GATE remains opt-in (disabled by default), so no behavioral change unless the gate is enabled
  • Rule enforcement off by default: GRADATA_RULE_ENFORCEMENT must be set to "1" to enable per-edit rule injection (hook returns None when not enabled)
  • New environment knobs for runtime behavior: GRADATA_CONTEXT_INJECT, GRADATA_MAX_RULES, GRADATA_MIN_CONFIDENCE, GRADATA_MAX_META_RULES, GRADATA_MAX_REMINDERS, GRADATA_INJECT_ON_COMPACT, GRADATA_MIN_MESSAGE_LEN, GRADATA_MAX_CONTEXT_LEN, GRADATA_META_RULE_MUTEX
  • Context injection defaults changed and made configurable: MIN_MESSAGE_LEN default increased (from 10 → 100) and exposed as GRADATA_MIN_MESSAGE_LEN; MAX_CONTEXT_LEN exposed as GRADATA_MAX_CONTEXT_LEN
  • Conditional/opt-in rule injection behavior: brain rule re-injection skips "compact" and "resume" SessionStart sources unless GRADATA_INJECT_ON_COMPACT=1
  • Meta-rule mutex extension: tracks meta_covered_lesson_ids and suppresses leaf rules whose lesson_id is already covered by injected meta-rules when enabled
  • Removed mandatory-reminder output: section no longer produced; only remains
  • Robust parsing/hardening: safer parsing/validation for GRADATA_BETA_LB_THRESHOLD and GRADATA_BETA_LB_MIN_FIRES (guard against NaN/inf, clamp threshold to [0,1], floor min_fires at 0), plus minor lint/import/annotation cleanups and stricter tests

Breaking changes:

  • Rule enforcement disabled by default (requires GRADATA_RULE_ENFORCEMENT=1)
  • MIN_MESSAGE_LEN default raised to 100
  • output removed

No security fixes or new public API methods were introduced.

Walkthrough

Adds environment-configurable feature gates and thresholds across multiple hooks, integrates BM25-first ranking with Jaccard fallback for JIT rule selection, extends meta-rule mutexing to suppress by lesson IDs, tightens parsing/clamping of beta LB threshold/min-fires, and removes the <mandatory-reminder> block from outputs/tests.

Changes

Cohort / File(s) Summary
Graduation / Threshold Parsing
src/gradata/enhancements/self_improvement/_graduation.py
Changed default parsed GRADATA_BETA_LB_THRESHOLD from 0.70 → 0.85; added math/non-finite handling, clamping to [0.0,1.0], and safer exception fallbacks; clamped GRADATA_BETA_LB_MIN_FIRES to non-negative ints.
Context Injection Hook
src/gradata/hooks/context_inject.py
Added os import; made MIN_MESSAGE_LEN and MAX_CONTEXT_LEN configurable via environment; added early-exit kill-switch when GRADATA_CONTEXT_INJECT != "1".
Rule Enforcement Hook
src/gradata/hooks/rule_enforcement.py
Added os import; hook now no-ops unless GRADATA_RULE_ENFORCEMENT="1"; MAX_REMINDERS made env-configurable.
Inject Brain Rules
src/gradata/hooks/inject_brain_rules.py
Env-configurable rule thresholds (MAX_RULES, MIN_CONFIDENCE, MAX_META_RULES), skip reinjection for compact/resume sources (unless allowed), larger ranker candidate overshoot with final budget enforcement, extended meta-rule mutexes to track meta_covered_lesson_ids and suppress leaf rules by lesson id; removed construction of <mandatory-reminder> in final render.
JIT Inject / Ranking
src/gradata/hooks/jit_inject.py
Added optional BM25 integration with import-time feature gate; new _bm25_scores_for_draft() helper; BM25-first scoring with Jaccard fallback retained; refactored ranking flow to apply BM25 scores when available.
Tests — adjustments & new cases
tests/test_hooks_intelligence.py, tests/test_hooks_safety.py, tests/test_jit_inject.py, tests/test_rule_enforcement_scope.py, tests/test_self_healing_fix.py
Updated test inputs and environment setup (explicitly enable rule enforcement where required), added BM25 availability tests for jit_inject, added autouse fixture enabling GRADATA_RULE_ENFORCEMENT, and removed tests/assertions tied to <mandatory-reminder>.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

Possibly related PRs

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 33.33% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and specifically identifies the two main changes: BM25 implementation for JIT ranking and Beta LB threshold increase from 0.70 to 0.85.
Description check ✅ Passed The description is directly related to the changeset, explaining both main features (BM25 for JIT ranking and Beta LB default threshold change) and providing a test plan demonstrating validation of the changes.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/jit-bm25-and-beta-lb

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot added the feature label Apr 17, 2026
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 8

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/gradata/hooks/rule_enforcement.py (1)

97-101: ⚠️ Potential issue | 🟠 Major

Fix CI-blocking Ruff SIM103 in domain check.

The conditional pattern at line 98 triggers the SIM103 lint failure. Simplify the boolean return logic as shown below.

Proposed fix
-    decl_domain = scope.get("domain")
-    if decl_domain and file_domain and decl_domain != file_domain:
-        return False
+    decl_domain = scope.get("domain")
+    if decl_domain and file_domain:
+        return decl_domain == file_domain
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/gradata/hooks/rule_enforcement.py` around lines 97 - 101, The current
domain mismatch check uses an if/return pattern that triggers Ruff SIM103;
replace the conditional block that uses decl_domain = scope.get("domain") and
file_domain with a single boolean return expression such as: return decl_domain
is None or file_domain is None or decl_domain == file_domain (or equivalently
return not (decl_domain and file_domain and decl_domain != file_domain)) to
simplify the logic in the domain check.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/gradata/enhancements/self_improvement/_graduation.py`:
- Around line 60-64: Parse and validate the env vars rather than blindly
casting: read GRADATA_BETA_LB_THRESHOLD into threshold with float(...), reject
NaN/inf and clamp into [0.0, 1.0] (fallback to 0.85 on invalid input), and read
GRADATA_BETA_LB_MIN_FIRES into min_fires with int(...), treating non-numeric
values as the default 5 and enforcing a non-negative minimum (e.g., min_fires =
max(1, parsed) or at least max(0, parsed) depending on intended floor) so
negative values can't bypass the observation floor; use math.isnan/
math.isfinite checks and handle exceptions around float/int conversion, updating
the variables threshold and min_fires in the _graduation.py block.

In `@src/gradata/hooks/context_inject.py`:
- Around line 18-19: The module-level env parsing for MIN_MESSAGE_LEN and
MAX_CONTEXT_LEN can raise ValueError during import; change it to a fail-safe
parse that never throws: replace the direct int(os.environ.get(...)) usage with
a small safe parser used for MIN_MESSAGE_LEN and MAX_CONTEXT_LEN (e.g., a helper
like parse_env_int or inline try/except) that returns the default when the env
value is missing or malformed and logs nothing so startup is not blocked; update
references to MIN_MESSAGE_LEN and MAX_CONTEXT_LEN only (the constants named in
the diff) to use the safely parsed values.

In `@src/gradata/hooks/inject_brain_rules.py`:
- Around line 47-50: Currently MAX_RULES, MAX_META_RULES and MIN_CONFIDENCE are
parsed at import time which can raise and break SessionStart; move the parsing
of os.environ for GRADATA_MAX_RULES, GRADATA_MAX_META_RULES and
GRADATA_MIN_CONFIDENCE into the module's main() (or the hook entrypoint) so
parsing happens at runtime, handle any ValueError/TypeError by falling back
silently to the existing defaults (10, 5, 0.60) and log/debug if needed but do
not raise, and clamp MIN_CONFIDENCE into the [0.0, 1.0] range before use (e.g.,
min(max(parsed_value, 0.0), 1.0)); update references in the code to use the
local variables (MAX_RULES, MAX_META_RULES, MIN_CONFIDENCE) obtained inside
main()/inject_brain_rules entry so import-time evaluation is removed.
- Around line 301-304: The current logic only limits individual_lines via
render_budget and allows cluster_lines to push the total over MAX_RULES; fix by
enforcing the MAX_RULES cap on the combined payload: either truncate
cluster_lines to at most MAX_RULES (and set render_budget = 0) or, after
computing cluster_lines + individual_lines, trim the concatenated list to
MAX_RULES before returning/using it. Update the logic around render_budget,
cluster_lines, individual_lines, and MAX_RULES (inside the same mutex-protected
section) so the final sent/returned list can never exceed MAX_RULES.

In `@src/gradata/hooks/jit_inject.py`:
- Around line 111-114: The type annotation for the parameter `candidates` in
`_bm25_scores_for_draft` uses a quoted forward reference `"Lesson"` which is
redundant with `from __future__ import annotations`; remove the quotes so the
annotation reads `list[tuple[Lesson, str, str]]` (and similarly drop quotes
anywhere else in this function signature using `"Lesson"`) to resolve Ruff UP037
and unblock CI.

In `@src/gradata/hooks/rule_enforcement.py`:
- Line 39: Protect MAX_REMINDERS import-time parsing by wrapping the
int(os.environ.get("GRADATA_MAX_REMINDERS", "5")) conversion in a safe block:
try to parse the env var into an int, on ValueError or other exceptions fall
back silently to the default (5) and ensure the final value is clamped to
non-negative with max(0, parsed_value); update the symbol MAX_REMINDERS in
rule_enforcement.py accordingly (no exceptions should propagate during import)
and optionally emit a non-blocking debug/warn via the module logger if
available.

In `@tests/test_hooks_intelligence.py`:
- Around line 182-185: The test relies on environment defaults and can be flaky;
update the patched environment in tests/test_hooks_intelligence.py so patch.dict
includes deterministic values for GRADATA_CONTEXT_INJECT and
GRADATA_MIN_MESSAGE_LEN (e.g., "0" or other known safe values) alongside
GRADATA_BRAIN_DIR, ensuring the call to context_main({"message": ...}) runs the
same regardless of runner environment; modify the existing patch.dict invocation
near the context_main call to add these two keys so the test no longer depends
on host env.

In `@tests/test_jit_inject.py`:
- Around line 142-150: The test test_falls_back_to_jaccard_when_bm25_unavailable
should assert exact expected output instead of truthiness: set _BM25_AVAILABLE
False and bm25s None as you already do, call rank_rules_for_draft(...) and
replace "assert ranked and ranked[0][0].category == 'X'" with a concrete
assertion such as asserting ranked is a list of expected length and that
ranked[0][0].category == "X" (e.g., assert len(ranked) >= 1 and
ranked[0][0].category == "X") so failures show the specific mismatch; use the
existing helpers _lesson and the ranked variable to verify the exact category
and position.

---

Outside diff comments:
In `@src/gradata/hooks/rule_enforcement.py`:
- Around line 97-101: The current domain mismatch check uses an if/return
pattern that triggers Ruff SIM103; replace the conditional block that uses
decl_domain = scope.get("domain") and file_domain with a single boolean return
expression such as: return decl_domain is None or file_domain is None or
decl_domain == file_domain (or equivalently return not (decl_domain and
file_domain and decl_domain != file_domain)) to simplify the logic in the domain
check.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 79c4b69e-7fce-43a3-a448-fafe0c3d7674

📥 Commits

Reviewing files that changed from the base of the PR and between 357fbdd and 5bf3acf.

📒 Files selected for processing (10)
  • src/gradata/enhancements/self_improvement/_graduation.py
  • src/gradata/hooks/context_inject.py
  • src/gradata/hooks/inject_brain_rules.py
  • src/gradata/hooks/jit_inject.py
  • src/gradata/hooks/rule_enforcement.py
  • tests/test_hooks_intelligence.py
  • tests/test_hooks_safety.py
  • tests/test_jit_inject.py
  • tests/test_rule_enforcement_scope.py
  • tests/test_self_healing_fix.py
📜 Review details
🧰 Additional context used
📓 Path-based instructions (3)
tests/**

⚙️ CodeRabbit configuration file

tests/**: Test files. Verify: no hardcoded paths, assertions check specific values not just truthiness,
parametrized tests preferred for boundary conditions, floating point comparisons use pytest.approx.

Files:

  • tests/test_hooks_intelligence.py
  • tests/test_self_healing_fix.py
  • tests/test_hooks_safety.py
  • tests/test_rule_enforcement_scope.py
  • tests/test_jit_inject.py
src/gradata/**/*.py

⚙️ CodeRabbit configuration file

src/gradata/**/*.py: This is the core SDK. Check for: type safety (from future import annotations required), no print()
statements (use logging), all functions accepting BrainContext where DB access occurs, no hardcoded paths. Severity
scoring must clamp to [0,1]. Confidence values must be in [0.0, 1.0].

Files:

  • src/gradata/hooks/rule_enforcement.py
  • src/gradata/hooks/context_inject.py
  • src/gradata/enhancements/self_improvement/_graduation.py
  • src/gradata/hooks/jit_inject.py
  • src/gradata/hooks/inject_brain_rules.py
src/gradata/hooks/**

⚙️ CodeRabbit configuration file

src/gradata/hooks/**: JavaScript hooks for Claude Code integration. Check for: no shell injection (no execSync with user
input), temp files must use per-user subdirectory, HTTP calls must have timeouts, errors must be silent (never block
the tool chain).

Files:

  • src/gradata/hooks/rule_enforcement.py
  • src/gradata/hooks/context_inject.py
  • src/gradata/hooks/jit_inject.py
  • src/gradata/hooks/inject_brain_rules.py
🪛 GitHub Actions: SDK CI
src/gradata/hooks/rule_enforcement.py

[error] 98-100: ruff: SIM103 Return the negated condition directly

src/gradata/enhancements/self_improvement/_graduation.py

[error] 9-30: ruff: I001 Import block is un-sorted or un-formatted (organize imports)

src/gradata/hooks/jit_inject.py

[error] 112-112: ruff: UP037 Remove quotes from type annotation

🔇 Additional comments (2)
tests/test_rule_enforcement_scope.py (1)

13-17: Good isolation: module-wide opt-in fixture is the right approach.

This keeps scope tests deterministic under the new default-off rule-enforcement behavior.

tests/test_hooks_safety.py (1)

107-107: Nice update: explicit env gating in tests matches runtime behavior.

These changes correctly opt tests into rule enforcement and keep expectations aligned with the new default-off semantics.

Also applies to: 120-120, 129-129

Comment thread src/gradata/enhancements/self_improvement/_graduation.py Outdated
Comment on lines +18 to +19
MIN_MESSAGE_LEN = int(os.environ.get("GRADATA_MIN_MESSAGE_LEN", "100"))
MAX_CONTEXT_LEN = int(os.environ.get("GRADATA_MAX_CONTEXT_LEN", "2000"))
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Make env integer parsing fail-safe for hook startup.

Lines 18-19 can throw on malformed env values (GRADATA_MIN_MESSAGE_LEN, GRADATA_MAX_CONTEXT_LEN) during import, which prevents this hook from loading.

Proposed fix
-MIN_MESSAGE_LEN = int(os.environ.get("GRADATA_MIN_MESSAGE_LEN", "100"))
-MAX_CONTEXT_LEN = int(os.environ.get("GRADATA_MAX_CONTEXT_LEN", "2000"))
+def _env_int(name: str, default: int, *, minimum: int = 0) -> int:
+    try:
+        value = int(os.environ.get(name, str(default)))
+    except (TypeError, ValueError):
+        return default
+    return max(minimum, value)
+
+
+MIN_MESSAGE_LEN = _env_int("GRADATA_MIN_MESSAGE_LEN", 100, minimum=0)
+MAX_CONTEXT_LEN = _env_int("GRADATA_MAX_CONTEXT_LEN", 2000, minimum=1)

As per coding guidelines, src/gradata/hooks/** requires that errors are silent and never block the tool chain.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
MIN_MESSAGE_LEN = int(os.environ.get("GRADATA_MIN_MESSAGE_LEN", "100"))
MAX_CONTEXT_LEN = int(os.environ.get("GRADATA_MAX_CONTEXT_LEN", "2000"))
def _env_int(name: str, default: int, *, minimum: int = 0) -> int:
try:
value = int(os.environ.get(name, str(default)))
except (TypeError, ValueError):
return default
return max(minimum, value)
MIN_MESSAGE_LEN = _env_int("GRADATA_MIN_MESSAGE_LEN", 100, minimum=0)
MAX_CONTEXT_LEN = _env_int("GRADATA_MAX_CONTEXT_LEN", 2000, minimum=1)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/gradata/hooks/context_inject.py` around lines 18 - 19, The module-level
env parsing for MIN_MESSAGE_LEN and MAX_CONTEXT_LEN can raise ValueError during
import; change it to a fail-safe parse that never throws: replace the direct
int(os.environ.get(...)) usage with a small safe parser used for MIN_MESSAGE_LEN
and MAX_CONTEXT_LEN (e.g., a helper like parse_env_int or inline try/except)
that returns the default when the env value is missing or malformed and logs
nothing so startup is not blocked; update references to MIN_MESSAGE_LEN and
MAX_CONTEXT_LEN only (the constants named in the diff) to use the safely parsed
values.

Comment on lines +47 to +50
MAX_RULES = int(os.environ.get("GRADATA_MAX_RULES", "10"))
MIN_CONFIDENCE = float(os.environ.get("GRADATA_MIN_CONFIDENCE", "0.60"))
# Meta-rules are high-level principles — separate cap from MAX_RULES.
MAX_META_RULES = int(os.environ.get("GRADATA_MAX_META_RULES", "5"))
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Don't parse hook env knobs at import time.

A malformed GRADATA_MAX_RULES, GRADATA_MAX_META_RULES, or GRADATA_MIN_CONFIDENCE now raises during module import and disables SessionStart entirely. MIN_CONFIDENCE also flows through without any [0,1] clamping. These should be parsed inside main() with silent fallback.

As per coding guidelines, "errors must be silent (never block the tool chain)" and "Confidence values must be in [0.0, 1.0]".

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/gradata/hooks/inject_brain_rules.py` around lines 47 - 50, Currently
MAX_RULES, MAX_META_RULES and MIN_CONFIDENCE are parsed at import time which can
raise and break SessionStart; move the parsing of os.environ for
GRADATA_MAX_RULES, GRADATA_MAX_META_RULES and GRADATA_MIN_CONFIDENCE into the
module's main() (or the hook entrypoint) so parsing happens at runtime, handle
any ValueError/TypeError by falling back silently to the existing defaults (10,
5, 0.60) and log/debug if needed but do not raise, and clamp MIN_CONFIDENCE into
the [0.0, 1.0] range before use (e.g., min(max(parsed_value, 0.0), 1.0)); update
references in the code to use the local variables (MAX_RULES, MAX_META_RULES,
MIN_CONFIDENCE) obtained inside main()/inject_brain_rules entry so import-time
evaluation is removed.

Comment on lines +301 to +304
# Total <brain-rules> entries = cluster_lines + individual_lines.
# Enforce MAX_RULES here (after mutex) so freed slots get refilled from
# the overshoot pool, and the final block still respects the budget.
render_budget = max(0, MAX_RULES - len(cluster_lines))
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

MAX_RULES is still exceeded when multiple clusters qualify.

This budget only caps individual_lines. If len(cluster_lines) > MAX_RULES, the final cluster_lines + individual_lines payload still exceeds the configured limit, so the new overshoot path can inject more entries than operators asked for.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/gradata/hooks/inject_brain_rules.py` around lines 301 - 304, The current
logic only limits individual_lines via render_budget and allows cluster_lines to
push the total over MAX_RULES; fix by enforcing the MAX_RULES cap on the
combined payload: either truncate cluster_lines to at most MAX_RULES (and set
render_budget = 0) or, after computing cluster_lines + individual_lines, trim
the concatenated list to MAX_RULES before returning/using it. Update the logic
around render_budget, cluster_lines, individual_lines, and MAX_RULES (inside the
same mutex-protected section) so the final sent/returned list can never exceed
MAX_RULES.

Comment thread src/gradata/hooks/jit_inject.py
}

MAX_REMINDERS = 5
MAX_REMINDERS = int(os.environ.get("GRADATA_MAX_REMINDERS", "5"))
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Guard GRADATA_MAX_REMINDERS parsing to avoid import-time crashes.

Line 39 can raise ValueError on invalid env input and fail hook loading before main() can return safely. Also clamp to non-negative to avoid unexpected slicing behavior.

Proposed fix
-MAX_REMINDERS = int(os.environ.get("GRADATA_MAX_REMINDERS", "5"))
+def _env_int(name: str, default: int, *, minimum: int = 0) -> int:
+    try:
+        value = int(os.environ.get(name, str(default)))
+    except (TypeError, ValueError):
+        return default
+    return max(minimum, value)
+
+
+MAX_REMINDERS = _env_int("GRADATA_MAX_REMINDERS", 5, minimum=0)

As per coding guidelines, src/gradata/hooks/** requires that errors are silent and never block the tool chain.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/gradata/hooks/rule_enforcement.py` at line 39, Protect MAX_REMINDERS
import-time parsing by wrapping the int(os.environ.get("GRADATA_MAX_REMINDERS",
"5")) conversion in a safe block: try to parse the env var into an int, on
ValueError or other exceptions fall back silently to the default (5) and ensure
the final value is clamped to non-negative with max(0, parsed_value); update the
symbol MAX_REMINDERS in rule_enforcement.py accordingly (no exceptions should
propagate during import) and optionally emit a non-blocking debug/warn via the
module logger if available.

Comment on lines 182 to 185
with patch.dict(os.environ, {"GRADATA_BRAIN_DIR": str(tmp_path)}), \
patch("gradata.brain.Brain", return_value=mock_brain):
result = context_main({"message": "How do I set up the pipeline for new prospects?"})
result = context_main({"message": "How do I set up the pipeline for new prospects in the onboarding workflow? I'd like to understand the full process from lead capture through to qualification."})

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Pin context-inject env knobs in this test to avoid host-env flakiness.

This test currently relies on external defaults. If GRADATA_CONTEXT_INJECT or GRADATA_MIN_MESSAGE_LEN is set in the runner environment, it can fail nondeterministically.

Proposed fix
-    with patch.dict(os.environ, {"GRADATA_BRAIN_DIR": str(tmp_path)}), \
+    with patch.dict(
+        os.environ,
+        {
+            "GRADATA_BRAIN_DIR": str(tmp_path),
+            "GRADATA_CONTEXT_INJECT": "1",
+            "GRADATA_MIN_MESSAGE_LEN": "100",
+        },
+    ), \
          patch("gradata.brain.Brain", return_value=mock_brain):
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
with patch.dict(os.environ, {"GRADATA_BRAIN_DIR": str(tmp_path)}), \
patch("gradata.brain.Brain", return_value=mock_brain):
result = context_main({"message": "How do I set up the pipeline for new prospects?"})
result = context_main({"message": "How do I set up the pipeline for new prospects in the onboarding workflow? I'd like to understand the full process from lead capture through to qualification."})
with patch.dict(
os.environ,
{
"GRADATA_BRAIN_DIR": str(tmp_path),
"GRADATA_CONTEXT_INJECT": "1",
"GRADATA_MIN_MESSAGE_LEN": "100",
},
), \
patch("gradata.brain.Brain", return_value=mock_brain):
result = context_main({"message": "How do I set up the pipeline for new prospects in the onboarding workflow? I'd like to understand the full process from lead capture through to qualification."})
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/test_hooks_intelligence.py` around lines 182 - 185, The test relies on
environment defaults and can be flaky; update the patched environment in
tests/test_hooks_intelligence.py so patch.dict includes deterministic values for
GRADATA_CONTEXT_INJECT and GRADATA_MIN_MESSAGE_LEN (e.g., "0" or other known
safe values) alongside GRADATA_BRAIN_DIR, ensuring the call to
context_main({"message": ...}) runs the same regardless of runner environment;
modify the existing patch.dict invocation near the context_main call to add
these two keys so the test no longer depends on host env.

Comment thread tests/test_jit_inject.py Outdated
- ruff UP037: drop quoted "Lesson" from jit_inject _bm25_scores_for_draft
  annotation (from __future__ import annotations is already present)
- ruff I001: organize imports in _graduation.py
- harden GRADATA_BETA_LB_THRESHOLD parsing: guard NaN/inf, clamp to [0,1],
  catch TypeError alongside ValueError; GRADATA_BETA_LB_MIN_FIRES floored
  at 0
- test: assert exact len and category in Jaccard-fallback test instead of
  bare truthiness

Tests: 34 jit + 88 graduation/wiring pass. Lint clean.

Co-Authored-By: Gradata <noreply@gradata.ai>
Copy link
Copy Markdown

@greptile-apps greptile-apps Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gradata has reached the 50-review limit for trial accounts. To continue receiving code reviews, upgrade your plan.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/gradata/hooks/jit_inject.py (1)

173-177: ⚠️ Potential issue | 🟠 Major

Clamp and validate confidence before filtering and scoring.

Line 175 trusts raw lesson.confidence without type/finiteness/range checks. Non-numeric/NaN/inf values can break comparisons or leak invalid ranking weights, and out-of-range values violate the SDK confidence contract.

Suggested fix
@@
 import json
 import logging
+import math
 import os
@@
 def rank_rules_for_draft(
@@
-    candidates: list[tuple[Lesson, str, str, float]] = []
+    try:
+        min_confidence = float(min_confidence)
+    except (TypeError, ValueError):
+        min_confidence = DEFAULT_MIN_CONFIDENCE
+    if not math.isfinite(min_confidence):
+        min_confidence = DEFAULT_MIN_CONFIDENCE
+    min_confidence = min(1.0, max(0.0, min_confidence))
+
+    candidates: list[tuple[Lesson, str, str, float]] = []
     for lesson in lessons:
-        conf = getattr(lesson, "confidence", 0.0)
+        raw_conf = getattr(lesson, "confidence", 0.0)
+        try:
+            conf = float(raw_conf)
+        except (TypeError, ValueError):
+            continue
+        if not math.isfinite(conf):
+            continue
+        conf = min(1.0, max(0.0, conf))
         if conf < min_confidence:
             continue

As per coding guidelines, "Confidence values must be in [0.0, 1.0]."

Also applies to: 194-205

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/gradata/hooks/jit_inject.py` around lines 173 - 177, Validate and clamp
lesson confidence before using it: when you read confidence via conf =
getattr(lesson, "confidence", 0.0) (in the loop over lessons) coerce non-numeric
values to 0.0, ensure the value is finite, convert to float, and clamp into the
[0.0, 1.0] range before applying the min_confidence filter and before any
scoring logic; apply the same validation/clamping in the later scoring block
(the candidate scoring section around lines 194-205) so all comparisons and
ranking weights use a normalized confidence value for lessons, and treat
invalid/NaN/inf as 0.0.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Outside diff comments:
In `@src/gradata/hooks/jit_inject.py`:
- Around line 173-177: Validate and clamp lesson confidence before using it:
when you read confidence via conf = getattr(lesson, "confidence", 0.0) (in the
loop over lessons) coerce non-numeric values to 0.0, ensure the value is finite,
convert to float, and clamp into the [0.0, 1.0] range before applying the
min_confidence filter and before any scoring logic; apply the same
validation/clamping in the later scoring block (the candidate scoring section
around lines 194-205) so all comparisons and ranking weights use a normalized
confidence value for lessons, and treat invalid/NaN/inf as 0.0.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: a52c044d-8d67-41c4-b71e-1e6c111bd475

📥 Commits

Reviewing files that changed from the base of the PR and between 5bf3acf and 3ff7a21.

📒 Files selected for processing (3)
  • src/gradata/enhancements/self_improvement/_graduation.py
  • src/gradata/hooks/jit_inject.py
  • tests/test_jit_inject.py
📜 Review details
🧰 Additional context used
📓 Path-based instructions (3)
tests/**

⚙️ CodeRabbit configuration file

tests/**: Test files. Verify: no hardcoded paths, assertions check specific values not just truthiness,
parametrized tests preferred for boundary conditions, floating point comparisons use pytest.approx.

Files:

  • tests/test_jit_inject.py
src/gradata/**/*.py

⚙️ CodeRabbit configuration file

src/gradata/**/*.py: This is the core SDK. Check for: type safety (from future import annotations required), no print()
statements (use logging), all functions accepting BrainContext where DB access occurs, no hardcoded paths. Severity
scoring must clamp to [0,1]. Confidence values must be in [0.0, 1.0].

Files:

  • src/gradata/hooks/jit_inject.py
  • src/gradata/enhancements/self_improvement/_graduation.py
src/gradata/hooks/**

⚙️ CodeRabbit configuration file

src/gradata/hooks/**: JavaScript hooks for Claude Code integration. Check for: no shell injection (no execSync with user
input), temp files must use per-user subdirectory, HTTP calls must have timeouts, errors must be silent (never block
the tool chain).

Files:

  • src/gradata/hooks/jit_inject.py
🔇 Additional comments (4)
src/gradata/enhancements/self_improvement/_graduation.py (2)

45-46: Docstring default is aligned with runtime behavior.

Line 45 now reflects the 0.85 default threshold used by the gate logic.


59-71: Beta gate env parsing/validation hardening is solid.

Lines 62-71 correctly handle non-finite input, clamp threshold to [0.0, 1.0], catch TypeError/ValueError, and floor min_fires at 0.

src/gradata/hooks/jit_inject.py (1)

111-147: BM25 path degrades safely without breaking the hook contract.

Good defensive shape here: Line 120 short-circuits when BM25 is unavailable, and Lines 135-137 fail closed to Jaccard fallback instead of blocking execution.

tests/test_jit_inject.py (1)

129-141: New BM25/fallback tests are well-targeted and deterministic.

These cover both branches clearly and assert concrete outcomes (top category and exact fallback result size), which tightens regression detection.

Also applies to: 142-151

@Gradata Gradata merged commit cedd97a into main Apr 17, 2026
4 of 17 checks passed
@Gradata Gradata deleted the feat/jit-bm25-and-beta-lb branch April 17, 2026 19:50
Gradata added a commit that referenced this pull request Apr 19, 2026
Covers: gradata.patterns deprecation (remove in v0.8.0), Alert dedup (#109),
Meta-Harness A-D, multi-tenant SDK (#102), BM25 JIT ranking (#101),
gradata seed/mine CLI, rule_verifier wiring, security hardening, and 67+66
ruff-violation fixes (#100, #103).

Co-Authored-By: Gradata <noreply@gradata.ai>
Gradata added a commit that referenced this pull request Apr 19, 2026
Covers: gradata.patterns deprecation (remove in v0.8.0), Alert dedup (#109),
Meta-Harness A-D, multi-tenant SDK (#102), BM25 JIT ranking (#101),
gradata seed/mine CLI, rule_verifier wiring, security hardening, and 67+66
ruff-violation fixes (#100, #103).

Co-Authored-By: Gradata <noreply@gradata.ai>
Gradata added a commit that referenced this pull request Apr 20, 2026
* docs: v0.6.1 changelog entry covering post-v0.6.0 work through PR #114

Covers: gradata.patterns deprecation (remove in v0.8.0), Alert dedup (#109),
Meta-Harness A-D, multi-tenant SDK (#102), BM25 JIT ranking (#101),
gradata seed/mine CLI, rule_verifier wiring, security hardening, and 67+66
ruff-violation fixes (#100, #103).

Co-Authored-By: Gradata <noreply@gradata.ai>

* docs(changelog): qualify env-centralization claim

CodeRabbit flagged that GRADATA_RULE_VERIFIER is still read directly in
rule_pipeline.py. Weaken the claim rather than block on the migration.

Co-Authored-By: Gradata <noreply@gradata.ai>

---------

Co-authored-by: Gradata <noreply@gradata.ai>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant