feat(jit,graduation): BM25 for JIT ranking + raise Beta LB default to 0.85 by Gradata · Pull Request #101 · Gradata/gradata

Gradata · 2026-04-17T10:33:43Z

Summary

BM25 for JIT ranking (src/gradata/hooks/jit_inject.py): replaces Jaccard with BM25 via optional bm25s dep when installed; falls back to Jaccard so the SDK stays zero-required-deps. Same pattern as rules/rule_ranker.py. BM25 weights rare terms over generic ones — better signal for matching drafts to rule descriptions.
Beta LB default 0.70 -> 0.85 (src/gradata/enhancements/self_improvement/_graduation.py): per S105 recommendation. Gate stays opt-in (GRADATA_BETA_LB_GATE off by default) so this is a no-op for anyone who hasn't enabled it.

Test plan

pytest tests/test_jit_inject.py — 34/34 (32 existing + 2 new: BM25 path, Jaccard fallback)
pytest tests/test_agent_graduation.py tests/test_graduation_notification.py tests/test_pattern_graduation_integration.py — 74 passed, 2 xfailed
pytest tests/test_wiring_compound.py — 14/14

Generated with Gradata

…ompts Two measured waste eliminations: 1. inject_brain_rules: mandatory_reminder block restated every mandatory rule verbatim in a recency tag. mandatory_block already places them in primacy position — the reminder was defensive redundancy costing ~50-200 tok/session per mandatory rule. 2. context_inject: MIN_MESSAGE_LEN raised 10 → 60 to skip brain FTS search on trivial follow-ups ("ok", "yes", "continue"). Previous threshold only filtered 1-9 char messages, causing ~400 tok of wasted context per ack-style reply. Council audit (adversarial, synthesized from Skeptic + Architect + Pragmatist) found my earlier measurement methodology used synthetic inputs that masked real per-operation costs. Real heavy session (10 prompts, 30 edits, 5 agents) accumulates ~9.6K gradata tokens; context_inject alone can contribute 3.5K when prompts are substantive. Tests: 2884 passed, 1 preexisting cp1252 Windows unicode failure in test_rule_to_hook (unrelated to injection). Co-Authored-By: Gradata <noreply@gradata.ai>

…imits Rule_enforcement was re-injecting SessionStart rules on every Write/Edit, costing ~165 tok/edit for zero added primacy value. Default it off behind GRADATA_RULE_ENFORCEMENT=1 opt-in for ablation. Also add env knobs for tuning injection volume without redeploying: - context_inject: GRADATA_CONTEXT_INJECT kill switch + MIN_MESSAGE_LEN raised 60 -> 100 so ack-style replies skip FTS cost - inject_brain_rules: GRADATA_MAX_RULES, GRADATA_MIN_CONFIDENCE, GRADATA_MAX_META_RULES for tuning SessionStart payload Tests updated to opt into rule_enforcement where they exercise the hook and to use a >=100 char message for context_inject. Co-Authored-By: Gradata <noreply@gradata.ai>

Telemetry showed 10 SessionStart events in a long session re-injected 1.9KB each (~3.7k tokens total) — duplicative because the compact summary already carries rules from the prior session, and the new session's primacy slot is consumed by that summary. Skip inject_brain_rules when source is 'compact' or 'resume'. Opt back in with GRADATA_INJECT_ON_COMPACT=1 for ablation. Co-Authored-By: Gradata <noreply@gradata.ai>

…d metas When a meta-rule is injected, its source_lesson_ids are added to a mutex set; individual RULE/PATTERN leaves whose _lesson_id() is in that set are skipped so the principle isn't double-injected as both the meta and its concrete leaves. Freed slots get refilled: rank_rules is now called with a 3x overshoot so the render loop has candidates beyond the MAX_RULES cap; the cap is enforced at render time (cluster_lines + individual_lines <= MAX_RULES) rather than at rank time. On current brain: 1822B -> 1319B per SessionStart (27% smaller), and leaf categories diversify from {LEADS, DEMO_PREP, DRAFTING} to {DRAFTING, TONE, DATA_INTEGRITY, ARCHITECTURE, CODE} while the LEADS + DEMO_PREP principles still land via metas. Opt-out: GRADATA_META_RULE_MUTEX=0 for ablation. Mirrors the existing cluster-vs-meta mutex pattern (meta_covered_categories). Co-Authored-By: Gradata <noreply@gradata.ai>

… 0.85 JIT injection (src/gradata/hooks/jit_inject.py) now uses BM25 via the optional bm25s dep when installed, with Jaccard as a graceful fallback. Matches the pattern already used in rules/rule_ranker.py. BM25 weights rare terms higher than generic ones, which is exactly what we want when matching a fresh draft against a corpus of rule descriptions. Beta LB gate default threshold raised from 0.70 -> 0.85 per S105 recommendation — S105 ablation showed the 0.70 floor still lets through format-based false graduations. Gate remains opt-in (GRADATA_BETA_LB_GATE off by default), so this is a no-op for anyone who hasn't enabled it. Tests: all 34 jit_inject tests pass (32 existing + 2 new covering BM25 path and Jaccard fallback). Graduation + wiring tests: 88/88 pass. Co-Authored-By: Gradata <noreply@gradata.ai>

greptile-apps

Gradata has reached the 50-review limit for trial accounts. To continue receiving code reviews, upgrade your plan.

coderabbitai · 2026-04-17T10:33:58Z

📝 Walkthrough

JIT ranking now BM25-first with Jaccard fallback: rank_rules_for_draft() uses BM25 scoring via optional bm25s (falls back to existing Jaccard path if unavailable or on error)
Beta LB threshold raised: default parsed GRADATA_BETA_LB_THRESHOLD changed from 0.70 → 0.85; note the GRADATA_BETA_LB_GATE remains opt-in (disabled by default), so no behavioral change unless the gate is enabled
Rule enforcement off by default: GRADATA_RULE_ENFORCEMENT must be set to "1" to enable per-edit rule injection (hook returns None when not enabled)
New environment knobs for runtime behavior: GRADATA_CONTEXT_INJECT, GRADATA_MAX_RULES, GRADATA_MIN_CONFIDENCE, GRADATA_MAX_META_RULES, GRADATA_MAX_REMINDERS, GRADATA_INJECT_ON_COMPACT, GRADATA_MIN_MESSAGE_LEN, GRADATA_MAX_CONTEXT_LEN, GRADATA_META_RULE_MUTEX
Context injection defaults changed and made configurable: MIN_MESSAGE_LEN default increased (from 10 → 100) and exposed as GRADATA_MIN_MESSAGE_LEN; MAX_CONTEXT_LEN exposed as GRADATA_MAX_CONTEXT_LEN
Conditional/opt-in rule injection behavior: brain rule re-injection skips "compact" and "resume" SessionStart sources unless GRADATA_INJECT_ON_COMPACT=1
Meta-rule mutex extension: tracks meta_covered_lesson_ids and suppresses leaf rules whose lesson_id is already covered by injected meta-rules when enabled
Removed mandatory-reminder output: section no longer produced; only remains
Robust parsing/hardening: safer parsing/validation for GRADATA_BETA_LB_THRESHOLD and GRADATA_BETA_LB_MIN_FIRES (guard against NaN/inf, clamp threshold to [0,1], floor min_fires at 0), plus minor lint/import/annotation cleanups and stricter tests

Breaking changes:

Rule enforcement disabled by default (requires GRADATA_RULE_ENFORCEMENT=1)
MIN_MESSAGE_LEN default raised to 100
output removed

No security fixes or new public API methods were introduced.

Walkthrough

Adds environment-configurable feature gates and thresholds across multiple hooks, integrates BM25-first ranking with Jaccard fallback for JIT rule selection, extends meta-rule mutexing to suppress by lesson IDs, tightens parsing/clamping of beta LB threshold/min-fires, and removes the <mandatory-reminder> block from outputs/tests.

Changes

Cohort / File(s)	Summary
Graduation / Threshold Parsing `src/gradata/enhancements/self_improvement/_graduation.py`	Changed default parsed `GRADATA_BETA_LB_THRESHOLD` from 0.70 → 0.85; added math/non-finite handling, clamping to [0.0,1.0], and safer exception fallbacks; clamped `GRADATA_BETA_LB_MIN_FIRES` to non-negative ints.
Context Injection Hook `src/gradata/hooks/context_inject.py`	Added `os` import; made `MIN_MESSAGE_LEN` and `MAX_CONTEXT_LEN` configurable via environment; added early-exit kill-switch when `GRADATA_CONTEXT_INJECT != "1"`.
Rule Enforcement Hook `src/gradata/hooks/rule_enforcement.py`	Added `os` import; hook now no-ops unless `GRADATA_RULE_ENFORCEMENT="1"`; `MAX_REMINDERS` made env-configurable.
Inject Brain Rules `src/gradata/hooks/inject_brain_rules.py`	Env-configurable rule thresholds (`MAX_RULES`, `MIN_CONFIDENCE`, `MAX_META_RULES`), skip reinjection for compact/resume sources (unless allowed), larger ranker candidate overshoot with final budget enforcement, extended meta-rule mutexes to track `meta_covered_lesson_ids` and suppress leaf rules by lesson id; removed construction of `<mandatory-reminder>` in final render.
JIT Inject / Ranking `src/gradata/hooks/jit_inject.py`	Added optional BM25 integration with import-time feature gate; new `_bm25_scores_for_draft()` helper; BM25-first scoring with Jaccard fallback retained; refactored ranking flow to apply BM25 scores when available.
Tests — adjustments & new cases `tests/test_hooks_intelligence.py`, `tests/test_hooks_safety.py`, `tests/test_jit_inject.py`, `tests/test_rule_enforcement_scope.py`, `tests/test_self_healing_fix.py`	Updated test inputs and environment setup (explicitly enable rule enforcement where required), added BM25 availability tests for jit_inject, added autouse fixture enabling `GRADATA_RULE_ENFORCEMENT`, and removed tests/assertions tied to `<mandatory-reminder>`.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

Possibly related PRs

feat(ranking): BM25 context relevance + Thompson sampling + unified ranker #91 — Closely related BM25 integration and ranking changes in jit_inject (similar BM25-first, Jaccard-fallback approach).
feat(wiring): canary + rules.injected + scipy Beta PPF + Beta LB gate #86 — Changes to Beta LB gating thresholds and parsing logic impacting the same graduation gate.
feat(hooks): inject meta-rules into LLM context at session start #45 — Related modifications to inject_brain_rules.py meta-rule handling and suppression logic.

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 33.33% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title clearly and specifically identifies the two main changes: BM25 implementation for JIT ranking and Beta LB threshold increase from 0.70 to 0.85.
Description check	✅ Passed	The description is directly related to the changeset, explaining both main features (BM25 for JIT ranking and Beta LB default threshold change) and providing a test plan demonstrating validation of the changes.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch feat/jit-bm25-and-beta-lb

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 8

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

src/gradata/hooks/rule_enforcement.py (1)

97-101: ⚠️ Potential issue | 🟠 Major

Fix CI-blocking Ruff SIM103 in domain check.

The conditional pattern at line 98 triggers the SIM103 lint failure. Simplify the boolean return logic as shown below.

Proposed fix

-    decl_domain = scope.get("domain")
-    if decl_domain and file_domain and decl_domain != file_domain:
-        return False
+    decl_domain = scope.get("domain")
+    if decl_domain and file_domain:
+        return decl_domain == file_domain

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@src/gradata/hooks/rule_enforcement.py` around lines 97 - 101, The current
domain mismatch check uses an if/return pattern that triggers Ruff SIM103;
replace the conditional block that uses decl_domain = scope.get("domain") and
file_domain with a single boolean return expression such as: return decl_domain
is None or file_domain is None or decl_domain == file_domain (or equivalently
return not (decl_domain and file_domain and decl_domain != file_domain)) to
simplify the logic in the domain check.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/gradata/enhancements/self_improvement/_graduation.py`:
- Around line 60-64: Parse and validate the env vars rather than blindly
casting: read GRADATA_BETA_LB_THRESHOLD into threshold with float(...), reject
NaN/inf and clamp into [0.0, 1.0] (fallback to 0.85 on invalid input), and read
GRADATA_BETA_LB_MIN_FIRES into min_fires with int(...), treating non-numeric
values as the default 5 and enforcing a non-negative minimum (e.g., min_fires =
max(1, parsed) or at least max(0, parsed) depending on intended floor) so
negative values can't bypass the observation floor; use math.isnan/
math.isfinite checks and handle exceptions around float/int conversion, updating
the variables threshold and min_fires in the _graduation.py block.

In `@src/gradata/hooks/context_inject.py`:
- Around line 18-19: The module-level env parsing for MIN_MESSAGE_LEN and
MAX_CONTEXT_LEN can raise ValueError during import; change it to a fail-safe
parse that never throws: replace the direct int(os.environ.get(...)) usage with
a small safe parser used for MIN_MESSAGE_LEN and MAX_CONTEXT_LEN (e.g., a helper
like parse_env_int or inline try/except) that returns the default when the env
value is missing or malformed and logs nothing so startup is not blocked; update
references to MIN_MESSAGE_LEN and MAX_CONTEXT_LEN only (the constants named in
the diff) to use the safely parsed values.

In `@src/gradata/hooks/inject_brain_rules.py`:
- Around line 47-50: Currently MAX_RULES, MAX_META_RULES and MIN_CONFIDENCE are
parsed at import time which can raise and break SessionStart; move the parsing
of os.environ for GRADATA_MAX_RULES, GRADATA_MAX_META_RULES and
GRADATA_MIN_CONFIDENCE into the module's main() (or the hook entrypoint) so
parsing happens at runtime, handle any ValueError/TypeError by falling back
silently to the existing defaults (10, 5, 0.60) and log/debug if needed but do
not raise, and clamp MIN_CONFIDENCE into the [0.0, 1.0] range before use (e.g.,
min(max(parsed_value, 0.0), 1.0)); update references in the code to use the
local variables (MAX_RULES, MAX_META_RULES, MIN_CONFIDENCE) obtained inside
main()/inject_brain_rules entry so import-time evaluation is removed.
- Around line 301-304: The current logic only limits individual_lines via
render_budget and allows cluster_lines to push the total over MAX_RULES; fix by
enforcing the MAX_RULES cap on the combined payload: either truncate
cluster_lines to at most MAX_RULES (and set render_budget = 0) or, after
computing cluster_lines + individual_lines, trim the concatenated list to
MAX_RULES before returning/using it. Update the logic around render_budget,
cluster_lines, individual_lines, and MAX_RULES (inside the same mutex-protected
section) so the final sent/returned list can never exceed MAX_RULES.

In `@src/gradata/hooks/jit_inject.py`:
- Around line 111-114: The type annotation for the parameter `candidates` in
`_bm25_scores_for_draft` uses a quoted forward reference `"Lesson"` which is
redundant with `from __future__ import annotations`; remove the quotes so the
annotation reads `list[tuple[Lesson, str, str]]` (and similarly drop quotes
anywhere else in this function signature using `"Lesson"`) to resolve Ruff UP037
and unblock CI.

In `@src/gradata/hooks/rule_enforcement.py`:
- Line 39: Protect MAX_REMINDERS import-time parsing by wrapping the
int(os.environ.get("GRADATA_MAX_REMINDERS", "5")) conversion in a safe block:
try to parse the env var into an int, on ValueError or other exceptions fall
back silently to the default (5) and ensure the final value is clamped to
non-negative with max(0, parsed_value); update the symbol MAX_REMINDERS in
rule_enforcement.py accordingly (no exceptions should propagate during import)
and optionally emit a non-blocking debug/warn via the module logger if
available.

In `@tests/test_hooks_intelligence.py`:
- Around line 182-185: The test relies on environment defaults and can be flaky;
update the patched environment in tests/test_hooks_intelligence.py so patch.dict
includes deterministic values for GRADATA_CONTEXT_INJECT and
GRADATA_MIN_MESSAGE_LEN (e.g., "0" or other known safe values) alongside
GRADATA_BRAIN_DIR, ensuring the call to context_main({"message": ...}) runs the
same regardless of runner environment; modify the existing patch.dict invocation
near the context_main call to add these two keys so the test no longer depends
on host env.

In `@tests/test_jit_inject.py`:
- Around line 142-150: The test test_falls_back_to_jaccard_when_bm25_unavailable
should assert exact expected output instead of truthiness: set _BM25_AVAILABLE
False and bm25s None as you already do, call rank_rules_for_draft(...) and
replace "assert ranked and ranked[0][0].category == 'X'" with a concrete
assertion such as asserting ranked is a list of expected length and that
ranked[0][0].category == "X" (e.g., assert len(ranked) >= 1 and
ranked[0][0].category == "X") so failures show the specific mismatch; use the
existing helpers _lesson and the ranked variable to verify the exact category
and position.

---

Outside diff comments:
In `@src/gradata/hooks/rule_enforcement.py`:
- Around line 97-101: The current domain mismatch check uses an if/return
pattern that triggers Ruff SIM103; replace the conditional block that uses
decl_domain = scope.get("domain") and file_domain with a single boolean return
expression such as: return decl_domain is None or file_domain is None or
decl_domain == file_domain (or equivalently return not (decl_domain and
file_domain and decl_domain != file_domain)) to simplify the logic in the domain
check.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 79c4b69e-7fce-43a3-a448-fafe0c3d7674

📥 Commits

Reviewing files that changed from the base of the PR and between 357fbdd and 5bf3acf.

📒 Files selected for processing (10)

src/gradata/enhancements/self_improvement/_graduation.py
src/gradata/hooks/context_inject.py
src/gradata/hooks/inject_brain_rules.py
src/gradata/hooks/jit_inject.py
src/gradata/hooks/rule_enforcement.py
tests/test_hooks_intelligence.py
tests/test_hooks_safety.py
tests/test_jit_inject.py
tests/test_rule_enforcement_scope.py
tests/test_self_healing_fix.py

📜 Review details

🧰 Additional context used

📓 Path-based instructions (3)

tests/**

⚙️ CodeRabbit configuration file

tests/**: Test files. Verify: no hardcoded paths, assertions check specific values not just truthiness,
parametrized tests preferred for boundary conditions, floating point comparisons use pytest.approx.

Files:

tests/test_hooks_intelligence.py
tests/test_self_healing_fix.py
tests/test_hooks_safety.py
tests/test_rule_enforcement_scope.py
tests/test_jit_inject.py

src/gradata/**/*.py

⚙️ CodeRabbit configuration file

src/gradata/**/*.py: This is the core SDK. Check for: type safety (from future import annotations required), no print()
statements (use logging), all functions accepting BrainContext where DB access occurs, no hardcoded paths. Severity
scoring must clamp to [0,1]. Confidence values must be in [0.0, 1.0].

Files:

src/gradata/hooks/rule_enforcement.py
src/gradata/hooks/context_inject.py
src/gradata/enhancements/self_improvement/_graduation.py
src/gradata/hooks/jit_inject.py
src/gradata/hooks/inject_brain_rules.py

src/gradata/hooks/**

⚙️ CodeRabbit configuration file

src/gradata/hooks/**: JavaScript hooks for Claude Code integration. Check for: no shell injection (no execSync with user
input), temp files must use per-user subdirectory, HTTP calls must have timeouts, errors must be silent (never block
the tool chain).

Files:

src/gradata/hooks/rule_enforcement.py
src/gradata/hooks/context_inject.py
src/gradata/hooks/jit_inject.py
src/gradata/hooks/inject_brain_rules.py

🪛 GitHub Actions: SDK CI

src/gradata/hooks/rule_enforcement.py

[error] 98-100: ruff: SIM103 Return the negated condition directly

src/gradata/enhancements/self_improvement/_graduation.py

[error] 9-30: ruff: I001 Import block is un-sorted or un-formatted (organize imports)

src/gradata/hooks/jit_inject.py

[error] 112-112: ruff: UP037 Remove quotes from type annotation

🔇 Additional comments (2)

tests/test_rule_enforcement_scope.py (1)

13-17: Good isolation: module-wide opt-in fixture is the right approach.

This keeps scope tests deterministic under the new default-off rule-enforcement behavior.

tests/test_hooks_safety.py (1)

107-107: Nice update: explicit env gating in tests matches runtime behavior.

These changes correctly opt tests into rule enforcement and keep expectations aligned with the new default-off semantics.

Also applies to: 120-120, 129-129

coderabbitai · 2026-04-17T10:40:23Z

+MIN_MESSAGE_LEN = int(os.environ.get("GRADATA_MIN_MESSAGE_LEN", "100"))
+MAX_CONTEXT_LEN = int(os.environ.get("GRADATA_MAX_CONTEXT_LEN", "2000"))


⚠️ Potential issue | 🟠 Major

Make env integer parsing fail-safe for hook startup.

Lines 18-19 can throw on malformed env values (GRADATA_MIN_MESSAGE_LEN, GRADATA_MAX_CONTEXT_LEN) during import, which prevents this hook from loading.

Proposed fix

-MIN_MESSAGE_LEN = int(os.environ.get("GRADATA_MIN_MESSAGE_LEN", "100")) -MAX_CONTEXT_LEN = int(os.environ.get("GRADATA_MAX_CONTEXT_LEN", "2000")) +def _env_int(name: str, default: int, *, minimum: int = 0) -> int: + try: + value = int(os.environ.get(name, str(default))) + except (TypeError, ValueError): + return default + return max(minimum, value) + + +MIN_MESSAGE_LEN = _env_int("GRADATA_MIN_MESSAGE_LEN", 100, minimum=0) +MAX_CONTEXT_LEN = _env_int("GRADATA_MAX_CONTEXT_LEN", 2000, minimum=1)

As per coding guidelines, src/gradata/hooks/** requires that errors are silent and never block the tool chain.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

MIN_MESSAGE_LEN = int(os.environ.get("GRADATA_MIN_MESSAGE_LEN", "100"))

MAX_CONTEXT_LEN = int(os.environ.get("GRADATA_MAX_CONTEXT_LEN", "2000"))

def _env_int(name: str, default: int, *, minimum: int = 0) -> int:

try:

value = int(os.environ.get(name, str(default)))

except (TypeError, ValueError):

return default

return max(minimum, value)

MIN_MESSAGE_LEN = _env_int("GRADATA_MIN_MESSAGE_LEN", 100, minimum=0)

MAX_CONTEXT_LEN = _env_int("GRADATA_MAX_CONTEXT_LEN", 2000, minimum=1)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@src/gradata/hooks/context_inject.py` around lines 18 - 19, The module-level env parsing for MIN_MESSAGE_LEN and MAX_CONTEXT_LEN can raise ValueError during import; change it to a fail-safe parse that never throws: replace the direct int(os.environ.get(...)) usage with a small safe parser used for MIN_MESSAGE_LEN and MAX_CONTEXT_LEN (e.g., a helper like parse_env_int or inline try/except) that returns the default when the env value is missing or malformed and logs nothing so startup is not blocked; update references to MIN_MESSAGE_LEN and MAX_CONTEXT_LEN only (the constants named in the diff) to use the safely parsed values.

coderabbitai · 2026-04-17T10:40:23Z

+MAX_RULES = int(os.environ.get("GRADATA_MAX_RULES", "10"))
+MIN_CONFIDENCE = float(os.environ.get("GRADATA_MIN_CONFIDENCE", "0.60"))
+# Meta-rules are high-level principles — separate cap from MAX_RULES.
+MAX_META_RULES = int(os.environ.get("GRADATA_MAX_META_RULES", "5"))


⚠️ Potential issue | 🟠 Major

Don't parse hook env knobs at import time.

A malformed GRADATA_MAX_RULES, GRADATA_MAX_META_RULES, or GRADATA_MIN_CONFIDENCE now raises during module import and disables SessionStart entirely. MIN_CONFIDENCE also flows through without any [0,1] clamping. These should be parsed inside main() with silent fallback.

As per coding guidelines, "errors must be silent (never block the tool chain)" and "Confidence values must be in [0.0, 1.0]".

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@src/gradata/hooks/inject_brain_rules.py` around lines 47 - 50, Currently MAX_RULES, MAX_META_RULES and MIN_CONFIDENCE are parsed at import time which can raise and break SessionStart; move the parsing of os.environ for GRADATA_MAX_RULES, GRADATA_MAX_META_RULES and GRADATA_MIN_CONFIDENCE into the module's main() (or the hook entrypoint) so parsing happens at runtime, handle any ValueError/TypeError by falling back silently to the existing defaults (10, 5, 0.60) and log/debug if needed but do not raise, and clamp MIN_CONFIDENCE into the [0.0, 1.0] range before use (e.g., min(max(parsed_value, 0.0), 1.0)); update references in the code to use the local variables (MAX_RULES, MAX_META_RULES, MIN_CONFIDENCE) obtained inside main()/inject_brain_rules entry so import-time evaluation is removed.

coderabbitai · 2026-04-17T10:40:23Z

+    # Total <brain-rules> entries = cluster_lines + individual_lines.
+    # Enforce MAX_RULES here (after mutex) so freed slots get refilled from
+    # the overshoot pool, and the final block still respects the budget.
+    render_budget = max(0, MAX_RULES - len(cluster_lines))


⚠️ Potential issue | 🟠 Major

MAX_RULES is still exceeded when multiple clusters qualify.

This budget only caps individual_lines. If len(cluster_lines) > MAX_RULES, the final cluster_lines + individual_lines payload still exceeds the configured limit, so the new overshoot path can inject more entries than operators asked for.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@src/gradata/hooks/inject_brain_rules.py` around lines 301 - 304, The current logic only limits individual_lines via render_budget and allows cluster_lines to push the total over MAX_RULES; fix by enforcing the MAX_RULES cap on the combined payload: either truncate cluster_lines to at most MAX_RULES (and set render_budget = 0) or, after computing cluster_lines + individual_lines, trim the concatenated list to MAX_RULES before returning/using it. Update the logic around render_budget, cluster_lines, individual_lines, and MAX_RULES (inside the same mutex-protected section) so the final sent/returned list can never exceed MAX_RULES.

coderabbitai · 2026-04-17T10:40:23Z

 }

-MAX_REMINDERS = 5
+MAX_REMINDERS = int(os.environ.get("GRADATA_MAX_REMINDERS", "5"))


⚠️ Potential issue | 🟠 Major

Guard GRADATA_MAX_REMINDERS parsing to avoid import-time crashes.

Line 39 can raise ValueError on invalid env input and fail hook loading before main() can return safely. Also clamp to non-negative to avoid unexpected slicing behavior.

Proposed fix

-MAX_REMINDERS = int(os.environ.get("GRADATA_MAX_REMINDERS", "5")) +def _env_int(name: str, default: int, *, minimum: int = 0) -> int: + try: + value = int(os.environ.get(name, str(default))) + except (TypeError, ValueError): + return default + return max(minimum, value) + + +MAX_REMINDERS = _env_int("GRADATA_MAX_REMINDERS", 5, minimum=0)

As per coding guidelines, src/gradata/hooks/** requires that errors are silent and never block the tool chain.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@src/gradata/hooks/rule_enforcement.py` at line 39, Protect MAX_REMINDERS import-time parsing by wrapping the int(os.environ.get("GRADATA_MAX_REMINDERS", "5")) conversion in a safe block: try to parse the env var into an int, on ValueError or other exceptions fall back silently to the default (5) and ensure the final value is clamped to non-negative with max(0, parsed_value); update the symbol MAX_REMINDERS in rule_enforcement.py accordingly (no exceptions should propagate during import) and optionally emit a non-blocking debug/warn via the module logger if available.

coderabbitai · 2026-04-17T10:40:23Z

    with patch.dict(os.environ, {"GRADATA_BRAIN_DIR": str(tmp_path)}), \
         patch("gradata.brain.Brain", return_value=mock_brain):
-            result = context_main({"message": "How do I set up the pipeline for new prospects?"})
+            result = context_main({"message": "How do I set up the pipeline for new prospects in the onboarding workflow? I'd like to understand the full process from lead capture through to qualification."})



⚠️ Potential issue | 🟠 Major

Pin context-inject env knobs in this test to avoid host-env flakiness.

This test currently relies on external defaults. If GRADATA_CONTEXT_INJECT or GRADATA_MIN_MESSAGE_LEN is set in the runner environment, it can fail nondeterministically.

Proposed fix

- with patch.dict(os.environ, {"GRADATA_BRAIN_DIR": str(tmp_path)}), \ + with patch.dict( + os.environ, + { + "GRADATA_BRAIN_DIR": str(tmp_path), + "GRADATA_CONTEXT_INJECT": "1", + "GRADATA_MIN_MESSAGE_LEN": "100", + }, + ), \ patch("gradata.brain.Brain", return_value=mock_brain):

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

with patch.dict(os.environ, {"GRADATA_BRAIN_DIR": str(tmp_path)}), \

patch("gradata.brain.Brain", return_value=mock_brain):

result = context_main({"message": "How do I set up the pipeline for new prospects?"})

result = context_main({"message": "How do I set up the pipeline for new prospects in the onboarding workflow? I'd like to understand the full process from lead capture through to qualification."})

with patch.dict(

os.environ,

{

"GRADATA_BRAIN_DIR": str(tmp_path),

"GRADATA_CONTEXT_INJECT": "1",

"GRADATA_MIN_MESSAGE_LEN": "100",

},

), \

patch("gradata.brain.Brain", return_value=mock_brain):

result = context_main({"message": "How do I set up the pipeline for new prospects in the onboarding workflow? I'd like to understand the full process from lead capture through to qualification."})

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@tests/test_hooks_intelligence.py` around lines 182 - 185, The test relies on environment defaults and can be flaky; update the patched environment in tests/test_hooks_intelligence.py so patch.dict includes deterministic values for GRADATA_CONTEXT_INJECT and GRADATA_MIN_MESSAGE_LEN (e.g., "0" or other known safe values) alongside GRADATA_BRAIN_DIR, ensuring the call to context_main({"message": ...}) runs the same regardless of runner environment; modify the existing patch.dict invocation near the context_main call to add these two keys so the test no longer depends on host env.

- ruff UP037: drop quoted "Lesson" from jit_inject _bm25_scores_for_draft annotation (from __future__ import annotations is already present) - ruff I001: organize imports in _graduation.py - harden GRADATA_BETA_LB_THRESHOLD parsing: guard NaN/inf, clamp to [0,1], catch TypeError alongside ValueError; GRADATA_BETA_LB_MIN_FIRES floored at 0 - test: assert exact len and category in Jaccard-fallback test instead of bare truthiness Tests: 34 jit + 88 graduation/wiring pass. Lint clean. Co-Authored-By: Gradata <noreply@gradata.ai>

greptile-apps

Gradata has reached the 50-review limit for trial accounts. To continue receiving code reviews, upgrade your plan.

coderabbitai

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

src/gradata/hooks/jit_inject.py (1)

173-177: ⚠️ Potential issue | 🟠 Major

Clamp and validate confidence before filtering and scoring.

Line 175 trusts raw lesson.confidence without type/finiteness/range checks. Non-numeric/NaN/inf values can break comparisons or leak invalid ranking weights, and out-of-range values violate the SDK confidence contract.

Suggested fix

@@
 import json
 import logging
+import math
 import os
@@
 def rank_rules_for_draft(
@@
-    candidates: list[tuple[Lesson, str, str, float]] = []
+    try:
+        min_confidence = float(min_confidence)
+    except (TypeError, ValueError):
+        min_confidence = DEFAULT_MIN_CONFIDENCE
+    if not math.isfinite(min_confidence):
+        min_confidence = DEFAULT_MIN_CONFIDENCE
+    min_confidence = min(1.0, max(0.0, min_confidence))
+
+    candidates: list[tuple[Lesson, str, str, float]] = []
     for lesson in lessons:
-        conf = getattr(lesson, "confidence", 0.0)
+        raw_conf = getattr(lesson, "confidence", 0.0)
+        try:
+            conf = float(raw_conf)
+        except (TypeError, ValueError):
+            continue
+        if not math.isfinite(conf):
+            continue
+        conf = min(1.0, max(0.0, conf))
         if conf < min_confidence:
             continue

As per coding guidelines, "Confidence values must be in [0.0, 1.0]."

Also applies to: 194-205

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@src/gradata/hooks/jit_inject.py` around lines 173 - 177, Validate and clamp
lesson confidence before using it: when you read confidence via conf =
getattr(lesson, "confidence", 0.0) (in the loop over lessons) coerce non-numeric
values to 0.0, ensure the value is finite, convert to float, and clamp into the
[0.0, 1.0] range before applying the min_confidence filter and before any
scoring logic; apply the same validation/clamping in the later scoring block
(the candidate scoring section around lines 194-205) so all comparisons and
ranking weights use a normalized confidence value for lessons, and treat
invalid/NaN/inf as 0.0.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Outside diff comments:
In `@src/gradata/hooks/jit_inject.py`:
- Around line 173-177: Validate and clamp lesson confidence before using it:
when you read confidence via conf = getattr(lesson, "confidence", 0.0) (in the
loop over lessons) coerce non-numeric values to 0.0, ensure the value is finite,
convert to float, and clamp into the [0.0, 1.0] range before applying the
min_confidence filter and before any scoring logic; apply the same
validation/clamping in the later scoring block (the candidate scoring section
around lines 194-205) so all comparisons and ranking weights use a normalized
confidence value for lessons, and treat invalid/NaN/inf as 0.0.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: a52c044d-8d67-41c4-b71e-1e6c111bd475

📥 Commits

Reviewing files that changed from the base of the PR and between 5bf3acf and 3ff7a21.

📒 Files selected for processing (3)

src/gradata/enhancements/self_improvement/_graduation.py
src/gradata/hooks/jit_inject.py
tests/test_jit_inject.py

📜 Review details

🧰 Additional context used

📓 Path-based instructions (3)

tests/**

⚙️ CodeRabbit configuration file

tests/**: Test files. Verify: no hardcoded paths, assertions check specific values not just truthiness,
parametrized tests preferred for boundary conditions, floating point comparisons use pytest.approx.

Files:

tests/test_jit_inject.py

src/gradata/**/*.py

⚙️ CodeRabbit configuration file

src/gradata/**/*.py: This is the core SDK. Check for: type safety (from future import annotations required), no print()
statements (use logging), all functions accepting BrainContext where DB access occurs, no hardcoded paths. Severity
scoring must clamp to [0,1]. Confidence values must be in [0.0, 1.0].

Files:

src/gradata/hooks/jit_inject.py
src/gradata/enhancements/self_improvement/_graduation.py

src/gradata/hooks/**

⚙️ CodeRabbit configuration file

src/gradata/hooks/**: JavaScript hooks for Claude Code integration. Check for: no shell injection (no execSync with user
input), temp files must use per-user subdirectory, HTTP calls must have timeouts, errors must be silent (never block
the tool chain).

Files:

src/gradata/hooks/jit_inject.py

🔇 Additional comments (4)

src/gradata/enhancements/self_improvement/_graduation.py (2)

45-46: Docstring default is aligned with runtime behavior.

Line 45 now reflects the 0.85 default threshold used by the gate logic.

59-71: Beta gate env parsing/validation hardening is solid.

Lines 62-71 correctly handle non-finite input, clamp threshold to [0.0, 1.0], catch TypeError/ValueError, and floor min_fires at 0.

src/gradata/hooks/jit_inject.py (1)

111-147: BM25 path degrades safely without breaking the hook contract.

Good defensive shape here: Line 120 short-circuits when BM25 is unavailable, and Lines 135-137 fail closed to Jaccard fallback instead of blocking execution.

tests/test_jit_inject.py (1)

129-141: New BM25/fallback tests are well-targeted and deterministic.

These cover both branches clearly and assert concrete outcomes (top category and exact fallback result size), which tightens regression detection.

Also applies to: 142-151

Covers: gradata.patterns deprecation (remove in v0.8.0), Alert dedup (#109), Meta-Harness A-D, multi-tenant SDK (#102), BM25 JIT ranking (#101), gradata seed/mine CLI, rule_verifier wiring, security hardening, and 67+66 ruff-violation fixes (#100, #103). Co-Authored-By: Gradata <noreply@gradata.ai>

* docs: v0.6.1 changelog entry covering post-v0.6.0 work through PR #114 Covers: gradata.patterns deprecation (remove in v0.8.0), Alert dedup (#109), Meta-Harness A-D, multi-tenant SDK (#102), BM25 JIT ranking (#101), gradata seed/mine CLI, rule_verifier wiring, security hardening, and 67+66 ruff-violation fixes (#100, #103). Co-Authored-By: Gradata <noreply@gradata.ai> * docs(changelog): qualify env-centralization claim CodeRabbit flagged that GRADATA_RULE_VERIFIER is still read directly in rule_pipeline.py. Weaken the claim rather than block on the migration. Co-Authored-By: Gradata <noreply@gradata.ai> --------- Co-authored-by: Gradata <noreply@gradata.ai>

Gradata and others added 5 commits April 16, 2026 23:57

greptile-apps Bot reviewed Apr 17, 2026

View reviewed changes

coderabbitai Bot added the feature label Apr 17, 2026

coderabbitai Bot requested changes Apr 17, 2026

View reviewed changes

greptile-apps Bot reviewed Apr 17, 2026

View reviewed changes

coderabbitai Bot reviewed Apr 17, 2026

View reviewed changes

Gradata merged commit cedd97a into main Apr 17, 2026
4 of 17 checks passed

Gradata mentioned this pull request Apr 17, 2026

chore: fix 67 ruff violations across SDK (unblock CI) #103

Merged

3 tasks

coderabbitai Bot mentioned this pull request Apr 17, 2026

feat: multi-tenant SDK + cloud alignment (tenant_id, visibility, clusters, sync_state) #102

Merged

6 tasks

Gradata deleted the feat/jit-bm25-and-beta-lb branch April 17, 2026 19:50

This was referenced Apr 20, 2026

fix(tests): clean up Sprites-side tests from Gradata/tests/ #122

Merged

feat(hooks): kill switches, tacit signals, cloud URL fix, emit refactor #134

Merged

perf(tokens): autoresearch loop 99.2% reduction (5513→42) #136

Merged

		MIN_MESSAGE_LEN = int(os.environ.get("GRADATA_MIN_MESSAGE_LEN", "100"))
		MAX_CONTEXT_LEN = int(os.environ.get("GRADATA_MAX_CONTEXT_LEN", "2000"))

-MIN_MESSAGE_LEN = int(os.environ.get("GRADATA_MIN_MESSAGE_LEN", "100"))
-MAX_CONTEXT_LEN = int(os.environ.get("GRADATA_MAX_CONTEXT_LEN", "2000"))
+def _env_int(name: str, default: int, *, minimum: int = 0) -> int:
+    try:
+        value = int(os.environ.get(name, str(default)))
+    except (TypeError, ValueError):
+        return default
+    return max(minimum, value)
+MIN_MESSAGE_LEN = _env_int("GRADATA_MIN_MESSAGE_LEN", 100, minimum=0)
+MAX_CONTEXT_LEN = _env_int("GRADATA_MAX_CONTEXT_LEN", 2000, minimum=1)

Conversation

Gradata commented Apr 17, 2026

Summary

Test plan

Uh oh!

greptile-apps Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot commented Apr 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot Apr 17, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Apr 17, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Apr 17, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot Apr 17, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Apr 17, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

greptile-apps Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

coderabbitai Bot commented Apr 17, 2026 •

edited

Loading