feat(jit,graduation): BM25 for JIT ranking + raise Beta LB default to 0.85#101
feat(jit,graduation): BM25 for JIT ranking + raise Beta LB default to 0.85#101
Conversation
…ompts
Two measured waste eliminations:
1. inject_brain_rules: mandatory_reminder block restated every mandatory
rule verbatim in a recency tag. mandatory_block already places them in
primacy position — the reminder was defensive redundancy costing
~50-200 tok/session per mandatory rule.
2. context_inject: MIN_MESSAGE_LEN raised 10 → 60 to skip brain FTS
search on trivial follow-ups ("ok", "yes", "continue"). Previous
threshold only filtered 1-9 char messages, causing ~400 tok of
wasted context per ack-style reply.
Council audit (adversarial, synthesized from Skeptic + Architect +
Pragmatist) found my earlier measurement methodology used synthetic
inputs that masked real per-operation costs. Real heavy session
(10 prompts, 30 edits, 5 agents) accumulates ~9.6K gradata tokens;
context_inject alone can contribute 3.5K when prompts are substantive.
Tests: 2884 passed, 1 preexisting cp1252 Windows unicode failure
in test_rule_to_hook (unrelated to injection).
Co-Authored-By: Gradata <noreply@gradata.ai>
…imits Rule_enforcement was re-injecting SessionStart rules on every Write/Edit, costing ~165 tok/edit for zero added primacy value. Default it off behind GRADATA_RULE_ENFORCEMENT=1 opt-in for ablation. Also add env knobs for tuning injection volume without redeploying: - context_inject: GRADATA_CONTEXT_INJECT kill switch + MIN_MESSAGE_LEN raised 60 -> 100 so ack-style replies skip FTS cost - inject_brain_rules: GRADATA_MAX_RULES, GRADATA_MIN_CONFIDENCE, GRADATA_MAX_META_RULES for tuning SessionStart payload Tests updated to opt into rule_enforcement where they exercise the hook and to use a >=100 char message for context_inject. Co-Authored-By: Gradata <noreply@gradata.ai>
Telemetry showed 10 SessionStart events in a long session re-injected 1.9KB each (~3.7k tokens total) — duplicative because the compact summary already carries rules from the prior session, and the new session's primacy slot is consumed by that summary. Skip inject_brain_rules when source is 'compact' or 'resume'. Opt back in with GRADATA_INJECT_ON_COMPACT=1 for ablation. Co-Authored-By: Gradata <noreply@gradata.ai>
…d metas
When a meta-rule is injected, its source_lesson_ids are added to a mutex
set; individual RULE/PATTERN leaves whose _lesson_id() is in that set are
skipped so the principle isn't double-injected as both the meta and its
concrete leaves.
Freed slots get refilled: rank_rules is now called with a 3x overshoot
so the render loop has candidates beyond the MAX_RULES cap; the cap is
enforced at render time (cluster_lines + individual_lines <= MAX_RULES)
rather than at rank time.
On current brain: 1822B -> 1319B per SessionStart (27% smaller), and
leaf categories diversify from {LEADS, DEMO_PREP, DRAFTING} to
{DRAFTING, TONE, DATA_INTEGRITY, ARCHITECTURE, CODE} while the LEADS +
DEMO_PREP principles still land via metas.
Opt-out: GRADATA_META_RULE_MUTEX=0 for ablation. Mirrors the existing
cluster-vs-meta mutex pattern (meta_covered_categories).
Co-Authored-By: Gradata <noreply@gradata.ai>
… 0.85 JIT injection (src/gradata/hooks/jit_inject.py) now uses BM25 via the optional bm25s dep when installed, with Jaccard as a graceful fallback. Matches the pattern already used in rules/rule_ranker.py. BM25 weights rare terms higher than generic ones, which is exactly what we want when matching a fresh draft against a corpus of rule descriptions. Beta LB gate default threshold raised from 0.70 -> 0.85 per S105 recommendation — S105 ablation showed the 0.70 floor still lets through format-based false graduations. Gate remains opt-in (GRADATA_BETA_LB_GATE off by default), so this is a no-op for anyone who hasn't enabled it. Tests: all 34 jit_inject tests pass (32 existing + 2 new covering BM25 path and Jaccard fallback). Graduation + wiring tests: 88/88 pass. Co-Authored-By: Gradata <noreply@gradata.ai>
There was a problem hiding this comment.
Gradata has reached the 50-review limit for trial accounts. To continue receiving code reviews, upgrade your plan.
📝 Walkthrough
Breaking changes:
No security fixes or new public API methods were introduced. WalkthroughAdds environment-configurable feature gates and thresholds across multiple hooks, integrates BM25-first ranking with Jaccard fallback for JIT rule selection, extends meta-rule mutexing to suppress by lesson IDs, tightens parsing/clamping of beta LB threshold/min-fires, and removes the Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~50 minutes Possibly related PRs
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 8
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/gradata/hooks/rule_enforcement.py (1)
97-101:⚠️ Potential issue | 🟠 MajorFix CI-blocking Ruff SIM103 in domain check.
The conditional pattern at line 98 triggers the SIM103 lint failure. Simplify the boolean return logic as shown below.
Proposed fix
- decl_domain = scope.get("domain") - if decl_domain and file_domain and decl_domain != file_domain: - return False + decl_domain = scope.get("domain") + if decl_domain and file_domain: + return decl_domain == file_domain🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/gradata/hooks/rule_enforcement.py` around lines 97 - 101, The current domain mismatch check uses an if/return pattern that triggers Ruff SIM103; replace the conditional block that uses decl_domain = scope.get("domain") and file_domain with a single boolean return expression such as: return decl_domain is None or file_domain is None or decl_domain == file_domain (or equivalently return not (decl_domain and file_domain and decl_domain != file_domain)) to simplify the logic in the domain check.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/gradata/enhancements/self_improvement/_graduation.py`:
- Around line 60-64: Parse and validate the env vars rather than blindly
casting: read GRADATA_BETA_LB_THRESHOLD into threshold with float(...), reject
NaN/inf and clamp into [0.0, 1.0] (fallback to 0.85 on invalid input), and read
GRADATA_BETA_LB_MIN_FIRES into min_fires with int(...), treating non-numeric
values as the default 5 and enforcing a non-negative minimum (e.g., min_fires =
max(1, parsed) or at least max(0, parsed) depending on intended floor) so
negative values can't bypass the observation floor; use math.isnan/
math.isfinite checks and handle exceptions around float/int conversion, updating
the variables threshold and min_fires in the _graduation.py block.
In `@src/gradata/hooks/context_inject.py`:
- Around line 18-19: The module-level env parsing for MIN_MESSAGE_LEN and
MAX_CONTEXT_LEN can raise ValueError during import; change it to a fail-safe
parse that never throws: replace the direct int(os.environ.get(...)) usage with
a small safe parser used for MIN_MESSAGE_LEN and MAX_CONTEXT_LEN (e.g., a helper
like parse_env_int or inline try/except) that returns the default when the env
value is missing or malformed and logs nothing so startup is not blocked; update
references to MIN_MESSAGE_LEN and MAX_CONTEXT_LEN only (the constants named in
the diff) to use the safely parsed values.
In `@src/gradata/hooks/inject_brain_rules.py`:
- Around line 47-50: Currently MAX_RULES, MAX_META_RULES and MIN_CONFIDENCE are
parsed at import time which can raise and break SessionStart; move the parsing
of os.environ for GRADATA_MAX_RULES, GRADATA_MAX_META_RULES and
GRADATA_MIN_CONFIDENCE into the module's main() (or the hook entrypoint) so
parsing happens at runtime, handle any ValueError/TypeError by falling back
silently to the existing defaults (10, 5, 0.60) and log/debug if needed but do
not raise, and clamp MIN_CONFIDENCE into the [0.0, 1.0] range before use (e.g.,
min(max(parsed_value, 0.0), 1.0)); update references in the code to use the
local variables (MAX_RULES, MAX_META_RULES, MIN_CONFIDENCE) obtained inside
main()/inject_brain_rules entry so import-time evaluation is removed.
- Around line 301-304: The current logic only limits individual_lines via
render_budget and allows cluster_lines to push the total over MAX_RULES; fix by
enforcing the MAX_RULES cap on the combined payload: either truncate
cluster_lines to at most MAX_RULES (and set render_budget = 0) or, after
computing cluster_lines + individual_lines, trim the concatenated list to
MAX_RULES before returning/using it. Update the logic around render_budget,
cluster_lines, individual_lines, and MAX_RULES (inside the same mutex-protected
section) so the final sent/returned list can never exceed MAX_RULES.
In `@src/gradata/hooks/jit_inject.py`:
- Around line 111-114: The type annotation for the parameter `candidates` in
`_bm25_scores_for_draft` uses a quoted forward reference `"Lesson"` which is
redundant with `from __future__ import annotations`; remove the quotes so the
annotation reads `list[tuple[Lesson, str, str]]` (and similarly drop quotes
anywhere else in this function signature using `"Lesson"`) to resolve Ruff UP037
and unblock CI.
In `@src/gradata/hooks/rule_enforcement.py`:
- Line 39: Protect MAX_REMINDERS import-time parsing by wrapping the
int(os.environ.get("GRADATA_MAX_REMINDERS", "5")) conversion in a safe block:
try to parse the env var into an int, on ValueError or other exceptions fall
back silently to the default (5) and ensure the final value is clamped to
non-negative with max(0, parsed_value); update the symbol MAX_REMINDERS in
rule_enforcement.py accordingly (no exceptions should propagate during import)
and optionally emit a non-blocking debug/warn via the module logger if
available.
In `@tests/test_hooks_intelligence.py`:
- Around line 182-185: The test relies on environment defaults and can be flaky;
update the patched environment in tests/test_hooks_intelligence.py so patch.dict
includes deterministic values for GRADATA_CONTEXT_INJECT and
GRADATA_MIN_MESSAGE_LEN (e.g., "0" or other known safe values) alongside
GRADATA_BRAIN_DIR, ensuring the call to context_main({"message": ...}) runs the
same regardless of runner environment; modify the existing patch.dict invocation
near the context_main call to add these two keys so the test no longer depends
on host env.
In `@tests/test_jit_inject.py`:
- Around line 142-150: The test test_falls_back_to_jaccard_when_bm25_unavailable
should assert exact expected output instead of truthiness: set _BM25_AVAILABLE
False and bm25s None as you already do, call rank_rules_for_draft(...) and
replace "assert ranked and ranked[0][0].category == 'X'" with a concrete
assertion such as asserting ranked is a list of expected length and that
ranked[0][0].category == "X" (e.g., assert len(ranked) >= 1 and
ranked[0][0].category == "X") so failures show the specific mismatch; use the
existing helpers _lesson and the ranked variable to verify the exact category
and position.
---
Outside diff comments:
In `@src/gradata/hooks/rule_enforcement.py`:
- Around line 97-101: The current domain mismatch check uses an if/return
pattern that triggers Ruff SIM103; replace the conditional block that uses
decl_domain = scope.get("domain") and file_domain with a single boolean return
expression such as: return decl_domain is None or file_domain is None or
decl_domain == file_domain (or equivalently return not (decl_domain and
file_domain and decl_domain != file_domain)) to simplify the logic in the domain
check.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 79c4b69e-7fce-43a3-a448-fafe0c3d7674
📒 Files selected for processing (10)
src/gradata/enhancements/self_improvement/_graduation.pysrc/gradata/hooks/context_inject.pysrc/gradata/hooks/inject_brain_rules.pysrc/gradata/hooks/jit_inject.pysrc/gradata/hooks/rule_enforcement.pytests/test_hooks_intelligence.pytests/test_hooks_safety.pytests/test_jit_inject.pytests/test_rule_enforcement_scope.pytests/test_self_healing_fix.py
📜 Review details
🧰 Additional context used
📓 Path-based instructions (3)
tests/**
⚙️ CodeRabbit configuration file
tests/**: Test files. Verify: no hardcoded paths, assertions check specific values not just truthiness,
parametrized tests preferred for boundary conditions, floating point comparisons use pytest.approx.
Files:
tests/test_hooks_intelligence.pytests/test_self_healing_fix.pytests/test_hooks_safety.pytests/test_rule_enforcement_scope.pytests/test_jit_inject.py
src/gradata/**/*.py
⚙️ CodeRabbit configuration file
src/gradata/**/*.py: This is the core SDK. Check for: type safety (from future import annotations required), no print()
statements (use logging), all functions accepting BrainContext where DB access occurs, no hardcoded paths. Severity
scoring must clamp to [0,1]. Confidence values must be in [0.0, 1.0].
Files:
src/gradata/hooks/rule_enforcement.pysrc/gradata/hooks/context_inject.pysrc/gradata/enhancements/self_improvement/_graduation.pysrc/gradata/hooks/jit_inject.pysrc/gradata/hooks/inject_brain_rules.py
src/gradata/hooks/**
⚙️ CodeRabbit configuration file
src/gradata/hooks/**: JavaScript hooks for Claude Code integration. Check for: no shell injection (no execSync with user
input), temp files must use per-user subdirectory, HTTP calls must have timeouts, errors must be silent (never block
the tool chain).
Files:
src/gradata/hooks/rule_enforcement.pysrc/gradata/hooks/context_inject.pysrc/gradata/hooks/jit_inject.pysrc/gradata/hooks/inject_brain_rules.py
🪛 GitHub Actions: SDK CI
src/gradata/hooks/rule_enforcement.py
[error] 98-100: ruff: SIM103 Return the negated condition directly
src/gradata/enhancements/self_improvement/_graduation.py
[error] 9-30: ruff: I001 Import block is un-sorted or un-formatted (organize imports)
src/gradata/hooks/jit_inject.py
[error] 112-112: ruff: UP037 Remove quotes from type annotation
🔇 Additional comments (2)
tests/test_rule_enforcement_scope.py (1)
13-17: Good isolation: module-wide opt-in fixture is the right approach.This keeps scope tests deterministic under the new default-off rule-enforcement behavior.
tests/test_hooks_safety.py (1)
107-107: Nice update: explicit env gating in tests matches runtime behavior.These changes correctly opt tests into rule enforcement and keep expectations aligned with the new default-off semantics.
Also applies to: 120-120, 129-129
| MIN_MESSAGE_LEN = int(os.environ.get("GRADATA_MIN_MESSAGE_LEN", "100")) | ||
| MAX_CONTEXT_LEN = int(os.environ.get("GRADATA_MAX_CONTEXT_LEN", "2000")) |
There was a problem hiding this comment.
Make env integer parsing fail-safe for hook startup.
Lines 18-19 can throw on malformed env values (GRADATA_MIN_MESSAGE_LEN, GRADATA_MAX_CONTEXT_LEN) during import, which prevents this hook from loading.
Proposed fix
-MIN_MESSAGE_LEN = int(os.environ.get("GRADATA_MIN_MESSAGE_LEN", "100"))
-MAX_CONTEXT_LEN = int(os.environ.get("GRADATA_MAX_CONTEXT_LEN", "2000"))
+def _env_int(name: str, default: int, *, minimum: int = 0) -> int:
+ try:
+ value = int(os.environ.get(name, str(default)))
+ except (TypeError, ValueError):
+ return default
+ return max(minimum, value)
+
+
+MIN_MESSAGE_LEN = _env_int("GRADATA_MIN_MESSAGE_LEN", 100, minimum=0)
+MAX_CONTEXT_LEN = _env_int("GRADATA_MAX_CONTEXT_LEN", 2000, minimum=1)As per coding guidelines, src/gradata/hooks/** requires that errors are silent and never block the tool chain.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| MIN_MESSAGE_LEN = int(os.environ.get("GRADATA_MIN_MESSAGE_LEN", "100")) | |
| MAX_CONTEXT_LEN = int(os.environ.get("GRADATA_MAX_CONTEXT_LEN", "2000")) | |
| def _env_int(name: str, default: int, *, minimum: int = 0) -> int: | |
| try: | |
| value = int(os.environ.get(name, str(default))) | |
| except (TypeError, ValueError): | |
| return default | |
| return max(minimum, value) | |
| MIN_MESSAGE_LEN = _env_int("GRADATA_MIN_MESSAGE_LEN", 100, minimum=0) | |
| MAX_CONTEXT_LEN = _env_int("GRADATA_MAX_CONTEXT_LEN", 2000, minimum=1) |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/gradata/hooks/context_inject.py` around lines 18 - 19, The module-level
env parsing for MIN_MESSAGE_LEN and MAX_CONTEXT_LEN can raise ValueError during
import; change it to a fail-safe parse that never throws: replace the direct
int(os.environ.get(...)) usage with a small safe parser used for MIN_MESSAGE_LEN
and MAX_CONTEXT_LEN (e.g., a helper like parse_env_int or inline try/except)
that returns the default when the env value is missing or malformed and logs
nothing so startup is not blocked; update references to MIN_MESSAGE_LEN and
MAX_CONTEXT_LEN only (the constants named in the diff) to use the safely parsed
values.
| MAX_RULES = int(os.environ.get("GRADATA_MAX_RULES", "10")) | ||
| MIN_CONFIDENCE = float(os.environ.get("GRADATA_MIN_CONFIDENCE", "0.60")) | ||
| # Meta-rules are high-level principles — separate cap from MAX_RULES. | ||
| MAX_META_RULES = int(os.environ.get("GRADATA_MAX_META_RULES", "5")) |
There was a problem hiding this comment.
Don't parse hook env knobs at import time.
A malformed GRADATA_MAX_RULES, GRADATA_MAX_META_RULES, or GRADATA_MIN_CONFIDENCE now raises during module import and disables SessionStart entirely. MIN_CONFIDENCE also flows through without any [0,1] clamping. These should be parsed inside main() with silent fallback.
As per coding guidelines, "errors must be silent (never block the tool chain)" and "Confidence values must be in [0.0, 1.0]".
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/gradata/hooks/inject_brain_rules.py` around lines 47 - 50, Currently
MAX_RULES, MAX_META_RULES and MIN_CONFIDENCE are parsed at import time which can
raise and break SessionStart; move the parsing of os.environ for
GRADATA_MAX_RULES, GRADATA_MAX_META_RULES and GRADATA_MIN_CONFIDENCE into the
module's main() (or the hook entrypoint) so parsing happens at runtime, handle
any ValueError/TypeError by falling back silently to the existing defaults (10,
5, 0.60) and log/debug if needed but do not raise, and clamp MIN_CONFIDENCE into
the [0.0, 1.0] range before use (e.g., min(max(parsed_value, 0.0), 1.0)); update
references in the code to use the local variables (MAX_RULES, MAX_META_RULES,
MIN_CONFIDENCE) obtained inside main()/inject_brain_rules entry so import-time
evaluation is removed.
| # Total <brain-rules> entries = cluster_lines + individual_lines. | ||
| # Enforce MAX_RULES here (after mutex) so freed slots get refilled from | ||
| # the overshoot pool, and the final block still respects the budget. | ||
| render_budget = max(0, MAX_RULES - len(cluster_lines)) |
There was a problem hiding this comment.
MAX_RULES is still exceeded when multiple clusters qualify.
This budget only caps individual_lines. If len(cluster_lines) > MAX_RULES, the final cluster_lines + individual_lines payload still exceeds the configured limit, so the new overshoot path can inject more entries than operators asked for.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/gradata/hooks/inject_brain_rules.py` around lines 301 - 304, The current
logic only limits individual_lines via render_budget and allows cluster_lines to
push the total over MAX_RULES; fix by enforcing the MAX_RULES cap on the
combined payload: either truncate cluster_lines to at most MAX_RULES (and set
render_budget = 0) or, after computing cluster_lines + individual_lines, trim
the concatenated list to MAX_RULES before returning/using it. Update the logic
around render_budget, cluster_lines, individual_lines, and MAX_RULES (inside the
same mutex-protected section) so the final sent/returned list can never exceed
MAX_RULES.
| } | ||
|
|
||
| MAX_REMINDERS = 5 | ||
| MAX_REMINDERS = int(os.environ.get("GRADATA_MAX_REMINDERS", "5")) |
There was a problem hiding this comment.
Guard GRADATA_MAX_REMINDERS parsing to avoid import-time crashes.
Line 39 can raise ValueError on invalid env input and fail hook loading before main() can return safely. Also clamp to non-negative to avoid unexpected slicing behavior.
Proposed fix
-MAX_REMINDERS = int(os.environ.get("GRADATA_MAX_REMINDERS", "5"))
+def _env_int(name: str, default: int, *, minimum: int = 0) -> int:
+ try:
+ value = int(os.environ.get(name, str(default)))
+ except (TypeError, ValueError):
+ return default
+ return max(minimum, value)
+
+
+MAX_REMINDERS = _env_int("GRADATA_MAX_REMINDERS", 5, minimum=0)As per coding guidelines, src/gradata/hooks/** requires that errors are silent and never block the tool chain.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/gradata/hooks/rule_enforcement.py` at line 39, Protect MAX_REMINDERS
import-time parsing by wrapping the int(os.environ.get("GRADATA_MAX_REMINDERS",
"5")) conversion in a safe block: try to parse the env var into an int, on
ValueError or other exceptions fall back silently to the default (5) and ensure
the final value is clamped to non-negative with max(0, parsed_value); update the
symbol MAX_REMINDERS in rule_enforcement.py accordingly (no exceptions should
propagate during import) and optionally emit a non-blocking debug/warn via the
module logger if available.
| with patch.dict(os.environ, {"GRADATA_BRAIN_DIR": str(tmp_path)}), \ | ||
| patch("gradata.brain.Brain", return_value=mock_brain): | ||
| result = context_main({"message": "How do I set up the pipeline for new prospects?"}) | ||
| result = context_main({"message": "How do I set up the pipeline for new prospects in the onboarding workflow? I'd like to understand the full process from lead capture through to qualification."}) | ||
|
|
There was a problem hiding this comment.
Pin context-inject env knobs in this test to avoid host-env flakiness.
This test currently relies on external defaults. If GRADATA_CONTEXT_INJECT or GRADATA_MIN_MESSAGE_LEN is set in the runner environment, it can fail nondeterministically.
Proposed fix
- with patch.dict(os.environ, {"GRADATA_BRAIN_DIR": str(tmp_path)}), \
+ with patch.dict(
+ os.environ,
+ {
+ "GRADATA_BRAIN_DIR": str(tmp_path),
+ "GRADATA_CONTEXT_INJECT": "1",
+ "GRADATA_MIN_MESSAGE_LEN": "100",
+ },
+ ), \
patch("gradata.brain.Brain", return_value=mock_brain):📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| with patch.dict(os.environ, {"GRADATA_BRAIN_DIR": str(tmp_path)}), \ | |
| patch("gradata.brain.Brain", return_value=mock_brain): | |
| result = context_main({"message": "How do I set up the pipeline for new prospects?"}) | |
| result = context_main({"message": "How do I set up the pipeline for new prospects in the onboarding workflow? I'd like to understand the full process from lead capture through to qualification."}) | |
| with patch.dict( | |
| os.environ, | |
| { | |
| "GRADATA_BRAIN_DIR": str(tmp_path), | |
| "GRADATA_CONTEXT_INJECT": "1", | |
| "GRADATA_MIN_MESSAGE_LEN": "100", | |
| }, | |
| ), \ | |
| patch("gradata.brain.Brain", return_value=mock_brain): | |
| result = context_main({"message": "How do I set up the pipeline for new prospects in the onboarding workflow? I'd like to understand the full process from lead capture through to qualification."}) |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@tests/test_hooks_intelligence.py` around lines 182 - 185, The test relies on
environment defaults and can be flaky; update the patched environment in
tests/test_hooks_intelligence.py so patch.dict includes deterministic values for
GRADATA_CONTEXT_INJECT and GRADATA_MIN_MESSAGE_LEN (e.g., "0" or other known
safe values) alongside GRADATA_BRAIN_DIR, ensuring the call to
context_main({"message": ...}) runs the same regardless of runner environment;
modify the existing patch.dict invocation near the context_main call to add
these two keys so the test no longer depends on host env.
- ruff UP037: drop quoted "Lesson" from jit_inject _bm25_scores_for_draft annotation (from __future__ import annotations is already present) - ruff I001: organize imports in _graduation.py - harden GRADATA_BETA_LB_THRESHOLD parsing: guard NaN/inf, clamp to [0,1], catch TypeError alongside ValueError; GRADATA_BETA_LB_MIN_FIRES floored at 0 - test: assert exact len and category in Jaccard-fallback test instead of bare truthiness Tests: 34 jit + 88 graduation/wiring pass. Lint clean. Co-Authored-By: Gradata <noreply@gradata.ai>
There was a problem hiding this comment.
Gradata has reached the 50-review limit for trial accounts. To continue receiving code reviews, upgrade your plan.
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/gradata/hooks/jit_inject.py (1)
173-177:⚠️ Potential issue | 🟠 MajorClamp and validate
confidencebefore filtering and scoring.Line 175 trusts raw
lesson.confidencewithout type/finiteness/range checks. Non-numeric/NaN/inf values can break comparisons or leak invalid ranking weights, and out-of-range values violate the SDK confidence contract.Suggested fix
@@ import json import logging +import math import os @@ def rank_rules_for_draft( @@ - candidates: list[tuple[Lesson, str, str, float]] = [] + try: + min_confidence = float(min_confidence) + except (TypeError, ValueError): + min_confidence = DEFAULT_MIN_CONFIDENCE + if not math.isfinite(min_confidence): + min_confidence = DEFAULT_MIN_CONFIDENCE + min_confidence = min(1.0, max(0.0, min_confidence)) + + candidates: list[tuple[Lesson, str, str, float]] = [] for lesson in lessons: - conf = getattr(lesson, "confidence", 0.0) + raw_conf = getattr(lesson, "confidence", 0.0) + try: + conf = float(raw_conf) + except (TypeError, ValueError): + continue + if not math.isfinite(conf): + continue + conf = min(1.0, max(0.0, conf)) if conf < min_confidence: continueAs per coding guidelines, "Confidence values must be in [0.0, 1.0]."
Also applies to: 194-205
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/gradata/hooks/jit_inject.py` around lines 173 - 177, Validate and clamp lesson confidence before using it: when you read confidence via conf = getattr(lesson, "confidence", 0.0) (in the loop over lessons) coerce non-numeric values to 0.0, ensure the value is finite, convert to float, and clamp into the [0.0, 1.0] range before applying the min_confidence filter and before any scoring logic; apply the same validation/clamping in the later scoring block (the candidate scoring section around lines 194-205) so all comparisons and ranking weights use a normalized confidence value for lessons, and treat invalid/NaN/inf as 0.0.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Outside diff comments:
In `@src/gradata/hooks/jit_inject.py`:
- Around line 173-177: Validate and clamp lesson confidence before using it:
when you read confidence via conf = getattr(lesson, "confidence", 0.0) (in the
loop over lessons) coerce non-numeric values to 0.0, ensure the value is finite,
convert to float, and clamp into the [0.0, 1.0] range before applying the
min_confidence filter and before any scoring logic; apply the same
validation/clamping in the later scoring block (the candidate scoring section
around lines 194-205) so all comparisons and ranking weights use a normalized
confidence value for lessons, and treat invalid/NaN/inf as 0.0.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: a52c044d-8d67-41c4-b71e-1e6c111bd475
📒 Files selected for processing (3)
src/gradata/enhancements/self_improvement/_graduation.pysrc/gradata/hooks/jit_inject.pytests/test_jit_inject.py
📜 Review details
🧰 Additional context used
📓 Path-based instructions (3)
tests/**
⚙️ CodeRabbit configuration file
tests/**: Test files. Verify: no hardcoded paths, assertions check specific values not just truthiness,
parametrized tests preferred for boundary conditions, floating point comparisons use pytest.approx.
Files:
tests/test_jit_inject.py
src/gradata/**/*.py
⚙️ CodeRabbit configuration file
src/gradata/**/*.py: This is the core SDK. Check for: type safety (from future import annotations required), no print()
statements (use logging), all functions accepting BrainContext where DB access occurs, no hardcoded paths. Severity
scoring must clamp to [0,1]. Confidence values must be in [0.0, 1.0].
Files:
src/gradata/hooks/jit_inject.pysrc/gradata/enhancements/self_improvement/_graduation.py
src/gradata/hooks/**
⚙️ CodeRabbit configuration file
src/gradata/hooks/**: JavaScript hooks for Claude Code integration. Check for: no shell injection (no execSync with user
input), temp files must use per-user subdirectory, HTTP calls must have timeouts, errors must be silent (never block
the tool chain).
Files:
src/gradata/hooks/jit_inject.py
🔇 Additional comments (4)
src/gradata/enhancements/self_improvement/_graduation.py (2)
45-46: Docstring default is aligned with runtime behavior.Line 45 now reflects the
0.85default threshold used by the gate logic.
59-71: Beta gate env parsing/validation hardening is solid.Lines 62-71 correctly handle non-finite input, clamp threshold to
[0.0, 1.0], catchTypeError/ValueError, and floormin_firesat0.src/gradata/hooks/jit_inject.py (1)
111-147: BM25 path degrades safely without breaking the hook contract.Good defensive shape here: Line 120 short-circuits when BM25 is unavailable, and Lines 135-137 fail closed to Jaccard fallback instead of blocking execution.
tests/test_jit_inject.py (1)
129-141: New BM25/fallback tests are well-targeted and deterministic.These cover both branches clearly and assert concrete outcomes (top category and exact fallback result size), which tightens regression detection.
Also applies to: 142-151
* docs: v0.6.1 changelog entry covering post-v0.6.0 work through PR #114 Covers: gradata.patterns deprecation (remove in v0.8.0), Alert dedup (#109), Meta-Harness A-D, multi-tenant SDK (#102), BM25 JIT ranking (#101), gradata seed/mine CLI, rule_verifier wiring, security hardening, and 67+66 ruff-violation fixes (#100, #103). Co-Authored-By: Gradata <noreply@gradata.ai> * docs(changelog): qualify env-centralization claim CodeRabbit flagged that GRADATA_RULE_VERIFIER is still read directly in rule_pipeline.py. Weaken the claim rather than block on the migration. Co-Authored-By: Gradata <noreply@gradata.ai> --------- Co-authored-by: Gradata <noreply@gradata.ai>
Summary
src/gradata/hooks/jit_inject.py): replaces Jaccard with BM25 via optionalbm25sdep when installed; falls back to Jaccard so the SDK stays zero-required-deps. Same pattern asrules/rule_ranker.py. BM25 weights rare terms over generic ones — better signal for matching drafts to rule descriptions.src/gradata/enhancements/self_improvement/_graduation.py): per S105 recommendation. Gate stays opt-in (GRADATA_BETA_LB_GATEoff by default) so this is a no-op for anyone who hasn't enabled it.Test plan
pytest tests/test_jit_inject.py— 34/34 (32 existing + 2 new: BM25 path, Jaccard fallback)pytest tests/test_agent_graduation.py tests/test_graduation_notification.py tests/test_pattern_graduation_integration.py— 74 passed, 2 xfailedpytest tests/test_wiring_compound.py— 14/14Generated with Gradata