feat: meta-rule discovery pipeline + behavioral extraction by Gradata · Pull Request #19 · Gradata/gradata

Gradata · 2026-04-09T17:22:45Z

Summary

New behavioral extractor: 12-archetype sentence-level extraction replaces word-diff garbage
Cross-session pattern tracking with severity-weighted graduation scoring
12 new e2e pipeline tests
Updated mock targets for existing tests

What changed

File	Change
`behavioral_extractor.py`	NEW: sentence-level archetype detection + template generation
`meta_rules_storage.py`	Pattern tracking table + batch upsert
`_core.py`	Wire behavioral_extractor as primary extraction path
`test_pipeline_e2e.py`	NEW: 12 e2e tests
`test_convergence_gate.py`	Updated mock target
`test_core_behavioral.py`	Updated mock target

Test plan

12/12 e2e pipeline tests passing locally
1699/1699 full test suite passing (0 failures)
Cloud-dependent tests skip gracefully on CI

Generated with Gradata

Greptile Summary

This PR introduces a sentence-level archetype-based behavioral extraction pipeline as the primary extraction path, replacing the previous word-diff approach. It adds 12 correction archetypes (behavioral_extractor.py), cross-session pattern tracking with severity-weighted graduation scoring (meta_rules_storage.py), wires the new extractor into the core correction flow (_core.py), and backs everything with 12 new e2e pipeline tests.

Key changes:

behavioral_extractor.py — New 497-line module implementing 12 archetypes, a template generator, quality gate, and LLM refinement hook
meta_rules_storage.py — New correction_patterns table with batch upsert and graduation candidate query
_core.py — Switched to extract_instruction() as the primary path, with keyword-template fallback when it returns None
_context_compile.py — Regex separator pattern refined to (?:\\s*—\\s*|\\s*--\\s*|\\s+-\\s+) (the en-dash drop from the previous review remains unaddressed)
test_pipeline_e2e.py — 12 new e2e tests, including cloud-gated @_requires_cloud tests and a pattern-tracking roundtrip test

Confidence Score: 4/5

Safe to merge after fixing the llm_provider docstring mismatch; all runtime paths use the correct .complete() interface today.

The core logic is solid — archetype detection, template generation, quality gate, and the _core.py wiring are all correct. The previously flagged connection leak and hardcoded path are still open but non-blocking. The one new P1 is a docstring documenting the wrong interface method for llm_provider, which causes AttributeError for future custom provider authors — a targeted one-line fix. All 1699 tests pass.

src/gradata/enhancements/behavioral_extractor.py (docstring interface mismatch), src/gradata/_context_compile.py (still missing en-dash in separator regex, missing from future import annotations)

Vulnerabilities

No security concerns identified. The LLM prompt built in _try_llm_extract embeds draft and final text directly, but since this is an internal SDK module and both values come from authenticated user corrections (not external untrusted input), there is no injection risk in the current usage path.

Important Files Changed

Filename	Overview
src/gradata/enhancements/behavioral_extractor.py	New 497-line module implementing 12-archetype sentence-level extraction; well-structured with from future import annotations and proper logging, but the extract_instruction docstring documents the wrong llm_provider interface method (.extract() vs the actual .complete() call).
src/gradata/enhancements/meta_rules_storage.py	Adds correction_patterns table with upsert and graduation candidate query; query_graduation_candidates still leaks SQLite connection on exception (flagged in prior review round, not yet fixed).
src/gradata/_core.py	Cleanly wires behavioral_extractor as primary path with keyword-template fallback; variable naming fix (l → lesson) and VALID_SCOPES casing fix are good housekeeping.
src/gradata/_context_compile.py	Regex separator updated to named alternation form; en-dash (U+2013) is still missing from the pattern as flagged in the prior review round; also missing from future import annotations (Rule 6).
tests/test_pipeline_e2e.py	12 well-structured e2e tests; hardcoded Windows path fallback for cloud path is still present (flagged in prior review), and the cloud-gated tests gracefully skip on CI.
tests/test_convergence_gate.py	Mock target updated from edit_classifier to behavioral_extractor correctly; tests are accurate. Missing from future import annotations (Rule 6).
tests/test_core_behavioral.py	Mock target updated correctly; tests cover happy path, None fallback, and exception propagation scenarios.
src/gradata/contrib/patterns/orchestrator.py	No functional issues found; clean domain-agnostic routing with proper type annotations and from future import annotations.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A["brain.correct(draft, final)"] --> B{Category converged?}
    B -- Yes --> C[Use primary.description]
    B -- No --> D["extract_instruction(draft, final, classification)"]
    D --> E[detect_archetype]
    E --> F[generate_instruction via template]
    F --> G{_is_actionable?}
    G -- Yes, confidence >= 0.60 --> H[Return instruction]
    G -- Yes, confidence < 0.60 --> I{llm_provider connected?}
    I -- Yes --> J[_try_llm_extract → .complete]
    I -- No --> H
    J -- Refined --> K[Return refined]
    J -- Failed --> H
    G -- No --> L{llm_provider connected?}
    L -- Yes --> J
    L -- No --> M[Generic category fallback]
    H --> N{behavioral_desc returned?}
    M --> N
    N -- Yes --> O[Use behavioral_desc as lesson description]
    N -- No --> P["Fallback: extract_behavioral_instruction (keyword templates)"]
    P --> Q[Store lesson]
    O --> Q

Prompt To Fix All With AI

This is a comment left during a code review.
Path: src/gradata/enhancements/behavioral_extractor.py
Line: 471-473

Comment:
**Documented `llm_provider` interface doesn't match the implementation**

The docstring declares the expected interface as `llm_provider.extract(draft, final, classification) -> str`, but `_try_llm_extract` at line 435 actually calls `llm_provider.complete(prompt, ...)` — matching the `LLMProvider` abstract base class in `llm_provider.py`. A caller who reads this docstring and implements a custom provider with only `.extract()` will hit `AttributeError: ... object has no attribute 'complete'` at runtime when the LLM hook fires.

The docstring should reflect the actual `.complete()` signature so custom provider authors aren't misled.

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: src/gradata/_context_compile.py
Line: 1-9

Comment:
**Missing `from __future__ import annotations` (Rule 6)**

`src/gradata/_context_compile.py` is missing the `from __future__ import annotations` import required by Rule 6 for all SDK `.py` files. `tests/test_convergence_gate.py` is also missing it.

```suggestion
from __future__ import annotations

import re
from typing import TYPE_CHECKING
```

**Rule Used:** # Code Review Rules

## Rule 1: Never use print() ... ([source](https://app.greptile.com/review/custom-context?memory=dee613fe-ca52-4382-b9d7-fad6d0b079ec))

How can I resolve this? If you propose a fix, please make it concise.

_{Reviews (5): Last reviewed commit: "fix: apply CodeRabbit auto-fixes" | Re-trigger Greptile}

Context used:

Rule used - # Code Review Rules

Rule 1: Never use print() ... (source)

coderabbitai · 2026-04-09T17:25:43Z

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
🔍 Trigger review

📝 Walkthrough

Walkthrough

Adds a deterministic archetype-based behavioral-instruction extractor (with optional LLM fallback), integrates it into brain.correct() as the primary extraction path with a legacy fallback, adds persistent correction-pattern storage/querying, and expands unit and E2E tests covering correction → graduation → meta-rule workflow.

Changes

Cohort / File(s)	Summary
Behavioral Extraction Module `src/gradata/enhancements/behavioral_extractor.py`	New module implementing a 12-member `Archetype` enum, `ArchetypeMatch` dataclass, `detect_archetype()`, `generate_instruction()`, and `extract_instruction()` orchestration with deterministic templates, actionable gating, and optional LLM refinement/fallback.
Core Integration & Minor Naming `src/gradata/_core.py`	`brain.correct()` now calls `behavioral_extractor.extract_instruction(...)` first for non-converged cases and falls back to legacy `extract_behavioral_instruction(...)`; ensures `brain._instruction_cache` is initialized when needed; scope validation var renamed (`_VALID_SCOPES` → `_valid_scopes`) and loop vars in `brain_end_session()` renamed for clarity without changing control flow.
Persistent Pattern Storage `src/gradata/enhancements/meta_rules_storage.py`	Adds `PATTERN_SEVERITY_WEIGHTS`, creates `correction_patterns` table via `ensure_pattern_table()`, implements upsert helpers (`upsert_correction_pattern`, `upsert_correction_patterns_batch`) with severity-weight semantics, and `query_graduation_candidates()` to aggregate/filter patterns by sessions and weighted score.
Tests — Unit & Behavioral `tests/test_convergence_gate.py`, `tests/test_core_behavioral.py`	Updated mocks/patch targets from legacy extractor to `behavioral_extractor.extract_instruction`; adjusted assertions; added `test_correct_persists_legacy_on_extractor_failure` ensuring lessons persist when extractor errors.
Tests — End-to-end `tests/test_pipeline_e2e.py`	New E2E test module exercising correction → severity derivation → session finalization → lesson graduation → meta-rule discovery → formatting → SQLite persistence, with cloud-gated branches, deduplication, cross-category isolation checks, and pattern-tracking assertions.
Orchestrator minor change `src/gradata/contrib/patterns/orchestrator.py`	Switched attribute access to use `getattr(brain, "spawn_queue")` when selecting parallel execution strategy.
Prospect name parsing tweak `src/gradata/_context_compile.py`	Adjusted prospect-name regex to explicitly match separator variants (`\s—\s`, `\s--\s`, `\s+-\s+`), changing how stems split into prospect/company.

Sequence Diagram(s)

sequenceDiagram
    actor Client
    participant Brain as brain.correct()
    participant BehavioralExt as behavioral_extractor.extract_instruction()
    participant Archetype as detect_archetype()
    participant Gen as generate_instruction()
    participant LLM as llm_provider (optional)
    participant Legacy as extract_behavioral_instruction()
    Client->>Brain: correct(draft, final, primary, category)
    Brain->>BehavioralExt: extract_instruction(draft, final, classification, category, llm_provider)
    BehavioralExt->>Archetype: detect_archetype(draft, final, classification)
    Archetype-->>BehavioralExt: ArchetypeMatch
    BehavioralExt->>Gen: generate_instruction(match, category)
    Gen-->>BehavioralExt: instruction or None
    alt instruction missing or low confidence
        BehavioralExt->>LLM: extract(...) (optional)
        LLM-->>BehavioralExt: refined_instruction or error
    end
    alt instruction obtained
        BehavioralExt-->>Brain: instruction
        Brain->>Brain: desc = instruction
    else fallback
        Brain->>Legacy: extract_behavioral_instruction(diff, primary, cache)
        Legacy-->>Brain: legacy_desc or exception
        Brain->>Brain: desc = legacy_desc or primary.description
    end
    Brain-->>Client: corrected result (with desc)

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

feat: behavioral extraction + business reposition #4 — Adds or modifies behavioral-extraction integration; touches the same extraction call site in brain.correct().
feat: Sim 9 engine hardening — rule scoping, convergence gate, efficiency metric #15 — Alters brain.correct() extraction flow and convergence/gating logic relevant to behavioral extraction.

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 54.29% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title 'feat: meta-rule discovery pipeline + behavioral extraction' accurately captures the two main features introduced: behavioral instruction extraction and meta-rule discovery.
Description check	✅ Passed	The PR description clearly outlines the major changes: behavioral extractor, pattern tracking, core wiring, and test coverage. It is directly related to the changeset with specific file details and test status.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch feat/meta-rule-discovery

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 7

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

tests/test_core_behavioral.py (1)

42-44: 🧹 Nitpick | 🔵 Trivial

Consider strengthening the fallback test assertion.

The assertion only checks for presence of state markers (INSTINCT or PATTERN), not whether a specific fallback description was used. Consider also verifying that the lesson description doesn't contain the mocked string (to confirm the fallback path was actually taken).

Suggested enhancement

         lessons_path = Path(d) / "lessons.md"
         if lessons_path.exists():
             content = lessons_path.read_text(encoding="utf-8")
             assert "INSTINCT" in content or "PATTERN" in content
+            # Verify fallback was used (mocked string should NOT appear)
+            assert "Use casual, direct tone" not in content

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@tests/test_core_behavioral.py` around lines 42 - 44, The current test only
asserts presence of a state marker; also assert that the lesson text used the
fallback by ensuring the mocked description value is not present and/or that the
description equals the expected fallback string: after reading lessons_path into
content, keep the existing assert "INSTINCT" in content or "PATTERN" in content
and add an assertion that the mocked description string used in the test mock
(replace with the actual mocked value) is not in content (or alternatively
assert content contains the known fallback phrase), referencing the variables
lessons_path and content in tests/test_core_behavioral.py.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/gradata/enhancements/behavioral_extractor.py`:
- Line 314: The parameter `category` on generate_instruction(match:
ArchetypeMatch, category: str = "") is unused—remove the `category` parameter
from the function signature and any default, and then update all call sites that
pass a category (e.g., where generate_instruction(...) is invoked) to call
generate_instruction(match) instead; alternatively, if you intend to vary
templates by category, implement usage inside generate_instruction (e.g., branch
on the category string) and keep the signature—choose one approach and make the
corresponding change consistently across the codebase.
- Around line 416-426: The _try_llm_extract function currently swallows all
exceptions with a bare except; update it to catch Exception as e and log the
exception at debug (or debug + stack) before returning None so failures from
llm_provider.extract are visible; keep the existing flow that returns refined
only if _is_actionable(refined) and otherwise returns None. Target the
_try_llm_extract function and the llm_provider.extract call when adding the
logging; use the module logger (e.g., logger.debug or similar) to record the
exception and any contextual info (draft, final, classification) without
changing return behavior.

In `@src/gradata/enhancements/meta_rules_storage.py`:
- Around line 340-342: PATTERN_SEVERITY_WEIGHTS are aggregation multipliers
(range ~0.5–2.5) used for pattern graduation scoring and are intentionally on a
different scale than self_improvement.SEVERITY_WEIGHTS (which are clamped to
[0,1]); update the nearby comment or docstring in meta_rules_storage.py to
explicitly state that PATTERN_SEVERITY_WEIGHTS are multipliers/aggregation
weights (not per-item severity scores) and mention their numeric range and
relation to self_improvement.SEVERITY_WEIGHTS so future readers won’t expect
clamped [0,1] values.
- Around line 445-465: The SQLite connection opened via sqlite3.connect (conn)
can leak if conn.execute(...).fetchall() raises; wrap the DB usage in a safe
context by either using the connection as a context manager (with
sqlite3.connect(...) as conn:) or surrounding the execute/fetchall with
try/finally and calling conn.close() in finally; update the block that creates
conn, sets conn.row_factory, runs the SELECT (the execute(...).fetchall() call)
and closes conn to ensure conn.close() always runs even on exceptions.
- Around line 345-367: The pattern-tracking helpers (ensure_pattern_table,
upsert_correction_pattern, upsert_correction_patterns_batch,
query_graduation_candidates) are never called so extracted patterns aren’t
persisted; wire them into the correction pipeline by: ensure_pattern_table is
invoked during DB initialization, call extract_patterns from brain.correct() and
immediately persist results using upsert_correction_patterns_batch (or
upsert_correction_pattern for single entries) with the current session_id and
severity metadata, and in brain.end_session() call query_graduation_candidates
to fetch cross-session candidates and incorporate/persist any graduations into
meta-rules (or the existing meta-rule persistence flow); reference these
functions when adding the calls so the persistence happens whenever corrections
are made and at session teardown.

In `@tests/test_pipeline_e2e.py`:
- Around line 90-109: The test assumes discover_meta_rules(...) returns
meta-rules but that function is cloud-gated in the OSS build and may return an
empty list; update the test in test_meta_rule_discovery_from_related_corrections
to handle the gated case by either skipping the test when cloud-gating is active
(use pytest.skip with a clear message) or by asserting conditionally: call
discover_meta_rules(rule_lessons, ...), and if it returns an empty list treat
the test as skipped/expected (or assert len(metas) == 0 with an informative
message), otherwise run the existing assertions on the first meta (referencing
discover_meta_rules, Lesson, and LessonState to locate the relevant code).
- Around line 21-27: Remove the hardcoded Windows path by stopping insertion of
"C:/Users/..." into sys.path; instead only respect an optional
GRADATA_CLOUD_PATH environment variable or prefer the package import fallback.
Update the block that sets _cloud_path (created via os.environ.get) and the
sys.path.insert call so it only inserts when the env var is set and non-empty,
then attempt importing discover_meta_rules and merge_into_meta accordingly
(symbols: _cloud_path, sys.path.insert, discover_meta_rules, merge_into_meta).
Ensure the tests rely on the package import fallback (from
gradata.enhancements.meta_rules) when no env var is provided.

---

Outside diff comments:
In `@tests/test_core_behavioral.py`:
- Around line 42-44: The current test only asserts presence of a state marker;
also assert that the lesson text used the fallback by ensuring the mocked
description value is not present and/or that the description equals the expected
fallback string: after reading lessons_path into content, keep the existing
assert "INSTINCT" in content or "PATTERN" in content and add an assertion that
the mocked description string used in the test mock (replace with the actual
mocked value) is not in content (or alternatively assert content contains the
known fallback phrase), referencing the variables lessons_path and content in
tests/test_core_behavioral.py.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 8e51391b-acd7-475c-a256-a29e1d38b150

📥 Commits

Reviewing files that changed from the base of the PR and between c6a1751 and 4fec28c.

📒 Files selected for processing (6)

src/gradata/_core.py
src/gradata/enhancements/behavioral_extractor.py
src/gradata/enhancements/meta_rules_storage.py
tests/test_convergence_gate.py
tests/test_core_behavioral.py
tests/test_pipeline_e2e.py

📜 Review details

🧰 Additional context used

📓 Path-based instructions (2)

tests/**

⚙️ CodeRabbit configuration file

tests/**: Test files. Verify: no hardcoded paths, assertions check specific values not just truthiness,
parametrized tests preferred for boundary conditions, floating point comparisons use pytest.approx.

Files:

tests/test_convergence_gate.py
tests/test_core_behavioral.py
tests/test_pipeline_e2e.py

src/gradata/**/*.py

⚙️ CodeRabbit configuration file

src/gradata/**/*.py: This is the core SDK. Check for: type safety (from future import annotations required), no print()
statements (use logging), all functions accepting BrainContext where DB access occurs, no hardcoded paths. Severity
scoring must clamp to [0,1]. Confidence values must be in [0.0, 1.0].

Files:

src/gradata/_core.py
src/gradata/enhancements/meta_rules_storage.py
src/gradata/enhancements/behavioral_extractor.py

🪛 GitHub Actions: CI

tests/test_pipeline_e2e.py

[error] 102-102: pytest failed in TestPipelineE2E.test_meta_rule_discovery_from_related_corrections. AssertionError: 4 RULE-graduated PROCESS lessons should produce at least 1 meta-rule (assert len(metas) >= 1); got len(metas)=0.

🪛 Ruff (0.15.9)

tests/test_convergence_gate.py

[warning] 48-49: Use a single with statement with multiple contexts instead of nested with statements

(SIM117)

tests/test_pipeline_e2e.py

[error] 87-87: Ambiguous variable name: l

(E741)

[error] 209-209: Ambiguous variable name: l

(E741)

src/gradata/enhancements/behavioral_extractor.py

[warning] 54-54: Comment contains ambiguous – (EN DASH). Did you mean - (HYPHEN-MINUS)?

(RUF003)

[warning] 168-168: Too many return statements (14 > 6)

(PLR0911)

[warning] 168-168: Too many branches (15 > 12)

(PLR0912)

[warning] 196-196: zip() without an explicit strict= parameter

Add explicit value for parameter strict=

(B905)

[warning] 239-239: zip() without an explicit strict= parameter

Add explicit value for parameter strict=

(B905)

[warning] 278-278: zip() without an explicit strict= parameter

Add explicit value for parameter strict=

(B905)

[warning] 314-314: Too many return statements (19 > 6)

(PLR0911)

[warning] 314-314: Too many branches (18 > 12)

(PLR0912)

[warning] 314-314: Unused function argument: category

(ARG001)

[error] 424-425: try-except-pass detected, consider logging the exception

(S110)

[warning] 424-424: Do not catch blind exception: Exception

(BLE001)

🔇 Additional comments (7)

tests/test_pipeline_e2e.py (2)

86-88: Ambiguous variable name l is acceptable in list comprehensions.

The static analysis flags l as ambiguous (E741), but in context (for l in lessons) it's a common and readable pattern for "lesson". No change required, but consider lesson for clarity if refactoring.

Also applies to: 208-216

79-81: Good assertion pattern: checks specific severity values.

The assertion verifies the severity value is one of the expected discrete values rather than just checking truthiness.

src/gradata/enhancements/behavioral_extractor.py (2)

476-478: extract_instruction returns None when category has no fallback.

If the category is not in _GENERIC_FALLBACKS (e.g., "UNKNOWN" or a typo), the function returns None. This is documented in the return type, but callers in _core.py handle this correctly via behavioral_desc or primary.description. Just noting this is intentional.

168-307: Archetype detection logic is comprehensive.

The 12-archetype taxonomy with sentence-level analysis and ordered resolution is well-designed. The confidence values are appropriately in [0.0, 1.0], and the fallback chain (prefix → hedging → constraint → action step → tone → length → reorder → factual → format → content removal → word replacement → fallback) covers the expected correction patterns.

src/gradata/_core.py (1)

228-250: Clean integration of behavioral extractor with proper fallback chain.

The two-stage extraction sequence is well-implemented:

Try archetype-based extract_instruction() first

Fall back to keyword templates via extract_behavioral_instruction() if step 1 returns falsy

Final fallback to primary.description

The isinstance guard on _instruction_cache (line 243) correctly handles uninitialized/None cache. Exception handling preserves the fallback to primary.description.

tests/test_convergence_gate.py (1)

48-55: Mock updated correctly to match new extraction flow.

The patch target now correctly reflects the updated brain_correct() flow where extract_instruction from behavioral_extractor is called first. The non-None return value ensures the primary extraction path is exercised without triggering fallbacks.

tests/test_core_behavioral.py (1)

15-18: Mocks correctly updated to test both extraction paths.

Both tests properly patch the new extract_instruction function:

test_correct_uses_behavioral_description: Returns a string → verifies it appears in lessons

test_correct_falls_back_to_old_description: Returns None → verifies fallback to old description path

This provides good coverage of the two-stage extraction sequence.

Also applies to: 33-36

coderabbitai · 2026-04-09T17:32:20Z

+# Template Generation
+# ---------------------------------------------------------------------------
+
+def generate_instruction(match: ArchetypeMatch, category: str = "") -> str:


⚠️ Potential issue | 🟡 Minor

Unused category parameter in generate_instruction().

The category parameter is declared but never used in the function body. Either remove it or use it (e.g., for category-specific template variations).

Proposed fix (remove unused parameter)

-def generate_instruction(match: ArchetypeMatch, category: str = "") -> str: +def generate_instruction(match: ArchetypeMatch) -> str: """Generate an imperative behavioral instruction from an archetype match."""

And update the call site at line 461:

- instruction = generate_instruction(match, category) + instruction = generate_instruction(match)

🧰 Tools

🪛 Ruff (0.15.9)

[warning] 314-314: Too many return statements (19 > 6)

(PLR0911)

[warning] 314-314: Too many branches (18 > 12)

(PLR0912)

[warning] 314-314: Unused function argument: category

(ARG001)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@src/gradata/enhancements/behavioral_extractor.py` at line 314, The parameter `category` on generate_instruction(match: ArchetypeMatch, category: str = "") is unused—remove the `category` parameter from the function signature and any default, and then update all call sites that pass a category (e.g., where generate_instruction(...) is invoked) to call generate_instruction(match) instead; alternatively, if you intend to vary templates by category, implement usage inside generate_instruction (e.g., branch on the category string) and keep the signature—choose one approach and make the corresponding change consistently across the codebase.

coderabbitai · 2026-04-09T17:32:20Z

+def _try_llm_extract(llm_provider, draft: str, final: str, classification) -> str | None:
+    """Try LLM extraction, return result or None on failure."""
+    if llm_provider is None:
+        return None
+    try:
+        refined = llm_provider.extract(draft, final, classification)
+        if refined and _is_actionable(refined):
+            return refined
+    except Exception:
+        pass
+    return None


⚠️ Potential issue | 🟡 Minor

Silent exception swallowing loses diagnostic information.

The bare except Exception: pass hides all errors from LLM extraction, making debugging difficult when the LLM provider misbehaves. At minimum, log at debug level.

Proposed fix

+import logging + +_log = logging.getLogger("gradata") + def _try_llm_extract(llm_provider, draft: str, final: str, classification) -> str | None: """Try LLM extraction, return result or None on failure.""" if llm_provider is None: return None try: refined = llm_provider.extract(draft, final, classification) if refined and _is_actionable(refined): return refined - except Exception: - pass + except Exception as e: + _log.debug("LLM extraction failed: %s", e) return None

🧰 Tools

🪛 Ruff (0.15.9)

[error] 424-425: try-except-pass detected, consider logging the exception

(S110)

[warning] 424-424: Do not catch blind exception: Exception

(BLE001)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@src/gradata/enhancements/behavioral_extractor.py` around lines 416 - 426, The _try_llm_extract function currently swallows all exceptions with a bare except; update it to catch Exception as e and log the exception at debug (or debug + stack) before returning None so failures from llm_provider.extract are visible; keep the existing flow that returns refined only if _is_actionable(refined) and otherwise returns None. Target the _try_llm_extract function and the llm_provider.extract call when adding the logging; use the module logger (e.g., logger.debug or similar) to record the exception and any contextual info (draft, final, classification) without changing return behavior.

coderabbitai · 2026-04-09T17:32:20Z

+def ensure_pattern_table(db_path: str | Path) -> None:
+    """Create correction_patterns table if it doesn't exist."""
+    conn = sqlite3.connect(str(db_path))
+    try:
+        conn.execute("""
+            CREATE TABLE IF NOT EXISTS correction_patterns (
+                pattern_hash TEXT NOT NULL,
+                category TEXT NOT NULL,
+                representative_text TEXT NOT NULL,
+                session_id INTEGER NOT NULL,
+                severity TEXT DEFAULT 'minor',
+                severity_weight REAL DEFAULT 1.0,
+                created_at TEXT DEFAULT (datetime('now')),
+                UNIQUE(pattern_hash, session_id)
+            )
+        """)
+        conn.execute("""
+            CREATE INDEX IF NOT EXISTS idx_patterns_hash
+            ON correction_patterns(pattern_hash)
+        """)
+        conn.commit()
+    finally:
+        conn.close()


⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # Search for usages of the new pattern tracking functions outside of tests and the storage module itself echo "=== Searching for upsert_correction_pattern calls ===" rg -n "upsert_correction_pattern" --type py -g '!tests/*' -g '!*_storage.py' echo "=== Searching for ensure_pattern_table calls ===" rg -n "ensure_pattern_table" --type py -g '!tests/*' -g '!*_storage.py' echo "=== Searching for query_graduation_candidates calls ===" rg -n "query_graduation_candidates" --type py -g '!tests/*' -g '!*_storage.py'

Repository: Gradata/gradata

Length of output: 510

🏁 Script executed:

# First, let's understand the structure and see the actual function definitions wc -l src/gradata/enhancements/meta_rules_storage.py

Repository: Gradata/gradata

Length of output: 109

🏁 Script executed:

# Read the storage module to understand the functions sed -n '345,475p' src/gradata/enhancements/meta_rules_storage.py

Repository: Gradata/gradata

Length of output: 4547

🏁 Script executed:

# Check if these functions are exported from the module rg -n "__all__" src/gradata/enhancements/meta_rules_storage.py

Repository: Gradata/gradata

Length of output: 41

🏁 Script executed:

# Search for any calls to these functions within the storage module itself rg -n "ensure_pattern_table|upsert_correction_pattern|query_graduation_candidates" src/gradata/enhancements/meta_rules_storage.py

Repository: Gradata/gradata

Length of output: 231

🏁 Script executed:

# Check the brain.correct() and brain.end_session() implementations fd -name "*.py" -path "*/gradata/*" | xargs grep -l "def correct\|def end_session" | grep -v test

Repository: Gradata/gradata

Length of output: 289

🏁 Script executed:

# Check the top of the storage module for imports and exports head -50 src/gradata/enhancements/meta_rules_storage.py

Repository: Gradata/gradata

Length of output: 1440

🏁 Script executed:

# Find files that define brain.correct() and brain.end_session() fd -name "*.py" -path "*/gradata/*" type f | head -20

Repository: Gradata/gradata

Length of output: 289

🏁 Script executed:

# Search for brain class and correct method rg "def correct\(" src/gradata/ --type py -A 5

Repository: Gradata/gradata

Length of output: 1803

🏁 Script executed:

# Search for brain end_session rg "def end_session\(" src/gradata/ --type py -A 5

Repository: Gradata/gradata

Length of output: 716

🏁 Script executed:

# Look at the test file mentioned in the review cat -n src/gradata/tests/test_pipeline_e2e.py | grep -A 30 "test_correction_pattern_tracking"

Repository: Gradata/gradata

Length of output: 129

🏁 Script executed:

# Find test files with "pattern" in the name fd ".*test.*\.py" src/gradata/tests/ | xargs grep -l "correction_pattern_tracking\|correction_patterns"

Repository: Gradata/gradata

Length of output: 166

🏁 Script executed:

# Read the brain.py file to see correct() implementation wc -l src/gradata/brain.py

Repository: Gradata/gradata

Length of output: 84

🏁 Script executed:

# Read brain.correct() implementation rg "def correct\(" src/gradata/brain.py -A 50

Repository: Gradata/gradata

Length of output: 2969

🏁 Script executed:

# Read brain.end_session() implementation rg "def end_session\(" src/gradata/brain.py -A 40

Repository: Gradata/gradata

Length of output: 2340

🏁 Script executed:

# Check for TODO/FIXME comments in meta_rules_storage.py rg "TODO|FIXME|XXX|HACK" src/gradata/enhancements/meta_rules_storage.py

Repository: Gradata/gradata

Length of output: 41

🏁 Script executed:

# Check the _core module which is imported fd "_core.py" src/gradata/

Repository: Gradata/gradata

Length of output: 79

🏁 Script executed:

# Check _core.py for brain_correct implementation rg "def brain_correct" src/gradata/_core.py -A 60

Repository: Gradata/gradata

Length of output: 2809

🏁 Script executed:

# Check _core.py for brain_end_session implementation rg "def brain_end_session" src/gradata/_core.py -A 60

Repository: Gradata/gradata

Length of output: 2723

🏁 Script executed:

# Search for any usage of pattern tracking functions in _core.py rg "ensure_pattern_table|upsert_correction_pattern|query_graduation_candidates" src/gradata/_core.py

Repository: Gradata/gradata

Length of output: 41

🏁 Script executed:

# Find test files (check broader directory) fd "test.*\.py" src/gradata/ | head -20

Repository: Gradata/gradata

Length of output: 41

🏁 Script executed:

# Search for pattern tracking test rg "test_correction_pattern_tracking" src/gradata/

Repository: Gradata/gradata

Length of output: 167

🏁 Script executed:

# Search all Python files for the test function name rg "test_correction_pattern_tracking" src/

Repository: Gradata/gradata

Length of output: 167

🏁 Script executed:

# Check if there are test directories find src -type d -name "*test*" 2>/dev/null

Repository: Gradata/gradata

Length of output: 41

🏁 Script executed:

# Search for files with "pattern" and "test" in name rg -l "pattern.*test|test.*pattern" src/ --type py

Repository: Gradata/gradata

Length of output: 167

🏁 Script executed:

# Check the full brain_correct to see if pattern tracking is there rg "def brain_correct" src/gradata/_core.py -A 150 | tail -100

Repository: Gradata/gradata

Length of output: 4236

🏁 Script executed:

# Search for any pattern tracking in _core.py rg "pattern" src/gradata/_core.py -i

Repository: Gradata/gradata

Length of output: 1628

🏁 Script executed:

# Look for imports of pattern functions in _core.py rg "from.*meta_rules_storage|import.*ensure_pattern_table|import.*upsert_correction" src/gradata/_core.py

Repository: Gradata/gradata

Length of output: 159

🏁 Script executed:

# Look at the _core.py end_session more completely to check for pattern tracking rg "def brain_end_session" src/gradata/_core.py -A 200 | tail -150

Repository: Gradata/gradata

Length of output: 8124

🏁 Script executed:

# Search for any PATTERN_SEVERITY_WEIGHTS constant which is referenced in pattern functions rg "PATTERN_SEVERITY_WEIGHTS" src/gradata/

Repository: Gradata/gradata

Length of output: 552

🏁 Script executed:

# Check the entire meta_rules_storage.py for context about these functions head -100 src/gradata/enhancements/meta_rules_storage.py | tail -50

Repository: Gradata/gradata

Length of output: 1765

🏁 Script executed:

# Check if there's a correction_patterns table referenced anywhere rg "correction_patterns" src/gradata/ --type py

Repository: Gradata/gradata

Length of output: 1199

🏁 Script executed:

# Check git log for recent changes to meta_rules_storage.py git log --oneline -20 src/gradata/enhancements/meta_rules_storage.py 2>/dev/null || echo "Git not available"

Repository: Gradata/gradata

Length of output: 125

🏁 Script executed:

# Verify the line ranges mentioned in the comment sed -n '370,396p' src/gradata/enhancements/meta_rules_storage.py | head -20

Repository: Gradata/gradata

Length of output: 933

🏁 Script executed:

# Check for any test files that might reference pattern tracking find src -name "*.py" -type f | xargs grep -l "correction_pattern_tracking\|correction_patterns"

Repository: Gradata/gradata

Length of output: 149

🏁 Script executed:

# Look for any PR or issue context in comments rg "cross-session|pattern.*tracking|correction_patterns" src/gradata/enhancements/meta_rules_storage.py -B 2 -A 2

Repository: Gradata/gradata

Length of output: 1661

🏁 Script executed:

# Check for any TODO/FIXME comments about pattern tracking integration rg "TODO|FIXME|TODO:|FIXME:|future|defer|deferred" src/gradata/enhancements/meta_rules_storage.py

Repository: Gradata/gradata

Length of output: 93

🏁 Script executed:

# Check the comments and docstrings around pattern functions for context sed -n '340,475p' src/gradata/enhancements/meta_rules_storage.py | grep -A 2 -B 2 "def\|#"

Repository: Gradata/gradata

Length of output: 763

🏁 Script executed:

# Verify the exact line numbers for the functions mentioned rg -n "^def ensure_pattern_table|^def upsert_correction_pattern|^def upsert_correction_patterns_batch|^def query_graduation_candidates" src/gradata/enhancements/meta_rules_storage.py

Repository: Gradata/gradata

Length of output: 231

Pattern tracking functions are not integrated into the correction pipeline.

These functions (ensure_pattern_table, upsert_correction_pattern, upsert_correction_patterns_batch, query_graduation_candidates) are defined and fully implemented, but are never called from brain.correct() or brain.end_session(). The PR mentions "cross-session pattern tracking," but the architecture shows these helpers are orphaned—corrections don't populate the correction_patterns table. In brain.correct(), extract_patterns() is called but results are not persisted; in brain.end_session(), only meta-rules are discovered and persisted.

Is this intentional (deferred to a future PR), or should these be wired into the pipeline now?

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@src/gradata/enhancements/meta_rules_storage.py` around lines 345 - 367, The pattern-tracking helpers (ensure_pattern_table, upsert_correction_pattern, upsert_correction_patterns_batch, query_graduation_candidates) are never called so extracted patterns aren’t persisted; wire them into the correction pipeline by: ensure_pattern_table is invoked during DB initialization, call extract_patterns from brain.correct() and immediately persist results using upsert_correction_patterns_batch (or upsert_correction_pattern for single entries) with the current session_id and severity metadata, and in brain.end_session() call query_graduation_candidates to fetch cross-session candidates and incorporate/persist any graduations into meta-rules (or the existing meta-rule persistence flow); reference these functions when adding the calls so the persistence happens whenever corrections are made and at session teardown.

coderabbitai · 2026-04-09T17:32:20Z

+    conn = sqlite3.connect(str(db_path))
+    conn.row_factory = sqlite3.Row
+    rows = conn.execute(
+        """SELECT
+             pattern_hash,
+             category,
+             representative_text,
+             COUNT(DISTINCT session_id) AS distinct_sessions,
+             SUM(severity_weight) AS weighted_score,
+             MIN(created_at) AS first_seen,
+             MAX(created_at) AS last_seen,
+             GROUP_CONCAT(DISTINCT session_id) AS session_ids
+           FROM correction_patterns
+           GROUP BY pattern_hash
+           HAVING COUNT(DISTINCT session_id) >= ?
+              AND SUM(severity_weight) >= ?
+           ORDER BY weighted_score DESC
+        """,
+        (min_sessions, min_score),
+    ).fetchall()
+    conn.close()


⚠️ Potential issue | 🟡 Minor

Potential resource leak if query fails.

The connection is opened at line 445 but conn.close() is called outside a try/finally block. If fetchall() raises an exception, the connection will leak.

Proposed fix

def query_graduation_candidates( db_path: str | Path, min_sessions: int = 2, min_score: float = 3.0, ) -> list[dict]: """Find correction patterns ready for meta-rule graduation. Returns patterns where: - Distinct sessions >= min_sessions - Sum of severity weights >= min_score """ conn = sqlite3.connect(str(db_path)) - conn.row_factory = sqlite3.Row - rows = conn.execute( - """SELECT - pattern_hash, - category, - representative_text, - COUNT(DISTINCT session_id) AS distinct_sessions, - SUM(severity_weight) AS weighted_score, - MIN(created_at) AS first_seen, - MAX(created_at) AS last_seen, - GROUP_CONCAT(DISTINCT session_id) AS session_ids - FROM correction_patterns - GROUP BY pattern_hash - HAVING COUNT(DISTINCT session_id) >= ? - AND SUM(severity_weight) >= ? - ORDER BY weighted_score DESC - """, - (min_sessions, min_score), - ).fetchall() - conn.close() - return [dict(r) for r in rows] + try: + conn.row_factory = sqlite3.Row + rows = conn.execute( + """SELECT + pattern_hash, + category, + representative_text, + COUNT(DISTINCT session_id) AS distinct_sessions, + SUM(severity_weight) AS weighted_score, + MIN(created_at) AS first_seen, + MAX(created_at) AS last_seen, + GROUP_CONCAT(DISTINCT session_id) AS session_ids + FROM correction_patterns + GROUP BY pattern_hash + HAVING COUNT(DISTINCT session_id) >= ? + AND SUM(severity_weight) >= ? + ORDER BY weighted_score DESC + """, + (min_sessions, min_score), + ).fetchall() + return [dict(r) for r in rows] + finally: + conn.close()

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

conn = sqlite3.connect(str(db_path))

conn.row_factory = sqlite3.Row

rows = conn.execute(

"""SELECT

pattern_hash,

category,

representative_text,

COUNT(DISTINCT session_id) AS distinct_sessions,

SUM(severity_weight) AS weighted_score,

MIN(created_at) AS first_seen,

MAX(created_at) AS last_seen,

GROUP_CONCAT(DISTINCT session_id) AS session_ids

FROM correction_patterns

GROUP BY pattern_hash

HAVING COUNT(DISTINCT session_id) >= ?

AND SUM(severity_weight) >= ?

ORDER BY weighted_score DESC

""",

(min_sessions, min_score),

).fetchall()

conn.close()

def query_graduation_candidates(

db_path: str | Path,

min_sessions: int = 2,

min_score: float = 3.0,

) -> list[dict]:

"""Find correction patterns ready for meta-rule graduation.

Returns patterns where:

- Distinct sessions >= min_sessions

- Sum of severity weights >= min_score

"""

conn = sqlite3.connect(str(db_path))

try:

conn.row_factory = sqlite3.Row

rows = conn.execute(

"""SELECT

pattern_hash,

category,

representative_text,

COUNT(DISTINCT session_id) AS distinct_sessions,

SUM(severity_weight) AS weighted_score,

MIN(created_at) AS first_seen,

MAX(created_at) AS last_seen,

GROUP_CONCAT(DISTINCT session_id) AS session_ids

FROM correction_patterns

GROUP BY pattern_hash

HAVING COUNT(DISTINCT session_id) >= ?

AND SUM(severity_weight) >= ?

ORDER BY weighted_score DESC

""",

(min_sessions, min_score),

).fetchall()

return [dict(r) for r in rows]

finally:

conn.close()

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@src/gradata/enhancements/meta_rules_storage.py` around lines 445 - 465, The SQLite connection opened via sqlite3.connect (conn) can leak if conn.execute(...).fetchall() raises; wrap the DB usage in a safe context by either using the connection as a context manager (with sqlite3.connect(...) as conn:) or surrounding the execute/fetchall with try/finally and calling conn.close() in finally; update the block that creates conn, sets conn.row_factory, runs the SELECT (the execute(...).fetchall() call) and closes conn to ensure conn.close() always runs even on exceptions.

coderabbitai · 2026-04-09T17:32:20Z

+try:
+    _cloud_path = os.environ.get("GRADATA_CLOUD_PATH", "C:/Users/olive/SpritesWork/brain/cloud-only")
+    sys.path.insert(0, _cloud_path)
+    from meta_rules import discover_meta_rules, merge_into_meta  # type: ignore[import]
+except ImportError:
+    from gradata.enhancements.meta_rules import discover_meta_rules
+


⚠️ Potential issue | 🟠 Major

Hardcoded Windows path violates test guidelines.

Line 22 contains a hardcoded Windows-specific path C:/Users/olive/SpritesWork/brain/cloud-only. This will fail on CI/other environments and violates the coding guidelines for tests. The fallback import on line 26 works, but the hardcoded path should be removed or made configurable.

Proposed fix

-# Try cloud-only override first (real discovery), fall back to SDK stubs -try: - _cloud_path = os.environ.get("GRADATA_CLOUD_PATH", "C:/Users/olive/SpritesWork/brain/cloud-only") - sys.path.insert(0, _cloud_path) - from meta_rules import discover_meta_rules, merge_into_meta # type: ignore[import] -except ImportError: - from gradata.enhancements.meta_rules import discover_meta_rules +# Try cloud-only override first (real discovery), fall back to SDK stubs +_cloud_path = os.environ.get("GRADATA_CLOUD_PATH") +if _cloud_path: + sys.path.insert(0, _cloud_path) +try: + from meta_rules import discover_meta_rules, merge_into_meta # type: ignore[import] +except ImportError: + from gradata.enhancements.meta_rules import discover_meta_rules

As per coding guidelines: "tests/**: no hardcoded paths"

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

try:

_cloud_path = os.environ.get("GRADATA_CLOUD_PATH", "C:/Users/olive/SpritesWork/brain/cloud-only")

sys.path.insert(0, _cloud_path)

from meta_rules import discover_meta_rules, merge_into_meta # type: ignore[import]

except ImportError:

from gradata.enhancements.meta_rules import discover_meta_rules

_cloud_path = os.environ.get("GRADATA_CLOUD_PATH")

if _cloud_path:

sys.path.insert(0, _cloud_path)

try:

from meta_rules import discover_meta_rules, merge_into_meta # type: ignore[import]

except ImportError:

from gradata.enhancements.meta_rules import discover_meta_rules

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@tests/test_pipeline_e2e.py` around lines 21 - 27, Remove the hardcoded Windows path by stopping insertion of "C:/Users/..." into sys.path; instead only respect an optional GRADATA_CLOUD_PATH environment variable or prefer the package import fallback. Update the block that sets _cloud_path (created via os.environ.get) and the sys.path.insert call so it only inserts when the env var is set and non-empty, then attempt importing discover_meta_rules and merge_into_meta accordingly (symbols: _cloud_path, sys.path.insert, discover_meta_rules, merge_into_meta). Ensure the tests rely on the package import fallback (from gradata.enhancements.meta_rules) when no env var is provided.

New: behavioral_extractor.py - 12-archetype sentence-level extraction for actionable lesson descriptions - Replaces word-diff summaries with imperative instructions - Template-based with upgrade path for LLM refinement New: correction_patterns table (meta_rules_storage.py) - Cross-session pattern tracking with batch upsert - Graduation candidate query for repeated patterns Changed: _core.py - brain.correct() uses behavioral_extractor as primary extraction path - Falls back to keyword templates New: test_pipeline_e2e.py (12 tests) - Pattern tracking, correction logging, injection formatting - Cloud-dependent tests skip on CI 1699 tests passing, 0 failures. Co-Authored-By: Gradata <noreply@gradata.ai>

coderabbitai

Actionable comments posted: 4

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

src/gradata/_core.py (1)

232-253: ⚠️ Potential issue | 🟠 Major

Preserve the legacy extractor when the new path throws.

Right now the fallback to extract_behavioral_instruction() only happens when extract_instruction() returns a falsy value. If the new extractor raises, Line 251 skips the legacy path and drops straight to primary.description, so a bug in the new code disables the old behavior entirely.

Proposed fix

-                        try:
-                            from gradata.enhancements.behavioral_extractor import extract_instruction
-                            behavioral_desc = extract_instruction(
-                                draft, final, primary, category=cat,
-                            )
-                            if not behavioral_desc:
-                                # Fallback to keyword templates
-                                from gradata.enhancements.edit_classifier import (
-                                    extract_behavioral_instruction,
-                                )
-                                from gradata.enhancements.instruction_cache import InstructionCache
-                                if not isinstance(brain._instruction_cache, InstructionCache):
-                                    brain._instruction_cache = InstructionCache(
-                                        lessons_path.parent / "instruction_cache.json"
-                                    )
-                                behavioral_desc = extract_behavioral_instruction(
-                                    diff, primary, cache=brain._instruction_cache,  # type: ignore[arg-type]
-                                )
-                            desc = behavioral_desc or primary.description
-                        except Exception as e:
-                            _log.debug("Behavioral extraction failed: %s", e)
-                            desc = primary.description
+                        behavioral_desc = None
+                        try:
+                            from gradata.enhancements.behavioral_extractor import extract_instruction
+                            behavioral_desc = extract_instruction(
+                                draft, final, primary, category=cat,
+                            )
+                        except Exception as e:
+                            _log.debug("Behavioral extractor failed; trying legacy fallback: %s", e)
+
+                        if not behavioral_desc:
+                            try:
+                                from gradata.enhancements.edit_classifier import (
+                                    extract_behavioral_instruction,
+                                )
+                                from gradata.enhancements.instruction_cache import InstructionCache
+                                if not isinstance(brain._instruction_cache, InstructionCache):
+                                    brain._instruction_cache = InstructionCache(
+                                        lessons_path.parent / "instruction_cache.json"
+                                    )
+                                behavioral_desc = extract_behavioral_instruction(
+                                    diff, primary, cache=brain._instruction_cache,  # type: ignore[arg-type]
+                                )
+                            except Exception as e:
+                                _log.debug("Legacy behavioral extraction failed: %s", e)
+
+                        desc = behavioral_desc or primary.description

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@src/gradata/_core.py` around lines 232 - 253, The try/except currently
swallows exceptions from extract_instruction and directly falls back to
primary.description, skipping the legacy extractor; update _core.py so that when
extract_instruction raises you still attempt the legacy path: in the except
block (or by restructuring the try) call extract_behavioral_instruction(...)
using the same cache initialization logic around InstructionCache and
brain._instruction_cache, assign its result to behavioral_desc, and then set
desc = behavioral_desc or primary.description (same as the success path) so that
legacy extraction runs on both falsy returns and exceptions from
extract_instruction.

♻️ Duplicate comments (1)

tests/test_pipeline_e2e.py (1)
23-24: ⚠️ Potential issue | 🟠 Major

Remove the developer-specific cloud path from the test module.

Falling back to C:/Users/olive/... makes this suite host-specific and brittle on CI/non-Windows machines. Only prepend GRADATA_CLOUD_PATH when the env var is actually set; otherwise rely on the package import fallback.

As per coding guidelines, "tests/**: Test files. Verify: no hardcoded paths".
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/test_pipeline_e2e.py` around lines 23 - 24, The test currently falls
back to a hardcoded developer path via the _cloud_path variable and
unconditionally calls sys.path.insert(0, _cloud_path); remove the hardcoded
fallback and only prepend to sys.path when the GRADATA_CLOUD_PATH environment
variable is set (i.e., check os.environ.get("GRADATA_CLOUD_PATH") and if it's
truthy call sys.path.insert(0, path)); ensure no other hardcoded paths remain
and rely on normal package imports when the env var is absent.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/gradata/enhancements/behavioral_extractor.py`:
- Around line 416-425: In _try_llm_extract replace the nonexistent
llm_provider.extract call by building a prompt from draft, final, and
classification and invoking llm_provider.complete(prompt, max_tokens=...,
timeout=...) (use appropriate token/timeout values or existing defaults), then
pass the returned text to _is_actionable and return it if actionable; also
remove the silent except and log the caught Exception (include the exception
message and context) so failures to call complete are visible.

In `@src/gradata/enhancements/meta_rules_storage.py`:
- Around line 447-458: The graduation SELECT currently groups by pattern_hash
but selects non-aggregated columns category and representative_text, yielding
arbitrary rows; change the query to explicitly pick a representative row per
pattern_hash (e.g., use a subquery or window function like ROW_NUMBER() OVER
(PARTITION BY pattern_hash ORDER BY created_at DESC) to pick the
latest/highest-weight row) and then join that representative row to the
aggregates (COUNT(DISTINCT session_id), SUM(severity_weight), MIN(created_at),
MAX(created_at), GROUP_CONCAT(...)) so category and representative_text come
from the chosen representative (reference the query built where rows =
conn.execute(...) against correction_patterns).

In `@tests/test_core_behavioral.py`:
- Around line 16-17: The tests currently allow regressions because both
assertions are conditional on lessons_path.exists(); update tests in
tests/test_core_behavioral.py to assert lessons_path.exists() (i.e., that
lessons.md was created) before checking contents, and add a new test case that
patches gradata.enhancements.behavioral_extractor.extract_instruction to raise
an exception (use side_effect=Exception(...)) to simulate the _core.py failure
mode, then run the flow and assert the legacy description (the content written
by the legacy extractor) is persisted into lessons.md; reference the mocked
function name extract_instruction and the file/variable lessons_path/lessons.md
to locate where to add these checks.

In `@tests/test_pipeline_e2e.py`:
- Around line 67-70: The test is forcing every simulated correction to "major"
by using result.get("outcome", "major") when building session_corrections for
end_session(); instead, propagate the real severity from the correction itself
(or fall back to result/outcome if needed). Update the session_corrections
construction in tests/test_pipeline_e2e.py to set "severity" using
correction.get("severity", result.get("outcome", "major")) (or
correction["severity"] if you expect it always present), so the value returned
by brain.correct() and/or the correction entry drives severity-aware graduation
logic in end_session().

---

Outside diff comments:
In `@src/gradata/_core.py`:
- Around line 232-253: The try/except currently swallows exceptions from
extract_instruction and directly falls back to primary.description, skipping the
legacy extractor; update _core.py so that when extract_instruction raises you
still attempt the legacy path: in the except block (or by restructuring the try)
call extract_behavioral_instruction(...) using the same cache initialization
logic around InstructionCache and brain._instruction_cache, assign its result to
behavioral_desc, and then set desc = behavioral_desc or primary.description
(same as the success path) so that legacy extraction runs on both falsy returns
and exceptions from extract_instruction.

---

Duplicate comments:
In `@tests/test_pipeline_e2e.py`:
- Around line 23-24: The test currently falls back to a hardcoded developer path
via the _cloud_path variable and unconditionally calls sys.path.insert(0,
_cloud_path); remove the hardcoded fallback and only prepend to sys.path when
the GRADATA_CLOUD_PATH environment variable is set (i.e., check
os.environ.get("GRADATA_CLOUD_PATH") and if it's truthy call sys.path.insert(0,
path)); ensure no other hardcoded paths remain and rely on normal package
imports when the env var is absent.

🪄 Autofix (Beta)

✅ Autofix completed

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 36c6accd-8e32-4391-8f3a-6b8b3129e802

📥 Commits

Reviewing files that changed from the base of the PR and between 4fec28c and 7680808.

📒 Files selected for processing (6)

src/gradata/_core.py
src/gradata/enhancements/behavioral_extractor.py
src/gradata/enhancements/meta_rules_storage.py
tests/test_convergence_gate.py
tests/test_core_behavioral.py
tests/test_pipeline_e2e.py

📜 Review details

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: Greptile Review

🧰 Additional context used

📓 Path-based instructions (2)

tests/**

⚙️ CodeRabbit configuration file

tests/**: Test files. Verify: no hardcoded paths, assertions check specific values not just truthiness,
parametrized tests preferred for boundary conditions, floating point comparisons use pytest.approx.

Files:

tests/test_convergence_gate.py
tests/test_core_behavioral.py
tests/test_pipeline_e2e.py

src/gradata/**/*.py

⚙️ CodeRabbit configuration file

src/gradata/**/*.py: This is the core SDK. Check for: type safety (from future import annotations required), no print()
statements (use logging), all functions accepting BrainContext where DB access occurs, no hardcoded paths. Severity
scoring must clamp to [0,1]. Confidence values must be in [0.0, 1.0].

Files:

src/gradata/_core.py
src/gradata/enhancements/meta_rules_storage.py
src/gradata/enhancements/behavioral_extractor.py

🪛 Ruff (0.15.9)

tests/test_convergence_gate.py

[warning] 48-49: Use a single with statement with multiple contexts instead of nested with statements

(SIM117)

tests/test_pipeline_e2e.py

[error] 93-93: Ambiguous variable name: l

(E741)

[error] 220-220: Ambiguous variable name: l

(E741)

src/gradata/enhancements/behavioral_extractor.py

[warning] 54-54: Comment contains ambiguous – (EN DASH). Did you mean - (HYPHEN-MINUS)?

(RUF003)

[warning] 168-168: Too many return statements (14 > 6)

(PLR0911)

[warning] 168-168: Too many branches (15 > 12)

(PLR0912)

[warning] 196-196: zip() without an explicit strict= parameter

Add explicit value for parameter strict=

(B905)

[warning] 239-239: zip() without an explicit strict= parameter

Add explicit value for parameter strict=

(B905)

[warning] 278-278: zip() without an explicit strict= parameter

Add explicit value for parameter strict=

(B905)

[warning] 314-314: Too many return statements (19 > 6)

(PLR0911)

[warning] 314-314: Too many branches (18 > 12)

(PLR0912)

[warning] 314-314: Unused function argument: category

(ARG001)

[error] 424-425: try-except-pass detected, consider logging the exception

(S110)

[warning] 424-424: Do not catch blind exception: Exception

(BLE001)

Use getattr() instead of direct attribute access for spawn_queue to satisfy Pyright's type narrowing after hasattr() check. Co-Authored-By: Gradata <noreply@gradata.ai>

coderabbitai · 2026-04-09T18:07:51Z

Note

Autofix is a beta feature. Expect some limitations and changes as we gather feedback and continue to improve it.

Fixes Applied Successfully

Fixed 94 file(s) based on 4 unresolved review comments.

Files modified:

src/gradata/__init__.py
src/gradata/_brain_manifest.py
src/gradata/_context_compile.py
src/gradata/_context_packet.py
src/gradata/_core.py
src/gradata/_db.py
src/gradata/_doctor.py
src/gradata/_embed.py
src/gradata/_encryption.py
src/gradata/_fact_extractor.py
src/gradata/_installer.py
src/gradata/_manifest_helpers.py
src/gradata/_manifest_quality.py
src/gradata/_math.py
src/gradata/_migrations.py
src/gradata/_paths.py
src/gradata/_query.py
src/gradata/_stats.py
src/gradata/_tag_taxonomy.py
src/gradata/_validator.py
src/gradata/audit.py
src/gradata/benchmarks/swe_bench.py
src/gradata/brain.py
src/gradata/brain_inspection.py
src/gradata/cli.py
src/gradata/context_wrapper.py
src/gradata/contrib/enhancements/eval_benchmark.py
src/gradata/contrib/enhancements/install_manifest.py
src/gradata/contrib/enhancements/quality_gates.py
src/gradata/contrib/patterns/context_brackets.py
src/gradata/contrib/patterns/evaluator.py
src/gradata/contrib/patterns/execute_qualify.py
src/gradata/contrib/patterns/guardrails.py
src/gradata/contrib/patterns/human_loop.py
src/gradata/contrib/patterns/loop_detection.py
src/gradata/contrib/patterns/middleware.py
src/gradata/contrib/patterns/orchestrator.py
src/gradata/contrib/patterns/parallel.py
src/gradata/contrib/patterns/pipeline.py
src/gradata/contrib/patterns/q_learning_router.py
src/gradata/contrib/patterns/reconciliation.py
src/gradata/contrib/patterns/reflection.py
src/gradata/contrib/patterns/task_escalation.py
src/gradata/contrib/patterns/tree_of_thoughts.py
src/gradata/correction_detector.py
src/gradata/daemon.py
src/gradata/detection/__init__.py
src/gradata/detection/addition_pattern.py
src/gradata/detection/correction_conflict.py
src/gradata/detection/mode_classifier.py
src/gradata/enhancements/behavioral_extractor.py
src/gradata/enhancements/cluster_manager.py
src/gradata/enhancements/edit_classifier.py
src/gradata/enhancements/git_backfill.py
src/gradata/enhancements/instruction_cache.py
src/gradata/enhancements/lesson_discriminator.py
src/gradata/enhancements/memory_taxonomy.py
src/gradata/enhancements/meta_rules.py
src/gradata/enhancements/meta_rules_storage.py
src/gradata/enhancements/observation_hooks.py
src/gradata/enhancements/pattern_extractor.py
src/gradata/enhancements/pattern_integration.py
src/gradata/enhancements/quality_monitoring.py
src/gradata/enhancements/reporting.py
src/gradata/enhancements/router_warmstart.py
src/gradata/enhancements/rule_evolution.py
src/gradata/enhancements/rule_verifier.py
src/gradata/enhancements/self_improvement.py
src/gradata/enhancements/similarity.py
src/gradata/enhancements/super_meta_rules.py
src/gradata/events_bus.py
src/gradata/graph.py
src/gradata/hooks/auto_correct.py
src/gradata/hooks/claude_code.py
src/gradata/inspection.py
src/gradata/integrations/anthropic_adapter.py
src/gradata/integrations/crewai_adapter.py
src/gradata/integrations/embeddings.py
src/gradata/integrations/langchain_adapter.py
src/gradata/integrations/openai_adapter.py
src/gradata/mcp_server.py
src/gradata/mcp_tools.py
src/gradata/onboard.py
src/gradata/rules/__init__.py
src/gradata/rules/rule_engine.py
src/gradata/rules/rule_graph.py
src/gradata/rules/rule_tracker.py
src/gradata/rules/scope.py
src/gradata/security/__init__.py
src/gradata/security/correction_provenance.py
src/gradata/security/manifest_signing.py
src/gradata/sidecar/watcher.py
tests/test_core_behavioral.py
tests/test_pipeline_e2e.py

Commit: 2536dcc193d3faad9aad53bfa359880e93cd93db

The changes have been pushed to the feat/meta-rule-discovery branch.

Time taken: 6m 5s

Fixed 94 file(s) based on 4 unresolved review comments. Co-authored-by: CodeRabbit <noreply@coderabbit.ai>

coderabbitai

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

src/gradata/_core.py (1)

232-255: ⚠️ Potential issue | 🟠 Major

Keep the legacy extractor as the fallback when the new extractor throws.

Right now the broad try wraps both branches. If extract_instruction() raises, this code drops straight to primary.description and never calls extract_behavioral_instruction(), so the explicit fallback path disappears on exceptions.

Proposed fix

-                        try:
-                            from gradata.enhancements.behavioral_extractor import (
-                                extract_instruction,
-                            )
-                            behavioral_desc = extract_instruction(
-                                draft, final, primary, category=cat,
-                            )
-                            if not behavioral_desc:
-                                # Fallback to keyword templates
-                                from gradata.enhancements.edit_classifier import (
-                                    extract_behavioral_instruction,
-                                )
-                                from gradata.enhancements.instruction_cache import InstructionCache
-                                if not isinstance(brain._instruction_cache, InstructionCache):
-                                    brain._instruction_cache = InstructionCache(
-                                        lessons_path.parent / "instruction_cache.json"
-                                    )
-                                behavioral_desc = extract_behavioral_instruction(
-                                    diff, primary, cache=brain._instruction_cache,  # type: ignore[arg-type]
-                                )
-                            desc = behavioral_desc or primary.description
-                        except Exception as e:
-                            _log.debug("Behavioral extraction failed: %s", e)
-                            desc = primary.description
+                        behavioral_desc = None
+                        try:
+                            from gradata.enhancements.behavioral_extractor import (
+                                extract_instruction,
+                            )
+                            behavioral_desc = extract_instruction(
+                                draft, final, primary, category=cat,
+                            )
+                        except Exception as e:
+                            _log.debug("Primary behavioral extraction failed: %s", e)
+
+                        if not behavioral_desc:
+                            try:
+                                from gradata.enhancements.edit_classifier import (
+                                    extract_behavioral_instruction,
+                                )
+                                from gradata.enhancements.instruction_cache import InstructionCache
+                                if not isinstance(brain._instruction_cache, InstructionCache):
+                                    brain._instruction_cache = InstructionCache(
+                                        lessons_path.parent / "instruction_cache.json"
+                                    )
+                                behavioral_desc = extract_behavioral_instruction(
+                                    diff, primary, cache=brain._instruction_cache,  # type: ignore[arg-type]
+                                )
+                            except Exception as e:
+                                _log.debug("Legacy behavioral extraction failed: %s", e)
+
+                        desc = behavioral_desc or primary.description

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@src/gradata/_core.py` around lines 232 - 255, The try currently wraps both
the new extractor and the fallback, so if extract_instruction(draft, final,
primary, category=cat) raises you skip the fallback; change the logic so only
the call to extract_instruction is wrapped in a try/except: call
extract_instruction inside a small try block (catch Exception as e and log the
failure), and if it returns falsy or raises, then proceed to import and call
extract_behavioral_instruction(diff, primary, cache=brain._instruction_cache)
(creating InstructionCache into brain._instruction_cache from
lessons_path.parent / "instruction_cache.json" if needed) to produce
behavioral_desc; finally set desc = behavioral_desc or primary.description.

♻️ Duplicate comments (5)

src/gradata/enhancements/meta_rules_storage.py (2)

345-486: ⚠️ Potential issue | 🟠 Major

The new correction-pattern pipeline is still not reachable from production flow.

These helpers are implemented here, but the provided src/gradata/_core.py still only counts extract_patterns(...) results and never calls upsert_correction_patterns_batch(...), and brain_end_session() still does not call query_graduation_candidates(). As shipped, correction_patterns never influences graduation.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@src/gradata/enhancements/meta_rules_storage.py` around lines 345 - 486, The
production flow never persists or uses correction patterns: ensure
extract_patterns results are upserted and graduation queries are invoked by
wiring the new helpers into the core session lifecycle—call
upsert_correction_patterns_batch (or upsert_correction_pattern for single items)
from the code path that currently handles extract_patterns results in
src/gradata/_core.py so patterns are saved per session, and modify
brain_end_session to call query_graduation_candidates (with appropriate
min_sessions/min_score) and feed any returned candidates into the existing
graduation path so correction_patterns influence meta-rule graduation.

445-486: ⚠️ Potential issue | 🟡 Minor

Always close the SQLite connection on query errors.

conn.close() only runs after fetchall() succeeds. If the SELECT fails, this connection leaks.

Proposed fix

 def query_graduation_candidates(
     db_path: str | Path,
     min_sessions: int = 2,
     min_score: float = 3.0,
 ) -> list[dict]:
@@
     """
     conn = sqlite3.connect(str(db_path))
-    conn.row_factory = sqlite3.Row
-    rows = conn.execute(
-        """WITH representative AS (
-             SELECT
-               pattern_hash,
-               category,
-               representative_text,
-               ROW_NUMBER() OVER (PARTITION BY pattern_hash ORDER BY created_at DESC) AS rn
-             FROM correction_patterns
-           ),
-           aggregates AS (
-             SELECT
-               pattern_hash,
-               COUNT(DISTINCT session_id) AS distinct_sessions,
-               SUM(severity_weight) AS weighted_score,
-               MIN(created_at) AS first_seen,
-               MAX(created_at) AS last_seen,
-               GROUP_CONCAT(DISTINCT session_id) AS session_ids
-             FROM correction_patterns
-             GROUP BY pattern_hash
-             HAVING COUNT(DISTINCT session_id) >= ?
-                AND SUM(severity_weight) >= ?
-           )
-           SELECT
-             r.pattern_hash,
-             r.category,
-             r.representative_text,
-             a.distinct_sessions,
-             a.weighted_score,
-             a.first_seen,
-             a.last_seen,
-             a.session_ids
-           FROM representative r
-           INNER JOIN aggregates a ON r.pattern_hash = a.pattern_hash
-           WHERE r.rn = 1
-           ORDER BY a.weighted_score DESC
-        """,
-        (min_sessions, min_score),
-    ).fetchall()
-    conn.close()
-    return [dict(r) for r in rows]
+    try:
+        conn.row_factory = sqlite3.Row
+        rows = conn.execute(
+            """WITH representative AS (
+                 SELECT
+                   pattern_hash,
+                   category,
+                   representative_text,
+                   ROW_NUMBER() OVER (PARTITION BY pattern_hash ORDER BY created_at DESC) AS rn
+                 FROM correction_patterns
+               ),
+               aggregates AS (
+                 SELECT
+                   pattern_hash,
+                   COUNT(DISTINCT session_id) AS distinct_sessions,
+                   SUM(severity_weight) AS weighted_score,
+                   MIN(created_at) AS first_seen,
+                   MAX(created_at) AS last_seen,
+                   GROUP_CONCAT(DISTINCT session_id) AS session_ids
+                 FROM correction_patterns
+                 GROUP BY pattern_hash
+                 HAVING COUNT(DISTINCT session_id) >= ?
+                    AND SUM(severity_weight) >= ?
+               )
+               SELECT
+                 r.pattern_hash,
+                 r.category,
+                 r.representative_text,
+                 a.distinct_sessions,
+                 a.weighted_score,
+                 a.first_seen,
+                 a.last_seen,
+                 a.session_ids
+               FROM representative r
+               INNER JOIN aggregates a ON r.pattern_hash = a.pattern_hash
+               WHERE r.rn = 1
+               ORDER BY a.weighted_score DESC
+            """,
+            (min_sessions, min_score),
+        ).fetchall()
+        return [dict(r) for r in rows]
+    finally:
+        conn.close()

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@src/gradata/enhancements/meta_rules_storage.py` around lines 445 - 486, The
DB connection created with sqlite3.connect(...) and closed via conn.close() can
leak if conn.execute(...) or fetchall() raises; wrap the usage of conn (the
sqlite3.Connection) in a try/finally or use the context manager form (with
sqlite3.connect(str(db_path)) as conn:) so that conn.close() always runs,
ensuring the query block that calls conn.execute(...).fetchall() and the
subsequent return [dict(r) for r in rows] still work but the connection is
closed on errors; update the code around the connect/execute/fetchall/conn.close
calls accordingly.

tests/test_core_behavioral.py (1)

34-44: ⚠️ Potential issue | 🟠 Major

These tests still don't prove the legacy fallback ran.

"INSTINCT" in content or "PATTERN" in content and len(content) > 0 will pass even when _core.py skips the legacy extractor and persists primary.description. Patch gradata.enhancements.edit_classifier.extract_behavioral_instruction to return a sentinel string and assert that exact string is written in both the None and exception branches.

As per coding guidelines, "tests/**: Test files. Verify: no hardcoded paths, assertions check specific values not just truthiness".

Proposed fix

 def test_correct_falls_back_to_old_description():
@@
-        with patch(
-            "gradata.enhancements.behavioral_extractor.extract_instruction",
-            return_value=None,
-        ):
+        with patch(
+            "gradata.enhancements.behavioral_extractor.extract_instruction",
+            return_value=None,
+        ), patch(
+            "gradata.enhancements.edit_classifier.extract_behavioral_instruction",
+            return_value="LEGACY SENTINEL",
+        ):
             brain.correct(
                 draft="Dear Sir, We are pleased.",
                 final="Hey, here's the deal.",
             )
@@
-        assert "INSTINCT" in content or "PATTERN" in content
+        assert "LEGACY SENTINEL" in content
@@
 def test_correct_persists_legacy_on_extractor_failure():
@@
-        with patch(
-            "gradata.enhancements.behavioral_extractor.extract_instruction",
-            side_effect=Exception("Simulated extraction failure"),
-        ):
+        with patch(
+            "gradata.enhancements.behavioral_extractor.extract_instruction",
+            side_effect=Exception("Simulated extraction failure"),
+        ), patch(
+            "gradata.enhancements.edit_classifier.extract_behavioral_instruction",
+            return_value="LEGACY SENTINEL",
+        ):
             brain.correct(
                 draft="Dear Sir, We are pleased to inform you.",
                 final="Hey, here's what we decided.",
             )
@@
-        assert "INSTINCT" in content or "PATTERN" in content or len(content) > 0
+        assert "LEGACY SENTINEL" in content

Also applies to: 52-63

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@tests/test_core_behavioral.py` around lines 34 - 44, Update the test to
verify the legacy fallback by making
gradata.enhancements.edit_classifier.extract_behavioral_instruction return a
unique sentinel string (e.g., "LEGACY_SENTINEL") and then assert that this exact
sentinel is written to lessons.md in both the None and exception branches;
specifically patch/mock extract_behavioral_instruction to return the sentinel
and replace the loose checks ("INSTINCT" or "PATTERN" / len>0) with an exact
equality/assertion against the sentinel, ensuring coverage for paths where
_core.py falls back to persisting primary.description.

src/gradata/enhancements/behavioral_extractor.py (1)

316-317: 🧹 Nitpick | 🔵 Trivial

Drop the unused category parameter from generate_instruction().

category is still threaded through the API but never read, which makes the generator look category-aware when it is not.

Proposed fix

-def generate_instruction(match: ArchetypeMatch, category: str = "") -> str:
+def generate_instruction(match: ArchetypeMatch) -> str:
@@
-    instruction = generate_instruction(match, category)
+    instruction = generate_instruction(match)

Also applies to: 479-479

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@src/gradata/enhancements/behavioral_extractor.py` around lines 316 - 317, The
function generate_instruction currently accepts an unused category parameter;
remove this parameter from its signature (def generate_instruction(match:
ArchetypeMatch) -> str) and from any other duplicate definition or overload (the
other occurrence at the second generate_instruction), and update all call sites
to stop passing category so the API no longer falsely suggests
category-awareness; ensure any documentation/type hints referencing
generate_instruction's category argument are cleaned up as well.

tests/test_pipeline_e2e.py (1)

20-25: ⚠️ Potential issue | 🟠 Major

Remove the machine-specific cloud path from the test bootstrap.

Falling back to C:/Users/olive/... makes this suite environment-specific. Only prepend GRADATA_CLOUD_PATH when it is explicitly set, otherwise rely on the package import fallback.

As per coding guidelines, "tests/**: Test files. Verify: no hardcoded paths".

Proposed fix

 # Try cloud-only override first (real discovery), fall back to SDK stubs
 _CLOUD_DISCOVERY = False
+_cloud_path = os.environ.get("GRADATA_CLOUD_PATH")
+if _cloud_path:
+    sys.path.insert(0, _cloud_path)
 try:
-    _cloud_path = os.environ.get("GRADATA_CLOUD_PATH", "C:/Users/olive/SpritesWork/brain/cloud-only")
-    sys.path.insert(0, _cloud_path)
     from meta_rules import discover_meta_rules, merge_into_meta  # type: ignore[import]
     _CLOUD_DISCOVERY = True
 except ImportError:
     from gradata.enhancements.meta_rules import discover_meta_rules

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@tests/test_pipeline_e2e.py` around lines 20 - 25, The test bootstrap
currently hardcodes a machine-specific fallback path in _cloud_path and always
prepends it to sys.path; change this so GRADATA_CLOUD_PATH is used only when
explicitly set: if os.environ.get("GRADATA_CLOUD_PATH") is truthy, set
_cloud_path to that value and insert it into sys.path before attempting from
meta_rules import discover_meta_rules, merge_into_meta, otherwise do not modify
sys.path and let the normal package import fallback occur; keep the existing
_CLOUD_DISCOVERY flag and import statement names unchanged.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/gradata/_context_compile.py`:
- Line 35: The current re.split call (parts = re.split(r"\s*[—\-\-]\s*", stem,
maxsplit=1)) uses a character class and incorrectly splits on single hyphens;
update the regex to match the whole separator (em dash, double hyphen, or a
hyphen used as a separator with surrounding spaces) so names with internal
hyphens aren't split: replace the pattern with something like
r"(?:\s*—\s*|\s*--\s*|\s+-\s+)" in the re.split call (referencing the same parts
variable and the re.split invocation).

In `@src/gradata/enhancements/behavioral_extractor.py`:
- Around line 211-214: The current substring checks on _CONSTRAINT_WORDS falsely
match parts of words; change the detection to use regex word-boundary searches
instead: for each word in _CONSTRAINT_WORDS use re.search(r'\b' +
re.escape(word) + r'\b', final, flags=re.IGNORECASE) and ensure the same word is
NOT found in draft using the same word-boundary search before adding to
new_constraints; keep the subsequent call to
_find_sentence_containing(new_match, ...) the same (use the matched constraint
token), and add an import for the re module if missing.

In `@tests/test_pipeline_e2e.py`:
- Around line 95-96: The test is asserting the wrong set of severity labels;
update the assertion that checks the variable severity (computed as
result.get("outcome") or result.get("data", {}).get("severity")) to use the
SDK's real labels: "as-is", "minor", "moderate", "major", and "discarded"
instead of the current ("trivial","minor","moderate","major","rewrite"); ensure
the assertion accepts only that exact set.

---

Outside diff comments:
In `@src/gradata/_core.py`:
- Around line 232-255: The try currently wraps both the new extractor and the
fallback, so if extract_instruction(draft, final, primary, category=cat) raises
you skip the fallback; change the logic so only the call to extract_instruction
is wrapped in a try/except: call extract_instruction inside a small try block
(catch Exception as e and log the failure), and if it returns falsy or raises,
then proceed to import and call extract_behavioral_instruction(diff, primary,
cache=brain._instruction_cache) (creating InstructionCache into
brain._instruction_cache from lessons_path.parent / "instruction_cache.json" if
needed) to produce behavioral_desc; finally set desc = behavioral_desc or
primary.description.

---

Duplicate comments:
In `@src/gradata/enhancements/behavioral_extractor.py`:
- Around line 316-317: The function generate_instruction currently accepts an
unused category parameter; remove this parameter from its signature (def
generate_instruction(match: ArchetypeMatch) -> str) and from any other duplicate
definition or overload (the other occurrence at the second
generate_instruction), and update all call sites to stop passing category so the
API no longer falsely suggests category-awareness; ensure any documentation/type
hints referencing generate_instruction's category argument are cleaned up as
well.

In `@src/gradata/enhancements/meta_rules_storage.py`:
- Around line 345-486: The production flow never persists or uses correction
patterns: ensure extract_patterns results are upserted and graduation queries
are invoked by wiring the new helpers into the core session lifecycle—call
upsert_correction_patterns_batch (or upsert_correction_pattern for single items)
from the code path that currently handles extract_patterns results in
src/gradata/_core.py so patterns are saved per session, and modify
brain_end_session to call query_graduation_candidates (with appropriate
min_sessions/min_score) and feed any returned candidates into the existing
graduation path so correction_patterns influence meta-rule graduation.
- Around line 445-486: The DB connection created with sqlite3.connect(...) and
closed via conn.close() can leak if conn.execute(...) or fetchall() raises; wrap
the usage of conn (the sqlite3.Connection) in a try/finally or use the context
manager form (with sqlite3.connect(str(db_path)) as conn:) so that conn.close()
always runs, ensuring the query block that calls conn.execute(...).fetchall()
and the subsequent return [dict(r) for r in rows] still work but the connection
is closed on errors; update the code around the
connect/execute/fetchall/conn.close calls accordingly.

In `@tests/test_core_behavioral.py`:
- Around line 34-44: Update the test to verify the legacy fallback by making
gradata.enhancements.edit_classifier.extract_behavioral_instruction return a
unique sentinel string (e.g., "LEGACY_SENTINEL") and then assert that this exact
sentinel is written to lessons.md in both the None and exception branches;
specifically patch/mock extract_behavioral_instruction to return the sentinel
and replace the loose checks ("INSTINCT" or "PATTERN" / len>0) with an exact
equality/assertion against the sentinel, ensuring coverage for paths where
_core.py falls back to persisting primary.description.

In `@tests/test_pipeline_e2e.py`:
- Around line 20-25: The test bootstrap currently hardcodes a machine-specific
fallback path in _cloud_path and always prepends it to sys.path; change this so
GRADATA_CLOUD_PATH is used only when explicitly set: if
os.environ.get("GRADATA_CLOUD_PATH") is truthy, set _cloud_path to that value
and insert it into sys.path before attempting from meta_rules import
discover_meta_rules, merge_into_meta, otherwise do not modify sys.path and let
the normal package import fallback occur; keep the existing _CLOUD_DISCOVERY
flag and import statement names unchanged.

🪄 Autofix (Beta)

✅ Autofix completed

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 3a290ff3-aba0-4383-b14b-d129664a0a91

📥 Commits

Reviewing files that changed from the base of the PR and between 7d9034b and 2536dcc.

📒 Files selected for processing (6)

src/gradata/_context_compile.py
src/gradata/_core.py
src/gradata/enhancements/behavioral_extractor.py
src/gradata/enhancements/meta_rules_storage.py
tests/test_core_behavioral.py
tests/test_pipeline_e2e.py

📜 Review details

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: Greptile Review

🧰 Additional context used

📓 Path-based instructions (2)

src/gradata/**/*.py

⚙️ CodeRabbit configuration file

src/gradata/**/*.py: This is the core SDK. Check for: type safety (from future import annotations required), no print()
statements (use logging), all functions accepting BrainContext where DB access occurs, no hardcoded paths. Severity
scoring must clamp to [0,1]. Confidence values must be in [0.0, 1.0].

Files:

src/gradata/_context_compile.py
src/gradata/enhancements/meta_rules_storage.py
src/gradata/enhancements/behavioral_extractor.py
src/gradata/_core.py

tests/**

⚙️ CodeRabbit configuration file

tests/**: Test files. Verify: no hardcoded paths, assertions check specific values not just truthiness,
parametrized tests preferred for boundary conditions, floating point comparisons use pytest.approx.

Files:

tests/test_core_behavioral.py
tests/test_pipeline_e2e.py

🪛 Ruff (0.15.9)

src/gradata/enhancements/behavioral_extractor.py

[warning] 56-56: Comment contains ambiguous – (EN DASH). Did you mean - (HYPHEN-MINUS)?

(RUF003)

[warning] 170-170: Too many return statements (14 > 6)

(PLR0911)

[warning] 170-170: Too many branches (15 > 12)

(PLR0912)

[warning] 198-198: zip() without an explicit strict= parameter

Add explicit value for parameter strict=

(B905)

[warning] 241-241: zip() without an explicit strict= parameter

Add explicit value for parameter strict=

(B905)

[warning] 280-280: zip() without an explicit strict= parameter

Add explicit value for parameter strict=

(B905)

[warning] 316-316: Too many return statements (19 > 6)

(PLR0911)

[warning] 316-316: Too many branches (18 > 12)

(PLR0912)

[warning] 316-316: Unused function argument: category

(ARG001)

[warning] 438-438: Do not catch blind exception: Exception

(BLE001)

tests/test_pipeline_e2e.py

[error] 102-102: Ambiguous variable name: l

(E741)

[error] 229-229: Ambiguous variable name: l

(E741)

src/gradata/_core.py

[warning] 105-105: Avoid specifying long messages outside the exception class

(TRY003)

[warning] 498-498: Missing return type annotation for private function _lesson_key

Add return type annotation: str

(ANN202)

[warning] 536-536: Do not catch blind exception: Exception

(BLE001)

coderabbitai · 2026-04-09T18:45:43Z

Note

Autofix is a beta feature. Expect some limitations and changes as we gather feedback and continue to improve it.

Fixes Applied Successfully

Fixed 3 file(s) based on 3 unresolved review comments.

Files modified:

src/gradata/_context_compile.py
src/gradata/enhancements/behavioral_extractor.py
tests/test_pipeline_e2e.py

Commit: 6b4909ddaf3c2a9c91d047c3e0e5c798d8eb7305

The changes have been pushed to the feat/meta-rule-discovery branch.

Time taken: 2m 49s

Fixed 3 file(s) based on 3 unresolved review comments. Co-authored-by: CodeRabbit <noreply@coderabbit.ai>

coderabbitai

Actionable comments posted: 7

♻️ Duplicate comments (1)

src/gradata/_context_compile.py (1)

35-35: ⚠️ Potential issue | 🟠 Major

Reintroduce en-dash support in the separator regex.

This pattern dropped – (en dash), so stems like Jane Doe – Acme won’t split anymore, which regresses extraction and diverges from src/gradata/_tag_taxonomy.py and src/gradata/_fact_extractor.py.

Proposed fix

-        parts = re.split(r"(?:\s*—\s*|\s*--\s*|\s+-\s+)", stem, maxsplit=1)
+        parts = re.split(r"(?:\s*[—–]\s*|\s*--\s*|\s+-\s+)", stem, maxsplit=1)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@src/gradata/_context_compile.py` at line 35, The separator regex used in the
re.split call that assigns parts (re.split(..., stem, maxsplit=1)) no longer
matches the en-dash (–), breaking splits like "Jane Doe – Acme"; update the
pattern used in the parts = re.split(...) expression to include the en-dash
(either as the literal character or \u2013) alongside the existing
em-dash/hyphen variants so it matches `—`, `–`, `--`, and ` - `, keeping
behavior consistent with src/gradata/_tag_taxonomy.py and
src/gradata/_fact_extractor.py.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/gradata/enhancements/behavioral_extractor.py`:
- Around line 53-57: The comment on the ArchetypeMatch dataclass for the
confidence field uses an EN DASH; update the comment in the ArchetypeMatch
definition (the confidence: float comment) to replace the EN DASH (`–`) with a
standard hyphen-minus (`-`) so it reads e.g. `# 0.0-1.0`, ensuring editors/tools
won't misinterpret the character.
- Around line 473-474: The docstring for the llm_provider is incorrect: update
the description to match the actual interface used by _try_llm_extract — instead
of extract(draft, final, classification) -> str, document that llm_provider
exposes a complete(prompt: str, max_tokens: int = ..., timeout: int = ...) ->
str (or similar synchronous completion API that accepts a prompt and optional
parameters and returns the model text); reference llm_provider and the
_try_llm_extract function in the behavioral_extractor module and ensure the
docstring mentions the expected parameters (prompt, max_tokens, timeout) and
return type (str).
- Around line 196-199: Zip calls pairing draft_sent_sets and final_sent_sets
(and the other zip usages involving draft_sents/final_sents) are currently
unchecked; change the zip(...) usages to zip(..., strict=True) to force length
equality at runtime. Update the list comprehension that builds added_sents
(using final_sents, final_sent_sets, draft_sent_sets and _sentence_overlap) and
the other places where zip is used (the other pairings referenced in this file)
to include strict=True so mismatched lengths raise immediately.

In `@tests/test_pipeline_e2e.py`:
- Around line 209-211: The test currently asserts a float equality on
m.context_weights["sales"] using ==; change this to use pytest.approx by
replacing the direct equality with assert m.context_weights["sales"] ==
pytest.approx(1.5) and ensure pytest is imported in the test module (add import
pytest at top if missing) so floating-point comparisons follow the project's
guideline.
- Around line 101-103: The list comprehension uses an ambiguous loop variable
name `l`; rename it to a descriptive name like `lesson` in the comprehension
that builds `process_lessons` (where you call fresh_brain._load_lessons()) so it
becomes [lesson for lesson in lessons if lesson.category == "PROCESS"], and
update any references to the loop variable (the comprehension itself and nearby
code/comments) to use `lesson` for clarity.
- Around line 228-236: The loop uses an ambiguous single-letter variable "l";
rename it to "lesson" throughout the loop that builds promoted (the for loop
iterating over lessons) and update all references where Lesson(...),
lesson.category, lesson.date, lesson.description are used so the logic is
unchanged (keep the condition checking lesson.category == "PROCESS" and
constructing Lesson(..., state=LessonState.RULE, confidence=0.90) for promoted).
Ensure the variable name change is applied consistently in that block to avoid
shadowing or unresolved names.
- Around line 251-258: The test test_same_correction_twice_same_session
currently only asserts r1 and r2 are not None; update it to assert concrete
expected values from fresh_brain.correct (inspect the return shape) — for
example assert r1 contains an identifier (e.g., "id" in r1) and that r1["id"] ==
r2["id"] if deduplication should return the same record, or assert r1["created"]
is True and r2["created"] is False (or assert a "deduplicated" flag) to
explicitly verify deduplication behavior when calling fresh_brain.correct with
the same SALES_CORRECTIONS[0] and session=95.

---

Duplicate comments:
In `@src/gradata/_context_compile.py`:
- Line 35: The separator regex used in the re.split call that assigns parts
(re.split(..., stem, maxsplit=1)) no longer matches the en-dash (–), breaking
splits like "Jane Doe – Acme"; update the pattern used in the parts =
re.split(...) expression to include the en-dash (either as the literal character
or \u2013) alongside the existing em-dash/hyphen variants so it matches `—`,
`–`, `--`, and ` - `, keeping behavior consistent with
src/gradata/_tag_taxonomy.py and src/gradata/_fact_extractor.py.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 0075571e-9add-451b-a01f-9d2832ee5e6f

📥 Commits

Reviewing files that changed from the base of the PR and between 2536dcc and 6b4909d.

📒 Files selected for processing (3)

src/gradata/_context_compile.py
src/gradata/enhancements/behavioral_extractor.py
tests/test_pipeline_e2e.py

📜 Review details

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: Greptile Review

🧰 Additional context used

📓 Path-based instructions (2)

src/gradata/**/*.py

⚙️ CodeRabbit configuration file

src/gradata/**/*.py: This is the core SDK. Check for: type safety (from future import annotations required), no print()
statements (use logging), all functions accepting BrainContext where DB access occurs, no hardcoded paths. Severity
scoring must clamp to [0,1]. Confidence values must be in [0.0, 1.0].

Files:

src/gradata/_context_compile.py
src/gradata/enhancements/behavioral_extractor.py

tests/**

⚙️ CodeRabbit configuration file

tests/**: Test files. Verify: no hardcoded paths, assertions check specific values not just truthiness,
parametrized tests preferred for boundary conditions, floating point comparisons use pytest.approx.

Files:

tests/test_pipeline_e2e.py

🪛 Ruff (0.15.9)

tests/test_pipeline_e2e.py

[error] 102-102: Ambiguous variable name: l

(E741)

[error] 229-229: Ambiguous variable name: l

(E741)

src/gradata/enhancements/behavioral_extractor.py

[warning] 56-56: Comment contains ambiguous – (EN DASH). Did you mean - (HYPHEN-MINUS)?

(RUF003)

[warning] 170-170: Too many return statements (14 > 6)

(PLR0911)

[warning] 170-170: Too many branches (15 > 12)

(PLR0912)

[warning] 198-198: zip() without an explicit strict= parameter

Add explicit value for parameter strict=

(B905)

[warning] 242-242: zip() without an explicit strict= parameter

Add explicit value for parameter strict=

(B905)

[warning] 281-281: zip() without an explicit strict= parameter

Add explicit value for parameter strict=

(B905)

[warning] 317-317: Too many return statements (19 > 6)

(PLR0911)

[warning] 317-317: Too many branches (18 > 12)

(PLR0912)

[warning] 317-317: Unused function argument: category

(ARG001)

[warning] 439-439: Do not catch blind exception: Exception

(BLE001)

🔇 Additional comments (8)

src/gradata/enhancements/behavioral_extractor.py (5)

317-317: Unused category parameter.

The category parameter is declared but never used in the function body. This was flagged in a previous review.

1-31: LGTM - Module setup and imports.

Proper use of from __future__ import annotations, logging setup, and TYPE_CHECKING guard for the EditClassification import.

64-96: LGTM - Detection vocabulary constants.

The frozensets and compiled regex patterns are well-organized and appropriate for the detection logic.

103-164: LGTM - Sentence-level helper functions.

Clean, pure functions with clear purposes. The _sentence_overlap correctly returns Jaccard index in [0,1], and _to_imperative handles a good variety of prefix patterns.

419-445: LGTM - LLM extraction with proper error handling.

The function now correctly uses llm_provider.complete() with appropriate parameters and logs failures at warning level. The broad exception catch is acceptable here since we want to gracefully handle any LLM provider failure.

tests/test_pipeline_e2e.py (3)

22-28: Hardcoded Windows path still present.

Line 23 contains a hardcoded Windows-specific path C:/Users/olive/SpritesWork/brain/cloud-only. This was flagged in a previous review but remains unchanged. The fallback import works, but the hardcoded path will cause issues in CI/other environments.

45-84: LGTM - Test fixtures and helper function.

SALES_CORRECTIONS provides good test data, and _simulate_session properly extracts severity with a robust fallback chain matching the SDK's return structure.

282-304: LGTM - Pattern tracking test with specific assertions.

Good test coverage for the correction pattern tracking functionality. Assertions check specific values (pattern_hash, distinct_sessions >= 3, weighted_score >= 3.0).

coderabbitai · 2026-04-09T18:53:59Z

+@dataclass
+class ArchetypeMatch:
+    archetype: Archetype
+    confidence: float  # 0.0–1.0
+    context: dict      # archetype-specific extracted data


🧹 Nitpick | 🔵 Trivial

Minor: EN DASH in comment.

Line 56 uses an EN DASH (–) instead of a hyphen-minus (-) in the comment # 0.0–1.0. While functionally harmless, it can cause issues with some tools and editors.

Proposed fix

`@dataclass` class ArchetypeMatch: archetype: Archetype - confidence: float # 0.0–1.0 + confidence: float # 0.0-1.0 context: dict # archetype-specific extracted data

🧰 Tools

🪛 Ruff (0.15.9)

[warning] 56-56: Comment contains ambiguous – (EN DASH). Did you mean - (HYPHEN-MINUS)?

(RUF003)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@src/gradata/enhancements/behavioral_extractor.py` around lines 53 - 57, The comment on the ArchetypeMatch dataclass for the confidence field uses an EN DASH; update the comment in the ArchetypeMatch definition (the confidence: float comment) to replace the EN DASH (`–`) with a standard hyphen-minus (`-`) so it reads e.g. `# 0.0-1.0`, ensuring editors/tools won't misinterpret the character.

coderabbitai · 2026-04-09T18:53:59Z

+    draft_sent_sets = [set(s.lower().split()) for s in draft_sents]
+    final_sent_sets = [set(s.lower().split()) for s in final_sents]
+    added_sents = [s for s, ws in zip(final_sents, final_sent_sets)
+                   if not any(_sentence_overlap(ws, ds) > 0.5 for ds in draft_sent_sets)]


🧹 Nitpick | 🔵 Trivial

Consider adding strict=True to zip() calls for defensive coding.

Lines 198, 242, and 281 use zip() without the strict= parameter. While the lists are constructed together and guaranteed to have the same length, adding strict=True (Python 3.10+) would catch unexpected mismatches during future refactoring.

Example for line 198

- added_sents = [s for s, ws in zip(final_sents, final_sent_sets) + added_sents = [s for s, ws in zip(final_sents, final_sent_sets, strict=True) if not any(_sentence_overlap(ws, ds) > 0.5 for ds in draft_sent_sets)]

🧰 Tools

🪛 Ruff (0.15.9)

[warning] 198-198: zip() without an explicit strict= parameter

Add explicit value for parameter strict=

(B905)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@src/gradata/enhancements/behavioral_extractor.py` around lines 196 - 199, Zip calls pairing draft_sent_sets and final_sent_sets (and the other zip usages involving draft_sents/final_sents) are currently unchecked; change the zip(...) usages to zip(..., strict=True) to force length equality at runtime. Update the list comprehension that builds added_sents (using final_sents, final_sent_sets, draft_sent_sets and _sentence_overlap) and the other places where zip is used (the other pairings referenced in this file) to include strict=True so mismatched lengths raise immediately.

coderabbitai · 2026-04-09T18:53:59Z

+        llm_provider: Optional LLM provider for refinement of low-confidence matches.
+                      Interface: llm_provider.extract(draft, final, classification) -> str


⚠️ Potential issue | 🟡 Minor

Docstring/implementation mismatch for llm_provider interface.

The docstring states:

llm_provider: Optional LLM provider for refinement of low-confidence matches. Interface: llm_provider.extract(draft, final, classification) -> str

But the implementation in _try_llm_extract (line 435) calls llm_provider.complete(prompt, max_tokens=100, timeout=10). Update the docstring to reflect the actual interface.

Proposed fix

category: Correction category (DRAFTING, PROCESS, etc.) - llm_provider: Optional LLM provider for refinement of low-confidence matches. - Interface: llm_provider.extract(draft, final, classification) -> str + llm_provider: Optional LLM provider for refinement of low-confidence matches. + Interface: llm_provider.complete(prompt, *, max_tokens, timeout) -> str

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

llm_provider: Optional LLM provider for refinement of low-confidence matches.

Interface: llm_provider.extract(draft, final, classification) -> str

llm_provider: Optional LLM provider for refinement of low-confidence matches.

Interface: llm_provider.complete(prompt, *, max_tokens, timeout) -> str

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@src/gradata/enhancements/behavioral_extractor.py` around lines 473 - 474, The docstring for the llm_provider is incorrect: update the description to match the actual interface used by _try_llm_extract — instead of extract(draft, final, classification) -> str, document that llm_provider exposes a complete(prompt: str, max_tokens: int = ..., timeout: int = ...) -> str (or similar synchronous completion API that accepts a prompt and optional parameters and returns the model text); reference llm_provider and the _try_llm_extract function in the behavioral_extractor module and ensure the docstring mentions the expected parameters (prompt, max_tokens, timeout) and return type (str).

coderabbitai · 2026-04-09T18:53:59Z

+        lessons = fresh_brain._load_lessons()
+        process_lessons = [l for l in lessons if l.category == "PROCESS"]
+        assert len(process_lessons) > 0, "Should have PROCESS lessons after 3 corrections"


🧹 Nitpick | 🔵 Trivial

Rename ambiguous variable l to lesson.

The variable name l is ambiguous (easily confused with 1 or I). Use a descriptive name like lesson for clarity.

Proposed fix

lessons = fresh_brain._load_lessons() - process_lessons = [l for l in lessons if l.category == "PROCESS"] + process_lessons = [lesson for lesson in lessons if lesson.category == "PROCESS"] assert len(process_lessons) > 0, "Should have PROCESS lessons after 3 corrections"

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

lessons = fresh_brain._load_lessons()

process_lessons = [l for l in lessons if l.category == "PROCESS"]

assert len(process_lessons) > 0, "Should have PROCESS lessons after 3 corrections"

lessons = fresh_brain._load_lessons()

process_lessons = [lesson for lesson in lessons if lesson.category == "PROCESS"]

assert len(process_lessons) > 0, "Should have PROCESS lessons after 3 corrections"

🧰 Tools

🪛 Ruff (0.15.9)

[error] 102-102: Ambiguous variable name: l

(E741)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@tests/test_pipeline_e2e.py` around lines 101 - 103, The list comprehension uses an ambiguous loop variable name `l`; rename it to a descriptive name like `lesson` in the comprehension that builds `process_lessons` (where you call fresh_brain._load_lessons()) so it becomes [lesson for lesson in lessons if lesson.category == "PROCESS"], and update any references to the loop variable (the comprehension itself and nearby code/comments) to use `lesson` for clarity.

coderabbitai · 2026-04-09T18:53:59Z

+        assert m.applies_when == ["task_type=sales", "session_type=sales"]
+        assert m.never_when == ["task_type=system"]
+        assert m.context_weights["sales"] == 1.5


⚠️ Potential issue | 🟡 Minor

Use pytest.approx for float comparison.

Line 211 compares a float value directly with ==. Per coding guidelines, floating-point comparisons should use pytest.approx to handle potential precision issues from serialization/deserialization.

Proposed fix

assert m.applies_when == ["task_type=sales", "session_type=sales"] assert m.never_when == ["task_type=system"] - assert m.context_weights["sales"] == 1.5 + assert m.context_weights["sales"] == pytest.approx(1.5)

As per coding guidelines: "tests/**: floating point comparisons use pytest.approx"

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@tests/test_pipeline_e2e.py` around lines 209 - 211, The test currently asserts a float equality on m.context_weights["sales"] using ==; change this to use pytest.approx by replacing the direct equality with assert m.context_weights["sales"] == pytest.approx(1.5) and ensure pytest is imported in the test module (add import pytest at top if missing) so floating-point comparisons follow the project's guideline.

coderabbitai · 2026-04-09T18:53:59Z

+        promoted = []
+        for l in lessons:
+            if l.category == "PROCESS":
+                promoted.append(Lesson(
+                    date=l.date, state=LessonState.RULE, confidence=0.90,
+                    category=l.category, description=l.description,
+                ))
+            else:
+                promoted.append(l)


🧹 Nitpick | 🔵 Trivial

Rename ambiguous variable l to lesson.

Same issue as line 102 - the variable name l should be renamed for clarity.

Proposed fix

# Promote lessons to RULE (simulating what graduation does over many sessions) promoted = [] - for l in lessons: - if l.category == "PROCESS": + for lesson in lessons: + if lesson.category == "PROCESS": promoted.append(Lesson( - date=l.date, state=LessonState.RULE, confidence=0.90, - category=l.category, description=l.description, + date=lesson.date, state=LessonState.RULE, confidence=0.90, + category=lesson.category, description=lesson.description, )) else: - promoted.append(l) + promoted.append(lesson)

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

promoted = []

for l in lessons:

if l.category == "PROCESS":

promoted.append(Lesson(

date=l.date, state=LessonState.RULE, confidence=0.90,

category=l.category, description=l.description,

))

else:

promoted.append(l)

promoted = []

for lesson in lessons:

if lesson.category == "PROCESS":

promoted.append(Lesson(

date=lesson.date, state=LessonState.RULE, confidence=0.90,

category=lesson.category, description=lesson.description,

))

else:

promoted.append(lesson)

🧰 Tools

🪛 Ruff (0.15.9)

[error] 229-229: Ambiguous variable name: l

(E741)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@tests/test_pipeline_e2e.py` around lines 228 - 236, The loop uses an ambiguous single-letter variable "l"; rename it to "lesson" throughout the loop that builds promoted (the for loop iterating over lessons) and update all references where Lesson(...), lesson.category, lesson.date, lesson.description are used so the logic is unchanged (keep the condition checking lesson.category == "PROCESS" and constructing Lesson(..., state=LessonState.RULE, confidence=0.90) for promoted). Ensure the variable name change is applied consistently in that block to avoid shadowing or unresolved names.

coderabbitai · 2026-04-09T18:53:59Z

+    def test_same_correction_twice_same_session(self, fresh_brain):
+        corr = SALES_CORRECTIONS[0]
+        r1 = fresh_brain.correct(draft=corr["draft"], final=corr["final"],
+                                  category=corr["category"], session=95)
+        r2 = fresh_brain.correct(draft=corr["draft"], final=corr["final"],
+                                  category=corr["category"], session=95)
+        assert r1 is not None
+        assert r2 is not None


🧹 Nitpick | 🔵 Trivial

Strengthen assertions beyond truthiness checks.

The assertions only verify that results are not None. Per coding guidelines, assertions should check specific values. Consider adding assertions that verify expected deduplication behavior (e.g., both corrections succeeded, or a deduplication indicator is present).

Proposed improvement

def test_same_correction_twice_same_session(self, fresh_brain): corr = SALES_CORRECTIONS[0] r1 = fresh_brain.correct(draft=corr["draft"], final=corr["final"], category=corr["category"], session=95) r2 = fresh_brain.correct(draft=corr["draft"], final=corr["final"], category=corr["category"], session=95) - assert r1 is not None - assert r2 is not None + assert r1 is not None, "First correction should succeed" + assert r2 is not None, "Duplicate correction should succeed (not crash)" + # Verify both have valid severity outcomes + assert r1.get("outcome") or r1.get("data", {}).get("severity") + assert r2.get("outcome") or r2.get("data", {}).get("severity")

As per coding guidelines: "tests/**: assertions check specific values not just truthiness"

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@tests/test_pipeline_e2e.py` around lines 251 - 258, The test test_same_correction_twice_same_session currently only asserts r1 and r2 are not None; update it to assert concrete expected values from fresh_brain.correct (inspect the return shape) — for example assert r1 contains an identifier (e.g., "id" in r1) and that r1["id"] == r2["id"] if deduplication should return the same record, or assert r1["created"] is True and r2["created"] is False (or assert a "deduplicated" flag) to explicitly verify deduplication behavior when calling fresh_brain.correct with the same SALES_CORRECTIONS[0] and session=95.

Merged github/main into feat/sdk-hook-port. Resolved conflicts in _core.py, behavioral_extractor.py, meta_rules_storage.py, test_pipeline_e2e.py. Kept main's meta-rule logic + feature branch's CodeRabbit fixes. 1775 tests pass. Co-Authored-By: Gradata <noreply@gradata.ai>

…y + behavioral extraction - Remove hardcoded Windows path in test_pipeline_e2e.py, use env var only (P1) - Wrap query_graduation_candidates in try/finally to prevent SQLite connection leak (P1) - Restore en-dash (U+2013) in _context_compile.py prospect name splitting regex (P1) - Fix tautological assertion in test_core_behavioral.py fallback test (Major) - Fix docstring: llm_provider.extract() -> llm_provider.complete() (Major) - Remove unused category parameter from generate_instruction() (Minor) - Downgrade _try_llm_extract exception log from warning to debug (Minor) - Use pytest.approx for float comparison in test_pipeline_e2e.py (Minor) Co-Authored-By: Gradata <noreply@gradata.ai>

greptile-apps Bot reviewed Apr 9, 2026

View reviewed changes

Comment thread tests/test_pipeline_e2e.py

Comment thread src/gradata/enhancements/meta_rules_storage.py

Comment thread src/gradata/enhancements/behavioral_extractor.py

Comment thread src/gradata/enhancements/behavioral_extractor.py

coderabbitai Bot added the feature label Apr 9, 2026

coderabbitai Bot requested changes Apr 9, 2026

View reviewed changes

Gradata force-pushed the feat/meta-rule-discovery branch from 0ca07b8 to 7680808 Compare April 9, 2026 17:37

coderabbitai Bot requested changes Apr 9, 2026

View reviewed changes

Comment thread src/gradata/enhancements/behavioral_extractor.py Outdated

Comment thread src/gradata/enhancements/meta_rules_storage.py Outdated

Comment thread tests/test_core_behavioral.py

Comment thread tests/test_pipeline_e2e.py

fix: resolve Pyright type error in orchestrator.py

7d9034b

Use getattr() instead of direct attribute access for spawn_queue to satisfy Pyright's type narrowing after hasattr() check. Co-Authored-By: Gradata <noreply@gradata.ai>

fix: apply CodeRabbit auto-fixes

2536dcc

Fixed 94 file(s) based on 4 unresolved review comments. Co-authored-by: CodeRabbit <noreply@coderabbit.ai>

greptile-apps Bot reviewed Apr 9, 2026

View reviewed changes

Comment thread src/gradata/_context_compile.py Outdated

coderabbitai Bot requested changes Apr 9, 2026

View reviewed changes

Comment thread src/gradata/_context_compile.py Outdated

Comment thread src/gradata/enhancements/behavioral_extractor.py

Comment thread tests/test_pipeline_e2e.py Outdated

fix: apply CodeRabbit auto-fixes

6b4909d

Fixed 3 file(s) based on 3 unresolved review comments. Co-authored-by: CodeRabbit <noreply@coderabbit.ai>

coderabbitai Bot requested changes Apr 9, 2026

View reviewed changes

Gradata merged commit d7d7643 into main Apr 9, 2026
6 checks passed

coderabbitai Bot mentioned this pull request Apr 9, 2026

feat: SDK hook port — 19 Python hooks + installer + profile system #20

Merged

4 tasks

Gradata deleted the feat/meta-rule-discovery branch April 10, 2026 04:25

coderabbitai Bot mentioned this pull request Apr 17, 2026

feat: multi-tenant SDK + cloud alignment (tenant_id, visibility, clusters, sync_state) #102

Merged

6 tasks

coderabbitai Bot mentioned this pull request Apr 21, 2026

Pre-launch ultra-review: meta-rules port, legacy-gate cleanup, graduation bug fix #126

Closed

3 tasks

		llm_provider: Optional LLM provider for refinement of low-confidence matches.
		Interface: llm_provider.extract(draft, final, classification) -> str

Conversation

Gradata commented Apr 9, 2026 • edited by greptile-apps Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

What changed

Test plan

Greptile Summary

Confidence Score: 4/5

Vulnerabilities

Important Files Changed

Flowchart

Rule 1: Never use print() ... (source)

Uh oh!

coderabbitai Bot commented Apr 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviews paused

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

❌ Failed checks (1 warning)

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Apr 9, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Apr 9, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot Apr 9, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Apr 9, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Apr 9, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot commented Apr 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Fixes Applied Successfully

Uh oh!

Uh oh!

coderabbitai Bot left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot commented Apr 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Fixes Applied Successfully

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Apr 9, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Apr 9, 2026

Choose a reason for hiding this comment

Uh oh!

Gradata commented Apr 9, 2026 •

edited by greptile-apps Bot

Loading

coderabbitai Bot commented Apr 9, 2026 •

edited

Loading

coderabbitai Bot left a comment •

edited

Loading

coderabbitai Bot commented Apr 9, 2026 •

edited

Loading

coderabbitai Bot left a comment •

edited

Loading

coderabbitai Bot commented Apr 9, 2026 •

edited

Loading