feat(hooks): opt-out env kill switches for 6 SDK hooks + audit fixes by Gradata · Pull Request #133 · Gradata/gradata

Gradata · 2026-04-21T15:50:00Z

Summary

Adds opt-out env-var kill switches to 6 Gradata SDK hooks so projects with a superset JS replacement can disable the Python version without SDK source edits
Previously shipped on this branch: agent-precontext sub-agent dedup, brain_prompt size cap, context-inject FTS dedup, jit-inject zero-match guard, handoff v2 delta
Rationale: hook-overlap audit 2026-04-21 found double-firing across ~14 hooks — this ships the SDK half of the fixes

Kill switches added (default "1")

Var	Event	Hook
`GRADATA_BRAIN_MAINTAIN`	Stop	brain_maintain.py
`GRADATA_SESSION_PERSIST`	Stop	session_persist.py
`GRADATA_SECRET_SCAN`	PreToolUse	secret_scan.py
`GRADATA_CONFIG_PROTECTION`	PreToolUse	config_protection.py
`GRADATA_DUPLICATE_GUARD`	PreToolUse	duplicate_guard.py
`GRADATA_CONFIG_VALIDATE`	SessionStart	config_validate.py

secret_scan additionally emits a stderr warning when disabled — it is the only credential guard, so a silent opt-out on a misconfigured project is too risky.

Estimated savings when JS superset active

Per-session: ~700-1500 tokens + 8-20s latency at Stop
Per-edit: ~200-400 tokens
Per-prompt: ~1000-2000 tokens (from earlier dedup commits on this branch)

Test plan

pytest tests/ — 3908 passed, 2 skipped
Code review (pr-review-toolkit:code-reviewer) — approved with secret_scan warning ask, addressed
Default behavior unchanged: env unset or "1" keeps every hook active
Manual: verify Sprites overlay no longer double-fires after merging + setting env block in project .claude/settings.json

Generated with Gradata

Local SQLite and cloud Supabase schemas diverged (wide `tenant_id` + `data_json` vs narrow `brain_id` + `data` jsonb, plus table rename `correction_patterns` -> `corrections`). Added `_transform_row` per-table mapper with deterministic uuid5 ids so repeat pushes upsert cleanly. `_scrub` strips NUL bytes and lone UTF-16 surrogates that Postgres JSONB rejects. `_post` dedupes within each batch, honors `_TABLE_REMAP`, and chunks large pushes to avoid PostgREST's opaque "Empty or invalid json" body-limit errors. `GRADATA_SUPABASE_URL` / `GRADATA_SUPABASE_SERVICE_KEY` now work as aliases so one .env serves both backend and SDK. Co-Authored-By: Gradata <noreply@gradata.ai>

…provider synth Phase 1 of the learning-pipeline revamp. Rule graduation now flows through the canonical _graduation.graduate() path (strict > for INSTINCT->PATTERN, >= for PATTERN->RULE) instead of the inline duplicate in rule_pipeline. Injection hook reads a persistent brain_prompt.md gated by an AUTO-GENERATED header, regenerated only at session_close after the pipeline fires. LLM synthesis gets a two-provider path: anthropic SDK (ANTHROPIC_API_KEY) with claude CLI fallback (Max-plan OAuth) so users without an exportable key still get synthesis. Meta-rule deterministic fallback now warns loudly instead of silently discarding. Drops five env-flag gates in favour of file-based signals. Co-Authored-By: Gradata <noreply@gradata.ai>

Adds --cloud / --no-cloud flags to the doctor CLI command and the underlying diagnose() function. Flips the default cloud endpoint to api.gradata.ai/api/v1. Covers new behaviour with test_doctor_cloud.py (all passing). Co-Authored-By: Gradata <noreply@gradata.ai>

Regex coverage was brittle to shorthand: real corrections like "Why r you not asking" and "Why flag.. we dont skip" slipped the \bwhy (did|would|are) you\b pattern and never became IMPLICIT_FEEDBACK events. That silently breaks Gradata's core promise ("learn from any correction"). Adds: - negation: dont/cant/shouldnt (no-apostrophe variants), never - reminder: "again" marker, "dont forget" - challenge: "why r u", "why not/r/are/is/does", "why word..", "how come", "you missed/forgot/failed/didnt" All 8 target phrases now detect. 25 existing implicit-feedback tests remain green. Co-Authored-By: Gradata <noreply@gradata.ai>

14 new tests pinning the regex expansion from 5a6da45. Covers real corrections observed this session ("Why r you not asking council", "Why flag.. we don't skip we do work") plus shorthand cases (dont / cant / again / you missed / how come). Dual-signal cases assert both types detect. Full suite: 37 passed, 1 pre-existing skip. Co-Authored-By: Gradata <noreply@gradata.ai>

Five post-launch metrics with precise definitions (activation, D7 retention, time-to-first-graduation, free->Pro conversion, correction-rate decay). Numeric triggers: pivot <20% activation + flat decay at D30; kill <100 installs at D60; scale >1K installs + >=5% conversion at D90. Monday 30-min retro agenda. Source: Card 8 of the pre-launch gap analysis. Co-Authored-By: Gradata <noreply@gradata.ai>

The source-provenance docstring referenced "cloud-side LLM synthesis" which is stale since the graduation-cloud-gate was removed. Synthesis runs on the user's machine via rule_synthesizer.py's two-provider path (Anthropic SDK with user's key, or Claude Code Max CLI OAuth). Co-Authored-By: Gradata <noreply@gradata.ai>

Graduation and meta-rule LLM synthesis run entirely locally as of a few sessions ago (rule_synthesizer.py uses user's own Anthropic key or Claude Code Max CLI OAuth). The Pro-tier inclusion list incorrectly still claimed "cloud runs better graduation engine" and implied a cloud-enhanced sqlite-vec path. Rewrite the inclusion list + philosophy paragraph to match reality: free is functionally complete; Pro is visualization, history, export, and the future community corpus. NOTE: this file is listed in .gitignore per the earlier "untrack private files" cleanup. Force-added at request. Co-Authored-By: Gradata <noreply@gradata.ai>

Test was checking the pre-transform local key name. _cloud_sync._transform_row correctly emits brain_id (cloud schema) from tenant_id (local schema); the assertion was stale. Co-Authored-By: Gradata <noreply@gradata.ai>

Previously nothing wrote to lesson_applications — the table existed (onboard.py), was size-checked (_validator.py), and synced to cloud (_cloud_sync.py), but no code ever inserted a row. The compound-quality story had no evidence: rules claimed to fire with no receipt. Now: - inject_brain_rules writes one PENDING row per injected rule (cluster members included), storing {category, description, task} in context so session_close can attribute outcomes back to specific rules. - session_close resolves PENDING rows at end-of-waterfall: REJECTED if any CORRECTION/IMPLICIT_FEEDBACK/RULE_FAILURE in the session shares the lesson's category (or description substring). CONFIRMED otherwise (rule survived the session). Both paths are best-effort — DB missing, schema drift, or IO errors degrade silently rather than blocking injection or session close. Unblocks the Card 6 MVP day-14 metric: "did a graduated rule actually fire and survive?" — the answer now has a row-level audit trail. Co-Authored-By: Gradata <noreply@gradata.ai>

Sweeps the remaining docs that still claimed cloud gated any part of the learning loop. Actual architecture (as of the graduation-local pivot): Local SDK owns: correction capture, graduation, meta-rule clustering AND LLM-synthesis (via user's Anthropic key or Claude Code Max OAuth), rule-to-hook promotion, manifest computation. Cloud owns: dashboard/visualization, cross-device sync, team brains, managed backups, future opt-in corpus donation. Files touched: - docs/cloud/overview.md — capability matrix, architecture diagram, use-when guidance. - docs/architecture/cloud-monolith-v2.md — cloud-side workload framing. - docs/architecture/multi-tenant-future-proofing.md — proprietary boundary, verification flow. - docs/concepts/meta-rules.md — synthesis is local, not cloud-gated. - docs/cloud/dashboard.md — dashboard visualizes local output, does not re-synthesize. README.md was already accurate; no changes there. Co-Authored-By: Gradata <noreply@gradata.ai>

Silent-failure-hunter CRITICAL-1: - inject_brain_rules: wrap lesson_applications connection in try/finally and escalate OperationalError to warning (missing-table surfaces). Silent-failure-hunter CRITICAL-2: - _cloud_sync.push: per-row try/except on _transform_row so one bad row no longer propagates and kills the whole push batch. Leak scan blockers: - Delete docs/pre-launch-plan.md and docs/gradata-marketing-strategy.md from the public repo; add both to .gitignore. These contain kill triggers, pricing, and PII that belong in the private brain vault only. Code-reviewer BLOCKER-3: - _doctor._check_vector_store returns status="ok" with FTS5 detail in the detail field, restoring the documented status vocabulary ({ok, warn, fail, skip, missing, error}). Test-coverage gaps: - Add tests/test_rule_synthesizer.py — both providers absent, empty input, cache hit, CLI fallback on SDK raise, malformed output. - Add IMPLICIT_FEEDBACK → REJECTED integration test to test_lesson_applications.py. Verification: full suite 3802 pass, 22 skip, 2 xfailed.

Gradata is fully local-first now. Cloud-gate stubs and "requires cloud" skip markers were legacy artifacts from an earlier architecture where discovery/synthesis lived server-side. This commit finishes the port: - meta_rules.discover_meta_rules + merge_into_meta run locally: category grouping + greedy semantic-similarity clustering, zombie filter on RULE-state lessons below 0.90, decay after 20 sessions, count/(count+3) confidence smoothing. - Drop @_requires_cloud markers from test_bug_fixes, test_llm_synthesizer, test_meta_rule_generalization, test_multi_brain_simulation, test_pipeline_e2e. These tests now exercise the local impl directly. - Retire the api_key-kwarg-on-merge_into_meta path (session-close rule_synthesizer drives LLM distillation now). - Update fixtures to realistic prose so they survive the noise filter that rejects "cut:/added:" edit-distance summaries. - Bump test_meta_rules confidence assertion to the smoothed formula. - Add docs/LEGACY_CLEANUP.md tracking the remaining cloud-gate vestiges (deprecated adapter shims, cloud docs, stale module docstrings). Suite: 3809 passed, 14 skipped, 2 xfailed. Co-Authored-By: Gradata <noreply@gradata.ai>

…xtures discover_meta_rules is implemented now (local-first). The if not metas: pytest.skip('discover_meta_rules not yet implemented') guards were vestiges from the cloud-only era — convert to real asserts. Also bump 0.88-confidence RULE-state fixtures to 0.90 so they survive the zombie filter (RULE at <0.90 is treated as a decayed rule). Suite: 3813 passed, 10 skipped, 2 xfailed. Remaining skips are all legit: - test_file_lock.py (2): Windows vs POSIX platform gates - test_integration_workflow.py (5): require ANTHROPIC/OPENAI keys, cost money - test_mem0_adapter.py::test_real_mem0_roundtrip: requires MEM0_API_KEY - test_meta_rules.py::test_with_real_data: requires GRADATA_LESSONS_PATH env xfails (2) are tracked for v0.7 reconciliation in test docstring. Co-Authored-By: Gradata <noreply@gradata.ai>

Found while clearing remaining skipped/xfailed tests: Bug: agent_graduation._update_lesson_confidence had confidence = max(0.0, confidence - MISFIRE_PENALTY) but MISFIRE_PENALTY = -0.15 (negative). Subtracting a negative added confidence on rejection. Test test_rejection_decreases_confidence was xfail'd with 'API drift, reconcile in v0.7' — it was a real bug. Fix: align with canonical _confidence.py usage (confidence + MISFIRE_PENALTY). Other cleanups in the same pass: - test_agent_graduation: drop both xfail markers. test_lesson_graduates_to_pattern was also wrong on its own terms — with ACCEPTANCE_BONUS=0.20 the lesson graduates straight to RULE (stronger than PATTERN). Accept either state. - test_integration_workflow: delete stale module-level skipif guarding 5 tests behind ANTHROPIC/OPENAI keys they never actually use. They only exercise local brain.correct/convergence/efficiency — no network. - test_mem0_adapter: delete test_real_mem0_roundtrip (live-API smoke test already covered by the 20+ fake-client tests in the same file). - test_meta_rules: delete test_with_real_data — dev-time exploration script with zero asserts, requiring GRADATA_LESSONS_PATH env var. Suite: 3820 passed, 3 skipped, 0 xfailed, 0 failed. Remaining 3 skips are test_file_lock.py POSIX paths that require fcntl, which does not exist on Windows. Complementary Windows paths skip on Linux — running on each platform covers all 4. Cannot be eliminated. From 22 skipped + 2 xfailed to 3 skipped + 0 xfailed. Co-Authored-By: Gradata <noreply@gradata.ai>

CRITICAL fixes: - C1: rewrite meta_rules.py module docstring. It still said 'require Gradata Cloud' / 'no-ops in the open-source build' which directly contradicted the local-first implementation in the same file. Now describes the real algorithm. Closes LEGACY_CLEANUP item #3. - C2: drop owner-name string from _NOISE_PATTERNS. The other entries are format-based (cut:/added:/content change) and filter just fine. - C3: generalize the name-prefix strip regex in _build_principle from hardcoded 'Oliver:' to a generic 'Name:' pattern. HIGH fixes: - H1: update _update_lesson_confidence docstring to stop quoting the old -0.25 number and instead point at the canonical constants. - H2: _apply_decay no longer mutates MetaRule in place — uses dataclasses.replace() so refresh_meta_rules' persisted inputs aren't silently modified. - H3: add a comment explaining why the call-site threshold=0.20 is intentionally looser than _cluster_by_similarity's 0.35 default (category pre-filter handles most noise, recall matters more here). Suite clean on touched areas. Co-Authored-By: Gradata <noreply@gradata.ai>

…tocol Closes #127: HandoffWatchdog fires a preemptive resume-doc at 0.65 pressure (GRADATA_HANDOFF_THRESHOLD override), writes a compact Markdown handoff, and emits a handoff.triggered event so auto-compaction isn't the first signal the agent is out of budget. Closes #128: MultimodalEmbedder Protocol + MultimodalInput validation + TextOnlyEmbedder default + embed_any router. User supplies their own multimodal provider (Gemini, Voyage, CLIP); Gradata never hosts the endpoint. Falls back to text-only when no multimodal embedder is configured. Both are provider-agnostic, local-first, and covered by unit tests (18 handoff + 20 embedder). Full suite: 3853 passed, 3 skipped. Co-Authored-By: Gradata <noreply@gradata.ai>

- HandoffWatchdog._fired now init=False/repr=False/compare=False so the guard cannot be bypassed via constructor and doesn't leak into equality. - _hash_vector zero-norm branch now returns a zero vector instead of an unnormalised one, honouring the Protocol's normalisation contract. - Add test covering the handoff.triggered event emission path so a _events.emit signature drift can't silently regress. Co-Authored-By: Gradata <noreply@gradata.ai>

test_capture_rule_failure.py reached out of Gradata/ via parents[4] to load .claude/hooks/reflect/scripts/capture_learning.py — a private Claude Code hook that is not part of the public SDK. The test would skip on every machine except the author's worktree, adding a phantom \"skipped\" count in CI for every downstream user. If we want coverage for the matcher, rewrite it as a pure unit test against a function exposed by the SDK, or keep it on the private side next to the hook it exercises. Suite after removal: 3854 passed, 2 skipped (the two legitimate POSIX tests in test_file_lock.py that run on Linux CI). Co-Authored-By: Gradata <noreply@gradata.ai>

Wires the watchdog to the next agent's context: when HandoffWatchdog fires and writes a handoff doc, the new SessionStart hook loads the most recent unconsumed *.handoff.md from {brain_dir}/handoffs/, wraps it in <handoff>...</handoff>, and returns it to Claude Code. The agent sees the handoff before brain-rules (primacy) and picks up where the prior agent left off. After injection the file moves to handoffs/consumed/ so the next session won't re-inject it. Oversized bodies are truncated (GRADATA_HANDOFF_MAX_CHARS, default 4000). Embedded </handoff> literals are escaped so a hostile body cannot close our wrapper early. Helpers added to gradata.contrib.patterns.handoff: - default_handoff_dir(brain_dir) → Path (canonical location) - pick_latest_unconsumed(dir) → Path | None - consume_handoff(path) → moves to consumed/ subdir Tests: +16 hook tests + 9 helper tests = 41 total on handoff+hook. Co-Authored-By: Gradata <noreply@gradata.ai>

Handoff now carries the timestamp of the rules the prior agent was operating under. On next SessionStart, inject_handoff writes a .handoff_active.json sentinel. inject_brain_rules reads it and, when lessons.md has not changed since the snapshot, suppresses the ranked <brain-rules> block — the handoff already carries that continuity. Mandatory directives, disposition, meta-rules, and the brain_prompt short-circuit still fire; only the ranked block is skipped. Gated by GRADATA_HANDOFF_RULES_DELTA=1 (default on). Co-Authored-By: Gradata <noreply@gradata.ai>

Sub-agent spawns were re-injecting rules already present in the parent session's context — measured ~500-2500 wasted tokens per multi-agent workflow. agent_precontext now reads brain_dir/.last_injection.json (written by inject_brain_rules on SessionStart) and skips any rule whose full_id appears in the parent manifest. Gated by GRADATA_SUBAGENT_DEDUP=1 (default on). Silent on missing manifest — falls back to full injection. Matches the feature-flag pattern used by the handoff-delta optimization. Co-Authored-By: Gradata <noreply@gradata.ai>

brain_prompt.md had no size cap and grew unconstrained as the lesson corpus matured, costing 500-3000 tokens per session on the primary injection path. Add GRADATA_MAX_BRAIN_PROMPT_CHARS (default 4000) with truncation marker, matching the inject_handoff pattern. Co-Authored-By: Gradata <noreply@gradata.ai>

context_inject fires on every UserPromptSubmit and returned FTS snippets that frequently overlapped with rules already in the <brain-rules> block — ~200-500 wasted tokens per prompt. Drops any snippet with >70% Jaccard token overlap against an injected rule description. Reads brain_dir/.last_injection.json for the comparison corpus. Gated by GRADATA_CONTEXT_DEDUP=1 with threshold override via GRADATA_CONTEXT_DEDUP_THRESHOLD. Co-Authored-By: Gradata <noreply@gradata.ai>

_emit_event ran unconditionally before the 'if not ranked: return' guard, writing a JIT_INJECTION entry for every UserPromptSubmit even when zero rules matched. Most prompts are zero-match, so this was the dominant source of events.jsonl write amplification and hot- path I/O overhead. Moved the emit after the empty-guard so only successful injections emit — matches the success-only pattern in inject_handoff. Co-Authored-By: Gradata <noreply@gradata.ai>

…tart hooks Projects with a superset JS replacement (e.g. the Sprites overlay) can now disable the Python SDK hook without patching SDK source. Default is on — setting the env var to "0" skips the hook and returns None. Vars added (default "1"): GRADATA_BRAIN_MAINTAIN — Stop, brain_maintain.py GRADATA_SESSION_PERSIST — Stop, session_persist.py GRADATA_SECRET_SCAN — PreToolUse, secret_scan.py GRADATA_CONFIG_PROTECTION — PreToolUse, config_protection.py GRADATA_DUPLICATE_GUARD — PreToolUse, duplicate_guard.py GRADATA_CONFIG_VALIDATE — SessionStart, config_validate.py secret_scan additionally emits a stderr warning when disabled — it is the sole line of defense against credential commits, so a silent opt-out on a misconfigured project is too risky. Hook-overlap audit 2026-04-21 (.tmp/hook-overlap-audit-2026-04-21.md): items 10-14 + 17. Eliminates ~8-20s per Stop, ~200-400 tok per edit, ~1500 tok per session of duplicate work when a JS superset is active. Tests: 3908 passed, 2 skipped (baseline 3828/2, +80 from unrelated). Co-Authored-By: Gradata <noreply@gradata.ai>

greptile-apps

Your free trial has ended. If you'd like to continue receiving code reviews, you can add a payment method here.

coderabbitai · 2026-04-21T15:50:18Z

Caution

Review failed

The pull request is closed.

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 81e9ae93-b425-475e-bdd8-96d8cd22a03f

📥 Commits

Reviewing files that changed from the base of the PR and between 5ef1a1e and f531086.

📒 Files selected for processing (54)

.gitignore
Gradata/docs/LEGACY_CLEANUP.md
Gradata/docs/architecture/cloud-monolith-v2.md
Gradata/docs/architecture/multi-tenant-future-proofing.md
Gradata/docs/cloud/dashboard.md
Gradata/docs/cloud/overview.md
Gradata/docs/concepts/meta-rules.md
Gradata/src/gradata/_cloud_sync.py
Gradata/src/gradata/_doctor.py
Gradata/src/gradata/cli.py
Gradata/src/gradata/cloud/client.py
Gradata/src/gradata/contrib/patterns/handoff.py
Gradata/src/gradata/enhancements/graduation/agent_graduation.py
Gradata/src/gradata/enhancements/meta_rules.py
Gradata/src/gradata/enhancements/rag/__init__.py
Gradata/src/gradata/enhancements/rag/embedders.py
Gradata/src/gradata/enhancements/rule_pipeline.py
Gradata/src/gradata/enhancements/rule_synthesizer.py
Gradata/src/gradata/hooks/agent_precontext.py
Gradata/src/gradata/hooks/brain_maintain.py
Gradata/src/gradata/hooks/config_protection.py
Gradata/src/gradata/hooks/config_validate.py
Gradata/src/gradata/hooks/context_inject.py
Gradata/src/gradata/hooks/duplicate_guard.py
Gradata/src/gradata/hooks/implicit_feedback.py
Gradata/src/gradata/hooks/inject_brain_rules.py
Gradata/src/gradata/hooks/inject_handoff.py
Gradata/src/gradata/hooks/jit_inject.py
Gradata/src/gradata/hooks/secret_scan.py
Gradata/src/gradata/hooks/session_close.py
Gradata/src/gradata/hooks/session_persist.py
Gradata/tests/conftest.py
Gradata/tests/test_agent_graduation.py
Gradata/tests/test_bug_fixes.py
Gradata/tests/test_capture_rule_failure.py
Gradata/tests/test_cloud_row_push.py
Gradata/tests/test_context_inject.py
Gradata/tests/test_doctor_cloud.py
Gradata/tests/test_handoff.py
Gradata/tests/test_hooks_learning.py
Gradata/tests/test_implicit_feedback.py
Gradata/tests/test_inject_handoff_hook.py
Gradata/tests/test_integration_workflow.py
Gradata/tests/test_jit_inject.py
Gradata/tests/test_lesson_applications.py
Gradata/tests/test_llm_synthesizer.py
Gradata/tests/test_mem0_adapter.py
Gradata/tests/test_meta_rule_generalization.py
Gradata/tests/test_meta_rules.py
Gradata/tests/test_multi_brain_simulation.py
Gradata/tests/test_pipeline_e2e.py
Gradata/tests/test_rag_embedders.py
Gradata/tests/test_rule_pipeline.py
Gradata/tests/test_rule_synthesizer.py

📝 Walkthrough

Summary

Six SDK Hook Kill Switches: Added environment-variable opt-out disable switches (default "1" = enabled) for GRADATA_BRAIN_MAINTAIN, GRADATA_SESSION_PERSIST, GRADATA_SECRET_SCAN (with stderr warning), GRADATA_CONFIG_PROTECTION, GRADATA_DUPLICATE_GUARD, and GRADATA_CONFIG_VALIDATE to allow disabling duplicate hooks when JavaScript superset is active; estimated savings 700–2000 tokens per session and 8–20s latency reduction.
Local-First Meta-Rules & Synthesis: Shifted core learning-loop functions (meta-rule discovery, graduation, rule-to-hook promotion) from cloud-dependent to fully local execution using user's Anthropic API key or Claude Code Max OAuth; cloud now visualizes/mirrors results rather than re-running logic.
New Public Modules:
- gradata.contrib.patterns.handoff: Context-pressure watchdog with artifact persistence, rules-snapshot timestamping, and consumption tracking
- gradata.enhancements.rag.embedders: Multimodal embedding protocol with TextOnlyEmbedder and pluggable routing
- gradata.enhancements.rule_synthesizer: Deterministic brain-wisdom block generation with Anthropic/Claude fallback and caching
Hook-Level Deduplication: Agent precontext subagent dedup (GRADATA_SUBAGENT_DEDUP), context-inject snippet dedup via Jaccard similarity, JIT-inject zero-match guard.
Cloud Infrastructure Updates: Supabase native env-var aliases, table remapping, deterministic row UUIDs, JSONB sanitization, batch dedup/limiting.
Diagnostics Enhancement: gradata doctor now accepts --cloud and --no-cloud flags with cloud-specific probes (config, env, reachability, credentials, data availability).
Cloud Endpoint Change: Updated default endpoint from https://api.gradata.com/v1 to https://api.gradata.ai/api/v1.
Test Coverage: 3908 passing tests (+260 new); removed cloud-gating decorators from meta-rule discovery tests; added comprehensive coverage for handoff, context dedup, embedders, rule synthesis, implicit feedback patterns, and lesson applications audit trail.
Breaking Changes: None (environment variables are optional and default-enabled).

Walkthrough

This PR shifts Gradata's architecture from cloud-centric to local-first by moving core learning-loop operations (graduation, meta-rule synthesis, rule-to-hook promotion) into the SDK, repositioning cloud as a visualization and sharing layer. It introduces local meta-rule discovery, a handoff mechanism for context-pressure handling, rule synthesis caching, RAG embedder infrastructure, enhanced cloud sync with row transformation, and numerous hook enhancements with configurable kill switches.

Changes

Cohort / File(s)	Summary
Documentation & Architecture `Gradata/docs/.gitignore`, `Gradata/docs/LEGACY_CLEANUP.md`, `Gradata/docs/architecture/cloud-monolith-v2.md`, `Gradata/docs/architecture/multi-tenant-future-proofing.md`, `Gradata/docs/cloud/dashboard.md`, `Gradata/docs/cloud/overview.md`, `Gradata/docs/concepts/meta-rules.md`	Clarifies new local-first boundary: graduation, synthesis, and rule promotion now run locally in SQLite; cloud mirrors events/rules for dashboard visualization, team sharing, and optional backups without gating or re-running the learning loop.
Cloud Sync & Client `Gradata/src/gradata/_cloud_sync.py`, `Gradata/src/gradata/cloud/client.py`	Implements Supabase-native environment variable support, deterministic row-shape transformation with deduplication, payload sanitization for Postgres JSONB (NUL byte removal), table remapping (`correction_patterns` → `corrections`), and updates API endpoint from `api.gradata.com/v1` to `api.gradata.ai/api/v1`.
Core Learning Loop `Gradata/src/gradata/enhancements/meta_rules.py`, `Gradata/src/gradata/enhancements/rule_pipeline.py`, `Gradata/src/gradata/enhancements/graduation/agent_graduation.py`	Implements fully local meta-rule discovery with clustering, semantic filtering, and deterministic principle generation; refactors rule graduation to use `graduate()` helper; adjusts graduation thresholds and result detection logic (changes `rejected` outcome handling from subtraction to addition of penalty).
Rule Synthesis & RAG `Gradata/src/gradata/enhancements/rule_synthesizer.py`, `Gradata/src/gradata/enhancements/rag/__init__.py`, `Gradata/src/gradata/enhancements/rag/embedders.py`	Adds rule synthesizer with Anthropic SDK + Claude CLI fallback for generating `<brain-wisdom>` blocks with deterministic caching; introduces pluggable RAG embedder protocol supporting multimodal inputs with text-only default and L2-normalized vectors.
Handoff Mechanism `Gradata/src/gradata/contrib/patterns/handoff.py`, `Gradata/src/gradata/hooks/inject_handoff.py`	Implements context-pressure watchdog with configurable threshold, one-shot firing, deterministic filenames, event emission, and file consumption tracking; handoff injection hook reads unconsumed docs, wraps in XML, parses rules snapshots, and emits metadata events.
Hooks: Environment Kill Switches & Core Logic `Gradata/src/gradata/hooks/brain_maintain.py`, `Gradata/src/gradata/hooks/config_protection.py`, `Gradata/src/gradata/hooks/config_validate.py`, `Gradata/src/gradata/hooks/duplicate_guard.py`, `Gradata/src/gradata/hooks/session_persist.py`, `Gradata/src/gradata/hooks/secret_scan.py`	Adds environment-variable opt-outs (`GRADATA_*=0` kill switches) for disabling respective hook behaviors without modifying code.
Hooks: Context & Rule Injection `Gradata/src/gradata/hooks/agent_precontext.py`, `Gradata/src/gradata/hooks/context_inject.py`, `Gradata/src/gradata/hooks/inject_brain_rules.py`, `Gradata/src/gradata/hooks/jit_inject.py`	Adds token-set Jaccard deduplication against prior injections; context dedup skips duplicates above similarity threshold; brain-rules hook reads/wraps synthesized `brain_prompt.md`, inserts `lesson_applications` audit rows, and skips ranked rules when handoff-active; JIT inject guards event emission on non-empty results.
Hooks: Post-Pipeline & Session `Gradata/src/gradata/hooks/implicit_feedback.py`, `Gradata/src/gradata/hooks/session_close.py`	Expands implicit feedback regex patterns (negations, reminders, challenges, corrections); session close updates `PENDING` lesson applications to `CONFIRMED`/`REJECTED` based on session events and regenerates `brain_prompt.md` via rule synthesis.
Diagnostics & CLI `Gradata/src/gradata/_doctor.py`, `Gradata/src/gradata/cli.py`	Extends `diagnose()` with cloud-specific checks (config, env, reachability, auth, data availability) gated by `include_cloud`/`cloud_only` flags; CLI adds `--cloud`/`--no-cloud` arguments and reformats output via multi-line `print()` calls.
Tests: New Suites `Gradata/tests/test_context_inject.py`, `Gradata/tests/test_doctor_cloud.py`, `Gradata/tests/test_handoff.py`, `Gradata/tests/test_implicit_feedback.py`, `Gradata/tests/test_inject_handoff_hook.py`, `Gradata/tests/test_lesson_applications.py`, `Gradata/tests/test_rag_embedders.py`, `Gradata/tests/test_rule_synthesizer.py`	Adds comprehensive test coverage for context dedup, cloud diagnostics, handoff mechanism, implicit feedback patterns, handoff injection, lesson applications audit trail, RAG embedders, and rule synthesis fail-safes.
Tests: Updated & Removed `Gradata/tests/test_agent_graduation.py`, `Gradata/tests/test_bug_fixes.py`, `Gradata/tests/test_capture_rule_failure.py`, `Gradata/tests/test_cloud_row_push.py`, `Gradata/tests/test_hooks_learning.py`, `Gradata/tests/test_integration_workflow.py`, `Gradata/tests/test_jit_inject.py`, `Gradata/tests/test_llm_synthesizer.py`, `Gradata/tests/test_mem0_adapter.py`, `Gradata/tests/test_meta_rule_generalization.py`, `Gradata/tests/test_meta_rules.py`, `Gradata/tests/test_multi_brain_simulation.py`, `Gradata/tests/test_pipeline_e2e.py`	Removes cloud-only skip conditions/markers; updates graduation assertions (widens state acceptance, removes xfail decorators); removes real LLM integration tests; updates meta-rule confidence formula to smoothing; removes cloud-sync integration skip; deletes `test_capture_rule_failure.py` (dynamic hook loading no longer needed).
Test Infrastructure `Gradata/tests/conftest.py`	Adds blank lines before fixtures for readability (no logic changes).

Sequence Diagram(s)

sequenceDiagram
    participant SDK as SDK (Local)
    participant Cache as Rule Synthesis Cache
    participant Anthropic as Anthropic SDK
    participant Claude as Claude CLI
    participant LLM as LLM Provider

    SDK->>SDK: Aggregate rules (mandatory, clustered, meta, disposition)
    activate SDK
    SDK->>Cache: Check deterministic cache key
    Cache-->>SDK: Cache hit? Return cached block
    alt Cache Miss
        SDK->>Anthropic: Try ANTHROPIC_API_KEY path
        alt SDK Available
            Anthropic->>LLM: POST to claude-opus-4-7
            LLM-->>Anthropic: <brain-wisdom>...</brain-wisdom>
            Anthropic-->>SDK: Return synthesized block
        else SDK Unavailable
            SDK->>Claude: Fallback to 'claude -p' CLI
            Claude->>LLM: CLI invocation --model ...
            LLM-->>Claude: Output with <brain-wisdom>
            Claude-->>SDK: Return extracted block
        end
        SDK->>Cache: Write cache (best-effort)
    end
    SDK-->>SDK: Return brain-wisdom block or None on failure
    deactivate SDK

sequenceDiagram
    participant App as Application
    participant Watchdog as HandoffWatchdog
    participant Synth as Synthesizer Callable
    participant FS as File System
    participant Events as Events System

    App->>Watchdog: measure_pressure(tokens_used, tokens_max)
    Watchdog-->>App: pressure [0,1] (clamped)

    App->>Watchdog: check(tokens_used, tokens_max)
    activate Watchdog
    Watchdog->>Watchdog: Compute pressure
    alt Pressure >= Threshold & Not Yet Fired
        Watchdog->>Synth: invoke synthesizer()
        Synth-->>Watchdog: HandoffDoc
        Watchdog->>FS: Write handoff_dir/[task_id].[agent].[ts].handoff.md
        FS-->>Watchdog: File written
        Watchdog->>Events: emit("handoff.triggered")
        Events-->>Watchdog: Event sent (or skipped)
        Watchdog->>Watchdog: Mark _fired = True
        Watchdog-->>App: Return HandoffDoc
    else Below Threshold or Already Fired
        Watchdog-->>App: Return None
    end
    deactivate Watchdog

    App->>Watchdog: reset()
    Watchdog->>Watchdog: Set _fired = False

sequenceDiagram
    participant SDK as SDK Row Buffer
    participant Transform as Row Transformer
    participant Dedup as Deduplication
    participant Scrub as Payload Sanitizer
    participant HTTP as HTTP Batch Post
    participant Cloud as Cloud Table

    SDK->>Transform: For each SQLite row
    activate Transform
    Transform->>Transform: Coerce types (e.g., session to int|None)
    Transform->>Transform: Parse JSON text columns
    Transform->>Transform: Pack extra fields → data
    Transform->>Transform: Generate deterministic UUID
    Transform-->>SDK: Transformed row
    deactivate Transform

    SDK->>Dedup: Collect transformed rows
    Dedup->>Dedup: Group by id, keep first of each
    Dedup-->>SDK: Deduplicated batch

    SDK->>Scrub: Sanitize for JSONB
    Scrub->>Scrub: Remove NUL bytes recursively
    Scrub-->>SDK: Sanitized payload

    SDK->>HTTP: POST /[remapped_table] batch
    HTTP->>Cloud: Send deduplicated, sanitized payload
    Cloud-->>HTTP: 200 OK (count)
    HTTP-->>SDK: Return accepted row count

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~55 minutes

Possibly related PRs

feat: retain orchestrator + cluster-level injection #96: Modifies inject_brain_rules.py injection flow with handoff-aware rule skipping and lesson applications audit trail insertion.
feat: wire LLM meta-rule synthesis (Gemma native) #97: Updates meta-rule synthesis flow in synthesize_meta_rules_agentic and merge_into_meta with LLM failure handling and local principle building.
fix(graduation): session-count-aware state for backfilled patterns #119: Refactors Phase 2 graduation control flow in rule_pipeline.py with session-count backfill logic for pattern/rule state mapping.

Suggested labels

feature

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch chore/hook-overlap-dedup

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

#133 added opt-out env vars (GRADATA_SECRET_SCAN=0, _CONFIG_PROTECTION=0, _SESSION_PERSIST=0, etc.) that disable the corresponding hook. Dev shells often leave these set, which then flips 10 hook safety/intelligence tests from green to failing locally even though the code is correct. Session-scoped autouse fixture pops the seven kill-switches for the whole test session and restores them on teardown. Co-Authored-By: Gradata <noreply@gradata.ai>

Watchdog gap: the repo only had sdk-publish.yml (tag-triggered), so PRs shipped without pytest ever running. #133's hermeticity bug slipped through because CodeRabbit reviews code but doesn't execute tests. - Matrix: Python 3.11 + 3.12 (matches pyproject `requires-python = >=3.11`) - Scope: PRs touching Gradata/** and pushes to main - No lint yet — ruff surfaces 622 pre-existing errors that need a separate clean-up pass before it can be a blocking gate. Co-Authored-By: Gradata <noreply@gradata.ai>

…or (#134) * fix(implicit_feedback): restore GAP signal category dropped in hook dedup The hook-overlap audit removed the JS implicit-feedback hook as redundant with the SDK version, but verifier caught that the SDK SIGNAL_MAP was missing the GAP category: "what about", "you forgot/missed/skipped/ dropped/ignored", "did you check/verify/test/review". CHALLENGE_PATTERNS already catches "you didn't/missed/forgot/failed" but lost the "what about" and "did you check" variants. Adding GAP_PATTERNS restores strict parity with the removed JS hook. Tests: 48 implicit_feedback + hooks_intelligence tests pass. Co-Authored-By: Gradata <noreply@gradata.ai> * feat(implicit_feedback): emit tacit OUTPUT_ACCEPTED on silent follow-ups Users rarely type "looks good" — they just send the next task. brain.correct() logs every CORRECTION but explicit approval words fire 20x less, making the correction ratio look broken (2289% over last 14 days). Emit a tacit OUTPUT_ACCEPTED when a substantive follow-up prompt (>=30 chars) has no negation / challenge / reminder / gap signals. - Adds `mode: "explicit" | "tacit"` to the event payload so the audit script can distinguish signal strength. - Short acks ("ok", "go", "thanks") stay below the threshold — they are too ambiguous to infer prior-turn acceptance. Co-Authored-By: Gradata <noreply@gradata.ai> * fix(hooks): eliminate false-positive REJECTED outcomes + dead code session_close previously flagged rules as REJECTED when a corrections' 30-char prefix happened to match any substring of the lesson description. "never hardcode secrets" and "never hardcode port numbers" collided on "never hardcode " and quietly poisoned the graduation pipeline. Require the shorter side to be ≥40 chars and to be a full substring match (either direction) before rejecting. Also remove a dead `payload.get("tool_output", "")` expression in claude_code.py whose return value was never captured. Co-Authored-By: Gradata <noreply@gradata.ai> * fix(cloud): align SDK base URL with /api/v1 prefix Cloud API now mounts routes under /api/v1 (gradata-cloud@04a272f). SDK was posting to /v1/telemetry/metrics — 404s. Rebases the default base to https://api.gradata.ai/api/v1 and trims the /v1 prefix from the per-call paths. Also applies ruff formatting to the security regression tests (no behavior change). Co-Authored-By: Gradata <noreply@gradata.ai> * refactor(hooks): collapse emit boilerplate into _base.emit_hook_event helper Five hooks were repeating the same resolve_brain_dir() → BrainContext.from_brain_dir() → emit() dance. Centralize into emit_hook_event() so new hooks don't re-learn the pattern and failures log uniformly. - _base.py: add emit_hook_event(event_type, source, data, brain_dir=None) - implicit_feedback, agent_graduation, tool_failure_emit, self_review: migrated - Net -13 lines, identical external behavior, all 90 hook tests pass Co-Authored-By: Gradata <noreply@gradata.ai> * fix(sdk): audit-driven bug batch — unreachable code, shared-state mutation, off-by-one From autoresearch audits on patterns/, enhancements/, and top-level SDK: - rag.py: two-pass query expansion was unreachable (Stage 3 unconditionally returned before Stage 4). Moved expansion inside Stage 3 gate so cfg.two_pass actually takes effect when non-empty results exist. - parallel.py: DependencyGraph.run mutated task.input_data on the shared ParallelTask instance. Re-running the graph saw stale upstream outputs. Use dataclasses.replace to scope the resolved input to the current run. - guardrails.py: two dead expressions (.lower() and str()) whose results were discarded; removed. - _confidence.py: sessions_since_fire off-by-one — reset to 0 then immediately += 1 produced a systematic overcount for fired lessons. Track via flag and skip the increment on fire. Added defensive severity default for fragile ternary on CONTRADICTING path. - meta_rules.py:685: refresh_meta_rules mutated existing_metas in place despite contract; use dataclasses.replace so callers' references stay pristine. - brain.py:_resolve_pending: held a SQLite connection open across lessons_lock. Close before acquiring the file lock; re-open only for the final UPDATE. All 670 affected tests pass. Co-Authored-By: Gradata <noreply@gradata.ai> * test(conftest): scrub hook kill-switch env vars for hermetic runs #133 added opt-out env vars (GRADATA_SECRET_SCAN=0, _CONFIG_PROTECTION=0, _SESSION_PERSIST=0, etc.) that disable the corresponding hook. Dev shells often leave these set, which then flips 10 hook safety/intelligence tests from green to failing locally even though the code is correct. Session-scoped autouse fixture pops the seven kill-switches for the whole test session and restores them on teardown. Co-Authored-By: Gradata <noreply@gradata.ai> * perf(sdk): MEDIUM fixes — skip DDL reruns, drop O(n) dupe scan, log swallowed exceptions - _events.py:_ensure_table: cache schema-initialized state per db_path so the 10+ CREATE/ALTER/INDEX DDL statements run once per process instead of on every emit() call. PRAGMAs still re-run per connection. - reflection.py: CritiqueChecklist duplicate-name scan was O(n²) via list.count in a loop; use Counter once. - reporting.py: three `except Exception: pass` blocks in build_brain_briefing silently dropped rule/quality/correction extraction errors. Log at DEBUG so misconfigurations are diagnosable without changing the silent-return contract. All 180/167 affected tests pass. Co-Authored-By: Gradata <noreply@gradata.ai> * fix(rules): audit HIGH batch — non-deterministic rule_id, O(N×E) sort, duplicate TaskType - _engine.py: replace `hash(lesson.description) % 10000` with `_make_rule_id(lesson)`. Python hash() is per-process randomized via PYTHONHASHSEED, so RuleCache and RuleGraph lookups keyed on rule_id broke across runs. - _engine.py: pre-compute difficulty_by_cat dict once before sort to collapse O(N × E) compute_rule_difficulty calls inside sort key to O(E + N). - scope.py: merge duplicate `TaskType(name="research", ...)` entries. Second entry (with sales-flavored keywords) was dead — first match always won. Unified keyword list in the primary entry. Tests: 418 passed (rules + scope + rule_engine scope). Co-Authored-By: Gradata <noreply@gradata.ai> * ci(sdk): add pytest on pull_request Watchdog gap: the repo only had sdk-publish.yml (tag-triggered), so PRs shipped without pytest ever running. #133's hermeticity bug slipped through because CodeRabbit reviews code but doesn't execute tests. - Matrix: Python 3.11 + 3.12 (matches pyproject `requires-python = >=3.11`) - Scope: PRs touching Gradata/** and pushes to main - No lint yet — ruff surfaces 622 pre-existing errors that need a separate clean-up pass before it can be a blocking gate. Co-Authored-By: Gradata <noreply@gradata.ai> * fix(sdk): audit CRITICAL + HIGH batch — hash determinism, shared-state mutation, conn leaks, O(N^2) CRITICAL - integrations/embeddings.py: trigram local-embedding used Python's built-in hash() which is per-process randomized (PYTHONHASHSEED). Same text embedded across processes yielded different vectors, silently corrupting cosine similarity + clustering. Switched to md5 truncated to 8 bytes — stable, fast, deterministic. - integrations/openai_adapter.py: patched_create mutated the caller's messages list and dict entries in place. Any caller that reused the list across calls permanently accumulated rules on the system message. Now clones the list + dicts and routes the clone to the underlying client via kwargs/args. - sidecar/watcher.py: _try_emit_via_brain created a new Brain (fresh SQLite conn) on every detected change, never closed. Now caches a single self._brain_instance lazily on first use. HIGH - graph.py: O(N*E) node lookup inside graduation-edge loop replaced with a single nodes_by_id dict. O(M*E) "any(e.target == mr_id for e in edges)" per meta-rule replaced with a merged_into_targets set. - mcp_server.py: top-level except in _dispatch now logs via _log.exception with traceback before returning the error dict. - middleware/_core.py: RuleSource.load now mtime-caches parsed+filtered lessons when sourced from lessons.md. Previously every on_llm_start / on_llm_end re-read and re-parsed the file. - integrations/langchain_adapter.py: 3 bare "except Exception: pass" blocks in load_memory_variables + save_context now log at debug level. - rules/rule_tracker.py: get_rule_history previously pulled 500 events and filtered Python-side. Now queries with tags_json LIKE 'rule:<id>' — uses the tag the emitter already writes. Tests: updated test_integrations.py to assert messages passed into the underlying client (not the caller's list) contain injected rules + that the caller's list stays unmutated. 249 targeted tests pass. Co-Authored-By: Gradata <noreply@gradata.ai> * fix(sdk): audit CRITICAL+HIGH batch 3 — conn leaks, double disk reads, silent except, dead code CRITICAL - brain.py::review_pending: conn.close() was outside try/finally. If fetchall() or row materialization raised, the SQLite handle leaked. Switched to `with contextlib.closing(get_connection(...)) as conn:`. - _brain_manifest.py::generate: session-count cross-check did get_connection + conn.close() with close outside finally. Same fix. - _manifest_metrics.py::_quality_metrics: previously read lessons.md twice per call (once here, once inside `_lesson_distribution`). Read once, pass text through. HIGH - _manifest_helpers.py::_count_events, _get_tables: conn.close() only on happy path. Switched to contextlib.closing. - _manifest_metrics.py::_quality_metrics second conn block: same fix. - _manifest_metrics.py:221: dead list-comprehension whose result was immediately discarded — deleted. - brain.py::correct: telemetry `except Exception: pass` now debug-logs so failures are visible. - rules/rule_engine/_scoring.py::validate_assumptions: bare `except: pass` on scope_json parse now logs at debug level. Tests: 602 passed (brain + manifest + scoring + confidence scope). Co-Authored-By: Gradata <noreply@gradata.ai> * perf(brain): push _search_events fallback filter into SQL; log manifest parse errors - brain.py::_search_events: term filter now runs in SQL (LOWER(data_json) LIKE) instead of fetching 500 rows and Python-filtering. Empty query returns [] early. - brain.py: delete dead `with contextlib.suppress(ImportError): pass` trailer. - cloud/client.py::_read_local_manifest: corrupt brain.manifest.json now logs at warning level before returning empty dict, instead of silently shipping empty payloads to the cloud. Co-Authored-By: Gradata <noreply@gradata.ai> * perf(brain): batch 5 — lessons cache, env hoist, pairwise embed dedup, scope_json helper - brain.py::_load_lessons: mtime-keyed parse cache so apply_brain_rules and other read-only callers reuse parsed lessons instead of re-parsing on every call. apply_brain_rules switched to the cached loader; enhancements import check moved to find_spec so pyright sees it as a capability probe. - _graduation.py: hoist Beta-LB env reads out of the per-lesson loop via a new _read_beta_lb_config() called once per graduate() invocation. - similarity.py: expose semantic_vector / similarity_from_vectors so callers comparing one probe against many stored strings precompute stored vectors once (O(N*M) tokenization -> O(N+M)). - _graduation.py dedup gate: precompute existing-rule vectors outside the candidate loop and use similarity_from_vectors. - _manifest_metrics.py: add _parse_ts_utc() so naive/aware timestamp mixes from SQLite coerce to aware UTC before subtraction. - _scoring.py::lesson_scope: shared scope_json parse helper; _engine.py and validate_assumptions now use it instead of inlining the try/json.loads pattern. Removes the unused logging import from _scoring.py. Full suite: 3908 passed, 2 skipped. Co-Authored-By: Gradata <noreply@gradata.ai> * perf(confidence): hoist json import to module top Removes 7 redundant `import json as _json_*` statements from hot paths (parse_lessons per-meta-line, format_lessons per-lesson). Python caches imports so the cost is modest, but the stuttered aliases obscure intent. Co-Authored-By: Gradata <noreply@gradata.ai> * perf: batch 7 — brain cache invalidation, loop_detection O(1), q_learning index - brain.py: invalidate lessons + rule cache after patch_rule/forget/rollback/_resolve_pending writes; wrap _resolve_pending sqlite connections in contextlib.closing; cache self_improvement capability check in __init__; add logger.debug to silent excepts in session/manifest.proof - loop_detection.py: use Counter alongside deque for O(1) repeat detection (was O(window_size) per record) - q_learning_router.py: hoist hmac/logging/platform/time imports to module level; precompute agent_index dict for O(1) lookup (was O(N) list.index per update_reward) Co-Authored-By: Gradata <noreply@gradata.ai> * perf: batch 8 — hoist env reads, wrap sqlite, logger.debug silent excepts - meta_rules.py: _resolve_principle_creds() hoists GRADATA_LLM_* env reads out of per-category loop; _try_llm_principle accepts precomputed creds - reporting.py: wrap health-report sqlite3 connection in contextlib.closing; replace two `except: pass` with logger.debug - router_warmstart.py: wrap warm-start sqlite connection in contextlib.closing (was leaked if exception in between connect/close) - contrib/enhancements/quality_gates.py: wrap success-report sqlite in contextlib.closing; replace `except: pass` with logger.debug - brain.py: lineage() now uses get_connection() (consistent with the rest of brain.py) instead of raw sqlite3.connect - test_agentic_synthesis.py: update mocks to accept new creds kwarg Co-Authored-By: Gradata <noreply@gradata.ai> * perf: batch 9 — stable RRF hashing, O(N^2) fix, precompute word-sets, log swallowed excepts - rag.py: replace non-deterministic hash() with zlib.crc32 for RRF chunk IDs (PYTHONHASHSEED randomisation was silently breaking dedupe across processes/restarts) - rag.py: order_by_relevance_position no longer uses list.insert(0, ...) — was O(N^2) per call, now O(N) via head/tail split + reverse - rag.py: two-pass expansion + NaiveRAG.retrieve silent excepts now log at debug instead of masking misconfigured backends - tree_of_thoughts.py: precompute rule_word_sets once outside _default_scorer closure (was O(N*M) re-tokenisation per candidate x existing_rule) - rule_context_bridge.py: wrap WAL checkpoint conn in contextlib.closing; log the swallow - brain.py: hoist dataclasses import to module level (was inside health()) Co-Authored-By: Gradata <noreply@gradata.ai> * perf: batch 10 — stable IDs, sqlite closing, dict-indexed intent registry - rule_graph.py: wrap add_rule_relationship and get_related_rules sqlite connections in contextlib.closing (were leaked on exception) - rule_tree.py: replace non-deterministic hash() with zlib.crc32 for lesson IDs written to persisted .md frontmatter (was changing across processes, breaking cross-run ID stability) - contrib/patterns/memory.py: use heapq.nlargest for retrieve() when limit < len(matches); full sort only when returning everything - contrib/patterns/orchestrator.py: mirror _REGISTERED_INTENT_PATTERNS into a dict index for O(1) classify_request lookup (was O(N) linear scan per classify call) Co-Authored-By: Gradata <noreply@gradata.ai> * chore: track handoff watchdog hooks in .claude/hooks/ Whitelist the user-prompt handoff-watchdog and session-start handoff-inject hooks (plus their dispatchers) so fresh clones keep context-pressure handling wired into Claude Code. Everything else under .claude/ remains ignored. Co-Authored-By: Gradata <noreply@gradata.ai> * chore(gitignore): drop dispatcher whitelist, keep only watchdog hooks Refine the .gitignore watchdog carve-out — track only the two hook files themselves; leave dispatcher wiring machine-local. Co-Authored-By: Gradata <noreply@gradata.ai> * fix: address CodeRabbit review findings on #134 - brain.py: _resolve_pending re-checks resolution IS NULL and verifies rowcount in the UPDATE; prevents lost-race overwrites returning resolved=true when another worker already resolved - rag.py: two-pass expanded retrieval returns the original user query (not expanded_query) so downstream telemetry/logging never surfaces mined corpus terms as user input - cloud/sync.py: _normalize_api_base() upgrades legacy https://api.gradata.ai bases (no /api/v1 segment) on load; older cloud-config.json files self-heal instead of silently POSTing to unversioned endpoints - hooks/session_close.py: enforce the 40-char floor on BOTH lesson_desc and rejecting desc; gating only one side let short lessons match long descriptions via prefix containment - hooks/implicit_feedback.py: drop forgot|missed from GAP_PATTERNS (already owned by CHALLENGE_PATTERNS); raise tacit threshold to 60 chars and skip messages that look like questions — "can you explain ..." no longer counts as tacit acceptance - guardrails.py: block_reason now reports only the first failing input check, aligning with the GuardedResult docstring contract Co-Authored-By: Gradata <noreply@gradata.ai> --------- Co-authored-by: Gradata <noreply@gradata.ai>

Gradata and others added 26 commits April 20, 2026 15:16

fix(tests): assert brain_id not tenant_id in cloud push test

f141efd

Test was checking the pre-transform local key name. _cloud_sync._transform_row correctly emits brain_id (cloud schema) from tenant_id (local schema); the assertion was stale. Co-Authored-By: Gradata <noreply@gradata.ai>

greptile-apps Bot reviewed Apr 21, 2026

View reviewed changes

Gradata merged commit 934627b into main Apr 21, 2026
1 check was pending

Gradata deleted the chore/hook-overlap-dedup branch April 21, 2026 15:51

coderabbitai Bot added the feature label Apr 21, 2026

Gradata mentioned this pull request Apr 21, 2026

feat(hooks): kill switches, tacit signals, cloud URL fix, emit refactor #134

Merged

4 tasks

Gradata mentioned this pull request Apr 21, 2026

feat: context-pressure handoff watchdog + multimodal RAG embedder protocol #130

Closed

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(hooks): opt-out env kill switches for 6 SDK hooks + audit fixes#133

feat(hooks): opt-out env kill switches for 6 SDK hooks + audit fixes#133
Gradata merged 26 commits intomainfrom
chore/hook-overlap-dedup

Gradata commented Apr 21, 2026

Uh oh!

greptile-apps Bot left a comment

Uh oh!

coderabbitai Bot commented Apr 21, 2026 •

edited

Loading

Review failed

Summary

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested labels

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Gradata commented Apr 21, 2026

Summary

Kill switches added (default "1")

Estimated savings when JS superset active

Test plan

Uh oh!

greptile-apps Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot commented Apr 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review failed

Summary

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested labels

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

coderabbitai Bot commented Apr 21, 2026 •

edited

Loading