Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
26 commits
Select commit Hold shift + click to select a range
a44a224
feat(sync): transform local rows to cloud schema + scrub JSONB payloads
Gradata Apr 20, 2026
f91d555
feat(pipeline): canonical graduation + persistent brain_prompt + two-…
Gradata Apr 21, 2026
d542533
feat(doctor): add cloud-health probing to gradata doctor
Gradata Apr 21, 2026
5a6da45
fix(implicit_feedback): catch text-speak corrections (r/u/dont/cant)
Gradata Apr 21, 2026
1a497e8
test(implicit_feedback): cover text-speak and multi-signal inputs
Gradata Apr 21, 2026
7340ebb
docs: add pre-launch plan with numeric pivot/kill/scale triggers
Gradata Apr 21, 2026
0b797b7
docs(meta_rules): llm_synth now runs locally, not cloud-side
Gradata Apr 21, 2026
2c65bf2
docs(marketing): correct stale cloud-graduation claims in Pro tier
Gradata Apr 21, 2026
f141efd
fix(tests): assert brain_id not tenant_id in cloud push test
Gradata Apr 21, 2026
d668bab
feat(lesson_applications): close the compound-quality audit loop
Gradata Apr 21, 2026
978e4c7
docs: truth-pass cloud-vs-SDK boundary across architecture + concepts
Gradata Apr 21, 2026
61ce3b1
fix(ultrareview): address 4-agent review before public push
Gradata Apr 21, 2026
509bf92
feat(meta_rules): port local-first discovery, unskip cloud-gated tests
Gradata Apr 21, 2026
2a78164
test(pipeline_e2e): remove stale 'not yet implemented' skips, bump fi…
Gradata Apr 21, 2026
03ddb6f
fix(graduation): correct MISFIRE_PENALTY sign in agent_graduation
Gradata Apr 21, 2026
90a993d
review: address 3 CRITICAL + 3 HIGH from PR #126 review
Gradata Apr 21, 2026
3698836
feat: context-pressure handoff watchdog + multimodal RAG embedder pro…
Gradata Apr 21, 2026
5c24a26
fix: address code review on PR #130
Gradata Apr 21, 2026
b49b66f
chore(tests): remove private-hook test leaking into public SDK
Gradata Apr 21, 2026
43a1905
feat(hooks): SessionStart hook injects handoff into next agent
Gradata Apr 21, 2026
cc00f3b
feat(handoff): v2 rules-snapshot delta to save warm-resume tokens
Gradata Apr 21, 2026
635db13
feat(agent-precontext): dedup sub-agent rules against parent injection
Gradata Apr 21, 2026
944bf00
feat(brain-prompt): cap brain_prompt.md at 4000 chars
Gradata Apr 21, 2026
a9375e3
feat(context-inject): dedup FTS snippets against injected rules
Gradata Apr 21, 2026
59b1418
fix(jit-inject): stop emitting events.jsonl on zero-match prompts
Gradata Apr 21, 2026
f531086
feat(hooks): opt-out env kill switches for 6 Stop/PreToolUse/SessionS…
Gradata Apr 21, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -135,6 +135,7 @@ Gradata/docs/STRESS_TEST_PROTOCOL.md
Gradata/docs/GRADATA-LAUNCH-STRATEGY.md
Gradata/docs/GTM-Execution-Plan.md
Gradata/docs/gradata-marketing-strategy.md
Gradata/docs/pre-launch-plan.md
Gradata/docs/gradata-comparison-table.md
Gradata/docs/ablation-experiment-s93.md
Gradata/docs/ARCHITECTURE.md
Expand Down
53 changes: 53 additions & 0 deletions Gradata/docs/LEGACY_CLEANUP.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,53 @@
# Legacy Cloud-Gate Cleanup Tracker

As of 2026-04-20, Gradata is fully local-first. Cloud-gate stubs and
"cloud-only" fallbacks are legacy concepts that should be removed.

## Principle

- Every feature must run locally with no external service.
- `gradata_cloud_backup/` is a private backup, not a gate.
- LLM-assisted synthesis uses the user's own provider (Anthropic SDK key or
Claude Code Max OAuth via `claude -p`). Never a Gradata-hosted endpoint.
- Tests and fixtures should exercise the local implementation directly.

## Known legacy items to retire

### 1. Deprecated adapter shims (scheduled v0.8.0)
- `src/gradata/integrations/anthropic_adapter.py` → `middleware.wrap_anthropic`
- `src/gradata/integrations/langchain_adapter.py` → `middleware.LangChainCallback`
- `src/gradata/integrations/crewai_adapter.py` → `middleware.CrewAIGuard`
Warnings are in place; remove the modules and their tests at v0.8.0.

### 2. `_cloud_sync.py` terminology
File posts to an optional external dashboard — fine to keep, but the
module docstring should make clear it is optional telemetry, not a
mandatory cloud dependency. Callers already tolerate absence.

### 3. ~~Docstring drift in `meta_rules.py`~~ (fixed in PR #126)
Module header now describes the local clustering algorithm and points
at `rule_synthesizer` for LLM-assisted distillation. Closed.

### 4. Test-level cloud gating
Former `@_requires_cloud` / `skipif` markers were deleted in this cycle.
If any new test reintroduces a cloud gate, delete the gate instead — the
feature should either be local-first or not ship.

### 5. `api_key` kwarg on `merge_into_meta`
The old `merge_into_meta(..., api_key=...)` path routed into
`synthesise_principle_llm` directly. Current architecture drives LLM
distillation from `rule_synthesizer` at session close instead. The kwarg
is still accepted via `**kwargs` for forward compatibility but performs
no work — remove after one release.

### 6. Doc sweep
`docs/cloud/` should be audited for pages that imply cloud is required.
Rewrite as "optional managed hosting" or delete.

## How to retire an item

1. Grep for the symbol / doc string.
2. Delete the code path and any tests that exercise it.
3. Update the module docstring.
4. Bump the deprecation note in `CHANGELOG`.
5. Run the full suite.
7 changes: 5 additions & 2 deletions Gradata/docs/architecture/cloud-monolith-v2.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,8 +5,11 @@ Redis (cache), Kafka (queue), Elasticsearch (search), and Pinecone
(vectors) for gradata-cloud workloads — no new vendors.

Design goal: one Postgres instance, RLS-isolated per tenant, carrying
every cloud-side workload the SDK needs. Local SQLite stays the source
of truth for writes; cloud is the pushable reflection + shared surface.
the cloud-side visualization and sharing workloads. Local SQLite stays
the source of truth and runs graduation, synthesis, and rule-to-hook
promotion locally. Cloud is a downstream reflection — it mirrors events
and rules for dashboards, team sharing, and managed backups, but does
not gate or re-run the learning loop.

## What v2 adds

Expand Down
14 changes: 7 additions & 7 deletions Gradata/docs/architecture/multi-tenant-future-proofing.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,13 +13,13 @@
- Embeddings stored as BLOB (`brain_embeddings`); FTS5 via `brain_fts`.
- `events.scope` column exists (default 'local') — partial seed for tenant scoping, not used.
- `sync_state` table exists per source but not cloud-bound.
- Proprietary scoring/graduation code in `gradata_cloud_backup/`.
- Proprietary dashboard / team-sharing code in `gradata_cloud_backup/`. Graduation runs locally in the OSS SDK.
- Open SDK is Apache-2.0 — cannot require cloud to run.

## Architectural Decisions (Lock In Now)

### 1. Local-first stays the source of truth
SDK writes to local SQLite + jsonl. Cloud is a **sync target + shared meta-rule source + proprietary scoring service**. Do NOT migrate SDK storage to Postgres. Reasons: privacy, offline, open source, speed.
SDK writes to local SQLite + jsonl and runs the full learning loop (graduation, synthesis, rule-to-hook promotion) locally. Cloud is a **sync target + dashboard + future team + future shared-corpus surface** — not a gate on the local loop. Do NOT migrate SDK storage to Postgres. Reasons: privacy, offline, open source, speed.

### 2. Supabase is the cloud target
Postgres + Auth + RLS + pgvector + Realtime in one project. Free tier covers pre-revenue. Alternative (Neon + Clerk + own RLS) costs weeks you don't have.
Expand All @@ -36,9 +36,9 @@ Add `visibility TEXT` to `meta_rules`, `rules` (if separate table emerges):
- `global` — Gradata-curated, pushed to all tenants (e.g., quality_gates, truth_protocol)

### 5. Proprietary boundary
- **Open SDK** writes raw events, computes local diffs, injects rules.
- **Cloud (proprietary)** owns: graduation scoring, cross-tenant meta-rule mining, profiling, billing, licensing.
- Clean interface: SDK posts events Cloud returns scored rules. Stateless call.
- **Open SDK** writes raw events, computes local diffs, injects rules, graduates lessons, and synthesizes meta-rules locally (BYO API key or Claude Code Max OAuth).
- **Cloud (proprietary)** owns: dashboard/visualization, cross-tenant meta-rule corpus (opt-in donation), team sharing, billing, licensing.
- Clean interface: SDK pushes events + graduated rules to cloud. Cloud reflects them back through UI. Cloud never re-runs graduation.

### 6. Schema versioning
Add `schema_version INT` to event envelope + a `migrations` table. Forward-only migrations. SDK refuses to run against incompatible brain.
Expand Down Expand Up @@ -116,9 +116,9 @@ Files to create:
### Phase 3 — Verification (half day)

10. Spin up a **test tenant** (not Oliver, not user #2). Run full flow:
- Onboard → writes local brain → syncs to cloudpulls global rulescorrects a draft → rule graduates → syncs back
- Onboard → writes local brain → corrects a draftrule graduates **locally**syncs reflection up to cloud → dashboard renders.
- Verify RLS: test tenant cannot see Oliver's events (SQL probe)
- Ablation: disable cloud sync → SDK still works fully offline
- Ablation: disable cloud sync → SDK still works fully offline, including graduation + synthesis.

### Phase 4 — Explicitly deferred

Expand Down
2 changes: 1 addition & 1 deletion Gradata/docs/cloud/dashboard.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# Dashboard

The Gradata Cloud dashboard is a Next.js app at [app.gradata.ai](https://app.gradata.ai). It wraps the same data the local `brain.manifest.json` exposes, plus Cloud-only views for meta-rule synthesis, team management, and the operator console.
The Gradata Cloud dashboard is a Next.js app at [app.gradata.ai](https://app.gradata.ai). It visualizes the same data the local `brain.manifest.json` exposes, plus Cloud-only views for team management and the operator console. Meta-rule synthesis runs locally in the SDK — the dashboard renders the results, it does not re-run them.

<!-- Screenshot placeholders will land after the dashboard design pass. -->

Expand Down
16 changes: 7 additions & 9 deletions Gradata/docs/cloud/overview.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# Gradata Cloud

Gradata Cloud is the hosted dashboard and back-end that complements the open-source SDK. The SDK keeps running locally; Cloud adds synchronization, cross-device continuity, team sharing, meta-rule synthesis, and an operator view for engineering teams.
Gradata Cloud is the hosted dashboard that complements the open-source SDK. **The SDK is functionally complete on its own** — graduation, meta-rule synthesis, rule-to-hook promotion, and every piece of the learning loop run locally. Cloud adds visualization, cross-device continuity, team sharing, and managed backups on top of that local loop.

## What's in the SDK vs the Cloud

Expand All @@ -14,15 +14,14 @@ Gradata Cloud is the hosted dashboard and back-end that complements the open-sou
| Search (FTS5 + optional embeddings) | Yes | Yes |
| Cross-platform export (`.cursorrules`, `BRAIN-RULES.md`, ...) | Yes | Yes |
| Meta-rule **clustering** | Yes | Yes |
| Meta-rule **synthesis** (LLM-generated principles) | Placeholder | Yes |
| Meta-rule **synthesis** (local LLM via your own key or Claude Code Max OAuth) | Yes | Yes |
| Dashboard with charts | No | Yes |
| Cross-device sync of a brain | No | Yes |
| Team brains (shared rules, per-member overrides) | No | Yes |
| Operator view (customer KPIs, alerts) | No | Yes |
| Cloud-side rule evaluation and A/B harness | No | Yes |
| Managed backups | No | Yes |

The SDK is Apache-2.0 and will stay permissively open. Cloud is a hosted SaaS tier with team features, corpus aggregation, and brain marketplace on top.
The SDK is Apache-2.0 and will stay permissively open. Cloud is a hosted SaaS tier that **visualizes** the local learning loop — it does not gate, override, or re-run it. Team features and brain marketplace build on top later.

## When to self-host vs use Cloud

Expand All @@ -34,10 +33,10 @@ The SDK is Apache-2.0 and will stay permissively open. Cloud is a hosted SaaS ti

**Use Cloud if:**

- Get meta-rule synthesis out of the box (no LLM wiring on your side).
- You want a dashboard to watch your brain mature (graduations, correction-rate decay, compound-quality score).
- Teams can maintain shared, version-controlled brains across multiple operators.
- Includes dashboard, alerts, and billing.
- Managed backups and cross-device sync handled for you.
- Operator / alerting view for engineering leads.

## Architecture

Expand All @@ -48,14 +47,13 @@ flowchart LR
end
subgraph Cloud["Gradata Cloud"]
C[Sync API] --> D[Postgres + pgvector]
D --> E[Meta-rule synthesis]
D --> F[Dashboard]
D --> G[Operator view]
end
A <-->|optional<br/>outbound only| C
A -->|optional<br/>outbound only| C
```

The SDK talks to Cloud only when you opt in with an API key. Sync is outbound: your local brain is the source of truth, Cloud holds a mirror plus derived metrics.
The SDK talks to Cloud only when you opt in with an API key. Sync is strictly outbound and read-only from Cloud's perspective: your local brain is the source of truth, Cloud holds a mirror plus derived metrics. Cloud never mutates your local state or re-runs graduation.

## Getting an API key

Expand Down
6 changes: 3 additions & 3 deletions Gradata/docs/concepts/meta-rules.md
Original file line number Diff line number Diff line change
Expand Up @@ -44,10 +44,10 @@ Clustering uses a combination of:

Minimum group size is controlled by `min_group_size=3` in `discover_meta_rules()`.

!!! info "Cloud vs open source"
In the open-source SDK, meta-rule **clustering** runs locally but the **principle synthesis** step requires [Gradata Cloud](../cloud/overview.md). Without cloud, `discover_meta_rules()` returns an empty list and `merge_into_meta()` produces a placeholder meta-rule with correct IDs and confidence but `principle = "(requires Gradata Cloud)"`.
!!! info "Local by default"
Meta-rule clustering **and** principle synthesis both run locally. Synthesis uses whichever LLM path you've configured: your own Anthropic API key (set `ANTHROPIC_API_KEY`) or the Claude Code Max OAuth path via `claude -p`. Cloud is not required for any of it — the full `[rule, rule, rule] → "Verify before acting"` pipeline runs in the OSS SDK.

The math, the events, and the storage are all open. Only the LLM-driven synthesis that turns `[rule, rule, rule] → "Verify before acting"` is cloud-gated.
Cloud becomes relevant when you want a hosted dashboard, cross-device sync, team brains, or (future) opt-in corpus donation. It does not re-synthesize or override what graduated locally.

## Confidence

Expand Down
Loading