diff --git a/CREDITS.md b/CREDITS.md new file mode 100644 index 00000000..bb5a4480 --- /dev/null +++ b/CREDITS.md @@ -0,0 +1,41 @@ +# Credits & Intellectual Lineage + +Gradata synthesizes ideas from decades of research and engineering practice. Standing on the shoulders of giants isn't stealing — it's the whole point of an open ecosystem. This document credits the work that shaped Gradata. + +## Research foundations + +- **Constitutional AI** (Anthropic, 2022) — the self-critique + revision loop under `sdk/src/gradata/enhancements/rule_verifier.py` is inspired by the RLAIF methodology introduced in *"Constitutional AI: Harmlessness from AI Feedback"* (Bai et al., 2022). +- **Half-life regression** (Settles & Meeder, ACL 2016) — confidence decay curves in the graduation engine draw on *"A Trainable Spaced Repetition Model for Language Learning"* and the Wozniak/Duolingo two-component memory model. +- **Generative agents** (Park et al., Stanford 2023/2024) — *"Generative Agents: Interactive Simulacra of Human Behavior"* and *"Generative Agent Simulations of 1,000 People"* (2024) validate our simulation-first design methodology; the latter demonstrated generative agents are ~85% as accurate as humans on survey responses. +- **MT-Bench / LLM-as-judge** (Zheng et al., NeurIPS 2023) — scoring methodology in `brain/scripts/brain_benchmark.py` adapts the multi-judge consensus approach from *"Judging LLM-as-a-Judge with MT-Bench and Chatbot Arena"*. +- **Self-preference bias in LLM judges** (2024) — informs our anonymization step before judging to control known evaluator biases. +- **Grammarly ROI study** (2024) — the "19 days saved per year" framing informs our *Est. Time Saved* KPI. +- **Copilot RCT** (Peng et al., 2023) — *"The Impact of AI on Developer Productivity: Evidence from GitHub Copilot"* reported a 55.8% speedup on a controlled coding task and anchors our developer-impact benchmarks. +- **SuperMemo 2 / two-component memory** (Wozniak, 1995) — retrievability + stability decomposition underpinning our confidence decay model. +- **Persona transparency** (AAAI 2025) — persona documentation requirements for simulation research inform how we publish MiroFish panels. + +## Architectural inspirations + +- **Mem0** — shared memory-first framing for AI agents. Gradata's difference: we learn from corrections, not just recall facts. +- **Letta** (formerly MemGPT) — agent state persistence patterns. Gradata's difference: state is rules, graduated from evidence rather than stored conversations. +- **EverMind / EverMemOS** (TCCI, 2025) — reported 92.3% on the LoCoMo memory-recall benchmark. Gradata is complementary: it adds the correction-learning layer on top of memory recall. +- **The 15 agentic patterns** — orchestrator, reflection, memory, rule_engine, RAG, tree-of-thoughts, and the rest are standard LLM-app primitives. Gradata builds the *enhancements* layer (`diff_engine`, `quality_gates`, `truth_protocol`, `meta_rules`, `rule_verifier`) on top of these primitives. + +## Research methodology + +- **MiroFish expert-panel simulation** — multi-round structured debate across grounded personas. Our adaptation lives in `brain/scripts/mirofish_sim.py`. The methodology is our own synthesis of published simulation work (Park et al.; Anthropic's Constitutional AI). +- **Mann-Kendall trend test** — autocorrelation-aware statistical validation used in our convergence checks. +- **OASIS framework** — indirect influence on our batch-run pattern for stress tests. +- **Karpathy-style autoresearch** — iterative self-improvement with verification gates. Our optimization runner adopts this pattern. + +## Open-source dependencies + +Key libraries Gradata is built on: FastAPI, Next.js, Recharts, Supabase, Stripe, Pydantic, pytest, Vitest, Tailwind, Radix UI, React. Full dependency lists are in `pyproject.toml` and `package.json`. + +## What's new here + +Gradata's novel contribution is the **graduation pipeline + correction tracking + compound proof** — the data dynamics that make personal AI learning work. Not the patterns. Not the libraries. The loop. + +## License + +Gradata is **AGPL-3.0** — use it, fork it, host it. If you host a SaaS version, your modifications must be open-source too. This is deliberate: the common ecosystem wins, the fork-and-close play doesn't. diff --git a/README.md b/README.md index 345dc98f..31b94e84 100644 --- a/README.md +++ b/README.md @@ -1,11 +1,17 @@ -# Gradata — AI that learns your judgment +# Gradata + +## AI that learns your judgment, not just your preferences. [![Tests](https://github.com/Gradata/gradata/actions/workflows/test.yml/badge.svg)](https://github.com/Gradata/gradata/actions/workflows/test.yml) [![PyPI](https://img.shields.io/pypi/v/gradata)](https://pypi.org/project/gradata/) [![Python](https://img.shields.io/pypi/pyversions/gradata)](https://pypi.org/project/gradata/) [![License](https://img.shields.io/badge/license-AGPL--3.0-blue)](LICENSE) -Every correction you make teaches your AI something. Gradata captures those corrections, extracts the behavioral instruction behind them, and graduates it into a rule. Over time, your AI stops needing corrections. It converges on your judgment. +Install the SDK, use Claude or GPT like you already do, and correct it when it's wrong. Gradata turns your repeat corrections into durable rules the AI carries forward — automatically. Unlike prompt engineering, which asks you to guess what the model needs, Gradata learns from what you actually fix. + +- **Local-first.** Your brain stays on your machine. AGPL-3.0 — fork it, host it, change it. +- **Proven.** Simulation-validated learning loop (MiroFish panel methodology + published research on behavioral learning). +- **Measurable.** *Est. Time Saved*, *Mistakes Caught*, *Sessions to Graduate* — honest metrics, not vanity. Not generally more intelligent. Calibrated to you. @@ -15,6 +21,10 @@ pip install gradata Works with any LLM. Python 3.11+. Zero required dependencies. +## Intellectual lineage + +Gradata synthesizes research from Constitutional AI (Anthropic, 2022), Duolingo's half-life regression (Settles & Meeder, ACL 2016), the Copilot RCT efficacy study (Peng et al., 2023), SuperMemo's two-component memory model (Wozniak, 1995), MT-Bench LLM-as-judge (Zheng et al., NeurIPS 2023), and the 15 agentic patterns (orchestrator, reflection, memory, rule_engine, and the rest). It stands alongside Mem0, Letta, and EverMind as an open memory system — with one difference: Gradata learns from your corrections, not just recalls facts. What's new is the graduation pipeline that turns repeated mistakes into durable rules, validated by multi-agent simulation. See [CREDITS.md](./CREDITS.md) for the full list. + ## Quick Start ```python diff --git a/docs/public-launch-narrative.md b/docs/public-launch-narrative.md new file mode 100644 index 00000000..589f52fc --- /dev/null +++ b/docs/public-launch-narrative.md @@ -0,0 +1,66 @@ +# Public-Launch Narrative — Draft for Review + +This PR prepares Gradata's marketing + credits narrative before going public. It is separate from the cleanup PR so the voice/framing decisions can be reviewed without noise from refactors. + +## Why this PR exists + +Going public invites two predictable reactions: + +1. *"This is just Mem0 / Letta / a prompt library."* +2. *"You didn't invent any of this. You copied X."* + +Both reactions dissolve if we lead with transparent attribution. The goal of this PR is to turn potential "you copied X" accusations into credited prior art, and to sharpen the one-sentence pitch so it lands for three different buyers (founder-engineer, OSS believer, enterprise). + +## Pieces shipped in this PR + +### 1. `CREDITS.md` (new, repo root) + +Transparent-synthesis narrative. Credits: + +- Research foundations (Constitutional AI, Duolingo half-life regression, Generative Agents, MT-Bench, SuperMemo, Copilot RCT, Grammarly ROI, Persona Transparency Checklist) +- Architectural inspirations (Mem0, Letta, EverMind, 15 agentic patterns) +- Research methodology (MiroFish, Mann-Kendall, OASIS, Karpathy autoresearch) +- Open-source dependencies (summary + pointers to `pyproject.toml` and `package.json`) +- "What's new here" — the graduation pipeline + correction tracking + compound proof + +All citations are real papers (Park et al., Settles/Meeder Duolingo, Peng Copilot RCT, Zheng MT-Bench, Wozniak SuperMemo, Anthropic Constitutional AI). No invented citations. + +### 2. `README.md` (rewritten top, new section) + +**Before:** H1 was `# Gradata — AI that learns your judgment`, then a descriptive paragraph. + +**After:** + +- Cleaner H1: `# Gradata` +- Tagline H2: `## AI that learns your judgment, not just your preferences.` +- Three-bullet product pitch targeting three buyers at once: + - Founder-engineer: pragmatic ROI framing ("use Claude or GPT like you already do") + - OSS believer: AGPL-3.0 + local-first framing + - Enterprise: "simulation-validated" + "honest metrics, not vanity" +- New `## Intellectual lineage` paragraph pointing to `CREDITS.md` + +Preserved: all badges, install instructions, Quick Start, mermaid diagrams, ablation table, CLI, architecture, Community, Contributing, License. Nothing removed beyond the original opening paragraphs, which were absorbed into the new framing. + +### 3. `docs/public-launch-narrative.md` (this file) + +Explains what changed and why, for review before merge. + +## What this PR does NOT do + +- Does not change product pricing or feature claims. +- Does not touch `marketing/` — that has its own narrative track. +- Does not invent citations — every paper listed is one Oliver approved. +- Does not remove existing README scaffolding (install, quick start, diagrams, license). + +## Suggested review checklist + +- [ ] Does the one-sentence pitch in README land for all three buyer profiles? +- [ ] Is `CREDITS.md` generous enough that "you copied X" arguments feel tired on arrival? +- [ ] Are there any sources we should add (papers, systems) before public launch? +- [ ] Is AGPL-3.0 framing in CREDITS the right voice (confident, not apologetic)? + +## Follow-ups (not in this PR) + +- Marketing site lineage page (separate work under `marketing/`). +- `RESEARCH.md` with longer-form academic framing once the paper is ready. +- Hacker News / launch-post alignment with this same vocabulary.