Skip to content

karstom/simplegraph-agentic

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

simplegraph-agentic

A lightweight, framework-agnostic persistent memory graph for AI coding assistants.

Your AI agent builds structured knowledge about your codebase — bugs that keep recurring, decisions that were deliberate, code areas that are dangerous — and retains it across sessions without bloating every context window. Works with Antigravity, Cursor, Claude Code, GitHub Copilot, and any tool that accepts custom instructions.


The Problem

Every AI coding session starts cold. The agent doesn't remember:

  • The bug you fixed three times that keeps coming back
  • The architectural decision that was intentional, not accidental
  • The code area where a subtle change broke production last month
  • The patterns your team agreed to never use

You re-explain the same context, or worse — the agent confidently undoes a hard-won decision because it has no history.

Why Not Just a README, Wiki, or CLAUDE.md?

Flat docs load everything every time. For non-trivial codebases this wastes context tokens and dilutes the signal. A 500-word CLAUDE.md that covers "coding style" gives the AI no actionable intelligence about where the real risks are.

simplegraph-agentic takes a different approach:

Tiered Loading — 16× Fewer Tokens at Session Start

Measured on a production codebase with 62 nodes across 26 files (a complex full-stack PWA with identity, messaging, CDN caching, and compactor pipeline):

Approach Tokens per session start Tokens per task
simplegraph (tiered) ~600 ~2,400
Monolith (flat file) ~9,700 ~9,700
No memory (re-explain each time) 0 up front, ~500–2,000 per re-explanation compounds

16× reduction at session start. 4× reduction for a typical task. The savings compound across every request in a session — the AI reads ~600 tokens once, then loads only the 2–3 files relevant to the current task.

Run bash scripts/token_benchmark.sh on your own graph to measure your reduction ratio.

Compared to Other Approaches

Approach Strengths simplegraph advantage
CLAUDE.md / .cursorrules Simple, zero setup Flat files load everything every time. At 62 nodes of real knowledge, that's 9,700 tokens wasted per request. simplegraph loads ~600.
Aider repo-map Auto-generates structural map Structural maps answer "where is X?" but not "what went wrong here?" or "why was this decision made?" simplegraph captures intent and history, not just structure.
Vector DB (Mem0, etc.) Scales to huge corpora, fuzzy retrieval Requires infrastructure (DB server, embeddings). Retrieval is probabilistic — it might not surface the one invariant that prevents a regression. simplegraph is deterministic, git-native, and reviewable.
Fine-tuned models Encoded knowledge Expensive, opaque, stale the moment code changes. simplegraph updates in the same commit as the code.

Typed, Linked Nodes — Follow Risk Chains

Nodes have types (Component, Invariant, Regression, Decision, Watchlist) and typed edges. An agent can follow a chain like:

AUTH_SERVICE --VIOLATED_BY--> REG_TOKEN_LEAK (×3) --FIXED_BY--> DEC_ROTATE_ON_REFRESH

This means: "the Auth service has a regression that's happened 3 times, and there's a deliberate architectural decision about how to prevent it." That chain tells the agent exactly what to be careful about and why — in 3 hops, not 500 words.

Priority and Heat — Load Critical Context First

Nodes carry a Priority field (HIGH / MEDIUM / LOW) derived from concrete signals:

Signal Priority
REGRESSED_N_TIMES >= 2 or on a Watchlist HIGH
LastUpdated within 14 days MEDIUM
Stable, no flags LOW

When the routing table points to multiple files, the agent loads HIGH-priority nodes first. Low-priority nodes are skipped unless the task directly touches them.


Benefits

For Individual Developers

Pain point How simplegraph-agentic helps
Agent re-introduces a fixed bug Regression nodes with REGRESSED_N_TIMES teach the agent which areas are fragile
Agent refactors away a deliberate pattern Decision nodes document the why, not just the what
Agent generates code that violates team conventions Anti-patterns file lists what the AI should never generate
Agent wastes tokens reading irrelevant context Task routing table directs loading to only relevant files
Agent can't find where code lives Auto-map (generated by ctags) gives structural awareness without manual docs
Session starts cold every time Graph persists across sessions — committed to git, always available

For Teams

Pain point How simplegraph-agentic helps
New team members repeat old mistakes The graph captures institutional knowledge as typed, searchable nodes
AI tools make different decisions for different devs The graph is committed and shared — everyone's AI reads the same truth
Multi-repo projects lose cross-boundary context Shared graph captures API contracts, org-wide invariants, and cross-repo regressions
Graph updates conflict in PRs One-file-per-component + append-only convention = clean merges
Large codebase overflows the index Hierarchical routing (domain indexes) scales to any size

Quickstart

Using the installer (recommended)

git clone https://github.com/karstom/simplegraph-agentic.git
cd simplegraph-agentic
bash setup.sh /path/to/your/project

The script will:

  1. Copy core/ into your project (optionally shared/ for multi-repo setups)
  2. Ask which AI tool you use and install the right adapter
  3. Run the consistency check
  4. Print next steps — including the seed prompt

Manual install

  1. Copy core/ into your project root.
  2. Pick an adapter from adapters/ — see the Adapter Matrix.
  3. Run the seed prompt in scripts/seed_prompt.md in your AI tool to bootstrap the graph.
  4. Commit core/.

Multi-Repo / Team Projects

For projects spanning multiple repos:

  1. Copy core/ into each individual repo for per-repo knowledge.
  2. Copy shared/ into a dedicated org-memory repo or monorepo root for cross-repo knowledge.
  3. In each repo's core/graph_index.md, set the shared graph path.
  4. Configure shared/auto_map_config.yaml to list all repos for shared structural mapping.

See shared/graph_index.md for the full multi-repo setup guide.


Adapter Matrix

AI Tool Adapter file Install path
Antigravity adapters/antigravity/SKILL.md .agent/skills/memory/SKILL.md
Cursor adapters/cursor/memory.mdc .cursor/rules/memory.mdc
Claude Code adapters/claude-code/CLAUDE_MEMORY.md Paste section into CLAUDE.md
GitHub Copilot adapters/copilot/copilot-instructions-memory.md Merge into .github/copilot-instructions.md
Generic adapters/generic/AGENT_MEMORY.md Paste into custom instructions

The generic adapter works with ChatGPT Projects, Gemini Gems, Windsurf, Aider, Cline, or any tool that accepts a persistent system prompt.


Graph Structure

core/
├── graph_index.md          # Mandatory session-start read (~50 lines)
├── anti_patterns.md        # What the AI should NEVER generate
├── invariants.md           # Hard rules (e.g. "never call X without Y")
├── regressions.md          # Bugs + REGRESSED_N_TIMES counters
├── decisions.md            # Architectural choices with rationale
├── watchlists.md           # Dangerous code areas + open issues
├── HOW_TO_UPDATE.md        # When and how to update the graph
├── components/             # One file per major service/module
├── archive/
│   └── resolved_regressions.md
├── auto_map.md             # (generated, gitignored) structural repo map
└── .scratchpad.md          # (gitignored) session-local AI notes

For multi-repo teams, a shared/ directory adds:

shared/
├── graph_index.md          # Cross-repo index + setup guide
├── auto_map_config.yaml    # Repos to include in shared auto-map
├── invariants.md           # API contracts, org-wide rules
├── regressions.md          # Cross-repo bugs
├── decisions.md            # Platform-level architectural choices
└── watchlists.md           # Integration boundaries

Node Types

Type Purpose Priority signal
Component A service, module, or subsystem
Invariant A hard rule that must never be violated HIGH if has VIOLATED_BY edge
Regression A bug that has occurred — especially recurring ones HIGH if REGRESSED_N_TIMES ≥ 2
Decision An intentional architectural choice with documented rationale
Watchlist A dangerous code area requiring extra caution HIGH by definition

Edge Types

Edge Meaning
DEPENDS_ON This node requires the target to function correctly
CAUSES Violating this node causes the target problem
MITIGATES This node reduces the risk of the target
FIXED_BY This regression was resolved by the target
VIOLATED_BY This invariant was broken by the target regression
CONTAINS This Watchlist or Component contains the target

Scripts

Script Purpose
setup.sh Interactive installer — copies scaffold + adapter into your project
scripts/auto_map.sh Generates structural repo map from ctags (requires Universal Ctags)
scripts/auto_map_shared.sh Generates combined public API map across multiple repos
scripts/consistency_check.sh Verifies no broken edge references in the graph
scripts/stale_check.sh Flags nodes with old dates or dead file references
scripts/token_benchmark.sh Measures token efficiency — compare tiered vs monolith
scripts/seed_prompt.md One-shot prompt to bootstrap the graph from a cold start

How the Graph Grows

You don't build the graph all at once. It accumulates naturally:

  1. Day 1: Run the seed prompt → AI generates initial Component and Decision nodes
  2. Week 1: Fix a bug → add a Regression node in the same commit
  3. Week 2: Discover a dangerous pattern → add a Watchlist entry
  4. Month 1: Notice a bug keeps recurring → REGRESSED_N_TIMES increments, AI automatically treats it as high-risk
  5. Ongoing: Review the stale check periodically → clean up nodes that no longer apply

The graph is a living document. Low quality at seed time is fine — it improves through real usage.


Scaling

Project size Strategy
Solo / small project (<10 components) Single graph_index.md with flat routing table
Medium project (10-30 components) Same, but consider splitting multi-node files (per-node files) if merge conflicts increase
Large project (30+ components) Hierarchical routing: domain-level indexes
Multi-repo / microservices Per-repo core/ + shared org-level graph

Design Principles

  1. Zero infrastructure. No databases, no servers, no build steps. Plain markdown + git.
  2. Opinionated about staying small. 5 high-signal nodes beat 50 shallow ones.
  3. AI writes the graph alongside the code. Graph updates go in the same commit as the fix.
  4. Tiered loading. The agent reads 50 lines at session start, not 5,000.
  5. Git-native. The graph is committed, versioned, branched, and reviewed like code.

Compatibility

  • Any language, any framework — the graph is plain markdown.
  • Any AI tool — adapters are the only tool-specific piece.
  • Git-friendly — commit the core/ directory; the whole team shares the same memory.
  • No dependencies — bash scripts for optional checks (ctags for auto-map).

Contributing

See CONTRIBUTING.md.

License

MIT

About

A lightweight, framework-agnostic persistent memory graph for AI coding assistants.

Topics

Resources

License

Contributing

Stars

Watchers

Forks

Contributors

Languages