You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
docs: reposition around always-fresh graph + optional LLM enhancement
Shift core messaging from "zero cloud / fully local" to "always-fresh
incremental graph at zero cost, optionally enhanced with your LLM."
- README: new hero tagline, add "Why most tools can't keep up" section,
reframe feature comparison around rebuild speed and LLM-optional mode
- COMPETITIVE_ANALYSIS: lead with incremental builds and dual-mode as
top differentiators, add LLM provider integration to Tier 2 roadmap
- FOUNDATION: principle 1 becomes "graph is always current", principle 4
becomes "zero-cost core, LLM-enhanced when you choose", update
competitive position around the three questions no competitor answers
@@ -77,11 +77,12 @@ Ranked by weighted score across 6 dimensions (each 1–5):
77
77
78
78
| Strength | Details |
79
79
|----------|---------|
80
-
|**Zero-dependency deployment**|`npm install` and done. No Docker, no cloud, no API keys needed. Most competitors require Docker (Memgraph, Neo4j, Dgraph, Qdrant) or cloud APIs |
80
+
|**Always-fresh graph (incremental rebuilds)**| File-level MD5 hashing means only changed files are re-parsed. Change 1 file in a 3,000-file project → rebuild in under a second. No other tool in this space offers this. Competitors re-index everything from scratch — making them unusable in commit hooks, watch mode, or agent-driven loops |
81
+
|**Zero-cost core, LLM-enhanced when you choose**| The full graph pipeline (parse, resolve, query, impact analysis) runs with no API keys, no cloud, no cost. LLM features (richer embeddings, semantic search) are an optional layer on top — using whichever provider the user already works with. Competitors either require cloud APIs for core features (code-graph-rag, autodev-codebase) or offer no AI enhancement at all (CKB, axon). Nobody else offers both modes in one tool |
82
+
|**Data goes only where you send it**| Your code reaches exactly one place: the AI agent you already chose (via MCP). No additional third-party services, no surprise cloud calls. Competitors like code-graph-rag, autodev-codebase, and Claude-code-memory send your code to additional AI providers beyond the agent you're using |
81
83
|**Dual engine architecture**| Only project with native Rust (napi-rs) + automatic WASM fallback. Others are pure Rust OR pure JS/Python — never both |
82
84
|**Single-repo MCP isolation**| Security-conscious default: tools have no `repo` property unless `--multi-repo` is explicitly enabled. Most competitors default to exposing everything |
83
-
|**Incremental builds**| File-hash-based skip of unchanged files. Some competitors re-index everything |
84
-
|**Platform binaries**| Published `@optave/codegraph-{platform}-{arch}` optional packages — true npm-native distribution |
85
+
|**Zero-dependency deployment**|`npm install` and done. No Docker, no external databases, no Python, no SCIP toolchains. Published platform-specific binaries (`@optave/codegraph-{platform}-{arch}`) resolve automatically |
85
86
|**Import resolution depth**| 6-level priority system with confidence scoring — more sophisticated than most competitors' resolution |
86
87
87
88
---
@@ -135,6 +136,7 @@ Ranked by weighted score across 6 dimensions (each 1–5):
135
136
### Tier 2: High impact, medium effort
136
137
| Feature | Inspired by | Why |
137
138
|---------|------------|-----|
139
+
|**Optional LLM provider integration**| code-graph-rag, autodev-codebase | Bring-your-own provider (OpenAI, etc.) for richer embeddings and AI-powered search. Enhancement layer only — core graph never depends on it. No other tool offers both zero-cost local and LLM-enhanced modes in one package |
138
140
|**Compound MCP tools**| CKB |`explore`/`understand` meta-tools that batch deps + fn + map into single responses. Biggest token-savings opportunity |
139
141
|**Token counting on responses**| glimpse, arbor | tiktoken-based counts so agents know context budget consumed |
140
142
|**Node classification**| arbor | Auto-tag Entry Point / Core / Utility / Adapter from in-degree/out-degree patterns |
@@ -153,10 +155,10 @@ Ranked by weighted score across 6 dimensions (each 1–5):
153
155
| Feature | Why skip |
154
156
|---------|----------|
155
157
| Memgraph/Neo4j/KuzuDB | Our SQLite = zero Docker, simpler deployment. Query gap matters less than simplicity |
156
-
| Multi-provider AI | We're deliberately cloud-free — that's a feature, not a limitation |
157
158
| SCIP indexing | Would require maintaining SCIP toolchains per language. Tree-sitter + native Rust is the right bet |
158
159
| CrewAI multi-agent | Overengineered for a code analysis tool. Keep the scope focused |
159
160
| Clipboard/LLM-dump mode | Different product category (glimpse). We're a graph tool, not a context-packer |
161
+
| Cloud APIs for core features | We will add LLM provider support, but as an **optional enhancement layer** — the core graph must always work with zero API keys and zero cost. This is the opposite of code-graph-rag's approach where cloud APIs are required for core functionality |
Copy file name to clipboardExpand all lines: FOUNDATION.md
+31-26Lines changed: 31 additions & 26 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -8,27 +8,27 @@
8
8
9
9
## Why Codegraph Exists
10
10
11
-
There are 20+ code analysis and code graph tools in the open-source ecosystem. Most require Docker, Python environments, cloud API keys, or external databases. None of them ship as a single npm package with native performance.
11
+
There are 20+ code analysis and code graph tools in the open-source ecosystem. They all force a choice: **fast local analysis with no AI, or powerful AI features that require full re-indexing through cloud APIs on every change.** None of them give you an always-current graph that you can rebuild on every commit and optionally enhance with the LLM provider you already use.
12
12
13
-
Codegraph exists to be **the code intelligence engine for the JavaScript ecosystem** — the one you `npm install` and it just works, on every platform, with nothing else to set up.
13
+
Codegraph exists to be **the code intelligence engine that keeps up with your commits** — an always-fresh graph that works at zero cost out of the box, with optional LLM enhancement through the provider you choose. Your code only goes where you send it.
14
14
15
15
---
16
16
17
17
## Core Principles
18
18
19
19
These principles define what codegraph is and is not. Every feature decision, PR review, and architectural choice should be measured against them.
20
20
21
-
### 1. Zero-infrastructure deployment
21
+
### 1. The graph is always current
22
22
23
-
**Codegraph must never require anything beyond `npm install`.**
23
+
**Codegraph must rebuild fast enough to run on every commit, every save, in every agent loop.**
24
24
25
-
No Docker. No external databases. No cloud accounts. No API keys for core functionality. No Python. No Go toolchain. No manual compilation steps.
25
+
This is our single most important differentiator. Every competitor in this space either re-indexes from scratch on every change (making them unusable in tight loops) or requires cloud API calls baked into the rebuild pipeline (making them slow and costly to run frequently).
26
26
27
-
SQLite is our database because it's embedded. WASM grammars are our fallback because they run everywhere Node.js runs. Optional dependencies (`@huggingface/transformers`, `@modelcontextprotocol/sdk`) are lazy-loaded and degrade gracefully.
27
+
File-level MD5 hashing means only changed files are re-parsed. Change one file in a 3,000-file project → rebuild in under a second. This makes commit hooks, watch mode, and AI-agent-triggered rebuilds practical. The graph is never stale.
28
28
29
-
This is our single most important differentiator. Every competitor that adds Docker to their install instructions loses users we should capture.
29
+
The core pipeline is pure local computation — tree-sitter + SQLite. No API calls, no network latency, no cost. This isn't about being anti-cloud. It's about being fast enough that the graph can stay current without waiting on anything external.
30
30
31
-
*Test: can a developer on a fresh machine run `npm install @optave/codegraph && codegraph build .`with zero prior setup? If not, we broke this principle.*
31
+
*Test: after changing one file in a 1000-file project, does `codegraph build .`complete in under 500ms? Can it run in a commit hook without the developer noticing?*
32
32
33
33
### 2. Native speed, universal reach
34
34
@@ -52,15 +52,17 @@ This principle extends beyond import resolution. When we add features — dead c
52
52
53
53
*Test: does every query result include enough context for the consumer to judge its reliability?*
54
54
55
-
### 4. Incremental by default
55
+
### 4. Zero-cost core, LLM-enhanced when you choose
56
56
57
-
**Never re-parse what hasn't changed.**
57
+
**The full graph works with no API keys. AI features are an optional layer on top.**
58
58
59
-
File-level MD5 hashing tracks what changed between builds. Only modified files get re-parsed, and their stale nodes/edges are cleaned before re-insertion. This makes watch-mode and AI-agent loops practical — rebuilds drop from seconds to milliseconds.
59
+
The core pipeline — parse, resolve, store, query, impact analysis — runs entirely locally with zero cost. No accounts, no API keys, no cloud calls. This is the mode that runs on every commit.
60
60
61
-
This is not a feature flag. It's the default behavior. The graph is always fresh with minimum work.
61
+
LLM-powered features (richer embeddings, semantic search, AI-enhanced analysis) are an optional enhancement layer. When enabled, they use whichever provider the user already works with (OpenAI, etc.). Your code goes to exactly one place: the provider you chose. No additional third-party services, no surprise cloud calls.
62
62
63
-
*Test: after changing one file in a 1000-file project, does `codegraph build .` complete in under 500ms?*
63
+
This dual-mode approach is unique in the competitive landscape. Competitors either require cloud APIs for core functionality (code-graph-rag, autodev-codebase) or offer no AI enhancement at all (CKB, axon, arbor). Nobody else offers both modes in one tool.
64
+
65
+
*Test: does every core command (`build`, `query`, `fn`, `deps`, `impact`, `diff-impact`, `cycles`, `map`) work with zero API keys? Are LLM features additive, never blocking?*
64
66
65
67
### 5. Embeddable first, CLI second
66
68
@@ -116,34 +118,37 @@ Staying in our lane means we can be embedded inside tools that do those things
116
118
- Features that improve **result quality**: fuzzy search, confidence scoring, node classification, compound queries that reduce agent round-trips
117
119
- Features that improve **speed**: faster native parsing, smarter incremental builds, lighter-weight search alternatives (FTS5/TF-IDF alongside full embeddings)
118
120
- Features that improve **embeddability**: better programmatic API, streaming results, output format options
121
+
-**Optional LLM provider integration**: bring-your-own provider (OpenAI, etc.) for richer embeddings, AI-powered search, and enhanced analysis — always as an additive layer that never blocks the core pipeline (Principle 4)
- Features that require non-npm dependencies — violates Principle 1
130
+
- Features that require non-npm dependencies — keeps deployment simple
128
131
129
132
---
130
133
131
134
## Competitive Position
132
135
133
136
As of February 2026, codegraph is **#7 out of 22** in the code intelligence tool space (see [COMPETITIVE_ANALYSIS.md](./COMPETITIVE_ANALYSIS.md)).
134
137
135
-
Six tools rank above us on feature breadth and community size. But none of them occupy our niche: **the npm-native, zero-config, dual-engine code intelligence library.**
138
+
Six tools rank above us on feature breadth and community size. But none of them can answer yes to all three questions:
139
+
140
+
1.**Can you rebuild the graph on every commit in a large codebase?** — Only codegraph has incremental builds. Everyone else re-indexes from scratch.
141
+
2.**Does the core pipeline work with zero API keys and zero cost?** — Tools like code-graph-rag and autodev-codebase require cloud APIs for core features. Codegraph's full graph pipeline is local and costless.
142
+
3.**Can you optionally enhance with your LLM provider?** — Local-only tools (CKB, axon, arbor) have no AI enhancement path. Cloud-dependent tools force it. Only codegraph makes it optional.
| Fast local analysis **or** AI-powered features | Both — zero-cost core + optional LLM layer |
147
+
| Full re-index on every change **or** stale graph | Always-current graph via incremental builds |
148
+
| Code goes to multiple cloud services **or** no AI at all | Code goes only to the one provider you chose |
149
+
| Docker + Python + external DB **or** nothing works |`npm install` and done |
145
150
146
-
Our path to #1 is not feature parity with every competitor. It's making codegraph **the obvious default for any JavaScript developer or tool that needs code intelligence** — because it's the only one that doesn't ask them to leave the npm ecosystem.
151
+
Our path to #1 is not feature parity with every competitor. It's being **the only code intelligence tool where the graph is always current, works at zero cost, and optionally gets smarter with the LLM you already use.**
0 commit comments