Skip to content

Commit a403acc

Browse files
docs: reposition around always-fresh graph + optional LLM enhancement
Shift core messaging from "zero cloud / fully local" to "always-fresh incremental graph at zero cost, optionally enhanced with your LLM." - README: new hero tagline, add "Why most tools can't keep up" section, reframe feature comparison around rebuild speed and LLM-optional mode - COMPETITIVE_ANALYSIS: lead with incremental builds and dual-mode as top differentiators, add LLM provider integration to Tier 2 roadmap - FOUNDATION: principle 1 becomes "graph is always current", principle 4 becomes "zero-cost core, LLM-enhanced when you choose", update competitive position around the three questions no competitor answers
1 parent bd12063 commit a403acc

File tree

3 files changed

+86
-52
lines changed

3 files changed

+86
-52
lines changed

COMPETITIVE_ANALYSIS.md

Lines changed: 7 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -17,7 +17,7 @@ Ranked by weighted score across 6 dimensions (each 1–5):
1717
| 4 | 3.9 | [harshkedia177/axon](https://github.com/harshkedia177/axon) | 29 | Python | None | 11-phase pipeline, KuzuDB, Leiden community detection, dead code, change coupling |
1818
| 5 | 3.8 | [anrgct/autodev-codebase](https://github.com/anrgct/autodev-codebase) | 111 | TypeScript | None | 40+ languages, 7 embedding providers, Cytoscape.js visualization, LLM reranking |
1919
| 6 | 3.7 | [Anandb71/arbor](https://github.com/Anandb71/arbor) | 85 | Rust | MIT | Native GUI, confidence scoring, architectural role classification, fuzzy search, MCP |
20-
| **7** | **3.6** | **[@optave/codegraph](https://github.com/optave/codegraph)** || **JS/Rust** | **Apache-2.0** | **Dual engine (native Rust + WASM), 11 languages, SQLite, MCP, semantic search, zero-cloud** |
20+
| **7** | **3.6** | **[@optave/codegraph](https://github.com/optave/codegraph)** || **JS/Rust** | **Apache-2.0** | **Sub-second incremental rebuilds, dual engine (native Rust + WASM), 11 languages, MCP, zero-cost core + optional LLM enhancement** |
2121
| 8 | 3.4 | [Durafen/Claude-code-memory](https://github.com/Durafen/Claude-code-memory) | 72 | Python | None | Memory Guard quality gate, persistent codebase memory, Voyage AI + Qdrant |
2222
| 9 | 3.3 | [NeuralRays/codexray](https://github.com/NeuralRays/codexray) | 2 | TypeScript | MIT | 16 MCP tools, TF-IDF semantic search (~50MB), dead code, complexity, path finding |
2323
| 10 | 3.2 | [al1-nasir/codegraph-cli](https://github.com/al1-nasir/codegraph-cli) | 11 | Python | MIT | CrewAI multi-agent system, 6 LLM providers, browser explorer, DOCX export |
@@ -77,11 +77,12 @@ Ranked by weighted score across 6 dimensions (each 1–5):
7777

7878
| Strength | Details |
7979
|----------|---------|
80-
| **Zero-dependency deployment** | `npm install` and done. No Docker, no cloud, no API keys needed. Most competitors require Docker (Memgraph, Neo4j, Dgraph, Qdrant) or cloud APIs |
80+
| **Always-fresh graph (incremental rebuilds)** | File-level MD5 hashing means only changed files are re-parsed. Change 1 file in a 3,000-file project → rebuild in under a second. No other tool in this space offers this. Competitors re-index everything from scratch — making them unusable in commit hooks, watch mode, or agent-driven loops |
81+
| **Zero-cost core, LLM-enhanced when you choose** | The full graph pipeline (parse, resolve, query, impact analysis) runs with no API keys, no cloud, no cost. LLM features (richer embeddings, semantic search) are an optional layer on top — using whichever provider the user already works with. Competitors either require cloud APIs for core features (code-graph-rag, autodev-codebase) or offer no AI enhancement at all (CKB, axon). Nobody else offers both modes in one tool |
82+
| **Data goes only where you send it** | Your code reaches exactly one place: the AI agent you already chose (via MCP). No additional third-party services, no surprise cloud calls. Competitors like code-graph-rag, autodev-codebase, and Claude-code-memory send your code to additional AI providers beyond the agent you're using |
8183
| **Dual engine architecture** | Only project with native Rust (napi-rs) + automatic WASM fallback. Others are pure Rust OR pure JS/Python — never both |
8284
| **Single-repo MCP isolation** | Security-conscious default: tools have no `repo` property unless `--multi-repo` is explicitly enabled. Most competitors default to exposing everything |
83-
| **Incremental builds** | File-hash-based skip of unchanged files. Some competitors re-index everything |
84-
| **Platform binaries** | Published `@optave/codegraph-{platform}-{arch}` optional packages — true npm-native distribution |
85+
| **Zero-dependency deployment** | `npm install` and done. No Docker, no external databases, no Python, no SCIP toolchains. Published platform-specific binaries (`@optave/codegraph-{platform}-{arch}`) resolve automatically |
8586
| **Import resolution depth** | 6-level priority system with confidence scoring — more sophisticated than most competitors' resolution |
8687

8788
---
@@ -135,6 +136,7 @@ Ranked by weighted score across 6 dimensions (each 1–5):
135136
### Tier 2: High impact, medium effort
136137
| Feature | Inspired by | Why |
137138
|---------|------------|-----|
139+
| **Optional LLM provider integration** | code-graph-rag, autodev-codebase | Bring-your-own provider (OpenAI, etc.) for richer embeddings and AI-powered search. Enhancement layer only — core graph never depends on it. No other tool offers both zero-cost local and LLM-enhanced modes in one package |
138140
| **Compound MCP tools** | CKB | `explore`/`understand` meta-tools that batch deps + fn + map into single responses. Biggest token-savings opportunity |
139141
| **Token counting on responses** | glimpse, arbor | tiktoken-based counts so agents know context budget consumed |
140142
| **Node classification** | arbor | Auto-tag Entry Point / Core / Utility / Adapter from in-degree/out-degree patterns |
@@ -153,10 +155,10 @@ Ranked by weighted score across 6 dimensions (each 1–5):
153155
| Feature | Why skip |
154156
|---------|----------|
155157
| Memgraph/Neo4j/KuzuDB | Our SQLite = zero Docker, simpler deployment. Query gap matters less than simplicity |
156-
| Multi-provider AI | We're deliberately cloud-free — that's a feature, not a limitation |
157158
| SCIP indexing | Would require maintaining SCIP toolchains per language. Tree-sitter + native Rust is the right bet |
158159
| CrewAI multi-agent | Overengineered for a code analysis tool. Keep the scope focused |
159160
| Clipboard/LLM-dump mode | Different product category (glimpse). We're a graph tool, not a context-packer |
161+
| Cloud APIs for core features | We will add LLM provider support, but as an **optional enhancement layer** — the core graph must always work with zero API keys and zero cost. This is the opposite of code-graph-rag's approach where cloud APIs are required for core functionality |
160162

161163
---
162164

FOUNDATION.md

Lines changed: 31 additions & 26 deletions
Original file line numberDiff line numberDiff line change
@@ -8,27 +8,27 @@
88

99
## Why Codegraph Exists
1010

11-
There are 20+ code analysis and code graph tools in the open-source ecosystem. Most require Docker, Python environments, cloud API keys, or external databases. None of them ship as a single npm package with native performance.
11+
There are 20+ code analysis and code graph tools in the open-source ecosystem. They all force a choice: **fast local analysis with no AI, or powerful AI features that require full re-indexing through cloud APIs on every change.** None of them give you an always-current graph that you can rebuild on every commit and optionally enhance with the LLM provider you already use.
1212

13-
Codegraph exists to be **the code intelligence engine for the JavaScript ecosystem**the one you `npm install` and it just works, on every platform, with nothing else to set up.
13+
Codegraph exists to be **the code intelligence engine that keeps up with your commits**an always-fresh graph that works at zero cost out of the box, with optional LLM enhancement through the provider you choose. Your code only goes where you send it.
1414

1515
---
1616

1717
## Core Principles
1818

1919
These principles define what codegraph is and is not. Every feature decision, PR review, and architectural choice should be measured against them.
2020

21-
### 1. Zero-infrastructure deployment
21+
### 1. The graph is always current
2222

23-
**Codegraph must never require anything beyond `npm install`.**
23+
**Codegraph must rebuild fast enough to run on every commit, every save, in every agent loop.**
2424

25-
No Docker. No external databases. No cloud accounts. No API keys for core functionality. No Python. No Go toolchain. No manual compilation steps.
25+
This is our single most important differentiator. Every competitor in this space either re-indexes from scratch on every change (making them unusable in tight loops) or requires cloud API calls baked into the rebuild pipeline (making them slow and costly to run frequently).
2626

27-
SQLite is our database because it's embedded. WASM grammars are our fallback because they run everywhere Node.js runs. Optional dependencies (`@huggingface/transformers`, `@modelcontextprotocol/sdk`) are lazy-loaded and degrade gracefully.
27+
File-level MD5 hashing means only changed files are re-parsed. Change one file in a 3,000-file project → rebuild in under a second. This makes commit hooks, watch mode, and AI-agent-triggered rebuilds practical. The graph is never stale.
2828

29-
This is our single most important differentiator. Every competitor that adds Docker to their install instructions loses users we should capture.
29+
The core pipeline is pure local computation — tree-sitter + SQLite. No API calls, no network latency, no cost. This isn't about being anti-cloud. It's about being fast enough that the graph can stay current without waiting on anything external.
3030

31-
*Test: can a developer on a fresh machine run `npm install @optave/codegraph && codegraph build .` with zero prior setup? If not, we broke this principle.*
31+
*Test: after changing one file in a 1000-file project, does `codegraph build .` complete in under 500ms? Can it run in a commit hook without the developer noticing?*
3232

3333
### 2. Native speed, universal reach
3434

@@ -52,15 +52,17 @@ This principle extends beyond import resolution. When we add features — dead c
5252

5353
*Test: does every query result include enough context for the consumer to judge its reliability?*
5454

55-
### 4. Incremental by default
55+
### 4. Zero-cost core, LLM-enhanced when you choose
5656

57-
**Never re-parse what hasn't changed.**
57+
**The full graph works with no API keys. AI features are an optional layer on top.**
5858

59-
File-level MD5 hashing tracks what changed between builds. Only modified files get re-parsed, and their stale nodes/edges are cleaned before re-insertion. This makes watch-mode and AI-agent loops practical — rebuilds drop from seconds to milliseconds.
59+
The core pipeline — parse, resolve, store, query, impact analysis — runs entirely locally with zero cost. No accounts, no API keys, no cloud calls. This is the mode that runs on every commit.
6060

61-
This is not a feature flag. It's the default behavior. The graph is always fresh with minimum work.
61+
LLM-powered features (richer embeddings, semantic search, AI-enhanced analysis) are an optional enhancement layer. When enabled, they use whichever provider the user already works with (OpenAI, etc.). Your code goes to exactly one place: the provider you chose. No additional third-party services, no surprise cloud calls.
6262

63-
*Test: after changing one file in a 1000-file project, does `codegraph build .` complete in under 500ms?*
63+
This dual-mode approach is unique in the competitive landscape. Competitors either require cloud APIs for core functionality (code-graph-rag, autodev-codebase) or offer no AI enhancement at all (CKB, axon, arbor). Nobody else offers both modes in one tool.
64+
65+
*Test: does every core command (`build`, `query`, `fn`, `deps`, `impact`, `diff-impact`, `cycles`, `map`) work with zero API keys? Are LLM features additive, never blocking?*
6466

6567
### 5. Embeddable first, CLI second
6668

@@ -116,34 +118,37 @@ Staying in our lane means we can be embedded inside tools that do those things
116118
- Features that improve **result quality**: fuzzy search, confidence scoring, node classification, compound queries that reduce agent round-trips
117119
- Features that improve **speed**: faster native parsing, smarter incremental builds, lighter-weight search alternatives (FTS5/TF-IDF alongside full embeddings)
118120
- Features that improve **embeddability**: better programmatic API, streaming results, output format options
121+
- **Optional LLM provider integration**: bring-your-own provider (OpenAI, etc.) for richer embeddings, AI-powered search, and enhanced analysis — always as an additive layer that never blocks the core pipeline (Principle 4)
119122

120123
### We will not build
121124

122-
- External database backends (Memgraph, Neo4j, Qdrant, etc.) — violates Principle 1
123-
- Cloud API integrations for core functionality — violates Principle 1
125+
- External database backends (Memgraph, Neo4j, Qdrant, etc.) — violates Principle 1 (speed) and zero-infrastructure goal
126+
- Cloud API calls in the core pipeline — violates Principle 1 (the graph must always rebuild in under a second) and Principle 4 (zero-cost core)
124127
- AI-powered code generation or editing — violates Principle 8
125128
- Multi-agent orchestration — violates Principle 8
126129
- Native desktop GUI — outside our lane; we're a library
127-
- Features that require non-npm dependencies — violates Principle 1
130+
- Features that require non-npm dependencies — keeps deployment simple
128131

129132
---
130133

131134
## Competitive Position
132135

133136
As of February 2026, codegraph is **#7 out of 22** in the code intelligence tool space (see [COMPETITIVE_ANALYSIS.md](./COMPETITIVE_ANALYSIS.md)).
134137

135-
Six tools rank above us on feature breadth and community size. But none of them occupy our niche: **the npm-native, zero-config, dual-engine code intelligence library.**
138+
Six tools rank above us on feature breadth and community size. But none of them can answer yes to all three questions:
139+
140+
1. **Can you rebuild the graph on every commit in a large codebase?** — Only codegraph has incremental builds. Everyone else re-indexes from scratch.
141+
2. **Does the core pipeline work with zero API keys and zero cost?** — Tools like code-graph-rag and autodev-codebase require cloud APIs for core features. Codegraph's full graph pipeline is local and costless.
142+
3. **Can you optionally enhance with your LLM provider?** — Local-only tools (CKB, axon, arbor) have no AI enhancement path. Cloud-dependent tools force it. Only codegraph makes it optional.
136143

137-
| What competitors need | What codegraph needs |
138-
|-----------------------|----------------------|
139-
| Docker (Memgraph, Neo4j, Qdrant, Dgraph) | Nothing |
140-
| Python environment | Nothing |
141-
| Cloud API keys (OpenAI, Gemini, Voyage AI) | Nothing |
142-
| Manual Rust/Go compilation | Nothing |
143-
| External secret management setup | Nothing |
144-
| `npm install @optave/codegraph` | That's it |
144+
| What competitors force you to choose | What codegraph gives you |
145+
|--------------------------------------|--------------------------|
146+
| Fast local analysis **or** AI-powered features | Both — zero-cost core + optional LLM layer |
147+
| Full re-index on every change **or** stale graph | Always-current graph via incremental builds |
148+
| Code goes to multiple cloud services **or** no AI at all | Code goes only to the one provider you chose |
149+
| Docker + Python + external DB **or** nothing works | `npm install` and done |
145150

146-
Our path to #1 is not feature parity with every competitor. It's making codegraph **the obvious default for any JavaScript developer or tool that needs code intelligence** — because it's the only one that doesn't ask them to leave the npm ecosystem.
151+
Our path to #1 is not feature parity with every competitor. It's being **the only code intelligence tool where the graph is always current, works at zero cost, and optionally gets smarter with the LLM you already use.**
147152

148153
---
149154

0 commit comments

Comments
 (0)