A minimal port of codebase-memory-mcp as a pi-coding-agent extension.
Instead of a Go binary + tree-sitter + SQLite the extension runs entirely inside the Node.js process that already hosts pi:
| MCP component | Extension equivalent |
|---|---|
| Go binary + CGO | Node.js built-ins — zero native deps |
| tree-sitter AST | Per-language regex, line-by-line with quick-filter |
| Content-hash incremental index | MD5 per file, async stat batch, same skip-if-unchanged logic |
| SQLite WAL database | .pi-codebase.bin next to your project (v8 serialized) |
| 11 MCP tools via stdio | 5 pi tools + 1 slash-command |
# Install directly from GitHub:
pi install git:github.com/R-Dson/pi-codebase
# Or via npm:
pi install npm:pi-codebase-memory- GitHub: R-Dson/pi-codebase
- npm: pi-codebase-memory
Full scan — walks the project, extracts symbols, writes .pi-codebase.bin.
codebase_index()
codebase_index({ root_path: "/my/app" })
Supported languages: TypeScript · JavaScript · Python · Go · Rust · Java · C# · PHP · C · C++ · Ruby · Swift · Kotlin · Shell · Perl · Dart · Lua · Scala · R
Ignored directories: node_modules, .git, dist, build, .next, __pycache__,
target, .cache, vendor, .venv, venv, coverage, .nyc_output, out
Re-parses only files whose MD5 content hash has changed since the last index run. Unchanged files are reused verbatim — identical to the MCP's incremental reindex strategy. Falls back to a full scan when no prior index exists.
codebase_update() # check everything since last run
codebase_update({ root_path: "/my/app" })
Output tells you how many files were +added, -removed, or ~changed.
Query the in-memory index — much faster than grep for structural questions.
Equivalent to search_graph in the MCP.
codebase_search({ query: "Handler" }) # name regex
codebase_search({ kind: "class" }) # by kind
codebase_search({ query: "process", file_pattern: "api" })
codebase_search({ kind: "function", limit: 100 })
Supported kinds: function · method · class · interface · type · variable · struct · enum · trait · module · route · http_call · macro · protocol · extension · object
Find every usage of a symbol across the project.
Equivalent to trace_call_path(direction="inbound") in the MCP.
Search back-end priority:
- ripgrep (
rg) — if installed; fastest, cross-platform including native Windows - grep — Unix (Linux / macOS / WSL)
- Pure Node.js — always available; slower on large trees but works everywhere
codebase_refs({ symbol: "processOrder" })
codebase_refs({ symbol: "UserService", file_pattern: "*.ts" })
codebase_refs({ symbol: "main", limit: 200 })
High-level overview: file counts per language, symbol counts per kind, index age,
root directory listing. Equivalent to get_graph_schema.
codebase_schema()
The output also reports the platform and which search back-ends are active
(find, grep, rg), so you know exactly what the extension is using.
/codebase → index status (root, file/symbol count, age, platform info)
| Language | Kinds |
|---|---|
| TypeScript / TSX | function, arrow function, class, interface, type, enum, method, route, http_call |
| JavaScript / JSX | function, arrow function, class, method, route, http_call |
| Python | function, method, class, route |
| Go | function, method, struct, interface, type |
| Rust | function, struct, enum, trait, type, module |
| Java | class, interface, enum, method, route |
| C# | class, interface, enum, struct, function, route |
| PHP | function, class, interface, route |
| C | function, struct, enum, type, macro |
| C++ | class, struct, enum, function, method, type, macro |
| Ruby | class, module, method |
| Swift | class, struct, protocol, enum, function, method, type, extension |
| Kotlin | class, interface, function, method, type, enum, object |
| Shell | function |
| Perl | function, module, class |
| Dart | class, function, method, enum, type, mixin |
| Lua | function, module |
| Scala | class, object, trait, function, method, type, enum |
| R | function |
Signatures are captured up to 200 characters — enough to show full generic bounds
in Rust (pub fn foo<T: Serialize + Clone>() and long Java return types.
| Environment | Discovery | Symbol extraction | Reference search |
|---|---|---|---|
| Linux / macOS | find (fast) |
Node.js regex | rg → grep |
| WSL | find (fast) |
Node.js regex | rg → grep |
| Native Windows | Node.js walk | Node.js regex | rg → JS scan |
Install ripgrep (winget install ripgrep /
brew install ripgrep / apt install ripgrep) to get the fastest reference search on
all platforms.
# Day 1 — initial index
codebase_index() → writes .pi-codebase.bin
# Day 2, session start → index reloaded automatically from .pi-codebase.bin
# After editing a few files
codebase_update() → only changed files are re-parsed (hash diff)
# After a big refactor
codebase_index() → full re-scan (safe to run at any time)
Add .pi-codebase.bin to .gitignore if you prefer not to commit it:
echo ".pi-codebase.bin" >> .gitignore# Structural overview of an unfamiliar repo
You: "What does this codebase look like?"
→ codebase_index() then codebase_schema()
# Find all HTTP handlers
You: "Where are the route handlers?"
→ codebase_search({ query: "Handler|Route|Controller", kind: "function" })
# Call-site tracing
You: "What calls processPayment?"
→ codebase_refs({ symbol: "processPayment" })
# Dead-code hint
You: "Find all exported functions in the billing package"
→ codebase_search({ query: "^[A-Z]", kind: "function", file_pattern: "billing" })
# After editing
You: "I just moved some files around, update the index"
→ codebase_update()
| Feature | MCP | This extension |
|---|---|---|
| Requires Go + CGO | ✅ | ❌ — zero external deps |
| tree-sitter AST accuracy | ✅ | |
| Content-hash incremental index | ✅ | ✅ MD5, async stat batch, same strategy |
| Call-graph edges (multi-hop) | ✅ | ❌ (use codebase_refs for single-hop) |
| Cross-service HTTP linking | ✅ | ❌ |
| Cypher-like query language | ✅ | ❌ |
| Dead-code detection | ✅ | ❌ |
| Works inside pi without MCP | ❌ | ✅ |
| Modular, no build step | ❌ | ✅ |
| Persistent index | ✅ | ✅ |
| Reference search | ✅ | ✅ (rg / grep / JS) |
| Symbol search | ✅ | ✅ |
| Schema / overview | ✅ | ✅ |
| Windows support | ❌ (WSL only) | ✅ (native + WSL) |
| Resilient to broken syntax | ✅ regex keeps working |
codebase-memory/
├── index.ts # Entry point — state, events, registration (~140 lines)
├── types.ts # Interfaces, constants, language specs with quickFilter (~420 lines)
├── indexing.ts # File discovery, symbol extraction, full/incremental index (~340 lines)
├── search.ts # ripgrep / grep / JS reference search (~90 lines)
└── tools.ts # Helpers + 5 tool registrations with renderers (~480 lines)
The indexer is optimized for speed:
- Concurrent I/O — semaphore-based worker pool keeps
Nfiles in flight simultaneously - Async stat batch —
incrementalIndexstats all files in parallel, not sequentially - Per-language quick-filter — a single cheap regex skips ~85% of lines before running expensive pattern matches
- Native crypto — MD5 via Node's C++ crypto module (hardware-accelerated)
- v8 serialization — binary index persistence is 5–10× faster than JSON
- Auto-tuned thread pool —
UV_THREADPOOL_SIZEset tomax(cpus × 2, 32)at startup
Real-world result on the Linux kernel (64,770 files, 7M+ symbols): ~24 seconds on a modern machine.
MIT