codebase-memory — pi-coding-agent extension

A minimal port of codebase-memory-mcp as a pi-coding-agent extension.

Instead of a Go binary + tree-sitter + SQLite the extension runs entirely inside the Node.js process that already hosts pi:

MCP component	Extension equivalent
Go binary + CGO	Node.js built-ins — zero native deps
tree-sitter AST	Per-language regex, line-by-line with quick-filter
Content-hash incremental index	MD5 per file, async stat batch, same skip-if-unchanged logic
SQLite WAL database	`.pi-codebase.bin` next to your project (v8 serialized)
11 MCP tools via stdio	5 pi tools + 1 slash-command

Installation

# Install directly from GitHub:
pi install git:github.com/R-Dson/pi-codebase

# Or via npm:
pi install npm:pi-codebase-memory

Resources

GitHub: R-Dson/pi-codebase
npm: pi-codebase-memory

Tools

`codebase_index`

Full scan — walks the project, extracts symbols, writes .pi-codebase.bin.

codebase_index()
codebase_index({ root_path: "/my/app" })

Supported languages: TypeScript · JavaScript · Python · Go · Rust · Java · C# · PHP · C · C++ · Ruby · Swift · Kotlin · Shell · Perl · Dart · Lua · Scala · R

Ignored directories: node_modules, .git, dist, build, .next, __pycache__, target, .cache, vendor, .venv, venv, coverage, .nyc_output, out

`codebase_update` (incremental)

Re-parses only files whose MD5 content hash has changed since the last index run. Unchanged files are reused verbatim — identical to the MCP's incremental reindex strategy. Falls back to a full scan when no prior index exists.

codebase_update()                        # check everything since last run
codebase_update({ root_path: "/my/app" })

Output tells you how many files were +added, -removed, or ~changed.

`codebase_search`

Query the in-memory index — much faster than grep for structural questions. Equivalent to search_graph in the MCP.

codebase_search({ query: "Handler" })                     # name regex
codebase_search({ kind: "class" })                        # by kind
codebase_search({ query: "process", file_pattern: "api" })
codebase_search({ kind: "function", limit: 100 })

Supported kinds: function · method · class · interface · type · variable · struct · enum · trait · module · route · http_call · macro · protocol · extension · object

`codebase_refs`

Find every usage of a symbol across the project. Equivalent to trace_call_path(direction="inbound") in the MCP.

Search back-end priority:

ripgrep (rg) — if installed; fastest, cross-platform including native Windows
grep — Unix (Linux / macOS / WSL)
Pure Node.js — always available; slower on large trees but works everywhere

codebase_refs({ symbol: "processOrder" })
codebase_refs({ symbol: "UserService", file_pattern: "*.ts" })
codebase_refs({ symbol: "main", limit: 200 })

`codebase_schema`

High-level overview: file counts per language, symbol counts per kind, index age, root directory listing. Equivalent to get_graph_schema.

codebase_schema()

The output also reports the platform and which search back-ends are active (find, grep, rg), so you know exactly what the extension is using.

Command

/codebase   →  index status (root, file/symbol count, age, platform info)

What gets extracted

Language	Kinds
TypeScript / TSX	function, arrow function, class, interface, type, enum, method, route, http_call
JavaScript / JSX	function, arrow function, class, method, route, http_call
Python	function, method, class, route
Go	function, method, struct, interface, type
Rust	function, struct, enum, trait, type, module
Java	class, interface, enum, method, route
C#	class, interface, enum, struct, function, route
PHP	function, class, interface, route
C	function, struct, enum, type, macro
C++	class, struct, enum, function, method, type, macro
Ruby	class, module, method
Swift	class, struct, protocol, enum, function, method, type, extension
Kotlin	class, interface, function, method, type, enum, object
Shell	function
Perl	function, module, class
Dart	class, function, method, enum, type, mixin
Lua	function, module
Scala	class, object, trait, function, method, type, enum
R	function

Signatures are captured up to 200 characters — enough to show full generic bounds in Rust (pub fn foo<T: Serialize + Clone>() and long Java return types.

Platform support

Environment	Discovery	Symbol extraction	Reference search
Linux / macOS	`find` (fast)	Node.js regex	`rg` → `grep`
WSL	`find` (fast)	Node.js regex	`rg` → `grep`
Native Windows	Node.js walk	Node.js regex	`rg` → JS scan

Install ripgrep (winget install ripgrep / brew install ripgrep / apt install ripgrep) to get the fastest reference search on all platforms.

Persistence & incremental workflow

# Day 1 — initial index
codebase_index()          →  writes .pi-codebase.bin

# Day 2, session start    →  index reloaded automatically from .pi-codebase.bin

# After editing a few files
codebase_update()         →  only changed files are re-parsed (hash diff)

# After a big refactor
codebase_index()          →  full re-scan (safe to run at any time)

Add .pi-codebase.bin to .gitignore if you prefer not to commit it:

echo ".pi-codebase.bin" >> .gitignore

Workflow examples

# Structural overview of an unfamiliar repo
You: "What does this codebase look like?"
  → codebase_index() then codebase_schema()

# Find all HTTP handlers
You: "Where are the route handlers?"
  → codebase_search({ query: "Handler|Route|Controller", kind: "function" })

# Call-site tracing
You: "What calls processPayment?"
  → codebase_refs({ symbol: "processPayment" })

# Dead-code hint
You: "Find all exported functions in the billing package"
  → codebase_search({ query: "^[A-Z]", kind: "function", file_pattern: "billing" })

# After editing
You: "I just moved some files around, update the index"
  → codebase_update()

Comparison with codebase-memory-mcp

Feature	MCP	This extension
Requires Go + CGO	✅	❌ — zero external deps
tree-sitter AST accuracy	✅	⚠️ regex (resilient to syntax errors)
Content-hash incremental index	✅	✅ MD5, async stat batch, same strategy
Call-graph edges (multi-hop)	✅	❌ (use `codebase_refs` for single-hop)
Cross-service HTTP linking	✅	❌
Cypher-like query language	✅	❌
Dead-code detection	✅	❌
Works inside pi without MCP	❌	✅
Modular, no build step	❌	✅
Persistent index	✅	✅
Reference search	✅	✅ (rg / grep / JS)
Symbol search	✅	✅
Schema / overview	✅	✅
Windows support	❌ (WSL only)	✅ (native + WSL)
Resilient to broken syntax	⚠️	✅ regex keeps working

File structure

codebase-memory/
├── index.ts       # Entry point — state, events, registration (~140 lines)
├── types.ts       # Interfaces, constants, language specs with quickFilter (~420 lines)
├── indexing.ts    # File discovery, symbol extraction, full/incremental index (~340 lines)
├── search.ts      # ripgrep / grep / JS reference search (~90 lines)
└── tools.ts       # Helpers + 5 tool registrations with renderers (~480 lines)

Performance

The indexer is optimized for speed:

Concurrent I/O — semaphore-based worker pool keeps N files in flight simultaneously
Async stat batch — incrementalIndex stats all files in parallel, not sequentially
Per-language quick-filter — a single cheap regex skips ~85% of lines before running expensive pattern matches
Native crypto — MD5 via Node's C++ crypto module (hardware-accelerated)
v8 serialization — binary index persistence is 5–10× faster than JSON
Auto-tuned thread pool — UV_THREADPOOL_SIZE set to max(cpus × 2, 32) at startup

Real-world result on the Linux kernel (64,770 files, 7M+ symbols): ~24 seconds on a modern machine.

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 24 Commits
.github/workflows		.github/workflows
src		src
tests		tests
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
index.ts		index.ts
package.json		package.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

codebase-memory — pi-coding-agent extension

Installation

Resources

Tools

`codebase_index`

`codebase_update` (incremental)

`codebase_search`

`codebase_refs`

`codebase_schema`

Command

What gets extracted

Platform support

Persistence & incremental workflow

Workflow examples

Comparison with codebase-memory-mcp

File structure

Performance

License

About

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

codebase-memory — pi-coding-agent extension

Installation

Resources

Tools

codebase_index

codebase_update (incremental)

codebase_search

codebase_refs

codebase_schema

Command

What gets extracted

Platform support

Persistence & incremental workflow

Workflow examples

Comparison with codebase-memory-mcp

File structure

Performance

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Contributors

Uh oh!

Languages

`codebase_index`

`codebase_update` (incremental)

`codebase_search`

`codebase_refs`

`codebase_schema`