codebase-oracle

Semantic search across all your local repos, via MCP or CLI.

codebase-oracle builds one semantic index over every git repo under a root directory, then exposes it to agents via MCP or to humans via CLI. The vector store lives on your machine; embeddings are computed by OpenAI by default, or fully local via Ollama (configurable). Indexing is incremental: only new and changed files are re-embedded. Built for agents first, humans second.

Try it in 60 seconds

git clone https://github.com/LanNguyenSi/codebase-oracle.git
cd codebase-oracle
npm install && npm run build

# point at the directory holding your git repos, set your key
export ORACLE_SCAN_ROOT=~/code
export OPENAI_API_KEY=sk-...

# build the index, then ask a question
npm run index
npm run query -- "where do we handle auth?"

Or wire it into Claude Code as an MCP server:

claude mcp add codebase-oracle -- npx tsx src/mcp-server.ts

From any Claude Code session on the same machine you can now call oracle_search, oracle_query, oracle_expand, and oracle_list_repos against the shared index.

What a search looks like

oracle_search with query="where do we read AGENT_TASKS_TOKEN" returns matching chunks with line-number locations:

[1] src/auth/token.ts:14-32 (agent-tasks-cli):
function loadToken(): string {
  const value = process.env.AGENT_TASKS_TOKEN;
  if (!value) throw new Error("AGENT_TASKS_TOKEN missing");
  return value;
}

---

[2] backend/src/middleware/auth.ts:8-21 (agent-tasks):
export function requireToken(req, res, next) {
  const token = req.headers.authorization?.replace(/^Bearer /, "");
  if (token !== process.env.AGENT_TASKS_TOKEN) return res.sendStatus(401);
  next();
}

oracle_list_repos shows what's indexed and how fresh each repo is:

- agent-tasks — 1842 chunks across 287 files (indexed 2026-04-27T10:14:02Z, 14 min ago)
- agent-tasks-cli — 421 chunks across 68 files (indexed 2026-04-27T10:14:18Z, 14 min ago)

Next steps

If you want to...	Read
Wire it into Claude Code (MCP setup, the four tools, HTTP MCP auth)	docs/mcp.md
Switch to Ollama, change embedding models, customise scan filters	docs/configuration.md
Understand how the index is built (chunking, embeddings, sqlite-vec)	docs/architecture.md
Migrate from v0.2 (JSONL) or pick up v0.4 line numbers	docs/upgrades.md

CLI reference

The CLI auto-loads .env from the repo root if present.

npm run index                            # build/refresh the index over ORACLE_SCAN_ROOT
npm run index -- --path /path/to/repos   # custom scan root
npm run query -- "what is the audit system?"
npm run query -- -r my-repo "where is the schema defined?"
npm run query -- -k 20 "list all API endpoints"
npm run dev -- search "evaluateTransitionRules"
npm run watch                            # keep the index fresh in the background

Flag	Description
`-r, --repo <name>`	Filter results to a specific repo
`-k, --limit <n>`	Number of chunks to retrieve (default: 12)

Watch mode runs a chokidar watcher over the scan root and re-embeds changed files after a short debounce. Newly dropped .git roots need one explicit npm run index to back-fill before watch mode picks up subsequent edits. See docs/architecture.md for details.

Two use cases

For agents (primary). A local Claude Code or other MCP client talks to the oracle's MCP server over stdio. The agent runs oracle_search / oracle_query / oracle_expand / oracle_list_repos against a shared, pre-built index: it never has to scan the filesystem, embed anything, or burn its own context on grep output. One scan for everyone, semantic instead of regex, no duplicate embeddings, MCP-first design.

For humans. The CLI is useful for spot checks, debugging the index, or terminal-driven answers without going through an agent.

Development

npm run build          # TypeScript compilation
npm test               # vitest run
npx tsc --noEmit       # type check only

License

MIT. See docs/architecture.md#credits for inspiration and prior art.

Name		Name	Last commit message	Last commit date
Latest commit History 47 Commits
.github		.github
docs		docs
src		src
tests		tests
.env.example		.env.example
.gitignore		.gitignore
.nvmrc		.nvmrc
CHANGELOG.md		CHANGELOG.md
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
SECURITY.md		SECURITY.md
package-lock.json		package-lock.json
package.json		package.json
tsconfig.json		tsconfig.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

codebase-oracle

Try it in 60 seconds

What a search looks like

Next steps

CLI reference

Two use cases

Development

License

About

Uh oh!

Releases 5

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

codebase-oracle

Try it in 60 seconds

What a search looks like

Next steps

CLI reference

Two use cases

Development

License

About

Topics

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases 5

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages