Auto-documenting repo tool with local RAG retrieval.
- Generates "what-exists" reference docs for each source file (LLM, on push only)
- Maintains them incrementally — a SHA-256 manifest means only changed files are re-documented/re-embedded
- Serves everything through a local hybrid (vector + BM25) index:
- an MCP server so Claude Code retrieves only the relevant slices (token savings)
- a MkDocs Material site for humans
All retrieval is local and free (sentence-transformers + LanceDB). The only API cost is generation, at push time — and even that can be moved to a local model via Ollama.
cd c:\apps\repodoc
uv venv --python 3.12
uv pip install -e ".[index,mcp,site]"The generator needs ANTHROPIC_API_KEY (or [generator] provider = "ollama" in config).
repodoc init <repo> # one-time: writes <repo>/.repodoc/config.toml + manifest
repodoc status --repo <repo> # what changed since last run
repodoc sync --repo <repo> # generate + index changed files (what CI calls)
repodoc query --repo <repo> "how does the needs system work?"
repodoc serve-mcp --repo <repo> # stdio MCP server for Claude Code
repodoc build-site --repo <repo> # MkDocs site -> <repo>/.repodoc/site
repodoc reindex --full --repo <repo> # after changing the embedding modelUseful flags: repodoc generate --dry-run (no LLM calls), --limit N (cap spend per run).
.mcp.json in your workspace:
{
"mcpServers": {
"gps3-docs": {
"command": "c:/apps/repodoc/.venv/Scripts/repodoc.exe",
"args": ["serve-mcp", "--repo", "c:/apps/gps3"]
}
}
}Tools exposed: search_docs(query, k, doc_class), get_doc(path), list_docs().
- CI (preferred): see
gps3/.github/workflows/repodoc.yml— runsrepodoc syncon push to the default branch and commits regenerated docs. Needs the repodoc repo on GitHub +ANTHROPIC_API_KEYsecret. - Local hook (opt-in):
.git/hooks/pre-pushrunningrepodoc sync --repo .. Note this calls the LLM (costs money) on each push — CI with caching is usually better.
Both model slots live in <repo>/.repodoc/config.toml:
| Slot | Change | Consequence |
|---|---|---|
[generator] |
e.g. anthropic → ollama + big local model | none — next generate uses it |
[embeddings] |
e.g. nomic-embed → Qwen3-Embedding 8B | auto-detected → one-time full re-index |
Provenance: every indexed chunk is tagged source, generated-reference, or planning, and retrieval surfaces the tag so agents treat planning docs as intent, not ground truth.