A semantic knowledge graph for your codebase
Extract the why behind code — not the what.
Quick Start · Commands · Output · Architecture · Example
The decisions behind code — why JWT over sessions? why this schema? why 5 pipeline stages? — are scattered across PR threads, commit messages, and tribal knowledge. AI can read syntax, but humans can't see the reasoning.
ShadowRepo scans your repo, extracts structured semantic specs, organizes them into a feature tree, and detects when code drifts from those specs.
graph LR
A["🗂 Your Codebase<br/><sub>source · git history · docs</sub>"]
B["📐 .shadowrepo/<br/><sub>feature tree · spec graph · coverage</sub>"]
C["👥 Your Team<br/><sub>engineers · PMs · new hires</sub>"]
A -- "/build" --> B
B -- "/render" --> C
B -. "/check · /update" .-> B
style A fill:#dbeafe,stroke:#3b82f6,stroke-width:2px,color:#1e3a5f
style B fill:#ede9fe,stroke:#8b5cf6,stroke-width:2px,color:#3b0764
style C fill:#dcfce7,stroke:#22c55e,stroke-width:2px,color:#14532d
ShadowRepo ships as a Claude Code plugin. Two install paths:
End users — install via marketplace (recommended):
/plugin marketplace add VW-ai/ShadowRepo-Skill
/plugin install shadowrepo@shadowrepo
Contributors — local dev with live edits:
git clone https://github.com/VW-ai/ShadowRepo-Skill.git
cd ShadowRepo-Skill
source ./dev-install.sh # adds `claude-sr` alias to your shell rc
claude-sr # launches Claude Code with shadowrepo loadedThen, in any project:
/shadowrepo-build
That's it. ShadowRepo scans your codebase and builds the knowledge graph.
| Command | Purpose | When to use |
|---|---|---|
/shadowrepo-build |
Scan repo, build feature tree + spec graph | First time setup |
/shadowrepo-check |
Detect code-spec drift | After code changes |
/shadowrepo-update |
Fix drifted specs | When drift is found |
/shadowrepo-render |
Generate docs from specs | For onboarding, reviews |
/shadowrepo-help |
Show all capabilities | Anytime |
graph LR
B["<b>/build</b><br/>Scan repo, extract<br/>feature tree + specs"]
C["<b>/check</b><br/>Detect code-spec<br/>drift"]
U["<b>/update</b><br/>Sync drifted<br/>specs to code"]
R["<b>/render</b><br/>Generate docs<br/>for the team"]
B --> C --> U --> R
R -.->|"code changes"| C
style B fill:#dbeafe,stroke:#3b82f6,stroke-width:2px,color:#1e1e1e
style C fill:#fef3c7,stroke:#f59e0b,stroke-width:2px,color:#1e1e1e
style U fill:#ede9fe,stroke:#8b5cf6,stroke-width:2px,color:#1e1e1e
style R fill:#dcfce7,stroke:#22c55e,stroke-width:2px,color:#1e1e1e
All output lives in .shadowrepo/ — plain JSON, git-diffable, zero dependencies.
.shadowrepo/
├── features.json # Feature tree (product → domain → module)
├── specs.json # Semantic specs with anchors + relations
├── coverage.json # File-level spec coverage
└── meta.json # Repo metadata + scan timestamp
Real output from scanning a social media monitoring pipeline:
Feature Tree — hierarchical product decomposition
{
"feature_id": "pipeline",
"name": "Filtering Pipeline",
"type": "business",
"description": "5-stage information filtering from social media firehose",
"key_files": ["src/pipeline/orchestrator.py"],
"parent": null
}{
"feature_id": "pipeline/s2-embedding",
"name": "Semantic Retrieval",
"type": "business",
"description": "Cosine similarity via text-embedding-3-small",
"key_files": ["src/pipeline/s2_embedding.py"],
"parent": "pipeline"
}Spec — a decision with anchors and relations
{
"spec_id": "pipeline/decision/recall-funnel-precision-at-s4",
"feature_name": "pipeline",
"type": "decision",
"summary": "S1-S3 optimize for recall (don't lose relevant posts), S4 optimizes for precision (remove false positives).",
"confidence": 0.9,
"anchors": [
{ "file": "src/pipeline/s3_triage.py", "symbols": ["triage"] },
{ "file": "src/pipeline/s4_review.py", "symbols": ["review"] }
],
"relations": [
{ "type": "depends_on", "target_spec_id": "core-ai/intent/litellm-abstraction" }
]
}Feature → Spec hierarchy — how it all connects
pipeline ← feature
├── decision/recall-over-precision ← why: early stages keep all candidates
├── intent/multi-stage-filtering ← why: progressive narrowing
├── constraint/5s-per-stage ← why: SLA budget
│
└── s2-embedding ← sub-feature
└── decision/drop-keyword-faiss ← why: cosine similarity outperformed BM25
Every spec belongs to a feature. A feature owns files. Specs explain why those files exist.
A good spec captures why — "recall funnel because early stages shouldn't lose posts" not what — "pipeline has 5 stages".
No server. No dependencies. No scaffolding.
Skills are natural language programs that Claude executes using its native tools (Read, Grep, Glob, Bash, Write).
| Skills Workflow logic |
/build · /check · /update · /render · /preview |
| Stdlib Extraction methodology |
methodology · data-model · quality-gates · recursion-engine · git-ops · file-discovery |
| Contracts JSON schemas |
spec · feature · scope · check-result · merge-result |
Core execution is recursive — sense → understand → extract → split → recurse → merge. The feature tree emerges bottom-up. Parallel agents write temp JSON; a synthesizer merges the results.
| Principle | What it means |
|---|---|
| Density over coverage | ~150–200 high-quality specs per medium repo, not thousands of shallow ones |
| Tree, not flat list | Features form a hierarchy (product → domain → module), enabling drill-down |
| Transparent | Every extraction step is visible; users can intervene and guide |
| Idempotent | Running /build twice produces the same result |
| Zero infrastructure | JSON files in your repo, committed alongside your code |
Marketplace install:
/plugin uninstall shadowrepo
Contributor (dev) install — removes the claude-sr alias and any legacy skill symlinks left over from pre-plugin installs:
./dev-install.sh removeBuilt as a Claude Code Plugin · No servers, no dependencies, just specs