A personal LLM-maintained wiki, inspired by Andrej Karpathy's LLM Wiki pattern.
You drop raw sources in — podcast transcripts, articles, blog posts, threads, notes — and a coding agent (Claude Code) reads them, builds an interlinked Obsidian vault, and keeps it current as you add more sources.
The wiki becomes a persistent, compounding artifact. Each page reads less like a summary and more like a Wikipedia article — synthesised across dozens of sources, with multiple points of view stacked side by side.
Three layers:
raw/— immutable source documents. The agent reads from here but never modifies.wiki/— the agent-generated Obsidian vault. Entity, concept, topic, debate, and episode pages with full citations.CLAUDE.md— the operating manual. Tells the agent how to ingest sources, structure pages, and maintain the wiki.
- Obsidian — for browsing the vault
- Claude Code
git clone https://github.com/<your-username>/knowledge-base.git
cd knowledge-base- Launch Obsidian → Open folder as vault → select this directory.
- The
wiki/folder is what you'll mostly browse. Use the graph view to see how everything connects.
From the repo root, launch Claude Code:
claudeClaude Code automatically reads CLAUDE.md on startup and follows its operating instructions. That single file is what turns the agent from a generic assistant into a disciplined wiki maintainer.
The scripts/ folder has a few small fetchers that drop content straight into raw/inbox/:
pip install -r requirements.txt
cp .env.example .env # then edit .env
# Pull recent posts from any Substack / RSS feed (paywalled or public)
python3 scripts/fetch_substack.py --rss-url https://stratechery.passport.online/feed/rss/<your-token> --limit 20
# Pull captions from a YouTube channel (podcast or otherwise)
python3 scripts/fetch_youtube_captions.py --channel @DwarkeshPatel --limit 5
# Or batch a list of channels (edit the array inside the script)
./scripts/fetch_all_channels.sh 5
# Or pipe arbitrary text in as a note
pbpaste | python3 scripts/paste_ingest.py "my note title"All fetchers write to raw/inbox/, where Claude Code picks them up on the next ingest.
- Drop the raw content (transcript, article, note) into
raw/inbox/. - In Claude Code, say: "classify and route inbox" — the agent moves each file to the correct folder under
raw/and writes itsmetadata.yaml. - Then say: "ingest the routed files" — the agent reads each source end-to-end and updates the wiki.
The agent will:
- Create or update entity pages (people + orgs)
- Create or update concept and topic pages
- Detect debates between named people and create debate pages
- Append every claim with full attribution (who said it, where, timestamp)
Ask the agent anything about the wiki's content. It reads index.md to find relevant pages, drills in, and synthesizes an answer with citations. Per Rule 6 in CLAUDE.md, useful answers can be filed back into the wiki as new pages — nothing evaporates into chat history.
knowledge-base/
├── CLAUDE.md # Operating manual for the agent
├── README.md
├── index.md # Auto-maintained catalog of all wiki pages (created on first ingest)
├── log.md # Append-only activity log (created on first ingest)
├── raw/ # Immutable source documents
│ ├── inbox/ # Drop zone for unclassified sources
│ ├── podcasts/
│ ├── articles/
│ ├── blogs/
│ ├── notes/
│ └── threads/
└── wiki/ # Agent-generated Obsidian pages
├── sources/
├── entities/
├── concepts/
├── topics/
├── debates/
└── episodes/
The wiki is governed by nine non-negotiable rules in CLAUDE.md. The most important:
- Editor, not writer. Every claim must trace to something a real person said in a real source. No extrapolation.
- Attribute everything. Every claim includes who said it, which source, and a timestamp or section anchor.
- Topics are canonical, sources are accretive. The wiki is organized by topics — every new source contributes claims into existing pages instead of creating duplicates.
- Flag, don't resolve. When two credible people disagree, create a debate page with both positions cited. Never editorialize.
Read CLAUDE.md in full before your first ingest.
MIT