A hands-on implementation of the LLM-Wiki pattern described by Andrej Karpathy. Applied here to Alice's Adventures in Wonderland (Ch. I–III) as a proof of concept.
An LLM-Wiki is a persistent, LLM-maintained knowledge base built from raw source documents. The key principle: the LLM does all the maintenance work. The human provides sources and asks questions.
Rather than re-deriving knowledge on every query, the LLM builds up a structured wiki of markdown pages and keeps them incrementally updated. The wiki becomes richer with every ingest and query operation.
Original concept: Andrej Karpathy's gist
sources/ ← Raw documents (immutable, human-curated)
↓
wiki/ ← LLM-maintained markdown pages (never edit manually)
↓
SCHEMA.md ← Rules and conventions governing the LLM's work
| File | Role |
|---|---|
SCHEMA.md |
The LLM's instruction manual: page types, frontmatter, naming, operations |
index.md |
Auto-maintained table of contents, organized by category |
log.md |
Append-only chronological log of all operations |
wiki/
concepts/ ← Abstract ideas, themes, theories
entities/ ← Persons, places, works, organizations
Every wiki page begins with YAML frontmatter:
---
title:
type: concept | entity
entity_type: person | place | work | organization
sources: []
tags: []
created: YYYY-MM-DD
updated: YYYY-MM-DD
related: []
---Three operations are defined as local Claude Code slash commands (.claude/commands/):
Process a new source document. The LLM reads the source, identifies concepts and entities,
creates or updates 5–15 wiki pages, updates index.md, and appends to log.md.
Answer a question using only the wiki as the knowledge base. The LLM reads index.md,
opens relevant pages, and synthesizes an answer with citations. If the answer reveals
a new insight not yet in the wiki, a new page is created.
Health-check the wiki: orphaned pages, missing cross-references, incomplete frontmatter,
contradictions, and broken index links. Writes a report to lint-report.md.
sources/alice-ch1-3.txt — Project Gutenberg plain text, chapters I–III (633 lines)
Concepts (5)
- Caucus Race — satirical ruleless race; everyone wins
- Identity and Self — Alice's crisis of self as she transforms
- Innocence as Disruption — Alice's naivety causes unintended chaos (emerged from QUERY)
- Logic and Nonsense — Carroll's core technique
- Transformation and Size — Drink Me / Eat Me mechanics
Entities — Persons (6)
Entities — Places (3)
[2026-04-21 16:45] INGEST sources/alice-ch1-3.txt — 9 entity pages, 4 concept pages
[2026-04-21 18:00] QUERY "Welchen Sinn hat Dinah?" — new concept: innocence-as-disruption
What works well
- The SCHEMA.md reliably guides the LLM to produce consistent, well-linked pages
- QUERY operations can discover concepts not extracted during INGEST
- Git history serves as a second audit trail alongside
log.md - Cross-references between pages emerge naturally
Known limitations
- Cold-start retrieval is slow: a fresh Claude session must read SCHEMA + index + pages before answering
- Mitigation without code: keep sessions open; enrich
index.mdwith more metadata so fewer pages need to be opened - At scale (100+ pages), a vector index would replace the LLM's manual index scan
Design decisions
- English throughout (better embedding model coverage for future retrieval)
- YAML frontmatter on all pages (machine-readable metadata)
- Slash commands are project-local (
.claude/commands/) — nothing global - Sources are tracked in git alongside the wiki (simplicity over size optimization)