LLM Wiki

A CLI tool that ingests PDF documents into a structured, LLM-maintained Markdown wiki. An LLM reads each document, extracts key concepts, and produces interlinked wiki pages with source citations — building a knowledge base incrementally.

How It Works

PDF ─→ Marker (OCR + layout) ─→ structured blocks ─→ LLM prompt ─→ wiki pages

Parse — Marker extracts text with OCR, layout detection, and page numbers
Prompt — The parsed content is combined with the wiki schema and current index into a single LLM prompt
Generate — The LLM (Ollama) produces wiki pages with YAML frontmatter, cross-links ([[page-name]]), and source citations (source.pdf, p.3)
Store — Pages are written to a flat Markdown wiki with an auto-maintained index.md and append-only log.md

Prerequisites

Python 3.12+
uv
Ollama running locally (recommended: native install on macOS for Apple Silicon GPU support)

Setup

# Install dependencies
uv sync

# Start Ollama and pull the model
ollama serve &
ollama pull qwen2.5:7b

# Or use Docker (note: requires sufficient memory allocation)
docker compose up -d

Configuration lives in config.yml:

wiki:
  dir: wiki

ollama:
  model: qwen2.5:7b
  url: http://localhost:11434
  timeout: 300
  num_ctx: 16384

Environment variables override config values (precedence: CLI flag > env var > config.yml > defaults):

Variable	Default	Description
`LLM_WIKI_OLLAMA_MODEL`	`qwen2.5:7b`	Ollama model name
`LLM_WIKI_OLLAMA_URL`	`http://localhost:11434`	Ollama API base URL
`LLM_WIKI_DIR`	`wiki`	Wiki output directory

Usage

# Ingest a single PDF
uv run llm-wiki ingest path/to/document.pdf

# Ingest all PDFs in a directory
uv run llm-wiki ingest-all path/to/pdfs/

# Show wiki status
uv run llm-wiki status

All CLI options can be overridden via flags:

uv run llm-wiki ingest doc.pdf --model qwen2.5:3b --wiki-dir output/

Wiki Structure

wiki/
├── _schema.md          # LLM instructions (page format, rules, output delimiters)
├── index.md            # Auto-maintained page catalog
├── log.md              # Append-only ingestion log
├── source-summary.md   # One per ingested document
├── concept-page.md     # Topics spanning multiple sources
└── entity-page.md      # Named things (models, standards, locations)

Pages use YAML frontmatter with source tracking and [[wiki-links]] for cross-references.

Project Structure

src/llm_wiki/
├── cli.py          # Typer CLI (ingest, ingest-all, status)
├── config.py       # YAML config loading with env var overrides
├── parser.py       # Marker PDF parsing + LLM text formatting
├── llm.py          # LLM Protocol + OllamaLLM implementation
├── ingestion.py    # Orchestrator: parse → prompt → generate → store
└── wiki_store.py   # Flat-file wiki (read/write pages, index, log)

Development

# Run tests
uv run pytest

# Run only unit tests
uv run pytest -m unit

# Lint
uv run ruff check src/ tests/

# Format
uv run ruff format src/ tests/

# Type check
uv run ty check src/

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
docs		docs
src/llm_wiki		src/llm_wiki
tests		tests
wiki		wiki
.gitignore		.gitignore
.python-version		.python-version
AGENTS.md		AGENTS.md
CLAUDE.md		CLAUDE.md
README.md		README.md
config.yml		config.yml
docker-compose.yml		docker-compose.yml
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LLM Wiki

How It Works

Prerequisites

Setup

Usage

Wiki Structure

Project Structure

Development

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

LLM Wiki

How It Works

Prerequisites

Setup

Usage

Wiki Structure

Project Structure

Development

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages