RawToWise

LLM Knowledge Compiler — drop raw documents, get a structured markdown wiki.

Install · Quick Start · How It Works · Contributing

demo.mp4

raw/ (papers, articles, URLs)
  → rtw compile → wiki/ (structured .md with backlinks)
                    → rtw query → answers accumulate in wiki
                    → rtw lint  → detect contradictions, fill gaps

Inspired by Andrej Karpathy's LLM knowledge base workflow. Turn his "hacky collection of scripts" into a real tool.

Why RawToWise?

Problem	RawToWise
RAG requires vector DB infra	No vector DB — LLM navigates via index + backlinks
Chat answers disappear	Exploration = accumulation — every query enriches the wiki
PKM requires manual organizing	Drop and forget — put files in `raw/`, LLM handles the rest
Vendor lock-in (NotebookLM, etc.)	Plain markdown — works in Obsidian, VSCode, or any editor

Install

curl -fsSL https://raw.githubusercontent.com/vericontext/rawtowise/main/install.sh | bash

Other install methods

# Via pipx
pipx install git+https://github.com/vericontext/rawtowise.git

# Via uv
uv tool install git+https://github.com/vericontext/rawtowise.git

# From source
git clone https://github.com/vericontext/rawtowise.git && cd rawtowise && pip install -e .

Requires an Anthropic API key. The rtw init command will prompt you to set it up.

Quick Start

# 1. Initialize a project
rtw init --name "AI Research"

# 2. Ingest sources
rtw ingest https://example.com/article
rtw ingest "https://en.wikipedia.org/wiki/Transformer_(deep_learning)"
rtw ingest paper.pdf
rtw ingest ./my-articles/

# 3. Compile into a wiki
rtw compile

# 4. Ask questions (answers stream in real-time)
rtw query "What are the key debates in this field?"

# 5. Health check
rtw lint

How It Works

Ingest — Fetch URLs (via Jina Reader), copy local files, and clean web boilerplate. Sources are stored in raw/.

Compile — LLM extracts key concepts from all sources, generates interlinked wiki articles with [[backlinks]] and [source: filename] citations, and builds an index. Articles are generated in parallel for speed.

Query — LLM reads the wiki index, finds relevant articles, and synthesizes an answer. Answers stream to the terminal and are saved to output/ for future reference.

Lint — LLM audits the wiki for contradictions, coverage gaps, stale information, and suggests new questions to explore.

Commands

Command	Description
`rtw init`	Initialize a new project (creates dirs + config, prompts for API key)
`rtw ingest <source>`	Ingest URL, file, or directory into `raw/`
`rtw compile`	Compile sources into wiki (incremental by default)
`rtw compile --full`	Full recompile from scratch
`rtw compile --dry-run`	Estimate token usage and cost
`rtw query "question"`	Ask the wiki (streamed output)
`rtw query "..." --format table`	Output as markdown table
`rtw query "..." --deep`	Deep research mode (longer output)
`rtw lint`	Run wiki health check
`rtw stats`	Show wiki statistics

Project Structure

my-research/
├── rtw.yaml              # Configuration
├── .env                  # API key (auto-created by rtw init, gitignored)
├── raw/                  # Raw sources — you add files here
│   ├── articles/         #   Web articles (auto-sorted)
│   └── papers/           #   PDFs (auto-sorted)
├── wiki/                 # LLM-generated wiki — don't edit manually
│   ├── _index.md         #   Master index
│   ├── _sources.md       #   Source catalog
│   └── concepts/         #   Concept articles with [[backlinks]]
├── output/               # Query results
│   └── queries/          #   Saved answers
└── .rtw/                 # Internal state (compile state, debug logs)

Configuration

rtw.yaml (auto-generated by rtw init):

version: 1
name: "My Research"

llm:
  compile: claude-sonnet-4-6      # Fast model for compilation
  query: claude-sonnet-4-6        # Query answering
  lint: claude-haiku-4-5-20251001 # Economical model for health checks

compile:
  strategy: incremental
  max_concepts: 200
  language: en                    # Wiki language

Viewing the Wiki

The compiled wiki is plain markdown with [[wiki-links]]. Best viewed with:

Obsidian — open wiki/ as a vault. Graph view shows concept connections.
VSCode + Foam — [[backlink]] support with graph visualization.
Any markdown viewer — files are standard .md, readable anywhere.

Cost

RawToWise uses the Anthropic API. You pay only for what you use.

Operation	Estimate
Ingest 1 article	~$0.02
Compile 5 sources	~$1-2
Single query	~$0.05-0.15
Lint	~$0.50

Use rtw compile --dry-run to estimate before compiling.

Roadmap

See open issues labeled roadmap for planned features, including:

PDF ingestion
YouTube transcript support
True incremental compile
Multi-LLM support (OpenAI, Ollama)
Obsidian plugin
MCP server for AI agents

Contributing

Contributions are welcome! See CONTRIBUTING.md for guidelines.

Uninstall

curl -fsSL https://raw.githubusercontent.com/vericontext/rawtowise/main/uninstall.sh | bash

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
.claude		.claude
.github		.github
src/rawtowise		src/rawtowise
.gitignore		.gitignore
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
TESTING.md		TESTING.md
install.sh		install.sh
pyproject.toml		pyproject.toml
uninstall.sh		uninstall.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

RawToWise

Why RawToWise?

Install

Quick Start

How It Works

Commands

Project Structure

Configuration

Viewing the Wiki

Cost

Roadmap

Contributing

Uninstall

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

RawToWise

Why RawToWise?

Install

Quick Start

How It Works

Commands

Project Structure

Configuration

Viewing the Wiki

Cost

Roadmap

Contributing

Uninstall

License

About

Topics

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages