knowledge-base

Build a personal RAG over your own iMessages, WhatsApp, and email. Ask questions about your message history in plain English. Bring your own LLM.

$ kb ask "what did I say about quitting my job in 2023"

What it is

A small Python CLI that:

Reads your local Apple data — ~/Library/Messages/chat.db (iMessage) and ~/Library/Mail/ (Apple Mail). WhatsApp via ChatStorage.sqlite if you've exported it.
Chunks conversations by 30-minute activity windows and email threads by message.
Embeds and indexes everything locally with ChromaDB (default embeddings: all-MiniLM-L6-v2 via sentence-transformers).
Lets you ask natural-language questions. Top-N chunks get retrieved and sent to an LLM as context. The LLM answers, citing dates and people.

It's a personal-data RAG. Karpathy-style: not what you say you believe, what 11 years of unguarded text reveals about you.

Privacy

Everything is local by default. Your messages never leave your machine.

The parser reads Apple's local SQLite databases directly. No iCloud, no Apple sign-in.
Embeddings run locally via sentence-transformers — no API calls.
The default LLM endpoint is Ollama on localhost:11434. If you keep the default, your messages never touch a third-party server.
If you point the LLM at a hosted model (OpenAI, Anthropic, OpenRouter), the retrieved chunks for each kb ask query are sent to that provider as context. Your full corpus still stays on disk.

chromadb_data/ is .gitignore'd. Do not commit your indexed data.

Install

Requires Python 3.9+, macOS (for the iMessage / Mail readers).

git clone https://github.com/gabrielkagan/knowledge-base.git
cd knowledge-base
python -m venv .venv && source .venv/bin/activate
pip install -e .

macOS Full Disk Access — your terminal needs it to read ~/Library/Messages/chat.db. Grant it in:

System Settings → Privacy & Security → Full Disk Access → enable for your terminal app (Terminal, iTerm, Ghostty, etc.)

Pick an LLM backend

The kb ask command speaks the OpenAI chat-completions API, so it works with any OpenAI-compatible endpoint. Configure via env vars:

Backend	`OPENAI_BASE_URL`	`LLM_MODEL`	`OPENAI_API_KEY`
Ollama (local, default)	`http://localhost:11434/v1`	`qwen2.5:14b`	any string
LM Studio (local)	`http://localhost:1234/v1`	`<model loaded in app>`	any string
OpenAI	`https://api.openai.com/v1`	`gpt-4o-mini`	your OpenAI key
Anthropic	`https://api.anthropic.com/v1`	`claude-sonnet-4-5`	your Anthropic key
OpenRouter	`https://openrouter.ai/api/v1`	`qwen/qwen-2.5-72b-instruct`	your OpenRouter key

Defaults are Ollama + Qwen 2.5 14B. To run that:

brew install ollama
ollama serve &
ollama pull qwen2.5:14b

To switch to OpenAI:

export OPENAI_BASE_URL=https://api.openai.com/v1
export OPENAI_API_KEY=sk-...
export LLM_MODEL=gpt-4o-mini

Usage

One-time ingest (reads chat.db + Mail, chunks, embeds, indexes):

kb ingest

Re-running is idempotent — chunks have content-hashed IDs, so re-indexed chunks are skipped.

Ask a question:

kb ask "summarize my relationship with X"
kb ask "what did I think about crypto in 2021" --n 20
kb ask "emails from my landlord about the lease" --source email
kb ask "what did I tell mom about the job offer" --person "+1234567890"

Show what was retrieved without sending to the LLM:

kb ask "..." --raw

Stats:

kb stats

How it works

chat.db (iMessage)  ─┐
ChatStorage.sqlite  ─┼──> parser ──> chunker ──> embedder ──> ChromaDB
~/Library/Mail/     ─┘                              │              │
                                                    │              │
                                                    └─ MiniLM ─────┘
                                                       (local)

         ─────────────────────────────────────────────────────
         kb ask "..."  ──>  ChromaDB top-N retrieval  ──>  LLM
                                                            │
                                                  answer + citations

Key design choices:

Conversation-aware chunking. iMessages get grouped into 30-minute activity windows so context stays coherent. Emails are chunked per-message; long ones split on paragraph breaks.
Local embeddings. ChromaDB's default all-MiniLM-L6-v2 runs on CPU, no API calls, no key required.
Deterministic chunk IDs. SHA-256 of source + date + first 200 chars. Re-ingesting is a no-op for existing chunks.
Tapback filtering. iMessage reactions (associated_message_type 2000–3005) are skipped — they pollute search.
Apple epoch handling. chat.db stores timestamps as seconds-since-2001 or nanoseconds-since-2001 depending on iOS version. The parser handles both.

File layout

kb/
  imessage.py    iMessage chat.db reader (handles, chats, group vs DM, tapback filter)
  email.py       Apple Mail .emlx parser (strips quoted replies, basic HTML cleanup)
  chunker.py     Conversation-window chunking + email-thread chunking
  index.py       ChromaDB indexing + retrieval
  cli.py         CLI: ingest / ask / stats / browse

License

MIT.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
kb		kb
.gitignore		.gitignore
README.md		README.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

knowledge-base

What it is

Privacy

Install

Pick an LLM backend

Usage

How it works

File layout

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

knowledge-base

What it is

Privacy

Install

Pick an LLM backend

Usage

How it works

File layout

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages