MindVault

A local-first second brain that turns your AI conversation exports, Obsidian notes, and documents into a searchable, conversational memory system.

Everything runs on your machine. No data leaves.

What it does

Ingests Claude, ChatGPT, and other AI conversation exports, Obsidian vaults, PDFs, and plain text files
Indexes content into a multi-layer memory system (raw → compressed → structured → linked)
Remembers entities, decisions, and goals extracted from every chat
Retrieves using hybrid scoring — summaries first, raw text only when needed
Chats interactively with six reasoning modes powered by a council of AI voices
Searches the web automatically when your memory doesn't have a confident answer
Saves sessions — resume any previous conversation exactly where you left off

Quick start

Prerequisites

Python 3.11+
Ollama with two models:

ollama pull nomic-embed-text   # vector search
ollama pull llama3.2           # chat and summarization
ollama serve                   # start Ollama if not running

Install

git clone https://github.com/calebthecm/MindVault
cd MindVault
python -m venv .venv && source .venv/bin/activate
pip install -e .               # installs deps + registers the mindvault CLI

Or without the CLI shortcut:

pip install -r requirements.txt

First run

mindvault setup                # or: python -m mindvault setup

Add your data

Drop your AI export folder into the project directory. PDFs and .txt/.md files go anywhere — point the ingester at them manually.

Provider	How to export
Claude	claude.ai → Settings → Export Data (folder starting with `data-`)
ChatGPT	chatgpt.com → Settings → Data Controls → Export Data

Index and chat

mindvault ingest               # index everything
mindvault chat                 # start talking to your brain

Running MindVault

Three equivalent ways to run it — use whichever you prefer:

# After pip install -e . (recommended)
mindvault
mindvault chat
mindvault ingest

# As a Python module (no install needed)
python -m mindvault
python -m mindvault chat
python -m mindvault ingest

# Legacy script (still works)
python mindvault.py
python mindvault.py chat
python mindvault.py ingest

Commands

mindvault                           chat (default)
mindvault chat                      interactive REPL
mindvault chat --resume             resume last session
mindvault chat --resume <id>        resume specific session
mindvault ingest                    auto-discover and index all exports
mindvault ingest ./folder/          index a specific folder
mindvault ingest --force            re-index even if already processed
mindvault notes                     regenerate Obsidian notes
mindvault setup                     first-run configuration wizard
mindvault stats                     show index and session statistics
mindvault sessions                  list resumable sessions
mindvault consolidate               merge near-duplicate memories

During a chat session:

Shift+Tab            cycle reasoning mode
/help                show all commands
/web <query>         search the web (DuckDuckGo, no API key needed)
/search <term>       search memory without LLM — shows scored results
/note <text>         quick-capture a note (indexed on next ingest)
/forget <topic>      suppress matching chunks from future retrieval
/mode [name]         show or switch mode (CHAT, PLAN, DECIDE, DEBATE, REFLECT, EXPLORE)
/sources             show which memories were used in the last answer
/remember <fact>     save a fact to this session
/private             toggle private vault inclusion
/resume              interactive session picker
/clear               clear conversation history
/quit, /exit         end session (compresses and saves automatically)

Web search

MindVault searches the web automatically when memory confidence is low, or on demand:

/web what is the current price of ETH?
/web latest news on local SEO in 2025

Uses DuckDuckGo — no API key, no Docker, no setup. Configure in config.py:

WEB_SEARCH_AUTO_THRESHOLD = 0.45   # auto-search when best memory score is below this
WEB_SEARCH_MAX_RESULTS    = 5      # results to include in context

Set WEB_SEARCH_AUTO_THRESHOLD = 0 to disable auto-search.

Reasoning modes

MindVault has six modes, cycled with Shift+Tab in the prompt bar.

Mode	What it does
💬 CHAT	Standard RAG — retrieve memories, synthesize an answer
📋 PLAN	Break the task into structured, actionable steps
🗳 DECIDE	Five-voice council votes; tally + majority verdict shown
⚖ DEBATE	FOR vs AGAINST, then a moderated verdict
🔍 REFLECT	Deep synthesis — what does your brain really know about this?
🕸 EXPLORE	Graph traversal — follows memory links to surface surprises

The council is five internal voices with distinct personalities:

Voice	Orientation
📊 The Analyst	Evidence-first, skeptical, quantitative
🚀 The Visionary	Big-picture, creative, optimistic
🔧 The Pragmatist	What's actionable right now
😈 The Devil	Challenges every assumption, finds the flaw
📜 The Historian	Patterns across time; what past memory reveals

How it works

Memory layers

Layer	What	Used for
Raw	Original text chunks	Fallback when summaries aren't confident enough
Compressed	LLM-generated summaries per session/document	Primary retrieval context
Structured	Extracted entities (persons, projects, decisions, goals)	Entity-boosted retrieval
Linked	Relationships between memories via shared entities + wikilinks	Graph traversal in EXPLORE mode
Web	Live DuckDuckGo results	Augments memory for current/unknown topics

Retrieval scoring

score = 0.5 × embedding_similarity
      + 0.2 × entity_overlap
      + 0.2 × recency
      + 0.1 × importance

Compressed summaries are searched first. Raw chunks are only fetched when confidence drops below the threshold. EXPLORE mode additionally walks memory_links to pull in related neighbors.

Session lifecycle

During chat: turns saved live + entities extracted per exchange (background)
On exit: LLM compresses the session into a 2–4 sentence summary
Summary embedded and stored in the compressed memory layer
Resume anytime with --resume or /resume during chat

Configuration

All settings in mindvault/config.py:

Variable	Default	What it controls
`LLM_MODEL`	`llama3.2`	Model for summarization, chat, extraction
`EMBEDDING_MODEL`	`nomic-embed-text`	Vector search embeddings
`CHAT_TOP_K`	`8`	Chunks retrieved per query
`COMPRESSED_SCORE_THRESHOLD`	`0.75`	Below this, also fetch raw chunks
`WEB_SEARCH_AUTO_THRESHOLD`	`0.45`	Auto web search below this memory score (0 = off)
`SUGGEST_FOLLOWUPS`	`True`	Suggest follow-up questions after each answer
`WRITE_SESSIONS_TO_VAULT`	`True`	Write session summary notes to Obsidian on exit
`CHAT_INCLUDE_PRIVATE`	`False`	Include private vault by default

Storage

Path	What
`brain.db`	SQLite: ingestion tracking, entities, links, importance scores
`.qdrant/`	Qdrant: vector index (raw + compressed collections)
`sessions/`	Compressed chat sessions (`.json.gz`)
`notes/`	Quick-captured notes via `/note` (indexed on next ingest)
`My Brain/`	Obsidian vault — business, projects, general knowledge
`Private Brain/`	Obsidian vault — personal content (separate collection)
`data-*/`	Export folders (excluded from git)

Privacy

All processing is local by default.
Web search uses DuckDuckGo's anonymous API — no account, no tracking.
My Brain and Private Brain are in separate Qdrant collections — private content is never implicitly included in responses.
.gitignore excludes all personal data: vaults, exports, sessions, databases.

Requirements

qdrant-client       vector database
httpx               HTTP client (LLM + web requests)
python-dotenv       .env file loading
pypdf               PDF ingestion
prompt_toolkit      TUI and interactive input
rich                markdown rendering in terminal
ddgs                web search (no API key)
trafilatura         web page content extraction

Name		Name	Last commit message	Last commit date
Latest commit History 45 Commits
.github/workflows		.github/workflows
MindVault_API		MindVault_API
mindvault		mindvault
src		src
tests		tests
web		web
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
icon-notext.png		icon-notext.png
icon-text.png		icon-text.png
mindvault.py		mindvault.py
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MindVault

What it does

Quick start

Prerequisites

Install

First run

Add your data

Index and chat

Running MindVault

Commands

Web search

Reasoning modes

How it works

Memory layers

Retrieval scoring

Session lifecycle

Configuration

Storage

Privacy

Requirements

About

Uh oh!

Releases 7

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

MindVault

What it does

Quick start

Prerequisites

Install

First run

Add your data

Index and chat

Running MindVault

Commands

Web search

Reasoning modes

How it works

Memory layers

Retrieval scoring

Session lifecycle

Configuration

Storage

Privacy

Requirements

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 7

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages