Reads your Markdown vault and surfaces what you believe but never wrote down.
You've written 400+ daily notes. They link to each other. They were useful in the moment. None of them feed into what you're working on now. The claim from six months ago that the last four weeks of work depend on is still in there — you just can't find it. Basalt finds it.
You're a developer using Obsidian (or Logseq, or a folder of .md files)
as project context for Claude / Cursor / your editor of choice. You don't
want another note-taking methodology. You want your existing vault to do
more for you, without moving anything, and without sending it anywhere.
You've tried PARA, Zettelkasten, BASB, Smart Connections. They worked exactly as long as you were reading the book about them. Basalt has no methodology to impose — it sits on top of whatever you already have.
Basalt is the only second-brain compiler in this category that doesn't require Claude Code, doesn't write to your vault, and doesn't make a single network call in the Open tier. Standalone Python. Read-only. Local-first. Three load-bearing properties — every other shipped project in this space gives up at least one of them.
Basalt compiles a longitudinal model of you — claims, priorities, drift, theses — and exposes it as cognitive verbs. It sits atop your existing vault. It does not replace your editor. It runs locally.
The signature output is The First Brief — a single page with citation- grounded sections, each ending in a one-click commit. Four unlocks, all shipping: Implicit Thesis · Contradiction · Drift · Connection.
Status: Phase 0 complete. All four site-advertised unlocks ship in v0.0.8, plus the bonus 5th unlock (Buried Insight) and a Calibration Layer that grades past findings. Verified on a 1,683-note production vault. Phase 1 adds LLM synthesis (named theses, classifier-validated contradictions). See virtuosoai.dev/basalt.
- Python 3.12+
- Ollama running locally
- The embedding model:
ollama pull nomic-embed-text - A Markdown vault (Obsidian, Logseq, plain folder of
.md— anything works)
python3.12 -m venv .venv && source .venv/bin/activate
pip install basalt-vaultOr with MCP server:
pip install 'basalt-vault[mcp]'Try the demo (no vault required — uses a sample vault bundled with the repo):
basalt demoOr set up your own vault. The interactive wizard handles it in ~10 seconds:
basalt init # interactive: vault path, ollama URL, embed model
basalt index # walk vault, embed every note (~1-3 min for 2k notes)
basalt brief --section allOn a first interactive basalt init, the wizard renders a sample Brief from the bundled vault right after "Set." — so you see what Basalt does before you've indexed a single note of your own.
If you'd rather skip the wizard, the env-var path still works:
basalt index --vault ~/path/to/your-vault
basalt brief --section allBuilding from source:
git clone https://github.com/virtexvirtuoso/basalt.git && cd basalt && pip install -e .
Want Basalt inside Obsidian? The companion plugin lives in ./obsidian/. It spawns basalt-mcp and renders findings as a native Obsidian modal with clickable wikilinks. Manual install for now (Community Plugins submission pending). See obsidian/README.md for setup + the macOS GUI PATH gotcha.
| Verb | Site language | What it actually does |
|---|---|---|
| Implicit Thesis (v0) | "The thing you keep saying without realizing you're saying the same thing." | Tight-neighborhood (near-clique) clusters of 3-15 notes converging on an unnamed through-line. Centroid's load-bearing sentence is the proxy thesis statement |
| Contradiction (v0) | "The two notes you wrote that can't both be true." | Pairs of same-topic notes whose load-bearing sentences carry asymmetric negation, reversal markers, or polarity pairs (ship ↔ kill, works ↔ broken). v0 is heuristic — output is candidates, not verdicts |
| Drift | "What you say is the priority versus what you actually spent the week on." | Stated priority (project-folder note count) vs lived priority (daily-note mentions over a 30-day window). Surfaces the largest divergence — "Moonshot is your stated #9 but lived #1" |
| Connection | "The two ideas in different folders that turn out to be the same idea." | Pairs of notes across different top-level folders, no wikilink between them, embedding similarity ≥ 0.78 |
| Buried Insight | (bonus 5th unlock) | A note you wrote once and never returned to, that recent notes still cite — explicit links plus semantic validators |
| Verb | Promotion | What it needs |
|---|---|---|
| Implicit Thesis v1 | named — not just a cluster | LLM synthesis pass over the v0 cluster — produces the through-line sentence the user hasn't written |
| Contradiction v1 | proven — not just a candidate | LLM pairwise compatibility classifier filtering v0 heuristic candidates |
| Drift v1 | auto-audited | Re-run drift on the current window during basalt audit — auto-falsify drift findings whose shares re-converged |
| Command | What it does |
|---|---|
basalt init |
Interactive first-run wizard — vault, Ollama URL, embed model. Writes ~/.config/basalt/config.toml. Renders a sample Brief on first interactive run |
basalt config show |
Print resolved config with (env) / (file) / (default) source indicators |
basalt doctor |
5-row health check — config, vault, ollama, model, index. Exits 1 if anything red |
basalt demo |
Index the bundled sample vault and run a brief — no setup, no vault needed |
basalt index --vault PATH |
Walk vault, parse frontmatter, build link graph, embed every note |
basalt brief |
Surface the strongest buried insight (default top 1) |
basalt brief --section connection --top 3 |
Surface 3 connections — same idea across folders, no wikilink |
basalt brief --section contradiction --top 3 |
Surface 3 contradiction candidates (v0) |
basalt brief --section implicit-thesis --top 2 |
Surface 2 implicit theses (v0 cluster heuristic) |
basalt brief --section drift |
Surface stated-vs-lived priority drift over the last 30 days |
basalt brief --section all |
Run every shipped verb in one pass — 4 unlocks + Buried Insight |
basalt thesis / basalt drift / basalt connection / basalt contradiction |
Convenience subcommands per verb |
basalt audit |
Re-evaluate pending findings against the current vault — see your track record |
basalt about |
Wordmark + the geological metaphor + schema version |
basalt brief --strict-defaults |
Buried Insight only — fixed 180/90/180 thresholds vs vault-age-aware |
Run basalt --help for everything.
THE BURIED INSIGHT
─────────────────────
vault age: 244d · thresholds: age≥122d dormant≥40d recent≤122d
On 2025-09-12 you wrote, in 02-Projects/SignalBot/HYPOTHESIS.md:
The sustainable edge isn't speed alone — it's speed + intelligence.
(callout body)
Since then, 4 notes link back.
You haven't returned to this claim since you wrote it.
→ 02-Projects/SignalBot/PHASE2.md (2026-03-14)
→ 02-Projects/SignalBot/BACKTEST.md (2026-03-21)
→ 02-Projects/SignalBot/CALIBRATION.md (2026-04-02)
→ 02-Projects/SignalBot/PRODUCTION-NOTES.md (2026-04-18)
▸ Promote to thesis ▸ Open all ▸ Snooze
CONNECTIONS (2)
─────────────────────
two ideas in different folders that turn out to be the same idea
01.
similarity 0.90 · no wikilink between them
A 09-AI-Context/whale-monitoring-infrastructure.md
1,330 orderflow tests, 2,808 derivatives tests — zero Bonferroni
survivors.
(blockquote summary)
B 02-Projects/Gem Hunter/_archived/WHALE_MOVEMENT_TRACKING.md
Whale activity analysis is automatically included in token scoring.
(first prose sentence)
▸ Link A ↔ B ▸ Open both ▸ Dismiss
Real numbers from a working 2,022-note Obsidian vault:
| Step | Cost | Notes |
|---|---|---|
| Parse + link graph | 1.1s for 2,022 notes / 9,284 links | Pure Python; deterministic |
| Embed (first run) | 143.5s for 636 notes uncached | Ollama nomic-embed-text, 6-way concurrent. ~4.4 notes/sec |
| Embed (cache hit) | 0s | Content-hash cache; unchanged notes skip the network call entirely |
| Brief — single section | ~1-2s on a built index | All five verbs read from SQLite + numpy; no embedding at brief-time |
| Brief — all five sections | ~5-10s | Connection + Implicit Thesis dominate (pairwise similarity scan) |
| Audit (re-evaluate pending findings) | <1s | SQL-only — no model calls |
| Wheel size | ~350 KB | Includes the wheel-shipped sample DB for the wizard preview |
On a 1,500-note vault the full Brief comfortably comes in under 90 seconds end-to-end (the Phase-0 acceptance target). The cache means daily Briefs after the first index are near-instant.
Basalt v0.0.15 is honest about its rough edges. None of these are blockers for the wedge ("surface what you wrote and forgot"), but you should know them before relying on the output:
- Contradiction is v0 heuristic. It flags pairs with asymmetric negation, reversal markers, and polarity pairs. Real conflicts will show up; so will false positives where two unrelated notes both contain the word "actually." The CLI labels every contradiction as "v0 — verify before acting."
- Implicit Thesis is a cluster, not a sentence. The v0 verb finds dense neighborhoods of notes converging on something, but it doesn't synthesize the through-line — it returns the centroid note's load-bearing sentence as a proxy. Naming the thesis is Phase 1 (LLM-assisted, Pro tier, BYO-key).
- Drift tangles related project names. The word-boundary regex matches
VirtuosoinVirtuoso Platform,Virtuoso Vault, etc. Longest-first matching helps but isn't perfect. If your projects share root names, expect some collapse. - Stale-knowledge detection is not yet shipped. Basalt finds what you wrote and forgot; it doesn't yet flag what's labeled
status: activebut hasn't been touched in 90 days. That's the Stale verb (planned, seedocs/superpowers/plans/2026-05-12-stale-verb.md). - Embedding model is fixed in the Open tier.
nomic-embed-textonly. Better-but-bigger models (bge-m3,Qwen3-Embedding-8B) are gated to Pro for now to keep the no-network promise verifiable. - Falsification rules are time-bounded. Many rules say "wrong if X within 30/60/90 days." Until that window elapses, the verdict stays
pending. Your track record fills in over weeks, not minutes. - Desktop-only for the plugin. The companion Obsidian plugin spawns a Python subprocess — no mobile support.
| Layer | Tech |
|---|---|
| Substrate | Your existing Markdown vault — never moved, never modified without consent |
| Compiler | SQLite (notes, links, embeddings) + numpy similarity + content-hash incremental |
| Embeddings | Ollama nomic-embed-text by default; bge-m3 and Qwen3-Embedding-8B coming for the Pro tier |
| Verbs | Cognitive operations exposed as CLI commands (MCP server wrapper next) |
Every verb reuses the same primitives:
- Sentence-aware quote extraction — picks the punchline (em-dash, negation, conclusion-opener) over the setup line; strips Markdown noise; refuses cliffhangers (no quote ending in
:or,). - Hub-note penalty — outgoing-link-density per 100 words. Hard-excludes MOCs above 1.5; soft-penalizes 0.5–1.5 gray zone.
- Vault-age-aware thresholds — Buried Insight derives age/dormancy windows from the oldest note's date; clamped to sensible floors and ceilings.
Basalt exposes its verb library as an MCP server, so any MCP-compatible client (Claude Desktop, Cursor, Cline, Zed, VS Code Copilot) can call Basalt's verbs as tools.
pip install 'basalt-vault[mcp]'
basalt-mcp --helpWire into Claude Desktop (~/Library/Application Support/Claude/claude_desktop_config.json):
{
"mcpServers": {
"basalt": {
"command": "basalt-mcp"
}
}
}Or with explicit paths:
{
"mcpServers": {
"basalt": {
"command": "basalt-mcp",
"args": ["--vault", "/path/to/vault", "--db", "/path/to/basalt.db"]
}
}
}Then ask Claude: "run a Basalt brief on my vault, all sections, top 2."
The MCP server exposes 4 tools:
| Tool | Maps to | Notes |
|---|---|---|
basalt_brief |
basalt brief |
Buried Insight / Connection / Contradiction; returns finding objects with falsification rules |
basalt_connection |
basalt connection |
Just connections, with min_sim knob |
basalt_contradiction |
basalt contradiction |
v0 heuristic candidates |
basalt_audit |
basalt audit |
Re-evaluates pending findings, returns track record |
The server is read-only on the vault — it never writes to your .md files.
Run basalt index from the CLI before pointing the MCP server at a fresh vault.
Local-first by default. Your vault is read from disk; embeddings are computed
by your local Ollama; the SQLite index lives at ~/.basalt/basalt.db. No
network calls leave your machine in the Open tier. See PRIVACY.md
for the full posture, and SECURITY.md for the threat model.
Open an issue first for anything non-trivial. Follow the existing module shape — small, named, single-purpose.
The wizard's first-run preview reads from src/basalt/data/demo.db, which is
committed to the repo. If you edit examples/sample-vault/ or change the
embedding model, regenerate it:
python scripts/build_demo_db.pyRequires Ollama running with nomic-embed-text pulled.
MIT — Fernando Villar / Virtuoso Crypto, 2026.

