Fast, schema-enforced wiki search & write — exposed as MCP tools. Drop-in for Claude Code, Cursor, Copilot, Windsurf, Zed.
markdown vault ──► SQLite FTS5 index ──► MCP tools ──► any AI agent
(~10ms BM25 search) (12 tools)
LLM agents waste tokens reading whole markdown files when they just need a snippet. wiki-mcp exposes a tiny set of tools so the agent can:
- Search with BM25 ranking → ~250 tokens vs ~15K from
grep+cat - Write with enforced schema → no tag drift, no orphan notes, no >150-line ramble files
- Look up valid tags, frontmatter templates, backlinks, stats
Same engine that powers the Hermes wiki search — now portable as an MCP server.
git clone https://github.com/patchmyday/wiki-mcp.git
cd wiki-mcp
pip install mcp --break-system-packages # if not already installed
# Point at any folder of markdown files
export WIKI_DIR=$HOME/Documents/notes
export WIKI_INDEX_DB=$HOME/.wiki-mcp/wiki.db
python3 server.py # stdio MCP server, readyclaude mcp add wiki -- python3 $(pwd)/server.py \
-e WIKI_DIR=$HOME/Documents/notes \
-e WIKI_INDEX_DB=$HOME/.wiki-mcp/wiki.dbAdd to ~/.cursor/mcp.json:
{
"mcpServers": {
"wiki": {
"command": "python3",
"args": ["/absolute/path/to/wiki-mcp/server.py"],
"env": {
"WIKI_DIR": "/path/to/your/vault",
"WIKI_INDEX_DB": "/path/to/wiki.db"
}
}
}
}Then ask your agent: "Search the wiki for auth bypass." It'll call search() automatically.
wiki-mcp/
├── server.py # MCP server — 12 tools, FastMCP wrapper
├── wiki_index.py # SQLite FTS5 + BM25 indexer
├── wiki_writer.py # Schema enforcement + write tools
├── ARCHITECTURE.md # Data flow, design rationale
├── USAGE.md # Per-tool examples + LLM workflows
├── examples/
│ └── SCHEMA.md # Sample taxonomy file for your vault
└── README.md # You are here
| Tool | Description |
|---|---|
search(query, limit=5) |
BM25-ranked snippets, 24-word context |
get_note(path) |
Full markdown body |
backlinks(path) |
Notes linking to this one |
list_tags() |
All #tags in vault w/ counts |
taxonomy() |
Valid tags + types from SCHEMA.md |
stats() |
Note count, db size, index health |
stubs(limit=20) |
Knowledge gaps — wikilinks to non-existent notes |
recent(days=7, limit=20) |
Recently modified notes |
orphans(limit=30) |
Notes with zero incoming links |
| Tool | Description |
|---|---|
frontmatter_template(type) |
Starter skeleton per note type |
lint_note(body) |
Validate against schema, no write |
write_note(folder, title, body, type, tags, ...) |
Create new note (auto-sets author from WIKI_AUTHOR) |
update_note(path, body?, add_tags?) |
Patch existing |
append_section(path, section_title, content) |
Append ## section |
| Tool | Description |
|---|---|
format_note(path, dry_run=true) |
Auto-fix frontmatter (title, type, dates, H1, wikilinks) |
format_vault(dry_run=true) |
Bulk scan + fix all notes |
suggest_split(path) |
Propose split points for oversized (>150 line) notes |
health() |
Team dashboard: compliance %, type distribution, author coverage, tag drift |
reindex(full=false) |
Rebuild FTS index — incremental by default |
Full per-tool reference w/ examples → see USAGE.md.
Markdown files w/ YAML frontmatter:
---
title: Jenkins args4j auth bypass
created: 2026-04-27
updated: 2026-04-27
type: runbook
tags: [waf, runbook, vulnerability]
sources: [https://...]
---
# Jenkins args4j auth bypass
CVE-2024-23897 lets `@filename` syntax read any file.
## Steps
1. ...
## Related
[[F5 BIG-IP WAF]] · [[CVE Hunting]]A SCHEMA.md at vault root defines your tag taxonomy. Edit it once; new tags are accepted on next call (mtime-cached). See examples/SCHEMA.md.
- ✅ Required frontmatter:
title,created,updated,type,tags - ✅
type∈{entity, concept, comparison, query, runbook, decision, journal} - ✅ Tags must exist in your
SCHEMA.mdtaxonomy - ✅ Dates:
YYYY-MM-DD - ✅ ≥1 outbound
[[wikilink]] - ✅ Body ≤150 lines (forces split)
- ✅ H1 present at top
Lint catches all of these before write — agent self-corrects without you babysitting.
wiki-mcp is designed to scale from personal vault to shared team knowledge base.
| Variable | Default | Purpose |
|---|---|---|
WIKI_DIR |
/tmp/wiki |
Path to markdown vault |
WIKI_INDEX_DB |
./wiki.db |
SQLite FTS5 index location |
WIKI_AUTHOR |
(empty) | Auto-set author: field on new notes (e.g. your username) |
WIKI_TRANSPORT |
stdio |
stdio for local, http for team server |
WIKI_PORT |
8787 |
HTTP port when WIKI_TRANSPORT=http |
# Start a team wiki server
WIKI_DIR=/shared/team-wiki \
WIKI_INDEX_DB=/shared/wiki.db \
WIKI_TRANSPORT=http \
WIKI_PORT=8787 \
python3 server.pyThen each team member connects via their client's MCP config:
{
"mcpServers": {
"wiki": {
"type": "http",
"url": "http://wiki-server:8787/mcp"
}
}
}Run health() in your agent to get a team dashboard:
- Schema compliance % across all notes
- Author contribution breakdown
- Tag drift (used tags not in taxonomy)
- Oversized notes needing splits
Tested on a 258-note / 30 MB vault:
| Metric | Value | Comparison |
|---|---|---|
| Search P50 | ~12 ms | 38× faster than grep + difflib |
| Search P95 | ~18 ms | 40× faster |
| Phrase search | ~6 ms | 118× faster |
| Initial index build | ~240 ms | one-time |
| Incremental reindex | ~5 ms | mtime-based delta |
| DB size | 2.2 MB | ~7% of vault size |
See ARCHITECTURE.md for full diagrams + design rationale.
Quick mental model:
┌─────────┐ "find auth bypass" ┌──────────────┐
│ YOU │ ───────────────────────► │ Agent │
└─────────┘ │ (Claude/ │
▲ │ Cursor/…) │
│ └──────┬───────┘
│ │ MCP stdio
│ ▼
│ ┌──────────────┐
│ │ wiki-mcp │
│ ranked snippets │ server.py │
└───────────────────────────────│ (Python) │
└──────┬───────┘
│
▼
┌──────────────┐
│ wiki.db │
│ SQLite FTS5 │
└──────────────┘
No cloud. No daemon. No keys. Just a local subprocess your AI talks to.
-
recent(days=7)tool — surface fresh notes - HTTP transport variant for team deployments
-
stubs()— knowledge gap detection via orphan wikilinks -
health()— team dashboard with compliance metrics -
format_note/vault— auto-fix frontmatter at scale -
suggest_split()— oversized note split proposals -
WIKI_AUTHOR— team attribution on writes -
setup.sh— cross-platform auto-installer - Tag/folder filter for
search - Optional vector reranking (Qwen3-0.6B local)
-
mcp-atlassiancomposition example (JIRA + Confluence) - Token-budgeted result trimming
- Multi-vault federation (shared taxonomy, per-team vaults)
- Activity feed SSE endpoint for team dashboards
- Git-backed audit log (who changed what, when)
Part of the PatchMyDay toolset by Jason Zhang. WAFs by day, AI agents by night.