Skip to content

patchmyday/wiki-mcp

Repository files navigation

wiki-mcp

Fast, schema-enforced wiki search & write — exposed as MCP tools. Drop-in for Claude Code, Cursor, Copilot, Windsurf, Zed.

markdown vault  ──►  SQLite FTS5 index  ──►  MCP tools  ──►  any AI agent
                     (~10ms BM25 search)     (12 tools)

CI License Python MCP

Why

LLM agents waste tokens reading whole markdown files when they just need a snippet. wiki-mcp exposes a tiny set of tools so the agent can:

  • Search with BM25 ranking → ~250 tokens vs ~15K from grep + cat
  • Write with enforced schema → no tag drift, no orphan notes, no >150-line ramble files
  • Look up valid tags, frontmatter templates, backlinks, stats

Same engine that powers the Hermes wiki search — now portable as an MCP server.

Quickstart (60 seconds)

git clone https://github.com/patchmyday/wiki-mcp.git
cd wiki-mcp
pip install mcp --break-system-packages   # if not already installed

# Point at any folder of markdown files
export WIKI_DIR=$HOME/Documents/notes
export WIKI_INDEX_DB=$HOME/.wiki-mcp/wiki.db

python3 server.py   # stdio MCP server, ready

Wire into Claude Code

claude mcp add wiki -- python3 $(pwd)/server.py \
  -e WIKI_DIR=$HOME/Documents/notes \
  -e WIKI_INDEX_DB=$HOME/.wiki-mcp/wiki.db

Wire into Cursor

Add to ~/.cursor/mcp.json:

{
  "mcpServers": {
    "wiki": {
      "command": "python3",
      "args": ["/absolute/path/to/wiki-mcp/server.py"],
      "env": {
        "WIKI_DIR": "/path/to/your/vault",
        "WIKI_INDEX_DB": "/path/to/wiki.db"
      }
    }
  }
}

Then ask your agent: "Search the wiki for auth bypass." It'll call search() automatically.


What's inside

wiki-mcp/
├── server.py            # MCP server — 12 tools, FastMCP wrapper
├── wiki_index.py        # SQLite FTS5 + BM25 indexer
├── wiki_writer.py       # Schema enforcement + write tools
├── ARCHITECTURE.md      # Data flow, design rationale
├── USAGE.md             # Per-tool examples + LLM workflows
├── examples/
│   └── SCHEMA.md        # Sample taxonomy file for your vault
└── README.md            # You are here

19 tools at a glance

🔍 Read (9)

Tool Description
search(query, limit=5) BM25-ranked snippets, 24-word context
get_note(path) Full markdown body
backlinks(path) Notes linking to this one
list_tags() All #tags in vault w/ counts
taxonomy() Valid tags + types from SCHEMA.md
stats() Note count, db size, index health
stubs(limit=20) Knowledge gaps — wikilinks to non-existent notes
recent(days=7, limit=20) Recently modified notes
orphans(limit=30) Notes with zero incoming links

✍️ Write (5) — schema-enforced

Tool Description
frontmatter_template(type) Starter skeleton per note type
lint_note(body) Validate against schema, no write
write_note(folder, title, body, type, tags, ...) Create new note (auto-sets author from WIKI_AUTHOR)
update_note(path, body?, add_tags?) Patch existing
append_section(path, section_title, content) Append ## section

🔧 Maintenance (5)

Tool Description
format_note(path, dry_run=true) Auto-fix frontmatter (title, type, dates, H1, wikilinks)
format_vault(dry_run=true) Bulk scan + fix all notes
suggest_split(path) Propose split points for oversized (>150 line) notes
health() Team dashboard: compliance %, type distribution, author coverage, tag drift
reindex(full=false) Rebuild FTS index — incremental by default

Full per-tool reference w/ examples → see USAGE.md.


Vault format

Markdown files w/ YAML frontmatter:

---
title: Jenkins args4j auth bypass
created: 2026-04-27
updated: 2026-04-27
type: runbook
tags: [waf, runbook, vulnerability]
sources: [https://...]
---

# Jenkins args4j auth bypass

CVE-2024-23897 lets `@filename` syntax read any file.

## Steps
1. ...

## Related
[[F5 BIG-IP WAF]] · [[CVE Hunting]]

A SCHEMA.md at vault root defines your tag taxonomy. Edit it once; new tags are accepted on next call (mtime-cached). See examples/SCHEMA.md.

Schema rules (auto-enforced on write)

  • ✅ Required frontmatter: title, created, updated, type, tags
  • type{entity, concept, comparison, query, runbook, decision, journal}
  • ✅ Tags must exist in your SCHEMA.md taxonomy
  • ✅ Dates: YYYY-MM-DD
  • ✅ ≥1 outbound [[wikilink]]
  • ✅ Body ≤150 lines (forces split)
  • ✅ H1 present at top

Lint catches all of these before write — agent self-corrects without you babysitting.


Team Deployment

wiki-mcp is designed to scale from personal vault to shared team knowledge base.

Environment variables

Variable Default Purpose
WIKI_DIR /tmp/wiki Path to markdown vault
WIKI_INDEX_DB ./wiki.db SQLite FTS5 index location
WIKI_AUTHOR (empty) Auto-set author: field on new notes (e.g. your username)
WIKI_TRANSPORT stdio stdio for local, http for team server
WIKI_PORT 8787 HTTP port when WIKI_TRANSPORT=http

Shared wiki server (HTTP transport)

# Start a team wiki server
WIKI_DIR=/shared/team-wiki \
WIKI_INDEX_DB=/shared/wiki.db \
WIKI_TRANSPORT=http \
WIKI_PORT=8787 \
python3 server.py

Then each team member connects via their client's MCP config:

{
  "mcpServers": {
    "wiki": {
      "type": "http",
      "url": "http://wiki-server:8787/mcp"
    }
  }
}

Team quality monitoring

Run health() in your agent to get a team dashboard:

  • Schema compliance % across all notes
  • Author contribution breakdown
  • Tag drift (used tags not in taxonomy)
  • Oversized notes needing splits

Performance

Tested on a 258-note / 30 MB vault:

Metric Value Comparison
Search P50 ~12 ms 38× faster than grep + difflib
Search P95 ~18 ms 40× faster
Phrase search ~6 ms 118× faster
Initial index build ~240 ms one-time
Incremental reindex ~5 ms mtime-based delta
DB size 2.2 MB ~7% of vault size

Architecture

See ARCHITECTURE.md for full diagrams + design rationale.

Quick mental model:

┌─────────┐    "find auth bypass"    ┌──────────────┐
│  YOU    │ ───────────────────────► │  Agent       │
└─────────┘                          │  (Claude/    │
     ▲                               │   Cursor/…)  │
     │                               └──────┬───────┘
     │                                      │ MCP stdio
     │                                      ▼
     │                               ┌──────────────┐
     │                               │  wiki-mcp    │
     │  ranked snippets              │  server.py   │
     └───────────────────────────────│  (Python)    │
                                     └──────┬───────┘
                                            │
                                            ▼
                                     ┌──────────────┐
                                     │  wiki.db     │
                                     │  SQLite FTS5 │
                                     └──────────────┘

No cloud. No daemon. No keys. Just a local subprocess your AI talks to.


Roadmap

  • recent(days=7) tool — surface fresh notes
  • HTTP transport variant for team deployments
  • stubs() — knowledge gap detection via orphan wikilinks
  • health() — team dashboard with compliance metrics
  • format_note/vault — auto-fix frontmatter at scale
  • suggest_split() — oversized note split proposals
  • WIKI_AUTHOR — team attribution on writes
  • setup.sh — cross-platform auto-installer
  • Tag/folder filter for search
  • Optional vector reranking (Qwen3-0.6B local)
  • mcp-atlassian composition example (JIRA + Confluence)
  • Token-budgeted result trimming
  • Multi-vault federation (shared taxonomy, per-team vaults)
  • Activity feed SSE endpoint for team dashboards
  • Git-backed audit log (who changed what, when)

Built by

Part of the PatchMyDay toolset by Jason Zhang. WAFs by day, AI agents by night.

License

MIT

About

Fast schema-enforced wiki search and write as MCP tools - Claude Code, Cursor, Copilot compatible

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors