A from-scratch MCP (Model Context Protocol) server that gives an LLM agent (e.g. Claude Code) structured tools over a folder of markdown notes — a temporal knowledge graph plus hybrid retrieval — implemented in pure-stdlib Python, zero external dependencies.
- MCP server built from scratch — JSON-RPC 2.0 over stdio (newline-delimited),
with
initialize/tools/list/tools/call/ping, and 10 tools advertised via full JSON-SchemainputSchema. - Hybrid retrieval — a hand-written TF-IDF engine (sublinear TF, smoothed IDF, sparse vectors, cosine similarity) fused with SQLite FTS5 BM25, with tunable semantic/keyword weights. Section-level markdown chunking, query expansion and bilingual tokenization. (Classical IR — no embeddings/vector DB.)
- Temporal knowledge graph — subject-predicate-object triples with
valid_from/valid_towindows, automatic invalidation on change, andas_of/ timeline queries (SQLite + FTS5 with sync triggers). - No external dependencies — entire core is Python stdlib (
json,sqlite3,re,math,hashlib).
kg_query, kg_add, kg_invalidate, kg_timeline, kg_search, kg_stats,
vault_search (BM25), semantic_search (TF-IDF+cosine / hybrid), semantic_stats,
semantic_rebuild.
export VAULT_DIR=/path/to/your/markdown/notes # defaults to ./sample_vault
python3 semantic_search.py --rebuild # build the index
python3 semantic_search.py "your query" # try a search
# register with Claude Code:
claude mcp add --scope user knowledge-vault -- python3 /ABSOLUTE/PATH/TO/mcp_server.pymcp_server.py— the MCP server (JSON-RPC 2.0 / stdio, tool router)semantic_search.py— TF-IDF + cosine + FTS5 hybrid retrievalsearch_wiki.py— BM25 full-text searchtrading_kg.py— temporal knowledge graph
Extracted from a personal project; databases and private notes are excluded —
VAULT_DIRpoints at your own markdown folder.