A knowledge graph built from plain markdown files. No database required.
⚠️ Work in progress. markedup is under active development — the API surface, CLI flags, and on-disk schema may still change between commits. The library is usable today (Clarit-AI's Plexium consumes it as a dependency), but a tagged stable release is still pending. Check back soon.
Knowledge lives in markdown files -- notes, documentation, research, wikis. But finding connections between documents means either manually linking everything, building a database, or surrendering your files to a proprietary tool.
Existing solutions force a choice: human-readable files or structured data. You can have a wiki that's easy to browse, or a database that's easy to query, but not both.
markedup turns your markdown files into a queryable knowledge graph by reading structured YAML frontmatter -- entities, relationships, confidence scores, temporal metadata -- and building an in-memory index directly from the filesystem.
Every file is simultaneously:
- A readable document you can open in any editor or Obsidian
- A graph node with typed relationships to other nodes
- A search target with keyword, semantic, and cross-encoder scoring
- A self-contained unit -- no sidecar database, no sync process, no lock-in
There is no external database. The filesystem is the database. git diff is your changelog. cp is your backup. Your files never leave your machine unless you push them.
A markedup file is standard markdown with YAML frontmatter:
---
id: distributed-consensus
title: Distributed Consensus Protocols
entity-type: concept
confidence: 0.92
tags: [distributed-systems, algorithms]
entities:
- name: Raft
role: subject
aliases: [raft-protocol]
relationships:
- target: paxos
type: derived-from
strength: 0.8
- target: etcd
type: implemented-by
strength: 0.9
temporal:
valid-from: "2014-01-01"
last-verified: "2024-06-15"
decay-rate: 0.05
semantic-hints:
- leader election
- log replication
- fault tolerance
---
Raft is a consensus algorithm designed to be more understandable than Paxos...markedup parses this frontmatter, builds a graph of relationships between files, and exposes it through CLI commands, a TUI, an MCP server for AI agents, and a Go library API. Obsidian users get compatibility out of the box -- [[wikilinks]] in the body and tags arrays in frontmatter work as expected.
See docs/schema-reference.md for the complete field specification.
markedup files are standard markdown. Any tool that reads .md files works normally -- the YAML frontmatter is either rendered (Obsidian, Hugo, Jekyll) or ignored (GitHub, VS Code, plain text editors). This means your knowledge base is not locked into markedup:
- Obsidian --
[[wikilinks]]in the body andtagsarrays in frontmatter are fully compatible. markedup auto-generates a## Relatedsection for Obsidian's graph view. - GitHub Wikis and READMEs -- GitHub renders markdown natively and displays YAML frontmatter in a table. Your knowledge base doubles as browsable documentation.
- Static site generators -- Hugo, Jekyll, Zola, and others already consume YAML frontmatter. markedup files can serve as content sources without modification.
- Plain text -- Every file is readable in
cat,less,grep, or any editor. No binary formats, no proprietary encoding.
You can adopt markedup incrementally -- add frontmatter to existing files one at a time, and they become graph nodes without breaking anything that already reads them.
markedup works with your existing markdown files out of the box. Files without frontmatter are automatically enriched when loaded -- id, title, tags, and relationships are extracted from the document structure and written back as YAML frontmatter.
# Preview what markedup would extract from your files
markedup enrich ./my-notes --dry-run
# Enrich all files (writes frontmatter, non-destructive)
markedup enrich ./my-notes
# Or just use any command -- auto-enrichment happens on load
markedup search ./my-notes "knowledge graph"For richer extraction, use a local model like Triplex (Phi3-3.8B KG extraction model) via Ollama to classify entities, infer relationship types, and generate semantic hints:
ollama run triplex
markedup enrich . --model triplex --endpoint http://localhost:11434See docs/cli-reference.md for all options.
markedup's search pipeline combines multiple signals to rank results:
- Keyword matching -- title, tags, entity names, body text
- Graph signals -- relationship density, link structure
- Temporal decay -- confidence scores degrade over time based on
last-verifiedanddecay-rate
For deeper recall, markedup can generate vector embeddings for your files and blend cosine similarity into the scoring pipeline. It works with any embedding model served via the OpenAI-compatible /v1/embeddings API:
- Local models -- Ollama, llama.cpp, Synapse, or any local inference server
- Cloud providers -- OpenRouter, OpenAI, or any OpenAI-compatible endpoint
# Embed using a local Ollama model
markedup embed --endpoint http://localhost:11434 --model nomic-embed-text
# Embed using OpenRouter
markedup embed --endpoint https://openrouter.ai/api --model openai/text-embedding-3-small --api-key $OPENROUTER_KEYEmbeddings are cached in .knowledge/vectors/ and only recomputed when file content changes. Switching models automatically invalidates the cache.
For highest precision, results can be re-scored with a cross-encoder model after initial retrieval. Cross-encoders evaluate each (query, document) pair directly -- slower but significantly more accurate than embedding similarity alone.
# Combine keyword scoring, semantic similarity, and cross-encoder reranking
markedup search . --semantic --rerank "consensus algorithms"Reranking supports the same provider model -- local via Ollama or remote via API (Jina, Cohere, OpenAI-compatible endpoints).
go install github.com/Clarit-AI/markedup/cmd/markedup@latest# Scaffold a sample knowledge base
markedup init my-kb
cd my-kb
# Validate frontmatter across all files
markedup check .
# Search by keyword
markedup search . "knowledge graph"
# Traverse the graph from a node
markedup explore . knowledge-graph --depth 3
# Launch the interactive TUI (runs setup wizard on first run)
markedup tuiOn first run, markedup tui will launch an interactive setup wizard to configure embedding, LLM, and reranker endpoints. You can also run markedup setup directly from the CLI at any time.
# Point markedup at your existing notes -- auto-enrichment handles the rest
markedup search ./my-notes "topic"
# Or explicitly enrich first to review what gets generated
markedup enrich ./my-notes --dry-run
markedup enrich ./my-notesmarkedup exposes an MCP server (JSON-RPC 2.0 over stdio) so AI agents and LLMs can search, traverse, and query your knowledge graph as a tool:
markedup serve ./my-kbThis gives agents access to 7 tools: markedup_search, markedup_get_page, markedup_traverse, markedup_get_structure, markedup_reason, embed_status, and embed_file. See docs/mcp-tools.md for the full tool catalog and integration configs for Claude Desktop, Cursor, and Claude Code.
markedup is also a Go library. You can import it as a dependency to load, search, and traverse knowledge graphs programmatically:
import (
"github.com/Clarit-AI/markedup/index"
"github.com/Clarit-AI/markedup/embed"
)
result, _ := index.Load(ctx, "./my-kb")
results := index.Search(result.Index, "consensus", index.WithLimit(10))See docs/go-library.md for the full API guide.
| Document | Contents |
|---|---|
| docs/cli-reference.md | All commands, flags, and output formats |
| docs/schema-reference.md | Frontmatter fields, validation rules, Obsidian compatibility |
| docs/mcp-tools.md | MCP tool names, parameters, and example payloads |
| docs/go-library.md | Using markedup as a Go library (including enrich package) |
| docs/architecture.md | Tech-stack decisions, module layout, and design rationale |
Licensed under the Apache License, Version 2.0. You may not use this project except in compliance with that license.