MCP Knowledge Engine

A knowledge management system built on SQLite FTS5 with multi-model coordination, epistemic decision capture, and a memory recall/confirm protocol. Designed for self-hosted environments where multiple AI models (Claude, GPT, Codex, Qwen) collaborate through a shared knowledge graph.

What It Does

Full-text search over a document corpus using SQLite FTS5 with Porter stemming and Unicode tokenization
Document tiering (hot/warm/cold) with automatic promotion, decay, and archival based on access patterns
Multi-model devloop coordination -- multiple AI models log structured runs, artifacts, and decisions to a shared database, exposed over SSE for cross-tool interop
Epistemic decision capture -- logs reasoning decisions with TAR (Tool-Assisted Rate) and TMR (Tool-Missing Rate) metrics to measure when AI tools help vs. when they are absent
Memory recall/confirm protocol -- retrieval with reinforcement: query documents, then confirm which results were useful, closing a feedback loop that drives tier promotion

Architecture

tools/                    # MCP tool modules (17 tools)
  knowledge_tools.py      # FTS5 search, status, reindex, OCR queue, context marking
  memory_tools.py         # recall + confirm (retrieval with reinforcement)
  devloop_tools.py        # multi-model run logging, artifact storage, search
  decision_capture_tools.py  # epistemic decision logging + metrics

server/
  devloop_sse_server.py   # FastMCP SSE bridge (3 tools: memory, devloop_write, devloop_propose)

scripts/
  index_knowledge.py      # Document indexer: walks directories, extracts text, chunks, FTS5 upsert
  normalize_tiers.py      # Tier rebalancing: promotes hot, decays cold, archives noise
  knowledge_cron.sh       # Cron wrapper for scheduled reindexing
  run_ocr.py              # OCR pipeline for scanned documents (Tesseract)

ui/
  decision_capture/       # Flask blueprint for decision capture UI
  ayala_sigil/            # Flask blueprint for knowledge sigil visualization

FTS5 Index

Documents are chunked (2000 chars, 200 overlap) and stored in an FTS5 virtual table with Porter stemming. Each document has metadata: category, entity, year, file path, and a computed priority score based on:

Category weight (legal/tax/hr = high priority)
Recency (current year boost)
Entity relevance (configurable per deployment)
Access frequency (reinforcement from recall/confirm)
File path keyword matches (operating agreements, tax forms, etc.)

Hot / Warm / Cold Tiering

Documents move between tiers based on usage:

Tier	Behavior
Hot	Pinned in memory, instant recall, quality_score >= 80
Warm	Standard FTS5 retrieval, moderate access patterns
Cold	Archived after sustained low access, retrievable on demand

Continuous decay runs on each search: documents that are not accessed lose quality score over time. High-quality documents that receive memory.confirm calls get promoted upward.

Multi-Model Devloop

The devloop system coordinates multiple AI models through a shared SQLite database:

Runs -- each model session creates a run with origin tracking (claude/chatgpt/codex/qwen)
Artifacts -- structured outputs attached to runs (code, analysis, decisions)
Search -- full-text search across all model outputs
SSE Bridge -- FastMCP server exposes devloop tools over Server-Sent Events for cross-tool access

Decision Capture (Epistemic Instrumentation)

Captures reasoning decisions with structured metadata:

Decision type: tool-assisted, reconstruction-only, hybrid
Confidence scores: pre/post decision confidence
TAR/TMR metrics: measures tool adoption rate vs. reconstruction rate
Sigil corrections: links decisions to epistemic correction documents

MCP Tool Reference

Knowledge (6 tools)

Tool	Description
`knowledge.status`	Index stats: doc count, tier distribution, staleness
`knowledge.search`	FTS5 search with priority scoring and stochastic recall
`knowledge.bootstrap_context`	Load hot-tier context for session start (deprecated, use memory.recall)
`knowledge.reindex`	Re-index documents from source directories
`knowledge.ocr_queue`	Queue scanned documents for OCR processing
`knowledge.context_mark`	Record which documents were consulted (deprecated, use memory.confirm)

Memory (2 tools)

Tool	Description
`memory.recall`	Query knowledge base with reinforcement tracking
`memory.confirm`	Confirm which recalled documents were useful (closes feedback loop)

Devloop (6 tools)

Tool	Description
`devloop.run_start`	Start a new multi-model coordination run
`devloop.log`	Log structured entry to current run
`devloop.add_artifact`	Attach artifact (code/analysis/decision) to a run
`devloop.latest`	Get latest runs and artifacts
`devloop.search`	Full-text search across all devloop entries
`devloop.get_artifact`	Retrieve a specific artifact by ID

Decision Capture (3 tools)

Tool	Description
`decision_capture.log`	Log a reasoning decision with epistemic metadata
`decision_capture.list`	List captured decisions with filtering
`decision_capture.metrics`	Compute TAR/TMR rates and epistemic health metrics

SSE Bridge (3 tools)

Tool	Description
`memory`	Recall/confirm via SSE (action: recall, latest, artifact, confirm)
`devloop_write`	Log or dispatch tasks to specific models
`devloop_propose`	Propose entries (always-available variant of devloop_write)

Key Concepts

Memory Recall / Confirm Protocol

The core retrieval pattern is two-step:

Recall: memory.recall(query="...") -- searches the knowledge base and returns ranked results
Confirm: memory.confirm(recall_id="...") -- marks which results were actually useful

This closes a reinforcement loop: confirmed documents get quality score boosts and tier promotion, while unconfirmed results decay over time. The system learns which documents are genuinely useful.

Epistemic Instrumentation

Decision capture tracks how AI models reason, not just what they produce:

Tool-Assisted Rate (TAR): percentage of decisions where MCP tools provided the answer
Tool-Missing Rate (TMR): percentage of decisions where tools were unavailable and the model had to reconstruct from memory
These metrics identify gaps in the knowledge base and tool coverage

Sigil Corrections

Epistemic corrections are stored as structured documents ("sigils") that capture:

What was wrong (the incorrect assumption or reconstruction)
What is correct (the verified ground truth)
Why it matters (impact on downstream reasoning)

Sigils are automatically promoted to hot tier for instant recall.

Tech Stack

Python 3.11+
SQLite FTS5 -- full-text search with Porter stemming, Unicode tokenization
FastMCP -- SSE transport for cross-tool MCP access
Flask -- web UI for decision capture and sigil visualization
Tesseract OCR -- document scanning pipeline

Stats

24 MB knowledge index
17 MCP tools across 4 modules + 3 SSE bridge tools
Multi-model coordination: Claude, GPT, Codex, Qwen
Hot/warm/cold document tiering with continuous decay
FTS5 with stochastic recall (serendipitous document surfacing)

Setup

# Install dependencies
pip install fastmcp flask

# Index documents
python scripts/index_knowledge.py --dry-run    # preview
python scripts/index_knowledge.py --apply       # commit

# Run SSE bridge (for multi-model access)
python server/devloop_sse_server.py

# Tier normalization (run periodically)
python scripts/normalize_tiers.py

Configuration

All database paths and host configuration are resolved through a config object. Set environment variables or modify the config module for your deployment:

KNOWLEDGE_DB -- path to the FTS5 knowledge database
DEVLOOP_MCP_BEARER_TOKEN -- optional bearer token for SSE bridge authentication
Document source directories are configured in scripts/index_knowledge.py

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
scripts		scripts
server		server
tools		tools
ui		ui
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MCP Knowledge Engine

What It Does

Architecture

FTS5 Index

Hot / Warm / Cold Tiering

Multi-Model Devloop

Decision Capture (Epistemic Instrumentation)

MCP Tool Reference

Knowledge (6 tools)

Memory (2 tools)

Devloop (6 tools)

Decision Capture (3 tools)

SSE Bridge (3 tools)

Key Concepts

Memory Recall / Confirm Protocol

Epistemic Instrumentation

Sigil Corrections

Tech Stack

Stats

Setup

Configuration

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

MCP Knowledge Engine

What It Does

Architecture

FTS5 Index

Hot / Warm / Cold Tiering

Multi-Model Devloop

Decision Capture (Epistemic Instrumentation)

MCP Tool Reference

Knowledge (6 tools)

Memory (2 tools)

Devloop (6 tools)

Decision Capture (3 tools)

SSE Bridge (3 tools)

Key Concepts

Memory Recall / Confirm Protocol

Epistemic Instrumentation

Sigil Corrections

Tech Stack

Stats

Setup

Configuration

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages