Roadmap

mq-mcp Roadmap

mq-mcp is a local deterministic AI execution runtime for engineering workflows.

The real strategic category is local engineering cognition runtime — not "MCP server" or "tool collection". The distinction matters: a tool collection grows by adding tools. An engineering cognition runtime grows by improving the quality of context, structure, and symbolic understanding it brings to every tool call.

It is not a chatbot, agent framework, or autonomous system. It exposes a controlled, documented, and testable MCP surface where every tool has a declared safety class, path boundary, and predictable output shape.

The goal is not to create an unrestricted local automation server.

The goal is to create a system that is:

self-describing — the runtime can explain its own boundaries
verifiable — contracts are enforced and drift is detected
self-reflective — the runtime can review and diagnose itself
deterministic — same inputs, same output structure, always
symbolically aware — the runtime understands structure, not just content

The highest-leverage improvement at any phase is context quality, not more features. All real quality gains come from giving the model better structured knowledge of the system it is reasoning about.

Authoritative identity contract: docs/RUNTIME_CONTRACT.md

Current status

Current project phase:

v1.9.0 — semantic memory hygiene + orchestration boundary (done)
Next:    Phase 8 — Release Gate v2

Completed foundation:

local MCP server with 66 tools across safety classes A–D and review engine tools
OpenAI/MCP bridge
repo-scoped file tools with path boundary enforcement
system resource tools
git tools
shell/subprocess safety boundaries
explicit filesystem allowlist
Bridget identity asset
validation and release-check scripts
docs and GitHub Pages
tool safety documentation and tool inventory sync
tool contract JSON and safety metadata (docs/tool_contracts.json)
mq-agent and mqlaunch integration docs
packaged mq-mcp CLI
install, upgrade, and uninstall scripts
health, info, report, and troubleshooting bundle commands
redacted observability endpoints
validated MCP profile templates for common clients and workflows
review engine: contracts, skills, severity engine, review memory, multi-pass reviewer, drift detector, review_diff, review_repo
architecture memory: ADRs, boundaries, rejected patterns, philosophy, coding convention extraction
orchestration boundary contract with mq-agent/profile validation
docs/RUNTIME_CONTRACT.md — authoritative identity and execution contract
docs/ORCHESTRATION_CONTRACT.md — authoritative orchestration boundary contract

Ecosystem position

mq-agent          — orchestration layer (consumes mq-mcp as runtime)
mq-mcp            — deterministic execution runtime  ← this repo
repo-signal       — repository intelligence (read-only analysis)
mq-hal            — operator interface layer
mq-image-analyze  — vision layer (invoked as a tool)
atlas-one         — prompt and interaction layer
macos-scripts     — human terminal UX and launch surface

This layering is not aspirational — it is enforced by the architecture. mq-mcp executes. mq-agent orchestrates. The boundary must not blur.

Cross-repo responsibility contract

mq-mcp owns the central cognition runtime:

review engine and review contracts
semantic retrieval and review memory
repo context selection for reviews
architecture memory and architecture drift detection
MCP runtime and safety metadata
multi-pass review and risk analysis

mq-mcp must not absorb heavy UI, duplicated repository indexing, repo metrics dashboards, or workflow automation logic. Those belong to mq-agent, repo-signal, macos-scripts and mq-hal.

Next phase — Ollama-backed learn extraction hardening

Goal:

Use Ollama only as an optional local provider for deterministic learn pattern extraction. mq-mcp remains the source of truth for contracts, validation, safety classes, review logic, and memory storage.

Planned scope:

validate learn extraction records before storage
default extraction to dry-run/read-only behavior
require explicit approval for storage
reject prompt-injection text inside reviewed content as instructions
handle missing Ollama or missing mq-learn model as an optional-provider error

Non-goals:

no autonomous learning
no repo mutation from Ollama output
no command execution from Ollama output
no final risk scoring by Ollama
no replacement of mq-mcp review logic

System feedback loops

The most valuable loop in the system:

Better contracts
→ better review
→ better self-understanding
→ better runtime stability
→ better contracts

The self-review loop:

mq-mcp reviews itself
→ finds drift
→ improves contracts
→ improves next review

When this loop is closed, the runtime acquires:

self-diagnostics
architectural immune response
adaptive stabilization

The permanent design tension:

More AI flexibility → more emergent behavior → more contracts needed → less flexibility

This is not a problem to solve. It is a tension to design.

Release map

Version	Theme	Status
v0.1.0	Public baseline	Done
v0.1.1	Documentation cleanup	Done
v0.1.2	Local validation flow	Done
v0.2.0	Safer MCP server structure	Done
v0.2.1	Bridget identity + repo metadata sync	Done
v0.2.2	Docs sync + tool inventory + CI credibility	Done
v0.2.3	AI tooling integration	Done
v0.3.0	Usable macOS MCP toolkit	Done / verify
v0.3.1	CI, release and validation hardening	Done
v0.4.0	Tool contract and safety map v2	Done
v0.5.0	mq-agent and mqlaunch integration hardening	Done
v0.6.0	Packaged local install flow	Done
v0.7.0	Local bridge observability	Done
v0.8.0	Profile templates and client setup polish	Done
v1.0.0	Stable local MCP platform	Done
v1.1.0	Runtime self-inspection	Done
v1.2.0	Architecture memory	Done
v1.3.0	Orchestration boundary formalization	Done
v1.4.0	Semantic memory layer	Done
v1.5.0	Risk analysis layer (shipped as v1.7.0)	Done
v1.6.0	Generated artifacts + repo-signal merge	Done
v1.9.0	Semantic hygiene + orchestration boundary	Done
v1.10.0	Learning Contract Layer	Planned
v1.11.0	Ollama-backed learn extraction hardening	Planned

Completed

v0.1.0 — Public baseline

v0.1.1 — Documentation cleanup

Goal:

Make the project understandable from the GitHub front page.

v0.1.2 — Local validation flow

Goal:

Make it easy to verify that the local MCP setup works.

Add a simple validation command
Add expected output examples
Add troubleshooting notes for missing uv
Add troubleshooting notes for Python version mismatch
Add troubleshooting notes for missing OpenAI credentials
Add troubleshooting notes for MCP server startup failures
Add a smoke-test script
Add a release-readiness checklist

v0.2.0 — Safer MCP server structure

Goal:

Make the local MCP server safer and easier to extend.

Replace hardcoded local paths with config or environment variables
Add an explicit filesystem allowlist
Document every exposed MCP tool
Separate system tools from repo/file tools
Add safer error handling
Add tests for path safety
Add tests for tool output shape
Add a minimal example config file

v0.2.1 — Bridget identity + repo metadata sync

Goal:

Give the project a recognizable identity and improve repo metadata quality.

Add Python syntax check workflow
Add basic test workflow
Add status badge when CI exists
Add Bridget face identity asset
Add Bridget face trigger to bridge.py
Add Bridget smoke-check to scripts/validate.sh
Sync pyproject.toml version with VERSION
Migrate unsafe os.path.normpath paths in server.py
Update GitHub Pages landing page

v0.2.2 — Docs sync + tool inventory + CI credibility

Goal:

Make the repository trustworthy by removing stale documentation and tool count drift.

Sync tool count across README, demo docs and safety docs
Fix Python version requirement in docs
Update stale tool list
Add proof section to README
Add scripts/release-check.sh
Add docs consistency workflow
Add tool inventory docs
Improve CI credibility

v0.2.3 — AI tooling integration

Goal:

Wire in mq-image-analyze and Claude Code subagents for richer local intelligence.

Bridget face lines dynamically generated via mq-image-analyze
Parallel mq-image analysis with chafa rendering — lower latency
Fix Bridget face output routing to /dev/tty (survives piped contexts)
Add Claude Code subagents: mq-project-context, mcp-tool-safety-reviewer, mcp-release-validator

v0.3.0 — Usable macOS MCP toolkit

Goal:

Make mq-mcp useful beyond a one-off local experiment.

Add a stable launcher command
Add documented MCP server profiles
Add setup examples for common MCP clients
Add screenshots for installation and usage
Add a complete troubleshooting page
Add example workflows
Add clear upgrade instructions
Make tool documentation easier to follow
Make validation flow repeatable

Completed: v0.3.1 — CI, release and validation hardening

Goal:

Make mq-mcp safe to depend on as the local MCP tool layer for mq-agent, mqlaunch and future HAL-style workflows.

This release should fix the trust layer before adding more features.

Scope

Validation commands

uv run python -m py_compile server.py bridge.py
uv run pytest -v
./scripts/validate.sh
./scripts/release-check.sh

Definition of done

Latest commit on main is green
GitHub Actions are green
Local validation passes
Release check passes
Tool count is documented once and referenced consistently
README proof section is current
CHANGELOG includes v0.3.1
GitHub release v0.3.1 exists (shipped as v0.4.0 — merged directly)
GitHub Pages deployment is successful

v0.4.0 — Tool contract and safety map v2

Goal:

Make every exposed MCP tool self-describing, safe to reason about and easy for mq-agent to consume.

Planned scope

Proposed safety classes

read-only
repo-read
repo-write
local-file-read
local-file-write
subprocess
external-app
dangerous
unknown

Definition of done

Every tool has a declared safety class
Every tool has a stable metadata entry
Tool docs are generated or verified from metadata
CI fails when a tool is undocumented
mq-agent can consume the tool metadata safely

v0.5.0 — mq-agent and mqlaunch integration hardening

Goal:

Make mq-mcp a reliable backend for mq-agent and mqlaunch workflows.

Planned scope

Example target flow

mqlaunch
  ↓
mq-agent
  ↓
mq-mcp
  ↓
safe local tool execution

Possible commands

mq-agent mcp status
mq-agent mcp tools
mq-agent run-tool read_repo_file --arg path=README.md --dry-run
mqlaunch agent mcp-status
mqlaunch agent mcp-tools

v0.6.0 — Packaged local install flow

Goal:

Make mq-mcp easy to install, update and run on a new macOS machine.

Planned scope

Possible commands

mq-mcp doctor
mq-mcp serve
mq-mcp validate
mq-mcp config path
mq-mcp tools

Non-goals

No hidden daemon by default
No automatic startup without explicit user choice
No silent credentials handling

v0.7.0 — Local bridge observability

Goal:

Make the MCP server and OpenAI bridge easier to inspect while running.

Planned scope

Possible commands

mq-mcp doctor --json
mq-mcp health
mq-mcp info --json
mq-mcp report --json
mq-mcp report --validate
mq-mcp bundle --validate

Safety requirements

Logs must not print secrets
Debug output must redact tokens and keys
Local paths should be shown only when useful
Dangerous tools must remain explicit

v0.8.0 — Profile templates and client setup polish

Goal:

Make mq-mcp easy to connect to different local MCP clients and mq ecosystem tools.

Planned scope

Example profiles

profiles/read-only.json
profiles/repo-dev.json
profiles/local-macos.json
profiles/mq-agent.json
profiles/openai-bridge.json

v1.0.0 — Stable local MCP platform

Goal:

Make mq-mcp stable enough to be the default MCP tool layer for the mq ecosystem.

v1.0.0 requirements

Review Engine — AI Engineering Runtime

mq-mcp is evolving beyond a local MCP tool layer into a repo-aware engineering cognition system. The review engine adds structured, contract-driven AI review directly into the MCP surface.

Strategic principle: better context architecture, not more AI.

Phase 1 — Review Foundation (done)

Goal: make review output consistent, stable, and contract-driven.

reviews/contracts/comment-review.md — hard rules: severity labels, output format, scope, max findings, uncertainty handling
reviews/skills/python-comment-review.md — Python-specific guidance: docstrings, type hints, naming, module-level side effects
reviews/skills/shell-review.md — shell-specific guidance: headers, unquoted vars, silent errors, set -e
reviews/skills/mcp-tool-review.md — MCP tool guidance: Args blocks, safety notes, path boundary docs, naming conventions
server.py: review_file, build_repo_context, list_review_contracts MCP tools — review engine exposed on the MCP surface (53 tools total)
Tool docs synced: TOOL_SAFETY.md, TOOL_INDEX.md, README.md, tool_contracts.json — all updated to 53 tools
reviews/golden/bridge-py-comment-review.md — 12-finding reference review with reasoning notes and excluded-findings section
reviews/contracts/architecture-review.md — ARCHITECTURE and RISK severity labels; scoped to boundaries, coupling, doc vs runtime
reviews/contracts/security-review.md — NOTE/WARNING/RISK labels; scoped to subprocess injection, path traversal, prompt injection, secret leakage, env forwarding, osascript injection

Phase 2 — Repo-Aware Intelligence (done)

Goal: give the review engine real system understanding.

review_engine/repo_context_builder.py — generates architecture_map.json (role of each file) and file_summary_index.json (public symbols, docstrings, line counts) from file heuristics + Python AST
review_engine/review_router.py — routes files to the correct skill by extension and path; wired into review_file — skill injected automatically
review_engine/severity_engine.py — parse_findings(), format_summary(), has_blocking_findings(), severity_counts(); sorts by severity then line number
docs/architecture/SYSTEM_OVERVIEW.md — ground-truth reference: runtime layers, file responsibilities, review pipeline, path safety, env vars, tool classes; used for drift detection
docs/architecture/REVIEW_PIPELINE.md — full pipeline reference: stages, prompt structure, severity parsing, memory persistence, MCP tools
review_engine/callgraph_builder.py — cross-file import graph and hub file detection. Outputs review_engine/context/callgraph.json with imports, importers, hub_files, symbols, and edges. Wired into build_repo_context (regenerated alongside architecture_map.json) and review_file / MultiPassReviewer.review_pass — cross-file context injected for every file, with hub files and their importers named explicitly.
callgraph_builder._try_merge_repo_signal_packs() — hook that merges repo_signal_callgraph.json, repo_signal_symbols.json, and repo_signal_summary.json from review_engine/context/ when present. Activates automatically when repo-signal starts writing intelligence packs to disk; no-op until then. Status surfaced in build_repo_context output.
review_engine/context_selector.py — ContextSelector enforces a 12 000- char budget (~3 000 tokens) on injected context. Priority order: past findings (2) before cross-file context (3). High-priority pieces are truncated rather than dropped when budget is tight. Wired into review_file after loading past_context and cross_file_ctx, before either review branch.

Phase 3 — Semantic Review Memory (done)

Goal: intelligent long-term memory for the review engine.

review_engine/review_memory.py — local persistent review history; ReviewMemory saves/retrieves findings per file, formats past context for injection into future reviews (max 5 findings, capped 10 entries/file)
review_file wired to memory: loads past context before model call, saves structured findings after; past findings shown as ## Previous review context
list_review_history MCP tool — summary of all reviewed files
get_last_review MCP tool — full last review for a specific file
reviews/skills/markdown-review.md + reviews/skills/json-review.md — review skills for .md and .json file types; wired into review_router
Cross-file reasoning: _build_rich_cross_file_context() injects arch role, top public symbols, and last review summary (finding count + severity distribution) for every file that imports or is imported by the file under review. Backed by callgraph.json (Phase 2) + ContextSelector (Phase 2). Files are no longer reviewed in isolation.
Persist coding conventions extracted from reviews into architecture memory — deferred to v1.2.0 (Architecture memory), where it belongs structurally.

Phase 4 — Multi-Pass Review Engine (done)

Goal: higher quality through structured pipeline.

review_engine/multi_pass_reviewer.py — MultiPassReviewer class:
- Pass 1: structural analysis (responsibility, patterns, hotspots, ≤400 tokens)
- Pass 2: contract-driven review enriched with structure context (≤2048 tokens)
- Pass 3: consistency pass — doc vs runtime divergence (docstrings, names, type hints vs actual behavior; ≤1024 tokens)
- Pass 4: deduplication — merges Pass 2 + Pass 3 findings, keeps highest severity per location, drops near-duplicate bodies (pure Python, no API call)
review_file(deep=True) — single-pass stays default; deep=True runs all 4 passes, returns formatted + deduplicated findings, ~3x API calls

Phase 5 — Advanced Engineering Review (done)

--risk mode: mode="security" in review_file via reviews/contracts/security-review.md covers subprocess injection, path traversal, prompt injection, secret leakage, env forwarding, osascript injection
review_engine/drift_detector.py — DriftDetector checks: tool count vs README/TOOL_SAFETY.md/tool_contracts.json, contract coverage (all tools in JSON), phantom contracts (JSON tools not in server), safety doc coverage, arch map freshness
detect_architecture_drift MCP tool — exposes DriftDetector on the MCP surface

Phase 6 — Autonomous Review Runtime (done)

review_diff MCP tool — continuous review triggered by git diff: reviews all .py/.sh/.md/.json files changed in the working tree or staging area, capped at 10
review_repo MCP tool — agentic review: prioritizes the least-recently-reviewed Python files in the repo (uses review memory to order by staleness), max 20 files
[~] Review TUI: severity history, semantic context display (deferred — out of scope for CLI)

Runtime Consolidation

mq-mcp has reached a transition point. The system no longer lacks capability.

The functional capacity already exceeds many established AI runtime projects. What was missing was a central model for how the system understands itself.

docs/RUNTIME_CONTRACT.md is the first output of this phase — the authoritative identity contract. The remaining interventions make the runtime self-inspecting, self-documenting, and structurally resistant to architectural drift.

Strategic principle: no new features until the existing runtime is self-describing, verifiable, and self-reflective.

v1.1.0 — Runtime self-inspection

Goal: the runtime can analyze its own architecture, verify its own contracts, and surface drift between documentation and implementation.

review_runtime_contract MCP tool — reviews docs/RUNTIME_CONTRACT.md against actual server state: structural checks (path resolvers, no-auto-commit, _redacted_env) + AI architecture pass with injected tool count and safety class breakdown
Extend detect_architecture_drift — checks 8-10: RUNTIME_CONTRACT.md existence (RISK), freshness relative to server.py (NOTE/WARNING), and reference document existence for all docs listed in the contract's reference table
list_architecture_docs MCP tool — inventory of all docs in docs/architecture/, with last-modified timestamps and freshness status relative to server.py mtime
review_architecture_doc MCP tool — applies the architecture review contract to a named architecture document, injecting current runtime state (tool count, safety classes, actual server mtime) so the model can detect stale counts, incorrect classifications, and undocumented behaviors
Cross-file semantic similarity: _build_rich_cross_file_context() pulls architecture role, top public symbols, and last review summary for every file that imports or is imported by the file under review. Injected into both single-pass and deep-mode review_file. Removes the file-isolation barrier.
Golden reviews for .md and .json file types — reviews/golden/system-overview-md-markdown-review.md (5 findings: stale tool count, incomplete router table, stale pipeline diagram, missing file responsibilities, static date pattern) and reviews/golden/tool-contracts-json-review.md (5 findings: version drift, Swedish descriptions, free-text resolver field, empty examples, undeclared side_effects vocabulary).

v1.2.0 — Architecture memory

Goal: durable, structured memory for design decisions and architectural intent — not just review findings.

The current review_engine/memory/review_history.json stores what the review engine found. Architecture memory stores why the system is designed as it is.

architecture_memory/ directory — structured ADR-style entries: decisions/ (ADR-001–005), rejected/ (REJ-001), boundaries/ (BND-001–002), philosophy/ (PHI-001–002). 8 seed entries covering path resolvers, no-auto-commit, safety classes, review contracts, secret handling, cognition ownership, execution vs orchestration, determinism, and context quality.
review_engine/architecture_memory.py — ArchitectureMemory class: list_all(), list_by_category(), get(id), relevant_for(file_path), format_context_block(), record(). Relevance matching by area keyword against file path; philosophy entries match all files.
list_architecture_decisions MCP tool — lists all entries with ID, status, category, title (Class A, read-only)
get_architecture_decision MCP tool — returns full text by ID (Class A)
record_architecture_decision MCP tool — writes a new ADR to architecture_memory/{category}/ (Class C, does not commit)
ADR injection in review_file — format_context_block() injects up to 3 relevant ADRs (decision body, capped at 300 chars each) at priority 1 in ContextSelector — highest priority, before past findings and cross-file context. Deep mode prepends ADRs to cross_file_ctx.
review_engine/convention_extractor.py — ConventionExtractor runs a single model call to extract generalizable coding conventions from review findings. Output format: CONVENTION / AREA / RATIONALE blocks, parsed into structured entries. Deduplicates against existing convention titles before writing.
extract_coding_conventions MCP tool — loads last review from ReviewMemory, runs ConventionExtractor, saves each convention to architecture_memory/decisions/ with status: convention. Conventions are immediately injected into future reviews of matching files via the existing ADR context mechanism (Class C).

v1.3.0 — Orchestration boundary formalization

Goal: make the mq-agent / mq-mcp boundary explicit, machine-readable, and verifiable — not just documented in prose.

docs/ORCHESTRATION_CONTRACT.md — formal contract defining:
- what mq-agent is allowed to invoke
- what return shapes it can rely on
- what side effects it must never assume
- how context flows from mq-agent into mq-mcp and back
Document cross-repo input/output contracts:
- repo-signal exports repo intelligence packs
- mq-image-analyze exports visual analysis JSON
- mq-hal exports runtime and model health summaries
- mq-agent routes review/orchestration requests to mq-mcp
validate_orchestration_contract MCP tool — verifies that the current tool set satisfies the orchestration contract: all caller-visible tools are documented, no undeclared side effects, no missing error prefixes
Profile validation: verify that each profile in profiles/ restricts tool access to the minimum required for its declared use case
Semantic coupling audit: error prefix consistency checked; profile max-class violations found and corrected across 5 profiles

v1.4.0 — Semantic memory layer

Goal: give the runtime a proper long-term knowledge store that is separate from architecture decisions (ADRs) and review history. Blueprint §8.

The distinction matters:

architecture_memory/  — decisions, boundaries, philosophy (structural)
review_engine/memory/ — per-file review history (operational)
semantic_memory/      — long-term reusable knowledge (semantic)

What semantic memory stores:

Summaries of README, ROADMAP, and key architecture docs
Contracts and review examples (indexed, not raw text)
Extracted conventions (already done via extract_coding_conventions)
Tool docs and safety notes
Cross-repo facts (e.g. "repo-signal outputs callgraph.json to disk")

What it does NOT store:

Entire raw repos
Generated build artifacts
Large binaries or noisy logs

Items:

semantic_memory/ directory + SemanticMemory class with store(key, content, tags), search(query, max=5), get(key), list()
store_semantic_memory MCP tool — writes a named knowledge item with tags for retrieval (Class C, writes to semantic_memory/)
search_semantic_memory MCP tool — keyword/tag search over stored items, returns ranked matches (Class A)
get_semantic_memory MCP tool — retrieves a specific item by key (Class A)
Bootstrap ingestion: index README, ROADMAP, RUNTIME_CONTRACT.md, ORCHESTRATION_CONTRACT.md, TOOL_SAFETY.md into semantic_memory at startup (lazy, on first search)
Integration with review_file context: semantic memory injected at priority 0 (above ADRs) when a match is found for the file being reviewed
list_semantic_memory MCP tool — inventory of stored items (Class A)
Docs: update ORCHESTRATION_CONTRACT.md §3 declared side effects table

v1.5.0 — Risk analysis layer

Goal: go beyond doc review — give the runtime explicit risk and security reasoning modes. Blueprint §10.

The review engine already has severity levels (RISK, ARCHITECTURE, WARNING). This phase adds structured risk modes so callers can request targeted security or architecture analysis without running a full review.

Risk modes:

security  — subprocess safety, shell injection, env leakage, unsafe fs access
            secret exposure, path traversal, MCP exposure surface
risk      — class D tools invoked without approval gates, missing contracts,
            undeclared side effects, stale safety docs
architecture — boundary violations, coupling, responsibility drift,
               cross-repo contract gaps

Items:

risk_review_file MCP tool — targeted risk pass on a single file with declared mode (security, risk, architecture). Returns findings using the fixed severity vocabulary (CRITICAL/RISK/WARNING). Class A.
risk_review_diff MCP tool — risk pass over current git diff. Same modes. Class A.
Risk contract in reviews/contracts/risk-review.md — defines what the security/risk/architecture passes look for and how to format findings
Security skill in reviews/skills/security-review.md — file-type-aware security patterns (Python subprocess, shell, env, path)
Severity engine update: add CRITICAL level above RISK for findings that represent immediate exploitable vulnerabilities
detect_security_patterns helper — grep-based pre-scan for known dangerous patterns (os.system, eval, exec, shell=True, hardcoded secrets) before API call; injects findings as context
Integration: review_file(mode="risk") routes through the risk contract rather than the standard comment contract

v1.6.0 — Generated artifacts + repo-signal merge

Goal: close the loop between repo-signal's intelligence output and mq-mcp's context builder. Blueprint §6.1, §3.3.

repo-signal already has a merge hook in callgraph_builder._try_merge_repo_signal_packs(). This phase activates it fully by defining the on-disk format and adding the generated artifacts directory structure.

Expected repo-signal output files (when repo-signal writes them):

review_engine/context/repo_signal_callgraph.json  — merged into callgraph.json
review_engine/context/repo_signal_symbols.json    — merged into symbol index
review_engine/context/repo_signal_summary.json    — repo-level health summary

Generated artifacts directory:

generated/
├── symbols/          — symbol_index.json, per-file symbol exports
├── callgraphs/       — callgraph snapshots with timestamps
└── architecture/     — architecture_map.json, ownership_map.json

Items:

generated/ directory with .gitkeep and generated/.gitignore (exclude snapshots from version control)
build_repo_context extended: write architecture_map.json to generated/architecture/ in addition to callgraph.json
architecture_map.json schema: maps file path → role label, public symbols, last review timestamp, hub score
ownership_map.json schema: maps file path → author (from git blame), change frequency, last modified
export_symbol_index MCP tool — writes current callgraph symbols to generated/symbols/symbol_index.json in a format repo-signal can consume (Class C)
Activate _try_merge_repo_signal_packs(): once repo-signal publishes its packs, the merge hook auto-activates; document the expected file paths and schema in docs/ORCHESTRATION_CONTRACT.md §5
repo_signal_status MCP tool — reports whether repo-signal packs are present, their age, and whether they have been merged into the callgraph (Class A)

v1.10.0 — Learning Contract Layer

Goal: add a deterministic learning layer that captures verified engineering lessons from Codex, Claude, mq-agent, mq-hal and manual operator sessions. Learning should improve review context, semantic memory, runbooks and agent guidance without weakening mq-mcp safety boundaries.

This is a controlled memory/runtime layer, not a self-learning agent. It should capture:

What worked?
Why did it work?
How was it verified?
When should the same pattern be used again?

Planned structure

learn_engine/
├── __init__.py
├── models.py
├── store.py
├── redaction.py
├── summarize.py
├── promote.py
└── validators.py

schemas/
└── learning.schema.json

docs/
├── LEARNING_CONTRACT.md
└── LEARNING_MODEL.md

MCP tools

Class A — read-only:

Class C — controlled write:

Class C tools may write only within the learning, semantic memory, runbook, architecture memory, AGENTS.md, or CLAUDE.md promotion scope. They must not commit, push, mutate router policy, mutate safety classes, or approve tool calls.

Storage model

learn_engine/memory/learning_events.jsonl  — raw learning events
learn_engine/memory/lessons.json           — normalized lessons
semantic_memory/store.json                 — searchable promoted lessons

Learning must reuse the existing semantic memory layer for searchable knowledge instead of creating a competing memory system.

Safety contract

Learning may influence future review context, runbooks, summaries, and recommendations.

Learning must not:

execute commands
mutate router policy
mutate safety classes
mutate allowlists
approve tool calls
write AGENTS.md or CLAUDE.md without explicit confirmation
store secrets
store chain-of-thought

Promotion model

Promotion must default to dry-run and require explicit confirmation for writes:

mq-mcp learn promote <id> --to runbook --dry-run
mq-mcp learn promote <id> --to agents-md --dry-run
mq-mcp learn promote <id> --to claude-md --dry-run
mq-mcp learn promote <id> --to architecture-memory --dry-run

Allowed promotion targets:

docs/RUNBOOK.md
AGENTS.md
CLAUDE.md
architecture_memory/
semantic_memory/store.json

Non-goals

No self-training
No chain-of-thought storage
No hidden uploads
No autonomous tool loops
No safety policy mutation
No router or allowlist changes based on learned content

Definition of done

Long-term ideas

These are intentionally not scheduled yet.

Model routing strategy — three tiers matched to task depth:

Mode	Model	Use case
Fast	Local small (qwen3:4b, llama3)	Single-file comment review, quick checks
Deep	Local large (qwen3:14b, deepseek-coder)	Multi-pass review, architecture analysis
Architecture	Cloud (GPT-4, Claude Opus)	Cross-repo reasoning, design decisions

The fast tier enables offline-first review with no API cost. The routing decision should be made automatically based on file size, review mode, and available local models.

Review TUI — terminal-native review surface showing severity history, cross-file graph, semantic context panel, and architecture role for the current file. Leverages callgraph.json and architecture_map.json as data sources.

Other ideas:

Bridget voice mode
Bridget terminal avatar mode
local event history
repo health history
MCP tool marketplace
integration with mq-ums
cross-repo tool inventory
visual safety map — runtime dependency graph, orchestration topology
generated architecture diagrams from architecture_memory
drift visualization: doc vs implementation divergence over time
demo videos or GIFs

Design principles

mq-mcp should remain:

local-first
explicit
safe by default
repo-aware
path-bounded
testable
observable
easy to validate
easy to disable
useful without hidden automation

The server should expose tools.

It should not become an unrestricted remote-control layer.

Safety principles

mq-mcp must never:

expose arbitrary filesystem access by default
run subprocess tools silently
ignore path boundaries
leak API keys
print secrets in logs
mutate repositories without explicit tool intent
hide dangerous behavior behind friendly names
treat AI-generated requests as automatically trusted

Every powerful tool must have:

a safety class
documented inputs
documented outputs
tests
error handling
explicit approval behavior when used by higher-level agents

Current recommended next step

Work on:

v1.4.0 — Semantic memory layer

The runtime is now stable, self-inspecting, architecture-memory-aware, and orchestration-boundary-aware.

The next leverage point is semantic memory: giving the runtime a durable knowledge layer separate from ADRs and per-file review history.

Immediate priorities:

Add semantic_memory/ and SemanticMemory class
Add store_semantic_memory, search_semantic_memory, get_semantic_memory, and list_semantic_memory MCP tools
Bootstrap README, ROADMAP, RUNTIME_CONTRACT.md, ORCHESTRATION_CONTRACT.md, and TOOL_SAFETY.md into semantic memory
Inject semantic memory into review_file at priority 0 when relevant
Update docs/ORCHESTRATION_CONTRACT.md side-effect table for the new tools

Keep validating releases with ./scripts/release-check.sh and only add new tool surface when safety metadata, tests, profiles, and docs move with it.

Runtime Truth + Safety Governance

Goal: evolve mq-mcp from a local MCP lab into a stable, verifiable, and safe local control plane for the MQ ecosystem.

The goal is not more tools first. The goal is better feedback between:

runtime → tools → safety metadata → docs → validation → release

When that chain is stable, the system can carry more integrations without creating drift between code, documentation, and actual behavior.

Why this matters now

The repo has grown quickly and now contains 76 MCP tools, a review engine, semantic memory, safety classes, an OpenAI bridge, profiles for multiple clients, and integration with mq-hal and repo-signal. The following signals can start to drift apart independently:

README status and version badge
VERSION, CHANGELOG, GitHub release
docs/stability.json
runtime tool count
docs/TOOL_SAFETY.md, TOOL_INDEX.md, actual MCP tool discovery

This is a system problem, not a series of isolated documentation errors.

Guiding principles

1.  Runtime is the truth.
2.  Documentation must be verified against runtime.
3.  Safety metadata must be machine-readable.
4.  New tools may not be added without a contract.
5.  Release may not happen if VERSION, README, CHANGELOG, and docs are out of sync.
6.  Semantic memory should be curated, not just accumulated.
7.  The review engine should audit system contracts, not just code style.
8.  Class C/D tools must always have explicit boundaries.
9.  Generated docs must be separated from handwritten analysis.
10. mq-mcp should be local-first, explicit, and verifiable.

Phase 1 — Stop version and documentation drift

Goal: get all public signals to say the same thing.

Tasks

Verify that VERSION matches the intended current release.
Update the README version badge.
Update the README status line.
Verify that CHANGELOG.md has an entry for the current version.
Verify that docs/stability.json matches the current version.
Verify that the GitHub release/tag matches the current version.
Fix any CI failure before the next release.
Remove or ignore cache directories that should not be version-controlled, e.g. .mypy_cache.

Definition of done

git status is clean after changes.
./scripts/validate.sh passes.
README, VERSION, CHANGELOG, and release status are in sync.
The repo shows a consistent version externally and internally.

Phase 2 — Runtime Truth Gate

Goal: build a check that blocks release when the repo describes itself incorrectly.

New files

scripts/check-runtime-truth.sh
tests/test_runtime_truth.py

Checks

VERSION exists and is semver-compatible.
README and README badge contain the same version as VERSION.
CHANGELOG.md and docs/stability.json contain the same version.
README tool count matches actual runtime discovery.
All runtime tools are present in docs/TOOL_SAFETY.md.
All tools in docs/TOOL_SAFETY.md exist in runtime.
All Class C/D tools have explicit safety metadata.

The script must emit clear error messages, for example:

MQ_MCP_RUNTIME_TRUTH_ERROR: VERSION mismatch between VERSION and README
MQ_MCP_RUNTIME_TRUTH_ERROR: tool count mismatch between README and runtime
MQ_MCP_RUNTIME_TRUTH_ERROR: tool missing from docs/TOOL_SAFETY.md

Definition of done

scripts/check-runtime-truth.sh is called by scripts/validate.sh.
CI fails if version, tool count, or safety docs drift apart.
Error messages are clear enough to locate and fix drift quickly.

Phase 3 — Tool Registry

Goal: make tool metadata a first-class part of the runtime.

New file: mq-mcp/tool_registry.py

Each tool must declare:

{
    "name": "read_repo_file",
    "category": "repo",
    "safety_class": "A",
    "read_only": True,
    "writes_files": False,
    "uses_subprocess": False,
    "uses_network": False,
    "requires_api_key": False,
    "resolver": "resolve_repo_file",
    "description": "Reads a file inside the repository root",
}

New outputs

generated/tool-index.json
generated/tool-safety.json
generated/runtime-contract.json

New commands

mq-mcp tools --json
mq-mcp tools --safety
mq-mcp tools --markdown

Definition of done

Tool metadata can be exported as JSON.
Tool safety can be exported in machine-readable form.
TOOL_INDEX.md can be generated or validated from runtime.
README no longer needs to be the sole source of tool count.

Phase 4 — Safety Contract Enforcement

Goal: make the safety model stricter and testable.

New files

scripts/check-tool-contracts.sh
tests/test_tool_contracts.py
tests/test_safety_classes.py

Class A — repo-scoped read-only

may only read repo-scoped files/data
may not write, run subprocess, or open apps
does not require an API key

Class B — external/system read-only

may read system status or external read-only data
may not write files or change system state
external access must be documented

Class C — controlled write

may write only within a clearly defined scope
may not commit automatically
must return the modified path and document rollback or limitation
must have a test for path safety

Class D — subprocess/open-app/system effect

must be explicit and document the system effect
must have a clear command boundary
should be avoided in automated workflows
must be identifiable in tool metadata

Definition of done

A new tool without complete metadata causes validation to fail.
Class C/D tools are easy to locate.
docs/TOOL_SAFETY.md can be checked against runtime.

Phase 5 — Review Engine Contracts

Goal: make the review engine a system auditor, not just a code reviewer.

New contract files

review_engine/contracts/runtime_truth.md
review_engine/contracts/safety_contract.md
review_engine/contracts/release_readiness.md
review_engine/contracts/memory_hygiene.md
review_engine/contracts/orchestration_boundary.md

New review modes: review_runtime_truth, review_safety_contract, review_release_readiness, review_memory_hygiene, review_orchestration_boundary

The review engine must detect

version drift, tool count drift, missing safety class
docs/runtime mismatch, stale architecture docs, stale semantic memory
unclear Class C/D boundaries, release blockers
skill/docs mismatch, orchestration boundary violations

Definition of done

review_repo can flag system drift.
review_diff can detect when a new tool is missing safety metadata.
Review results can be fed into semantic memory without creating noise.

Phase 6 — Semantic Memory Hygiene

Goal: semantic memory should be curated knowledge, not just accumulated text.

New files

semantic_memory/POLICY.md
semantic_memory/schema.json
scripts/check-semantic-memory.sh
tests/test_semantic_memory_policy.py

Memory item schema

{
  "key": "mq-mcp.tool-safety-model",
  "type": "fact | decision | convention | summary | warning",
  "source": "README.md",
  "version": "1.9.0",
  "tags": ["safety", "tools"],
  "created_at": "2026-05-29",
  "updated_at": "2026-05-29",
  "confidence": "high",
  "content": "..."
}

Policy must define: what may/may not be stored, how old entries are marked, how conflicts and replacements are handled, how sources are cited, how facts are distinguished from interpretation, how bootstrap may be used, and how stale memory is detected.

New command: mq-mcp memory audit — shows stale, duplicate, conflicting, and sourceless items.

Definition of done

Semantic memory can be audited.
Bootstrap does not overwrite valuable ADRs without a policy rule.
The review engine can use memory without mixing old and new truth.

Phase 7 — Orchestration Boundary

Goal: clarify exactly what mq-mcp does compared to other MQ repos.

Role division

Repo	Role
`mq-mcp`	local tool surface, safety, bridge, memory, review
`mq-agent`	planner, orchestrator, routing, and agent flows
`mq-hal`	system status, reports, and environment analysis
`repo-signal`	repo health, publish readiness, and scoring
`mq-image-analyze`	visual perception, screenshots, diagrams, and image reasoning

Files to update: docs/orchestration-boundary.md, docs/integration.md, README.md, profiles/

README must answer: when is each repo used, which tools may run automatically, which require an explicit human decision, and where the boundary between orchestration and execution lies.

Definition of done

A new user understands what mq-mcp is.
An agent can decide when to use mq-mcp.
Class C/D tools are clearly separated from read-only flows.

Phase 8 — Release Gate v2

Goal: make release a system test, not just a version bump.

Files to update: release.sh, scripts/release-check.sh, scripts/validate.sh

The release gate must run check-runtime-truth.sh, check-tool-contracts.sh, check-semantic-memory.sh, and validate.sh. Release must be blocked if any of the following are true: version drift, wrong README badge, CHANGELOG missing the version, stale docs/stability.json, wrong tool count, safety docs missing a tool, runtime missing a documented tool, absent Class C/D metadata, corrupt semantic memory, out-of-sync generated artifacts, or red CI.

Definition of done

The release process catches system drift before tagging.
Release output clearly shows what was verified.
Release can be run with --dry-run.

Phase 9 — Generated Docs Discipline

Goal: reduce manual documentation drift by separating what is generated from what is handwritten.

generated docs = what the system actually exposes
handwritten docs = why the system is designed that way

Generated: generated/tool-index.json, generated/tool-safety.json, generated/runtime-contract.json, generated/release-state.json, generated/profile-index.json

Handwritten: README.md, ROADMAP.md, SAFETY_MODEL.md, docs/security.md, docs/integration.md, docs/orchestration-boundary.md

Definition of done

Generated artifacts can be reproduced deterministically.
Validation fails if generated artifacts are out of sync.
README uses summaries rather than duplicating tool truth.

Priorities

Do first: fix CI failure → sync VERSION/README/CHANGELOG/GitHub release → add scripts/check-runtime-truth.sh → wire it into scripts/validate.sh → verify tool count and docs/TOOL_SAFETY.md against runtime.

Do next: introduce tool_registry.py → generate tool-index from registry → add check-tool-contracts.sh → add semantic_memory/POLICY.md → add review contracts for runtime, safety, and release → clarify the orchestration boundary.

Defer: more macOS automation tools, more write-capable tools, daemonization, auto-execution, more external integrations, more voice/persona layers, more Class D tools.

This repo does not primarily need more power right now. It needs better feedback. The most important chain is:

runtime → registry → generated docs → safety validation → release gate

Roadmap

mq-mcp Roadmap

Current status

Ecosystem position

Cross-repo responsibility contract

Next phase — Ollama-backed learn extraction hardening

System feedback loops

Release map

Completed

v0.1.0 — Public baseline

v0.1.1 — Documentation cleanup

v0.1.2 — Local validation flow

v0.2.0 — Safer MCP server structure

v0.2.1 — Bridget identity + repo metadata sync

v0.2.2 — Docs sync + tool inventory + CI credibility

v0.2.3 — AI tooling integration

v0.3.0 — Usable macOS MCP toolkit

Completed: v0.3.1 — CI, release and validation hardening

v0.4.0 — Tool contract and safety map v2

v0.5.0 — mq-agent and mqlaunch integration hardening

v0.6.0 — Packaged local install flow

v0.7.0 — Local bridge observability

v0.8.0 — Profile templates and client setup polish

v1.0.0 — Stable local MCP platform

v1.0.0 requirements

Review Engine — AI Engineering Runtime

Phase 1 — Review Foundation (done)

Phase 2 — Repo-Aware Intelligence (done)

Phase 3 — Semantic Review Memory (done)

Phase 4 — Multi-Pass Review Engine (done)

Phase 5 — Advanced Engineering Review (done)

Phase 6 — Autonomous Review Runtime (done)

Runtime Consolidation

v1.1.0 — Runtime self-inspection

v1.2.0 — Architecture memory

v1.3.0 — Orchestration boundary formalization

v1.4.0 — Semantic memory layer

v1.5.0 — Risk analysis layer

v1.6.0 — Generated artifacts + repo-signal merge

v1.10.0 — Learning Contract Layer

Long-term ideas

Design principles

Safety principles

Current recommended next step

Runtime Truth + Safety Governance

Phase 1 — Stop version and documentation drift

Phase 2 — Runtime Truth Gate

Phase 3 — Tool Registry

Phase 4 — Safety Contract Enforcement

Phase 5 — Review Engine Contracts

Phase 6 — Semantic Memory Hygiene

Phase 7 — Orchestration Boundary

Phase 8 — Release Gate v2

Phase 9 — Generated Docs Discipline

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

mq-mcp

Clone this wiki locally