-
Notifications
You must be signed in to change notification settings - Fork 0
Runtime Contract
This document defines what mq-mcp is, what it guarantees, and what it explicitly does not do. It is the authoritative contract for tool implementers, MCP clients, and upstream orchestrators such as mq-agent.
mq-mcp is a local deterministic AI execution runtime for engineering workflows.
It is not a chatbot, agent framework, or autonomous system. It is not a cloud service or persistent daemon. It does not take actions without explicit tool invocation.
It exposes a controlled, documented, and testable MCP surface where every tool has a declared safety class, path boundary, and predictable output shape.
mq-agent — orchestration layer (consumes mq-mcp as runtime)
mq-mcp — execution runtime (this repo)
repo-signal — repo intelligence (read-only analysis)
mq-hal — operational UX
mq-image-analyze — vision layer (invoked as a tool)
mq-mcp is the execution layer. It does not orchestrate across repos or across sessions. That is mq-agent's responsibility.
Every capability is exposed as a named MCP tool with declared inputs and outputs. Tools are invoked explicitly — never triggered automatically or in the background.
Tool outputs are strings. They are deterministic given the same inputs and environment state. Tools that call external APIs (OpenAI) are not deterministic in content but are deterministic in structure (format contract is enforced by the review engine).
Tools return error strings rather than raising exceptions. Error strings begin
with the tool name: review_file blocked: ..., get_last_review failed: ....
Callers must check for error prefixes, not exception types.
Every tool declares its side effects in docs/TOOL_SAFETY.md and
docs/tool_contracts.json. A tool with "write": false and "subprocess": false
has no observable side effects outside its return value.
| Class | Capability | Examples |
|---|---|---|
| A | Read-only, repo-scoped |
read_repo_file, review_file, git_status
|
| B | Read-only, system or allowed external paths |
get_system_resources, get_clipboard
|
| C | Write-capable, controlled scope |
update_repo_file, edit_image, build_repo_context, export_symbol_index
|
| D | Subprocess / open-app |
open_in_app, run_tests, hal_repo_report
|
Two resolvers enforce all filesystem access:
| Resolver | Scope |
|---|---|
resolve_repo_file |
Files inside REPO_ROOT only |
resolve_allowed_local_file |
REPO_ROOT + MQ_MCP_ALLOWED_PATHS roots |
No tool may access a path outside these boundaries. Path traversal (../),
absolute paths to unlisted roots, and symlink escapes are all rejected.
Write-capable tools (Class C) enforce additional constraints:
-
update_repo_filerequires an exact string match before writing, never commits, rejects.env,uv.lock,.git,.venv, and non-text suffixes. -
edit_imagewrites only withinresolve_allowed_local_filescope. - No write tool auto-commits, auto-pushes, or chains into further writes.
Class D tools invoke a fixed, declared command. They do not accept arbitrary shell input. Timeouts are enforced (60 seconds for validation tools).
The review engine provides four layers of context to every review:
| Layer | Source | Purpose |
|---|---|---|
| Architecture role | review_engine/context/architecture_map.json |
What the file is in the system |
| File symbol index | review_engine/context/file_summary_index.json |
Public API surface of the file |
| Review skill |
reviews/skills/ routed by extension and path |
File-type-specific guidance |
| Past review context | review_engine/memory/review_history.json |
Incremental quality, avoid re-flagging |
Context artifacts are generated by build_repo_context() and cached locally.
They are read-only inputs to the review pipeline — the pipeline never writes to them.
Rebuild context when: server.py is modified, new files are added, or
detect_architecture_drift() reports the map is stale.
build_repo_context() also writes two richer files to generated/architecture/:
| Artifact | Schema | Contents |
|---|---|---|
generated/architecture/architecture_map.json |
architecture_map.v1 |
role, public_symbols, last_review_timestamp, hub_score per file |
generated/architecture/ownership_map.json |
ownership_map.v1 |
author, change_frequency, last_modified per file (git log) |
These files are excluded from version control by generated/.gitignore.
They are consumed by downstream tools and callers — not by the review pipeline directly.
A review contract is a markdown file in reviews/contracts/ that defines:
- allowed severity labels
- required output format (
[SEVERITY] file:line\nbody) - max findings per review
- scope rules (what to flag, what to skip)
- uncertainty handling
The model is bound by the contract. The contract is not optional guidance — it is injected as the system prompt and enforced by the severity engine parser.
Available contracts:
| Mode | Contract | Severity labels |
|---|---|---|
comment |
comment-review.md |
NOTE, SUGGESTION, WARNING, MISSING |
architecture |
architecture-review.md |
NOTE, SUGGESTION, WARNING, ARCHITECTURE, RISK |
security |
security-review.md |
NOTE, WARNING, RISK |
risk |
risk-review.md |
NOTE, WARNING, RISK, CRITICAL |
risk_review_file and risk_review_diff also run a grep-based pre-scan
(_detect_security_patterns) before the API call and inject the results as
context. The security skill (reviews/skills/security-review.md) is injected
for security and risk modes regardless of file type.
CRITICAL > RISK > ARCHITECTURE > WARNING > MISSING > SUGGESTION > NOTE
Findings are sorted by severity then line number. Blocking findings
(CRITICAL, WARNING, RISK, ARCHITECTURE) are identified by has_blocking_findings().
review_file(deep=True) runs three sequential passes:
| Pass | Purpose | Tokens |
|---|---|---|
| 1 — Structure | Compact structural analysis (responsibility, patterns, hotspots) | ≤ 400 |
| 2 — Review | Contract-driven review enriched with structure context | ≤ 2048 |
| 3 — Consistency | Doc vs runtime divergence check | ≤ 1024 |
Pass 4 is local deduplication (no API call): keeps highest severity per location, drops near-duplicate bodies.
Three separate memory stores serve different purposes:
Stored in review_engine/memory/review_history.json.
| Property | Value |
|---|---|
| Storage format | JSON, local file |
| Max entries per file | 10 (oldest discarded) |
| Context injected into future reviews | Top 5 findings from last review (priority 2) |
| Indexed by | Repo-relative file path |
Append-only. Never automatically purged. Never sent outside the local machine. Stores findings, severity distribution, mode, model, skill, and timestamp — not file content.
Stored in architecture_memory/ (decisions/, boundaries/, philosophy/, rejected/).
| Property | Value |
|---|---|
| Format | Markdown files with YAML frontmatter |
| Context injected | Up to 3 relevant ADRs per review (priority 1) |
| Managed by |
record_architecture_decision, extract_coding_conventions
|
Stored in semantic_memory/store.json.
| Property | Value |
|---|---|
| Format | JSON key-value store with tags and timestamps |
| Context injected | Up to 2 matches per review (priority 0 — highest) |
| Managed by |
store_semantic_memory, bootstrap_semantic_memory
|
| Bootstrapped from | README, ROADMAP, RUNTIME_CONTRACT, ORCHESTRATION_CONTRACT, TOOL_SAFETY |
- Exposes deterministic tools over the MCP protocol
- Enforces path boundaries on every file access
- Loads and applies review contracts and skills
- Persists review history locally
- Detects documentation drift via
detect_architecture_drift() - Reviews changed files via
review_diff() - Prioritizes stale files for review via
review_repo()
- Orchestrate multi-repo workflows → mq-agent
- Decide what to review or when → caller's responsibility
- Chain tool calls autonomously → no internal tool-calling loops
- Store data in the cloud → all state is local
- Authenticate or authorize callers → MCP client is responsible
- Take actions after a session ends → no background processes
When mq-agent invokes an mq-mcp tool it can assume:
- The tool either succeeds or returns an error string with the tool name as prefix.
- Path access is bounded — no tool can escape its declared resolver.
- Write tools do not commit or push.
- Class A and B tools have no persistent side effects.
- Tool contracts in
docs/tool_contracts.jsonare accurate (verified by CI).
The following guarantees are enforced by the implementation, not by convention:
-
No path escape —
resolve_repo_fileandresolve_allowed_local_filereject traversal, absolute paths outside declared roots, and symlink escapes. -
No secret leakage —
_redacted_env()is used for all diagnostic output. API keys are never returned in tool output. -
No auto-commit —
update_repo_fileexplicitly does not commit. No tool callsgit commitorgit push. -
No arbitrary subprocess — Class D tools invoke a fixed command. No tool accepts shell strings for execution.
-
No write outside declared scope — Class A and B tools are verified write-free by
docs/TOOL_SAFETY.mdclassification and CI checks. -
Drift detection —
detect_architecture_drift()verifies that tool counts, safety doc entries, and contract coverage are consistent with the actual server state.
| Component | Determinism |
|---|---|
| Path resolvers | Fully deterministic |
| Tool safety classification | Static — does not change at runtime |
| Severity parsing | Fully deterministic (regex-based) |
| Deduplication (Pass 4) | Fully deterministic |
| Structure/review/consistency passes | Non-deterministic in content, deterministic in format |
| Review memory reads | Deterministic given same history file |
| Architecture map reads | Deterministic given same generated artifacts |
The review pipeline is format-deterministic: regardless of model output
variation, the output always conforms to [SEVERITY] file:line\nbody
or degrades gracefully to the raw model response.
The following are verified by scripts/release-check.sh on every release:
- Every
@mcp.tool()inserver.pyis listed indocs/TOOL_SAFETY.md - Every
@mcp.tool()inserver.pyis listed inREADME.md -
docs/tool_contracts.jsontool count matchesserver.py -
VERSION,pyproject.toml,README.md, andCHANGELOG.mdare in sync - All Python files compile
- All tests pass
Runtime contract drift (tool count, safety coverage, arch map staleness)
is detected by detect_architecture_drift().
| Document | Role |
|---|---|
docs/RUNTIME_CONTRACT.md |
This document — authoritative runtime contract |
docs/architecture/SYSTEM_OVERVIEW.md |
Ground-truth architecture reference |
docs/architecture/REVIEW_PIPELINE.md |
Full review pipeline specification |
docs/TOOL_SAFETY.md |
Tool-by-tool safety classification |
docs/tool_contracts.json |
Machine-readable tool metadata |
TOOL_INDEX.md |
Human-readable tool index |
ROADMAP.md |
Planned work and phase history |
SAFETY_MODEL.md |
Safety principles summary |