Spec-first. Behavior-tested. Agent-safe.
Spec Guard is a methodology, workflow runner, and MCP server for agent-driven software development. The model is simple: humans write specs, agents write all code. Six mechanical gates enforce the boundary between human intent and agent execution.
Specs guide implementation, but tests validate running behavior and durable contracts — not prose.
The human's interface is a conversation. Describe what you want, answer a few questions, approve the spec. The agent handles gates, contracts, tests, and code. There are no files to fill out, no commands to learn, no dashboard to check — just a description of what you need and a chat window to deliver it in.
| Layer | What it is | Who uses it |
|---|---|---|
WORKFLOW.md + AGENTS.md |
Process flow document + compact agent instructions | Agents that load project files as context, or humans reviewing the process |
spec-guard run |
Interactive CLI that walks a spec through all 6 gates | Agents executing CLI commands |
| MCP server | Structured tool calls for all Spec Guard operations | MCP-compatible agents (Claude Code, Cursor, etc.) |
Each layer enforces the same 6 gates. Pick the one that fits your agent's capabilities — or use all three.
DISCOVER → [Gate 1] → CLASSIFY & CONTRACT → [Gate 2] → IMPLEMENTATION PLANNING → [Gate 3] → TEST FIRST → [Gate 4] → IMPLEMENT → [Gate 5] → REVIEW → [Gate 6]
| Gate | Check | How it's confirmed |
|---|---|---|
| 1 | Spec valid (required headings, content, classification) | spec-guard check — must exit 0 |
| 2 | Contracts present (API/UI inputs exist and are referenced) | spec-guard check --warnings |
| 3 | Implementation planning confirmed (required stack/layer decisions are recorded) | Agent suggests a context-appropriate stack/layer, human accepts or overrides it, then agent calls spec_guard_confirm_gate |
| 4 | Failure-first confirmed (test runs and fails for expected reason) | Agent runs tests, records failure, calls spec_guard_confirm_gate with evidence |
| 5 | Tests pass (no scope silently absorbed) | Agent runs tests until passing, calls spec_guard_confirm_gate |
| 6 | Review complete + cross-artifact analysis clean | spec-guard review then spec-guard analyze |
npm install --save-dev @jpstone/spec-guard
# Initialize project
npx spec-guard init
# Author a spec (guided wizard)
npx spec-guard draft my-feature
# Orchestrated workflow (recommended)
npx spec-guard run my-feature
# Just validate
npx spec-guard check my-feature
# See all specs
npx spec-guard status
# Browse all .md artifacts locally (no GitHub push needed)
npx spec-guard serveSee the CLI Reference for all commands and flags.
spec-guard serve starts a local HTTP server that renders all .md files in your repo as styled HTML. Navigate spec artifacts, contracts, and reviews in the browser with live reload — no GitHub push required.
npx spec-guard serve # opens http://localhost:7777
npx spec-guard serve --port 8080The root page is your README.md (or .spec-guard/README.md as fallback). A sidebar lists every .md file in the repo. Edits to any .md file reload the browser automatically.
Agents don't inherently know the Spec Guard workflow. Without it, they'll implement normally — no gates, no classification, no halt conditions. There are three ways to give an agent the operating contract:
MCP server (preferred) — Connect the MCP server and the agent receives structured guidance at each step via tool calls. No upfront loading required. spec_guard_workflow_next_step always tells the agent what to do next. Works with any MCP-compatible agent (Claude Code, Cursor, Copilot, etc.).
Project file context — spec-guard init puts AGENTS.md in the project root. Agents that automatically ingest project-level instruction files (Claude Code's CLAUDE.md pattern, Cursor rules, etc.) will pick it up without any manual step. Point the agent at the project and it reads the contract on its own.
Manual paste — For agents without MCP support or automatic file ingestion, paste the contents of AGENTS.md at the start of a session. The agent then operates under the full Spec Guard contract for that session.
WORKFLOW.md has the full phase-by-phase process flow. AGENTS.md is the compact operating contract an agent needs to execute it.
Exposes all Spec Guard operations as structured tools for MCP-compatible agents.
{
"mcpServers": {
"spec-guard": {
"command": "node",
"args": ["/path/to/spec-guard/mcp/server.js"]
}
}
}| Tool | What it does |
|---|---|
spec_guard_analyze |
Cross-artifact consistency check (spec ↔ contract ↔ review) |
spec_guard_check |
Validate a spec; returns diagnostics |
spec_guard_classify |
Get classification + test guidance |
spec_guard_confirm_gate |
Record gate 3/4/5/6 confirmation, with evidence required for Gate 4 |
spec_guard_create_artifact |
Create any artifact from a template |
spec_guard_draft_spec |
Turn interview answers into a valid spec (passes Gate 1) |
spec_guard_gate_status |
Status of all 6 gates for a spec |
spec_guard_initiative_questions |
Get question list for decomposing a broad app into feature slices |
spec_guard_interview_questions |
Get structured question list for AI-assisted spec authoring |
spec_guard_save_initiative |
Save initiative decomposition artifact; returns slice names for drafting |
spec_guard_status |
Overview of all specs |
spec_guard_suggest |
Check + return each diagnostic with a concrete fix instruction |
spec_guard_test_guidance |
Get test type and Gate 2 checklist for a classification |
spec_guard_validate_directory |
Check all specs in a directory |
spec_guard_workflow_next_step |
Given gates passed → what to do next |
spec_guard_workflow_next_step is the key tool for agents: call it after each action and it returns a structured next_action + instruction so the agent always knows what step comes next without reading docs.
See the MCP Setup guide for configuration and the MCP Tool Reference for full tool inputs, outputs, and examples.
.spec-guard/
specs/
contracts/
blockers/
scope-discoveries/
reviews/
deviations/
discoveries/
runs/
AGENTS.md
WORKFLOW.md
.github/
workflows/
spec-guard.yml
- Implement before Gates 1, 2, 3, and 4 pass — the spec must be valid, contracts must be present, required implementation planning must be confirmed, and a failing test must exist before any implementation begins
- Skip work classification
- Create documentation by default
- Test whether documentation files (specs, contracts, reviews, READMEs, help files, changelogs) exist or contain expected content, unless the document is explicitly the deliverable of an operational/document deliverable classification
- Invent UI — do not implement UI work until both a mockup/design direction and a component library reference are in the spec, or the human has explicitly confirmed each is not needed
- Assume a component library — if none is referenced, ask the human before proceeding
- Test private/undocumented internals instead of contract surfaces
- Silently absorb out-of-scope work
- Add unrequested features, optional enhancements, or opportunistic refactors
- Upgrade dependencies or change architecture unless the spec requires it
- Redesign UI beyond provided direction
- Implement nearby TODOs unless the spec requires them
- Propose unsolicited feature roadmaps after completing a task
- Treat "what's next?" as permission to invent features
- Perform discovery unless the human explicitly asks
- Implement discovery findings without separate authorization
- Skip Gate 4 (failure-first) without recording a concrete reason
- Close Gate 6 without running
spec-guard analyze
End-to-end walkthroughs showing Spec Guard in use with natural-language requests.
| Example | What it covers |
|---|---|
| Todo App | Building a new app with initiative decomposition, then adding a single feature with the standard spec flow |
| Doc | What it covers |
|---|---|
| CLI Reference | All commands, flags, exit codes, diagnostic format |
| Glossary | Term definitions |
| MCP Setup | MCP server configuration for Claude Code, Cursor, and Windsurf |
| MCP Tool Reference | All MCP tools — inputs, outputs, and examples |
| Philosophy | Design philosophy, innovations, and problems Spec Guard solves |
| Quality Gates | Gate-by-gate breakdown and pass conditions |
| Quickstart | Minimum workflow, live validation, CI setup |
| Validation Rules | Every rule ID with severity and description |
| Work Classification | How to choose the right classification |
npm test # 231 tests across check, run, MCP, CLI, discover, analyze, suggest, initiative, and implementation planning
npm run check:example # gate 1 smoke check
npm run run:example # gate 1+2 non-interactive checkMIT.
This project uses Spec Guard