Spec-Driven Dev

A universal skill package for AI coding assistants that enforces a 6-phase development pipeline:

Explore → [APPROVE] → Requirements → [APPROVE] → Design → [APPROVE] → Task Plan → [APPROVE] → Implementation → [APPROVE] → Review → [APPROVE] → Done

No API keys. No dependencies. Works in any IDE with AI agent support.

Designed for a single project or monorepo. Not intended for features spanning multiple independent repositories.

Why

AI coding assistants are great at writing code, but they often skip the thinking phase. They jump straight to implementation without understanding what needs to be built, why, or what the constraints are.

Spec-Driven Dev forces a structured workflow:

Explore — investigate the problem space, compare 2–4 approaches, recommend direction. Output: exploration document with options, risks, and scope boundaries.
Requirements — structured 4-layer interview (context → scope → constraints → verification), then generate formal requirements. Output: requirements document with WHEN/SHALL grammar.
Design — architect the solution with Mermaid diagrams, ADRs, correctness properties, and testing strategy. Output: design document with traceability to requirements.
Task Plan — decompose design into TDD tasks (RED/GREEN/CODE/VERIFY/GATE) with coverage matrix. Output: implementation plan with full requirements traceability (no code yet).
Implementation — execute the task plan: write tests, write code, run suite after each task. Output: implementation report with task completion checklist.
Review — review written code: change set, requirements traceability, design conformance, code quality, security scan. Output: review document with PASS/NEEDS_CHANGES/BLOCK verdict.

Each phase produces a document. Each transition requires explicit human approval. No skipping.

Installation

Install via skills.sh:

npx skills add sipki-tech/sdd

The CLI auto-detects your installed agents (GitHub Copilot, Claude Code, Cursor, Codex, Windsurf, Cline, and 40+ others) and symlinks the skill into their config directories.

Install Options

# Install globally (available across all projects)
npx skills add sipki-tech/sdd -g

# Install to a specific agent
npx skills add sipki-tech/sdd -a github-copilot
npx skills add sipki-tech/sdd -a claude-code
npx skills add sipki-tech/sdd -a cursor

# CI-friendly (no prompts)
npx skills add sipki-tech/sdd --all -y

# Full GitHub URL also works
npx skills add https://github.com/sipki-tech/sdd

Manual Installation

If you prefer not to use npx, clone the repo directly into your project:

git clone https://github.com/sipki-tech/sdd.git /tmp/sdd
cp -r /tmp/sdd/skills/sdd skills/sdd
rm -rf /tmp/sdd

Verify

npx skills list

Quick Start

Tell your AI assistant:

"I want to add user authentication with OAuth2"

The agent will automatically pick up the pipeline and start with the exploration phase.

How It Works

Project Configuration

Customize AI behavior for your project by creating .spec/config.yaml:

Note: The parser supports flat key: value pairs only — nested YAML, multi-line blocks (|, >), and arrays (- item) are NOT supported. See .spec/config.yaml.example for a complete template.

# .spec/config.yaml

# Project context (single line)
context: Go 1.23 monorepo, PostgreSQL, gRPC, go test + testify, make build

# Phase-specific rules (flat key: value)
rules.explore: Focus on existing API surface before proposing new endpoints
rules.requirements: All REQ should use gRPC error codes, not HTTP statuses
rules.design: ADR must consider protobuf backward compatibility
rules.task-plan: Test command: go test ./...; Build command: make build
rules.implementation: Run tests after each task before marking it done
rules.review: Always check protobuf backward compatibility
rules.docs: Always include Mermaid diagrams in ARCHITECTURE.md

# Optional: test style cascade overrides
test_skill: my-test-skill        # Tier 1: delegate test generation to this skill
test_reference: "**/*_test.go"   # Tier 2: use these files as test style reference

# Optional: project documentation directory
docs_dir: .spec                  # Default. Change to customize (e.g., .docs, docs/)
doc_freshness_days: 30           # Days before a generated doc is considered stale (default: 30)

context is injected into ALL phases — the agent knows your stack before asking questions.
rules.<phase> adds phase-specific rules on top of the template defaults.
test_skill (optional) — name of an installed skill for test generation. When set, Design and Implementation phases delegate test specification to this skill instead of writing test tasks directly.
test_reference (optional) — glob or file paths pointing to representative test files. When set, the agent uses these as the style reference for all generated tests. When omitted, the agent auto-discovers adjacent tests.
docs_dir (optional) — directory for project documentation, default: .spec. The agent reads and writes project docs here.
doc_freshness_days (optional) — number of days after which a generated doc is considered stale, default: 30. Used by docs-check to flag outdated files.
rules.docs (optional) — rules for documentation generation (e.g., skip irrelevant docs, require diagrams).

Phase-specific rule keys: rules.explore, rules.requirements, rules.design, rules.task-plan, rules.implementation, rules.review, rules.docs.

File Structure

skills/sdd/              ← skill package (installed by skills.sh)
├── SKILL.md                         ← orchestrator (skills.sh entry point)
├── templates/
│   ├── explore.md                   ← phase 1 prompt
│   ├── requirements.md              ← phase 2 prompt
│   ├── design.md                    ← phase 3 prompt
│   ├── task-plan.md                 ← phase 4 prompt
│   ├── implementation.md            ← phase 5 prompt
│   ├── review.md                    ← phase 6 prompt
│   └── docs/                        ← documentation generation templates
│       ├── README.md                ← manifest (lists all doc templates)
│       ├── bootstrap.md             ← generates README.md, agent-rules.md
│       ├── agents-index.md          ← generates AGENTS.md
│       ├── core.md                  ← generates ARCHITECTURE, PACKAGES, DOMAIN, CODE_STYLE
│       ├── development.md           ← generates TOOLS, TESTING, FILES
│       ├── errors.md                ← generates ERRORS.md
│       ├── auth.md                  ← generates AUTH.md / OAUTH.md
│       ├── database.md              ← generates DATABASE.md
│       ├── api.md                   ← generates API.md
│       ├── deployment.md            ← generates DEPLOYMENT.md
│       ├── infrastructure.md        ← generates per-component infra docs
│       ├── clients.md               ← generates CLIENTS.md + per-client docs
│       ├── security.md              ← generates SECURITY.md
│       ├── feature-flags.md         ← generates FEATURE_FLAGS.md
│       └── background-jobs.md       ← generates BACKGROUND_JOBS.md
└── scripts/
    └── pipeline.sh                  ← state machine (POSIX sh, zero deps)

.spec/                               ← project-local (committed to git)
├── config.yaml                      ← project context & rules (opt-in)
├── features/                        ← per-feature pipeline artifacts
│   ├── grpc-streaming/              ← example completed feature
│   │   ├── pipeline.json            ← pipeline state (for agents)
│   │   ├── pipeline.kv              ← internal KV store
│   │   ├── explore.md               ← phase 1 artifact
│   │   ├── requirements.md          ← phase 2 artifact
│   │   ├── design.md                ← phase 3 artifact
│   │   ├── task-plan.md             ← phase 4 artifact
│   │   ├── implementation.md        ← phase 5 artifact
│   │   ├── review.md                ← phase 6 artifact
│   │   ├── revisions/               ← artifact revision history
│   │   └── approved/                ← approved snapshots
│   └── oauth-login/                 ← another feature (in progress or done)
│       └── ...
├── README.md                        ← documentation index
├── ARCHITECTURE.md                  ← architecture overview
├── PACKAGES.md                      ← package reference
├── DOMAIN.md                        ← domain model
├── CODE_STYLE.md                    ← code conventions
├── ERRORS.md                        ← error handling & error catalog
├── TOOLS.md                         ← commands & tooling
├── TESTING.md                       ← testing conventions
├── DATABASE.md                      ← database schema & migrations
├── API.md                           ← API endpoints & conventions
├── DEPLOYMENT.md                    ← deployment pipeline & environments
├── SECURITY.md                      ← security audit & OWASP mapping
└── ...                              ← domain-specific docs (AUTH, CLIENTS, etc.)

Pipeline Commands

P="skills/sdd/scripts/pipeline.sh"

sh $P help                        # Show usage
sh $P init my-feature             # Start a new pipeline
sh $P status                      # Show current phase, artifacts, progress
sh $P artifact [path]             # Register output artifact for current phase
sh $P approve                     # Advance to next phase (after user approval)
sh $P history                     # Show all features and their status
sh $P revisions [ph]              # Show revision history (current or specified phase)
sh $P docs-check                  # Check project documentation status (JSON)
sh $P version                     # Show version

# When multiple pipelines are active simultaneously:
sh $P --feature auth-flow status  # Operate on a specific feature
sh $P --feature payment approve

Typical Session

You: "I want to add gRPC streaming to my service"

Agent: [reads SKILL.md → runs pipeline.sh init grpc-streaming]
Agent: [reads templates/explore.md → investigates codebase]
Agent: "Let me explore the problem space first. I see you currently use
        unary gRPC calls in service X. Here are 3 approaches:
        1. Server streaming  2. Bidirectional  3. SSE fallback
        I recommend #1 because..."
Agent: "Exploration document ready. Approve?"

You: "Approve"

Agent: [runs pipeline.sh approve → advances to requirements]
Agent: [reads templates/requirements.md → starts interview]
Agent: "Now let me capture the formal requirements.
        1. Which service is this for?
        2. What's the current communication pattern?
        3. Who are the consumers of this API?"

You: [answers questions]

Agent: [generates requirements document with WHEN/SHALL grammar]
Agent: "Requirements document is ready. Approve?"

You: "Approve"

Agent: [→ design → task plan, same pattern]
Agent: "Task plan approved. Now I'll implement the code."

Agent: [reads templates/implementation.md → executes the TDD plan]
Agent: [writes tests, writes code, marks tasks done with [x]]

Agent: "All tasks completed. Implementation report ready. Approve?"

You: "Approve"

Agent: [→ review phase]
Agent: "Code review complete. Verdict: NEEDS_CHANGES.
        Found 2 issues: F-1 (major) missing test for REQ-1.2,
        F-2 (minor) naming inconsistency.
        What would you like me to do?"

You: "Fix both"

Agent: [fixes code, re-reviews]
Agent: "All findings resolved. Verdict: PASS. Approve?"

You: "Approve"

Agent: "Pipeline complete!"
Agent: "This feature added new packages. Update .spec/PACKAGES.md?"

You: "Update docs"

Agent: [regenerates affected documentation]

Phase Details

Phase 1: Explore

The agent investigates the problem space before committing to requirements:

Reads existing codebase to understand current state
Identifies constraints, risks, and dependencies
Compares 2–4 realistic approaches with trade-offs
Recommends a direction with suggested scope boundaries

Output: exploration document with Intent, Investigation, Build Tooling, Options Considered, Constraints & Risks, Recommended Direction, Scope Boundaries, and Assumptions & Open Questions.

Phase 2: Requirements

The agent conducts a structured interview through 4 layers:

Context & Motivation — what, why, who's affected
Scope Boundaries — what changes, what must NOT change
Constraints & Edge Cases — errors, defaults, conflicts
Verification — how to prove it works

Output: formal requirements document with overview, glossary, requirements using WHEN/SHALL grammar, verification commands, and open design questions.

Phase 3: Design

Takes the requirements document and produces:

Architecture with Mermaid diagrams (color-coded: new/modified/existing)
Components & Interfaces — affected files + files NOT requiring changes
Key Decisions (ADR) — choices between alternatives with rationale
Correctness Properties — formal "For all X, Y must hold" statements
Testing Strategy — test style source, project commands, unit and property-based test specifications

Phase 4: Task Plan

Takes both documents and produces a TDD implementation plan:

Exploration tests (RED) — prove the problem exists
Preservation tests (GREEN) — lock in correct behavior
Implementation tasks — atomic, one file per subtask
Re-tests — confirm fix, no regressions
Checkpoints — integration verification

Every task is traceable: Requirements X.Y → Task N → Correctness Property K.

Phase 5: Implementation

The agent executes the approved task plan:

Writes real tests and real production code for every task
Runs the test suite after each task; iterates until green
Marks each completed task with [x] in the implementation report
Does NOT create new tasks or modify the plan — only executes
Final verification: all tests pass, build succeeds, lint is clean

Output: implementation report with task checklist showing what was done.

Phase 6: Review

After the agent implements the TDD plan, it reviews the written code:

Change Set Discovery — git diff from the base commit, cross-reference with the plan
Requirements Traceability — every requirement mapped to test and code
Design Conformance — architectural boundaries, data models, API contracts, correctness properties
Code Quality — naming, dead code, scope creep, test quality
Security Scan — input validation, auth, injection, secrets (scoped to changed files)

Verdict: PASS (no critical/major findings), NEEDS_CHANGES (major findings), or BLOCK (critical findings). If not PASS, the agent presents findings and recommendations to the user. The user decides which findings to fix. This follows the same human-in-the-loop model as all other phases.

Self-Documenting Mechanic

The skill includes a self-documenting mechanic that keeps project documentation (.spec/) in sync with code changes.

How it works

Before the pipeline (soft gate):

When starting a new feature, the agent checks if .spec/ exists
If missing, it suggests generating documentation first: "Project docs not found. Say 'generate docs' or 'skip'."
If present, it reads the docs as context for all phases, reducing codebase scan time
This is a suggestion, not a blocker — the pipeline works without .spec/

After the pipeline (targeted update):

When the feature pipeline completes, the agent analyzes which files were changed
It suggests updating only the affected documentation (e.g., new package → update PACKAGES.md)
You can accept or skip the update

Documentation templates

Templates for generating docs live in skills/sdd/templates/docs/. Each template generates specific doc files:

Template	Stage	Generates
`bootstrap.md`	Bootstrap	`README.md`, `agent-rules.md` — project index and agent rules
`agents-index.md`	Bootstrap	`AGENTS.md` — entry point for agents
`core.md`	Core	`ARCHITECTURE.md`, `PACKAGES.md`, `DOMAIN.md`, `CODE_STYLE.md`
`development.md`	Core	`TOOLS.md`, `TESTING.md`, `FILES.md`
`errors.md`	Core	`ERRORS.md` — error architecture & business error catalog
`auth.md`	Domain	`AUTH.md` / `OAUTH.md` — authentication & authorization
`database.md`	Domain	`DATABASE.md` — schema, migrations, query patterns
`api.md`	Domain	`API.md` — endpoint reference, middleware, error format
`deployment.md`	Domain	`DEPLOYMENT.md` — environments, CI/CD, rollout, health checks
`infrastructure.md`	Domain	Per-component infra docs (`OBSERVABILITY.md`, `REDIS.md`, etc.)
`clients.md`	Domain	`CLIENTS.md` + per-client docs (`FRONTEND.md`, `TELEGRAM.md`, etc.)
`security.md`	Domain	`SECURITY.md` — security audit, OWASP mapping, secrets management
`feature-flags.md`	Domain	`FEATURE_FLAGS.md` — flag inventory, lifecycle, rollout, cleanup
`background-jobs.md`	Domain	`BACKGROUND_JOBS.md` — job inventory, retry/DLQ, concurrency, scaling

To add a new documentation type, create a template file in templates/docs/ and add it to the manifest (templates/docs/README.md).

Architecture Decisions

POSIX sh over Python — guaranteed to be available everywhere, even in minimal containers
KV file + JSON mirror — KV for simple shell manipulation, JSON for agents to parse
skills.sh distribution — standard skill packaging, no custom installer needed
No auto-approve — human in the loop is the whole point
Persistent artifacts in .spec/features/ — committed to git, creating a permanent record of decisions
Per-feature directories — each feature gets its own directory with all artifacts, revisions, and state
Code review as a phase — code is verified against specs before the pipeline completes
Review as information — review presents findings and verdict to the user for decision, consistent with the human-in-the-loop philosophy across all 6 phases
Task Plan / Implementation split — planning (Phase 4) and execution (Phase 5) are separate phases with separate approval gates, ensuring the plan is reviewed before any code is written

Requirements

POSIX-compatible shell (sh, bash, zsh, dash)
git, date, grep, sed, mkdir — standard Unix utilities
No Python, Node.js, or other runtime required

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
.github/workflows		.github/workflows
.spec		.spec
skills/sdd		skills/sdd
.gitignore		.gitignore
.shellcheckrc		.shellcheckrc
CHANGELOG.md		CHANGELOG.md
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Spec-Driven Dev

Why

Installation

Install Options

Manual Installation

Verify

Quick Start

How It Works

Project Configuration

File Structure

Pipeline Commands

Typical Session

Phase Details

Phase 1: Explore

Phase 2: Requirements

Phase 3: Design

Phase 4: Task Plan

Phase 5: Implementation

Phase 6: Review

Self-Documenting Mechanic

How it works

Documentation templates

Architecture Decisions

Requirements

License

About

Uh oh!

Releases 8

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Spec-Driven Dev

Why

Installation

Install Options

Manual Installation

Verify

Quick Start

How It Works

Project Configuration

File Structure

Pipeline Commands

Typical Session

Phase Details

Phase 1: Explore

Phase 2: Requirements

Phase 3: Design

Phase 4: Task Plan

Phase 5: Implementation

Phase 6: Review

Self-Documenting Mechanic

How it works

Documentation templates

Architecture Decisions

Requirements

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 8

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages