English · 한국어
Natural-language GWT scenarios as first-class citizens. Multiple LLM agents propose, critique, and reach consensus on what to build, then implement, verify, and auto-regress across rounds — under a per-scope autonomy policy the user controls without staying tethered to the prompt.
SDI is the LLM-era successor to TDD (1990s) and BDD (2000s). The lineage:
| spec form | verifier | who reads it | |
|---|---|---|---|
| TDD | test code | the test runner | humans + the runner |
| BDD | Gherkin DSL | step definitions + runner | humans, with step glue maintained by humans |
| SDI | natural language Given/When/Then | LLM agent | the LLM directly — no compilation step |
The unit of work is the scenario. A plan locks in a set of scenarios. Specialist agents decompose runtime tasks, propose implementations, critique each other, and only then converge on consensus that is gated by an autonomy policy. The next round auto-replays prior scenarios as regression.
Identity & full spec: docs/PRD.md — the canonical PRD lives in this repository, decisions D1–D29 in §2.
| Entity | Role |
|---|---|
| Plan | Approved intent. Approval gate = ≥ 1 scenario with valid GWT (Task count irrelevant). |
| Requirement | SNAPSHOT-ONLY natural-language ask. Latest snapshot is the only truth; change history lives in Decision. |
| Decision | Append-only ADR with kind ∈ {proposal, critique, consensus, dissensus}. Consensus is the gate-passing form. Carries reversal_plan + blast_radius_score (D28). |
| Scenario | Strict Given/When/Then. Carries tags, depends_on DAG, produced_by/verified_by agents (M4 contract), plus claimed_resources_json + claim_status for multi-session safety (D29). |
| Round | R1 is new development; R2+ is regression. Default mode strict-regression replays every prior passing scenario. |
| AutonomyPolicy | Per-scope (plan / decision_kind / pattern_kind / global) mode ∈ {L3, L4, L5} + l5_threshold + pattern_depth_cap + plan_single_session_lock. Decides where the human gate sits. |
| CollaborationPattern (D22, v0.5) | Kind ∈ {workflow, graph, swarm, agents-as-tools, direct}, applies_to ∈ {plan, requirement, scenario, task, decision, round}. Persisted manifest with per-kind shape (steps/reviewers/fan_out/peer_registration). Every work entity records produced_via_pattern_id. |
Two non-first-class persistent entities back the multi-agent substrate: AgentNote (M1 blackboard, append-only journal) and AgentSpec (M5 runtime specialist registration, with stance ∈ {proposer, devil_advocate, schema_guardian, performance_reviewer, security_reviewer, neutral} for D26 sybil-fix).
- D13 — multi-agent is the body. Single-
@mainsolo execution is anti-pattern. Every flow assumes specialist agents communicating. - D14 — AutonomyPolicy is a first-class entity. Per-scope mode is persisted in SQLite and gates Decision application.
- D15 — four built-in patterns. Workflow, Graph, Swarm, Agents-as-Tools live inside SDI. External A2A protocol is out of v1.
- D16 — default = act with policy. Not "ask every time" — the user toggles intervention windows, not per-decision prompts.
- D17 — mode defaults. New plan defaults to L5; plans with external surface (publish/deploy/external API) default to L4;
decision_kind ∈ {architecture, schema, naming-canonical}is forced to L4 regardless of plan mode. - D18 — circuit breaker always on. One UI action demotes every policy row to L3 immediately; in-flight decisions apply at the next gate.
- D19 — substrate runs mode-independent. M1 blackboard, M2 hand-off, M3 negotiation, M4 scenario-as-contract, M5 self-organization keep working in any mode; mode only positions the user gate on consensus.
- D20 — consensus is the gate unit. Single-agent decision = L3 max. Multi-agent consensus unlocks L4/L5. Dissensus always escalates regardless of mode.
- D21 — mandatory delegation gate. Orchestrator (main session) is forbidden from calling execution tools (
Edit/Write/NotebookEdit/mutatingBash). PreToolUse hook detectshookInput.agent_idabsence and blocks the call — the only legitimate path isAgent-spawned specialist sub-agents. This is D13's mechanical enforcement face: the anti-pattern is structurally impossible, not just documented. - D22 — CollaborationPattern as the seventh entity. AWS's four patterns (Workflow / Graph / Swarm / Agents-as-Tools) become persistent DB rows with lifecycle (pending → active → converged | dissensus | aborted).
directis the anti-pattern marker, not an escape. - D23 — pattern provenance is NOT NULL. Every new work entity carries
produced_via_pattern_id; main sessions that omit it get an autodirectrow with a red dashboard badge + L3 cap + audit log. - D24 — pattern recursion via DAG.
parent_pattern_idself-FK, depth ≤AutonomyPolicy.pattern_depth_cap(default 3). A pattern's step can spawn sub-patterns; cycles are blocked. - D25 — pattern-scoped autonomy. Defaults: workflow=L5, graph=L5, swarm=L4, agents-as-tools=L4, direct=L3. The strictest of (plan-mode, pattern-mode) wins.
- D26 — four-pattern integrity gates with sybil fix. Graph consensus requires
≥ 2 distinct (AgentSpec.name, AgentSpec.stance) tuples— twoimpl-coderinstances with identical stance no longer fake diversity. Workflow needs sequential evidence andsteps ≥ 2; swarm needsfan_out ≥ 2plus spawn-depth and self-spawn-loop blocks; agents-as-tools needs peer registration andpeer ≥ 1. - D27 — pattern shape & selection gate. Shape validation runs at
pending → active. Fake patterns (1-step workflow, single-instance swarm, registry-empty agents-as-tools) cannot bypassdirect's L3 cap. - D28 — reversibility as a first-class constraint on L5. Decision.reversal_plan (inverse migration / git revert SHA / fs snapshot / compensating action) + Decision.blast_radius_score gate L5 auto-apply: shape valid AND reversal_plan present AND blast_radius_score ≤
AutonomyPolicy.l5_threshold(default 5). The reversal-runner specialist executes rollbacks as append-only Decisions. - D29 — multi-session resource claims. Scenario.claimed_resources_json (path globs) + claim_status give the daemon a decision-router role: cross-session overlap is blocked at PreToolUse with a merge-or-wait prompt. Optional
plan_single_session_lockfor high-conflict plans.
The full L3/L4/L5 semantics, scope matrix, circuit-breaker triggers, delegation-gate tool classification, and pattern integrity rules are spelled out in docs/PRD.md §3.7, §3.9, §5 Layer 0 / 1.5 / 2.6 / 2.7 / 2.8 / 3.
| Command | Owner | Purpose |
|---|---|---|
/scenario |
this plugin | Create / list / retire scenarios (strict GWT). |
/round |
this plugin | Start R1 or R2+ with regression auto-replay. |
/plan |
this plugin | Create plan, manage Requirements, approve gate. |
/req |
this plugin | Snapshot requirements (SNAPSHOT-ONLY). |
/decide |
this plugin | Append Decision with kind (proposal → critique → consensus / dissensus). Carries reversal_plan + blast_radius_score (D28). |
/consensus |
this plugin | Drive a multi-agent consensus round — proposal / critique / convergence — gated by the active CollaborationPattern's shape (D20, D26). |
/autonomy |
this plugin | Inspect / change AutonomyPolicy per scope; surface circuit breaker. Includes pattern_kind, l5_threshold, pattern_depth_cap, plan_single_session_lock. |
/agent-note |
this plugin | Append AgentNote (M1 blackboard) — hypothesis / observation / question / handoff / dissent / evidence. |
/pattern (D22, v0.5) |
this plugin | Create / list / advance CollaborationPattern. Sub-commands for workflow / graph / swarm / agents-as-tools manifests. |
/sdi-status |
this plugin | Snapshot the daemon's resolved state — active plan, scenarios, autonomy mode, active patterns, claim ledger. |
/goal |
Claude Code built-in | Orthogonal. SDI does not intercept it. |
This is a Claude Code plugin whose body is a Rust workspace. The plugin shell, the cli, and the daemon are all surfaces of the same repository.
sdi-plugin/
├── Cargo.toml # workspace root (resolver = 2)
├── crates/
│ ├── cli/ # `sdi` binary — user/LLM entry point. Hosts `sdi mcp` subcommand.
│ ├── daemon/ # `sdid` binary — background daemon (HTTP + unix socket).
│ ├── mcp/ # stdio MCP server library, embedded into cli.
│ ├── core/ # Domain model: Plan / Requirement / Decision / Scenario / Round / AutonomyPolicy / CollaborationPattern + AgentNote / AgentSpec.
│ └── db/ # SQLite storage adapter (rusqlite + r2d2; FTS5 keyword search, vector search deferred).
├── plugin/ # Claude Code plugin shell
│ ├── .claude-plugin/plugin.json
│ ├── .mcp.json
│ ├── hooks/hooks.json
│ ├── web/ # dashboard SPA (Vite/React 19/Tailwind 4); `sdid` serves dist/.
│ └── README.md
├── assets/ # logo + brand SVGs
├── docs/
│ ├── PRD.md # canonical product spec (D1–D29)
│ ├── ARCHITECTURE.md # this repo's architecture + multi-agent layers
│ └── …
├── README.md # this file
├── CLAUDE.md # AI context for contributors / agents
├── LICENSE # MIT
└── .gitignore
The dashboard SPA lives in this repository at plugin/web/ and is served directly by sdid over tower-http ServeDir. The autonomy panel, decision timeline, agent-notes blackboard, and pattern views all render from the daemon's HTTP API + /events SSE.
Two separate org repositories accompany this one:
sdi-desktop— Tauri 2 shell. Bundlesplugin/web/distand spawnssdidas a sidecar. Mirrors the resolved autonomy mode + active-pattern badge into the window title and tray, and exposes the circuit breaker as a global shortcut (Cmd+Shift+L / Ctrl+Shift+L).sdi-docs— Astro/Starlight landing + bilingual (ko / en) guide site. Presentation layer mirroring this repo'sdocs/PRD.md.
Pre-built sdi + sdid binaries (macOS + Linux × x86_64 + aarch64) ship via the Claude Code plugin marketplace — no Rust toolchain required.
/plugin marketplace add scenario-driven/sdi-plugin
/plugin install sdi@scenario-driven-sdi-plugin
The plugin shell lives under plugin/; the marketplace pulls it from the dist branch (binaries attached to each GitHub Release).
cargo buildBuilds two binaries: sdi (cli) and sdid (daemon). To rebuild the dashboard SPA:
pnpm --dir plugin/web install
pnpm --dir plugin/web buildSDI is the direct successor to Clawket v3.0 (operated for roughly one month). Clawket validated that LLMs can carry long-running work state through a local SQLite + daemon + MCP architecture, but its task-centric Jira-lineage model did not enable LLM-driven verification, automatic regression, or multi-agent governance. SDI re-centers on scenarios and adds the multi-agent substrate to close those gaps.
Migration mapping is in docs/PRD.md §9. SDI is a new tool in a new org, not a Clawket version bump.
MIT. See LICENSE.