feat(decisions): Plumb-like architectural decision tracking, provider-agnostic#2
Closed
laurentftech wants to merge 120 commits intomainfrom
Closed
feat(decisions): Plumb-like architectural decision tracking, provider-agnostic#2laurentftech wants to merge 120 commits intomainfrom
laurentftech wants to merge 120 commits intomainfrom
Conversation
Adds `spec-gen test` command that transforms OpenSpec scenarios into
executable test scaffolding with real assertions — not just TODO placeholders.
Key features:
**THEN clause pattern engine** (no LLM required)
- Matches common spec patterns: HTTP status codes, property presence,
error messages, field value assertions
- Emits typed assertions for all 5 frameworks without any API call
- Falls back to structured TODO placeholder when no pattern matches
**Multi-framework support**
- vitest, playwright (TypeScript/JavaScript)
- pytest (Python)
- gtest, catch2 (C++)
- Auto-detects framework from package.json / pyproject.toml / CMakeLists.txt
**Spec coverage dashboard** (`--coverage`)
- Scans test files for `// spec-gen: {JSON}` metadata tags
- Reports coverage % by domain with drift warnings
- `--discover` uses LLM semantic matching to attribute existing tests
- `--min-coverage <n>` CI gate: exits 1 when coverage drops below threshold
**LLM enrichment** (`--use-llm`)
- Pattern engine runs first; LLM fills unmatched THEN clauses only
- Reads mapped function source from mapping.json to ground assertions
**MCP tools**: `generate_tests` + `get_test_coverage`
- AI agents can generate tests and query coverage without the CLI
New files:
- src/types/test-generator.ts
- src/core/test-generator/{scenario-parser,then-matchers,test-generator,
test-writer,coverage-analyzer,framework-detector,index}.ts
- src/core/test-generator/renderers/{vitest,playwright,pytest,gtest,catch2,index}.ts
- src/cli/commands/test.ts
- 39 new unit tests across 4 test files, all passing
https://claude.ai/code/session_01Gw7jciiyYMzxPt6UrtB1ro
…eration Scenario-level business logic controls via HTML comment annotations: <!-- spec-gen-test: priority=high tags=smoke,regression --> <!-- spec-gen-test: skip reason="deprecated in v3, tracked in #1234" --> Changes: - ParsedScenario gains skip, skipReason, tags, priority fields - scenario-parser reads <!-- spec-gen-test: ... --> comments inline - Skipped scenarios filtered by default; opt-in via includeSkipped=true - Tag filtering: --tags smoke,regression requires all listed tags - Domain exclusion: --exclude-domains database - test-generator sorts scenarios by priority (high → normal → low) before grouping so priority groups appear first and consume LLM budget first - 6 new unit tests covering all annotation cases https://claude.ai/code/session_01Gw7jciiyYMzxPt6UrtB1ro
OpenSpecFormatGenerator.addScenario() now infers and emits
<!-- spec-gen-test: ... --> annotations automatically based on the
scenario name and THEN clause — zero extra LLM cost.
Rules (heuristic, override-able by editing the annotation):
tags=smoke → scenario name/THEN contains: success, valid, happy,
creat, register, accept
tags=regression → contains: invalid, error, fail, missing, reject,
unauthori, forbidden, duplicate, conflict, expired, wrong
priority=high → contains: auth, login, jwt, token, password, payment,
billing, permission, role, security, access
priority=low → contains: legacy, deprecated, backcompat, backward
Annotations are inserted between the #### Scenario: heading and the
first - **GIVEN** bullet, making them invisible in rendered Markdown.
Only emitted when at least one rule fires; unannotated scenarios stay clean.
Developers can override or extend these annotations after generation.
The spec-gen test command reads them to filter/prioritize test generation.
6 new unit tests added to openspec-format-generator.test.ts.
https://claude.ai/code/session_01Gw7jciiyYMzxPt6UrtB1ro
The previous README opened with implementation mechanics ("reverse-engineer
specs") and buried the most compelling capabilities. Key changes:
- New hero tagline: leads with outcomes, not mechanism
- Expanded problem statement explaining what "closing the loop" means
- Capabilities at a Glance table: all major features in one view, with
API key requirements and speed, including spec-gen test
- Languages supported listed prominently up front (TS/JS/Python/Go/Rust/...)
- What It Does: added spec-driven test generation as item 4 with bullet
detail on pattern engine, coverage, CI gate, annotation controls
- Commands table: added spec-gen test and spec-gen test --coverage rows
- New "Spec-Driven Tests" section (140 lines) covering:
framework support table (vitest/playwright/pytest/gtest/catch2)
generated output example with real assertions
THEN clause pattern engine with pattern → assertion table
spec coverage report example output
priority/tags annotation controls + auto-generation during generate
full options reference
- CI/CD intro updated to mention test is also deterministic/no API key
https://claude.ai/code/session_01Gw7jciiyYMzxPt6UrtB1ro
The AI agent value proposition was undersold: agents spend significant context budget on discovery (reading files, grepping, inferring architecture) before any useful work begins. spec-gen eliminates this. Changes: - Tagline: "Eliminate codebase discovery overhead for AI agents" - Problem section: added dedicated paragraph on agent discovery cost, explaining why every session starting from zero is expensive - Solution paragraph: explicit mention of passive context delivery and MCP active tools as two complementary layers - Capabilities table: split agent capability into two rows — "pre-loaded architectural context via CODEBASE.md" and "graph-based navigation via MCP" - Agent Setup section: new opening that frames the section around the problem being solved (discovery cost) not the setup mechanics; list of what agents "arrive knowing" before starting a task https://claude.ai/code/session_01Gw7jciiyYMzxPt6UrtB1ro
Token costs are a concrete, daily concern for developers using AI agents. Framing spec-gen's agent value around token savings makes the benefit immediate and quantifiable. - Problem: "burn thousands of tokens just answering what does this code do" - "Token budgets are real costs, and discovery is pure waste" - Passive context: "low token cost" vs active tools "per-call token cost" - CODEBASE.md: "costs a fraction of what reading the equivalent source files would — and it's already pre-digested into what the agent needs" - orient tool: "one round-trip instead of a dozen Read calls" - CODEBASE.md section: "~100 lines instead of reading dozens of source files" - Closing: "at the cost of two small file reads instead of an unbounded exploration loop" https://claude.ai/code/session_01Gw7jciiyYMzxPt6UrtB1ro
Swift call graph support was merged upstream (clay-good#40, commit 2bd6bf6) but laurentftech/spec-gen main is not yet synced. The feature is confirmed shipped — updating all three language lists in the README accordingly. https://claude.ai/code/session_01Gw7jciiyYMzxPt6UrtB1ro
…d annotations - All renderers now emit a warning comment before pattern-matched assertions: "⚠️ AUTO-GENERATED ASSERTIONS — verify correctness manually" Only shown when at least one assertion came from the heuristic pattern engine (fromPattern: true); LLM-written or placeholder-only tests are unaffected. - Auto-inferred spec-gen-test annotations emitted during spec-gen generate now carry an explicit (auto) marker, e.g.: <!-- spec-gen-test: priority=high tags=smoke (auto) --> The parser ignores the marker so existing parsing logic is unchanged. https://claude.ai/code/session_01Gw7jciiyYMzxPt6UrtB1ro
Adds a read-only digest layer on top of existing OpenSpec files. Source specs and all tooling are completely unaffected. spec-gen digest Print plain-English summary to stdout spec-gen digest --save Write to openspec/digest.md spec-gen digest --output <f> Write to a custom path spec-gen digest --domains <> Filter to specific domains Each domain section lists requirements with one line per scenario: **Login** - **ValidCredentials**: User submits valid credentials → session token returned. - **InvalidCredentials**: Invalid credentials submitted → 401 with error message. Template-based (no LLM, no API key required). Fast and deterministic. https://claude.ai/code/session_01Gw7jciiyYMzxPt6UrtB1ro
After detecting drift, spec-gen drift --suggest-tests scans test files
for spec-gen: metadata tags and lists the files that cover each drifted
domain, plus a ready-to-run command:
spec-gen drift --suggest-tests
Affected domains with test suggestions:
auth (2 files)
→ spec-tests/auth/Login.test.ts
→ spec-tests/auth/Session.test.ts
payment (1 file)
→ spec-tests/payment/Checkout.test.ts
Run: npx vitest spec-tests/auth/Login.test.ts ...
No LLM required. Falls back gracefully when no spec-gen test files
exist yet ("Run spec-gen test to generate them").
https://claude.ai/code/session_01Gw7jciiyYMzxPt6UrtB1ro
- Capabilities table: add digest and drift --suggest-tests rows - New ## Spec Digest section with example output and all options - Drift Detection: new ### Drift → Tests subsection with full example - Commands table: add digest and drift --suggest-tests entries - Drift Options: add --suggest-tests flag - Intro tagline: mention plain-English review layer https://claude.ai/code/session_01Gw7jciiyYMzxPt6UrtB1ro
- test-writer.ts: remove unused `header` variable in buildMergedContent() - test.ts: remove unused SPEC_GEN_CONFIG_REL_PATH import https://claude.ai/code/session_01Gw7jciiyYMzxPt6UrtB1ro
- test.ts: use GeneratedTestFile[] in displayGenerationSummary signature instead of inline structural type (was missing ParsedScenario fields) - test-generator.test.ts: add skip/tags/priority to mock ParsedScenario objects to satisfy updated interface - openspec-format-generator.test.ts: add metadata field to makeResult() PipelineResult fixture (added to interface in upstream v1.2.8) https://claude.ai/code/session_01Gw7jciiyYMzxPt6UrtB1ro
formatRequirementName preserves PascalCase when input has no separators, so getScenarioBlock must search for the original PascalCase name rather than a lowercased variant. https://claude.ai/code/session_01Gw7jciiyYMzxPt6UrtB1ro
…d passive model behavior Three failure modes addressed: - User-specified target caused model to skip coverage/impact steps (Steps 3c–6b) by treating them as discovery-only; added explicit shortcut-path rule - "Run tests after each step" was only in the plan template, not a model instruction; added explicit mandate in Step 7 of plan skill - Execute skill allowed model to ask for confirmation between every step and to offer "proceed at your own risk" on red baselines; both are now hard-blocked https://claude.ai/code/session_01Gw7jciiyYMzxPt6UrtB1ro
…ctor skills These skills provide structured workflows for planning and executing refactoring operations on codebases using spec-gen's static analysis capabilities. - spec-gen-plan-refactor: Identifies high-priority refactoring targets, analyzes impact, and generates a detailed refactoring plan - spec-gen-execute-refactor: Applies the refactoring plan step-by-step with test verification at each stage Both skills integrate with spec-gen's MCP server tools for codebase analysis, impact assessment, and drift detection.
…dates Bumps the npm_and_yarn group with 2 updates in the / directory: [@hono/node-server](https://github.com/honojs/node-server) and [hono](https://github.com/honojs/hono). Updates `@hono/node-server` from 1.19.10 to 1.19.13 - [Release notes](https://github.com/honojs/node-server/releases) - [Commits](honojs/node-server@v1.19.10...v1.19.13) Updates `hono` from 4.12.7 to 4.12.12 - [Release notes](https://github.com/honojs/hono/releases) - [Commits](honojs/hono@v4.12.7...v4.12.12) --- updated-dependencies: - dependency-name: "@hono/node-server" dependency-version: 1.19.13 dependency-type: indirect dependency-group: npm_and_yarn - dependency-name: hono dependency-version: 4.12.12 dependency-type: indirect dependency-group: npm_and_yarn ... Signed-off-by: dependabot[bot] <support@github.com>
…n/npm_and_yarn-84176cb2e3 chore(deps): bump the npm_and_yarn group across 1 directory with 2 updates
Use resolveLLMProvider() function to support all LLM providers including no-key providers like mistral-vibe, claude-code, copilot, gemini-cli, and cursor-agent. This makes spec-gen test consistent with other commands that use LLM functionality. The test command now properly detects and uses: - ANTHROPIC_API_KEY, GEMINI_API_KEY, OPENAI_COMPAT_API_KEY, OPENAI_API_KEY - Config-based providers that don't require API keys - Project-level LLM configuration from .spec-gen/config.json
Language-agnostic (TS/Python/C++) skill that reads the implementation and spec contract before writing any assertion. Includes run→fix loop and get_test_coverage report. No stubs, no placeholders. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…skill - spec-gen test is now coverage-only (--coverage mode promoted to default) - removed generation mode: produced stubs even with --use-llm + Mistral - spec-gen-write-tests skill/workflow now emits spec-gen annotation tags so get_test_coverage can track written tests Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…test writing Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Add 8 OpenCode/Claude-compatible skills in examples/opencode-skills/ (converted from Mistral Vibe: plain MCP tool call prose, no XML blocks) - Extend setup command with 'claude' and 'opencode' install targets - Claude Code skills install to .claude/skills/<name>/SKILL.md - OpenCode skills install to .opencode/skills/<name>/SKILL.md - Both targets checked by default in interactive mode (opencode off) - Non-TTY default includes claude alongside vibe/cline/gsd Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Remove all pre-checked defaults from the checkbox — user must consciously pick their agent tool(s) - Replace the silent non-TTY fallback with a clear error message directing users to use --tools instead Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Without --force, existing files are skipped (install-only behaviour). With --force, all files are overwritten — useful after upgrading spec-gen to pull in updated skill content. Report now shows 'created / updated / already up-to-date' counts and hints at --force when nothing changed. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
tree-sitter-swift pulls tree-sitter-cli as a dependency; its postinstall script downloads a Rust binary from GitHub, breaking installs in corporate networks without external internet access. - Add stubs/tree-sitter-cli-stub/ (empty package, no postinstall) - Override tree-sitter-cli → stub via npm overrides in package.json - Remove tree-sitter-cli from optionalDependencies - Include stubs/ in published files so the override resolves correctly Language parsers ship precompiled .node prebuilds for all platforms; the CLI is never needed at runtime. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
npm ci was failing in CI because the lock file still referenced tree-sitter-cli from optionalDependencies before the overrides switch. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…-stub fix: stub tree-sitter-cli to prevent GitHub download on npm install
…file detection - Use git diff --cached --name-only directly instead of getChangedFiles (which only looks at committed changes, missing staged-but-not-committed files) - When verified=0 but drafts exist, prompt to run --consolidate (TTY) or print a message (non-TTY) instead of passing silently - Import execFile/promisify for the direct git call Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Instead of passing silently, all gate-blocked states now output JSON with a 'reason' field (exit 1) so the agent reads it, stops, and presents the situation to the user before acting: - reason: drafts_pending_consolidation — drafts recorded but consolidation never ran - reason: no_decisions_recorded — source files staged with empty store CLAUDE.md updated with handling instructions for each reason. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…lock Without the return, process.exitCode=1 was immediately overwritten by the process.exitCode=0 at the end of the empty-store branch. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
When --consolidate runs as fallback (no drafts), it was using baseRef:'auto' which resolves to the merge-base with main — showing the entire branch diff. This caused 25 spurious decisions to be extracted on every commit. Fix: default baseRef to 'HEAD' so the fallback only sees staged changes for the current commit. Also adds lastConsolidatedAt to DecisionStore so the gate skips the no_decisions_recorded warning for 1 hour after consolidation ran and found nothing. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Use the canonical fileExists from utils/command-helpers.js instead of local duplicate. This reduces maintenance burden and ensures single source of truth.
Move duplicate parseJSON helper from three decisions/ files to shared src/utils/misc.ts. Handles both JSON arrays and objects from LLM output with markdown code fence stripping.
Prevents git diff --cached from showing an empty diff when HEAD is used as base (HEAD..HEAD is always empty in pre-commit context). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Replaces baseRef-based diff with stagedOnly mode that calls git diff --cached directly. Previously baseRef='HEAD' produced an empty diff in pre-commit context (HEAD..HEAD is always empty). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Agents must present decisions to the user before approving them — never auto-approve. Mirrors the CLAUDE.md gate flow with all three reason values (verified, drafts_pending_consolidation, no_decisions_recorded). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Agents (especially autonomous ones like GLM-4.6/Big Pickle via Crush) were calling approve_decision without asking the user. The tool description now carries a hard stop instruction that all models see at tool-selection time, regardless of their instruction-following level. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Codestral and other Mistral models stop prematurely with "Task completed" despite the agent-guard plugin nudge. Adding the rule directly to AGENTS.md ensures it is read via Crush's <memory> section regardless of whether experimental.chat.system.transform is honoured by the provider. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Codestral stops after completing one change in the plan instead of continuing to the next. The loop contract makes the termination condition unambiguous: only stop when all items are ✅. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Add three OpenCode plugins enforcing the spec-gen SDD workflow when oh-my-openagent is detected: - anti-laziness: re-prompts agent on premature idle using todo.updated and session.idle events (replaces unreliable tool.execute.after on todowrite) - spec-gen-enforcer: nudges before structural file writes, gates on idle, preserves decisions across context compaction - spec-gen-decision-extractor: spawns a Librarian sub-session to detect architectural changes after file writes; falls back to direct OpenAI-compat call if oh-my-openagent is not available setup command detects OMOA via config file presence and opencode.json plugin list, pre-checks the new omoa option in the interactive checkbox, and prints a config snippet after install showing how to wire sisyphus-sdd.md via prompt_append. Also export validateDirectoryImpl from mcp-handlers/utils for direct testing. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
spec-gen-context-injector.ts — closes the SDD triangle by injecting OpenSpec contracts into Sisyphus context before every coding turn: - experimental.chat.system.transform: compact domain index on every turn (domain name + one-line purpose, ~5 lines per domain) - tool.execute.after: tracks which spec domains are touched per session via mapping.json + path heuristics - experimental.session.compacting: injects full spec content of active domains (capped at 4 specs × 3000 chars) so contracts survive compaction spec-gen-decision-extractor.ts — pre-scores files via dep-graph before spawning a Librarian session: - reads .spec-gen/analysis/dependency-graph.json (inDegree, pageRank, fileScore) to classify files as hub vs low-centrality - skips LLM call for files with inDegree=0, pageRank<0.1, fileScore<0.3 - injects structural context (inDegree, pageRank, fileScore) into the Librarian prompt so it can weight the decision accordingly setup.ts: adds spec-gen-context-injector.ts to the omoa manifest. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Export scoreFromDepGraph, loadSpecDomains, readSpec, fileToSpecDomain with optional rootDir parameter so tests can use tmpdir fixtures without touching the real project tree. spec-gen-decision-extractor.test.ts (8 tests): scoreFromDepGraph — missing file, invalid JSON, node not found, exact path match, suffix match, hub detection (inDegree/pageRank/ fileScore), low-centrality node, missing metrics fields spec-gen-context-injector.test.ts (20 tests): loadSpecDomains — missing dir, empty dir, single/multi domain, missing spec.md, PARTIAL SPEC prefix, missing Purpose section readSpec — missing file, full content, section-boundary truncation, flat-content truncation fileToSpecDomain — no match, path heuristic, parent dir heuristic, mapping.json array/object form, fallback, malformed JSON vitest.config.ts: include examples/**/*.test.ts in the default test run. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
… crash OpenCode tried to load every exported symbol as a plugin; scoreFromDepGraph is a plain helper function (not Plugin-typed), causing a null-deref on S.auth during TUI bootstrap. Removing the export fixes the startup crash. Also fix duplicate logger import and missing afterEach in utils.test.ts, and refactor decision-extractor to use direct HTTP calls instead of the Librarian agent. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
1. args.filePath (camelCase): opencode edit tool passes filePath not path 2. event.properties.sessionID: session.idle event wraps sessionID under properties, not at the top level — filter was always empty 3. export on scoreFromDepGraph: opencode loads all exports as plugins, causing S.auth null crash at TUI bootstrap Loop verified end-to-end: tool.execute.after captures writes, session.idle triggers extractAndRecord, Mistral API returns 200, NOT_ARCHITECTURAL correctly rejected. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
output.output from the edit tool returns "Edit applied successfully." — not the file content — so the LLM had no code to classify. Fix reads the file with readFileSync after the edit so the prompt contains real content. Also fix agent-guard nudge: tool name is spec-gen_record_decision, not record_decision (the agent was calling an unavailable tool). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…oduce an interactive dependency graph viewer and update analyzer logic/tests to improve import parsing, duplicate detection, and refactor analysis.\n\nMade-with: Cursor
The caching layer was unnecessary complexity that could lead to stale data issues. The mapping.json file is small enough that direct reads are more maintainable and reliable.
de1ee3b to
f24aac3
Compare
…import OpenCode loads every export from a plugin file as a Plugin — scoreFromDepGraph returning FileScore|null caused a null-deref on S.auth at TUI bootstrap. The previous fix removed the export, breaking the test import. This extracts scoreFromDepGraph, FileScore, and the three threshold constants into a companion helpers file that is safe to export from (not a plugin file). The plugin imports from helpers internally; the test imports from helpers directly. setup --tools omoa now copies the helpers file alongside the plugin. Fixes: 10 failing tests in spec-gen-decision-extractor.test.ts
- Move spec-gen-decision-extractor-helpers.ts → lib/ - Create spec-gen-context-injector-helpers.ts in lib/ - Update imports and setup.ts manifest accordingly - Minor formatting cleanup in enforcer plugin and store.ts
…lassification - Remove directive "should be in schemaFiles/apiFiles" from hints - Add class/function count per file to inform LLM classification - Pass llmContext to buildStructuredHints for density calculation - Update test calls to include llmContext parameter Helps Stage 1 distinguish between pure schema files (many classes, few functions) and orchestrator/god-function files (few classes, many functions).
- Change helper imports from .js to .ts in plugin files and tests - Ensure Vitest resolves TypeScript sources in examples/ directory - Add !examples/opencode/plugins/lib/ to .gitignore to track helper files - Fixes CI: Cannot find module './lib/spec-gen-*helpers.js'
- Add !examples/opencode/plugins/lib/ exception - Ensures helper .ts files are not ignored by the global lib/ rule
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes clay-good#69
What this PR adds
Plumb (Drew Breunig) intercepts git commits, extracts architectural decisions from Claude conversations via LLM, and syncs them to spec files. Its limitation: Claude only.
This PR implements the equivalent directly in spec-gen, reusing the existing LLM service (9 providers). Two improvements over Plumb:
record_decisionMCP, not only at commit timeThis closes the Breunig triangle for spec-gen:
check_spec_drift: code → spec (already existed)generate_tests: spec → tests (already existed)Decision lifecycle
New files
src/core/decisions/store.ts.spec-gen/decisions/pending.jsonsrc/core/decisions/consolidator.tssrc/core/decisions/extractor.tssrc/core/decisions/verifier.tssrc/core/decisions/syncer.tssrc/core/decisions/*.test.tssrc/cli/commands/decisions.tsspec-gen decisionsCLI commandsrc/api/decisions.tssrc/core/services/mcp-handlers/decisions.tsModified files
src/types/index.tsPendingDecision,DecisionStore,DecisionStatussrc/constants.tsSPEC_GEN_DECISIONS_SUBDIR,DECISIONS_*constantssrc/cli/commands/mcp.tssrc/cli/commands/generate.ts--forceclears generation cache + removes--reanalyzesrc/core/generator/openspec-writer.ts--forceremoves stale domain dirssrc/core/analyzer/ai-config-generator.tsAGENTS.mdtarget +record_decisionin MCP instructionsCLAUDE.mdrecord_decisionin MCP tools tableREADME.mdexamples/cline-workflows/+examples/mistral-vibe/skills/record_decisionstep in all agent skillsMCP tools
record_decisionlist_decisionsapprove_decisionreject_decisionsync_decisionsdryRunsupported)CLI
Follow-up ideas
list_decisionsinorientresults so the agent sees in-progress decisions at task startdecisions --exportto push approved decisions toopenspec/decisions/for team sharing--commit <hash>to replay extraction on a past commit