Spec Kit v0.1 — agentic SDLC workflow foundation#1
Conversation
Establish a solution-agnostic, spec-driven workflow template covering the full SDLC from idea to retrospective. Built on the foundation in the original v0.1 draft and enhanced with patterns from GitHub Spec Kit, Amazon Kiro, BMAD-METHOD, EARS notation, and Michael Nygard ADRs. Adds: - Root governance: README, AGENTS.md (cross-tool), CLAUDE.md, constitution - Workflow docs: spec-kit.md, workflow-overview, quality-framework, traceability, ears-notation - Steering files: product, tech, ux, quality, operations - 16 stage and meta templates with YAML frontmatter and quality gates - 3 seed ADRs (record decisions, adopt SDD, adopt EARS) - 13 Claude Code subagents with narrow scope and tool restrictions (orchestrator, analyst, pm, ux/ui designer, architect, planner, dev, qa, reviewer, release-manager, sre, retrospective) - 16 slash commands (/spec:* per stage + /spec:clarify, /spec:analyze, /adr:new) - Examples and specs placeholders, .gitignore
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: f3c3cd350a
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| --- | ||
| description: Stage 9 — Review. Invokes reviewer to produce review.md and validate traceability.md. Read-only on artifacts. | ||
| argument-hint: [feature-slug] | ||
| allowed-tools: [Read, Grep, Bash] |
There was a problem hiding this comment.
Allow /spec:review to write its required artifacts
The command is restricted to allowed-tools: [Read, Grep, Bash], but its own procedure requires writing traceability.md, producing review.md, and updating workflow-state.md (steps 2–4). With this permission set, stage 9 cannot complete its documented outputs, so the review phase is effectively blocked.
Useful? React with 👍 / 👎.
| --- | ||
| description: Optional gate. Cross-checks consistency between requirements, design, spec, and tasks. Reports gaps and contradictions. | ||
| argument-hint: [feature-slug] | ||
| allowed-tools: [Read, Grep] |
There was a problem hiding this comment.
Allow /spec:analyze to persist clarification reports
This command also omits Edit/Write in allowed-tools, yet step 4 says to add findings to workflow-state.md under clarifications. Without write capability, /spec:analyze cannot persist the gate output in the artifact that later stages rely on, so the documented consistency-check workflow cannot be executed as written.
Useful? React with 👍 / 👎.
…heir artifacts Both slash commands document writing review.md, traceability.md, and workflow-state.md, but their allowed-tools omitted Edit/Write — making the review and analyze stages unable to produce their documented outputs. - /spec:review: add Edit, Write (creates review.md, traceability.md; updates workflow-state.md) - /spec:analyze: add Edit (appends findings to workflow-state.md) - reviewer agent: add Edit, Write (mirrors the slash command); tighten scope wording to "only write your own artifacts" - agents README: correct reviewer's tool list Caught by Codex review on PR #1.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 53a21967ad
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| | [`memory/constitution.md`](memory/constitution.md) | Project principles loaded ahead of every workflow command | | ||
| | [`templates/`](templates/) | Blank artifacts for each stage (idea, prd, design, spec, tasks, …) | | ||
| | [`.claude/agents/`](.claude/agents/) | One subagent per SDLC role (PM, UX, architect, dev, QA, SRE, …) | | ||
| | [`.claude/commands/`](.claude/commands/) | Slash commands per workflow phase (`/spec:specify`, `/spec:plan`, …) | |
There was a problem hiding this comment.
Replace invalid
/spec:plan command reference
The README advertises /spec:plan, but this command does not exist in .claude/commands/spec/ (stage 6 is /spec:tasks). New users following the quick reference will hit a dead command at the planning phase and cannot continue the documented workflow without hunting for the correct name elsewhere, so this should be corrected to keep the main entrypoint accurate.
Useful? React with 👍 / 👎.
The README's file map advertised /spec:plan as an example slash command, but stage 6 is /spec:tasks — /spec:plan does not exist. Replaced with the real command so new users following the README don't hit a dead reference. Caught by Codex review on PR #1.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: cdf82746f4
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
|
|
||
| ## Don't | ||
|
|
||
| - Don't edit existing ADRs. If you need to change a decision, supersede it (status: `Superseded by ADR-NNNN` on the old one; status: `Accepted` on the new one). |
There was a problem hiding this comment.
Preserve ADR immutability when superseding
The /adr:new procedure tells the agent to update the old ADR's status to Superseded by ..., but this conflicts with the repository rule that decided ADRs are immutable and should be superseded without edits. Following this instruction forces history rewrites of accepted ADR files, which undermines the append-only decision log and can break auditability for teams relying on ADR history.
Useful? React with 👍 / 👎.
| --- | ||
| feature: <feature-slug> | ||
| area: <AREA> # short uppercase code; used in IDs (REQ-<AREA>-NNN) | ||
| current_stage: idea # idea | research | requirements | design | specification | tasks | implementation | testing | review | release | learning | ||
| status: active # active | blocked | paused | done |
There was a problem hiding this comment.
Add machine-readable artifact statuses to workflow state
This template frontmatter defines current_stage and global status but omits an artifacts: status map, even though the workflow docs and stage commands rely on entries like idea.md: complete for gating and handoff. Starting from this template leaves no canonical machine-readable place to track per-artifact completion, so orchestration must parse free-form tables/notes and can make incorrect stage-transition decisions.
Useful? React with 👍 / 👎.
…rify ADR immutability Two P2 issues caught by Codex review on PR #1. 1. workflow-state-template.md was missing the artifacts: status map in frontmatter that docs/spec-kit.md §5.1 and the orchestrator agent depend on for stage-transition decisions. Added the canonical map covering all 13 stage artifacts; the human-readable table in the body stays as a view of the same state. 2. /adr:new said "Don't edit existing ADRs" then immediately said to update the predecessor's status to "Superseded by ADR-NNNN" — self-contradictory. Resolved by clarifying across the kit that ADR *bodies* are immutable; only the status and superseded-by pointer fields may be updated when a successor is filed. This matches what templates/adr-template.md already implies via its status enum and superseded-by frontmatter field, and aligns with industry practice (Nygard, MADR, adr-tools). Updated AGENTS.md, docs/adr/README.md, templates/adr-template.md, and the /adr:new command. ADR-0001 is left untouched (its prose is still accurate under the clarified rule and, per its own rule, its body is immutable).
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 54ade3539e
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| ## Conventions | ||
|
|
||
| - Commands accept `$ARGUMENTS` (full string) and `$1`, `$2`, … (positional, shell-quoted). | ||
| - The first positional, when relevant, is the **feature slug**. |
There was a problem hiding this comment.
Clarify /spec:implement positional argument order
The global convention says the first positional argument is the feature slug, but /spec:implement is defined as [task-id] [feature-slug] (.claude/commands/spec/implement.md:3,12). Following the documented convention here will pass the slug as $1, which can be misinterpreted as a task ID and lead to selecting the wrong work item or failing to resolve the task at all during stage 7.
Useful? React with 👍 / 👎.
| Run **stage 9 — Review**. | ||
|
|
||
| 1. Resolve slug; verify the test report is `complete` and there are no S1/S2 open. | ||
| 2. Build / refresh `specs/<slug>/traceability.md` from artifact frontmatter (mechanical pass). |
There was a problem hiding this comment.
Stop requiring RTM generation from frontmatter alone
Step 2 requires traceability.md to be built mechanically from artifact frontmatter, but the required REQ→SPEC→TASK links are primarily modeled in body content (for example REQ headings in templates/prd-template.md and Satisfies: fields in templates/spec-template.md/templates/tasks-template.md), not frontmatter fields. Implementing this instruction literally will produce incomplete RTM rows and can incorrectly block or pass stage 9 reviews.
Useful? React with 👍 / 👎.
Two P1 inconsistencies caught by Codex review on PR #1. 1. /spec:implement takes [task-id] [feature-slug], which contradicts the global "first positional is the feature slug" convention in .claude/commands/README.md. The order is intentional — slug auto-resolves from active workflow-state and users mid-implementation type task IDs far more often than slugs — so I documented the exception in the convention rather than reorder a frequently-used command. The argument-hint remains authoritative. 2. /spec:review, docs/traceability.md, templates/traceability-template.md, and the constitution all said the RTM is "generated mechanically from artifact frontmatter". This is misleading: per-item IDs and Satisfies/Depends-on links live in body content (REQ/SPEC/T headings and inline fields) for multi-item artifacts, not per-item frontmatter. Reworded to "parsed from frontmatter + marked-up body entries" and enumerated the surfaces concretely. RTM regeneration is still mechanical, just from a structured parser, not just YAML.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 16b35ab902
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| stage: review | ||
| feature: <feature-slug> | ||
| status: draft # draft | complete | ||
| owner: qa |
There was a problem hiding this comment.
Set RTM owner to reviewer for stage-9 generation
The RTM template assigns owner: qa, but /spec:review explicitly has the reviewer generate/refresh traceability.md (.claude/commands/spec/review.md), and the agent conventions state that agents should not edit another agent’s outputs (.claude/agents/README.md). In workflows that enforce artifact ownership metadata, this creates a direct stage-9 conflict where the reviewer is instructed to modify a QA-owned artifact.
Useful? React with 👍 / 👎.
| | REQ-<AREA>-001 | SPEC-<AREA>-001 | T-001, T-002, T-003 | `src/...:42-98` | TEST-001, TEST-002 | ✅ | | ||
| | REQ-<AREA>-002 | SPEC-<AREA>-002 | T-004, T-005 | `src/...:12-60` | TEST-003 | ⚠️ 1 failing | |
There was a problem hiding this comment.
These example rows use T-001 and TEST-001 formats, which conflict with the repository’s required ID scheme (T-<AREA>-NNN, TEST-<AREA>-NNN in docs/traceability.md). If users follow this template shape, the mechanical traceability checks run during /spec:review can fail to match task/test links correctly, producing false orphan/missing-chain defects.
Useful? React with 👍 / 👎.
Two P2 issues caught by Codex review on PR #1. 1. templates/traceability-template.md set owner: qa, but /spec:review has the reviewer generate/refresh traceability.md (the reviewer agent prompt already claims this responsibility). Per the kit's "no agent edits another agent's outputs" rule, owner must match the producer. Changed to owner: reviewer. 2. The example RTM rows used T-001 / TEST-001 instead of T-<AREA>-NNN / TEST-<AREA>-NNN, contradicting the ID scheme in docs/traceability.md and breaking mechanical link-matching during /spec:review. Prefixed all example task and test IDs with <AREA>. Also caught one straggler in docs/spec-kit.md from the previous "RTM generated from frontmatter" sweep that I missed; brought it into line.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: e91a62f729
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
|
|
||
| | REQ ID | Satisfied? | Evidence | | ||
| |---|---|---| | ||
| | REQ-<AREA>-001 | ✅ | TEST-001, TEST-002, RTM row | |
There was a problem hiding this comment.
Normalize TEST IDs in review template examples
docs/traceability.md defines test IDs as TEST-<AREA>-NNN, but this example row uses TEST-001/TEST-002. Teams often copy these rows verbatim into early drafts, and the review/RTM checks described in this repo rely on area-qualified IDs; using unqualified IDs here can produce false orphan-test/missing-link findings during stage 9.
Useful? React with 👍 / 👎.
| | REQ-<AREA>-001 | TEST-001, TEST-002 | 2 | 0 | ✅ | | ||
| | REQ-<AREA>-002 | TEST-003 | 0 | 1 | ❌ | |
There was a problem hiding this comment.
Use area-qualified test IDs in test report examples
The per-requirement results examples use TEST-001/TEST-002/TEST-003, which conflicts with the repository ID contract (TEST-<AREA>-NNN in docs/traceability.md). If users follow this template literally, downstream traceability parsing and consistency checks can misclassify valid tests as unlinked because the <AREA> segment is missing.
Useful? React with 👍 / 👎.
|
|
||
| ## Tool restrictions are deliberate | ||
|
|
||
| `qa` doesn't have `Edit` on production code — it changes tests, not source. `reviewer` doesn't have `Edit` at all — its output is `review.md`, not patches. `analyst` doesn't have `Bash` — research doesn't need a shell. Don't broaden these without an ADR. |
There was a problem hiding this comment.
Fix reviewer permission claim in agents README
This line says the reviewer has no Edit capability, but the same file’s tool table and .claude/agents/reviewer.md grant Edit/Write. That mismatch creates a false least-privilege guarantee and can lead maintainers to rely on a restriction that is not actually enforced when auditing or adjusting agent safety boundaries.
Useful? React with 👍 / 👎.
Three P2 issues caught by Codex review on PR #1, plus a sweep for the same defect class to stop drip-fed follow-ups. ID sweep (TEST-NNN / T-NNN → TEST-<AREA>-NNN / T-<AREA>-NNN): - templates/test-report-template.md: per-requirement results table - templates/review-template.md: requirements compliance table - templates/tasks-template.md: parallelisable batches list Verified repo-wide that no other unqualified IDs remain. Reviewer agent permission claim (.claude/agents/README.md): The "Tool restrictions" section said "reviewer doesn't have Edit at all", but the same file's tool table and reviewer.md grant Edit/Write (which the reviewer needs to produce review.md and traceability.md). Fixed the text to describe the actual restriction: reviewer has Edit/Write only to produce its own artifacts; the prompt forbids editing specs, code, or other agents' outputs.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: eae7c54160
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
|
|
||
| ```yaml | ||
| artifacts: | ||
| idea.md: skipped (trivial: typo fix) |
There was a problem hiding this comment.
Keep skip status values machine-readable
The skip example stores the reason inside the status value (skipped (trivial: typo fix)), but templates/workflow-state-template.md defines artifacts statuses as a fixed enum (pending | in-progress | complete | skipped | blocked). If users copy this example, parsers that key off exact status values will treat the entry as invalid and can miscompute stage gates or next-step recommendations.
Useful? React with 👍 / 👎.
| - **Frontmatter** uses Claude Code's standard fields: `name`, `description`, `tools`, `model`, optionally `color`. | ||
| - **`description`** is written in the imperative — Claude reads it to decide *when* to spawn this agent. | ||
| - **Body** is the system prompt. Keep it focused; rely on linked docs (`docs/spec-kit.md`, `docs/quality-framework.md`, the relevant template) for detail rather than restating. | ||
| - **No agent edits another agent's outputs.** If you need a sibling artifact updated, raise a clarification or hand back to the orchestrator. |
There was a problem hiding this comment.
Reconcile no-cross-edit rule with stage-4 workflow
This global rule conflicts with the documented stage-4 flow, where /spec:design explicitly has ux-designer, then ui-designer, then architect collaborate in the same design.md artifact. Taken literally, the rule blocks the second and third agent from completing stage 4, so the convention should either scope to cross-stage artifacts or explicitly carve out same-stage collaboration.
Useful? React with 👍 / 👎.
| | Agent | Owns stage(s) | Tool surface | | ||
| |---|---|---| | ||
| | [`orchestrator`](orchestrator.md) | cross-cutting hand-off | Read | | ||
| | [`analyst`](analyst.md) | Idea, Research | Read, WebSearch, WebFetch | |
There was a problem hiding this comment.
Align analyst tool surface row with agent config
The table says analyst has Read, WebSearch, WebFetch, but .claude/agents/analyst.md grants Edit and Write too. This creates a false permission model in the primary agent index, which can mislead maintainers doing capability audits or copying the table into policy docs.
Useful? React with 👍 / 👎.
Three P2 issues caught by Codex review on PR #1, plus preemptive sweeps for the same defect classes. Agents README table sync — the table claimed several agents had narrower tool lists than their .claude/agents/*.md files actually grant. Aligned six rows with reality: orchestrator Read → Read, Grep analyst Read, Web* → Read, Edit, Write, Web* dev …Bash → …Bash, Grep qa …Bash → …Bash, Grep release-manager …Edit, Write → …Edit, Write, Bash sre …Bash → …Bash, Grep retrospective Read, Edit, Write → Read, Edit, Write, Grep Cross-edit rule (.claude/agents/README.md) — the global "no agent edits another agent's outputs" conflicted with stage 4 by construction, where ux-designer / ui-designer / architect all extend the same design.md. Carved out same-stage explicit collaboration: cross-edit is forbidden across stages but allowed within a stage when the stage's slash command sequences the agents (stage 4 is the canonical case). Skip status syntax — docs/quality-framework.md and the workflow-state template had examples like "idea.md: skipped (trivial: typo fix)" or "skipped — trivial copy fix" that smuggle the reason into what templates/workflow-state-template.md defines as a fixed enum (pending | in-progress | complete | skipped | blocked). Status values are now the bare enum; reasons live in the body's Skips / Blocks sections, keeping frontmatter machine-parseable. Updated three call sites.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: f6c255b181
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| | What's the input? | The previous stage's artifact in `specs/<feature>/` | | ||
| | What's the output? | The matching `templates/<stage>-template.md` | | ||
| | When am I done? | The quality gate in [`docs/quality-framework.md`](quality-framework.md) | | ||
| | How do I trigger it? | `/spec:<stage>` | |
There was a problem hiding this comment.
Replace
/spec:<stage> shorthand with real command names
This cheat-sheet rule is incorrect for several stages (specification, testing, learning), where the actual commands are /spec:specify, /spec:test, and /spec:retro. Following this guidance literally leads users to invoke non-existent commands, which breaks the happy-path flow described in the same document.
Useful? React with 👍 / 👎.
| | 6 | Tasks | Planner | `tasks.md` | [`templates/tasks-template.md`](../templates/tasks-template.md) | | ||
| | 7 | Implementation | Dev | code + `implementation-log.md` | [`templates/implementation-log-template.md`](../templates/implementation-log-template.md) | | ||
| | 8 | Testing | QA | `test-plan.md`, `test-report.md` | [`templates/test-plan-template.md`](../templates/test-plan-template.md), [`templates/test-report-template.md`](../templates/test-report-template.md) | | ||
| | 9 | Review | Reviewer | `review.md` | [`templates/review-template.md`](../templates/review-template.md) | |
There was a problem hiding this comment.
Include traceability artifact in stage-9 output contract
The stage table is the primary contract for stage outputs, but stage 9 lists only review.md even though the workflow requires traceability.md to be generated/validated before review exits. Teams or tooling that rely on this table can incorrectly treat review as complete without an RTM, then fail the stage-9 gate later.
Useful? React with 👍 / 👎.
| | Requirement | `REQ-<AREA>-NNN` | `REQ-AUTH-001` | | ||
| | Spec item | `SPEC-<AREA>-NNN` | `SPEC-AUTH-001` | | ||
| | Task | `T-<AREA>-NNN` | `T-AUTH-014` | | ||
| | Test | `TEST-<AREA>-NNN` | `TEST-AUTH-007` | | ||
| | ADR | `ADR-NNNN` | `ADR-0007` | |
There was a problem hiding this comment.
Add missing ID prefixes used by shipped templates
The documented ID scheme omits several prefixes that the templates use as first-class IDs (IDEA-, RESEARCH-, PRD-, DESIGN-, TASKS-, TESTPLAN-, TESTREPORT-, RELEASE-, RETRO-). Any validator or parser built from this spec will flag valid template artifacts as non-conformant, undermining the “mechanical” traceability checks.
Useful? React with 👍 / 👎.
…e ID scheme Three P2 contract gaps caught by Codex review on PR #1. 1. docs/spec-kit.md stage table listed only review.md as stage-9 output, even though the workflow requires traceability.md (the RTM) to be generated/validated before /spec:review exits. Added traceability.md to the row alongside review.md and linked the RTM template. 2. docs/workflow-overview.md cheat sheet's "How do I trigger it?" row said "/spec:<stage>", which doesn't resolve for stages whose command names differ from their stage names (specification → /spec:specify, testing → /spec:test, learning → /spec:retro). Replaced with the concrete list of all 11 commands. (The same shorthand in ADR-0002 describes the abstract property and is left in place; ADR bodies are immutable.) 3. docs/traceability.md ID scheme table omitted several prefixes that the shipped templates use as first-class IDs: IDEA-, RESEARCH-, PRD-, NFR-, DESIGN-, TASKS-, IMPL-LOG-, TESTPLAN-, TESTREPORT-, REVIEW-, RELEASE-, RETRO-, RTM-, CHECKLIST-, CLAR-. A validator built from this table would have flagged valid template artifacts as non-conformant. Expanded the table to cover every prefix actually used in templates/ and split document-level vs. per-item rows where both exist (e.g. SPEC document vs. per-item spec entries).
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: bb10ef869a
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| Run **stage 9 — Review**. | ||
|
|
||
| 1. Resolve slug; verify the test report is `complete` and there are no S1/S2 open. | ||
| 2. Build / refresh `specs/<slug>/traceability.md` by parsing the artifacts' structured content: document-level YAML frontmatter, plus the marked-up per-item entries in body — `### REQ-<AREA>-NNN` headings in `requirements.md`, `### SPEC-<AREA>-NNN` blocks and `Satisfies:` lines in `spec.md`, `### T-<AREA>-NNN` blocks and `Satisfies:` lines in `tasks.md`, test IDs and REQ references in the test report. The pass is mechanical but reads body content, not just frontmatter. |
There was a problem hiding this comment.
Parse implementation-log when building traceability matrix
Step 2 defines RTM generation inputs as requirements/spec/tasks/test-report markup only, but that source set does not contain code location mappings. The RTM template and traceability chain require a Code column (file:line), and those references are captured in implementation-log.md; without parsing it (or an equivalent source), review either produces incomplete RTM rows or invents code links manually, which undermines the stage-9 mechanical validation gate.
Useful? React with 👍 / 👎.
|
|
||
| ## Tool restrictions are deliberate | ||
|
|
||
| `qa` doesn't have `Edit` on production code — it changes tests, not source. `reviewer` has `Edit` / `Write` only to produce its own artifacts (`review.md`, `traceability.md`); its prompt forbids editing specs, code, or other agents' outputs. `analyst` doesn't have `Bash` — research doesn't need a shell. Don't broaden these without an ADR. |
There was a problem hiding this comment.
Remove incorrect QA permission guarantee
This statement claims QA lacks Edit capability on production code, but the same file’s tool-surface table grants QA unrestricted Read, Edit, Write, Bash, Grep. That creates a false least-privilege guarantee for maintainers and auditors, who may assume enforcement exists when it is only prompt guidance.
Useful? React with 👍 / 👎.
| | NFR (per-item, in PRD) | `NFR-<AREA>-NNN` | `NFR-AUTH-002` | | ||
| | Design (document) | `DESIGN-<AREA>-NNN` | `DESIGN-AUTH-001` | | ||
| | Spec (document) | `SPEC-<AREA>-NNN` | `SPEC-AUTH-001` | | ||
| | Spec item (per-item, in spec) | `SPEC-<AREA>-NNN` | `SPEC-AUTH-001` | |
There was a problem hiding this comment.
Use distinct IDs for spec document and spec items
The ID scheme assigns SPEC-<AREA>-NNN to both the spec document and per-item spec entries, which makes IDs ambiguous when templates are followed (for example, a document id: SPEC-...-001 and a ### SPEC-...-001 item). This breaks the “stable/mechanical” linking model because task Satisfies: links and RTM generation cannot reliably tell whether a reference points to the whole document or a specific spec item.
Useful? React with 👍 / 👎.
…t QA scope claim Three issues caught by Codex review on PR #1. 1. SPEC document and SPEC items both used SPEC-<AREA>-NNN, the only ID collision in the kit (PRD/REQ, TASKS/T, TESTPLAN/TEST, REVIEW/R etc. were already distinct). A `Satisfies: SPEC-AUTH-001` link was genuinely ambiguous between "the whole spec doc" and "spec item 001". Renamed the document ID to SPECDOC-<AREA>-NNN (matching the verbose TESTPLAN-/TESTREPORT- style); per-item SPEC- keeps its meaning everywhere it appears (Satisfies: lines, ### SPEC- headings, /spec:review parse list, implementation-log "Spec reference" citations to specific items). Touched: spec/tasks/review/traceability/implementation-log/test-plan templates and docs/traceability.md scheme table. 2. /spec:review step 2 listed requirements/spec/tasks/test-report as parse sources for the RTM but omitted implementation-log.md, where the file:line references that fill the RTM Code column actually live. Added implementation-log.md (Files changed: + Spec reference: lines) to the parse list. 3. .claude/agents/README.md "Tool restrictions" paragraph said qa "doesn't have Edit on production code", but qa has unrestricted Edit/Write at the tool layer — same defect class as the earlier reviewer fix that I missed for qa. Rewrote the section honestly: tool grants are coarse and Edit/Write can't be path-restricted at the tool layer; narrow scope is enforced by each agent's body prompt. Listed the actual hard tool-layer omissions (Bash on analyst/pm/designers/architect/planner/release/retro). Bonus: /spec:review's frontmatter description still claimed "Read-only on artifacts", which has been false since I added Edit/Write. Brought it into line with the actual scope.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 305b450dff
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| - tags the release / cuts the artifact per `docs/steering/operations.md`. | ||
| 3. **Stop and ask the human** before any irreversible deployment action. Authorisation in the past does not authorise the present. |
There was a problem hiding this comment.
Require approval before tagging or publishing artifacts
This step tells the release manager to tag/cut the release artifact before the human-approval checkpoint, which violates the repo's own reversibility rule for shared-state actions. In environments where this command has git/registry access, /spec:release can publish a tag or artifact before explicit authorization, creating irreversible side effects that the next step is supposed to prevent.
Useful? React with 👍 / 👎.
|
|
||
| - **Description:** … | ||
| - **Satisfies:** REQ-<AREA>-NNN, SPEC-<AREA>-NNN | ||
| - **Owner:** dev | qa | architect | human |
There was a problem hiding this comment.
Align task owner options with implement dispatch rules
The task template advertises architect and human as valid owners, but /spec:implement only routes owner=dev and owner=qa. If planners follow this template literally, stage 7 has no execution path for those tasks and the workflow can stall or require undocumented manual handling; either the template should restrict owners or the implement command should define routing for all advertised values.
Useful? React with 👍 / 👎.
|
|
||
| ```yaml | ||
| --- | ||
| id: REQ-AUTH-001 |
There was a problem hiding this comment.
Use document ID pattern in traceability frontmatter example
The document-level frontmatter example uses id: REQ-AUTH-001 for a requirements artifact, but the same file defines requirements documents as PRD-<AREA>-NNN and reserves REQ-* for per-item requirement entries. If users copy this example, they introduce ambiguous/duplicate REQ identifiers, which undermines the “mechanical” RTM generation model described in this guide.
Useful? React with 👍 / 👎.
…tmatter example, /spec:start Triggered by user request for a thorough self-audit. Folds in three concurrent Codex P1/P2 findings (release tag-before-approval, task owner enum mismatch, traceability frontmatter example) plus issues the audit surfaced independently. Defects fixed in this commit: 1. Release safety (P1, .claude/commands/spec/release.md AND .claude/agents/release-manager.md): both put "tag/cut the artifact" BEFORE "stop and ask the human", which violates the constitution's reversibility rule by construction. Restructured as prepare → ask → publish, where prepare has no irreversible side effects, the authorisation gate is explicit and per-release, and only the publish phase performs tagging / pushing / deploying. 2. Task owner enum mismatch (P2, templates/tasks-template.md + .claude/commands/spec/implement.md): template advertised `dev | qa | architect | human`, but /spec:implement only routed `dev` and `qa` — and the architect agent's prompt explicitly forbids writing code, so listing `architect` as a task owner was internally contradictory. Restricted enum to `dev | qa | sre | human`. Expanded /spec:implement routing: dev → dev agent, qa → qa agent, sre → sre agent (ops/observability/runbook tasks), human → stop, surface to user, append a hand-off note, exit; any other value escalates as a clarification. 3. Traceability frontmatter example (P2, docs/traceability.md): the doc-level example used `id: REQ-AUTH-001` (per-item prefix used for what was supposed to be a document) and a self-referential `satisfies: REQ-AUTH-001`. Replaced with a coherent PRD-document example: `id: PRD-AUTH-001`, document-level `inputs:` only, no `satisfies` (documents derive from inputs; per-item Satisfies links live inline in body). Added a sentence noting that per-item cross-links are body fields, not frontmatter. 4. /spec:start was missing `Edit` in allowed-tools. The procedure copies workflow-state-template.md then fills `feature:` and `area:` placeholders — the second step is editing. Added `Edit`. Audit also re-verified clean state for: unqualified IDs, /spec:<stage> shorthand in user-facing docs, RTM-from-frontmatter language, ADR immutability prose, skip-syntax, SPEC vs SPECDOC, no-cross-edit rule, agents README table sync, slash command tools coverage. Only one pre-existing /spec:<stage> reference remains (in ADR-0002, abstract property statement; ADR bodies are immutable).
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 05fd725067
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| - `specs/<feature>/tasks.md` | ||
| - `specs/<feature>/implementation-log.md` | ||
| - `specs/<feature>/test-plan.md` and `test-report.md` | ||
| - The diff: `git diff <base>...HEAD` (Bash, read-only). |
There was a problem hiding this comment.
Specify review diff base before running git diff
The reviewer procedure requires running git diff <base>...HEAD, but <base> is never defined in this agent prompt or in the stage command inputs, so following the instruction literally produces a bad-revision error and can cause the review to skip code-diff validation. This weakens stage-9 defect detection whenever the reviewer relies on this step; define a deterministic base resolution rule (for example, merge-base with the target branch) in the prompt.
Useful? React with 👍 / 👎.
| ## Read first | ||
|
|
||
| - All artifacts in `specs/<feature>/`. | ||
| - The diff: `git log <base>..HEAD` (via Bash if available, otherwise via the artifacts). |
There was a problem hiding this comment.
Define retrospective log base instead of placeholder
The retrospective read checklist asks for git log <base>..HEAD, but the placeholder <base> is not resolved anywhere, so the command is non-executable as written and the mandatory retrospective may miss the intended change-history review. This affects retros that try to follow the documented procedure exactly; the prompt should state how to compute or select the base revision.
Useful? React with 👍 / 👎.
… git commands Two P2 findings caught by Codex review on PR #1. reviewer.md and retrospective.md both invoked git with a bare <base> placeholder ("git diff <base>...HEAD", "git log <base>..HEAD") that was never defined anywhere, so following the instruction literally produced a bad-revision error and skipped code-diff / change-history review. Replaced with a deterministic resolution: the merge-base of HEAD with the remote's default branch, computed via git symbolic-ref on refs/remotes/origin/HEAD with a fallback to main. Projects on a different integration branch override per docs/steering/operations.md.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 5bf4cde952
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| - `specs/<feature>/tasks.md` | ||
| - `specs/<feature>/implementation-log.md` | ||
| - `specs/<feature>/test-plan.md` and `test-report.md` | ||
| - The diff: `git diff "$(git merge-base HEAD "$(git symbolic-ref --short refs/remotes/origin/HEAD 2>/dev/null | sed 's|^origin/||' || echo main)")"...HEAD` (Bash, read-only). The base is the merge-base of `HEAD` with the project's default branch (resolves the remote's default; falls back to `main`). If the project uses a different integration branch, override per `docs/steering/operations.md`. |
There was a problem hiding this comment.
Fix default-branch fallback in reviewer diff command
The git diff recipe here will fail in repositories that do not have refs/remotes/origin/HEAD (common in local-only or CI checkouts): git symbolic-ref ... | sed ... || echo main does not actually fall back, because sed exits successfully on empty input, so git merge-base receives an empty branch name and errors. That means the reviewer can skip or lose the required code-diff check for stage 9 in exactly the environments where this fallback is supposed to help.
Useful? React with 👍 / 👎.
| ## Read first | ||
|
|
||
| - All artifacts in `specs/<feature>/`. | ||
| - The diff: `git log "$(git merge-base HEAD "$(git symbolic-ref --short refs/remotes/origin/HEAD 2>/dev/null | sed 's|^origin/||' || echo main)")"..HEAD` via Bash if available, otherwise reconstruct from the artifacts. The base is the merge-base of `HEAD` with the project's default branch (resolves the remote's default; falls back to `main`). Override per `docs/steering/operations.md` if the project uses a different integration branch. |
There was a problem hiding this comment.
Fix broken branch fallback in retrospective git-log command
This git log instruction repeats the same fallback pattern (... | sed ... || echo main), so when origin/HEAD is unset the fallback never runs and git merge-base is called with an empty ref. In that scenario, the retrospective cannot compute the intended commit range and loses a key required input for the learning stage.
Useful? React with 👍 / 👎.
…ssing origin/HEAD Two P1/P2 findings caught by Codex review on PR #1. The previous one-liner used "git symbolic-ref ... | sed 's|^origin/||' || echo main" to fall back to main when origin/HEAD is unset (common in local-only clones or fresh CI checkouts). That fallback is broken: sed exits 0 on empty input, so the || branch never fires and git merge-base ends up called with an empty ref. Replaced both call sites with an explicit three-step recipe that forces the fallback: DEFAULT_BRANCH="$(git symbolic-ref --short refs/remotes/origin/HEAD 2>/dev/null | sed 's|^origin/||')" : "${DEFAULT_BRANCH:=main}" BASE="$(git merge-base HEAD "$DEFAULT_BRANCH")" || BASE="$DEFAULT_BRANCH" The bash parameter-expansion default (:= ) handles the empty-string case directly, and a separate || on git merge-base handles the case where the default branch is itself missing locally. Verified both states (origin/HEAD set, origin/HEAD missing) behave correctly. Same recipe in both reviewer.md and retrospective.md so future tweaks stay consistent.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 811ac52934
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| --- | ||
| description: Stage 1 — Idea. Invokes the analyst to produce idea.md from a brief. | ||
| argument-hint: [feature-slug] | ||
| allowed-tools: [Read, Edit, Write, WebSearch] |
There was a problem hiding this comment.
Add Agent tool to commands that spawn subagents
This command explicitly requires delegating to the analyst subagent, but allowed-tools omits the Agent tool, so the delegation step cannot run in Claude Code and stage 1 cannot execute as documented. The same omission pattern appears in other stage commands that instruct spawning role agents, which blocks the core orchestrated workflow instead of using the scoped subagent prompts.
Useful? React with 👍 / 👎.
|
|
||
| ## Tool restrictions are deliberate | ||
|
|
||
| **Tool grants are coarse; scope is enforced by the prompt.** Most agents need `Edit`/`Write` to produce their artifacts, so `Edit`/`Write` cannot be path-restricted at the tool layer. The narrow scope is enforced in the agent's body prompt — `qa` may edit tests but not production source; `reviewer` may edit `review.md` and `traceability.md` only; `dev` may not change test assertions; `release-manager` may not deploy without explicit authorisation. The hard tool-layer restrictions are the *omissions*: `analyst` has no `Bash` (research doesn't need a shell); `pm` / `ux-designer` / `ui-designer` / `architect` / `planner` / `release-manager` / `retrospective` have no `Bash` (they don't run code). Don't broaden tool grants without an ADR; don't loosen prompt scopes without one either. |
There was a problem hiding this comment.
Correct release-manager Bash capability statement
This sentence claims release-manager is in the “no Bash” omission set, but the same file’s tool-surface table grants release-manager Bash. That inconsistency gives maintainers an inaccurate permission model when auditing risk boundaries for release operations, so the prose should be aligned with the actual tool configuration.
Useful? React with 👍 / 👎.
…Bash list Two findings caught by Codex review on PR #1, both with sweeps for the defect class. 1. P1: 12 slash commands (idea, research, requirements, design, specify, tasks, implement, test, review, release, retro, clarify) instructed the runtime to "Spawn the <role> subagent" but their allowed-tools omitted the Agent tool. Without Agent in the allow-list, the delegation step can't run in Claude Code, so the core orchestrated workflow couldn't execute as documented. Swept all 12 to add Agent. Verified that non-spawning commands (start, analyze, adr/new) intentionally omit it. 2. P2: agents/README.md "Tool grants are coarse" paragraph listed release-manager in the no-Bash group, but the table grants it Bash (it needs it for tagging, git inspection, deploy tooling). Corrected: explicitly enumerate which agents have Bash (dev, qa, reviewer, release-manager, sre) and which don't (analyst, pm, ux-designer, ui-designer, architect, planner, retrospective), matching the actual tool table.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 09918d5554
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| DEFAULT_BRANCH="$(git symbolic-ref --short refs/remotes/origin/HEAD 2>/dev/null | sed 's|^origin/||')" | ||
| : "${DEFAULT_BRANCH:=main}" # fallback when origin/HEAD is unset (local-only / fresh CI checkouts) | ||
| BASE="$(git merge-base HEAD "$DEFAULT_BRANCH")" || BASE="$DEFAULT_BRANCH" |
There was a problem hiding this comment.
Use a resolvable ref for reviewer BASE computation
Stripping origin/ here can turn a valid remote-tracking ref (origin/main) into a missing local ref (main) in detached/CI checkouts, so git merge-base fails, BASE becomes main, and the required git diff "$BASE"...HEAD step then errors with ambiguous argument. git diff -h documents the A...B form as commit-vs-commit, so both sides must resolve. Fresh evidence: running this exact snippet in a detached clone with only origin/main and no local main reproduces the failure and prevents stage-9 diff validation.
Useful? React with 👍 / 👎.
| DEFAULT_BRANCH="$(git symbolic-ref --short refs/remotes/origin/HEAD 2>/dev/null | sed 's|^origin/||')" | ||
| : "${DEFAULT_BRANCH:=main}" # fallback when origin/HEAD is unset (local-only / fresh CI checkouts) | ||
| BASE="$(git merge-base HEAD "$DEFAULT_BRANCH")" || BASE="$DEFAULT_BRANCH" |
There was a problem hiding this comment.
Keep retrospective BASE on an existing branch ref
This BASE recipe has the same ref-resolution problem as reviewer: converting origin/<default> to <default> can leave DEFAULT_BRANCH unresolved when no local branch exists, so the fallback sets BASE to a non-existent ref and git log "$BASE"..HEAD cannot run. That drops the documented commit-history input for the mandatory retrospective in common detached checkout environments. Fresh evidence: the snippet fails in a clone that has origin/main but no local main branch.
Useful? React with 👍 / 👎.
…ached checkouts Codex P1 finding on PR #1, re-fixing the previous "fix" of the <base> placeholder. The previous recipe stripped origin/ from the symbolic-ref output ("origin/main" → "main"), then asked git merge-base to resolve "main". That fails in detached / shallow / fresh-CI checkouts where only the remote-tracking ref origin/main exists and there's no local main branch — git diff "$BASE"...HEAD then errors with "ambiguous argument". Replaced with a recipe that keeps the full remote-tracking ref: DEFAULT_REF="$(git symbolic-ref --short refs/remotes/origin/HEAD 2>/dev/null)" : "${DEFAULT_REF:=origin/main}" BASE="$(git merge-base HEAD "$DEFAULT_REF")" || BASE="$DEFAULT_REF" origin/<branch> resolves whether a local branch exists or not (it's always a ref under refs/remotes/). The fallback also uses origin/main for the same reason. Verified all three failure modes (full clone, no local branch, no origin/HEAD) resolve correctly. Same recipe in reviewer.md and retrospective.md.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 61e2c10332
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| Resolve `$BASE` like this — keep the full remote-tracking ref so it resolves in detached / shallow / CI checkouts that have `origin/<default>` but no local branch: | ||
| ```bash | ||
| DEFAULT_REF="$(git symbolic-ref --short refs/remotes/origin/HEAD 2>/dev/null)" # e.g. "origin/main" | ||
| : "${DEFAULT_REF:=origin/main}" # fallback when origin/HEAD is unset |
There was a problem hiding this comment.
Derive a valid default ref when origin/HEAD is missing
When refs/remotes/origin/HEAD is unset, this fallback hard-codes origin/main; in repositories whose integration branch is master (or another name), BASE becomes a non-existent ref and the subsequent git diff "$BASE"...HEAD fails, which can skip the required code-diff check in stage 9. Fresh evidence: running this snippet in a checkout with only origin/master reproduced fatal: bad revision 'origin/main...HEAD'.
Useful? React with 👍 / 👎.
| feat(<area>): <task-id> <short title> | ||
| ``` | ||
| 6. Update `workflow-state.md` (and the task's checkbox in `tasks.md`). | ||
| 7. Recommend the next task or, if all `dev` tasks are done, recommend `/spec:test`. |
There was a problem hiding this comment.
Gate /spec:test handoff on all unfinished tasks
This transition condition only checks whether dev tasks are complete, but the same command explicitly supports qa, sre, and human task owners; if those tasks remain open, recommending /spec:test here can advance the workflow before required work is actually finished. The handoff should be based on all remaining non-skipped tasks rather than just dev ownership.
Useful? React with 👍 / 👎.
Triggered by user request for a deliberate self-review using subagents.
Five parallel audit agents (templates / slash-commands / agents /
cross-doc / safety) returned ~80 findings across the kit; this commit
applies all P1s, all systemic P2s, and the P3s that were trivially
fixable in the same edit. Also folds in concurrent Codex P1/P2 findings
on the BASE recipe and /spec:implement task gating.
## Safety / reversibility (P1)
- sre incident procedure had "Mitigate first" with no per-action gate
(Article IX violation). Restructured: diagnose with read-only tools →
propose mitigation → obtain explicit per-action authorisation → execute.
Each subsequent action requires a fresh ask.
- dev / qa / sre / reviewer all had Bash but no enumerated forbidden ops.
Added per-agent forbidden lists (push, force, reset --hard, publish,
kubectl/terraform mutations, etc.); reviewer is now read-only-Bash.
- /spec:implement and dev agent: explicit local-only commit semantics
(reversible via reset --soft, never push), with auto_commit: false
opt-out via docs/steering/tech.md.
- release-manager: split publish into authorised sub-steps; require
per-action announce; explicit "stop on failure, do not auto-cleanup".
- release-manager rollback verification now checks every Rollback-plan
field is non-empty (drafted-but-empty placeholders are blockers).
## Tools / capabilities consistency (P1/P2)
- retrospective.md uses Bash (git log) but tools list omitted Bash.
Added Bash; updated agents/README "Tool restrictions" paragraph and
table accordingly.
- /spec:idea allowed-tools missing WebFetch (research has it; analyst
needs both for source-fetching).
- commands/README.md claimed /spec:analyze "Spawns reviewer in
lightweight mode" but the body never spawns. Fixed the cell.
- Reviewer Boundaries now mirror agents-README: edit only review.md and
traceability.md.
- Orchestrator wording: "user explicitly runs the next command" (drops
"or Claude in the main loop" auto-fire ambiguity); "propose marking
skipped" clarified as orchestrator-cannot-edit.
- Retrospective Boundaries: drafts amendments inside retrospective.md;
human (not orchestrator) sequences PRs (orchestrator is read-only).
## Spec-kit / docs cross-doc (P1/P2)
- spec-kit.md §4 agent table headed "Stage" listed "Operations" and
"Retrospective" as if they were stages. Renamed column to
"Stage / role"; numbered the 11 real stages; sre and orchestrator
marked as cross-cutting roles.
- spec-kit.md §5.1 workflow-state YAML example: added required `area:`
field, full 13-artifact map, and notes-section pointer. Same fix on
the workflow-overview.md cheat-sheet (added `area:` and `blocked` to
the status enum).
- /spec:retro frontmatter description ("Stage 11 — Retrospective") vs
body ("stage 11 — Learning") misaligned. Description now says
"Stage 11 — Learning (Retrospective, mandatory)".
- /spec:tasks body "stage 6 — Task planning" → "stage 6 — Tasks" to
match canonical stage label.
- spec-kit "automated artifact validation" wording now matches README's
v0.2 list.
## Slash-command-from-procedure (P2)
Slash commands cannot be invoked from inside another command's
procedure. Replaced "Run /spec:foo" instructions with "Recommend the
user run /spec:foo" plus explicit workflow-state surfacing in:
- /spec:requirements (re: /spec:clarify)
- /spec:specify (re: /spec:analyze)
- /spec:design (re: /adr:new — architect now writes ADRs directly using
the template since it has Edit/Write)
## Sequencing / re-spawn (P2)
- /spec:design steps 3–5: explicit "wait for the agent to return" so
the three designers are sequenced, not racing.
- /spec:release publish phase: explicit "re-spawn the release-manager
in publish mode" (subagents are stateless across the human pause).
- /spec:retro: explicit closing summary step (feature complete; suggest
next-feature; print open action items).
- /spec:implement: gate /spec:test recommendation on ALL non-skipped
tasks (every owner: dev/qa/sre/human), not just dev.
## Templates (P2)
- idea-template.md owner: was placeholder "<human-or-agent>". Set to
`analyst` so orchestrator can route on it.
- design-template.md used `owners:` (plural list), which orchestrator
can't route on. Added singular `owner: architect` (the lead role for
hand-off) and renamed the list `collaborators:`.
- prd-template.md status enum was missing `proposed`; now matches
docs/traceability.md example.
- prd-template.md per-REQ field: renamed `Links:` → `Satisfies:` so the
RTM generator's documented `Satisfies:` parser matches.
- prd-template.md quality gate added "Jobs to be done captured" and
"counter-metric" — both were in the body but missing from the gate.
- research-template.md quality gate added "Technical considerations
noted" (in DoD; was missing from gate).
- test-plan-template.md status enum: removed orphan `executing`; gate
wording aligned with DoD ("EARS clause").
- test-report-template.md quality gate added coverage-threshold and
non-functional-checks lines (both in the body, not in the gate).
- implementation-log-template.md gate added spec-adherence and
no-scope-creep lines (in DoD; missing).
- traceability-template.md inputs added IMPL-LOG-<AREA>-NNN (the Code
column comes from there).
- review-template.md inputs added TASKS-<AREA>-NNN.
- tasks-template.md inputs added PRD-<AREA>-NNN.
- design-template.md inputs added RESEARCH-<AREA>-NNN.
- checklist-template.md frontmatter expanded to include stage / feature
/ status (was missing required fields).
- adr-template.md "Superseded by" pointer no longer renders as a live
link with Unicode ellipsis (was a copy-paste hazard).
## Agents — Read first / hand-off discipline (P2)
Most agents were missing memory/constitution.md from their Read first
list (load-bearing for their procedures). Added to: analyst (renamed
section from "Steering" to "Read first" while in there), planner, qa,
ui-designer, release-manager, sre.
Most agents were missing an explicit "update workflow-state.md on
hand-off" step. Added a closing Procedure step to: planner, architect,
qa, reviewer, release-manager, retrospective, sre. dev procedure
reordered so the commit happens before the log entry (so the SHA is
real when written).
## Tightened agent descriptions
- dev: "Does not modify test assertions or scenarios; does not push or
run destructive shell commands" (was looser).
- qa: "Does not modify production source" (description didn't say so;
Boundaries did).
- /spec:implement description updated for sre/human routing (was
stale: only mentioned dev/qa).
- /spec:review description: stale "Read-only on artifacts" already
fixed; consistent now.
## New stub directories
- docs/postmortems/ with README — sre's incident procedure points here.
- docs/playbooks/ with README — operations steering points here.
## Steering tweaks
- docs/steering/README.md: qualified `@import` as Claude-Code-only
(cross-tool consumers see the literal text).
- docs/steering/operations.md / tech.md: softened "(when present)"
references to "create when first needed" / "link your project's …".
## Quality-framework
- Added explicit S1–S4 severity definitions (used by /spec:test and
/spec:review without ever being defined before).
## Audit categories that came up clean
- Cross-role agent name references (every name resolves).
- Template / steering file references (every named file exists).
- ID prefix usage across cross-references (no SPEC vs SPECDOC
collisions remain; per-item vs document IDs distinct).
- Stage values across all stage-bearing templates match the
workflow-state enum.
- ADR mutability prose consistent across all five locations.
- No agent edits another agent's outputs across stages.
- No external-side-effect triggers without authorisation.
- Workflow-state mutations don't conflict (each agent touches its own
fields/sections).
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 107c6c8789
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| : "${DEFAULT_REF:=HEAD}" # last-resort: degrade to no-diff | ||
| BASE="$(git merge-base HEAD "$DEFAULT_REF")" || BASE="$DEFAULT_REF" |
There was a problem hiding this comment.
Require a real diff base instead of silently using HEAD
When no remote-tracking default branch exists (common in local-only or minimal CI checkouts), this fallback sets DEFAULT_REF to HEAD, which makes BASE equal to HEAD and turns git diff "$BASE"...HEAD into an empty diff. That can let stage-9 review proceed without inspecting any code changes, undermining the core review gate; the command should fail closed (or ask for an explicit base) rather than degrading to no-diff.
Useful? React with 👍 / 👎.
| 3. Specify **data model** changes (new entities, schema changes, migration impact). | ||
| 4. Specify **data flow** for the primary scenarios end-to-end. | ||
| 5. Sketch **API / interaction contracts** (full contracts go in spec.md). | ||
| 6. Record **key decisions** in a table. **For any decision that constrains future implementation, file an ADR via `/adr:new`.** Link from the table. |
There was a problem hiding this comment.
Remove impossible
/adr:new instruction from architect flow
This tells the architect subagent to file ADRs via /adr:new, but subagents do not invoke slash commands in this workflow (the design command explicitly states that), so the instruction is non-executable as written. In designs with irreversible decisions, this can block or skip required ADR creation during stage 4 unless the operator notices and works around it manually.
Useful? React with 👍 / 👎.
…s directly Two findings caught by Codex review on PR #1. reviewer.md: when no default branch could be resolved (no origin/HEAD, no origin/{main,master,trunk,develop}), the previous fallback set DEFAULT_REF=HEAD, which made BASE=HEAD and turned git diff "$BASE"...HEAD into an empty diff — letting stage-9 review proceed without inspecting any code. Now fails closed: prints a fatal message asking the user to set DEFAULT_REF explicitly per docs/steering/operations.md and exits with code 1. retrospective.md: same fallback issue but different appropriate behaviour — the retrospective is mandatory and shouldn't fail just because git history is missing. Now degrades cleanly: warns, sets BASE=empty as a signal to skip git log, and reconstructs the change history from implementation-log.md + tasks.md instead. architect.md: stage-4 procedure step 6 still said "file an ADR via /adr:new" — but subagents can't invoke slash commands (the /spec:design fix already noted this for the orchestration side; the architect agent body itself was missed in the previous pass). Architect now writes the ADR file directly: list docs/adr/ for the next NNNN, copy the template, fill it in (it has Edit/Write), and add a row to docs/adr/README.md. The user can still run /adr:new for any additional ADRs after stage 4.
…lls + bots Adds the operational layer that complements the lifecycle workflow from PR #1, distilled from patterns observed in the agentonomous repo. - .claude/memory/ — operational memory split from durable principles in memory/constitution.md. README + MEMORY.md index + 8 feedback_*.md rules (pr_hygiene, verify_gate, worktrees_required, pr_workflow, docs_with_pr, parallel_pr_conflicts, autonomous_merge, memory_edits). - .claude/skills/ — reusable how-tos (verify, new-adr, review-fix) any agent can invoke. Skills don't override agent tool restrictions. - .claude/settings.json — permission baseline: allow git ops on topic branches, deny push to main/develop, deny --no-verify, deny rm -rf on protected paths. SessionStart hook surfaces the memory index. - agents/operational/ — second class of agent: always-on, scheduled routines that act against the live repo. PROMPT.md + README.md per bot. Five seeded: review-bot, docs-review-bot, plan-recon-bot, dep-triage-bot, actions-bump-bot. - .gitignore — add .worktrees/ and .claude/cache/.
…ify/worktrees guides (#3) Adds the operational layer that complements PR #1's lifecycle workflow, distilled and generalised from agentonomous patterns. Includes: - .claude/memory/ — operational memory (8 feedback rules + index) - .claude/skills/ — verify, new-adr, review-fix - .claude/settings.json — permission baseline (hardened deny list including --no-verify aliases, env bypass, force-push to topic + protected branches) - agents/operational/ — five always-on bots (review, docs-review, plan-recon, dep-triage, actions-bump) with the eight-section common shape - docs/branching.md, docs/worktrees.md, docs/verify-gate.md - docs/plans/, docs/archive/, docs/daily-reviews/ — plans + archive convention - docs/adr/0004 — adopt operational agents alongside lifecycle agents - CONTRIBUTING.md + README/CLAUDE/AGENTS pointers updated Resolves a substantial set of Codex review findings across multiple iterations: invalid HEAD~24h fallback, broken starter pointer, branch-prefix consistency, contradictory digest-PR output, silent major-bump runs, docs-review exclusion claim, plan-recon recovery path / link integrity / docs-only verify, dep-triage spoofing risk (mandatory TRUSTED_DEP_BOT_LOGINS allowlist), and structural eight-section shape gaps.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 05951cd552
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| feat(auth): T-AUTH-014 add password reset confirmation | ||
| Implements REQ-AUTH-001 step 3. SPEC-AUTH-001 §4. | ||
| ``` | ||
| 6. Append an **entry** to `implementation-log.md`, *after* the commit so the SHA is real: |
There was a problem hiding this comment.
Commit implementation metadata in same task transaction
This procedure commits in step 5 and only then appends implementation-log.md (and later updates task/workflow state), so in normal auto_commit mode those traceability/state edits are left outside the per-task commit unless the agent performs an extra amend/commit that is not specified. In practice, users can push the generated task commit without the corresponding log/state updates, which breaks the stage-9 expectation that RTM/review evidence is present and versioned with the task work.
Useful? React with 👍 / 👎.
| 3. **Spec compliance.** Are deviations logged in the implementation log? Are material ones ADR-tracked? | ||
| 4. **Constitution check.** Any violations of the principles? | ||
| 5. **Risks.** Status of each risk in `research.md` / `design.md`. New risks? | ||
| 6. **Findings.** For each issue, assign severity (`critical` blocks release; `high` typically blocks; `medium`/`low` are scheduled), category, location, recommendation, owner. |
There was a problem hiding this comment.
Standardize reviewer severities to S1–S4 labels
The reviewer prompt uses critical/high/medium/low, but the documented framework defines S1–S4 as the canonical scale and states /spec:review uses those labels. This mismatch makes severity-based gating brittle (for example, automation or handoff checks looking for S1/S2 blockers will not reliably interpret reviewer findings), so the reviewer instructions should use the same severity taxonomy as the framework.
Useful? React with 👍 / 👎.
Adds a mattpocock-style skills layer integrated cleanly with PR #1's foundation and PR #3's operational learnings. Skills (.claude/skills/): - orchestrate — natural-language entry to the 11-stage workflow. Reads workflow-state.md, gates with AskUserQuestion, dispatches /spec:* commands per stage. Lean depth writes idea/research stubs marked complete; Spike marks stages 3–10 skipped (none dispatched); per-stage Skip allowed only for stage 10. Crash-safe resume protocol against the canonical YAML artifacts: map. - grill — one-question-at-a-time interview primitive (after Pocock). - design-twice — parallel divergent design via Task tool (Ousterhout). /spec:design now reads the design-comparison.md synthesis as input. - tracer-bullet — vertical-slice planning for /spec:tasks. - tdd-cycle — strict red/green/refactor scoped to one /spec:implement task; canonical task-ID commit format for traceability. - record-decision — wraps /adr:new with ADR-worthiness criteria. ADR bodies immutable from creation; only YAML status and supersession pointers may change. - domain-context — lazy maintenance of docs/CONTEXT.md (or multi-context layout); subagent-safe via [CONTEXT-CONFIRM] flag pattern. - ubiquitous-language — lazy maintenance of docs/UBIQUITOUS_LANGUAGE.md. Sink catalog (docs/sink.md): one source-of-truth document for every markdown artifact — workflow-scoped (specs/<slug>/*.md), cross-cutting (docs/adr/, docs/CONTEXT.md, docs/UBIQUITOUS_LANGUAGE.md), governance, and tooling. Ownership, eager vs. lazy, immutability, naming, read order. v0.1 command updates: /spec:design now reads design-comparison.md from design-twice; /spec:release verdict check accepts Approved | Approved with conditions; /spec:* preconditions reaffirm `complete`-only upstream and cite the agents that depend on the content. Skills README, root README, and CLAUDE.md updated to surface the combined catalog (workflow conductor + practice skills + cross-cutting sink + operational skills from PR #3) and bump version to v0.2. Iterated through ~10 rounds of Codex review (~26 comments addressed) plus parallel-subagent self-reviews (schema, slash-command preconditions, subagent contracts, cross-skill consistency, scenario walkthrough).
PR #143 review feedback (Codex P2 round 3): item #2 only covered **new** CSS/JSX/HTML files. The severity model treats token literals in any changed code as blocking, so adding `#fff` to an existing file like `sites/foo.css` (or inline `style="…"` in modified JSX) bypassed the gate. - .claude/agents/brand-reviewer.md item #2: scope widened to "changed (new *or* modified)" files reported by `git diff --name-only "$BASE"...HEAD`. Same :root-stripping awk + hex grep recipe as item #1. Inline `style="…"` attributes in changed JSX/HTML are explicitly in scope. - templates/brand-review-checklist.md: same widening. Refs PR #143, Codex review round 3. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
(P1 #1) Workflow permissions check now scans `jobs.<job>.permissions` too, not only the workflow-level block. GitHub Actions applies job-level permissions after workflow-level, so a compliant top-level can be widened by a job override (e.g. `actions: write`). Each declared block — top-level or per-job — must be subset+match against the least-privilege set. Top-level still requires presence; job-level blocks are optional but if present must not widen scope. (P1 #2) `package.json#files` is now enforced against the contract include list. Added `EXPECTED_PACKAGE_FILES` derived from `package-contract.md` §3 and plumbed it through the readiness call as the default expectation. Required shipping paths can no longer be removed without triggering a PkgFiles diagnostic. (P1 #3) `realGit().resolveRef` now peels annotated tags via `^{commit}` so `git rev-parse` returns the commit SHA on both sides of the tag-at-main comparison. Without peeling, an annotated `vX.Y.Z` tag (created via `git tag -a`) returned the tag-object SHA while `main` returned a commit SHA, falsely reporting TAG_NOT_AT_MAIN. The `^{commit}` suffix is a no-op for refs already pointing at a commit (lightweight tags, branches, raw SHAs). Documented the invariant on the `GitInterface` contract. (P2) Release-notes shape check now validates `changelog.exclude.labels` and `changelog.exclude.authors` are arrays. Previously `exclude: {}` passed because only the block's existence was checked, missing the T-V05-003 contract requirement to filter maintenance and bot entries. Tests cover all four fixes (Scenarios 4d, 5d, 5e, 3c plus the existing regression suite). 26 tests pass; full suite green; `npm run verify` ok. Tightened `ExpectedPackage.files` and `expectedPackageFiles` types to `readonly string[]` so the const list satisfies the typecheck. Findings: PR #158 inline comments by chatgpt-codex-connector on commit 13a657d. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
PR #161 chunk plan said the operator guide documents "the five workflow_dispatch inputs" but listed six immediately after. Same defect class as round-6 P3 #1 (operator guide §2) and round-7 P3 (workflow-state.md hand-off note) — missed on the earlier sweeps. Update count to six so all four planning surfaces (chunk plan, operator guide, workflow-state, docs/specorator.md §3.10) agree. Refs SPEC-V05-006 (REQ-V05-008), NFR-V05-002. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two open round-5 P1 findings on dbfca99 closed in this commit (maintainer-directed continuation past the 3-round soft cap). P1 #1 — Parser too strict for legacy tasks.md format ===================================================== Every pre-template `tasks.md: complete` feature in the repo (e.g. `specs/version-0-6-plan/tasks.md`) uses the legacy `### T-V06-001 - Decide steering profile location` shape: no emoji, ASCII hyphen separator, and no `## Task list` / `## Parallelisable batches` / `## Quality gate` subheaders. The previous parser would have hard-stopped on every existing feature. `.claude/agents/issue-breakdown.md` Step 5 + spec "Slicing input" table now require **only** a file with at least one `### T-<AREA>-NNN …` heading and a `**Description:**` bullet underneath it. Everything else is optional with a documented synthesised default: - `## Task list` absent → every `### T-…` heading in the file is a task. - `## Parallelisable batches` absent → synthesise a single batch containing every task in document order (one PR). Surface the synthesis to the conductor so the user confirms / aborts. - `## Quality gate` absent → use the default DoD shipped in `templates/issue-breakdown-pr-body-template.md`. - Per-task `**Definition of done:**` absent → skip the per-task aggregation; the slice DoD falls back to the (possibly default) `## Quality gate`. - Per-task `**Depends on:**` absent → no cross-check. - Per-task `**Satisfies:**` absent → omit from spec lineage. Heading regex relaxed to `^### (T-[A-Z0-9]+-\d{3})(?:\s+([🧪🔨📐📚🚀🪓\s]+?))?\s+[—-]\s+(.+)$` — `<emoji-block>` capture optional, em-dash OR hyphen accepted as separator. May-slice flag still keys off `🪓` presence anywhere in the captured block. Conductor's Step 6 confirm prompt updated to surface whether the slice list came from a real `## Parallelisable batches` section or was synthesised from a legacy `tasks.md`. P1 #2 — Audit / hand-off edits not persisted before cleanup ============================================================ Steps 9 + 10 appended to `specs/<slug>/issue-breakdown-log.md` and `specs/<slug>/workflow-state.md`, then Step 11 deleted the staging dir and exited — leaving the working tree dirty (which conflicts with Step 1's `git status --porcelain` clean-tree gate on the next run) and losing the audit trail unless the operator committed manually. New Step 10.5 cuts a `chore/issue-breakdown-audit-issue-<n>-<runid>` branch from the integration branch, commits the two appends, pushes, and opens a non-draft `chore(issue-breakdown): record run for issue #<n>` PR. The push-deny rules on `main` / `develop` are honoured — the housekeeping branch uses the `chore/` prefix, which the default `.claude/settings.json` allows. If the push is denied (e.g. operator runs against a repo with stricter `chore/*` permissions) the agent surfaces the failure with the local commit SHA so the audit trail can be rescued by hand. Conductor skill, methodology doc, and design spec all updated to reference the new Step 10.5 in the flow diagram + outputs. Refs #183, #184. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… truth Three new Codex P1s on commit 05d84a9, all real internal-consistency drift between agents/skills/commands and the shared goal-state schema. Fix together so the documented happy paths run end-to-end. current_phase drift (P1 #1): `set-goal` initialises `current_phase: scope` per the shared schema (`scope | observe | orient | decide | act | review | closed`), but several files referenced a non-existent `goal` value or a stale list that excluded `scope`: - /goal:observe accepted `goal | review | observe` — would fail preflight on a freshly signed goal. Now: `scope | review | observe`. - /goal:start said new goals get `current_phase: goal`. Now: `scope`. - /goal:close said amend re-entry sets `current_phase: goal`. Now: `scope` (matching the post-`set-goal` entry state). act_approval drift (P1 #2): The actor, decider, orchestrator, and the Codex iteration workflow all referenced an `act_approval` entry in `goal-state.md`. The shared schema does not define that field. The canonical record of approval lives in the `## Human Approval` section of the decision set in `decisions.md`, which is filled by the `decide` skill at the act gate. - agents/actor.md — pre-flight now reads the Human Approval section of decisions.md as the canonical source, with the orchestrator's `## History` row as the secondary signal that the gate cleared. - agents/goal-orchestrator.md — clarified that Decide owns the Human Approval section; orchestrator appends an act-gate row to `## History` citing the approved decision-set id. - agents/decider.md — clarified the decider does not write the Human Approval section itself; the `decide` skill captures the user's verdict there, then the orchestrator records the `## History` row. - .codex/workflows/goal-iteration.md — same alignment for the Codex side; the Safety rail now references the Human Approval section. Review preflight drift (P1 #3): /goal:review required an `actions/` artifact for every iteration, but the `decide` skill explicitly advances `current_phase: review` with no actions when the act gate produces nothing executable (all rejected, deferred, or `act_gate: never`). That left no-action iterations unable to close. Now /goal:review accepts both: - the post-Act case (actions/ artifact present, the normal path) - the no-action case (`## History` records an `act: skipped` row with reason — `all rejected` / `deferred` / `policy forbids`) `npm run verify` is green locally. https://claude.ai/code/session_01CxFjqxucdBwScjAkMxU83s
Codex P1 #1 (commands/goal/start.md:4) — `allowed-tools` omitted `AskUserQuestion` even though Step 1 (missing slug), Step 2 (existing-goal branch), and Step 3 (session-goal compose check) all need it. Wider sweep: 7 commands use AskUserQuestion across their procedures; all 7 were missing it from `allowed-tools`. Fixed: - commands/set-goal.md - commands/create-goal.md - commands/goal/start.md - commands/goal/amend.md - commands/goal/demo.md - commands/goal/welcome.md - commands/goal/trace.md Codex P1 #2 (.claude-plugin/plugin.json:52) — `capabilities.commands` only registered `commands/goal`, omitting the two top-level session-goal entrypoints. Adopters installing the plugin would not see `/create-goal` or `/set-goal` even though every doc surfaces them. Added explicit paths: - commands/create-goal.md - commands/set-goal.md `bash scripts/check-all.sh`: all 7 checks pass. `check:links` ok. `check-roster` ok (the new manifest entries match disk). https://claude.ai/code/session_01CxFjqxucdBwScjAkMxU83s
…dit + per-command frontmatter check Three P1 items from the wave-9 punch list land together: P1 #1 — README install recipe (README.md §Install): Four concrete install paths replace the prior "copy the folder" wording: (a) Claude Code marketplace (canonical when published), (b) `git subtree` for upstream-tracking adopters, (c) `git submodule` for pinned vendoring, (d) one-shot `cp -r` for fork-and-own. Decision rule appended below the recipes to help adopters pick. The existing `Adopt the entry points` step is preserved. P1 #2 — Required-field enforcement audit (skills/decide/SKILL.md Step 3a): The H-1 research recommendation made `rationale`, `confidence`, `reversible: Y|N` required on every decision proposal. The decider agent §Validation enforces this at first-pass (drops malformed proposals to `## Alternatives Considered`). The skill now enforces at second-pass before opening the act gate: rejects the whole set if any proposal still lacks a required field, surfaces an explicit AskUserQuestion with three options (re-dispatch / drop the malformed / accept-with-acknowledgement). No duplicated enforcement; clean first-pass / second-pass division. P1 #3 — Per-command frontmatter ↔ table consistency check (scripts/check-roster.sh check_8 + automation-spec.md §B2 + scripts/README.md): The §B2 deferral noted in earlier waves is now closed. check_8_command_descriptions extracts each command file's frontmatter `description:` and the corresponding row from commands/goal/README.md's `## Commands` table, then confirms they share at least one substantive keyword (with a stop-word filter for Goal-Loop boilerplate). All 15 /goal:* commands report OK on the current tree — no drift surfaced. automation-spec §B2 status annotation drops the deferral note ("implemented — all surfaces covered"); scripts/README table row enumerates all 8 covered surfaces. `bash scripts/check-all.sh`: all 7 checks pass; check_8 inside check-roster confirms 15/15 commands match their table descriptions. `npm run check:links` and typos clean. https://claude.ai/code/session_01CxFjqxucdBwScjAkMxU83s
…impl-rigor / educator / OODA fidelity Four parallel review subagents (educator / impl-rigor / security / OODA) ran read-only against d63766d. Seven P1 findings consolidated into single-file edits below. Each is load-bearing for either a load-bearing security guarantee, a peer-implementer contract, or the documented audit chain. THREAT-03 (security; data egress) Added `## Export safety` section to `templates/observe-source-datasheet-template.md` with `safe_for_export: false` default-deny. Tightened `skills/export-trace/SKILL.md` boundary so default-deny applies to every source type, not just `inbox | person | dataset`. Pre-wave-9 datasheets (which omit the section) are treated as `safe_for_export: false` — failing open is forbidden. A `feed`, `api`, `repo`, or `document` source can carry attacker-controlled or proprietary content just as easily; the export-safety contract does not vary by source type. THREAT-01 (security; audit-trail keystone) `/goal:demo` now records a demo-seed origin row in `## History` before any phase runs (canonical `outcome: amend`, `artifact: memory/amendments/_demo-seed-origin.md`) and sets `demo_seed: true` in frontmatter. Article III §1 (human goal-gate) is not bypassed silently — the audit trail names the bypass for what it is. `/goal:amend` refuses to clear `demo_seed` silently; promoting a demo to a real goal now requires a recorded amendment with the full goal-gate interview re-run. THREAT-04 + impl-rigor P2-12 (security + spec; criteria laundering) In revive-amend mode, `set-goal` now surfaces an explicit acceptance-criteria diff via `AskUserQuestion` and records the user's `equivalent | weaker | stronger` attestation in the amendment record file. Goal that was `close-abandon`ed cannot be silently re-opened with weaker criteria and re-closed as `close-met` — Article X §3 is preserved through recorded attestation. Fixed iteration-bump semantics: `iteration` is NOT bumped at amend time (the next Observe bumps it, per the canonical rule in `skills/_shared/goal-state.md`). Impl-rigor P1-5 (spec; act-gate parsing contract) Pinned canonical, byte-fixed parsing rule for `## Human Approval` in `templates/decision-template.md`. Act and Actor parse only the `**Approved action ids:**` line — comma-and-space separator, `[a-z0-9-]+` ids, placeholder `<…>` or empty value = no ids, multi-checkbox state forces re-prompt, zero-checkbox + non-empty ids = draft. Closes the act-gate semantic gap two implementers could otherwise interpret differently. Impl-rigor P1-7 (spec; cap-counter contract) Added `## Status enum — canonical boundaries` table to `templates/action-log-template.md` defining `done | partial | stopped | blocked | failed` operationally. Pinned cap-counter rule: `max_failed_acts` counts only `failed`; `stopped` and `blocked` are upstream-fix signals, not act-quality death spirals. Educator #1 + #2 (doc learnability) Promoted `docs/glossary.md` to item #1 in AGENTS.md "Read these first" so the canonical enums are defined before the constitution and method.md reference them. Added `## Canonical enums` consolidator section to `glossary.md` that lists every enum's values + defining doc + users. Replaced Article II §4's forward-reference to undefined enums with a cite to the glossary section. OODA D1 + D8 (Boyd fidelity) Added §"What Boyd's actual diagram looks like" + §"Implicit guidance & control" to `docs/ooda-foundations.md`. Names the seven feedback arrows, the implicit-guidance channel Boyd's late-1996 diagram contains, why the Goal Loop's diagrams in method.md are a deliberate simplification, and how the plugin's conventions function as an implicit-guidance surface. Checks: bash plugin-v2/scripts/check-all.sh → all green.
…DA / doc-learnability findings Four parallel review subagents flagged ~30 P2/P3 findings; this wave consolidates their recommendations into focused single-file edits across the same areas the P1 wave covered. Also addresses two Codex P2 findings on fdd24ca (AGENTS roster tool list + .codex iteration-entry precondition). Security hardening (threat-model refresh subagent): - `docs/threat-model.md` — documented THREAT-01 (demo-signed audit-trail bypass, mitigated by `demo_seed: true` + `_demo-seed-origin.md`), THREAT-03 (OTel exporter raw-content leak through pre-wave-9 datasheets, mitigated by default-deny on every source type), THREAT-04 (revive-via-amend criteria laundering, mitigated by recorded `equivalent|weaker|stronger` attestation). - `agents/goal-orchestrator.md` — added §7a click-through erosion guard (`consecutive_acknowledged_caps` counter; ≥2 escalates to remove `Continue`), cap-raise sub-rule (`cap_raises_lifetime` + escalation_contact signing), and History integrity check (refuses to dispatch on non-contiguous iteration numbers; writes `history-integrity-gap.md`). - `agents/observer.md` — static Bash allow-list (`ls cat head tail wc find git* grep rg awk sed-n` only); explicit forbid list; Bash lockout when source's datasheet has `adversarial_content_risk: high`. Impl-rigor (peer-implementer spec review subagent): - `skills/_shared/goal-state.md` — added optional frontmatter fields `lock`, `consecutive_acknowledged_caps`, `cap_raises_lifetime`, `approver_anonymisation`, `decision_mode`, `tempo`; byte-fixed `## History` row format (six columns, leading/trailing pipes, seconds-precision UTC, escape `|` as `\|`). - `agents/decider.md` — pinned tie-breaker (reversible Y > N, lower cost, lex ascending id); RPD decision-mode section reading `decision_mode` from goal. - `agents/loop-reviewer.md` — externality-dominates-progress rule for `partial`-vs-`blocked` per-criterion verdict. - `agents/signal-deduper.md` — pinned algorithm: ±15-min window, Jaccard 0.7 merge threshold, Jaccard < 0.3 contradiction routing. - `skills/decide/SKILL.md` — "missing field" definition (absent, whitespace, template placeholder, off-enum); idempotence pre-flight when prior decision set already has empty approved-id list with skip reason. - `skills/observe/SKILL.md` — restructured `current_phase: review` admit-vs- refuse rules positionally (admit only on `continue`; refuse with route per other outcomes); kept the force-advance ban from a prior Codex fix. - `skills/export-trace/SKILL.md` — deterministic `traceId` (SHA-256 of `<slug>|<iteration>` truncated 32 hex) and `spanId` (SHA-256 of `<traceId>|<phase>|<sub-id>` truncated 16 hex); approver-anonymisation rendering honoring goal's `approver_anonymisation` policy (`role-only` default with `approver_hash`); History-integrity fail-loud (refuse to emit on non-contiguous History); dangling-evidence rule (emit event + adjacent `goal_loop.gap` event with kind `dangling-evidence`). - `templates/decision-template.md` — pinned H=3, M=2, L=1 numeric basis and default score formula `impact × confidence ÷ cost`. OODA fidelity (Boyd practitioner review subagent): - `docs/method.md` — replaced four-phase table + pseudo-phase paragraph with one seven-row `current_phase` enum table (closes the four-vs-five-phases bend); updated ASCII diagram with explicit back-edges for Decide → Orient, Act → Orient, and Review → Observe (with legend distinguishing solid forward chain from dashed back-edges). - `templates/orientation-template.md` — added `## Re-orientation triggers` (populated by orienter reading previous iteration's `decisions.md` rejections), `## Centre of Gravity` (orienter names one binding constraint each iteration; Goldratt-via-Boyd Schwerpunkt), and `## Action results` placeholder for mid-iteration Act → Orient feedback. - `agents/orienter.md` — added `decisions.md` (previous iteration) to "Read first" so the Decide → Orient back-edge has a write path. - `docs/ooda-foundations.md` — Richards bridge paragraph between Boyd's military origin and the plugin's domain-neutral instantiation. - `skills/set-goal/SKILL.md` — Klein pre-mortem per acceptance criterion ("name evidence that, if absent at close, would force `unmet`"); answers land in `## Open Questions` tagged as Review checkpoints. Doc learnability (educator review subagent): - `CLAUDE.md` — deferred the nine-subagent/nine-skill/twelve-command inventory dump to a single-line pointer at AGENTS.md. - `README.md` — fixed subagent count off-by-one (1 orchestrator + 5 phase + 3 consult-only = 9). - `docs/glossary.md` — added EARS five-pattern sub-entry under §"Acceptance criterion"; added §"Frontmatter" top-level entry. - `docs/getting-started.md` — stripped redundant "(see glossary)" parentheticals now that the P1 wave promoted glossary to Read-These-First item #1. - `docs/domain-model.md` — added "How to read this doc" preamble splitting core (sections 1-8) from extended-reference (9-24). - `docs/examples/issue-resolution.md` — annotated the four-observer fan-out with the `parallel_observe` defaults rule (no silent contract violation). - `examples/issue-resolution/decisions.md` — added a form-note explaining the summary table is for narrative readability; the canonical form is the definition-list per `templates/decision-template.md`. - `docs/announcement.md` — added a self-aware header pointing forward readers at `getting-started.md` / `method.md` / `domain-model.md`. Codex P2 fixes on fdd24ca: - `AGENTS.md` — synced subagent tool roster with actual agent contracts (`goal-orchestrator` adds `AskUserQuestion`; `observer` adds `Write`). - `.codex/workflows/goal-iteration.md` — removed `review` from iteration-entry precondition; cites Article VI §1 and routes to `/goal:review` first. Checks: bash plugin-v2/scripts/check-all.sh → all green.
Summary
Establishes the v0.1 foundation for a solution-agnostic, spec-driven, agentic SDLC workflow template. Built on your original v0.1 draft and enhanced with patterns researched from GitHub Spec Kit, Amazon Kiro, BMAD-METHOD, EARS notation, and Michael Nygard ADRs.
The workflow is the deliverable — this repo defines the process, artifacts, agents, and quality gates used to build other products.
What's in here (69 files, ~4,000 lines)
Root governance
README.md— entry point and file mapAGENTS.md— cross-tool root context (works with Codex, Cursor, Aider, Copilot, Claude Code)CLAUDE.md— Claude Code entry, importsAGENTS.mdmemory/constitution.md— 10 articles of governing principles, ADR-amendable.gitignoreWorkflow documentation (
docs/)spec-kit.md— full workflow definition (11 stages × owners × gates)workflow-overview.md— one-page cheat sheetquality-framework.md— 6 quality dimensions and per-stage Definition of Donetraceability.md— ID schemeREQ → SPEC → T → code → TESTears-notation.md— reference for the 5 EARS patternssteering/— 5 scoped context files (product, tech, ux, quality, operations)adr/— 3 seed ADRs (record decisions, adopt SDD, adopt EARS)Templates (
templates/) — 16 templates with YAML frontmatter and built-in quality gates: idea, research, prd, design, spec, tasks, implementation log, test plan, test report, review, release notes, retrospective, ADR, RTM, workflow state, checklist.Claude Code (
.claude/)/spec:start,/spec:idea,/spec:research,/spec:requirements,/spec:design,/spec:specify,/spec:tasks,/spec:implement,/spec:test,/spec:review,/spec:release,/spec:retro,/spec:clarify,/spec:analyze,/adr:new.Placeholders
examples/— for v0.2 worked end-to-end examplesspecs/— for per-feature work productKey design choices (each ADR-tracked)
qacan'tEditsource,reviewercan'tEditat all,analysthas noBash. Scope is a permission, not a default.Notes
mainfor now with a small footprint. The harness is wired to develop on this branch and create a PR. Easiest path: merge this PR and continue onmaingoing forward — the branch protection / PR overhead can stay off.Test plan
README.md— does the file map match the actual layout?docs/spec-kit.md— does the workflow make sense end-to-end?templates/prd-template.md) — does the structure feel usable?.claude/agents/dev.md) — is its scope tight enough?/spec:start <slug>(Claude Code) and walk a tiny imaginary feature through a few stages to feel the seams.https://claude.ai/code/session_01DvaENnL3uxU7f7eUNJTENv
Generated by Claude Code