Conversation
…rkflow Add severity-scaled Regression Risk Analysis section to bugfix template with Blast Radius, Behavior Inventory, Test Coverage Assessment, Risk Tier, and Scope Escalation Check subsections. Enhance Phase 2 workflow guidance with concrete discovery steps using abstract operations. Add REGRESSION_MARKERS validation to ensure regression analysis content survives platform generation.
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughAdds a Bugfix Regression Risk Analysis feature: new spec/docs and tasks, a Regression Risk Analysis section added to the bugfix template, workflow updated to require severity‑scaled analysis before Proposed Fix, validator updated to check regression markers and timestamp ordering, and a Timestamp Protocol plus tests added. Changes
Sequence DiagramsequenceDiagram
participant Analyst as Analyst
participant Workflow as WorkflowEngine
participant Template as BugfixTemplate
participant Validator as Validator
participant Platform as PlatformOutput
Analyst->>Workflow: start bugfix spec (provide severity)
Workflow->>Analyst: require Regression Risk Analysis step
alt Critical/High
Analyst->>Template: fill Blast Radius, Behavior Inventory, Test Coverage, Risk Tier, Scope Escalation
else Medium
Analyst->>Template: fill Blast Radius, Behavior Inventory, brief Risk Tier
else Low
Analyst->>Template: minimal Blast Radius note
end
Template->>Template: derive Must-Test items -> populate Unchanged Behavior & Testing Plan
Analyst->>Template: write Proposed Fix
Template->>Validator: submit generated outputs
Validator->>Validator: check REGRESSION_MARKERS and timestamp ordering
Validator->>Platform: verify cross-platform marker consistency
Validator-->>Analyst: return validation results
Estimated Code Review Effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly Related PRs
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Pull request overview
This PR adds a Regression Risk Analysis methodology to the SpecOps bugfix workflow. Previously, the bugfix template provided a section for documenting preserved behaviors but no systematic process for discovering which behaviors need protecting. This change fills that gap by adding severity-scaled discovery steps, integrating them into the Phase 2 workflow, and adding validation markers to ensure the content survives platform generation.
Changes:
- Adds a Regression Risk Analysis section (Blast Radius, Behavior Inventory, Test Coverage Assessment, Risk Tier, Scope Escalation Check) to the bugfix template with severity-scaled guidance, and updates related sections (Unchanged Behavior, Testing Plan, Acceptance Criteria)
- Enhances Phase 2 workflow guidance in
core/workflow.mdwith concrete severity-tiered discovery steps using abstract operations (LIST_DIR, READ_FILE, RUN_COMMAND) - Adds
REGRESSION_MARKERSvalidation ingenerator/validate.pyfor per-platform and cross-platform consistency checks
Reviewed changes
Copilot reviewed 15 out of 15 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
core/templates/bugfix.md |
New Regression Risk Analysis section with five subsections, updated Unchanged Behavior/Testing Plan comments, two new acceptance criteria |
core/workflow.md |
Replaced single-paragraph bugfix guidance with severity-tiered discovery procedure |
generator/validate.py |
Added 6 regression markers, per-platform check, cross-platform consistency |
skills/specops/SKILL.md |
Regenerated Claude skill output with new content |
platforms/claude/SKILL.md |
Regenerated Claude platform output |
platforms/cursor/specops.mdc |
Regenerated Cursor platform output |
platforms/codex/SKILL.md |
Regenerated Codex platform output |
platforms/copilot/specops.instructions.md |
Regenerated Copilot platform output |
CHECKSUMS.sha256 |
Updated checksums for all regenerated files |
.specops/index.json |
Added completed Spec 2 entry |
.specops/bugfix-regression-analysis/* |
Spec 2 dogfood artifacts (requirements, design, tasks, implementation journal, spec.json) |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
core/templates/bugfix.md
Outdated
| Use LIST_DIR and READ_FILE to find callers, importers, and dependents. | ||
| Use RUN_COMMAND to search for usages if the platform supports code execution. |
There was a problem hiding this comment.
The HTML comment text prepends "Use" before the abstract tool operations (LIST_DIR, READ_FILE, RUN_COMMAND), but the tool substitution mechanism replaces these with platform-specific phrases that may already include "Use". For example, the Claude platform maps LIST_DIR → Use the Glob tool to list, so Use LIST_DIR becomes Use Use the Glob tool to list in the generated output (visible in skills/specops/SKILL.md line 1645 and platforms/claude/SKILL.md line 1645).
The fix is to drop the leading "Use" and write the abstract operations directly as verbs, matching the existing convention in core/workflow.md (e.g., line 56: LIST_DIR and READ_FILE the affected component's directory). For example:
- Line 36:
LIST_DIR and READ_FILE to find callers, importers, and dependents. - Line 37:
RUN_COMMAND to search for usages if the platform supports code execution.
| Use LIST_DIR and READ_FILE to find callers, importers, and dependents. | |
| Use RUN_COMMAND to search for usages if the platform supports code execution. | |
| LIST_DIR and READ_FILE to find callers, importers, and dependents. | |
| RUN_COMMAND to search for usages if the platform supports code execution. |
There was a problem hiding this comment.
Actionable comments posted: 5
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
core/templates/bugfix.md (1)
79-104:⚠️ Potential issue | 🟠 MajorKeep low-severity bugfixes completable.
Lines 80-104 still require Must-Test items and coverage-gap closure for every bugfix, but Lines 29-32 explicitly allow low-severity fixes to stop after a brief blast-radius scan. As written, low-severity specs either become impossible to satisfy or get padded with boilerplate just to clear the checklist.
♻️ Suggested adjustment
## Unchanged Behavior -<!-- Drawn from Must-Test behaviors in the Regression Risk Analysis above. - Each item here is a formal commitment backed by discovery, not a guess. - Use EARS notation: WHEN [condition] THE SYSTEM SHALL CONTINUE TO [existing behavior] --> -- WHEN [condition] THE SYSTEM SHALL CONTINUE TO [existing behavior that must be preserved] -- WHEN [condition] THE SYSTEM SHALL CONTINUE TO [existing behavior that must be preserved] +<!-- For Critical/High severity, list Must-Test behaviors from the Regression Risk Analysis. + For Medium severity, include documented Must-Test items if a Risk Tier was produced. + For Low severity, omit this section when the blast-radius scan confirms no meaningful regression surface. --> +- WHEN [condition] THE SYSTEM SHALL CONTINUE TO [existing behavior that must be preserved] @@ ### Unchanged Behavior (verify no regressions) -<!-- Must-Test behaviors from Regression Risk Analysis. Nice-To-Test items are optional. - Gap behaviors (no existing test) from Coverage Assessment must have new tests here. --> +<!-- Must-Test behaviors from Regression Risk Analysis, if any. Nice-To-Test items are optional. + Gap behaviors from Coverage Assessment must have new tests here when a Coverage Assessment was required. --> - WHEN [related condition] THE SYSTEM SHALL CONTINUE TO [preserved behavior] ## Acceptance Criteria -- [ ] Regression Risk Analysis completed (blast radius surveyed, behaviors inventoried) +- [ ] Regression Risk Analysis completed at a depth proportional to Severity - [ ] Bug reproduction confirmed (Current Behavior verified) - [ ] Fix verified (Expected Behavior tests pass) -- [ ] No regressions (all Must-Test Unchanged Behavior tests pass) -- [ ] Test coverage gaps from Coverage Assessment addressed +- [ ] No regressions (all documented Must-Test Unchanged Behavior tests pass, if any) +- [ ] Test coverage gaps from Coverage Assessment addressed, if a Coverage Assessment was required🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@core/templates/bugfix.md` around lines 79 - 104, Update the bugfix.md template to allow low-severity fixes to follow the lighter path referenced in the doc: change the strict "Must-Test" requirement under the "Unchanged Behavior" and "Testing Plan" sections so that for low-severity bugs the template permits a minimal blast-radius scan and a short verification (e.g., add a conditional note or an optional checkbox that Low-Severity fixes may provide a brief summary instead of full Must-Test items). Specifically, modify the headings/blocks "Unchanged Behavior", "Testing Plan", and "Acceptance Criteria" to include an explicit optional branch or checkbox labeled "Low-Severity: brief blast-radius scan and summary permitted" and update the checklist items so they can be marked satisfied by that minimal verification path.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In @.specops/bugfix-regression-analysis/spec.json:
- Around line 6-7: The spec metadata contains future UTC timestamps for the
"created" and "updated" fields; replace those values in
.specops/bugfix-regression-analysis/spec.json with the actual UTC times when the
PR was opened/modified (use the PR event time, e.g., 2026-03-07T19:29:32Z or the
correct commit timestamps) ensuring "created" is the original creation time and
"updated" reflects the real last-modified time, then regenerate/rebuild the
derived .specops/index.json so the index reflects the corrected timestamps.
In @.specops/bugfix-regression-analysis/tasks.md:
- Line 77: Task 3 in the markdown has been left as "Completed" while its "Tests
Required" checkboxes remain unchecked; update the task state to be consistent by
either (a) checking the two "Tests Required" checkboxes under Task 3 (the "[ ]
Tests Required" items) and any similar unchecked boxes for tasks 101–103, or (b)
if tests were not actually completed, change Task 3 (and tasks 101–103) out of
the "Completed" section back to an appropriate state. Locate the "Task 3"
heading and the inline "- [ ] Tests Required" items and either toggle them to "-
[x] Tests Required" or move the entire task out of the "Completed" list
accordingly.
In `@core/workflow.md`:
- Around line 62-66: The low-severity path allows stopping after "Blast Radius"
but still requires populating "Unchanged Behavior" from "Must-Test behaviors",
which may not exist; update the workflow so that when the "Low severity" branch
is taken (the text referencing "Low severity: Brief step 1 only" and the later
"Unchanged Behavior" requirement) the author must either (a) provide a minimal
"Must-Test behaviors" list derived from the brief blast-radius scan (e.g., key
callers or state changes) or (b) explicitly state that "Unchanged Behavior" will
be populated from the blast-radius findings and include a short
template/example; make this change by editing the "Low severity" sentence to
mandate a minimal Must-Test list or a default derivation method and add one-line
guidance in the "Unchanged Behavior" paragraph to accept that minimal input.
In `@generator/validate.py`:
- Around line 121-129: REGRESSION_MARKERS currently contains headings that also
appear elsewhere in the workflow prose, weakening the validator; update the
REGRESSION_MARKERS list in generator.validate.py so each marker is unique to the
bugfix template (for example, rename overlapping entries like "Blast Radius",
"Behavior Inventory", "Risk Tier", and "Scope Escalation" to distinct names such
as "Bugfix - Blast Radius", "Bugfix - Behavior Inventory", "Bugfix - Risk Tier",
"Bugfix - Scope Escalation" or other unique variants) so the checks that
reference REGRESSION_MARKERS (and functions that read it) will only pass when
the bugfix template actually includes those headings.
In `@platforms/claude/SKILL.md`:
- Around line 69-73: Make the SKILL.md generation conditional so the
regression-gate checks (Risk Tier, Coverage Assessment, "all Must-Test" and
scope-escalation) are applied according to severity instead of unconditionally;
modify the generator template logic in the generator templates (the SKILL.md
Jinja2 template under generator/templates) to: 1) render
Risk/Coverage/Escalation checkboxes only when severity != Low and include a
minimal Risk/Coverage/Escalation pass for Low/Medium branches when required, and
2) only require "all Must-Test" and "coverage gaps ... addressed" when those
checks are enabled for that severity; make the change in the template or core
module that produces SKILL.md (not the generated SKILL.md file itself) and then
regenerate all outputs with the generator entrypoint (generate.py --all).
---
Outside diff comments:
In `@core/templates/bugfix.md`:
- Around line 79-104: Update the bugfix.md template to allow low-severity fixes
to follow the lighter path referenced in the doc: change the strict "Must-Test"
requirement under the "Unchanged Behavior" and "Testing Plan" sections so that
for low-severity bugs the template permits a minimal blast-radius scan and a
short verification (e.g., add a conditional note or an optional checkbox that
Low-Severity fixes may provide a brief summary instead of full Must-Test items).
Specifically, modify the headings/blocks "Unchanged Behavior", "Testing Plan",
and "Acceptance Criteria" to include an explicit optional branch or checkbox
labeled "Low-Severity: brief blast-radius scan and summary permitted" and update
the checklist items so they can be marked satisfied by that minimal verification
path.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 979b36a5-bf62-4e3c-a7c9-e3e7fb438ef7
📒 Files selected for processing (15)
.specops/bugfix-regression-analysis/design.md.specops/bugfix-regression-analysis/implementation.md.specops/bugfix-regression-analysis/requirements.md.specops/bugfix-regression-analysis/spec.json.specops/bugfix-regression-analysis/tasks.md.specops/index.jsonCHECKSUMS.sha256core/templates/bugfix.mdcore/workflow.mdgenerator/validate.pyplatforms/claude/SKILL.mdplatforms/codex/SKILL.mdplatforms/copilot/specops.instructions.mdplatforms/cursor/specops.mdcskills/specops/SKILL.md
- Remove double "Use" prefix on abstract tool operations in bugfix template - Fix future timestamps in spec metadata to match actual PR creation time - Check unchecked test gates on completed Task 3 in tasks.md - Clarify low-severity bugfix path to produce minimal artifacts for downstream gates - Make regression markers heading-prefixed for unique validation against bugfix template
There was a problem hiding this comment.
Actionable comments posted: 2
♻️ Duplicate comments (4)
generator/validate.py (1)
121-129:⚠️ Potential issue | 🟠 MajorValidate the new acceptance-gate checkboxes too.
REGRESSION_MARKERSonly covers the section headings. If a generated output drops the two new acceptance criteria lines,validate_platform()and the cross-platform consistency check still pass, so this validator misses part of the regression this PR is meant to catch.♻️ Proposed fix
REGRESSION_MARKERS = [ "## Regression Risk Analysis", "### Blast Radius", "### Behavior Inventory", "### Test Coverage Assessment", "### Risk Tier", "### Scope Escalation Check", + "- [ ] No regressions (all Must-Test Unchanged Behavior tests pass)", + "- [ ] Test coverage gaps from Coverage Assessment addressed", ]🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@generator/validate.py` around lines 121 - 129, REGRESSION_MARKERS currently only lists section headings and doesn't include the two new acceptance-gate checkbox lines, so update the constant REGRESSION_MARKERS to include the exact acceptance criteria strings (the two new checkbox lines) and then ensure validate_platform() and any cross-platform consistency check use REGRESSION_MARKERS (or the updated list) for their existence checks; locate REGRESSION_MARKERS in generator/validate.py and add the two checkbox strings to the list so the validator fails when those lines are missing.platforms/codex/SKILL.md (1)
1396-1400:⚠️ Potential issue | 🟠 MajorLow-severity template guidance is still behind the workflow.
The comment still says low severity needs only a brief Blast Radius scan, but the workflow now requires either a lightweight Risk Tier entry or an explicit "No caller-visible unchanged behavior" note. That mismatch will keep generated bugfix specs under-specified on the low-severity path.
Run this to confirm the stale wording is coming from the source template and not just this generated artifact:
#!/bin/bash rg -n -C2 'brief Blast Radius scan only|caller-visible unchanged behavior|lightweight Risk Tier' core platformsExpected result: the low-severity "brief Blast Radius scan only" wording should show up in the source template as well as generated outputs, so the fix should land in the source template and then be regenerated.
As per coding guidelines: "Never edit generated platform output files directly (SKILL.md, specops.mdc, specops.instructions.md). Edit core/ modules or generator/templates/*.j2 instead, then regenerate with python3 generator/generate.py --all."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@platforms/codex/SKILL.md` around lines 1396 - 1400, The "Regression Risk Analysis" low-severity guidance in the generated SKILL.md still says "brief Blast Radius scan only" but the workflow now requires either a lightweight Risk Tier entry or an explicit "No caller-visible unchanged behavior" note; update the source template (not the generated SKILL.md) to replace the stale low-severity wording with the new requirement language so that low-severity paths require a "lightweight Risk Tier" or an explicit "No caller-visible unchanged behavior" statement, and then regenerate via the generator (generator/templates/*.j2 or the relevant core module that supplies the Regression Risk Analysis block) so the change propagates to SKILL.md and other generated outputs.platforms/claude/SKILL.md (1)
60-73:⚠️ Potential issue | 🟠 MajorSeverity-scaled workflow contradicts unconditional acceptance criteria.
Lines 60-73 allow Low/Medium severity bugs to skip regression analysis steps (Low: "brief step 1 only"; Medium: "skip detailed coverage check"), and line 73 instructs to "mark the regression/coverage acceptance criteria as checked with a note" for Low severity with no Must-Test behaviors. However, the template acceptance criteria at lines 1708-1714 remain unconditional — every bugfix must check off "Regression Risk Analysis completed (blast radius surveyed, behaviors inventoried)", "all Must-Test Unchanged Behavior tests pass", and "Test coverage gaps from Coverage Assessment addressed", regardless of severity. This makes Low/Medium bugs impossible to complete (all boxes checked) without performing the supposedly optional work, or creates silent non-compliance when boxes are checked despite skipped steps.
The template checkboxes need conditional rendering or inline notes matching the severity branches. For example: line 1709 could read "Regression Risk Analysis completed (blast radius surveyed, behaviors inventoried; for Low severity: minimal blast radius noted)", line 1712 "No regressions (all Must-Test Unchanged Behavior tests pass; N/A if Low severity with no Must-Test behaviors)", and line 1713 "Test coverage gaps from Coverage Assessment addressed (N/A if Medium severity skipped detailed check)".
As per coding guidelines, "Never edit generated platform output files directly (SKILL.md, specops.mdc, specops.instructions.md). Edit core/ modules or generator/templates/*.j2 instead, then regenerate with python3 generator/generate.py --all." Fix this in
core/templates/bugfix.mdor the generator template that produces the Acceptance Criteria section, then regenerate.Also applies to: 1708-1714
skills/specops/SKILL.md (1)
60-73:⚠️ Potential issue | 🟠 MajorSeverity-scaled workflow contradicts unconditional acceptance criteria.
Lines 60-73 allow Low/Medium severity bugs to skip regression analysis steps (Low: "brief step 1 only"; Medium: "skip detailed coverage check"), and line 73 instructs to "mark the regression/coverage acceptance criteria as checked with a note" for Low severity with no Must-Test behaviors. However, the template acceptance criteria at lines 1708-1714 remain unconditional — every bugfix must check off "Regression Risk Analysis completed (blast radius surveyed, behaviors inventoried)", "all Must-Test Unchanged Behavior tests pass", and "Test coverage gaps from Coverage Assessment addressed", regardless of severity. This makes Low/Medium bugs impossible to complete (all boxes checked) without performing the supposedly optional work, or creates silent non-compliance when boxes are checked despite skipped steps.
The template checkboxes need conditional rendering or inline notes matching the severity branches. For example: line 1709 could read "Regression Risk Analysis completed (blast radius surveyed, behaviors inventoried; for Low severity: minimal blast radius noted)", line 1712 "No regressions (all Must-Test Unchanged Behavior tests pass; N/A if Low severity with no Must-Test behaviors)", and line 1713 "Test coverage gaps from Coverage Assessment addressed (N/A if Medium severity skipped detailed check)".
As per coding guidelines, "Never edit generated platform output files directly (SKILL.md, specops.mdc, specops.instructions.md). Edit core/ modules or generator/templates/*.j2 instead, then regenerate with python3 generator/generate.py --all." Fix this in
core/templates/bugfix.mdor the generator template that produces the Acceptance Criteria section, then regenerate.Also applies to: 1708-1714
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@skills/specops/SKILL.md` around lines 60 - 73, The Acceptance Criteria in the generated bugfix template is unconditional and conflicts with the severity-scaled Regression Risk Analysis; update the generator template (core/templates/bugfix.md) to render the three acceptance checklist items conditionally using the template severity variable (e.g., severity, must_test_items) so Low/Medium severities show N/A or alternate wording where steps are skipped (change the "Regression Risk Analysis completed", "All Must-Test Unchanged Behavior tests pass", and "Test coverage gaps from Coverage Assessment addressed" lines to include conditional phrases like "N/A if Low severity with no Must-Test behaviors" or "skipped detailed coverage check for Medium severity"), ensure the template checks reference must_test_items when deciding whether to require the Must-Test checkbox, and then regenerate all outputs with python3 generator/generate.py --all rather than editing SKILL.md directly.
🧹 Nitpick comments (1)
platforms/cursor/specops.mdc (1)
60-73: Well-designed severity-scaled analysis workflow.The Regression Risk Analysis integration is thoughtful and follows the Simplicity Principle effectively. The three-tier approach (Critical/High with full 5-step analysis, Medium with abbreviated steps, Low with minimal scan) ensures proportional effort without over-engineering small fixes.
Optional clarity improvement for Line 73:
The phrase "mark the regression/coverage acceptance criteria as checked with a note" might be ambiguous. Consider rephrasing to make it clearer that the user should check the box AND add an explanatory note at completion time, not pre-check it in the template:
✨ Suggested clarification
-For Low severity with no Must-Test behaviors identified, note "N/A — isolated change with no caller-visible behavior to preserve" in the Unchanged Behavior section and mark the regression/coverage acceptance criteria as checked with a note. +For Low severity with no Must-Test behaviors identified, note "N/A — isolated change with no caller-visible behavior to preserve" in the Unchanged Behavior section. At completion (Phase 4), the regression-related acceptance criteria may be checked with an explanatory note (e.g., "N/A — isolated change").This clarifies that the checking happens at Phase 4 verification time, not during spec creation.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@platforms/cursor/specops.mdc` around lines 60 - 73, The sentence "mark the regression/coverage acceptance criteria as checked with a note" is ambiguous; update the wording in specops.mdc (the Regression Risk Analysis section) to explicitly instruct the reviewer to both check the acceptance-criteria checkbox and add an explanatory note at verification time (Phase 4), e.g., replace that phrase with "when completing Phase 4 verification, check the regression/coverage acceptance-criteria box and add a brief explanatory note describing what was verified" so it's clear the check happens at completion, not during spec creation.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@core/workflow.md`:
- Line 66: The workflow currently instructs agents to "mark the
regression/coverage acceptance criteria as checked" during Phase 2 (in the
"Unchanged Behavior" / "Testing Plan" text); change this so acceptance criteria
checkboxes are NOT pre-checked in Phase 2 and remain unchecked until Phase 4.
Update the sentences that mention marking criteria as checked (and the
Low-severity special-case language) to instead state that the Unchanged Behavior
section may be empty for Low severity and that regression/coverage checkboxes
must be left unchecked until Phase 4 (docs check and checkbox verification).
Ensure the terms "Unchanged Behavior", "Testing Plan", "Phase 2", "Phase 4", and
"Must-Test" are preserved so readers can locate the revised guidance.
- Around line 55-58: Update the "Blast Radius Survey" step to split directory
listing from file reads: change the instruction that currently says to
"READ_FILE the affected component's directory" so it first uses LIST_DIR to
enumerate files in the component directory (keep the "Identify every module..."
guidance), then instruct that for each file returned you should READ_FILE each
file to inspect imports/usages; reference the "Blast Radius Survey" heading and
the current step 1 wording so you replace the single READ_FILE-on-directory
action with a two-phase LIST_DIR then per-file READ_FILE sequence.
---
Duplicate comments:
In `@generator/validate.py`:
- Around line 121-129: REGRESSION_MARKERS currently only lists section headings
and doesn't include the two new acceptance-gate checkbox lines, so update the
constant REGRESSION_MARKERS to include the exact acceptance criteria strings
(the two new checkbox lines) and then ensure validate_platform() and any
cross-platform consistency check use REGRESSION_MARKERS (or the updated list)
for their existence checks; locate REGRESSION_MARKERS in generator/validate.py
and add the two checkbox strings to the list so the validator fails when those
lines are missing.
In `@platforms/codex/SKILL.md`:
- Around line 1396-1400: The "Regression Risk Analysis" low-severity guidance in
the generated SKILL.md still says "brief Blast Radius scan only" but the
workflow now requires either a lightweight Risk Tier entry or an explicit "No
caller-visible unchanged behavior" note; update the source template (not the
generated SKILL.md) to replace the stale low-severity wording with the new
requirement language so that low-severity paths require a "lightweight Risk
Tier" or an explicit "No caller-visible unchanged behavior" statement, and then
regenerate via the generator (generator/templates/*.j2 or the relevant core
module that supplies the Regression Risk Analysis block) so the change
propagates to SKILL.md and other generated outputs.
In `@skills/specops/SKILL.md`:
- Around line 60-73: The Acceptance Criteria in the generated bugfix template is
unconditional and conflicts with the severity-scaled Regression Risk Analysis;
update the generator template (core/templates/bugfix.md) to render the three
acceptance checklist items conditionally using the template severity variable
(e.g., severity, must_test_items) so Low/Medium severities show N/A or alternate
wording where steps are skipped (change the "Regression Risk Analysis
completed", "All Must-Test Unchanged Behavior tests pass", and "Test coverage
gaps from Coverage Assessment addressed" lines to include conditional phrases
like "N/A if Low severity with no Must-Test behaviors" or "skipped detailed
coverage check for Medium severity"), ensure the template checks reference
must_test_items when deciding whether to require the Must-Test checkbox, and
then regenerate all outputs with python3 generator/generate.py --all rather than
editing SKILL.md directly.
---
Nitpick comments:
In `@platforms/cursor/specops.mdc`:
- Around line 60-73: The sentence "mark the regression/coverage acceptance
criteria as checked with a note" is ambiguous; update the wording in specops.mdc
(the Regression Risk Analysis section) to explicitly instruct the reviewer to
both check the acceptance-criteria checkbox and add an explanatory note at
verification time (Phase 4), e.g., replace that phrase with "when completing
Phase 4 verification, check the regression/coverage acceptance-criteria box and
add a brief explanatory note describing what was verified" so it's clear the
check happens at completion, not during spec creation.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 786ca4f0-1ddd-4c80-bffd-03bf639316dc
📒 Files selected for processing (12)
.specops/bugfix-regression-analysis/spec.json.specops/bugfix-regression-analysis/tasks.md.specops/index.jsonCHECKSUMS.sha256core/templates/bugfix.mdcore/workflow.mdgenerator/validate.pyplatforms/claude/SKILL.mdplatforms/codex/SKILL.mdplatforms/copilot/specops.instructions.mdplatforms/cursor/specops.mdcskills/specops/SKILL.md
🚧 Files skipped from review as they are similar to previous changes (5)
- .specops/bugfix-regression-analysis/tasks.md
- .specops/bugfix-regression-analysis/spec.json
- CHECKSUMS.sha256
- .specops/index.json
- core/templates/bugfix.md
Add Timestamp Protocol to review-workflow requiring all timestamps come from RUN_COMMAND(date -u) instead of LLM estimation. Reinforce at every timestamp update site. Add data integrity rule and timestamp ordering tests. Fix existing dogfood timestamps to actual commit times.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 18 out of 18 changed files in this pull request and generated 2 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
core/workflow.md
Outdated
|
|
||
| **Phase 3: Implement** | ||
| 1. Check the implementation gate: if spec review is enabled, verify `spec.json` status is `approved` before proceeding. Update status to `implementing`, set `specopsUpdatedWith` to the current SpecOps version (from this instruction file's frontmatter `version:` field), update `updated` timestamp, and regenerate `index.json`. | ||
| 1. Check the implementation gate: if spec review is enabled, verify `spec.json` status is `approved` before proceeding. Update status to `implementing`, set `specopsUpdatedWith` to the current SpecOps version (from this instruction file's frontmatter `version:` field), update `updated` timestamp (use RUN_COMMAND(`date -u +"%Y-%m-%dT%H:%M:%SZ"`) for the current time), and regenerate `index.json`. |
There was a problem hiding this comment.
The use RUN_COMMAND(...) phrasing here produces broken output in generated platform files. Since the tool mapping replaces RUN_COMMAND with a phrase that already starts with a verb (e.g., Claude: Use the Bash tool to run, Copilot: Run the terminal command), the result becomes use Use the Bash tool to run(...) or use Run the terminal command(...).
The fix is to drop the leading "use" and write RUN_COMMAND(...) directly, matching the convention used in core/review-workflow.md (line 15: RUN_COMMAND(\date -u +"%Y-%m-%dT%H:%M:%SZ"`) to get the current UTC timestamp) and throughout the rest of core/workflow.md(e.g., line 56:LIST_DIR and READ_FILE the affected component's directory). For example: update `updated` timestamp (RUN_COMMAND(`date -u +"%Y-%m-%dT%H:%M:%SZ"`) for the current time)`.
| 1. Check the implementation gate: if spec review is enabled, verify `spec.json` status is `approved` before proceeding. Update status to `implementing`, set `specopsUpdatedWith` to the current SpecOps version (from this instruction file's frontmatter `version:` field), update `updated` timestamp (use RUN_COMMAND(`date -u +"%Y-%m-%dT%H:%M:%SZ"`) for the current time), and regenerate `index.json`. | |
| 1. Check the implementation gate: if spec review is enabled, verify `spec.json` status is `approved` before proceeding. Update status to `implementing`, set `specopsUpdatedWith` to the current SpecOps version (from this instruction file's frontmatter `version:` field), update `updated` timestamp (RUN_COMMAND(`date -u +"%Y-%m-%dT%H:%M:%SZ"`) for the current time), and regenerate `index.json`. |
core/workflow.md
Outdated
| - If stale documentation is detected, update the affected sections | ||
| - If unsure whether a doc needs updating, flag it to the user rather than skipping silently | ||
| 4. Set `spec.json` status to `completed`, set `specopsUpdatedWith` to the current SpecOps version (from this instruction file's frontmatter `version:` field), update `updated` timestamp, and regenerate `index.json` | ||
| 4. Set `spec.json` status to `completed`, set `specopsUpdatedWith` to the current SpecOps version (from this instruction file's frontmatter `version:` field), update `updated` timestamp (use RUN_COMMAND(`date -u +"%Y-%m-%dT%H:%M:%SZ"`) for the current time), and regenerate `index.json` |
There was a problem hiding this comment.
Same "use RUN_COMMAND" issue as in line 88 — the lowercase use before the abstract operation produces use Use the Bash tool to run(...) in generated platform output. Drop the leading "use" to match the convention in core/review-workflow.md line 15.
| 4. Set `spec.json` status to `completed`, set `specopsUpdatedWith` to the current SpecOps version (from this instruction file's frontmatter `version:` field), update `updated` timestamp (use RUN_COMMAND(`date -u +"%Y-%m-%dT%H:%M:%SZ"`) for the current time), and regenerate `index.json` | |
| 4. Set `spec.json` status to `completed`, set `specopsUpdatedWith` to the current SpecOps version (from this instruction file's frontmatter `version:` field), update `updated` timestamp (RUN_COMMAND(`date -u +"%Y-%m-%dT%H:%M:%SZ"`) for the current time), and regenerate `index.json` |
There was a problem hiding this comment.
Actionable comments posted: 2
♻️ Duplicate comments (2)
core/workflow.md (2)
55-57:⚠️ Potential issue | 🟠 MajorSplit the directory scan from the file reads.
READ_FILEon a directory is not a valid operation sequence, so this still renders into broken generated guidance. List the directory first, then read the concrete files you discover.♻️ Proposed fix
- 1. **Blast Radius Survey** — LIST_DIR and READ_FILE the affected component's directory. Identify every module, function, or API that imports or calls the affected code. If RUN_COMMAND is available, search for usages across the codebase. Record each entry point in the Blast Radius subsection. + 1. **Blast Radius Survey** — LIST_DIR the affected component's directory. Then READ_FILE the specific source files, callers, and entry points discovered in that scan. Identify every module, function, or API that imports or calls the affected code. If RUN_COMMAND is available, search for usages across the codebase. Record each entry point in the Blast Radius subsection.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@core/workflow.md` around lines 55 - 57, The workflow text currently suggests using READ_FILE on a directory which is invalid; update the "Blast Radius Survey" and "Behavior Inventory" guidance to explicitly separate directory listing from file reads by using LIST_DIR to enumerate the affected component's directory (record the directory and discovered file paths) and then use READ_FILE only on each concrete file path found; mention LIST_DIR and READ_FILE by name and clarify that the Blast Radius subsection should list directories and discovered files, while the Behavior Inventory step reads each discovered file to enumerate behaviors.
64-66:⚠️ Potential issue | 🟠 MajorLeave regression checkboxes unchecked until Phase 4.
This still tells the agent to mark acceptance criteria as checked while the spec is being written, which bypasses the completion gate and can make a draft bugfix look verified before implementation or testing.
Based on learnings: "Phase 4 includes docs check and checkbox verification."♻️ Proposed fix
- After the Regression Risk Analysis, populate the "Unchanged Behavior" section from the Must-Test behaviors. For Low severity with no Must-Test behaviors identified, note "N/A — isolated change with no caller-visible behavior to preserve" in the Unchanged Behavior section and mark the regression/coverage acceptance criteria as checked with a note. Structure the Testing Plan into three categories: Current Behavior (verify bug exists), Expected Behavior (verify fix works), Unchanged Behavior (verify no regressions using Must-Test items from the analysis; for Low severity with no Must-Test items, this section may be empty). + After the Regression Risk Analysis, populate the "Unchanged Behavior" section from the Must-Test behaviors. For Low severity with no Must-Test behaviors identified, note "N/A — isolated change with no caller-visible behavior to preserve" in the Unchanged Behavior section and record why the regression/coverage criteria are expected to be trivially satisfied at verification time. Leave the acceptance checkboxes unchecked until Phase 4. Structure the Testing Plan into three categories: Current Behavior (verify bug exists), Expected Behavior (verify fix works), Unchanged Behavior (verify no regressions using Must-Test items from the analysis; for Low severity with no Must-Test items, this section may be empty).🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@core/workflow.md` around lines 64 - 66, The spec currently instructs authors to "mark the regression/coverage acceptance criteria as checked with a note" during Low severity drafting; change the language in the "Regression Risk Analysis", "Unchanged Behavior", and "Testing Plan" paragraphs so that acceptance criteria checkboxes are explicitly left unchecked until Phase 4 (docs check and checkbox verification). Concretely, replace any phrase that tells the author to mark checkboxes as checked (e.g., "mark the regression/coverage acceptance criteria as checked with a note") with a directive to "defer checking acceptance criteria until Phase 4" and ensure the Testing Plan still records Current Behavior, Expected Behavior, and Unchanged Behavior but does not mark any verification boxes until Phase 4.
🧹 Nitpick comments (3)
skills/specops/SKILL.md (1)
1646-1726: Same minor inconsistency as Copilot versionThe template correctly uses Claude-specific tool abstractions in the HTML comments ("Use the Glob tool to list", "Use the Read tool to read", "Use the Bash tool to run"). However, if there are references to conceptual operation names like "LIST_DIR, READ_FILE, RUN_COMMAND" elsewhere in comments, the same clarification suggested for Copilot applies here.
Otherwise, the template is consistent with the Copilot version and comprehensive.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@skills/specops/SKILL.md` around lines 1646 - 1726, Update the comments under the "Regression Risk Analysis" and "Blast Radius" headings to consistently reference the Claude-specific tool names already used (e.g., "Use the Glob tool", "Use the Read tool", "Use the Bash tool") instead of any conceptual operation names like "LIST_DIR, READ_FILE, RUN_COMMAND"; search the SKILL.md sections (headings "Regression Risk Analysis", "Blast Radius", "Behavior Inventory", and comments shown) and replace any occurrences of the conceptual operation names with the matching Claude tool phrases to keep the template consistent.platforms/copilot/specops.instructions.md (1)
1404-1484: Minor inconsistency: Abstract operation names in commentsThe HTML comments reference abstract operations like "LIST_DIR, READ_FILE, RUN_COMMAND" but the actual instructions use the Copilot-specific syntax: "List the contents of and Read the file at" and "Run the terminal command". Consider removing the abstract operation names from the comments or clarifying that they're conceptual labels.
Example at line 1412:
<!-- Survey what code paths are touched by the affected component(s). - LIST_DIR, READ_FILE, RUN_COMMAND (conceptual operations) List the contents of and Read the file at to find callers, importers, and dependents. Run the terminal command to search for usages if the platform supports code execution.Otherwise, the template is comprehensive and well-structured. The Must-Test-derived Unchanged Behavior approach (lines 1456-1461) provides a clear methodology for regression prevention.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@platforms/copilot/specops.instructions.md` around lines 1404 - 1484, The HTML comment inside the "Regression Risk Analysis" section that currently mixes abstract operation names ("LIST_DIR, READ_FILE, RUN_COMMAND") with the Copilot-specific phrasing ("List the contents of and Read the file at", "Run the terminal command") is inconsistent; update the comment block that contains the token LIST_DIR, READ_FILE, RUN_COMMAND so it either removes those abstract labels or clarifies them as conceptual aliases (e.g., add "(conceptual labels corresponding to the Copilot actions below)"), and ensure the surrounding lines that mention "List the contents of and Read the file at" and "Run the terminal command" remain unchanged so readers see the concrete instructions.platforms/codex/SKILL.md (1)
1457-1461: Template comment could acknowledge Low severity escape hatch.The comment "Drawn from Must-Test behaviors in the Regression Risk Analysis above" implies Must-Test behaviors will always exist. However, Phase 2 instructions (line 70-72) allow Low severity bugs with no caller-visible behavior to note "N/A — isolated change with no caller-visible behavior to preserve" and skip Must-Test gates.
Suggested improvement for the template source (core/templates/bugfix.md or generator template):
<!-- Drawn from Must-Test behaviors in the Regression Risk Analysis above. Each item here is a formal commitment backed by discovery, not a guess. For Low severity bugs with no Must-Test behaviors identified, note: "N/A — isolated change with no caller-visible behavior to preserve" Use EARS notation: WHEN [condition] THE SYSTEM SHALL CONTINUE TO [existing behavior] -->Note: Since this is a generated file, any fix should be made to the source template, not this file directly. As per coding guidelines, edit core/ modules or generator/templates/*.j2 instead.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@platforms/codex/SKILL.md` around lines 1457 - 1461, The template comment block starting "<!-- Drawn from Must-Test behaviors in the Regression Risk Analysis above." in the generated SKILL.md is missing guidance for Low severity/escape-hatch cases; update the source template (e.g., core/templates/bugfix.md or the generator template under generator/templates/*.j2) to replace that comment with the suggested paragraph that includes the N/A note: "For Low severity bugs with no Must-Test behaviors identified, note: 'N/A — isolated change with no caller-visible behavior to preserve' Use EARS notation: WHEN [condition] THE SYSTEM SHALL CONTINUE TO [existing behavior]". Ensure the change is made in the template source (core/ modules or generator/templates/*.j2) not the generated SKILL.md so future generations include the clarification.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@core/workflow.md`:
- Around line 87-88: The Phase 3 gate currently only allows specs with
spec.json.status == "approved" to proceed, which blocks solo workflows; update
the check in the Phase 3: Implement logic to treat both "approved" and
"self-approved" as valid statuses, then when advancing set spec.json.status to
"implementing", set specopsUpdatedWith to the instruction file frontmatter
version, update the updated timestamp using RUN_COMMAND(`date -u
+"%Y-%m-%dT%H:%M:%SZ"`), and regenerate index.json as before; locate the
conditional that inspects spec.json.status and expand it to accept
"self-approved" alongside "approved" (preserving existing updates to
specopsUpdatedWith, updated, and index.json).
In `@tests/test_spec_schema.py`:
- Around line 417-434: The check_timestamp_ordering function currently verifies
updated >= created and reviewedAt >= created but misses ensuring updated >=
reviewedAt; update check_timestamp_ordering to, for each reviewer in
spec.get("reviewers", []), when both reviewed_at and updated exist, compare
parse_ts(updated) and parse_ts(reviewed_at) and fail (print and return False) if
updated < reviewed_at so the function enforces updated >= reviewedAt in addition
to the existing checks.
---
Duplicate comments:
In `@core/workflow.md`:
- Around line 55-57: The workflow text currently suggests using READ_FILE on a
directory which is invalid; update the "Blast Radius Survey" and "Behavior
Inventory" guidance to explicitly separate directory listing from file reads by
using LIST_DIR to enumerate the affected component's directory (record the
directory and discovered file paths) and then use READ_FILE only on each
concrete file path found; mention LIST_DIR and READ_FILE by name and clarify
that the Blast Radius subsection should list directories and discovered files,
while the Behavior Inventory step reads each discovered file to enumerate
behaviors.
- Around line 64-66: The spec currently instructs authors to "mark the
regression/coverage acceptance criteria as checked with a note" during Low
severity drafting; change the language in the "Regression Risk Analysis",
"Unchanged Behavior", and "Testing Plan" paragraphs so that acceptance criteria
checkboxes are explicitly left unchecked until Phase 4 (docs check and checkbox
verification). Concretely, replace any phrase that tells the author to mark
checkboxes as checked (e.g., "mark the regression/coverage acceptance criteria
as checked with a note") with a directive to "defer checking acceptance criteria
until Phase 4" and ensure the Testing Plan still records Current Behavior,
Expected Behavior, and Unchanged Behavior but does not mark any verification
boxes until Phase 4.
---
Nitpick comments:
In `@platforms/codex/SKILL.md`:
- Around line 1457-1461: The template comment block starting "<!-- Drawn from
Must-Test behaviors in the Regression Risk Analysis above." in the generated
SKILL.md is missing guidance for Low severity/escape-hatch cases; update the
source template (e.g., core/templates/bugfix.md or the generator template under
generator/templates/*.j2) to replace that comment with the suggested paragraph
that includes the N/A note: "For Low severity bugs with no Must-Test behaviors
identified, note: 'N/A — isolated change with no caller-visible behavior to
preserve' Use EARS notation: WHEN [condition] THE SYSTEM SHALL CONTINUE TO
[existing behavior]". Ensure the change is made in the template source (core/
modules or generator/templates/*.j2) not the generated SKILL.md so future
generations include the clarification.
In `@platforms/copilot/specops.instructions.md`:
- Around line 1404-1484: The HTML comment inside the "Regression Risk Analysis"
section that currently mixes abstract operation names ("LIST_DIR, READ_FILE,
RUN_COMMAND") with the Copilot-specific phrasing ("List the contents of and Read
the file at", "Run the terminal command") is inconsistent; update the comment
block that contains the token LIST_DIR, READ_FILE, RUN_COMMAND so it either
removes those abstract labels or clarifies them as conceptual aliases (e.g., add
"(conceptual labels corresponding to the Copilot actions below)"), and ensure
the surrounding lines that mention "List the contents of and Read the file at"
and "Run the terminal command" remain unchanged so readers see the concrete
instructions.
In `@skills/specops/SKILL.md`:
- Around line 1646-1726: Update the comments under the "Regression Risk
Analysis" and "Blast Radius" headings to consistently reference the
Claude-specific tool names already used (e.g., "Use the Glob tool", "Use the
Read tool", "Use the Bash tool") instead of any conceptual operation names like
"LIST_DIR, READ_FILE, RUN_COMMAND"; search the SKILL.md sections (headings
"Regression Risk Analysis", "Blast Radius", "Behavior Inventory", and comments
shown) and replace any occurrences of the conceptual operation names with the
matching Claude tool phrases to keep the template consistent.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 11e5a481-5b0e-4e1f-97de-e5ffd2b837ea
📒 Files selected for processing (12)
.specops/bugfix-regression-analysis/spec.json.specops/index.jsonCHECKSUMS.sha256core/data-handling.mdcore/review-workflow.mdcore/workflow.mdplatforms/claude/SKILL.mdplatforms/codex/SKILL.mdplatforms/copilot/specops.instructions.mdplatforms/cursor/specops.mdcskills/specops/SKILL.mdtests/test_spec_schema.py
🚧 Files skipped from review as they are similar to previous changes (1)
- .specops/bugfix-regression-analysis/spec.json
…rage - Drop leading "use" before abstract operations (RUN_COMMAND) to prevent "use Use the Bash tool..." in generated output - Split LIST_DIR and READ_FILE in Blast Radius Survey for correct rendering - Stop pre-checking acceptance criteria during Phase 2 (defer to Phase 4) - Allow self-approved specs through Phase 3 implementation gate - Add severity-conditional guidance to bugfix template acceptance criteria - Add updated >= reviewedAt assertion to timestamp ordering tests
There was a problem hiding this comment.
Actionable comments posted: 2
♻️ Duplicate comments (2)
core/workflow.md (1)
55-64:⚠️ Potential issue | 🟠 MajorKeep Scope Escalation in the medium/low paths.
Step 5 only exists in the Critical/High branch right now, so Medium and Low bugs can skip the one check that stops a disguised feature request from flowing through the bugfix path.
♻️ Suggested change
- **Medium severity:** Complete steps 1 (Blast Radius) and 2 (Behavior Inventory). Brief Risk Tier table. Skip detailed coverage check unless the codebase has obvious test gaps. + **Medium severity:** Complete steps 1 (Blast Radius) and 2 (Behavior Inventory). Brief Risk Tier table. Perform a brief Scope Escalation Check. Skip detailed coverage check unless the codebase has obvious test gaps. - **Low severity:** Brief step 1 only. If the blast radius is clearly one isolated function with no callers in critical paths, note "minimal regression risk — isolated change". Also record at least one caller-visible behavior to preserve and classify it in a lightweight Risk Tier entry, or note "No caller-visible unchanged behavior — isolated internal fix" which explicitly skips Must-Test-derived unchanged-behavior gates for this spec. + **Low severity:** Brief step 1 only. If the blast radius is clearly one isolated function with no callers in critical paths, note "minimal regression risk — isolated change". Also record at least one caller-visible behavior to preserve and classify it in a lightweight Risk Tier entry, or note "No caller-visible unchanged behavior — isolated internal fix" which explicitly skips Must-Test-derived unchanged-behavior gates for this spec. Still perform a brief Scope Escalation Check: if the issue is actually a missing feature or needs shared new abstractions, stop and create a Feature Spec instead.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@core/workflow.md` around lines 55 - 64, The Medium and Low severity branches currently omit the "Scope Escalation" check (Step 5); update the workflow text to include a scaled Scope Escalation evaluation for Medium and Low paths: add a short Step 5 under both "Medium severity" and "Low severity" that instructs reviewers to flag "Scope escalation needed" and create a Feature Spec if the bugfix requires new abstractions or API changes, while allowing the bugfix spec to proceed for contained fixes; reference the existing "Scope Escalation" wording from the Critical/High branch and reuse its key phrase "Scope escalation needed" so reviewers use consistent signals.core/templates/bugfix.md (1)
79-108:⚠️ Potential issue | 🟠 MajorAllow an explicit N/A path when no Must-Test behaviors exist.
The workflow now allows low-severity isolated fixes to end with no caller-visible unchanged behavior, but this template still only offers concrete Unchanged Behavior bullets plus default regression/coverage checkboxes. That pushes authors to invent preservation requirements instead of documenting the intended no-op path.
♻️ Suggested change
## Unchanged Behavior <!-- Drawn from Must-Test behaviors in the Regression Risk Analysis above. Each item here is a formal commitment backed by discovery, not a guess. - Use EARS notation: WHEN [condition] THE SYSTEM SHALL CONTINUE TO [existing behavior] --> + Use EARS notation: WHEN [condition] THE SYSTEM SHALL CONTINUE TO [existing behavior] + If no Must-Test behaviors were identified for a low-severity isolated fix, replace this section with: + N/A — isolated change with no caller-visible behavior to preserve. --> - WHEN [condition] THE SYSTEM SHALL CONTINUE TO [existing behavior that must be preserved] - WHEN [condition] THE SYSTEM SHALL CONTINUE TO [existing behavior that must be preserved] @@ ### Unchanged Behavior (verify no regressions) <!-- Must-Test behaviors from Regression Risk Analysis. Nice-To-Test items are optional. - Gap behaviors (no existing test) from Coverage Assessment must have new tests here. --> + Gap behaviors (no existing test) from Coverage Assessment must have new tests here. + If no Must-Test behaviors were identified, this subsection may be omitted or replaced with the same N/A note. --> - WHEN [related condition] THE SYSTEM SHALL CONTINUE TO [preserved behavior] @@ - [ ] No regressions (all Must-Test Unchanged Behavior tests pass) - [ ] Test coverage gaps from Coverage Assessment addressed +<!-- Omit the regression checkbox when no Must-Test behaviors were identified. + Omit the coverage-gap checkbox when Coverage Assessment was intentionally skipped. -->🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@core/templates/bugfix.md` around lines 79 - 108, The template's "Unchanged Behavior" section and Acceptance Criteria force authors to invent preservation requirements; update the "Unchanged Behavior" subsection under Testing Plan (and the related bullet under the top-level Unchanged Behavior heading) to include an explicit "N/A — no Must-Test behaviors" option and add guidance text so authors can select that when there are none, and update the Acceptance Criteria checklist to allow skipping the "No regressions" checkbox (or mark it as conditional) when "N/A" is chosen; look for the headings "## Unchanged Behavior", "### Unchanged Behavior (verify no regressions)" and the "Acceptance Criteria" checklist in bugfix.md to implement this change.
🧹 Nitpick comments (1)
platforms/copilot/specops.instructions.md (1)
412-420: Timestamp Protocol correctly implemented.The protocol mandates system clock timestamps via
date -u +"%Y-%m-%dT%H:%M:%SZ"and is consistently applied across all spec.json mutation sites (Phase 3, Phase 4, reviews, revisions, status changes). This prevents timestamp fabrication and ensures ordering correctness.💡 Optional: Document the timestamp ordering invariant
The Data Handling section (line 2061) mentions the invariant
updated>=created, but the Timestamp Protocol section could also document this explicitly:### Timestamp Protocol All timestamps in `spec.json` (`created`, `updated`, `reviewedAt`) must come from the system clock. Never estimate or fabricate timestamps. To get the current UTC timestamp: Run the terminal command(`date -u +"%Y-%m-%dT%H:%M:%SZ"`) Use this command's output wherever a timestamp is needed. + +**Invariant**: `updated` must always be >= `created`. This is enforced by tests and must be preserved across all spec.json updates.This would make the invariant more discoverable for maintainers working in this section.
Based on learnings: "spec.json metadata timestamps (created/updated) must always be set using system-clock timestamps via
date -u. A Timestamp Protocol is enforced at all update sites, and ordering tests (created <= updated) validate correctness."🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@platforms/copilot/specops.instructions.md` around lines 412 - 420, Add an explicit statement to the "Timestamp Protocol" section that enforces the ordering invariant by requiring updated >= created (and reviewedAt >= created when present); update the "Timestamp Protocol" text near the existing guidance (the Timestamp Protocol heading) to state this invariant and reference the same system-clock command (`date -u +"%Y-%m-%dT%H:%M:%SZ"`) as the canonical source for both created and updated timestamps, and optionally add a short cross-reference to the "Data Handling" section where the invariant is already mentioned to make it more discoverable.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@platforms/codex/SKILL.md`:
- Around line 1406-1409: Update the template header comment that currently reads
"Low → brief Blast Radius scan only" (the "Depth scales with Severity..." block)
to require that low-severity bugfixes still include a Risk Tier outcome or an
explicit "no caller-visible unchanged behavior" note; specifically, change the
low-severity line to something like "Low → brief Blast Radius scan only; must
record Risk Tier or explicit 'no caller-visible unchanged behavior' note" so
later sections that reference caller-visible behavior or Risk Tier always have
the required data.
In `@tests/test_spec_schema.py`:
- Around line 443-444: The loop in tests/test_spec_schema.py currently uses
sorted(glob.glob(".specops/*/spec.json")) which only matches a single directory
level; update the glob to recursively find all nested dogfood specs so the
timestamp check runs on every .specops/**/spec.json. Modify the expression
inside the sorted(...) call (or replace with
pathlib.Path(".specops").rglob("spec.json")) used in the for spec_file in ...
loop that calls check(check_timestamp_ordering(spec_file)) so it uses a
recursive pattern (e.g., glob(..., recursive=True) with ".specops/**/spec.json")
ensuring check_timestamp_ordering runs for all nested spec.json files.
---
Duplicate comments:
In `@core/templates/bugfix.md`:
- Around line 79-108: The template's "Unchanged Behavior" section and Acceptance
Criteria force authors to invent preservation requirements; update the
"Unchanged Behavior" subsection under Testing Plan (and the related bullet under
the top-level Unchanged Behavior heading) to include an explicit "N/A — no
Must-Test behaviors" option and add guidance text so authors can select that
when there are none, and update the Acceptance Criteria checklist to allow
skipping the "No regressions" checkbox (or mark it as conditional) when "N/A" is
chosen; look for the headings "## Unchanged Behavior", "### Unchanged Behavior
(verify no regressions)" and the "Acceptance Criteria" checklist in bugfix.md to
implement this change.
In `@core/workflow.md`:
- Around line 55-64: The Medium and Low severity branches currently omit the
"Scope Escalation" check (Step 5); update the workflow text to include a scaled
Scope Escalation evaluation for Medium and Low paths: add a short Step 5 under
both "Medium severity" and "Low severity" that instructs reviewers to flag
"Scope escalation needed" and create a Feature Spec if the bugfix requires new
abstractions or API changes, while allowing the bugfix spec to proceed for
contained fixes; reference the existing "Scope Escalation" wording from the
Critical/High branch and reuse its key phrase "Scope escalation needed" so
reviewers use consistent signals.
---
Nitpick comments:
In `@platforms/copilot/specops.instructions.md`:
- Around line 412-420: Add an explicit statement to the "Timestamp Protocol"
section that enforces the ordering invariant by requiring updated >= created
(and reviewedAt >= created when present); update the "Timestamp Protocol" text
near the existing guidance (the Timestamp Protocol heading) to state this
invariant and reference the same system-clock command (`date -u
+"%Y-%m-%dT%H:%M:%SZ"`) as the canonical source for both created and updated
timestamps, and optionally add a short cross-reference to the "Data Handling"
section where the invariant is already mentioned to make it more discoverable.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 6bd60491-7feb-481a-9719-1a30e13bff3d
📒 Files selected for processing (9)
CHECKSUMS.sha256core/templates/bugfix.mdcore/workflow.mdplatforms/claude/SKILL.mdplatforms/codex/SKILL.mdplatforms/copilot/specops.instructions.mdplatforms/cursor/specops.mdcskills/specops/SKILL.mdtests/test_spec_schema.py
…rsive glob for dogfood specs
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 18 out of 18 changed files in this pull request and generated 1 comment.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
core/workflow.md
Outdated
| **For bugfix specs:** After completing Root Cause Analysis and Impact Assessment, conduct Regression Risk Analysis before writing the Proposed Fix. The analysis depth scales with the Severity field from Impact Assessment: | ||
|
|
||
| **Critical or High severity:** | ||
| 1. **Blast Radius Survey** — LIST_DIR the affected component's directory. Then READ_FILE the specific source files, callers, and entry points discovered in that scan. Identify every module, function, or API that imports or calls the affected code. If RUN_COMMAND is available, search for usages across the codebase. Record each entry point in the Blast Radius subsection. |
There was a problem hiding this comment.
The phrase "If RUN_COMMAND is available" uses the abstract operation token as a noun (capability reference), but the tool substitution system replaces it with a verb phrase. After generation, this produces broken text across all platforms:
- Claude: "If Use the Bash tool to run is available"
- Cursor/Copilot: "If Run the terminal command is available"
- Codex: "If Execute the command is available"
The fix is to rephrase this to avoid embedding the abstract operation in a noun position. For example, use If the platform supports code execution (which is already the phrasing used in the bugfix template's Blast Radius comment at core/templates/bugfix.md:37) or reference the capability flag directly, e.g., If code execution is available (canExecuteCode).
| 1. **Blast Radius Survey** — LIST_DIR the affected component's directory. Then READ_FILE the specific source files, callers, and entry points discovered in that scan. Identify every module, function, or API that imports or calls the affected code. If RUN_COMMAND is available, search for usages across the codebase. Record each entry point in the Blast Radius subsection. | |
| 1. **Blast Radius Survey** — LIST_DIR the affected component's directory. Then READ_FILE the specific source files, callers, and entry points discovered in that scan. Identify every module, function, or API that imports or calls the affected code. If the platform supports code execution, search for usages across the codebase. Record each entry point in the Blast Radius subsection. |
There was a problem hiding this comment.
Actionable comments posted: 4
🧹 Nitpick comments (2)
platforms/codex/SKILL.md (1)
1477-1480: Same minor clarity issue as skills/specops version.The Acceptance Criteria comment uses the same conditional phrasing as the skills/specops file. Recommend the same fix at the source level (likely
core/templates/bugfix.mdorgenerator/templates/codex.j2):Low → keep criteria 1-3; keep 4 only if Must-Test items identified; omit 5 (no Coverage Assessment)As per coding guidelines: Edit core/ modules or generator/templates/*.j2 instead of platform output files, then regenerate.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@platforms/codex/SKILL.md` around lines 1477 - 1480, The Acceptance Criteria wording in the generated codex output is using ambiguous conditional phrasing; update the source template (likely core/templates/bugfix.md or generator/templates/codex.j2) to replace the "Low → keep criteria 1-3; omit 4 if no Must-Test items; omit 5 (no Coverage Assessment)" line with the clarified phrasing: "Low → keep criteria 1-3; keep 4 only if Must-Test items identified; omit 5 (no Coverage Assessment)". Edit the template that defines the Acceptance Criteria block, run the generator to regenerate platform outputs (including platforms/codex/SKILL.md and skills/specops) and verify the updated sentence appears in the produced files.skills/specops/SKILL.md (1)
1718-1721: Minor: Acceptance Criteria comment could be more explicit for Low severity.The comment says "omit 4 if no Must-Test items" for Low severity, but doesn't explicitly state the inverse: "keep 4 if Must-Test items exist." The logic is correct, but the conditional phrasing could be clearer.
Since this is a generated file, recommend updating the source template (likely
core/templates/bugfix.md) to:Low → keep criteria 1-3; keep 4 only if Must-Test items identified; omit 5 (no Coverage Assessment)As per coding guidelines: Never edit generated platform output files directly (SKILL.md, specops.mdc, specops.instructions.md). Edit core/ modules or generator/templates/*.j2 instead, then regenerate.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@skills/specops/SKILL.md` around lines 1718 - 1721, Update the source template that generates SKILL.md (the bugfix template, e.g., bugfix.md in your templates) to change the Low-severity acceptance criteria wording to: "Low → keep criteria 1-3; keep 4 only if Must-Test items identified; omit 5 (no Coverage Assessment)". Do not edit the generated SKILL.md directly—modify the generator/template file (core/templates/bugfix.md or generator/templates/*.j2), apply the new sentence in place of the existing Low severity line, and then regenerate the output so the change propagates.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@core/templates/bugfix.md`:
- Around line 100-105: The checklist line "Regression Risk Analysis completed
(blast radius surveyed, behaviors inventoried)" is inconsistent with the
Low-severity flow; update the checklist text in core/templates/bugfix.md by
removing or making optional the "behaviors inventoried" fragment so the
Regression Risk Analysis checkbox reads something like "Regression Risk Analysis
completed (blast radius surveyed)" and ensure the overall checklist follows the
Low-severity rules (keep criteria 1-3, omit criterion 4 if no Must-Test items
and omit criterion 5 entirely). Target the exact string "Regression Risk
Analysis completed (blast radius surveyed, behaviors inventoried)" when editing
so the template no longer forces the Behavior Inventory for Low-severity bugs.
In `@platforms/claude/SKILL.md`:
- Line 95: Update the Phase 3 gate text to match the Implementation Gate
override behavior: when checking spec.json status in the Phase 3 paragraph (the
implementation gate), explicitly state that implementation normally proceeds
only if spec.json.status is "approved" or "self-approved" but that an
interactive override is allowed for unapproved specs via the Implementation Gate
flow; also call out security-sensitive files require extra scrutiny; ensure you
reference updating spec.json.status to "implementing", setting
specopsUpdatedWith to the instruction file frontmatter version, updating updated
timestamp (use Bash date -u +"%Y-%m-%dT%H:%M:%SZ"), and regenerating index.json
so the Phase 3 description is consistent with the later Implementation Gate
module.
- Around line 1646-1693: The SKILL.md output still includes full "Test Coverage
Assessment" and "Scope Escalation Check" sections for every bugfix; update the
bugfix template used by the generator (edit generator/templates/bugfix.md.j2 or
the core module that produces bugfix content) to conditionally omit or collapse
those two sections when severity is Medium or Low (i.e., add severity checks in
the Jinja template to render those sections only for Critical/High, or render a
brief note for Medium/Low), then run python3 generator/generate.py --all to
regenerate SKILL.md so the changes propagate (do not edit SKILL.md directly).
In `@tests/test_spec_schema.py`:
- Around line 417-438: The check_timestamp_ordering function currently lets
parse_ts raise ValueError and abort the run; wrap calls to parse_ts for created,
updated, and reviewedAt in try/except ValueError blocks (or a small helper) so
malformed timestamps result in printing a FAIL for that spec_path with the
offending field/value (e.g., "FAIL: {spec_path} - invalid timestamp in created:
{created}") and returning False instead of propagating the exception; ensure
every parse_ts usage inside check_timestamp_ordering (for created, updated, and
reviewed_at) is guarded so the function records failures and continues rather
than crashing the test suite.
---
Nitpick comments:
In `@platforms/codex/SKILL.md`:
- Around line 1477-1480: The Acceptance Criteria wording in the generated codex
output is using ambiguous conditional phrasing; update the source template
(likely core/templates/bugfix.md or generator/templates/codex.j2) to replace the
"Low → keep criteria 1-3; omit 4 if no Must-Test items; omit 5 (no Coverage
Assessment)" line with the clarified phrasing: "Low → keep criteria 1-3; keep 4
only if Must-Test items identified; omit 5 (no Coverage Assessment)". Edit the
template that defines the Acceptance Criteria block, run the generator to
regenerate platform outputs (including platforms/codex/SKILL.md and
skills/specops) and verify the updated sentence appears in the produced files.
In `@skills/specops/SKILL.md`:
- Around line 1718-1721: Update the source template that generates SKILL.md (the
bugfix template, e.g., bugfix.md in your templates) to change the Low-severity
acceptance criteria wording to: "Low → keep criteria 1-3; keep 4 only if
Must-Test items identified; omit 5 (no Coverage Assessment)". Do not edit the
generated SKILL.md directly—modify the generator/template file
(core/templates/bugfix.md or generator/templates/*.j2), apply the new sentence
in place of the existing Low severity line, and then regenerate the output so
the change propagates.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 61e57e09-bcb0-4ba4-8a7f-851791070102
📒 Files selected for processing (8)
CHECKSUMS.sha256core/templates/bugfix.mdplatforms/claude/SKILL.mdplatforms/codex/SKILL.mdplatforms/copilot/specops.instructions.mdplatforms/cursor/specops.mdcskills/specops/SKILL.mdtests/test_spec_schema.py
… Phase 3 gate cross-reference, and handle malformed timestamps
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 18 out of 18 changed files in this pull request and generated no new comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Summary
core/templates/bugfix.md) with five severity-scaled subsections: Blast Radius, Behavior Inventory, Test Coverage Assessment, Risk Tier, and Scope Escalation Checkcore/workflow.md) with concrete discovery steps using abstract operations (LIST_DIR, READ_FILE, RUN_COMMAND), tiered by bug severity (Critical/High, Medium, Low)REGRESSION_MARKERSvalidation togenerator/validate.pyensuring regression analysis content survives platform generation across all 4 platforms.specops/bugfix-regression-analysis/(requirements with EARS notation, design, tasks, implementation journal)Changes
core/templates/bugfix.mdcore/workflow.mdgenerator/validate.pyplatforms/*/,skills/,.claude-plugin/CHECKSUMS.sha256.specops/bugfix-regression-analysis/*.specops/index.jsonTest Plan
python3 generator/validate.pypasses (all platforms, including new regression markers)shasum -a 256 -c CHECKSUMS.sha256passespython3 generator/generate.py --all && git diff --exit-code platforms/ skills/ .claude-plugin/— generated files are freshpython3 tests/check_schema_sync.pypassesbash scripts/run-tests.sh— 7/7 tests passSummary by CodeRabbit
New Features
Validation
Documentation
Tests