Skip to content

feat: add refining-plan skill for iterative plan pressure-testing#622

Closed
404pilo wants to merge 16 commits intoobra:mainfrom
404pilo:feat/refine-plan-skill
Closed

feat: add refining-plan skill for iterative plan pressure-testing#622
404pilo wants to merge 16 commits intoobra:mainfrom
404pilo:feat/refine-plan-skill

Conversation

@404pilo
Copy link

@404pilo 404pilo commented Mar 5, 2026

Summary

  • Adds refining-plan skill that iteratively pressure-tests implementation plans before execution (simulate → fix → converge loop)
  • Dispatches plan-simulator and plan-fixer subagents with domain-detected role profiles
  • Integrates into writing-plans execution handoff as a recommended first step
  • Includes convergence detection, escalation logic, and structured reporting

New Files

  • skills/refining-plan/SKILL.md — Main skill with checklist, flowcharts, and convergence rules
  • skills/refining-plan/plan-simulator-prompt.md — Prompt template for simulation subagent
  • skills/refining-plan/plan-fixer-prompt.md — Prompt template for fixer subagent
  • commands/refining-plan.md/refining-plan command entry point

Modified Files

  • skills/writing-plans/SKILL.md — Updated plan header template and execution handoff to offer refinement as option 1

Design

The skill follows a simulate → evaluate → fix → check convergence loop (max 5 iterations). It detects the plan's domain to generate contextual role profiles for subagents. Convergence is checked via diminishing returns, recurring criticals, and drift detection. The controller provides full plan text to subagents (no file reading overhead).

Test plan

  • /refining-plan loads skill and announces usage
  • Skill detects domain from plan content correctly
  • Plan-simulator subagent receives full plan text and returns structured findings
  • Plan-fixer subagent applies minimal edits for critical/important findings only
  • Loop converges when no critical/important findings remain
  • Loop escalates when same critical persists 3+ rounds
  • Report shows round-by-round table with signal column
  • Execution handoff offers 3 options (subagent-driven, parallel, refine again)
  • /writing-plans now shows refine-plan as option 1 in handoff
  • Plan header template includes refining-plan reference

🤖 Generated with Claude Code

404pilo and others added 10 commits March 5, 2026 07:24
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Plan file is now auto-detected from conversation context or most recent
file in docs/plans/ — only asks user as last resort. Max iterations
defaults to 5 without prompting.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Skill: refine-plan → refining-plan (matches gerund convention)
- Directory: skills/refine-plan/ → skills/refining-plan/
- Command: commands/refine-plan.md → commands/refining-plan.md
- Roles: simulator → plan-simulator, fixer → plan-fixer
- Prompt files: simulator-prompt.md → plan-simulator-prompt.md, etc.
- Remove novel "Context Detection" section (no other skill has this)
- Remove novel "never ask user" red flags (no other skill has these)
- Update all cross-references in writing-plans

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@coderabbitai
Copy link

coderabbitai bot commented Mar 5, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Adds a new refining-plans skill with simulator and fixer subagent prompts, a refine-plans command, and updates the writing-plans skill to include a “Refine first” execution option plus clarified post-save execution workflows.

Changes

Cohort / File(s) Summary
Writing-Plans Skill
skills/writing-plans/SKILL.md
Reworked post-save execution options from two to three: added “Refine first” (requires refining-sub-skill), renamed/clarified existing options to explicitly skip refinement and state required sub-skills.
Refining-Plans Skill Doc
skills/refining-plans/SKILL.md
New skill doc describing a three-phase refining workflow (Domain Detection, Iteration Loop: simulate+fix, Report & Handoff), iteration/convergence rules, telemetry, signals (CONVERGED/ESCALATE), post-refinement choices, and red-flag rules.
Subagent Prompts
skills/refining-plans/plan-simulator-prompt.md, skills/refining-plans/plan-fixer-prompt.md
New plan-simulator and plan-fixer subagent prompt templates: simulator defines lenses, DO/DO NOT, severity levels and findings report; fixer defines minimal, non-restructuring patch workflow and FIXED/skipped report format.
Command Fragment
commands/refine-plans.md
New command fragment (YAML front matter) to invoke the refining-plans skill for pressure-testing plans; includes metadata (disable-model-invocation: true) and imperative to call the refining-plans skill.

Sequence Diagram

sequenceDiagram
    actor User
    participant WritingPlan as Writing-Plans Skill
    participant RefineCmd as Refine-Plans Command
    participant RefineSkill as Refining-Plans Skill
    participant Simulator as Plan Simulator
    participant Fixer as Plan Fixer
    participant Report as Refinement Report

    User->>WritingPlan: Request plan execution
    WritingPlan->>WritingPlan: Select execution option
    alt Refine first
        WritingPlan->>RefineCmd: Invoke refine-plans
        RefineCmd->>RefineSkill: Start refining workflow
        RefineSkill->>Simulator: Phase 1: detect domain & simulate
        Simulator->>Simulator: Phase 2: simulate across lenses, produce findings
        Simulator-->>Report: Emit findings (Critical/Important/Minor)
        RefineSkill->>Fixer: Dispatch targeted fixes with findings
        Fixer->>Fixer: Apply minimal fixes, document changes
        Fixer-->>Report: Return FIXED/Skipped entries
        RefineSkill->>RefineSkill: Evaluate convergence / escalate if needed
        RefineSkill-->>Report: Phase 3: final report & handoff
        Report-->>WritingPlan: Deliver refined plan
        WritingPlan->>User: Proceed with execution (refined)
    else Subagent-Driven (skip refinement)
        WritingPlan->>User: Spawn fresh subagent per task (no refinement)
    else Parallel Session (skip refinement)
        WritingPlan->>User: Batch execution in separate session with checkpoints
    end
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~22 minutes

Poem

🐰 I nudge the plan with twitching nose,
I simulate where trouble grows,
I stitch the gaps with careful hops,
Mark fixes, skips, and tidy stops,
Then hand it off — we’re ready to go.

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately describes the main feature: adding a new refining-plan skill for iterative plan pressure-testing, which is the primary focus of this pull request.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Description check ✅ Passed The pull request description clearly outlines the new refining-plan skill, its integration into writing-plans, specific file changes, design approach, and a detailed test plan with checkboxes.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (1)
skills/refining-plan/plan-fixer-prompt.md (1)

46-52: Preserve severity + requirement in fixer report for round-to-round traceability.

Current changes/skipped entries only keep concern text. Adding severity and requirement fields makes audits and escalation checks more reliable.

Proposed report-format refinement
     changes:
-      - finding: [original concern]
+      - severity: [critical|important]
+        requirement: [exact text from plan]
+        finding: [original concern]
         action: [what was changed]
         location: [which section]
     skipped:
-      - finding: [concern]
+      - severity: [critical|important|minor]
+        requirement: [exact text from plan]
+        finding: [concern]
         reason: [why skipped — e.g., "minor severity", "conflicts with plan intent"]
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@skills/refining-plan/plan-fixer-prompt.md` around lines 46 - 52, Update the
report template under the YAML blocks "changes" and "skipped" to include
severity and requirement fields for each entry; specifically augment each
"finding" object (used in the changes list with keys finding/action/location) to
also include "severity" and "requirement", and do the same for skipped entries
(keys finding/reason) so they become finding/reason/severity/requirement —
modify the plan-fixer prompt template where the "changes:" and "skipped:"
examples are defined to add these fields and ensure any code that renders or
validates these entries expects and preserves "severity" and "requirement".
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@skills/refining-plan/plan-simulator-prompt.md`:
- Around line 5-72: The opening fenced code block that contains "Task tool
(general-purpose):" is missing a language identifier (MD040); update the opening
``` to ```text (and keep the closing ``` unchanged) so the block is explicitly
marked as text—locate the triple-backtick fence that wraps the plan prompt and
add the language tag to satisfy markdownlint.

In `@skills/refining-plan/SKILL.md`:
- Around line 148-158: The listed execution options in the refining-plan SKILL
("Subagent-Driven (this session)", "Parallel Session (separate)", "Refine
again") are numbered differently than the mapping used in the writing-plans
skill where option 1 is “Refine first”; update the numbering/order in this
SKILL.md so it matches the sequence in skills/writing-plans/SKILL.md (make
“Refine again” the first option, then the batch/parallel option, then
subagent-driven), and ensure the REQUIRED SUB-SKILL lines
(superpowers:refining/plans vs superpowers:executing-plans vs
superpowers:subagent-driven-development) are reordered to match each option
label so the option numbers and required sub-skill mappings are consistent
across both skills.
- Around line 133-144: The fenced code block starting at the "## Plan Refinement
Complete" header lacks a language tag (MD040); update the opening fence from ```
to ```markdown so the block is explicitly marked as Markdown (i.e., change the
fenced block that contains "**Plan:** {plan_path}" etc. to start with
```markdown) and keep the closing ``` as-is to satisfy the linter.

---

Nitpick comments:
In `@skills/refining-plan/plan-fixer-prompt.md`:
- Around line 46-52: Update the report template under the YAML blocks "changes"
and "skipped" to include severity and requirement fields for each entry;
specifically augment each "finding" object (used in the changes list with keys
finding/action/location) to also include "severity" and "requirement", and do
the same for skipped entries (keys finding/reason) so they become
finding/reason/severity/requirement — modify the plan-fixer prompt template
where the "changes:" and "skipped:" examples are defined to add these fields and
ensure any code that renders or validates these entries expects and preserves
"severity" and "requirement".

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 0c028e54-01a2-4c68-93e0-843877baca19

📥 Commits

Reviewing files that changed from the base of the PR and between e4a2375 and cef42ad.

📒 Files selected for processing (5)
  • commands/refining-plan.md
  • skills/refining-plan/SKILL.md
  • skills/refining-plan/plan-fixer-prompt.md
  • skills/refining-plan/plan-simulator-prompt.md
  • skills/writing-plans/SKILL.md

404pilo and others added 2 commits March 5, 2026 11:15
- Add language tags to fenced code blocks (MD040)
- Add severity/requirement fields to fixer report format
- Rename command to refine-plan.md (matches brainstorm/write-plan/execute-plan convention)
- Trim SKILL.md from 807 to 570 words (merge checklist into process, remove redundant sections)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
TDD REFACTOR: Both baseline and skill-guided tests identified that
"critical count unchanged → CONVERGED" is wrong — if criticals persist
because fixer failed, that's ESCALATE, not CONVERGED. Fixed convergence
rules: unchanged criticals now escalate to user.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@skills/refining-plan/plan-fixer-prompt.md`:
- Around line 13-21: The fixer prompt in
skills/refining-plan/plan-fixer-prompt.md is missing the original snapshot
required by Phase 2 in skills/refining-plan/SKILL.md; add a clearly labeled
"Original Snapshot" section (e.g., "## Original Snapshot" with a placeholder
like {original_snapshot} or {snapshot}) above or near the "Read the plan file
now." block so the fixer receives the original snapshot input as specified by
Phase 2, ensuring the placeholder name matches the workflow's expected variable.
- Around line 46-56: The fixer report schema in
skills/refining-plan/plan-fixer-prompt.md uses fields "finding" and "action"
that must be renamed to match the simulator's finding schema ("concern" and
"recommendation") to avoid brittle round-trip parsing; update the "changes:" and
"skipped:" entries to replace finding→concern and action→recommendation
(preserve other keys: severity, requirement, location, reason) and ensure any
consumers/serializers that reference those keys are updated accordingly to use
"concern" and "recommendation" consistently to align with
skills/refining-plan/plan-simulator-prompt.md.

In `@skills/refining-plan/SKILL.md`:
- Around line 81-83: Update the escalation rule text so it requires the same
critical issue to persist rather than just an unchanged critical count: replace
the current line "Critical count unchanged (round 2+) or same concern persists
(round 3+) → ESCALATE" with a conditional that explicitly checks for the same
critical persisting across rounds (e.g., "Same critical(s) persist across rounds
(round 2+) → ESCALATE"); keep the "Drift detection: plan changed direction →
ESCALATE" rule unchanged. Ensure the wording mentions that unchanged counts
alone do not trigger escalation unless at least one identical critical (by
description/id) remains.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: cac45af0-431d-458e-a5a0-bd6bffaa6b2b

📥 Commits

Reviewing files that changed from the base of the PR and between cef42ad and 105f4a8.

📒 Files selected for processing (4)
  • commands/refine-plan.md
  • skills/refining-plan/SKILL.md
  • skills/refining-plan/plan-fixer-prompt.md
  • skills/refining-plan/plan-simulator-prompt.md
✅ Files skipped from review due to trivial changes (1)
  • commands/refine-plan.md
🚧 Files skipped from review as they are similar to previous changes (1)
  • skills/refining-plan/plan-simulator-prompt.md

404pilo and others added 3 commits March 5, 2026 13:41
…ention

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…d alignment, escalation rule

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
skills/refining-plans/SKILL.md (1)

104-113: Consider softening “stable” wording in the execution prompt.

“Plan refined and stable” conflicts slightly with offering “Refine again.” A neutral phrase like “Plan refinement pass complete” would reduce ambiguity.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@skills/refining-plans/SKILL.md` around lines 104 - 113, Replace the header
phrase "Plan refined and stable. Three execution options:" with a neutral
wording such as "Plan refinement pass complete. Three execution options:" (or
similar) in the SKILL.md section that lists the three execution options (the
heading text and any identical occurrences), so it no longer implies finality
while keeping the three options and required sub-skill lines
(superpowers:subagent-driven-development and superpowers:executing-plans)
unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@skills/refining-plans/SKILL.md`:
- Around line 104-113: Replace the header phrase "Plan refined and stable. Three
execution options:" with a neutral wording such as "Plan refinement pass
complete. Three execution options:" (or similar) in the SKILL.md section that
lists the three execution options (the heading text and any identical
occurrences), so it no longer implies finality while keeping the three options
and required sub-skill lines (superpowers:subagent-driven-development and
superpowers:executing-plans) unchanged.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 2820accc-8212-46f1-89e3-2e63416d5b27

📥 Commits

Reviewing files that changed from the base of the PR and between 105f4a8 and 3b5c0b5.

📒 Files selected for processing (5)
  • commands/refine-plans.md
  • skills/refining-plans/SKILL.md
  • skills/refining-plans/plan-fixer-prompt.md
  • skills/refining-plans/plan-simulator-prompt.md
  • skills/writing-plans/SKILL.md

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@skills/refining-plans/plan-fixer-prompt.md`:
- Around line 13-17: The prompt currently instructs the fixer to read/write a
filesystem path ({plan_path}) which contradicts the subagent contract; update
the "Plan File" section in plan-fixer-prompt.md to accept the full plan text as
an injected variable (e.g., {plan_text}) instead of a file path and remove any
wording that implies file I/O or direct filesystem access; also change any
instructions that tell the fixer to write back to {plan_path} so the fixer
instead returns the modified plan text as its output (or a clearly labeled
{fixed_plan} variable) for the controller to persist.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 1ba44f04-fed2-41da-9789-218ca83f2c48

📥 Commits

Reviewing files that changed from the base of the PR and between 3b5c0b5 and 929059d.

📒 Files selected for processing (1)
  • skills/refining-plans/plan-fixer-prompt.md

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
IgorTavcar added a commit to IgorTavcar/superpowers that referenced this pull request Mar 9, 2026
…ra#622)

Simulate-evaluate-fix-converge loop with max 5 iterations for plans.
Fills the gap between writing and executing plans.

Upstream PR: obra#622

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@obra
Copy link
Owner

obra commented Mar 10, 2026

Thanks for this! The plan pressure-testing concept is solid. v5.0.0 addressed this with built-in plan review loops — after each chunk of a plan is written, a reviewer subagent checks it for completeness, spec alignment, and task decomposition quality. Closing as the idea landed in a different form.

— Claude (claude-opus-4-6, Claude Code 2.1.71, session 64908a66-5c98-4c79-b0f7-1aa7eac2dcb0)

@obra obra closed this Mar 10, 2026
@404pilo
Copy link
Author

404pilo commented Mar 10, 2026

Awesome! .. looking forward to v5.0.0
thanks for making the time to reply!

@404pilo 404pilo deleted the feat/refine-plan-skill branch March 10, 2026 00:39
@404pilo 404pilo restored the feat/refine-plan-skill branch March 13, 2026 14:13
@paseka10jaroslav-coder
Copy link

Summary

  • Adds refining-plan skill that iteratively pressure-tests implementation plans before execution (simulate → fix → converge loop)
  • Dispatches plan-simulator and plan-fixer subagents with domain-detected role profiles
  • Integrates into writing-plans execution handoff as a recommended first step
  • Includes convergence detection, escalation logic, and structured reporting

New Files

  • skills/refining-plan/SKILL.md — Main skill with checklist, flowcharts, and convergence rules
  • skills/refining-plan/plan-simulator-prompt.md — Prompt template for simulation subagent
  • skills/refining-plan/plan-fixer-prompt.md — Prompt template for fixer subagent
  • commands/refining-plan.md/refining-plan command entry point

Modified Files

  • skills/writing-plans/SKILL.md — Updated plan header template and execution handoff to offer refinement as option 1

Design

The skill follows a simulate → evaluate → fix → check convergence loop (max 5 iterations). It detects the plan's domain to generate contextual role profiles for subagents. Convergence is checked via diminishing returns, recurring criticals, and drift detection. The controller provides full plan text to subagents (no file reading overhead).

Test plan

  • /refining-plan loads skill and announces usage
  • Skill detects domain from plan content correctly
  • Plan-simulator subagent receives full plan text and returns structured findings
  • Plan-fixer subagent applies minimal edits for critical/important findings only
  • Loop converges when no critical/important findings remain
  • Loop escalates when same critical persists 3+ rounds
  • Report shows round-by-round table with signal column
  • Execution handoff offers 3 options (subagent-driven, parallel, refine again)
  • /writing-plans now shows refine-plan as option 1 in handoff
  • Plan header template includes refining-plan reference

🤖 Generated with Claude Code

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants