Skip to content

feat: improve reliability of generated agent skills#1284

Closed
clay-good wants to merge 11 commits into
Fission-AI:mainfrom
clay-good:improve-skill-instructions
Closed

feat: improve reliability of generated agent skills#1284
clay-good wants to merge 11 commits into
Fission-AI:mainfrom
clay-good:improve-skill-instructions

Conversation

@clay-good

@clay-good clay-good commented Jun 30, 2026

Copy link
Copy Markdown
Collaborator

What this does

Upgrades the 11 agent skills OpenSpec generates — the SKILL.md files that tell coding agents how to run the OpenSpec workflow — so an agent can reliably pick the right skill, know when it's done, recover when it gets stuck, and write specs that follow OpenSpec's conventions. It also adds allowed-tools frontmatter so agents stop asking permission on every openspec call, and a validation gate so a malformed skill can't ship.

Nothing an agent actually does changes — same commands, same prompts, same artifacts. Only the instructions get clearer. (Verified by diffing every skill's executed CLI commands against main — identical.)

What it fixes

Measured across all 11 generated skills today:

  • Ambiguous triggers — several skills answer the same request ("I want to build X") with nothing telling the agent which to choose.
  • No "done" signal — not one skill stated a success condition, so an agent can't tell "finished" from "stalled."
  • No recovery — skills named blocked/error states but not how to get unstuck.
  • Permission friction — no skill pre-approved the openspec CLI, so agents prompt on every call.
  • Spec-writing rules stranded in the docs (issue Concepts from docs not included in the skills #1289) — the guidance for what belongs in a spec lived only in docs/, so agents drafting specs never followed it.

Before → after (if merged)

Before After
Skills that say when to use them vs. their siblings 2 / 11 11 / 11
Skills that state a success / "done" condition 0 / 11 11 / 11
Skills with named failure + concrete recovery 0 / 11 11 / 11
Skills that hand off to the next skill 0 / 11 11 / 11
openspec CLI pre-approved (no permission prompts) no yes
Spec-writing conventions reach the drafting agent no yes
A malformed skill can be generated yes no — blocked by a gate

An objective conformance scorecard (printed on every test run) goes from 33/81 → 81/81 checks passing.

Why it's safe to merge

  • Behavior preserved — commands, prompts, and artifacts are unchanged; proven by a command-level diff of all 11 skills against main.
  • Tests green — full suite passes except one pre-existing zsh-installer failure that also fails on main (unrelated shell-completion test).
  • allowed-tools is pure upside — agents that honor it stop prompting; agents that ignore it are unaffected.

Notes for review

  • One planned piece — AGENTS.md guidance for agents that don't load skills — was dropped, because OpenSpec no longer generates that file (legacy-cleanup deletes it as obsolete), so there's nowhere to put it.
  • Command (slash-command) templates are intentionally left unchanged; no spec requirement covers them and they're independently rewritable.
  • tasks.md has the full requirement-by-requirement coverage matrix if you want the details.

🤖 Generated with Claude Code

… skills

Add an OpenSpec change proposal (proposal/design/tasks + spec delta) that
establishes a quality contract for the 11 generated agent skills: trigger
disambiguation, canonical structure, explicit success criteria, named failure
recovery, single-source skill/command generation, shared-snippet reuse, lean
always-on body, and cross-skill navigation. Proposal only — no skill code or
CLI behavior changes in this PR.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@clay-good clay-good requested a review from TabishB as a code owner June 30, 2026 16:33
@coderabbitai

coderabbitai Bot commented Jun 30, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Adds planning docs for a unified skill-authoring-conventions contract, plus related bundle, validation, and agent-guidance specs. The proposal, design, capability spec, and task list define shared instruction structure, deduplication rules, allowed-tools, publishable bundle requirements, and validation gates.

Changes

skill-authoring-conventions planning documents

Layer / File(s) Summary
Proposal: gaps and planned changes
openspec/changes/improve-skill-instructions/proposal.md
Describes instruction-quality gaps, the unified contract, canonical section structure, shared generation, lean bodies, related navigation, validation, and non-goals.
Design: canonical structure and deduplication
openspec/changes/improve-skill-instructions/design.md
Defines the procedural layout, preserves explore and onboard, and specifies shared snippets, deduplication, worked examples, allowed-tools, and cross-platform notes.
Bundle validation and AGENTS guidance
openspec/changes/improve-skill-instructions/specs/skill-distribution/spec.md, openspec/changes/improve-skill-instructions/specs/docs-agent-instructions/spec.md
Adds publishable-bundle requirements, validation-before-publish gating, listing metadata, and regenerated openspec/AGENTS.md guidance.
Capability spec: authoring requirements
openspec/changes/improve-skill-instructions/specs/skill-authoring-conventions/spec.md
States the authoring rules for trigger disambiguation, section ordering, success criteria, recovery behavior, shared sources, references, navigation, and conformance validation.
Task checklist for implementation
openspec/changes/improve-skill-instructions/tasks.md
Lists the implementation work for shared constants, workflow refactoring, canonical rewrites, variants, validation gates, bundle metadata, and test coverage.

Estimated code review effort: 2 (Simple) | ~10 minutes

Possibly related PRs

  • Fission-AI/OpenSpec#564: Updates the workflow skill generation path that these new skill-authoring and validation rules target.
  • Fission-AI/OpenSpec#719: Also touches propose skill/command generation, which is standardized here via shared instruction guidance.

Suggested reviewers: TabishB

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly matches the PR’s main goal of improving generated agent skills, even if it is broader than the specific authoring-spec changes.
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
openspec/changes/improve-skill-instructions/specs/skill-authoring-conventions/spec.md (1)

73-79: 🎯 Functional Correctness | 🟠 Major | ⚡ Quick win

Reconcile unconditional requirement with conditional scenario.

The requirement states "Each skill SHALL reference the related or next skill" unconditionally, but the scenario's WHEN clause only triggers "when a natural next or sibling skill exists." This leaves terminal skills (e.g., feedback) without a defined behavior. Either:

  • Add a scenario covering the absence of a related skill, or
  • Soften the requirement to "SHALL where a natural next or sibling skill exists."
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@openspec/changes/improve-skill-instructions/specs/skill-authoring-conventions/spec.md`
around lines 73 - 79, The Cross-Skill Navigation requirement is unconditional
while the scenario in the skill-authoring-conventions spec only applies when a
natural next or sibling skill exists. Update the Requirement: Cross-Skill
Navigation text and/or add a complementary scenario in the same spec so terminal
skills like feedback have explicit behavior, using the existing requirement and
scenario wording as the anchor. Make the policy consistent by either qualifying
the requirement with “where a natural next or sibling skill exists” or adding an
absence case that defines what terminal skills should do.
🧹 Nitpick comments (2)
openspec/changes/improve-skill-instructions/design.md (1)

31-39: 📐 Maintainability & Code Quality | 🔵 Trivial | 💤 Low value

Add language specifier to fenced code block.

Satisfy markdownlint MD040 by tagging the structural diagram as text (or markdown). No semantic change.

-```
+```text
 Use when     — one line; includes the sibling boundary
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@openspec/changes/improve-skill-instructions/design.md` around lines 31 - 39,
The fenced structural diagram in the skill instructions is missing a language
tag and needs to be annotated to satisfy markdownlint MD040. Update the fenced
block in the design document so the diagram is explicitly marked as text (or
markdown) while keeping the content unchanged; use the existing fenced section
containing the “Use when”, “Inputs”, “Steps”, and “Guardrails” headings as the
target.

Source: Linters/SAST tools

openspec/changes/improve-skill-instructions/proposal.md (1)

3-3: 📐 Maintainability & Code Quality | 🔵 Trivial | 💤 Low value

Repetitive use of "right" weakens prose.

Three instances in one sentence dilute impact. Vary the wording: e.g., "correct skill," "proper steps," "intended place."

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@openspec/changes/improve-skill-instructions/proposal.md` at line 3, The
sentence in the OpenSpec proposal repeats “right” three times, making the prose
feel repetitive and weak. Revise that sentence in the proposal text to vary the
wording while preserving meaning, using distinct phrasing such as “correct
skill,” “proper steps,” and “intended place” so the opening reads more cleanly.

Source: Linters/SAST tools

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@openspec/changes/improve-skill-instructions/proposal.md`:
- Line 59: The proposal text has a wording typo in the new capability spec
reference: update the phrase in the skill-authoring-conventions entry from “on
archive” to “on disk” or “in the repository.” Locate the bullet mentioning
openspec/specs/skill-authoring-conventions and correct the description so it
matches the repository context.

In
`@openspec/changes/improve-skill-instructions/specs/skill-authoring-conventions/spec.md`:
- Line 25: The skill-authoring convention text currently uses a lowercase
section name that conflicts with the proposal’s title-case naming. Update the
instructions in the spec so the required sections are named consistently as
title-case labels, matching the proposal’s “Use when / Inputs / Steps / Success
/ Failure & recovery / Guardrails / Related” ordering, and keep this wording
aligned wherever the section list is referenced so generators and tests can
match it deterministically.

In `@openspec/changes/improve-skill-instructions/tasks.md`:
- Line 31: The task item overstates that every skill must have a Related line,
but terminal or isolated skills may not have a natural successor. Update the
wording in the task list entry to explicitly scope it to skills that have a
natural workflow successor, or add a short exception list for terminal cases
such as feedback; keep the change aligned with the related skill-instruction
spec language.

---

Outside diff comments:
In
`@openspec/changes/improve-skill-instructions/specs/skill-authoring-conventions/spec.md`:
- Around line 73-79: The Cross-Skill Navigation requirement is unconditional
while the scenario in the skill-authoring-conventions spec only applies when a
natural next or sibling skill exists. Update the Requirement: Cross-Skill
Navigation text and/or add a complementary scenario in the same spec so terminal
skills like feedback have explicit behavior, using the existing requirement and
scenario wording as the anchor. Make the policy consistent by either qualifying
the requirement with “where a natural next or sibling skill exists” or adding an
absence case that defines what terminal skills should do.

---

Nitpick comments:
In `@openspec/changes/improve-skill-instructions/design.md`:
- Around line 31-39: The fenced structural diagram in the skill instructions is
missing a language tag and needs to be annotated to satisfy markdownlint MD040.
Update the fenced block in the design document so the diagram is explicitly
marked as text (or markdown) while keeping the content unchanged; use the
existing fenced section containing the “Use when”, “Inputs”, “Steps”, and
“Guardrails” headings as the target.

In `@openspec/changes/improve-skill-instructions/proposal.md`:
- Line 3: The sentence in the OpenSpec proposal repeats “right” three times,
making the prose feel repetitive and weak. Revise that sentence in the proposal
text to vary the wording while preserving meaning, using distinct phrasing such
as “correct skill,” “proper steps,” and “intended place” so the opening reads
more cleanly.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 2afb5da9-8590-4a2f-95a4-fe390e1ad158

📥 Commits

Reviewing files that changed from the base of the PR and between 546224e and 96c3e97.

📒 Files selected for processing (4)
  • openspec/changes/improve-skill-instructions/design.md
  • openspec/changes/improve-skill-instructions/proposal.md
  • openspec/changes/improve-skill-instructions/specs/skill-authoring-conventions/spec.md
  • openspec/changes/improve-skill-instructions/tasks.md

- `src/core/templates/workflows/*.ts` — rewrite the 11 workflow instruction strings (and feedback) to the new conventions; collapse each skill/command pair onto one instruction source.
- `src/core/templates/workflows/store-selection.ts` (and likely new sibling snippet modules) — house the shared change-selection, artifact-loop, and context/rules guardrail blocks.
- `src/core/shared/skill-generation.ts` / `src/core/templates/skill-templates.ts` — adjust the assembly so skill and command derive from one source.
- `openspec/specs/skill-authoring-conventions/` — new capability spec created on archive.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎯 Functional Correctness | 🟡 Minor | ⚡ Quick win

Fix typo: "on archive" → "on disk" (or "in the repository").

"On archive" does not fit the context of creating a new spec directory.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@openspec/changes/improve-skill-instructions/proposal.md` at line 59, The
proposal text has a wording typo in the new capability spec reference: update
the phrase in the skill-authoring-conventions entry from “on archive” to “on
disk” or “in the repository.” Locate the bullet mentioning
openspec/specs/skill-authoring-conventions and correct the description so it
matches the repository context.


## 4. Cross-skill navigation

- [ ] 4.1 Add a Related line to every skill pointing to its natural next/sibling (e.g. `propose` → `apply`, `verify` → `archive`, `new-change` → `continue`)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎯 Functional Correctness | 🟡 Minor | ⚡ Quick win

Clarify terminal skills without a natural next/sibling.

"Every skill" includes terminal or isolated skills that may not have a meaningful next step. Either enumerate exceptions (e.g., feedback) or change to "every skill that has a natural workflow successor," matching the spec's conditional scenario.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@openspec/changes/improve-skill-instructions/tasks.md` at line 31, The task
item overstates that every skill must have a Related line, but terminal or
isolated skills may not have a natural successor. Update the wording in the task
list entry to explicitly scope it to skills that have a natural workflow
successor, or add a short exception list for terminal cases such as feedback;
keep the change aligned with the related skill-instruction spec language.

@clay-good clay-good self-assigned this Jun 30, 2026
…eservation contract

- Correct duplication/size figures to measured values (onboard 543, bulk-archive
  237, verify 160, explore 278 instruction lines; skill/command overlap 89-100%
  for 9 of 11 pairs; propose body 87% identical to ff-change).
- Add an audit-evidence table and worked before/after examples (trigger
  disambiguation, explicit success, failure recovery) to design.md.
- Add a Behavior Preservation requirement and tighten the single-source and
  lean-body scenarios to be normalizable/testable.
- Add behavior-preservation and single-source identity validation tasks.

Strict-validated with the repo CLI.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

@alfred-openspec alfred-openspec left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This proposal is solid. The measured audit evidence plus behavior-preservation contract makes this safe to take forward, and the single-source plan matches the existing template drift risk.

…ribution, and AGENTS.md guidance

Broaden the proposal from instruction quality to making OpenSpec's skills
first-class Agent Skills packages and getting them listable in a public
directory:

- skill-authoring-conventions: add standard-conformance and a generation/CI
  validation gate; anchor the lean-body rule to the standard's <500-line /
  ~5000-token budget with references/ split (onboard is the one over-budget body).
- skill-distribution (new capability): a validated, publishable bundle and a
  documented listing checklist.
- docs-agent-instructions (modified): openspec/AGENTS.md advertises the skills
  and the deterministic CLI so non-skill-loading agents follow the same workflow.

Notes: agents.sh is a voice product, not the registry — the target is the Agent
Skills standard (agentskills.io) and the skills.sh directory. Verified all 11
skill names already satisfy name==folder and the charset rules; deliberately
orthogonal to add-tool-command-surface-capabilities (no layout/delivery change).
Strict-validated; 3 spec deltas.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
openspec/changes/improve-skill-instructions/specs/skill-authoring-conventions/spec.md (1)

73-77: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick win

Clarify or relax the "one level deep" path constraint.

Requiring reference links to be references/ files at "a relative path one level deep" is brittle if skill folders are nested or reorganized. Either explain why the depth matters (e.g., standard-mandated layout), or rephrase to require a stable relative path from SKILL.md without prescribing depth.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@openspec/changes/improve-skill-instructions/specs/skill-authoring-conventions/spec.md`
around lines 73 - 77, Update the “Reference material in on-demand files”
scenario in skill-authoring-conventions so the link requirement is less brittle:
either justify the “one level deep” constraint or change it to require a stable
relative link from SKILL.md without mandating directory depth. Keep the existing
references/ guidance and adjust the scenario text so authors can place linked
material using the relevant relative path while preserving the rule that the
body remains readable without opening the reference file.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In
`@openspec/changes/improve-skill-instructions/specs/skill-authoring-conventions/spec.md`:
- Around line 73-77: Update the “Reference material in on-demand files” scenario
in skill-authoring-conventions so the link requirement is less brittle: either
justify the “one level deep” constraint or change it to require a stable
relative link from SKILL.md without mandating directory depth. Keep the existing
references/ guidance and adjust the scenario text so authors can place linked
material using the relevant relative path while preserving the rule that the
body remains readable without opening the reference file.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: b5555378-acd1-4754-84df-87178d86198c

📥 Commits

Reviewing files that changed from the base of the PR and between 3561e72 and 77ca885.

📒 Files selected for processing (6)
  • openspec/changes/improve-skill-instructions/design.md
  • openspec/changes/improve-skill-instructions/proposal.md
  • openspec/changes/improve-skill-instructions/specs/docs-agent-instructions/spec.md
  • openspec/changes/improve-skill-instructions/specs/skill-authoring-conventions/spec.md
  • openspec/changes/improve-skill-instructions/specs/skill-distribution/spec.md
  • openspec/changes/improve-skill-instructions/tasks.md
✅ Files skipped from review due to trivial changes (4)
  • openspec/changes/improve-skill-instructions/specs/skill-distribution/spec.md
  • openspec/changes/improve-skill-instructions/design.md
  • openspec/changes/improve-skill-instructions/tasks.md
  • openspec/changes/improve-skill-instructions/proposal.md

- allowed-tools: each skill declares its toolset and emits the standard's
  allowed-tools frontmatter; Bash scoped to Bash(openspec:*) for CLI-only skills,
  unrestricted Bash only for apply-change/onboard (arbitrary commands). Declared
  set is a validated superset of body usage, so strict-allowlist agents never
  block a needed tool and ignoring agents are unaffected — pure upside.
- New requirement + scenarios in skill-authoring-conventions; design rationale
  for the asymmetric-risk decision; tasks; validation-gate covers tool coverage.
- Coherence pass: clarify that conformance/distribution/allowed-tools target the
  11 generated SKILL.md skills; feedback is held to the authoring bar only.

3 deltas, 14 reqs / 30 scenarios, strict-valid.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
openspec/changes/improve-skill-instructions/specs/skill-authoring-conventions/spec.md (1)

18-18: 📐 Maintainability & Code Quality | 🟡 Minor | ⚡ Quick win

Use title-case for section names.

The spec still uses lowercase "use when" here. Per the design's canonical structure (line 53), section names are title-case: Use when / Inputs / Steps / Success / Failure & recovery / Guardrails / Related. Use consistent title-case section names so generators and validators can match them deterministically. This applies to line 25 as well.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@openspec/changes/improve-skill-instructions/specs/skill-authoring-conventions/spec.md`
at line 18, Update the spec text in the relevant clause and any matching
references so section names use title-case consistently; specifically, replace
the lowercase “use when” wording in the affected requirement with “Use when,”
and align the other section heading mention near the same area to the canonical
title-case names used by the design structure. Keep the wording deterministic so
generators and validators can match section names like Use when, Inputs, Steps,
Success, Failure & recovery, Guardrails, and Related.
♻️ Duplicate comments (1)
openspec/changes/improve-skill-instructions/specs/skill-authoring-conventions/spec.md (1)

25-25: 📐 Maintainability & Code Quality | 🟡 Minor | ⚡ Quick win

Use title-case for section names.

The scenario still lists 'a "use when" line' in lowercase. Align with the design's canonical structure (line 53) using title-case Use when, and ensure Inputs is also capitalized for consistency within the same list.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@openspec/changes/improve-skill-instructions/specs/skill-authoring-conventions/spec.md`
at line 25, The canonical instruction sequence in the scenario still uses
lowercase section labels; update the wording in the specification so the listed
sections match the design’s title-case convention. In the relevant requirement
text, change the “use when” entry to “Use when” and ensure “Inputs” remains
capitalized, keeping the rest of the ordered list aligned with the same
title-case style.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In
`@openspec/changes/improve-skill-instructions/specs/skill-authoring-conventions/spec.md`:
- Line 127: The unrestricted shell access rule is ambiguous because it refers to
“the implementation skill” instead of the explicitly named skill. Update the
wording in the skill-authoring conventions spec to use apply-change directly, or
clearly define that “implementation skill” means apply-change, so generators and
validators have a single unambiguous target.

---

Outside diff comments:
In
`@openspec/changes/improve-skill-instructions/specs/skill-authoring-conventions/spec.md`:
- Line 18: Update the spec text in the relevant clause and any matching
references so section names use title-case consistently; specifically, replace
the lowercase “use when” wording in the affected requirement with “Use when,”
and align the other section heading mention near the same area to the canonical
title-case names used by the design structure. Keep the wording deterministic so
generators and validators can match section names like Use when, Inputs, Steps,
Success, Failure & recovery, Guardrails, and Related.

---

Duplicate comments:
In
`@openspec/changes/improve-skill-instructions/specs/skill-authoring-conventions/spec.md`:
- Line 25: The canonical instruction sequence in the scenario still uses
lowercase section labels; update the wording in the specification so the listed
sections match the design’s title-case convention. In the relevant requirement
text, change the “use when” entry to “Use when” and ensure “Inputs” remains
capitalized, keeping the rest of the ordered list aligned with the same
title-case style.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 5f8d4c0f-014a-4cd5-b90a-fb65bfacdb75

📥 Commits

Reviewing files that changed from the base of the PR and between 77ca885 and 6b362a8.

📒 Files selected for processing (4)
  • openspec/changes/improve-skill-instructions/design.md
  • openspec/changes/improve-skill-instructions/proposal.md
  • openspec/changes/improve-skill-instructions/specs/skill-authoring-conventions/spec.md
  • openspec/changes/improve-skill-instructions/tasks.md
🚧 Files skipped from review as they are similar to previous changes (2)
  • openspec/changes/improve-skill-instructions/tasks.md
  • openspec/changes/improve-skill-instructions/proposal.md

#### Scenario: CLI bash pre-approved and narrowly scoped
- **WHEN** a skill invokes the OpenSpec CLI through a shell tool
- **THEN** its `allowed-tools` SHALL pre-approve the OpenSpec CLI invocation scoped to that binary (for example `Bash(openspec:*)`)
- **AND** unrestricted shell access SHALL be declared only for skills that run arbitrary build or test commands (for example the implementation skill)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎯 Functional Correctness | 🟡 Minor | ⚡ Quick win

Clarify which skill is "the implementation skill."

The spec uses "the implementation skill" as the example for unrestricted Bash, but the design explicitly names apply-change and onboard. Use the actual skill name (apply-change) or clarify that "implementation skill" refers to it, so the generator and validation have an unambiguous target.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@openspec/changes/improve-skill-instructions/specs/skill-authoring-conventions/spec.md`
at line 127, The unrestricted shell access rule is ambiguous because it refers
to “the implementation skill” instead of the explicitly named skill. Update the
wording in the skill-authoring conventions spec to use apply-change directly, or
clearly define that “implementation skill” means apply-change, so generators and
validators have a single unambiguous target.

@clay-good clay-good changed the title Proposal: skill-authoring-conventions — update all agent skills [Docs] Propose skill-authoring-conventions — quality bar, distribution, and allowed-tools for all 11 agent skills Jul 1, 2026
clay-good and others added 5 commits July 1, 2026 12:31
…ission-AI#1289)

Address issue Fission-AI#1289: docs/concepts.md's "What a Spec Is (and Is Not)"
guidance (what belongs in a spec vs. what to keep out) never reaches the
skills that draft specs, so agents write implementation-laden specs unless
separately instructed.

Add a SPEC_CONTENT_GUIDANCE shared snippet, sourced from concepts.md and
embedded by the spec-authoring skills (propose, ff-change, continue-change,
sync-specs), plus a new "Embedded Spec-Content Guidance" requirement in
skill-authoring-conventions and a test asserting the snippet stays aligned
with the docs. Proposal, design, and tasks updated to match.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…guidance

A doc-vs-skill audit found Fission-AI#1289 (spec-content guidance stranded in the
docs) is one instance of a class: rules that shape artifact quality live
only in docs/ and never reach the skills that draft artifacts, so agents
don't follow them unless told. Confirmed absent from the templates by grep:
right-sized rigor (Lite/Full), RFC-2119 keyword meanings, scenario quality
(edge cases), and delta conventions (MODIFIED shows prior value, REMOVED
says why).

Generalize the requirement "Embedded Spec-Content Guidance" into "Embedded
Authoring Guidance" (5 scenarios) covering the whole class, add a
SPEC_CONVENTIONS_GUIDANCE shared snippet alongside SPEC_CONTENT_GUIDANCE,
and require AGENTS.md (docs-agent-instructions) to carry the same
conventions so non-skill agents get them too. Design gains an audit table
plus two deliberately out-of-scope divergences (enabler-graph vs. gate
wording; update-vs-fresh heuristics, owned by add-update-workflow).

Now 15 requirements / 36 scenarios; strict-valid.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Incorporate maintainer direction on the skills architecture:
- Duplication across skill files is intentional (self-contained skills →
  independent rewrites), so drop the single-source/DRY pillar. Item 4
  becomes "self-contained skills, shared conventions by reference"; the
  spec requirement, design principles/decisions/alternatives, and tasks
  (no single-source refactor, no extracted procedure constants) follow.
- Favor design/behavior guidance over procedure-heavy "if this then that"
  skills. Item 2 becomes guidance-first; the canonical structure's Steps
  section becomes Guidance, with deep/exact procedure moved to references/.
- Deliver the Fission-AI#1289-class authoring guidance as a proposal-writing
  reference the artifact-drafting skills link to (item 12), not inline
  shared snippets. AGENTS.md carries the same reference for non-skill
  agents. Tests assert the reference matches concepts.md and that skills
  link to it.

Three architecture principles now stated up front in What Changes and
design. Still 15 requirements / 36 scenarios; strict-valid.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…rewrite

Start implementing skill-authoring-conventions in this PR, with a measured
before/after as proof it earns its place.

- Add a conformance scorer (src/core/shared/skill-conformance.ts) that scores
  every skill against the conventions on objective signals (trigger boundary,
  success criteria, failure & recovery, guardrails, related-skills, body
  budget, authoring-reference link) and prints a scorecard.
- Add the authoring-conventions reference (the proposal-writing reference,
  src/core/templates/workflows/authoring-conventions.ts) — compact form of
  docs/concepts.md (belongs/avoid, rigor, RFC-2119 meanings, scenario quality,
  delta conventions). Closes the Fission-AI#1289 class.
- Emit it on disk: getSkillReferenceFiles + init/update write references/ for
  exactly the skills that link it (verified e2e: openspec init emits
  openspec-propose/references/authoring-conventions.md; new-change gets none).
- Rewrite the create-a-change family (new/propose/ff/continue) and sync-specs
  skills to the conventions — trigger boundaries, Use when/Inputs/Success/
  Failure & recovery/Related, and the reference link for the spec-authoring
  ones. Behavior preserved (same commands/prompts/artifacts); command
  templates unchanged (self-contained, independently rewritable).
- Regenerate golden template hashes; add skill-conformance test.

Measured efficacy: convention checks passing rose 33/81 -> 57/81 across the 11
skills; the five rewritten skills now score full marks (7/7 or 8/8). Full
suite green except a pre-existing, environment-specific zsh-installer failure
(fails identically on baseline).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Complete the implementation across all 11 skills and the cross-cutting infra.

Skills (behavior-preserving, command templates unchanged):
- Rewrote apply, archive, bulk-archive, verify (procedural), explore (stance),
  onboard (tutorial), and feedback to the conventions — trigger boundaries,
  Use when/Inputs/Success/Failure & recovery/Related. Combined with the earlier
  create-a-change family + sync-specs, all 11 skills now conform.

allowed-tools (item 11):
- Declared per-skill toolsets (skill-tools.ts); generateSkillContent emits
  allowed-tools frontmatter. Bash scoped to Bash(openspec:*) for CLI-only
  skills; unrestricted Bash only for apply and onboard.

Conformance gate (item 8) + distribution (item 9/skill-distribution):
- validateSkillConformance enforces frontmatter validity, name==folder,
  resolvable references, and declared tools as hard errors; body budget is a
  warning. Wired into init/update (fail rather than write a bad skill) and
  covered in CI. Bundle-validation test + docs/skill-distribution.md checklist.

Efficacy: convention checks 33/81 -> 80/81 (the one miss is onboard's
over-budget body, a documented warning). Full suite green except the
pre-existing env-specific zsh-installer failure. Regenerated all golden hashes.

BLOCKED / flagged for maintainer: docs-agent-instructions (AGENTS.md, item 10)
is left unbuilt because the codebase removed openspec/AGENTS.md generation and
legacy-cleanup deletes it as obsolete; re-introducing it would contradict that
direction. See tasks.md §9.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@clay-good clay-good changed the title [Docs] Propose skill-authoring-conventions — quality bar, distribution, and allowed-tools for all 11 agent skills [Feature] Build out skill-authoring-conventions — rewrite all 11 agent skills + allowed-tools + conformance gate (33/81 → 80/81) Jul 1, 2026
…er-body, drop AGENTS.md

Methodical pass to fully satisfy the spec deltas:

- Lean body: moved deep reference material out of the skill bodies into
  emitted references/ files — onboard artifact skeletons
  (references/onboarding-artifact-templates.md, body now 456 lines, under the
  500 budget), sync-specs delta-format (references/delta-format.md),
  bulk-archive conflict examples (references/conflict-resolution.md), verify
  dimension detail (references/verification-dimensions.md). Generalized
  getSkillReferenceFiles to a REFERENCE_REGISTRY; onboard's *command* keeps the
  skeletons inline (self-contained, no references/ dir).
- Declared tools cover body usage (6.3): the gate now fails if a body uses an
  unambiguous tool token (AskUserQuestion/TodoWrite/Grep/Glob/WebFetch/
  WebSearch) not in the declared allowed-tools.
- Reference/docs drift (10.7): a test asserts the authoring-conventions
  reference and docs/concepts.md share the same anchor items.
- Dropped the docs-agent-instructions capability: OpenSpec removed AGENTS.md
  generation (legacy-cleanup deletes openspec/AGENTS.md as obsolete), so there
  is no always-on surface to target. Spec delta deleted; proposal/design/tasks
  updated. Always-on guidance can return in a separate change once a surface
  exists.

Result: conformance scorecard 33/81 -> 81/81 (all skills fully conformant);
2 spec deltas, 14 requirements / 32 scenarios, strict-valid; full suite green
except the pre-existing env-specific zsh-installer failure. Golden hashes
regenerated (only the changed skills + onboard command; other command
templates byte-identical).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@clay-good clay-good changed the title [Feature] Build out skill-authoring-conventions — rewrite all 11 agent skills + allowed-tools + conformance gate (33/81 → 80/81) [Feature] Build out skill-authoring-conventions — all 11 agent skills to full conformance + allowed-tools + gate + distribution (33/81 → 81/81) Jul 1, 2026
…f-scope

Methodical verification that the spec is fully built out:
- Added a requirement-by-requirement coverage matrix (14/14 requirements,
  all scenarios) mapping each to concrete code/test evidence.
- Behavior Preservation proven: diffed the executed CLI command set of all 11
  skills against the pre-rewrite baseline — identical for every skill (the
  apparent verify deltas are prose in the new recovery section, not executed
  commands); user-facing prompts preserved; per-skill behavioral specs hold.
- Clarified that slash-command templates are out of the spec's scope (no
  scenario governs them) — not an open task; the 10 unchanged command
  templates stay byte-identical by design.

Scorecard 81/81; 2 deltas, 14 req / 32 scenarios; strict-valid.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@clay-good clay-good changed the title [Feature] Build out skill-authoring-conventions — all 11 agent skills to full conformance + allowed-tools + gate + distribution (33/81 → 81/81) [Feature] Make the generated agent skills more reliable — clear triggers, success & recovery steps, pre-approved CLI Jul 1, 2026

@alfred-openspec alfred-openspec left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed the latest skill-authoring updates through e46c3e4, including the bda1816 spec-content guidance change. The direction looks right: artifact-drafting skills now link to a shared authoring-conventions reference instead of duplicating docs guidance, and the proposal/spec/tasks are aligned with the self-contained-skill architecture.\n\nVerified locally: targeted authoring/conformance/parity tests pass, build passes, and openspec validate improve-skill-instructions --strict passes.

@clay-good clay-good changed the title [Feature] Make the generated agent skills more reliable — clear triggers, success & recovery steps, pre-approved CLI feat: improve reliability of generated agent skills Jul 2, 2026

@TabishB TabishB left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok this PR is super rough. I think we need to start over here and make more intentional, deliberate changes, a bit at a time. A lot of it just makes the skill bulkier and I can't really see it adding much value.

The main part i like is for tools and auto approving OpenSpec commands.

/**
* Scores one skill's description + instructions against the conventions.
*/
export function scoreSkillConformance(input: ConformanceInput): ConformanceResult {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I agree that these are the key ingredients of every single skill. Not every single skill is implicit or explicit in nature. Different types of skills serve different purposes.

i.e., some skills are implicitly triggered vs some are explicit.

These seem focused on implicit skills (which we don't really have in OpenSpec to begin with). Even if these were implicit, I don't think they would follow the same pattern.


${STORE_SELECTION_GUIDANCE}

**Use when:** the user wants to write code and check off a change's tasks. To confirm the work is correct without modifying tasks, use \`openspec-verify-change\`; to create missing artifacts (proposal, design, tasks) rather than implement them, use \`openspec-continue-change\`.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As mentioned in the call yesterday. "Use when" as part of the instruction makes no sense. By the time the instruction is loaded in the agent has already choosen to invoke the skill.

In general a lot of the skills at the moment are expected to be explicity triggered vs implicitly triggered.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not to mention this just doubles up on the description above anyways


**Use when:** the user wants to write code and check off a change's tasks. To confirm the work is correct without modifying tasks, use \`openspec-verify-change\`; to create missing artifacts (proposal, design, tasks) rather than implement them, use \`openspec-continue-change\`.

**Inputs:** optionally a change name. If omitted, infer it from conversation context; auto-select when only one active change exists; if vague or ambiguous you MUST run \`openspec list --json\` and prompt for available changes.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This just seems to double up on the Input section below?


**Failure & recovery**
- **Ambiguous or missing change name:** run \`openspec list --json\` and prompt with the AskUserQuestion tool; never guess.
- **\`state: "blocked"\` (missing artifacts):** stop implementing and invoke \`openspec-continue-change\` to create the missing artifacts, then re-run the apply instructions.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A user could not have this skill installed. I'm not sure if continue would be the right thing to do here either?

I would expect this to be a soft warning with a prompt that asks to proceed.

- **Allows artifact updates**: If implementation reveals design issues, suggest updating artifacts - not phase-locked, work fluidly`,
- **Allows artifact updates**: If implementation reveals design issues, suggest updating artifacts - not phase-locked, work fluidly

**Success:** every task in the tasks file is checked \`- [x]\`, and \`openspec instructions apply --change "<name>" --json\` reports \`state: "all_done"\` with 0 remaining tasks.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if this is the real success criteria, ideally the success is the change is implemented as expected + tasks ticked off + matching the specs etc


${STORE_SELECTION_GUIDANCE}

**Use when:** the user wants to finalize a single completed change - sync its delta specs and move it to the archive. To sync main specs without archiving (keeping the change active), use \`openspec-sync-specs\`; to archive several changes in one run, use \`openspec-bulk-archive-change\`.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok sensing a theme here that, I think the assumptions that have gone into this by the agent from the model are just not right. It's repeated the same mistake here as above. We also have no clue that this is actually whats missing in the skills.

There's no empirical or anecdotal evidence for the need for these additional sections. It dosen't feel tied to anything in particular and we can't really prove this makes it better or worse.

Comment on lines +18 to +30
export const SKILL_TOOLS: Record<string, string[]> = {
'openspec-explore': [CLI, 'Read', 'Grep', 'Glob', 'Write', 'Edit', 'AskUserQuestion', 'TodoWrite'],
'openspec-new-change': [CLI, 'Read', 'AskUserQuestion', 'TodoWrite'],
'openspec-continue-change': [CLI, 'Read', 'Write', 'Edit', 'AskUserQuestion', 'TodoWrite'],
'openspec-apply-change': [FULL_BASH, 'Read', 'Write', 'Edit', 'Grep', 'Glob', 'AskUserQuestion', 'TodoWrite', 'Skill'],
'openspec-ff-change': [CLI, 'Read', 'Write', 'Edit', 'AskUserQuestion', 'TodoWrite'],
'openspec-sync-specs': [CLI, 'Read', 'Edit', 'AskUserQuestion', 'TodoWrite'],
'openspec-archive-change': [CLI, 'Read', 'AskUserQuestion', 'TodoWrite', 'Skill'],
'openspec-bulk-archive-change': [CLI, 'Read', 'Edit', 'AskUserQuestion', 'TodoWrite'],
'openspec-verify-change': [CLI, 'Read', 'Grep', 'Glob', 'AskUserQuestion', 'TodoWrite'],
'openspec-onboard': [FULL_BASH, 'Read', 'Grep', 'Glob', 'Write', 'Edit', 'AskUserQuestion', 'TodoWrite'],
'openspec-propose': [CLI, 'Read', 'Write', 'Edit', 'AskUserQuestion', 'TodoWrite'],
};

@TabishB TabishB Jul 3, 2026

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are some of these tools coding agent agnostic? Like is TodoWrite and AskUserQuestion tool agnostic? I don't think so.

FULL_BASH seems tricky? how does it work with user level safeguards?

Comment thread src/core/init.ts
Comment on lines +556 to +565
if (shouldGenerateSkills) {
const conformanceErrors: string[] = [];
for (const { template, dirName } of skillTemplates) {
conformanceErrors.push(...validateSkillConformance(template, dirName).errors);
}
if (conformanceErrors.length > 0) {
throw new Error(`Skill conformance check failed:\n- ${conformanceErrors.join('\n- ')}`);
}
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It makes no sense for this to be gated during init. I feel like if we did get added in it should be a linting rule when people create skills.

Like imaging if we made an update to the skill that was non conformant. This would just cause it to error for users initializing openspec in their project.

@clay-good

clay-good commented Jul 3, 2026

Copy link
Copy Markdown
Collaborator Author

I opened #1300 to auto-approve the openspec CLI in generated skills and I am closing this PR now.

@clay-good clay-good closed this Jul 3, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants