feat(skills): give end-to-end tests a home in the workflow#7
Merged
Conversation
End-to-end tests had no home in the workflow. This adds a standalone skill that decides when a whole-flow test is warranted and what it asserts, distinct from testing-a-feature's per-surface assertion shape. Core principle: an end-to-end test asserts that a whole flow keeps the promise the user or consumer was made, exercising the real seams unit tests stub out. That fixes both timing (write only after structural completion, when the seam exists) and selection (one test per promised journey; edge cases stay at the unit level). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Route the choreography to the new skill at the points where the structure has settled enough for a whole-flow test to be safe: the single-PR checkpoint (developing-a-feature Step 5) and the multi-PR integration checkpoint (reviewing-feature-progress Step 6). Add a boundary pointer in testing-a-feature so edge cases stay at the unit level, and list the skill in the README and project-CLAUDE tables. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Contributor
There was a problem hiding this comment.
Pull request overview
This PR adds a dedicated testing-end-to-end skill to the feature-dev-workflow plugin and wires it into the workflow at the “structurally complete” checkpoints, clarifying when end-to-end coverage is warranted and what it should (and should not) assert.
Changes:
- Added a new
testing-end-to-endskill defining selection/timing principles for system-level tests (golden paths and consumer-visible branches, not edge cases). - Wired
testing-end-to-endinto the two structural-completion checkpoints (developing-a-featureStep 5 andreviewing-feature-progressStep 6) and added a pointer fromtesting-a-feature. - Documented the new skill in the README and the project CLAUDE template tables.
Reviewed changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated no comments.
Show a summary per file
| File | Description |
|---|---|
templates/project-CLAUDE.md |
Adds the new skill to the workflow “which skill owns which part” table. |
skills/testing-end-to-end/SKILL.md |
Introduces the new end-to-end testing guidance skill. |
skills/testing-a-feature/SKILL.md |
Adds a boundary pointer to the new end-to-end skill for whole-flow coverage. |
skills/reviewing-feature-progress/SKILL.md |
Invokes testing-end-to-end at the integrated-branch verification checkpoint. |
skills/developing-a-feature/SKILL.md |
Invokes testing-end-to-end at the single-PR structural completion checkpoint. |
README.md |
Adds testing-end-to-end to the top-level skills list. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
End-to-end tests had no home in the workflow.
testing-a-featurecovers the assertion shape for a single surface, but nothing said when a whole-flow test is warranted, what it should assert, or — critically — what it should not. Left unaddressed, agents over-cover: a baseline run asked to scope e2e for a complete feature produced 8 e2e tests, promoting edge cases into the suite on the rationalization "it's only real when the full stack runs."Change
A standalone
testing-end-to-endskill, plus the wiring to invoke it at the right moment.Wired into both structural-completion points (
developing-a-featureStep 5,reviewing-feature-progressStep 6), with a boundary pointer fromtesting-a-featureand rows in the README and project-CLAUDE tables.Developed with
superpowers:writing-skills(RED → GREEN)🤖 Generated with Claude Code