Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion package.json
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
{
"name": "propulsion",
"version": "0.9.2",
"version": "0.10.0",
"description": "Compact workflow skills for agentic coding in OpenCode.",
"homepage": "https://github.com/moonpixels/propulsion#readme",
"bugs": {
Expand Down
33 changes: 15 additions & 18 deletions skills/tdd/SKILL.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,13 +6,13 @@ description: Build observable behaviour with one failing test at a time through

# TDD

Prove observable behaviour with a failing test before writing production code. Use red-green-refactor to build confidence in the change and keep the code clean.
Default to red-green-refactor when a valuable behavioural test exists. Do not fabricate brittle tests when work cannot be proven through a public interface or stable seam.

## Prerequisites

ALL prerequisites MUST be true before following this skill.

- The work includes a change to observable user-facing behaviour, a public contract, or durable business logic that can be proven through testing.
- The work includes a change to observable user-facing behaviour, a public contract, or durable business logic.
- The codebase has a test framework installed, and tests can be run locally.
- The work is not solely for CI-only changes, linting, formatting, dependency maintenance, build or development script changes, repo hygiene, or internal refactors with no behaviour change.

Expand All @@ -23,41 +23,38 @@ If the work mixes behaviour change with tooling or maintenance updates, use `tdd
Follow these steps IN ORDER. Do NOT skip steps.

1. Choose the smallest thin vertical slice that delivers one observable behaviour end-to-end.
2. Write one test for that single behaviour through a public interface or stable seam. Refer to [references/testing-patterns.md](references/testing-patterns.md) for guidance.
3. Run the test and verify it fails for the expected reason.
4. Write the smallest amount of production code to pass the test.
5. Run the test again and verify it now passes.
6. Repeat steps 1-5 for the next behaviour, building on the previous code, until the work is complete.
7. Refactor only while all tests are green, and verify tests remain green after refactor. Refer to [references/refactor-candidates.md](references/refactor-candidates.md) for guidance.
8. For bug fixes, write a regression test that reproduces the bug before fixing it, then verify the test passes after the fix.
2. Apply the test-quality/applicability gate in [references/testing-patterns.md](references/testing-patterns.md) before writing or keeping a test.
3. If a valuable behavioural test exists, write one failing test for that behaviour through a public interface or stable seam, verify it fails for the expected reason, implement the smallest passing code, then verify it passes.
4. If no valuable behavioural test exists, document the no-test rationale, run the strongest appropriate fallback verification, implement the smallest change, then rerun fallback verification.
Comment thread
adamwhp marked this conversation as resolved.
5. Repeat for the next behaviour until complete; refactor only while tests or fallback checks are green. Refer to [references/refactor-candidates.md](references/refactor-candidates.md).
6. For bug fixes, prefer a regression test that reproduces the bug; if none is valuable, document why and use the strongest fallback verification.

## Rules

These rules are MANDATORY.

- ONLY use `tdd` on observable user-visible behaviour or business logic changes.
- NO PRODUCTION CODE BEFORE A FAILING TEST.
- NO production code before a failing test WHEN a valuable behavioural test exists.
- ALWAYS write ONE test at a time for ONE observable behaviour.
- ENSURE the test initially fails for the EXPECTED reason before writing production code.
- ONLY write the minimal amount of code to make the test pass.
- ALWAYS use the public interface for testing, and test through stable seams if necessary.
- NEVER write speculative tests or code for behaviour that is not yet required.
- NEVER write speculative, brittle, implementation-detail, or private-structure tests.
- ALWAYS document no-test rationale plus fallback verification when no valuable behavioural test exists.
- ALWAYS look for refactor opportunities AFTER the test is green.

## Completion Gate

Do NOT leave this skill until ALL items are complete.

- [ ] Work was implemented in thin vertical slices.
- [ ] Each slice started with a failing test.
- [ ] Each failing test was verified to fail for the expected reason.
- [ ] Each slice was completed with passing tests.
- [ ] Refactors only happened from green and remained green.
- [ ] All tests for the work are now passing.
- [ ] Each slice passed the test-quality/applicability gate.
- [ ] Each testable slice started with a failing test that failed for the expected reason.
- [ ] Untestable slices documented no-test rationale and strongest appropriate fallback verification.
- [ ] Each slice was completed with passing tests or fallback checks.
- [ ] Where possible, refactors were applied after the tests were green.

## References

Use these references when you need detail.

- [references/testing-patterns.md](references/testing-patterns.md) - Testing patterns for guidance on how to write effective tests.
- [references/refactor-candidates.md](references/refactor-candidates.md) - Refactor candidates to identify good opportunities for refactor after the tests are green.
29 changes: 29 additions & 0 deletions skills/tdd/references/testing-patterns.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,8 @@

Use this reference during red-green. Choose tests that prove behaviour, not today's implementation.

**Refactor-safe tests are the standard:** good tests keep passing when internals are rewritten but observable behaviour stays the same. Test through public interfaces or stable seams with domain meaning, and avoid assertions about private helpers, call order, source shape, or other implementation details.

## Default Move

Start with the highest-level public interface that proves the behaviour cheaply.
Expand Down Expand Up @@ -95,6 +97,8 @@ The seam should still represent behaviour another part of the system could reaso

## Anti-Patterns

Reject tests that only inspect source strings, private structure, implementation details, brittle snapshots, or speculative behaviour. These are not acceptable substitutes for behavioural coverage; use fallback verification instead when no valuable behavioural test exists.

### Implementation-detail tests

These tests fail when refactoring changes structure without changing behaviour.
Expand All @@ -119,6 +123,20 @@ test('sends audit event after saving', async () => {

Prefer a result that matters to a caller, such as the user being created and an audit entry being visible through a supported query.

### Source-string and private-structure tests

Do not read source files as strings or inspect private modules, hidden fields, AST shape, CSS class names, hook order, folder layout, or helper presence to prove behaviour.

- Bad: asserting a file contains `aria-label` or calls `useMemo()`.
- Better: render the UI and query the accessible control, or verify the public API result.

### Brittle snapshots

Do not use broad snapshots for behaviour changes. Snapshots that mostly capture markup, class churn, generated IDs, timestamps, or component structure fail on harmless refactors.

- Bad: snapshotting an entire page to prove a button opens a menu.
- Better: interact as a user and assert the menu content is visible.

### Over-mocking

If most of the test is mock setup, the test is probably proving that the mocks agree with each other.
Expand Down Expand Up @@ -195,6 +213,17 @@ Ask these questions before keeping a test:

If any answer is "no" or "I am not sure", simplify the test before proceeding.

If no valuable behavioural test remains, do not keep a weak test. Document why no new test was written and run the strongest appropriate fallback verification, such as an existing related test suite, typecheck, lint, build, CLI smoke check, manual reproduction, or browser check.

## Frontend Guidance

For UI behaviour changes, prefer user-level tests that render the UI, interact through accessible controls, and assert visible or announced outcomes.

- Good: click "Save" and assert the success toast appears.
- Bad: assert a component state setter was called or a specific class name exists.

For visual-only changes or UI states that are hard to cover with valuable automated tests, prefer Playwright or browser verification when available. Capture the no-test rationale and the browser checks performed.

## Red-Green Heuristics

When choosing the next test:
Expand Down