docs: add unit testing guidance to swamp-extension-model skill#926
Merged
docs: add unit testing guidance to swamp-extension-model skill#926
Conversation
Add testing guidance to the extension model skill now that @systeminit/swamp-testing is published on JSR. - Add "Unit Testing" section to SKILL.md with concise example and progressive disclosure link to reference - Add references/testing.md with createModelTestContext options, inspection helpers, CRUD lifecycle testing, injectable client pattern, and extracted function pattern - Add reference link in the References table Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
73f5db4 to
9f87be0
Compare
Remove the standard Zod types table — Claude already knows Zod's type
system. Keep only the swamp-specific modifier (.meta({ sensitive: true })).
Saves ~16 lines of context window.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- swamp-vault (86%→higher): Trim security best practices Claude already knows, condense interactive prompt explanation, narrow generic trigger terms (remove 'password', 'encrypt', etc.) - swamp-issue (86%→higher): Trim Requirements/Troubleshooting sections for gh CLI (Claude knows this), add explicit numbered workflow - swamp-report (89%→higher): Trim redundant "When to Create" section, add end-to-end workflow with validation checkpoints - swamp-data (89%→higher): Move Data Concepts section (lifetime types, tags, version GC) to references/concepts.md for progressive disclosure Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- swamp-issue (90%→100%): Move formatting guidelines to references/formatting.md, remove troubleshooting for gh CLI - swamp-data (90%→100%): Move JSON output shapes to references/output-shapes.md, trim search examples from 14 to 3, clarify platform context in description, remove generic trigger terms Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
fa01cac to
4beaa5e
Compare
With EVAL_RUNS=1, every query is all-or-nothing — a single LLM variance causes a failure even when the trigger terms are correct. Bumping to 3 runs with the default 0.5 threshold means a query passes if it triggers at least 2/3 times, dramatically reducing flakiness. Doubled workers from 25 to 50 to keep wall-clock time similar despite tripling the total tasks. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
CI Security Review
The only workflow change in this PR is in .github/workflows/ci.yml (lines 163-164), modifying two hardcoded environment variable values:
EVAL_RUNS:"1"→"3"EVAL_WORKERS:"25"→"50"
All other changed files are .claude/skills/ documentation (SKILL.md and reference markdown files) — no workflow or security impact.
Checklist Results
- Prompt Injection: No changes to LLM prompts or tool scoping. N/A.
- Expression Injection: No new expression interpolation introduced. N/A.
- Dangerous Triggers: No trigger changes. N/A.
- Supply Chain: No action additions or version changes. N/A.
- Permissions: No permission changes. N/A.
- Secret Exposure: No new secret usage. N/A.
- Auto-merge & Trust Boundaries: No changes to merge logic. N/A.
Verdict
PASS — Security-neutral change. Only modifies eval concurrency parameters (runs and workers) with hardcoded integer values.
There was a problem hiding this comment.
Code Review
Blocking Issues
None.
Suggestions
- The
swamp-issue/SKILL.mdrefactoring removed troubleshooting tips (gh CLI auth, editor wait issues) that weren't moved toreferences/formatting.md. This is minor since it's generic gh CLI knowledge, but worth noting in case users relied on those hints. Consider adding areferences/troubleshooting.mdif the skill had frequent questions about these. - The CI eval config change (EVAL_RUNS 1→3, EVAL_WORKERS 25→50) is bundled with the docs PR — fine for now, but could be its own commit for clearer git history.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
swamp-extension-modelskill with a concisecreateModelTestContextexamplereferences/testing.mdwith the full testing guide (options, inspection helpers, CRUD lifecycle, injectable client pattern)This follows the skill-creator progressive disclosure pattern: SKILL.md has the essential example and links to the reference file for details.
Depends on #925 (merged) which published
@systeminit/swamp-testingto JSR.Test plan
🤖 Generated with Claude Code