A Claude Code skill for front-end QA and pre-deploy hardening. It runs real tools against a production build of a website or landing page and proves every finding with tool output — no vibes, no guessed line numbers.
Tool-gated. Every issue ShipShape reports is something a tool measured or a probe confirmed. It never invents findings, and it names the tool + rule for each one so you can re-run it to verify the fix.
| Area | Tool(s) |
|---|---|
| Performance / Core Web Vitals | Lighthouse |
| Accessibility | axe-core (via Playwright) + Lighthouse |
| SEO / best practices | Lighthouse |
| Broken links | linkinator / lychee |
| HTML validity | html-validate |
| Security headers | MDN HTTP Observatory |
| Secrets in source | gitleaks + trufflehog + semgrep p/secrets + a grep backstop |
| XSS / dangerous sinks | semgrep |
| Responsive layout | Playwright multi-breakpoint capture (375 / 768 / 1440) with a hard horizontal-overflow gate |
Front-end scope only — not backend/API or smart-contract testing.
Clone it straight into your Claude Code skills directory:
git clone https://github.com/sasonov/shipshape ~/.claude/skills/shipshapeClaude loads the skill from SKILL.md automatically. Ask it to "QA this landing
page before launch", "is my site production-ready?", "run Lighthouse and tell
me what to fix", etc., and the skill triggers.
The individual tools are invoked on demand (via npx / Docker) — see
references/tools.md for the exact pinned versions and
commands. Nothing is installed globally by the skill itself.
SKILL.md the skill itself (workflow + triggering description)
references/
tools.md every tool: pinned version, command, how to read output
ai-failure-modes.md failure modes this skill is built to avoid
scripts/
detect-stack.sh detect the project's framework / build command
verdict.sh aggregate all tool JSON into one machine-readable verdict
assets/
responsive-shots.spec.js Playwright multi-breakpoint capture + overflow gate
report-template.md the worklist/report format the skill fills in
evals/
trigger-eval.json 20-query set measuring description trigger accuracy
README.md how to re-run the description optimizer
After a run, scripts/verdict.sh . aggregates each tool's JSON output into
shipshape-results.json — one gate per check (pass / fail / skipped) plus
an overall PASS / FAIL / INCOMPLETE. It exits non-zero on FAIL, so it can
gate CI. A missing tool input reads as skipped, never as green — a partial run
can't masquerade as a pass.
MIT — see LICENSE.