[STG-NEW] Add ui-test skill for adversarial UI testing#56
[STG-NEW] Add ui-test skill for adversarial UI testing#56
Conversation
Builds on #52 with three key additions: 1. Local/remote mode selection — localhost uses local browser (no API key), deployed sites use Browserbase via cookie-sync for authenticated testing 2. Diff-driven testing — analyze git diff, generate targeted tests for what changed, execute with before/after snapshot comparison 3. Structured assertion protocol — STEP_PASS/STEP_FAIL markers with evidence, deterministic checks (axe-core, console errors, overflow detection), and adversarial testing patterns (XSS, empty submit, rapid click, keyboard-only) Smoke-tested against a local Next.js app: found real bugs (Escape not closing modals, undersized mobile touch targets) that confirmed the adversarial patterns work. Fixed browse eval recipes (no top-level await, console capture on-page not about:blank). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…ssions Enables concurrent test execution by leveraging browse CLI's --session flag to spin up independent Browserbase browsers per test group, with fan-out via Agent tool and merged result reporting. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Documents how to add Bash(browse:*) to project or user settings so users don't get prompted on every browse snapshot/click/eval. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…t figure it out - Remove .ui-tests/suite.yml format and generation pipeline - Replace Workflow B (8-step codebase analysis) with lightweight exploratory testing - Simplify references/codebase-analysis.md to quick hints (framework detection, route finding) - Remove example YAML suite file - Update README to reflect no-artifacts philosophy - Drop Write tool from allowed-tools (no files to generate) The codegen/suite approach can ship as v2 later. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- XSS check: replace false-positive inline script count with input value check - Console capture: preserve original console.error in Examples 6 snippets - Form labels: use native i.labels API in browser-recipes.md (matches SKILL.md) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 3 potential issues.
Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
| ### Form structure | ||
|
|
||
| ```bash | ||
| browse eval "JSON.stringify(Array.from(document.querySelectorAll('form')).map(f => ({ action: f.action, inputs: Array.from(f.querySelectorAll('input,select,textarea')).map(i => ({ name: i.name, type: i.type, required: i.required, hasLabel: !!i.labels?.length })) })))" |
There was a problem hiding this comment.
Inconsistent hasLabel check misses aria-label inputs
Medium Severity
The form structure recipe in SKILL.md computes hasLabel as !!i.labels?.length, which only checks for associated <label> elements. The same recipe in browser-recipes.md correctly uses !!i.labels?.length || !!i.getAttribute('aria-label'), also covering aria-label attributes. Since SKILL.md is the primary instruction file and tells the agent "any false = accessibility FAIL," inputs that use aria-label instead of <label> will produce false accessibility failures.
Additional Locations (1)
|
|
||
| # Step 2: Wait for script to load, then run audit | ||
| # (wait 2-3 seconds for the script to load) | ||
| browse eval "axe.run().then(r => JSON.stringify({ violations: r.violations.map(v => ({ id: v.id, impact: v.impact, description: v.description, nodes: v.nodes.length, help: v.helpUrl })), passes: r.passes.length, incomplete: r.incomplete.length }))" |
There was a problem hiding this comment.
axe-core recipe lacks wait between load and run
Low Severity
The axe-core recipes across all three files inject the script via eval and immediately call axe.run() on the next line. There's only a comment ("wait 2-3 seconds") but no actual browse wait timeout command between them. Since browser-recipes.md is framed as "copy-paste recipes," the missing wait can cause a ReferenceError on axe if the script hasn't finished loading. A browse wait timeout 3000 between the two evals would match the pattern used elsewhere (e.g., the responsive screenshot sweep).
Additional Locations (2)
| browse screenshot /tmp/explore-home.png | ||
|
|
||
| # Console health check | ||
| browse eval "JSON.stringify({errors: (window.__capturedErrors || []).length})" |
There was a problem hiding this comment.
Console error variable name mismatch with injection recipe
Medium Severity
Example 8's console health check reads from window.__capturedErrors, but every console capture injection recipe across all files (SKILL.md, EXAMPLES.md Example 6, browser-recipes.md, exploratory-testing.md) stores errors in window.__logs. Since window.__capturedErrors is never defined anywhere, the fallback || [] ensures it always reports {errors: 0} — silently hiding any console errors the agent was supposed to detect.
Also strengthens auto-select rule: localhost → browse env local, deployed URLs → browse env remote, applied consistently across all workflows including parallel sessions. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>


Summary
ui-testskill for AI-powered adversarial UI testing via thebrowseCLIWhat's new vs #52
browse env local, no API key neededgit diff→ targeted tests for what changedSTEP_PASS|id|evidence/STEP_FAIL|id|expected → actualabout:blankinjection (broken)browse evalawait fixawait(broken).then()(working)Files
Test plan
🤖 Generated with Claude Code
Note
Low Risk
Low risk: this PR adds new Markdown-based skill documentation and examples without changing runtime application code or existing behaviors.
Overview
Adds a new
ui-testskill underskills/ui-test/that defines a structured, evidence-based UI testing workflow using thebrowseCLI, including diff-driven, exploratory, and parallel (multi-session) testing modes.Includes extensive worked examples (
EXAMPLES.md), deterministic check recipes (axe-core, console/resource errors, responsive overflow/touch targets), and supporting reference/heuristics docs, plus an MITLICENSE.txtand top-levelREADME.mdfor installation and usage.Written by Cursor Bugbot for commit 4b04ef0. This will update automatically on new commits. Configure here.