[STG-NEW] Add ui-test skill for adversarial UI testing by shrey150 · Pull Request #56 · browserbase/skills

shrey150 · 2026-03-26T06:52:08Z

Summary

Adds ui-test skill for AI-powered adversarial UI testing via the browse CLI
Builds on Add ui-test skill for agentic UI testing #52 — keeps the UX heuristics, browser recipes, codebase analysis, and exploratory testing references; adds local/remote mode selection, diff-driven testing, structured assertions, and adversarial patterns
Smoke-tested against a local Next.js app — found real bugs (Escape not closing modals, undersized mobile touch targets)

What's new vs #52

Feature	#52	This PR
Local browser for localhost	No (always Browserbase)	Yes — `browse env local`, no API key needed
Cookie-sync for remote auth	Mentioned but not wired up	Full workflow with examples
Diff-driven testing	No (full suite only)	`git diff` → targeted tests for what changed
Assertion protocol	Freeform	`STEP_PASS\|id\|evidence` / `STEP_FAIL\|id\|expected → actual`
Before/after comparison	No	Snapshot before act, snapshot after, compare trees
Adversarial patterns	No	XSS, empty submit, rapid click, keyboard-only, focus trap
EXAMPLES.md	No	8 examples with exact commands and expected output
Console capture fix	`about:blank` injection (broken)	On-page injection (working)
`browse eval` await fix	Uses `await` (broken)	Uses `.then()` (working)

Files

skills/ui-test/
├── SKILL.md                              # Skill definition (478 lines)
├── EXAMPLES.md                           # 8 worked examples with assertions
├── LICENSE.txt                           # MIT
├── README.md                             # Overview (from #52)
├── rules/ux-heuristics.md               # 6 evaluation frameworks (from #52)
├── references/
│   ├── browser-recipes.md               # Deterministic check recipes (fixed)
│   ├── codebase-analysis.md             # 8-step suite generation (from #52)
│   └── exploratory-testing.md           # Agent-driven QA guide (from #52)
└── examples/
    └── browserbase-dashboard-suite.yml   # Example suite (from #52)

Test plan

Smoke tested component rendering (before/after snapshot comparison)
Smoke tested form validation (happy path + adversarial: empty, XSS, long input, keyboard-only)
Smoke tested modal lifecycle (open, cancel, escape, confirm, focus trap)
Smoke tested axe-core accessibility audit (deterministic violation count)
Smoke tested responsive screenshots + deterministic overflow/touch-target checks
Smoke tested console error capture (on-page injection pattern)
Smoke tested remote Browserbase mode with API key
Found 2 real bugs in test app confirming adversarial patterns work

🤖 Generated with Claude Code

Note

Low Risk
Low risk: this PR adds new Markdown-based skill documentation and examples without changing runtime application code or existing behaviors.

Overview
Adds a new ui-test skill under skills/ui-test/ that defines a structured, evidence-based UI testing workflow using the browse CLI, including diff-driven, exploratory, and parallel (multi-session) testing modes.

Includes extensive worked examples (EXAMPLES.md), deterministic check recipes (axe-core, console/resource errors, responsive overflow/touch targets), and supporting reference/heuristics docs, plus an MIT LICENSE.txt and top-level README.md for installation and usage.

^{Written by Cursor Bugbot for commit 4b04ef0. This will update automatically on new commits. Configure here.}

Builds on #52 with three key additions: 1. Local/remote mode selection — localhost uses local browser (no API key), deployed sites use Browserbase via cookie-sync for authenticated testing 2. Diff-driven testing — analyze git diff, generate targeted tests for what changed, execute with before/after snapshot comparison 3. Structured assertion protocol — STEP_PASS/STEP_FAIL markers with evidence, deterministic checks (axe-core, console errors, overflow detection), and adversarial testing patterns (XSS, empty submit, rapid click, keyboard-only) Smoke-tested against a local Next.js app: found real bugs (Escape not closing modals, undersized mobile touch targets) that confirmed the adversarial patterns work. Fixed browse eval recipes (no top-level await, console capture on-page not about:blank). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

skills/ui-test/examples/browserbase-dashboard-suite.yml

skills/ui-test/EXAMPLES.md

skills/ui-test/references/browser-recipes.md

skills/ui-test/EXAMPLES.md

skills/ui-test/references/codebase-analysis.md

…ssions Enables concurrent test execution by leveraging browse CLI's --session flag to spin up independent Browserbase browsers per test group, with fan-out via Agent tool and merged result reporting. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Documents how to add Bash(browse:*) to project or user settings so users don't get prompted on every browse snapshot/click/eval. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…t figure it out - Remove .ui-tests/suite.yml format and generation pipeline - Replace Workflow B (8-step codebase analysis) with lightweight exploratory testing - Simplify references/codebase-analysis.md to quick hints (framework detection, route finding) - Remove example YAML suite file - Update README to reflect no-artifacts philosophy - Drop Write tool from allowed-tools (no files to generate) The codegen/suite approach can ship as v2 later. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

- XSS check: replace false-positive inline script count with input value check - Console capture: preserve original console.error in Examples 6 snippets - Form labels: use native i.labels API in browser-recipes.md (matches SKILL.md) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

cursor

Cursor Bugbot has reviewed your changes and found 3 potential issues.

^{Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.}

cursor · 2026-03-26T20:00:48Z

skills/ui-test/SKILL.md

+### Form structure
+
+```bash
+browse eval "JSON.stringify(Array.from(document.querySelectorAll('form')).map(f => ({ action: f.action, inputs: Array.from(f.querySelectorAll('input,select,textarea')).map(i => ({ name: i.name, type: i.type, required: i.required, hasLabel: !!i.labels?.length })) })))"


Inconsistent hasLabel check misses aria-label inputs

Medium Severity

The form structure recipe in SKILL.md computes hasLabel as !!i.labels?.length, which only checks for associated <label> elements. The same recipe in browser-recipes.md correctly uses !!i.labels?.length || !!i.getAttribute('aria-label'), also covering aria-label attributes. Since SKILL.md is the primary instruction file and tells the agent "any false = accessibility FAIL," inputs that use aria-label instead of <label> will produce false accessibility failures.

Additional Locations (1)

skills/ui-test/references/browser-recipes.md#L169-L170

cursor · 2026-03-26T20:00:48Z

skills/ui-test/references/browser-recipes.md

+
+# Step 2: Wait for script to load, then run audit
+# (wait 2-3 seconds for the script to load)
+browse eval "axe.run().then(r => JSON.stringify({ violations: r.violations.map(v => ({ id: v.id, impact: v.impact, description: v.description, nodes: v.nodes.length, help: v.helpUrl })), passes: r.passes.length, incomplete: r.incomplete.length }))"


axe-core recipe lacks wait between load and run

Low Severity

The axe-core recipes across all three files inject the script via eval and immediately call axe.run() on the next line. There's only a comment ("wait 2-3 seconds") but no actual browse wait timeout command between them. Since browser-recipes.md is framed as "copy-paste recipes," the missing wait can cause a ReferenceError on axe if the script hasn't finished loading. A browse wait timeout 3000 between the two evals would match the pattern used elsewhere (e.g., the responsive screenshot sweep).

Additional Locations (2)

skills/ui-test/SKILL.md#L392-L395

skills/ui-test/EXAMPLES.md#L273-L276

cursor · 2026-03-26T20:00:48Z

skills/ui-test/EXAMPLES.md

+browse screenshot /tmp/explore-home.png
+
+# Console health check
+browse eval "JSON.stringify({errors: (window.__capturedErrors || []).length})"


Console error variable name mismatch with injection recipe

Medium Severity

Example 8's console health check reads from window.__capturedErrors, but every console capture injection recipe across all files (SKILL.md, EXAMPLES.md Example 6, browser-recipes.md, exploratory-testing.md) stores errors in window.__logs. Since window.__capturedErrors is never defined anywhere, the fallback || [] ensures it always reports {errors: 0} — silently hiding any console errors the agent was supposed to detect.

Also strengthens auto-select rule: localhost → browse env local, deployed URLs → browse env remote, applied consistently across all workflows including parallel sessions. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

cursor bot reviewed Mar 26, 2026

View reviewed changes

shubh24 and others added 4 commits March 26, 2026 12:41

Add permission setup docs to avoid approval fatigue on browse commands

3a51067

Documents how to add Bash(browse:*) to project or user settings so users don't get prompted on every browse snapshot/click/eval. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

cursor bot reviewed Mar 26, 2026

View reviewed changes

shubh24 and others added 2 commits March 26, 2026 13:32

Simplify env rule: just say localhost → local, don't prescribe remote

4b04ef0

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[STG-NEW] Add ui-test skill for adversarial UI testing#56

[STG-NEW] Add ui-test skill for adversarial UI testing#56
shrey150 wants to merge 7 commits intomainfrom
shrey/ui-test-skill

shrey150 commented Mar 26, 2026 •

edited by cursor bot

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

cursor bot left a comment

Uh oh!

cursor bot Mar 26, 2026

Uh oh!

cursor bot Mar 26, 2026

Uh oh!

cursor bot Mar 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

shrey150 commented Mar 26, 2026 • edited by cursor bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

What's new vs #52

Files

Test plan

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

cursor bot left a comment

Choose a reason for hiding this comment

Uh oh!

cursor bot Mar 26, 2026

Choose a reason for hiding this comment

Inconsistent hasLabel check misses aria-label inputs

Uh oh!

cursor bot Mar 26, 2026

Choose a reason for hiding this comment

axe-core recipe lacks wait between load and run

Uh oh!

cursor bot Mar 26, 2026

Choose a reason for hiding this comment

Console error variable name mismatch with injection recipe

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

shrey150 commented Mar 26, 2026 •

edited by cursor bot

Loading

Inconsistent `hasLabel` check misses `aria-label` inputs