CLI-first QA testing tool for deterministic web testing with Playwright.
Write tests in YAML. Run them from the terminal. Get screenshots, traces, and reports automatically.
# .prowl/hunts/login-flow.yml
name: login-flow
steps:
- navigate: "/login"
- fill:
"Email": "{{TEST_EMAIL}}"
- fill:
"Password": "{{TEST_PASSWORD}}"
- click: "Sign In"
- assert:
visible: "Dashboard" ● Running hunt: login-flow
✓ navigate "/login" (120ms)
✓ fill "Email" (85ms)
✓ fill "Password" (62ms)
✓ click "Sign In" (340ms)
✓ assert visible "Dashboard" (15ms)
PASS login-flow (622ms) 5/5 steps
Artifacts: .prowl/runs/2026-02-09_10-30-45
npm install -g prowl-toolsOr with Homebrew:
brew tap prowl-tools/tap
brew install prowlProwl uses Playwright under the hood. Install the browser:
npx playwright install chromiumcd your-project
prowl initThis creates a .prowl/ directory with a config file and 8 example hunts:
.prowl/
├── config.yml # Target URL, browser settings, guardrails
└── hunts/
├── homepage.yml # Basic page load smoke test
├── login-flow.yml # Email/password authentication
├── signup-flow.yml # Registration with validation
├── form-submit.yml # Form fill and submit
├── form-validation.yml # Validation errors and resubmit
├── crud-cycle.yml # Create, read, update, delete lifecycle
├── checkout-flow.yml # E-commerce checkout
└── onboarding-wizard.yml # Multi-step SaaS onboarding
Edit .prowl/config.yml to point at your app:
target:
url: "http://localhost:3000"Edit .prowl/hunts/homepage.yml or create a new file:
name: smoke-test
steps:
- navigate: "/"
- wait: "Welcome"
- assert:
visible: "Sign In"
assertions:
- noConsoleErrors: trueprowl run smoke-testThat's it. You're testing.
Prowl supports both shorthand and explicit syntax for most step types. Shorthand is concise and readable. Explicit gives you full control over selectors.
Navigate to a URL (relative to your target.url or absolute).
- navigate: "/"
- navigate: "/login"
- navigate: "https://example.com/page"Click an element. Shorthand finds buttons by text, then falls back to any matching text.
# Shorthand — finds by button role, then text
- click: "Sign In"
# Explicit — use any Playwright selector
- click:
selector: "[data-testid='submit-btn']"Fill an input field. Shorthand finds inputs by label or placeholder text.
# Shorthand — finds by label, then placeholder
- fill:
"Email": "user@example.com"
# Explicit — use any selector
- fill:
selector: "input[name='email']"
value: "user@example.com"Type into the currently focused element. Useful after clicking into a field.
- click: "Message"
- type: "Hello, I have a question."Press a keyboard key on a specific element.
- press:
selector: "input[name='search']"
key: "Enter"Select a dropdown value. Shorthand finds by label, explicit uses a selector.
# Shorthand — finds <select> by label, aria-label, or placeholder
- select:
"State": "FL"
# Explicit
- selectOption:
selector: "select[name='state']"
value: "FL"Mid-flow assertions. Fails the hunt immediately if the assertion fails.
- assert:
visible: "Welcome back"
- assert:
notVisible: "Error"
- assert:
urlIncludes: "/dashboard"
- assert:
urlEquals: "https://example.com/dashboard"Conditionally execute steps based on whether a selector is visible or not visible. This step is explicit-only.
Key fields:
visibleornotVisible(exactly one)then(required array of steps)else(optional array of steps)
- if:
visible: ".cookie-banner"
then:
- click: ".accept"
else:
- wait: "Welcome back"Repeat a block of steps either a fixed number of times or while a selector condition is true. This step is explicit-only.
Key fields:
times(fixed count) orwhile(condition), exactly onewhile.visibleorwhile.notVisible(when usingwhile)maxIterations(required withwhile)steps(required array of steps to execute each iteration)
# Fixed count
- repeat:
times: 3
steps:
- click: ".load-more"
# Condition-based loop
- repeat:
while:
visible: ".load-more"
maxIterations: 10
steps:
- click: ".load-more"Mock network responses for a URL pattern. This step is explicit-only.
Key fields:
url(Playwright route pattern, e.g.**/api/users)response.status- exactly one of
response.bodyorresponse.file - optional
response.contentType(defaults toapplication/json)
- mockRoute:
url: "**/api/users"
response:
status: 200
body: '{"users":[{"id":1}]}'Remove a previously registered route mock. This step is explicit-only.
Key fields:
url(must match the mocked route URL pattern)
- unmockRoute:
url: "**/api/users"Wait for text to appear on the page. Shorthand for waitForSelector with text matching.
# Simple — wait for text with default timeout
- wait: "Loading complete"
# With custom timeout
- wait:
for: "Loading complete"
timeout: 10000Wait for any Playwright selector to appear.
- waitForSelector:
selector: "[data-testid='results-table']"
timeout: 5000Wait for the URL to contain a substring.
- waitForUrl:
value: "/dashboard"
timeout: 10000Wait for all network requests to complete.
- waitForNetworkIdle:
timeout: 5000Handle browser-native dialogs (alert, confirm, prompt). Register the handler before the action that triggers the dialog.
- onDialog:
action: accept # or "dismiss"
- click: "Delete" # this triggers the confirm dialogSet files on <input type="file"> elements. Paths are relative to .prowl/.
# Single file
- setInputFiles:
selector: "[data-testid='avatar-upload']"
files: "fixtures/avatar.png"
# Multiple files
- setInputFiles:
selector: "[data-testid='attachments']"
files:
- "fixtures/doc1.pdf"
- "fixtures/doc2.pdf"Execute another hunt file inline. Enables reusable sub-flows like login.
# Simple — run the hunt as-is
- runHunt: "login-flow"
# With variable overrides
- runHunt:
name: "login-flow"
vars:
EMAIL: "admin@test.com"
PASSWORD: "{{ADMIN_PASSWORD}}"Circular dependencies are detected automatically (A → B → A will error).
Capture a screenshot at any point.
- screenshot:
name: "after-login"Use assert steps anywhere in your hunt for mid-flow checks:
- assert:
visible: "Welcome" # Text must be visible on page
- assert:
notVisible: "Error" # Text must NOT be visible
- assert:
urlIncludes: "/dashboard" # Current URL must contain string
- assert:
urlEquals: "https://..." # Current URL must match exactlyRun after all steps complete:
assertions:
- selectorExists: "h1" # Element must exist
- selectorNotExists: ".error-banner" # Element must NOT exist
- urlIncludes: "/dashboard"
- urlEquals: "https://example.com/"
- noConsoleErrors: true # No console.error messages
- noNetworkErrors: true # No HTTP responses >= 400Config lives at .prowl/config.yml. All options with defaults:
# The base URL for all hunt navigation
target:
url: "http://localhost:3000" # Required
# Browser settings
browser:
headless: true # false = show the browser window
slowMo: 0 # ms delay between actions (debugging)
timeout: 30000 # default page operation timeout
# What gets saved per run
artifacts:
screenshots: "on-failure" # "on-failure" or "all"
networkHar: false # save network activity as HAR
console: true # save browser console output
# Hunt-level assertions (applied to every hunt)
assertions:
noConsoleErrors: true # fail on console.error
noNetworkErrors: true # fail on HTTP >= 400
maxTotalTimeMs: 30000 # max total time for all steps
networkIgnorePatterns: [] # URL substrings to ignore
# Safety guardrails
guardrails:
maxSteps: 50 # max steps per hunt
allowedDomains: # only navigate to these domains
- "localhost"
- "127.0.0.1"
forbiddenSelectors: # selectors that steps cannot use
- "[data-danger]"
- ".delete-btn"
# Auth state from `prowl login`
auth:
storageStatePath: ".prowl/auth-state.json"
# Run history retention
history:
maxRuns: 100 # keep last N runs per huntforbiddenSelectorsandassertions.networkIgnorePatternsboth use JavaScriptincludes()for case-sensitive substring matching. A pattern of"Delete"matches"Delete History", but"delete"does not. Specific selectors like".delete-btn"also match".undelete-btn"because the substring is present, so prefer exact-enough patterns instead of broad fragments.allowedDomainsis enforced only forhttp:andhttps:navigations. Theabout:anddata:protocols (for example,about:blank) bypass the allowlist by design so hunts can interact with browser-internal pages.- Migration note: If an older config relied on lowercase patterns like
"delete"matching uppercase text such as"Delete History", update the pattern to the exact case present in the selector or URL. Apply the same review toforbiddenSelectors,assertions.networkIgnorePatterns, and anyallowedDomainsassumptions aboutabout:ordata:URLs.
Set guardrails.selfHealing: true (default false) to let Prowl recover when an
explicit selector stops matching — for example after a markup change renames
#sign-in-btn. When such a selector matches nothing, Prowl derives the intent from
the selector text and tries, in order:
- Fuzzy text — an element containing the selector's words (e.g. "sign in")
- ARIA label — an element whose
aria-labelcontains those words - Structural — an interactive element (
button,a,input, …) containing the text
It heals only to a candidate that matches exactly one element — it never guesses among multiple. A heal is logged as a warning and recorded in the run report:
result.json: the step gains ahealedFromfieldsummary.md: a Self-Healed Selectors section listsoriginal → healed
Healing applies to action steps (click, fill, selectOption, setInputFiles,
press, hover, scrollTo) and is meant as a safety net — update your hunt to a stable
selector (ideally a data-testid) when you see a heal. waitForSelector is excluded,
since a not-yet-present element is its normal state.
Use {{VAR_NAME}} to inject dynamic values into your hunts.
- Hunt vars — defined in the hunt's
vars:block (highest priority) - Environment variables — from
process.env .envfile — from.prowl/.env(loaded automatically)
# .prowl/hunts/login-flow.yml
vars:
EMAIL: "{{TEST_EMAIL}}" # References env var TEST_EMAIL
TIMEOUT: "5000" # Static value
steps:
- fill:
"Email": "{{EMAIL}}" # Resolves to the value of TEST_EMAILCreate .prowl/.env for secrets:
TEST_EMAIL=user@example.com
TEST_PASSWORD=secret123Any fill or type step whose value came from a {{VAR}} interpolation is automatically redacted in reports:
# In summary.md and result.json:
fill "[data-testid='email']" → [REDACTED]
This prevents credentials from leaking into artifacts, CI logs, or screenshots.
Every shorthand has an explicit equivalent. Use shorthand for readability, explicit for precision.
| Shorthand | Explicit Equivalent |
|---|---|
click: "Sign In" |
click: { selector: 'button:has-text("Sign In")' } |
fill: { "Email": "val" } |
fill: { selector: 'input[placeholder="Email"]', value: "val" } |
type: "text" |
fill: { selector: ':focus', value: "text" } |
select: { "State": "FL" } |
selectOption: { selector: 'select[name="state"]', value: "FL" } |
wait: "Welcome" |
waitForSelector: { selector: 'text="Welcome"' } |
runHunt: "login" |
runHunt: { name: "login" } |
Prowl uses Playwright's selector engine. For stable, maintainable selectors:
-
data-testid(best) — explicit test hooks that don't change with UI refactors- click: { selector: "[data-testid='submit']" }
-
Accessible roles — semantic and resilient to styling changes
- click: { selector: "role=button[name='Submit']" }
-
Labels/placeholders — via shorthand, Prowl resolves these automatically
- fill: { "Email": "user@test.com" }
-
Text content — via shorthand click, good for buttons and links
- click: "Sign In"
-
CSS selectors (last resort) — fragile, avoid class names that change
- click: { selector: ".btn-primary" } # Avoid if possible
For hunts that require authentication, use prowl login to capture browser state:
prowl loginThis opens a headed Chromium window. Log in manually, then close the browser. Prowl saves cookies, localStorage, and sessionStorage to .prowl/auth-state.json.
All subsequent prowl run commands will load this auth state, so your hunts start already logged in.
No changes needed — auth state is loaded automatically from the path in config.yml:
auth:
storageStatePath: ".prowl/auth-state.json"If your session expires, run prowl login again to re-capture.
Every hunt run generates artifacts in .prowl/runs/<timestamp>/:
.prowl/runs/2026-02-09_10-30-45/
├── summary.md # Human-readable report
├── result.json # Machine-readable results
├── console.log # Browser console output
├── screenshots/
│ ├── final.png # Final page state
│ └── failure_step_3.png # Screenshot on failure (if any)
├── trace.zip # Playwright trace (if --trace)
└── network.har # Network activity (if networkHar: true)
npx playwright show-trace .prowl/runs/2026-02-09_10-30-45/trace.zipWhen a hunt hits a failing request (HTTP status ≥ 400), Prowl reads the response's
traceparent header, extracts the W3C trace ID, and records it. This lets you pivot
straight from a hunt failure to the matching distributed trace in your own
observability stack (Datadog, Grafana/Tempo, Jaeger, etc.).
The trace IDs appear in:
result.jsonunder atraceCorrelationsarray (url,status,traceId,header)summary.mdunder a Trace Correlations section
If your app uses a non-standard header, configure it in .prowl/config.yml:
tracing:
header: "x-request-id" # default: "traceparent"This is a correlation bridge only — Prowl does not generate or propagate its own spans. When the app emits no trace headers, nothing is recorded (no noise).
# Run a hunt
prowl run <hunt-name>
prowl run <hunt-name> --headed # Show browser window
prowl run <hunt-name> --trace # Capture Playwright trace
prowl run <hunt-name> --slow-mo 500 # Slow down actions (ms)
prowl run <hunt-name> --url <override> # Override target URL
prowl run <hunt-name> --config <path> # Custom config path
# Watch mode — re-runs on file changes
prowl watch <hunt-name>
# Auth — capture login state interactively
prowl login
# Initialize — create .prowl directory with examples
prowl init
prowl init --force # Overwrite existing
# List available hunts
prowl list
# CI mode — run all hunts with aggregate status
prowl ci
prowl ci --json # Machine-readable CI output
prowl ci --parallel 4 # Run hunts with 4 workers
# History — show past runs of a hunt
prowl history <hunt-name>
prowl history <hunt-name> --limit 50 # Show the last 50 runs (default: 20)
prowl history <hunt-name> --json # Machine-readable history output
# MCP server — expose Prowl to AI agents over stdio
prowl mcp
prowl mcp --projects ~/.prowl/projects.yml # Drive multiple repos via a registry--parallel <count> details:
- Runs hunts in parallel with
countworkers. - Must be a positive integer (
>= 1). - Invalid values (for example
0or1.5) fail fast with an argument error.
Every prowl run and prowl ci appends an entry to .prowl/history.json
with the hunt name, status, start time, duration, and run directory. Retention
is capped per hunt by history.maxRuns (default 100) — once a hunt exceeds the
cap, its oldest entries are dropped on the next write. Other hunts are not
affected.
# In .prowl/config.yml
history:
maxRuns: 50 # keep the last 50 runs per hunt (default: 100)Use prowl history <hunt-name> for a quick status/duration table, or
--json to feed the entries into dashboards, flake detectors, or agents.
When a prowl ci run has multiple failures that share a common cause — the same
step type, selector, and (normalized) error — Prowl groups them into a single
failure cluster. Instead of triaging five separate failures, you see one root
cause (for example, a renamed #submit selector that broke five hunts).
Clusters appear in:
- the Failure clusters section of the CI summary, with the cause and affected hunts
- a
clustersarray inci-result.json(andprowl ci --json), each entry withcause,stepType,selector,error,count, andhunts
Only causes shared by more than one hunt are reported as clusters. This pairs well with self-healing selectors and flake detection to cut triage time on large suites.
Prowl can run as an MCP server, exposing QA as a small set of named tools that any MCP-capable agent can call over stdio. The agent triggers runs and reads structured results through these tools — it never needs shell access to your repo.
Prerequisites:
prowlmust be on yourPATH. Install it globally withnpm install -g prowl-tools, or launch it throughnpx(use"command": "npx", "args": ["prowl-tools", "mcp"]in the client config below). If the binary can't be found, the MCP client fails to start the server with no hunt-specific error.- The target project must be initialized — a
.prowl/directory with a valid config and hunts. Runprowl initand author hunts first. Pointed at an uninitialized repo, MCP tool calls fail with a missing.prowl/config.ymlerror.
prowl mcpThis starts a stdio server for the current project (it discovers .prowl/ from
the working directory, exactly like the other commands). Point your MCP client at
it — for example:
{
"mcpServers": {
"prowl": {
"command": "prowl",
"args": ["mcp"],
"cwd": "/path/to/your/project"
}
}
}| Tool | Arguments | Returns |
|---|---|---|
list_hunts |
project? |
Hunt names in run order |
run_hunt |
hunt, project? |
The full RunResult for a single hunt |
run_suite |
includeTags?, excludeTags?, parallel?, logBugs?, project? |
Pass/fail/skip counts, the ci-result.json path, and the bug tickets created |
list_projects |
— | Registered projects (empty unless a registry is configured) |
Existing guardrails (allowedDomains, forbiddenSelectors, maxSteps,
maxTotalTimeMs) apply to every run the server triggers.
Controlling what the agent can do: Prowl exposes only these four tools and
never runs arbitrary shell. To restrict the agent further, allow-list tool names
in your MCP client (e.g. OpenClaw) config — for example, allow list_hunts and
run_suite but withhold run_hunt. That allow-listing is configured on the
agent/client side, not in Prowl.
run_suite runs every hunt and, by default, logs each failure as a deduplicated
bug ticket in the project's docs/backlog.md, under a ## QA Findings (automated)
section that stays separate from your hand-written items. A bug is identified by
hunt + failing step + normalized error, so:
- a brand-new failure creates a
QA-NNNticket with the hunt, failing step, error, and a link to the run artifacts; - a failure that already has an open ticket is left alone (no duplicates);
- a failure matching something already in
docs/resolved.mdis logged as a regression that references the old ticket id.
Pass logBugs: false to run without touching the backlog. A run_suite response
looks like this:
{
"status": "fail",
"totalHunts": 8,
"passed": 6,
"failed": 2,
"skipped": 0,
"resultPath": "/path/to/project/.prowl/runs/ci-2026-05-26_09-12-03-456/ci-result.json",
"bugs": {
"created": ["QA-014"],
"regressions": ["QA-015"],
"alreadyOpen": ["QA-009"],
"backlogPath": "/path/to/project/docs/backlog.md"
}
}status is one of pass, fail, no-hunts, or all-skipped. When
logBugs is false, bugs arrays are empty and backlogPath is null.
By default the server acts on the current directory. To drive several repos from a single server, give it a project registry — one YAML file that maps project names to repo roots. This file lives outside any repo (it spans many), not inside a target project:
# ~/.prowl/projects.yml
projects:
coupe:
root: /Users/you/projects/coupe
storefront:
root: /Users/you/projects/storefront
configPath: /custom/.prowl/config.yml # optional; defaults to <root>/.prowl/config.ymlThe registry is resolved in priority order:
prowl mcp --projects <path>- the
PROWL_PROJECTSenvironment variable ~/.prowl/projects.yml
With a registry loaded, every tool accepts an optional project argument that
selects which repo to act on, and list_projects enumerates what's available.
For example, these run_suite arguments run the smoke suite for coupe and log
any failures to coupe/docs/backlog.md:
Omit project and the tool falls back to the current directory. Naming a project
that isn't registered — or naming one when no registry is configured — returns a
clear error.
CLI Commands
│
├── Config Loader (YAML → Zod validation → merged defaults)
│ │
│ ├── Hunt Loader (YAML → schema validation → interpolation)
│ │
│ └── .env Loader (dotenv)
│
├── Runner
│ │
│ ├── Step Executor (16 step types, guardrail checks)
│ │
│ ├── Assertion Evaluator (6 assertion types)
│ │
│ └── Browser Controller (Playwright launch/close)
│
└── Reporter
│
├── summary.md (human-readable, redacted)
│
└── result.json (machine-readable)
Browse and contribute hunt templates through the internal community registry (contact ops for access).
Templates cover auth flows (OAuth, 2FA), e-commerce (Stripe), admin panels, SaaS patterns, and more. Each template is heavily commented and ready to customize.
Run prowl init in your project root to create the .prowl/ directory.
Add the domain to guardrails.allowedDomains in your config:
guardrails:
allowedDomains:
- "localhost"
- "your-domain.com"The selector matches a pattern in guardrails.forbiddenSelectors. Either change the selector or update the guardrails config.
The {{VAR_NAME}} in your hunt couldn't be resolved. Check:
- Is it defined in the hunt's
vars:block? - Is it set in your
.prowl/.envfile? - Is it set as an environment variable?
- Use
--headedand--slow-mo 1000to watch the browser in real time - Check if the element is inside an iframe
- Check if the element appears after a network request (add
waitForNetworkIdlebefore) - Use
--traceand view withnpx playwright show-tracefor detailed diagnostics
- Check
browser.timeoutin your config — lower it for faster failures - Add
waitForNetworkIdleonly where needed (it waits for ALL requests) - Use
waitForSelectorwith a specific element instead ofwaitForNetworkIdle
Apache 2.0 — see LICENSE
{ "project": "coupe", "includeTags": ["smoke"] }