test(regression): 100% E2E coverage for CLI configs (PER-8250) by AkashBrowserStack · Pull Request #2312 · percy/cli

AkashBrowserStack · 2026-06-23T10:19:19Z

Summary

Adds 100% E2E coverage for the percy snapshot CLI config surface
(PER-8250). The existing
test/regression/ harness covered only a subset of asset-discovery options;
this extends it to every non-excluded option across the percy, snapshot,
discovery, and project namespaces plus the cli-snapshot static/sitemap
and per-snapshot capture options.

Three complementary tracks (full per-option matrix in
test/regression/COVERAGE.md):

Config validation (config-validation.test.js) — token-free
percy snapshot --dry-run that loads fixtures setting every option and
asserts the CLI accepts them all (no Invalid config:). This is the
literal-100% backbone and gates every PR (including forks) without a token.
Includes a negative self-test proving the validator actually fires.
Visual (regression.test.js, new pages/config-*.html) — render-affecting
options reviewed as visual diffs: scope/scopeOptions, per-snapshot percyCSS
override, execute + additionalSnapshots, domTransformation,
discovery.scrollToBottom, disableShadowDOM, forceShadowAsLightDOM,
responsiveSnapshotCapture.
Functional (functional.test.js, gated server.js routes) — discovery
options asserted on what the test servers observed (robust to log-format
changes): requestHeaders, authorization, cookies, userAgent, captureSrcset,
disallowedHostnames.

CI: the regression job in test.yml now runs config validation token-free
always, plus visual + functional with PERCY_REGRESSION_TOKEN.

Scope — what "100% coverage" means here

"100%" = every CLI config option reachable via percy snapshot is exercised
end-to-end through the real CLI and validated — ~54 options across the
percy / snapshot / discovery / project namespaces plus the cli-snapshot
static/sitemap and per-snapshot capture options. Coverage is layered:

All ~54 options → loaded & schema-validated end-to-end (Track C, token-free;
asserts the CLI accepts each with no Invalid config:).
Render-affecting options → additionally verified by visual diff (Track V).
Discovery-behavior options → additionally verified by server-observed
assertions (Track F).

Non-visual options (e.g. concurrency, retry, sync, deferUploads,
project.*, readiness.* sub-options) are covered at the config-validity
level — the CLI accepts and loads them — not with bespoke runtime-behavior
tests, by design.

Explicitly excluded: the 6 onlyAutomate snapshot options — fullPage,
freezeAnimation, freezeAnimatedImage, freezeAnimatedImageOptions,
ignoreRegions, considerRegions — are unreachable via the percy snapshot
web flow and belong to the SDK/Automate path (PER-8249), so they are not
covered here. CLI flags (--allowed-hostname, --disable-cache, …) are 1:1
wrappers over these config values and are covered at the value level.

Test-only changes; no production code touched.

Test plan

yarn test:regression:config — passes locally, token-free (12 assertions
incl. negative self-test; 25 + 8 snapshots enumerated, no Invalid config:)
Gated server routes verified token-free (headers/auth/cookies/UA/srcset
recorded; auth route 401s without creds)
eslint test/regression/ clean
Visual track — verified by CI regression job (Percy dashboard review)
Functional track — verified by CI regression job (PERCY_REGRESSION_TOKEN)

Draft until CI confirms the two token-gated tracks and the visual build is
reviewed in the Percy dashboard.

🤖 Generated with Claude Code

Covers every non-excluded CLI config option (percy/snapshot/discovery/ project + cli-snapshot static/sitemap + per-snapshot capture options) via `percy snapshot --dry-run`, asserting the CLI loads & validates each with no "Invalid config:" output. Runs token-free and discovery-free, so it can gate every PR in CI. Includes a negative self-test fixture proving the validator detector fires. PER-8250

…s (Track V) New pages + snapshots.yml entries exercise scope/scopeOptions, per-snapshot percyCSS override, execute + additionalSnapshots, domTransformation, discovery.scrollToBottom, disableShadowDOM, forceShadowAsLightDOM, and responsiveSnapshotCapture. Visual diffs are reviewed via Percy's dashboard on token-gated runs; the suite skips cleanly without PERCY_TOKEN. Bumped Track C expected snapshot count 16 -> 25 to match. PER-8250

Adds gated server routes that record the headers/auth/cookies/user-agent Percy sends during discovery plus a cross-origin disallowed-hostname probe. functional.test.js runs one real snapshot and asserts on those server observations (robust to log-format changes) for discovery.requestHeaders, authorization, cookies, userAgent, captureSrcset and disallowedHostnames. Token-gated; skips cleanly without PERCY_TOKEN. Server-route behavior is verified token-free; the end-to-end Percy assertions run when a token is set. PER-8250

…cks in CI Adds COVERAGE.md (per-option E2E coverage matrix with explicit onlyAutomate exclusions), rewrites the regression README around the three tracks, and adds config-validation (token-free) + functional (token-gated) steps to the regression job in test.yml. PER-8250

- lib/percy-cli.js: justify spawn shell:true with a nosemgrep comment (static, test-controlled args — no injection surface; mirrors regression.test.js). - functional-discovery.html: load the cross-origin disallowed-host probe via @import instead of a <link> so it needs no Subresource Integrity hash (meaningless for a dynamic localhost test resource); discovery still attempts the request. PER-8250

…p OSS alert The nosemgrep comment suppressed spawn-shell-true for the workflow exit code, but GitHub code-scanning (Semgrep OSS) still surfaced the suppressed finding as an alert and failed the check. Eliminate it instead: npx resolves via PATH without a shell (this suite is Linux-only) and args are passed as an array, so shell:true was unnecessary and there is no command-injection surface. PER-8250

AkashBrowserStack · 2026-06-25T06:57:05Z

Claude Code PR Review

PR: #2312 • Head: fb9ffce • Reviewers: stack:code-reviewer

Summary

Test-only PR extending the test/regression/ E2E harness with three tracks — a token-free config-validation track (percy snapshot --dry-run over fixtures exercising every non-onlyAutomate CLI config option, with an intentional-invalid negative guard), a visual track (new render-affecting snapshots + pages), and a functional discovery track (a real run whose headers/auth/cookies/UA/srcset/disallowed-host behavior is asserted against what gated local test servers observed). Adds COVERAGE.md, two yarn scripts, and wires all three into the regression job in test.yml.

Review Table

Priority	Category	Check	Status	Notes
High	Security	No hardcoded secrets or credentials	Pass	`percy:secret` / `abc123` are local-server fixture stubs, not real creds
High	Security	Authentication/authorization checks present	N/A	Test harness; no production auth code
High	Security	Input validation and sanitization	Pass	New `/gated/*` + `/disallowed-probe.css` routes are exact-match static responders; existing path-traversal guard retained
High	Security	No IDOR — resource ownership validated	N/A	No resource ownership in test fixtures
High	Security	No SQL injection (parameterized queries)	N/A	No SQL
High	Correctness	Logic is correct, handles edge cases	Pass	Assertions map to verified core behavior (schema, dry-run skipDiscovery, Fetch.failRequest abort); minor vacuous-pass risks noted below
High	Correctness	Error handling is explicit, no swallowed exceptions	Pass	Servers stopped in `finally`; reviewer/runner errors captured
High	Correctness	No race conditions or concurrency issues	Pass	Observation state reset per run; servers bound/closed deterministically
Medium	Testing	New code has corresponding tests	Pass	This is test code; strong coverage incl. an explicit negative-guard for the validator
Medium	Testing	Error paths and edge cases tested	Pass	Negative-guard fixture; a couple of soft assertions (see Low findings)
Medium	Testing	Existing tests still pass (no regressions)	Pass	Test-only change; CI 46/46 green
Medium	Performance	No N+1 queries or unbounded data fetching	Pass	Config track is fast dry-runs; functional is a single capture
Medium	Performance	Long-running tasks use background jobs	N/A	Not applicable to a test harness
Medium	Quality	Follows existing codebase patterns	Pass	Matches harness style; one duplicate-spawn consistency gap (see Findings)
Medium	Quality	Changes are focused (single concern)	Pass	Focused regression-coverage PR
Low	Quality	Meaningful names, no dead code	Pass	Clean, well-commented
Low	Quality	Comments explain why, not what	Pass	Good rationale comments throughout
Low	Quality	No unnecessary dependencies added	Pass	No new dependencies

Findings

File: test/regression/regression.test.js:36
Severity: Medium
Reviewer: stack:code-reviewer
Issue: The pre-existing visual-track runner still uses its own inline spawn('npx', …, { shell: true }) and was not refactored onto the new shared runPercy helper. Result: two spawn implementations in the same dir with opposite shell settings; the "no command-injection surface" rationale for dropping shell:true doesn't hold for the most prominent file. Not exploitable (args are static arrays), hence Medium.
Suggestion: Make regression.test.js import and use runPercy from lib/percy-cli.js, deleting its inline spawn and shell:true.
File: test/regression/config-validation.test.js (the expectCount: 25 cases)
Severity: Medium
Reviewer: stack:code-reviewer
Issue: Cases 1/2/4 hard-code expectCount: 25 against snapshots.yml, which is also the visual track's file. Adding any future visual snapshot there (the documented way to add visual coverage) silently breaks the config track with a confusing cross-track count mismatch.
Suggestion: Point the config cases at a dedicated fixture not shared with the visual track, or assert count >= expected for the shared file while keeping the exact-count assertion only on the focused per-snapshot-options.yml. The real signal (absence of Invalid config:) should remain.
File: test/regression/lib/percy-cli.js:33 (snapshotCount regex)
Severity: Low
Reviewer: stack:code-reviewer
Issue: Core logs Found 1 snapshot (singular) for a single snapshot, but the regex /Found (\d+) snapshots/ only matches the plural → returns null for a 1-snapshot fixture. Not triggered today (counts are 25/8) but a latent bug for any future single-snapshot fixture.
Suggestion: Relax to /Found (\d+) snapshot/.
File: test/regression/functional.test.js:54 (captureSrcset assertion)
Severity: Low
Reviewer: stack:code-reviewer
Issue: The <img src="/gated/srcset-1x.png"> means the 1x route is hit by src regardless of captureSrcset, so obs.srcset.includes('1x') isn't evidence of the feature. Only '2x' is the real signal.
Suggestion: Assert specifically on '2x' (or point src at a non-candidate URL) so the assertion can't pass on the src alone.
File: test/regression/functional.test.js:61 (disallowed-probe assertion)
Severity: Low
Reviewer: stack:code-reviewer
Issue: disallowedProbeRequested === false would also hold if the cross-origin @import were never attempted for an unrelated reason; there's no positive control proving the probe is hit when the host is allowed, so a future regression that stops discovering @import resources could pass vacuously.
Suggestion: Acceptable tradeoff for E2E; optionally add a one-off allowed-host control or a comment noting the limitation.
File: .github/workflows/test.yml:138-187 (regression job)
Severity: Low
Reviewer: stack:code-reviewer
Issue: The job is timeout-minutes: 15 and now runs config + visual + functional (the latter two do real captures/uploads sequentially). If visual already approaches the budget, adding functional risks intermittent timeouts.
Suggestion: Monitor wall-clock and bump the timeout if needed.

Informational (no action required): per-snapshot scope/waitForSelector/readySelectors in per-snapshot-options.yml are schema-validated only (Track C is --dry-run, which skips discovery/capture) — correct by design, just not proof the options work. New harness code uses rmSync/??/?., all available on the pinned Node 14.14+.

Schema & behavioral claims were cross-checked against packages/core/src/config.js, packages/cli-snapshot/src/config.js, and core network.js/percy.js: all fixture options are valid and in-range; the onlyAutomate exclusion list matches the schema exactly; snapshot counts (25, 8) are correct; the disallowed-host request is genuinely aborted before reaching the server; dry-run is token-free.

Verdict: PASS — No High/Critical issues. Two Medium findings (adopt the shared runPercy helper in regression.test.js; decouple the config-track count from the shared snapshots.yml) are worth addressing but do not block merge.

pranavz28 · 2026-06-25T11:53:35Z

Coverage audit: ~3% of reachable options not yet exercised

Verified the fixtures against the schema source of truth (packages/core/src/config.js, packages/cli-snapshot/src/config.js) — ~145 of ~150 reachable options are covered, and the onlyAutomate exclusions are correctly justified. Three small gaps remain before "100% of every option" is literally true. All are test-only, non-blocking follow-ups:

1. regions[].elementSelector.elementXpath is never set — config.js:290
The schema accepts three selector forms (boundingBox, elementCSS, elementXpath), but the fixtures only use elementCSS (all-config.yml:80, per-snapshot-options.yml:86) and boundingBox (all-config.yml:90). The elementXpath form is unexercised.
Fix: add one region using elementSelector.elementXpath to all-config.yml (no snapshot-count change → Track C still asserts 25).

2. algorithm: layout enum value is never set — config.js:303 (snapshot) / config.js:321 (region)
The enum is ['standard','layout','ignore','intelliignore'], but fixtures only use standard, intelliignore, and ignore. The layout value is never validated.
Fix: set one algorithm: layout (e.g. a region or the top-level snapshot.algorithm) in all-config.yml.

3. Server mode (serve / port) is not exercised — config.js:831-832
The suite runs list-mode and --base-url mode; the core /snapshot/server schema reached via percy snapshot <dir> isn't covered by any fixture. (Borderline — arguably out of scope for "percy snapshot config," but it is reachable.)
Fix (optional): a minimal serve/port fixture, or document it as an explicit non-goal in COVERAGE.md alongside the onlyAutomate exclusions.

Items 1–2 are two-line additions that make the literal-100% claim accurate; item 3 is optional. Great PR overall — the Track C additionalProperties: false validation genuinely proves every fixture option is schema-recognized.

…free) Reviewer feedback: the regression project's builds showed only 1 snapshot and read as baseline-only. Root cause: the functional track was a second `percy snapshot` invocation that uploaded its own 1-snapshot build on the same commit, superseding the 25-snapshot visual build. Fix: run the functional track with `--debug` (skipUploads). Discovery still runs — the browser fetches every gated resource so the servers verify the headers/auth/cookies/user-agent/srcset/disallowed-host behavior — but no Percy build is created. This makes the track token-free (now runs on every PR like the config track) and leaves the visual track as the sole build creator, so the PR's single build carries all visual snapshots and compares against the master baseline as a proper head build. Also tightened the captureSrcset assertion to the discriminating 2x candidate. PER-8250

…ayout, serve) Addresses reviewer's coverage audit: 1. regions[].elementSelector.elementXpath — add a region using the xpath selector form (all-config.yml); previously only elementCSS + boundingBox. 2. algorithm: layout — the same new region uses the layout enum value, which no fixture/region exercised before. 3. server mode (serve) — add a static-site/ fixture covered by a new Track C case (percy snapshot <dir> --dry-run --clean-urls). The /snapshot/server 'port' option is not reachable via the CLI (no --port flag; not set for directories), so it's documented as a non-goal in COVERAGE.md. Track C now 5 cases; all-config.yml stays at 25 snapshots. PR-2312 review follow-up. PER-8250

AkashBrowserStack · 2026-06-25T14:00:27Z

@pranavz28 thanks for the careful audit — all three gaps are now closed in b04f42f (test-only):

regions[].elementSelector.elementXpath — added a region using the elementXpath selector form in all-config.yml (alongside the existing elementCSS and boundingBox regions). No snapshot-count change → Track C still asserts 25.
algorithm: layout — that same new region uses algorithm: layout, so the enum value is now validated (the suite already covered standard, intelliignore, and ignore).
Server mode (serve) — added a minimal static-site/ fixture plus a Track C case percy snapshot static-site --dry-run --clean-urls (2 snapshots), which exercises the /snapshot/server path. port is not reachable via percy snapshot — there's no --port flag on the command and it isn't in the static config namespace (the command only sets serve/cleanUrls/baseUrl for a directory; see packages/cli-snapshot/src/snapshot.js), so it's a programmatic/SDK-only option. I've documented it as an explicit non-goal in COVERAGE.md alongside the onlyAutomate exclusions rather than fake-exercise it.

COVERAGE.md updated to reflect all three region selector forms, the full algorithm enum (incl. layout), and the serve vs port distinction. That makes the literal-100% claim accurate for every CLI-reachable option. Appreciate the thorough review! 🙏

AkashBrowserStack · 2026-06-29T07:05:54Z

Claude Code PR Review

PR: #2312 • Head: b04f42f • Reviewers: stack:code-reviewer

Summary

Re-review of the test-only PR adding three regression tracks (config-validation, visual, functional discovery) to test/regression/. The harnesses are well-constructed, assertions are non-vacuous and verify genuine CLI behavior, fixtures are schema-faithful, and all prior review findings are genuinely resolved (verified against the actual core source, not just claimed). No High/Critical issues.

Review Table

Priority	Category	Check	Status	Notes
High	Security	No hardcoded secrets or credentials	Pass	`password: secret` / `abc123` are local-server fixture stubs
High	Security	Authentication/authorization checks present	N/A	Test harness; no production auth code
High	Security	Input validation and sanitization	Pass	Gated routes are exact-match static responders; `serveStaticFile` traversal guard intact
High	Security	No IDOR — resource ownership validated	N/A	No resource ownership in fixtures
High	Security	No SQL injection (parameterized queries)	N/A	No SQL
High	Correctness	Logic is correct, handles edge cases	Pass	Assertions match verified core behavior (skipUploads≠skipDiscovery, disallowed-host abort path, CDP-applied headers)
High	Correctness	Error handling is explicit, no swallowed exceptions	Pass	Servers stopped in try/finally; runner errors captured
High	Correctness	No race conditions or concurrency issues	Pass	Observation state reset per run; deterministic server bind/close
Medium	Testing	New code has corresponding tests	Pass	This is test code; 3 tracks + negative-guard self-test
Medium	Testing	Error paths and edge cases tested	Pass	`invalid-example.yml` proves the validator fires; auth route 401s without creds
Medium	Testing	Existing tests still pass (no regressions)	Pass	Test-only; CI green
Medium	Performance	No N+1 queries or unbounded data fetching	Pass	dry-runs + `--debug` discovery; servers preload assets
Medium	Performance	Long-running tasks use background jobs	N/A	Not applicable to a test harness
Medium	Quality	Follows existing codebase patterns	Pass	Shared `runPercy` helper; matches harness style
Medium	Quality	Changes are focused (single concern)	Pass	Focused regression-coverage PR
Low	Quality	Meaningful names, no dead code	Pass	Clear naming, well-commented
Low	Quality	Comments explain why, not what	Pass	Good rationale comments + COVERAGE.md/README
Low	Quality	No unnecessary dependencies added	Pass	No new dependencies

Findings

Prior findings — all verified resolved:

lib/percy-cli.js — spawn-shell-true gone (array-arg spawn, no shell:true).
pages/functional-discovery.html — missing-integrity gone (cross-origin probe via @import, not <link>).
Functional track --debug/skipUploads — verified correct in core/percy.js (skipUploads does NOT imply skipDiscovery; uploads gated on !skipUploads): discovery runs, servers observe, no build created, token-free.
Coverage gaps — elementXpath + algorithm: layout region added; static-site/ + Track C case exercise serve; port documented as not-CLI-reachable.

Remaining (all Low / nits, none blocking):

File: test/regression/regression.test.js:36 · Low · Pre-existing inline spawn(..., { shell: true }) not refactored onto the shared runPercy helper. Out of scope — this file is not in the PR's diff — but a follow-up could move it onto lib/percy-cli.js to clear the latent semgrep pattern.
File: test/regression/lib/percy-cli.js (snapshotCount) · Low · Regex /Found (\d+) snapshots/ misses the singular Found 1 snapshot form. No impact today (all counts > 1); tighten to /Found (\d+) snapshots?/ for future single-snapshot fixtures.
File: test/regression/config-validation.test.js (expectCount) · Low · Counts are coupled to the shared snapshots.yml/per-snapshot-options.yml; verified accurate (25 / 8). Intentional (catches dropped snapshots) and documented in COVERAGE.md.
File: test/regression/functional.test.js:36 · Low (nit) · --verbose is redundant alongside --debug (the debug flag already raises log level). Harmless.

Verdict: PASS — Clean, well-engineered test-only PR; all prior findings genuinely resolved (verified against core source), only Low/nit items remain, none blocking merge.

AkashBrowserStack · 2026-06-29T18:27:33Z

Verification — the suite was mutation-tested (it catches regressions, not just passes)

To prove the specs actually fail when an option changes (and only the affected spec moves), each track was mutation-tested. The real percy-cli-regression project was not touched — the visual checks ran in a throwaway project, and the branch was restored after each run.

Track C (config-validation, token-free): set widths: [99999] (out of range) → the "no Invalid config:" assertion fails. ✅

Track F (functional, token-free): each mutation fails its own server-observation assertion → ✅

remove discovery.disallowedHostnames → "blocked the 9101 probe" fails (probe now reaches the server)
captureSrcset: false → "2x candidate" fails (only 1x fetched)
wrong authorization.password → auth route 401s, "authorization sent" fails

Track V (visual): baseline build on main, then mutated builds on feature branches targeting main. Percy flagged a diff on exactly the mutated snapshot(s), everything else unchanged → ✅

scope removed → Config - Scope 91.27% diff (1 changed / 24 unchanged)
percyCSS recolor → Config - percyCSS Override 77.56% diff
disableShadowDOM: true → false → Config - Disable Shadow DOM 40.70% diff
domTransformation banner text changed → Config - DOM Transformation 0.15% diff (3 changed / 22 unchanged)

The methodology is documented in test/regression/COVERAGE.md ("Verifying the suite actually catches regressions") so it can be re-run before trusting future changes.

AkashBrowserStack added 4 commits June 23, 2026 14:18

github-advanced-security AI found potential problems Jun 23, 2026

View reviewed changes

Comment thread test/regression/lib/percy-cli.js Fixed

Comment thread test/regression/pages/functional-discovery.html Fixed

AkashBrowserStack added the 🧹 maintenance General maintenance label Jun 24, 2026

AkashBrowserStack marked this pull request as ready for review June 24, 2026 05:03

AkashBrowserStack requested a review from a team as a code owner June 24, 2026 05:03

AkashBrowserStack requested review from amandeepsingh333 and ninadbstack June 24, 2026 05:03

AkashBrowserStack added 2 commits June 25, 2026 17:39

pranavz28 approved these changes Jun 26, 2026

View reviewed changes

AkashBrowserStack merged commit c1e8904 into master Jun 29, 2026
47 checks passed

AkashBrowserStack deleted the PER-8250_cli-config-e2e-coverage branch June 29, 2026 09:48

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

test(regression): 100% E2E coverage for CLI configs (PER-8250)#2312

test(regression): 100% E2E coverage for CLI configs (PER-8250)#2312
AkashBrowserStack merged 8 commits into
masterfrom
PER-8250_cli-config-e2e-coverage

AkashBrowserStack commented Jun 23, 2026 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

AkashBrowserStack commented Jun 25, 2026

Uh oh!

pranavz28 commented Jun 25, 2026

Uh oh!

AkashBrowserStack commented Jun 25, 2026

Uh oh!

AkashBrowserStack commented Jun 29, 2026

Uh oh!

Uh oh!

AkashBrowserStack commented Jun 29, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Conversation

AkashBrowserStack commented Jun 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Scope — what "100% coverage" means here

Test plan

Uh oh!

Uh oh!

Uh oh!

AkashBrowserStack commented Jun 25, 2026

Claude Code PR Review

Summary

Review Table

Findings

Uh oh!

pranavz28 commented Jun 25, 2026

Coverage audit: ~3% of reachable options not yet exercised

Uh oh!

AkashBrowserStack commented Jun 25, 2026

Uh oh!

AkashBrowserStack commented Jun 29, 2026

Claude Code PR Review

Summary

Review Table

Findings

Uh oh!

Uh oh!

AkashBrowserStack commented Jun 29, 2026

Verification — the suite was mutation-tested (it catches regressions, not just passes)

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

AkashBrowserStack commented Jun 23, 2026 •

edited

Loading