Skip to content

feat(doctor): enhanced diagnostics — auth, config, CI, env audit, quick mode, auto-doctor#2153

Merged
rishigupta1599 merged 13 commits intomasterfrom
feat/percy-doctor-enhanced-diagnostics
Mar 18, 2026
Merged

feat(doctor): enhanced diagnostics — auth, config, CI, env audit, quick mode, auto-doctor#2153
rishigupta1599 merged 13 commits intomasterfrom
feat/percy-doctor-enhanced-diagnostics

Conversation

@akshayminocha5
Copy link
Copy Markdown
Contributor

@akshayminocha5 akshayminocha5 commented Mar 16, 2026

Summary

Adds six new diagnostic check modules to @percy/cli-doctor and two new runtime modes, significantly expanding Percy's self-service troubleshooting capabilities:

  • Auth check (checkAuth) — validates PERCY_TOKEN presence, format, project-type detection (web/app/automate/generic/visual_scanner/responsive_scanner), and live API authentication (GET /api/v1/tokens)
  • Config check (checkConfig) — loads Percy config via cosmiconfig, validates version field, detects project-type/config key mismatches (automate-only keys with web token and vice versa)
  • CI check (checkCI) — detects 10+ CI providers, validates git availability, checks parallel build configuration (PERCY_PARALLEL_TOTAL + PERCY_PARALLEL_NONCE)
  • Env audit (checkEnvVars) — inventories all Percy env vars, validates PERCY_PARALLEL_TOTAL format, flags manual overrides (PERCY_COMMIT/BRANCH/PULL_REQUEST), detects NODE_TLS_REJECT_UNAUTHORIZED=0
  • Quick mode (--quick) — runs only connectivity + SSL + token auth for fast triage
  • Auto-doctor — on build failure, @percy/core calls runDoctorOnFailure() to automatically run diagnostics (gated by PERCY_AUTO_DOCTOR=true)

Architecture

  • Pure functions with dependency injection for testability (searchFn, apiBaseUrl, execFn patterns)
  • Inter-check context (ctx) flows proxy/connectivity results to downstream checks
  • runSection() factory pattern reduces check boilerplate
  • Promise.allSettled for failure isolation in parallel checks

Changes

File What
src/checks/auth.js Token validation + live API auth with sanitizeError defense
src/checks/config.js Config loading, version validation, project-type mismatch detection
src/checks/ci.js CI provider detection, git validation, parallel config check
src/checks/env-audit.js Env var inventory, format validation, override detection, TLS warning
src/doctor.js Quick mode flag, timeout upper bound (300s), updated section orchestration
src/utils/helpers.js runDiagnostics() programmatic entry point, runSection() factory, timeout validation
src/utils/http.js Minor: added method and headers support to probeUrl
test/auth.test.js 15 specs — mock HTTP server, all response codes, security assertions
test/ci.test.js 21 specs — CI detection matrix, git validation, parallel config
test/config.test.js 22 specs — all config scenarios, 6 token prefix types, mismatch matrix
test/env-audit.test.js 14 specs — all env var scenarios, float validation, security
test/utils/helpers.test.js Updated timeout assertions for new upper bound
packages/core/src/snapshot.js runDoctorOnFailure() integration on build failure paths

Stats: 15 files changed, +1815 / -69 lines

Testing

  • 510 specs, 0 new failures (1 pre-existing connectivity flake)
  • All new checks have deterministic tests using dependency injection (no network calls)
  • Auth tests use in-process mock HTTP server via createHttpServer helper
  • CI tests use spyOn(cp, 'execSync') for git simulation
  • Config tests use injected searchFn for cosmiconfig simulation
  • Security tests verify no token values leak into findings output

QA Cycles

Cycle Functional Bugs Security Findings Result
Cycle 1 7 (3 Critical, 2 High, 2 Medium) 14 (3 Critical, 6 High, 5 Medium) All Critical/High/Medium fixed
Cycle 2 7 (0 Critical, 0 High, 4 Medium, 3 Low) 2 (0 Critical, 0 High, 2 Medium) All Medium fixed, Low deferred

Final: zero Critical/High bugs remaining.

Key fixes from QA

  • Auth tests now use mock HTTP server (were hitting real percy.io API)
  • sanitizeError() strips token patterns from error messages
  • DR-206 no longer exposes PERCY_PARALLEL_NONCE value
  • parseIntNumber() + Number.isInteger() for stricter validation
  • Timeout upper bound (300s) prevents DoS-scale waits
  • stdio: 'pipe' replaces 2>/dev/null for Windows compatibility

Deferred items (Low severity)

  • Summary banner excludes info/skip from section count
  • sectionStatus() doesn't handle 'skip' status
  • Duplicated KNOWN_PREFIXES constant (auth.js + config.js) — extract to shared module
  • PERCY_PARALLEL_TOTAL=-1 special value (used by percy build:finalize) flagged as invalid

Post-Deploy Monitoring & Validation

No additional operational monitoring required: this is a diagnostic-only CLI tool that runs on-demand. It does not affect Percy build execution, snapshot capture, or API behavior. The auto-doctor feature is gated behind PERCY_AUTO_DOCTOR=true (opt-in) and runs in a try/catch that swallows all errors.

@akshayminocha5 akshayminocha5 requested a review from a team as a code owner March 16, 2026 13:25
akshayminocha5 and others added 5 commits March 16, 2026 20:23
Add three new diagnostic checks to percy-doctor:
- Token auth check (PERCY-DR-001 through -006): validates presence,
  format, prefix-based project type detection, and API authentication
- Config validation (PERCY-DR-100 through -106): detects config file,
  validates version, warns on project-type config mismatches
- CI environment (PERCY-DR-200 through -209): detects CI system,
  validates commit SHA, branch, parallel config, and git availability

Also adds:
- Inter-check context (ctx) with bestProxy getter for downstream checks
- Report version field and summary banner rendering
- Headers support in HttpProber for auth API calls
- Comprehensive Jasmine test coverage for all new checks

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Phase 6: Environment variable audit check (PERCY-DR-300 through -305)
- Lists set Percy vars, validates PERCY_PARALLEL_TOTAL format
- Detects manual overrides (PERCY_COMMIT/BRANCH/PULL_REQUEST)
- Warns on NODE_TLS_REJECT_UNAUTHORIZED=0
- Never exposes env var VALUES in findings (security)

Phase 7: --quick mode flag
- Runs only connectivity + SSL + token auth (~4s)
- Skips config, CI, env audit, proxy, PAC, and browser checks
- Skips token auth with PERCY-DR-007 when connectivity fails

Phase 9: Orchestration updates
- Wired env audit into Phase 1 parallel allSettled
- Updated runDiagnostics() programmatic API with mode parameter
- Added env-audit.js export to package.json

Phase 10: Build failure auto-doctor integration
- Added runDoctorOnFailure() to snapshot.js (packages/core)
- Triggers from all 3 failure branches in createSnapshotsQueue
- Guarded by PERCY_AUTO_DOCTOR env var
- Dynamic import with graceful fallback when doctor not installed

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…dation alignment

Two QA cycles with functional + security agents found and resolved:

Security fixes:
- Sanitize HTTP error messages in auth.js to prevent token leakage (defense-in-depth)
- Remove PERCY_PARALLEL_NONCE value from ci.js findings (was leaking env var values)
- Add 300s upper bound on --timeout to prevent DoS via indefinite hangs
- Align timeout validation (Number.isInteger) between doctor.js and runDiagnostics()

Bug fixes:
- Fix config.js project type for ss_/vmw_/res_ tokens (was showing "web" for all)
- Fix DR-106 web-only key check to cover all non-web token types (was only auto/app)
- Fix DR-103 to show actual version number instead of hardcoding "version 1"
- Fix PERCY_PARALLEL_TOTAL validation to reject floats like "4.5"
- Fix ci.js git check to use stdio:'pipe' instead of shell redirect (Windows compat)

Test coverage:
- Wire auth test mock server via apiBaseUrl DI param (DR-003/004/005/006 now deterministic)
- Add tests for DR-209 (git unavailable), DR-106 with ss_ token, config version 0
- Add tests for float PERCY_PARALLEL_TOTAL, auth 401/403/200/500/unreachable
- Add config tests for ss_/vmw_/res_ token prefix type labels

510 specs pass (1 pre-existing connectivity flake).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…d code

- Strengthen sanitizeError() to also strip raw PERCY_TOKEN value from
  error messages (defense-in-depth, not just Authorization header format)
- Extract KNOWN_PREFIXES to shared constants.js (was duplicated in
  auth.js and config.js)
- Remove unused `isApp` variable in config.js
- Change cross-package import in snapshot.js from deep path
  (@percy/cli-doctor/src/utils/helpers.js) to package root
- Add cross-reference comments between doctor.js and runDiagnostics()
  orchestration paths
- Add targeted test for raw token sanitization in error messages

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@rishigupta1599 rishigupta1599 force-pushed the feat/percy-doctor-enhanced-diagnostics branch from 8dc6a43 to c3b5f3d Compare March 16, 2026 14:54
Copy link
Copy Markdown
Contributor

@ninadbstack ninadbstack left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Check inline comments below.

Copy link
Copy Markdown
Contributor

@ninadbstack ninadbstack left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Exhaustive Code Review — Multi-Agent Analysis

Reviewed by: security-sentinel, performance-oracle, architecture-strategist, code-simplicity-reviewer, agent-native-reviewer

14 findings total: 3 🔴 P1 (blocks merge) · 6 🟡 P2 (should fix) · 5 🔵 P3 (nice-to-have)

See inline comments for details on each finding. Two findings that couldn't be placed inline:

🔵 P3 — Redundant isMachineLevel() in Monitoring constructor (packages/monitoring/src/index.js:44): isMachineLevel() calls isContainerLevel() again instead of using this.isContainer. Fix: this.isMachine = !this.isContainer;

🔵 P3 — JSON report schema undocumented: The Finding type (code, status, message, suggestions, metadata) is referenced in JSDoc but never defined as a schema or typedef. Add typedefs for programmatic consumers.


What's working well

  • Strong security awareness — sanitizeError(), redactProxyUrl(), token filtering, stdio: 'pipe'
  • Excellent test coverage — 72 new specs with deterministic DI (mock HTTP server, injected searchFn/monitoring/percyEnv)
  • Well-structured findings model — error codes (PERCY-DR-001 through PERCY-DR-305), severity levels, actionable suggestions
  • Promise.allSettled for failure isolation in Phase 1
  • Quick mode is a smart addition for fast triage
  • Auto-doctor gating behind PERCY_AUTO_DOCTOR=true with swallowed errors is appropriate

@rishigupta1599 rishigupta1599 merged commit a2bd2c5 into master Mar 18, 2026
45 checks passed
@rishigupta1599 rishigupta1599 deleted the feat/percy-doctor-enhanced-diagnostics branch March 18, 2026 14:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants