feat(doctor): enhanced diagnostics — auth, config, CI, env audit, quick mode, auto-doctor#2153
Conversation
Add three new diagnostic checks to percy-doctor: - Token auth check (PERCY-DR-001 through -006): validates presence, format, prefix-based project type detection, and API authentication - Config validation (PERCY-DR-100 through -106): detects config file, validates version, warns on project-type config mismatches - CI environment (PERCY-DR-200 through -209): detects CI system, validates commit SHA, branch, parallel config, and git availability Also adds: - Inter-check context (ctx) with bestProxy getter for downstream checks - Report version field and summary banner rendering - Headers support in HttpProber for auth API calls - Comprehensive Jasmine test coverage for all new checks Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Phase 6: Environment variable audit check (PERCY-DR-300 through -305) - Lists set Percy vars, validates PERCY_PARALLEL_TOTAL format - Detects manual overrides (PERCY_COMMIT/BRANCH/PULL_REQUEST) - Warns on NODE_TLS_REJECT_UNAUTHORIZED=0 - Never exposes env var VALUES in findings (security) Phase 7: --quick mode flag - Runs only connectivity + SSL + token auth (~4s) - Skips config, CI, env audit, proxy, PAC, and browser checks - Skips token auth with PERCY-DR-007 when connectivity fails Phase 9: Orchestration updates - Wired env audit into Phase 1 parallel allSettled - Updated runDiagnostics() programmatic API with mode parameter - Added env-audit.js export to package.json Phase 10: Build failure auto-doctor integration - Added runDoctorOnFailure() to snapshot.js (packages/core) - Triggers from all 3 failure branches in createSnapshotsQueue - Guarded by PERCY_AUTO_DOCTOR env var - Dynamic import with graceful fallback when doctor not installed Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…dation alignment Two QA cycles with functional + security agents found and resolved: Security fixes: - Sanitize HTTP error messages in auth.js to prevent token leakage (defense-in-depth) - Remove PERCY_PARALLEL_NONCE value from ci.js findings (was leaking env var values) - Add 300s upper bound on --timeout to prevent DoS via indefinite hangs - Align timeout validation (Number.isInteger) between doctor.js and runDiagnostics() Bug fixes: - Fix config.js project type for ss_/vmw_/res_ tokens (was showing "web" for all) - Fix DR-106 web-only key check to cover all non-web token types (was only auto/app) - Fix DR-103 to show actual version number instead of hardcoding "version 1" - Fix PERCY_PARALLEL_TOTAL validation to reject floats like "4.5" - Fix ci.js git check to use stdio:'pipe' instead of shell redirect (Windows compat) Test coverage: - Wire auth test mock server via apiBaseUrl DI param (DR-003/004/005/006 now deterministic) - Add tests for DR-209 (git unavailable), DR-106 with ss_ token, config version 0 - Add tests for float PERCY_PARALLEL_TOTAL, auth 401/403/200/500/unreachable - Add config tests for ss_/vmw_/res_ token prefix type labels 510 specs pass (1 pre-existing connectivity flake). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…d code - Strengthen sanitizeError() to also strip raw PERCY_TOKEN value from error messages (defense-in-depth, not just Authorization header format) - Extract KNOWN_PREFIXES to shared constants.js (was duplicated in auth.js and config.js) - Remove unused `isApp` variable in config.js - Change cross-package import in snapshot.js from deep path (@percy/cli-doctor/src/utils/helpers.js) to package root - Add cross-reference comments between doctor.js and runDiagnostics() orchestration paths - Add targeted test for raw token sanitization in error messages Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
8dc6a43 to
c3b5f3d
Compare
ninadbstack
left a comment
There was a problem hiding this comment.
Exhaustive Code Review — Multi-Agent Analysis
Reviewed by: security-sentinel, performance-oracle, architecture-strategist, code-simplicity-reviewer, agent-native-reviewer
14 findings total: 3 🔴 P1 (blocks merge) · 6 🟡 P2 (should fix) · 5 🔵 P3 (nice-to-have)
See inline comments for details on each finding. Two findings that couldn't be placed inline:
🔵 P3 — Redundant isMachineLevel() in Monitoring constructor (packages/monitoring/src/index.js:44): isMachineLevel() calls isContainerLevel() again instead of using this.isContainer. Fix: this.isMachine = !this.isContainer;
🔵 P3 — JSON report schema undocumented: The Finding type (code, status, message, suggestions, metadata) is referenced in JSDoc but never defined as a schema or typedef. Add typedefs for programmatic consumers.
What's working well
- Strong security awareness —
sanitizeError(),redactProxyUrl(), token filtering,stdio: 'pipe' - Excellent test coverage — 72 new specs with deterministic DI (mock HTTP server, injected searchFn/monitoring/percyEnv)
- Well-structured findings model — error codes (
PERCY-DR-001throughPERCY-DR-305), severity levels, actionable suggestions Promise.allSettledfor failure isolation in Phase 1- Quick mode is a smart addition for fast triage
- Auto-doctor gating behind
PERCY_AUTO_DOCTOR=truewith swallowed errors is appropriate
Summary
Adds six new diagnostic check modules to
@percy/cli-doctorand two new runtime modes, significantly expanding Percy's self-service troubleshooting capabilities:checkAuth) — validates PERCY_TOKEN presence, format, project-type detection (web/app/automate/generic/visual_scanner/responsive_scanner), and live API authentication (GET/api/v1/tokens)checkConfig) — loads Percy config via cosmiconfig, validates version field, detects project-type/config key mismatches (automate-only keys with web token and vice versa)checkCI) — detects 10+ CI providers, validates git availability, checks parallel build configuration (PERCY_PARALLEL_TOTAL + PERCY_PARALLEL_NONCE)checkEnvVars) — inventories all Percy env vars, validates PERCY_PARALLEL_TOTAL format, flags manual overrides (PERCY_COMMIT/BRANCH/PULL_REQUEST), detects NODE_TLS_REJECT_UNAUTHORIZED=0--quick) — runs only connectivity + SSL + token auth for fast triage@percy/corecallsrunDoctorOnFailure()to automatically run diagnostics (gated byPERCY_AUTO_DOCTOR=true)Architecture
ctx) flows proxy/connectivity results to downstream checksrunSection()factory pattern reduces check boilerplatePromise.allSettledfor failure isolation in parallel checksChanges
src/checks/auth.jssrc/checks/config.jssrc/checks/ci.jssrc/checks/env-audit.jssrc/doctor.jssrc/utils/helpers.jsrunDiagnostics()programmatic entry point,runSection()factory, timeout validationsrc/utils/http.jsmethodandheaderssupport to probeUrltest/auth.test.jstest/ci.test.jstest/config.test.jstest/env-audit.test.jstest/utils/helpers.test.jspackages/core/src/snapshot.jsrunDoctorOnFailure()integration on build failure pathsStats: 15 files changed, +1815 / -69 lines
Testing
createHttpServerhelperspyOn(cp, 'execSync')for git simulationsearchFnfor cosmiconfig simulationQA Cycles
Final: zero Critical/High bugs remaining.
Key fixes from QA
sanitizeError()strips token patterns from error messagesparseInt→Number() + Number.isInteger()for stricter validationstdio: 'pipe'replaces2>/dev/nullfor Windows compatibilityDeferred items (Low severity)
sectionStatus()doesn't handle 'skip' statusKNOWN_PREFIXESconstant (auth.js + config.js) — extract to shared modulePERCY_PARALLEL_TOTAL=-1special value (used bypercy build:finalize) flagged as invalidPost-Deploy Monitoring & Validation
No additional operational monitoring required: this is a diagnostic-only CLI tool that runs on-demand. It does not affect Percy build execution, snapshot capture, or API behavior. The auto-doctor feature is gated behind
PERCY_AUTO_DOCTOR=true(opt-in) and runs in a try/catch that swallows all errors.