Skip to content

chore(ci): don't gate master Storybook on observe-purpose VR result#59120

Merged
aspicer merged 1 commit into
masterfrom
aspicer/ciopt-storybook-master-fix
May 20, 2026
Merged

chore(ci): don't gate master Storybook on observe-purpose VR result#59120
aspicer merged 1 commit into
masterfrom
aspicer/ciopt-storybook-master-fix

Conversation

@aspicer
Copy link
Copy Markdown
Contributor

@aspicer aspicer commented May 19, 2026

Problem

The Storybook workflow on master has been failing about half of all completed
runs, blocking the required Visual regression tests pass check on the default
branch.

In a sample of the most recent 50 master-push runs, the dominant failure pattern
(roughly 55% of all failures) is the Complete Visual Review run job exiting 1
with Visual changes detected — review at: ..., even though the master Visual
Review run is created with --purpose observe (tracking-only, non-gating).

Example failed runs that hit this pattern:

In each, every shard of the visual-regression matrix passes; the only failing
step is Complete Visual Review run, e.g.:

[run:...] Done: 3125 snapshots — 3119 unchanged, 6 changed, 0 new, 0 removed
[run:...] Visual changes detected — review at: ...
##[error]Process completed with exit code 1.

Changes

The vr-setup step already documents the intent: master runs use
--purpose observe so they don't block. But the vr run complete CLI still
exits 1 whenever changes are detected, regardless of the run's purpose, and that
propagates to the vr-complete job result and onward to the required check.

Marking the Complete Visual Review run step as continue-on-error on
push events keeps the master tracker recording diffs and finalizing the run
without flipping the workflow red. PR runs are unchanged — they still gate as
before.

A backend-side fix (teaching the CLI / API to return 0 for observe runs) would
be cleaner, but requires plumbing purpose through the Run dataclass and
serializer. This workflow-only change is the smallest fix that restores master
green and leaves the better fix for a follow-up.

Concerns:

  • WebKit shard timeouts are a second, unrelated flake source. Around a third
    of the remaining master failures are page.goto: Timeout 30000ms exceeded
    inside WebKit shards (different stories each time — Onboarding,
    ErrorDisplay, etc.). This PR does not touch those — they need a separate
    investigation (likely test-runner navigation timeout or retry behavior). After
    this PR, the residual master failure rate should drop to roughly the WebKit
    flake floor.

How did you test this code?

I'm an agent. I did not run the workflow locally. Verification was static:

  • Read all 50 most recent master completed Storybook runs and classified job
    failures via gh run view ... --json jobs and gh run view --job ... --log-failed.
  • Confirmed the failing step is Complete Visual Review run in 16/29 recent
    master failures, all with Visual changes detected — review at: ... followed
    by exit code 1 while every shard reported success.
  • Confirmed the intent comment in vr-setup (# … master runs are tracking-only ("observe") since we don't want master runs to block).
  • continue-on-error: true on a step makes the job conclusion success
    even when the step fails; downstream visual_regression_tests reads
    needs.vr-complete.result so the required check will pass.

Publish to changelog?

no

🤖 Agent context

  • Tool: Claude Code agent, CI optimization push focused on Storybook master
    health.
  • Investigation: sampled the last 50 completed master runs, grouped failures by
    failing job, and read logs for representative examples in both buckets.
  • Considered fixes:
    1. Teach vr run complete to return 0 on observe runs (rejected for this PR
      — needs purpose exposed on the RunApi schema and a CLI follow-up; the
      correct long-term fix, but out of scope for an unblocker).
    2. Skip vr-complete entirely on push (rejected — the backend still needs the
      completion call to transition runs out of PENDING, healing baselines,
      finalizing snapshots).
    3. continue-on-error gated on github.event_name == 'push' (chosen — smallest
      diff, preserves PR gating, keeps the backend transition).
  • Not touched in this PR: WebKit page.goto timeouts in the shard matrix; the
    backend exposing purpose on the run serializer.
  • Used --no-verify? No — pre-commit hook ran successfully after refreshing the
    venv (uv sync --active).

The vr-setup step creates master Visual Review runs with --purpose observe
(tracking-only, non-gating). However `vr run complete` still exits 1 when
any visual changes are detected, which propagates through `vr-complete`'s
job result and trips the required `Visual regression tests pass` check.

This makes master Storybook red whenever a UI change lands — defeating the
"observe" semantics already documented in the vr-setup comment. Mark the
Complete Visual Review run step as continue-on-error on push events so
the master tracker keeps recording diffs without gating. PRs still gate.
@assign-reviewers-posthog assign-reviewers-posthog Bot requested a review from a team May 19, 2026 22:46
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented May 19, 2026

Reviews (1): Last reviewed commit: "chore(ci): don't gate master Storybook o..." | Re-trigger Greptile

@aspicer aspicer merged commit 2ba3b3a into master May 20, 2026
159 checks passed
@aspicer aspicer deleted the aspicer/ciopt-storybook-master-fix branch May 20, 2026 17:41
@deployment-status-posthog
Copy link
Copy Markdown

deployment-status-posthog Bot commented May 20, 2026

Deploy status

Environment Status Deployed At Workflow
dev ✅ Deployed 2026-05-20 18:12 UTC Run
prod-us ✅ Deployed 2026-05-20 18:30 UTC Run
prod-eu ✅ Deployed 2026-05-20 18:32 UTC Run

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants