chore(ci): don't gate master Storybook on observe-purpose VR result#59120
Merged
Conversation
The vr-setup step creates master Visual Review runs with --purpose observe (tracking-only, non-gating). However `vr run complete` still exits 1 when any visual changes are detected, which propagates through `vr-complete`'s job result and trips the required `Visual regression tests pass` check. This makes master Storybook red whenever a UI change lands — defeating the "observe" semantics already documented in the vr-setup comment. Mark the Complete Visual Review run step as continue-on-error on push events so the master tracker keeps recording diffs without gating. PRs still gate.
Contributor
|
Reviews (1): Last reviewed commit: "chore(ci): don't gate master Storybook o..." | Re-trigger Greptile |
gantoine
approved these changes
May 20, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
The
Storybookworkflow onmasterhas been failing about half of all completedruns, blocking the required
Visual regression tests passcheck on the defaultbranch.
In a sample of the most recent 50 master-push runs, the dominant failure pattern
(roughly 55% of all failures) is the
Complete Visual Review runjob exiting 1with
Visual changes detected — review at: ..., even though the master VisualReview run is created with
--purpose observe(tracking-only, non-gating).Example failed runs that hit this pattern:
In each, every shard of the visual-regression matrix passes; the only failing
step is
Complete Visual Review run, e.g.:Changes
The
vr-setupstep already documents the intent: master runs use--purpose observeso they don't block. But thevr run completeCLI stillexits 1 whenever changes are detected, regardless of the run's purpose, and that
propagates to the
vr-completejob result and onward to the required check.Marking the
Complete Visual Review runstep ascontinue-on-erroronpushevents keeps the master tracker recording diffs and finalizing the runwithout flipping the workflow red. PR runs are unchanged — they still gate as
before.
A backend-side fix (teaching the CLI / API to return 0 for observe runs) would
be cleaner, but requires plumbing
purposethrough theRundataclass andserializer. This workflow-only change is the smallest fix that restores master
green and leaves the better fix for a follow-up.
Concerns:
of the remaining master failures are
page.goto: Timeout 30000ms exceededinside WebKit shards (different stories each time —
Onboarding,ErrorDisplay, etc.). This PR does not touch those — they need a separateinvestigation (likely test-runner navigation timeout or retry behavior). After
this PR, the residual master failure rate should drop to roughly the WebKit
flake floor.
How did you test this code?
I'm an agent. I did not run the workflow locally. Verification was static:
failures via
gh run view ... --json jobsandgh run view --job ... --log-failed.Complete Visual Review runin 16/29 recentmaster failures, all with
Visual changes detected — review at: ...followedby
exit code 1while every shard reported success.vr-setup(# … master runs are tracking-only ("observe") since we don't want master runs to block).continue-on-error: trueon a step makes the job conclusionsuccesseven when the step fails; downstream
visual_regression_testsreadsneeds.vr-complete.resultso the required check will pass.Publish to changelog?
no
🤖 Agent context
health.
failing job, and read logs for representative examples in both buckets.
vr run completeto return 0 on observe runs (rejected for this PR— needs
purposeexposed on theRunApischema and a CLI follow-up; thecorrect long-term fix, but out of scope for an unblocker).
vr-completeentirely on push (rejected — the backend still needs thecompletion call to transition runs out of
PENDING, healing baselines,finalizing snapshots).
continue-on-errorgated ongithub.event_name == 'push'(chosen — smallestdiff, preserves PR gating, keeps the backend transition).
backend exposing
purposeon the run serializer.--no-verify? No — pre-commit hook ran successfully after refreshing thevenv (
uv sync --active).