feat(ci): on-demand golden regenerate workflow by TaprootFreak · Pull Request #577 · RealUnitCH/app

TaprootFreak · 2026-05-25T18:23:43Z

What

Permanent `workflow_dispatch`-only workflow `.github/workflows/golden-regenerate.yaml` that regenerates visual-regression baselines on the dfx01 self-hosted runner and commits them back to the dispatched branch as `github-actions[bot]`.

Replaces the previous pattern of introducing and removing a temporary `golden-bootstrap.yaml` per regen cycle (documented in `docs/visual-regression-tests.md` until this PR).

Trigger

```bash
gh workflow run golden-regenerate.yaml --ref
```

No inputs needed — the workflow uses `github.ref` as the checkout ref.

Behaviour

Runs on `[self-hosted, macOS, ARM64, m3-ultra, realunit-app]` — same labels as the `golden-tests` job in `pull-request.yaml`.
Setup steps mirror `golden-tests` 1:1 (Flutter 3.41.6 via subosito, `flutter pub get`, generators, build_runner).
Regen: `flutter test test/goldens --update-goldens`.
Auto-commit: git as `github-actions[bot]` (`41898282+github-actions[bot]@users.noreply.github.com`), `git add test/goldens/`, clean-exit on no-diff, otherwise commit `test(goldens): regenerate baselines on dfx01` and push.
Concurrency group `golden-regenerate-` with cancel-in-progress so two parallel dispatches on the same ref don't race.
30 min timeout (matches `golden-tests`).
`permissions: contents: write` and `token: ${{ secrets.GITHUB_TOKEN }}` on checkout — load-bearing for the push.

Failure mode on protected branches

`develop` and `main` are protected by ruleset. A dispatch against either fails cleanly on the `git push` step — no force-push, no bypass. To still recover the regen output, the workflow uploads the regenerated PNGs as a `golden-baselines` artifact whenever the push step fails; rsync them onto a feature branch and commit there.

Doc updates

`docs/visual-regression-tests.md` — replaced the "Initial bootstrap" + "Adding a new golden test" bootstrap-pattern sections (steps 1-4 of bootstrap) with the one-command `gh workflow run golden-regenerate.yaml --ref ` flow. Updated the "Reacting to a CI drift", "Flutter SDK bumps", and "dfx01 outage fallback" sections to reference the new workflow.
`docs/handbook/README.md` — handbook-screenshot regen now points at the same workflow instead of asking for a local `flutter test --update-goldens` run.

Out of scope

`DFXServer/server/infrastructure/dfx01/actions-runners/golden-tests-recipe.md` (different repo) still references the old bootstrap pattern. Separate PR there.

Verification

`python3 -c "import yaml; yaml.safe_load(open('.github/workflows/golden-regenerate.yaml'))"` passes.
`actionlint` clean except for the two expected "unknown label" warnings on `m3-ultra` and `realunit-app` (same as the existing `golden-tests` job — custom self-hosted runner labels are not in actionlint's default known-labels list).

Permanent workflow_dispatch workflow that regenerates the visual- regression baselines on the dfx01 self-hosted runner and commits them back to the dispatched branch as github-actions[bot]. Replaces the previous pattern of introducing and removing a temporary golden-bootstrap.yaml per regen cycle. Usage: gh workflow run golden-regenerate.yaml --ref <feature-branch> Setup steps mirror the golden-tests job in pull-request.yaml so the regenerated baselines render under the exact toolchain that validates them. Concurrency group golden-regenerate-<ref> with cancel-in-progress keeps two parallel dispatches on the same ref from racing. On a protected ref (develop, main) the push fails by design — no force-push, no bypass. The regenerated PNGs are still uploaded as a golden-baselines artifact so they can be rsynced onto a feature branch. docs/visual-regression-tests.md replaces the bootstrap/download/rsync section with the one-command flow. docs/handbook/README.md picks up the same change for handbook screenshot regeneration.

- golden-regenerate.yaml: narrow `git add` to `test/goldens/screens/*/goldens/` so alchemist's transient `failures/` dirs never accidentally land in a bot-pushed commit (MINOR 3 from review). - golden-regenerate.yaml: tighten fallback artifact `if-no-files-found` from `warn` to `error`. An empty fallback artifact is worse than no artifact at all — the user expects the artifact precisely because the push failed; a silent empty is a footgun (MINOR 5). - docs/visual-regression-tests.md: rewrite the stale intro that still claimed "5 screens, 8 baseline PNGs" (left over from PR #541's pilot description) → 57 page files / 68 Golden PNGs, validated by the required `Visual Regression` check (MINOR 4). - docs/visual-regression-tests.md: extend the "On a protected ref the push fails by design" paragraph to also cover the parallel-human-push / non-fast-forward race — same artifact-fallback path, same recovery (MAJOR 1).

Round-2 review caught that `test/goldens/screens/*/goldens/` (the narrow pattern from round 1) silently matches zero files — git pathspecs are not shell globs, the trailing `/goldens/` anchors a directory exactly and `*` does not expand through it. Result: every workflow_dispatch run would have aborted at `git add` with `fatal: pathspec ... did not match any files` before reaching the no-diff early-exit. Regression of the round-1 narrow-add fix. Switch to `'test/goldens/screens/**/goldens/**'` — verified locally: matches the same files as the artifact path on line 92 and excludes alchemist's transient `failures/` directories (alchemist writes them one level up at `<feature>/failures/...`, not inside `goldens/`). Also: rewrite stale "Pilot scope" section in visual-regression-tests.md that still listed only the 5 pilot screens — replaced with a generic "Layout" section describing the per-feature directory convention, which is now accurate at 57 page files / 68 PNGs.

Final stale-pilot reference on docs/visual-regression-tests.md:80 caught by subagent round-3 review. Was inside the "Adding a new golden test" how-to bullet, not a scope claim, but the term "pilot" is no longer accurate now that every page has a Golden.

TaprootFreak added 4 commits May 25, 2026 20:23

TaprootFreak marked this pull request as ready for review May 25, 2026 18:34

TaprootFreak merged commit 62b966d into develop May 25, 2026
12 checks passed

TaprootFreak deleted the feat/golden-regenerate-workflow branch May 25, 2026 20:06

TaprootFreak mentioned this pull request May 25, 2026

Review BitBox all-initiatives audit findings #578

Draft

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(ci): on-demand golden regenerate workflow#577

feat(ci): on-demand golden regenerate workflow#577
TaprootFreak merged 4 commits into
developfrom
feat/golden-regenerate-workflow

TaprootFreak commented May 25, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

TaprootFreak commented May 25, 2026

What

Trigger

Behaviour

Failure mode on protected branches

Doc updates

Out of scope

Verification

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant