feat(tests): scale visual regression to all 57 screens (stacked on #541)#552
Merged
TaprootFreak merged 3 commits intoMay 23, 2026
Merged
Conversation
Adds 52 new golden tests on top of the pilot 5, covering every Page-File under lib/screens/**: - Bucket 1: onboarding + wallet lifecycle (11 tests: create, restore, verify_seed, home, onboarding_completed, pin x2, hw bitbox, legal x2, debug_auth) - Bucket 2: settings subpages (17 tests: languages, currencies, network, seed, tax_report, contact, wallet_address, legal_docs x3, user_data x7) - Bucket 3: KYC (15 tests: 2fa, email x2, financial_data x4, ident, nationality, registration, subpages x5) - Bucket 4: trading + support + misc (9 tests: receive, tx_history, sell_bitbox, sell_bank_account_selection, support x4, web_view skipped) One Page → one golden in default/initial state. State-variant goldens follow as a separate PR per feature when the team finds them necessary. web_view_page.dart is the only skipped target — InAppWebView is a platform-view that has no headless rendering in flutter_test. The test file is committed with skip: true so it activates the moment a stub is introduced. Bootstrap workflow is restored temporarily so the dfx01 runner can regenerate the baseline set on push; removed in a follow-up commit once the artifact is committed.
Two failure classes turned up on the first dfx01 bootstrap run: 1. MissingPluginException for no_screenshot's com.flutterplaza.no_screenshot_methods channel — create_wallet, settings_seed. Stubbed via TestDefaultBinaryMessengerBinding in the per-file setUpAll so the call returns true and the test continues. 2. pumpAndSettle hangs on CircularProgressIndicator/CupertinoActivityIndicator in loading-state pages — kyc_loading, kyc_financial_data, kyc_financial_data_loading, settings_edit_loading, support_chat, sell_bitbox. Switched pumpBeforeTest to alchemist's pumpOnce so the first frame is captured instead of waiting for animation completion.
Generated by golden-bootstrap run 26341918780 on dfx01. Total authoritative baselines now sits at 59 PNGs covering 56 of 57 page files (web_view stays skipped). Bootstrap workflow removed — the standard golden-tests CI job is the permanent validation entry point from here.
This was referenced May 23, 2026
TaprootFreak
added a commit
that referenced
this pull request
May 23, 2026
…) (#552) ## Summary - Stacked on top of #541 (visual-regression pilot). **Do not merge until #541 is merged**, then this PR rebases against develop. - Adds 52 new golden tests covering every \`lib/screens/**/*_page.dart\` (1 per page, default/initial state) on top of the pilot 5 — 57 tests total - Bootstrap workflow is reintroduced temporarily to regenerate all baselines on dfx01; removed in a follow-up commit before ready-for-review - 1 page (\`web_view_page.dart\`) is intentionally \`skip: true\` — InAppWebView is a platform-view that has no headless render ## Bucket breakdown - **Onboarding + wallet** (11): create, restore, verify_seed, home, onboarding_completed, pin x2, hw bitbox, legal x2, debug_auth - **Settings subpages** (17): languages, currencies, network, seed, tax_report, contact, wallet_address, legal_docs x3, user_data x7 - **KYC** (15): 2fa, email x2, financial_data x4, ident, nationality, registration, subpages x5 - **Trading + support + misc** (9): receive, tx_history, sell_bitbox, sell_bank_account_selection, support x4, web_view (skipped) ## Test plan - [ ] Bootstrap workflow runs on push, generates 56 baseline PNGs (57 minus web_view) - [ ] Baselines reviewed visually, committed to test/goldens/screens/**/goldens/macos/ - [ ] golden-tests CI job grün on the same commit - [ ] Bootstrap workflow removed before ready-for-review State-variant goldens (loaded/error/loading per screen) are out of scope here — they follow as separate per-feature PRs.
This was referenced May 23, 2026
TaprootFreak
added a commit
that referenced
this pull request
May 25, 2026
…tacked on #541) (#562) ## Summary Stacked on #541 (which is itself the merge bus for #552). Two changes that together unblock unifying the Maestro handbook screenshots with the Golden baselines (see plan at `~/Documents/Claude/realunit-handbook-unification-plan.md`): 1. **Locale switch en → de**: `wrapForGolden` defaulted to `Locale('en')`, so all 59 current Goldens render in English. The Maestro handbook pins the simulator to `de_CH` and captures German UI — the two pipelines cannot share images while they speak different languages. 2. **Handbook gap coverage**: three new Goldens for handbook pages that had no Golden equivalent: - `create_wallet_page_revealed` — handbook 05-seed-revealed (state variant of `state.hideSeed=false`) - `settings_seed_page_revealed` — handbook 19-settings-seed-revealed (`showSeed=true`) - `settings_confirm_logout_wallet_sheet_default` — handbook 24-settings-delete-wallet (modal in initial unchecked state) ## Mapping audit (Phase 0) Verified against `.maestro/handbook/*.yaml`: | Handbook page | Golden | Status | |---|---|---| | 01–09, 11–16, 18, 20–23, 25 | existing | ✅ | | 05-seed-revealed | new | ✅ this PR | | 17-settings-backup-pin | — |⚠️ deferred (state variant of `verify_pin_page`, needs context-aware test setup) | | 19-settings-seed-revealed | new | ✅ this PR | | 24-settings-delete-wallet | new | ✅ this PR | | 26-terms | `legal_document_page_default` |⚠️ to verify visually whether the bound content matches | | **10-biometric-prompt** | — | ❌ **out of scope**: iOS system bottom sheet from `LocalAuthentication`, not rendered by Flutter — Skia cannot reproduce it. Will be discussed before Phase 1 (Dockerfile.handbook switch). | ## BackdropFilter validation The existing `settings_seed_page_default` Golden already proves that Flutter's headless Skia renders `BackdropFilter` correctly (the blur is visible, not the historic XCUITest-black-PNG issue). Same applies to the new revealed/hidden state variants and the `create_wallet_view`'s `SeedBlurCard`. ## Bootstrap workflow `.github/workflows/golden-bootstrap.yaml` is re-introduced temporarily, triggered by push to this branch. It runs `flutter test test/goldens --update-goldens` on the `realunit-app` self-hosted dfx01 runner and uploads the regenerated PNGs as `golden-baselines`. I download the artifact, commit the baselines into `test/goldens/screens/**/goldens/macos/`, then delete the bootstrap workflow file in a follow-up commit — same pattern as the pilot PR. ## Test plan - [ ] `golden-bootstrap` workflow run completes green on dfx01 - [ ] Baselines downloaded + committed - [ ] `golden-bootstrap.yaml` removed - [ ] `Visual Regression` job in pull-request.yaml green on final commit - [ ] Spot-check sample DE Goldens visually match the handbook screenshots - [ ] Decide on `10-biometric-prompt` and `17-settings-backup-pin` before promoting to ready-for-review ## Out of scope - `Dockerfile.handbook` switch from `docs/handbook/screenshots/` to `test/goldens/` (Phase 1 of the unification plan) - Maestro pipeline retirement / nightly-only mode (Phase 2)
TaprootFreak
added a commit
that referenced
this pull request
May 25, 2026
…unner (#541) ## Summary - Introduces visual-regression goldens for **every \`lib/screens/**/*_page.dart\`** in the repo (56 of 57 rendered, 1 explicit \`skip: true\`) - Render host is the dfx01 self-hosted runner (Mac Studio M3 Ultra, labels \`self-hosted, macOS, ARM64, m3-ultra, realunit-app\`) — Hardware-pinning so Skia/CoreText state is identical between baseline generation and validation - Stack: [alchemist](https://pub.dev/packages/alchemist) 0.14.0, Open Sans (SIL OFL 1.1) committed as an asset (the previous system-font fallback wasn't deterministic across hosts) - New CI job \`golden-tests\` in \`.github/workflows/pull-request.yaml\` runs in parallel to \`build\`; \`build\` passes \`--exclude-tags golden\` so the visual-regression tests stay confined to the dfx01 runner ## Coverage - **Onboarding + wallet lifecycle (11):** create_wallet, restore_wallet, verify_seed, home, onboarding_completed, setup_pin, verify_pin, hw bitbox, legal x2, debug_auth - **Settings + subpages (17):** languages, currencies, network, seed, tax_report, contact, wallet_address, legal_docs x3, user_data x7 - **KYC (15):** 2fa, email x2, financial_data x4, ident, nationality, registration, subpages x5 (account_merge, completed, failure, loading, pending) - **Trading + support + misc (9):** receive, transaction_history, sell_bitbox, sell_bank_account_selection, support x4, web_view (skipped — InAppWebView is a platform-view with no headless render) - **Pilot 5:** welcome (iOS + Android theme variants), dashboard, settings, buy (initial + payment-info-loaded), sell (no-account + with-balance) **Total: 57 test files, 59 baseline PNGs.** ## Verified - Baselines generated on dfx01 via the (now removed) \`golden-bootstrap.yaml\` workflow, downloaded and committed - \`golden-tests\` CI ran green on the stacked PR #552 ([run 26342855405](https://github.com/DFXswiss/realunit-app/actions/runs/26342855405)) — proving the committed baselines match a fresh render on dfx01 - Drift detection verified during the pilot phase: a probe pixel change in \`realUnitBlue\` flipped CI red and uploaded master/test/maskedDiff/isolatedDiff PNGs as a \`golden-diffs\` artifact - Local \`flutter analyze\` clean, \`flutter test --exclude-tags golden\` passes (2148/2148) ## Documentation - [\`docs/visual-regression-tests.md\`](https://github.com/DFXswiss/realunit-app/blob/feat/visual-regression-pilot/docs/visual-regression-tests.md) — bootstrap pattern, drift workflow, Flutter-bump regeneration, dfx01-outage fallback - Runner setup + tooling docs in [DFXServer/server@develop](https://github.com/DFXServer/server/blob/develop/infrastructure/dfx01/actions-runners/realunit-app-tooling.md) State-variant goldens (loaded/error/loading per screen) are out of scope here — they follow as separate per-feature PRs. ## Follow-ups after merge - Set \`golden-tests\` as a required status check on develop branch protection
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Bucket breakdown
Test plan
State-variant goldens (loaded/error/loading per screen) are out of scope here — they follow as separate per-feature PRs.