fix(doc-tests): closes #244 — auto-downsample retina 2× screenshots before SSIM#298
Merged
proggeramlug merged 1 commit intomainfrom Apr 29, 2026
Merged
fix(doc-tests): closes #244 — auto-downsample retina 2× screenshots before SSIM#298proggeramlug merged 1 commit intomainfrom
proggeramlug merged 1 commit intomainfrom
Conversation
…efore SSIM
The screenshot comparison in `image_diff::diff()` hard-errored on any
size mismatch before SSIM ever ran. On macOS retina hardware the
gallery binary captures a 2× PNG (e.g. 1800×1940) while the blessed
baseline is 1× (900×970); the strict `actual.width() != baseline.width()`
check returned `Err("size mismatch")` immediately, producing a recurring
SCREENSHOT_DIFF failure for 30+ commits.
Fix: when the actual screenshot is exactly 2× the baseline in both
dimensions, downsample with a 2×2 box-filter (`halve()`) before SSIM.
Any other size mismatch still returns an error (so genuine resize
regressions are still caught). A downsampled 1× image compared with
the existing 1× baseline falls well within the 0.05 SSIM threshold.
Added `#[derive(Debug)]` to `DiffOutcome` (needed for `unwrap_err()` in
tests) and 3 unit tests pinning the new behaviour:
- `halve_averages_2x2_blocks` — verifies box-filter arithmetic
- `diff_retina_2x_against_1x_baseline_passes` — the exact retina repro
- `diff_arbitrary_size_mismatch_errors` — non-2× mismatches still fail
https://claude.ai/code/session_01FVjeQDnyymLz7yTJbnsyKG
proggeramlug
added a commit
that referenced
this pull request
Apr 29, 2026
… merges The lint job on c61296b caught rustfmt drift from the recent batch of squash-merges: - crates/perry-doc-tests/src/image_diff.rs (PR #298) - crates/perry-hir/src/destructuring.rs (PR #301) - crates/perry-hir/src/lower/expr_new.rs (PR #301) - crates/perry-hir/src/lower_decl.rs (PR #301) - crates/perry-stdlib/src/fetch.rs (PR #301) - crates/perry-stdlib/src/streams.rs (PR #301) - crates/perry-stdlib/src/ethers.rs (PR #299, my conflict resolution inserted a long line that needed wrapping) - crates/perry-codegen/src/lower_call.rs + crates/perry-codegen/src/lower_call/builtin.rs + crates/perry-codegen/src/runtime_decls.rs (PR #301) Plus Cargo.lock sync from 0.5.384 → 0.5.385 (my v0.5.385 commit landed Cargo.toml's version bump but I forgot to stage Cargo.lock). Pure `cargo fmt --all` output, no hand edits. Verified `cargo build --release -p perry -p perry-runtime -p perry-stdlib` clean post-fmt in 1m 31s. No version bump — same precedent as ea95e85 (rustfmt baseline as chore companion to #294).
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
crates/perry-doc-tests/src/image_diff.rs:35-43): the SSIM comparison hard-errored on any size mismatch before SSIM ran. On retina macOS the gallery binary captures at 2× backing scale (e.g. 1800×1940), but the blessed baseline is 1× (900×970). The strict equality check returnedErr("size mismatch")immediately →SCREENSHOT_DIFFfailure for 30+ consecutive commits.halve()) before SSIM. After downsampling the 1× pixels match the 1× baseline well within the existing0.05SSIM threshold. Any other size mismatch still returnsErr(genuine resize regressions are caught).cargo test -p perry-doc-tests→ 10/10):halve_averages_2x2_blocks— verifies box-filter arithmeticdiff_retina_2x_against_1x_baseline_passes— the exact retina repro shape (4×4 actual vs 2×2 baseline)diff_arbitrary_size_mismatch_errors— non-2× mismatches still produce a size-mismatch errorFiles changed
crates/perry-doc-tests/src/image_diff.rshalve()downsampler; updatediff()to use it on 2× captures; add#[derive(Debug)]toDiffOutcome; 3 unit testscrates/perry-doc-tests/Cargo.tomltempfile = "3"as dev-dependency (for unit tests only)Test plan
cargo test --release -p perry-doc-tests→ 10/10 passed (was 7/10)./target/release/doc-tests --skip-xcompile→ 80/80 passed (was 79/80)Closes #244
https://claude.ai/code/session_01FVjeQDnyymLz7yTJbnsyKG
Generated by Claude Code