Improve RMSE math and lower thresholds in Canvas tests #4011

HalfWhitt · 2025-12-24T07:15:51Z

Fixes #4005; I've been nerd-sniped. ~~Uploading this with threshold values that work on macOS; presumably at least a couple will need bumping up a little depending on test failures.~~

Also removes the unused assert_pixel.

PR Checklist:

All new features have been tested
All new features have been documented
I have read the CONTRIBUTING.md file
I will abide by the code of conduct

HalfWhitt · 2025-12-24T08:51:04Z

First of all, I know most of the core team's on holiday leave, so I'm aware this might not get reviewed for a while.

@corranwebster, I'd appreciate a double-check on my math. I've changed how it's written out Python-wise, but math-wise the only differences should be, per your suggestions:

Dividing by the four bands
Converting the images to RGBa to test premultiplied alpha

Evidently the premultiplying helps a lot in some cases. After all, dividing by bands should, alone, only halve the error amounts... but the threshold for the test_multiline_text was able to drop all the way from 0.09 to 0.01, and test_transparency from 0.1 to 0.01.

test_write_text (which uses a variety of fonts, not just system as in test_multiline_text) only dropped from 0.07 to the expected 0.035, but it's interesting that Gtk on Wayland was the only one to go above 0.02.

testbed/tests/widgets/test_canvas.py

corranwebster

Looks good, as a non-maintainer.

HalfWhitt · 2025-12-24T09:12:05Z

Looks good, as a non-maintainer.

Maintainer or not, I very much appreciate your canvas-and-image-related expertise so far!

johnzhou721

Nice use of ImageChops.difference!

HalfWhitt · 2025-12-26T19:23:26Z

Nice use of ImageChops.difference!

Credit to corranwebster, they're the one who suggested it 👍

kattni · 2025-12-27T03:42:29Z

@HalfWhitt Please ping me when this is ready for review. Thanks!

HalfWhitt · 2025-12-27T03:45:17Z

@kattni Ping : )

HalfWhitt · 2025-12-29T03:37:47Z

(This is still ready to go, I just decided to arrange it slightly differently.)

freakboy3742

Can't argue with much here - ImageChops.difference() is a new one for me, but I'm very much in favor of adopting something that is a pre-existing difference measure than inventing one. And, if it means our error thresholds are lower, all the better.

HalfWhitt · 2025-12-29T06:53:15Z

ImageChops.difference() is a new one for me, but I'm very much in favor of adopting something that is a pre-existing difference measure than inventing one. And, if it means our error thresholds are lower, all the better.

Just for the record, ImageChops.difference is only a tidier way of doing the same math as what I initially submitted:

        total = sum(
            ((actual - expected) / 255) ** 2
            for actual, expected in zip(
                chain(*scaled_image.convert("RGBa").getdata()),
                chain(*reference_image.convert("RGBa").getdata()),
                strict=True,
            )
        )

It's not relevant to lowering the error; that comes entirely from (1) dividing by the number of bands, and (2) premultiplying the alpha, so we're not testing invisible differences. (Both of which were corranwebster's idea on the original ticket).

HalfWhitt added 3 commits December 24, 2025 02:14

Change RMSE math and thresholds

fcc6e29

Relax some thresholds

965e7b1

Remove or update outdated comments

a708028

HalfWhitt marked this pull request as ready for review December 24, 2025 08:50

corranwebster reviewed Dec 24, 2025

View reviewed changes

testbed/tests/widgets/test_canvas.py Outdated Show resolved Hide resolved

corranwebster reviewed Dec 24, 2025

View reviewed changes

testbed/tests/widgets/test_canvas.py Outdated Show resolved Hide resolved

corranwebster reviewed Dec 24, 2025

View reviewed changes

testbed/tests/widgets/test_canvas.py Outdated Show resolved Hide resolved

Clean up iteration, fix silly math error

419af44

corranwebster approved these changes Dec 24, 2025

View reviewed changes

johnzhou721 approved these changes Dec 24, 2025

View reviewed changes

HalfWhitt mentioned this pull request Dec 26, 2025

Don't clear path after fill() or stroke(). #4008

Merged

7 tasks

HalfWhitt added 4 commits December 26, 2025 17:34

Bake-in 200x200 image size

4323868

Merge branch 'main' into rmse

8a2e5bb

Try tighter thresholds on new tests

d292ceb

Bump up text_and_path threshold

890a9fa

Obsessive formatting tweak

074c6b6

freakboy3742 approved these changes Dec 29, 2025

View reviewed changes

freakboy3742 merged commit 88c6a48 into beeware:main Dec 29, 2025
56 checks passed

HalfWhitt deleted the rmse branch December 29, 2025 06:51

Uh oh!

Improve RMSE math and lower thresholds in Canvas tests #4011

Improve RMSE math and lower thresholds in Canvas tests #4011

Uh oh!

Conversation

HalfWhitt commented Dec 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Checklist:

Uh oh!

HalfWhitt commented Dec 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

corranwebster left a comment

Choose a reason for hiding this comment

Uh oh!

HalfWhitt commented Dec 24, 2025

Uh oh!

johnzhou721 left a comment

Choose a reason for hiding this comment

Uh oh!

HalfWhitt commented Dec 26, 2025

Uh oh!

kattni commented Dec 27, 2025

Uh oh!

HalfWhitt commented Dec 27, 2025

Uh oh!

HalfWhitt commented Dec 29, 2025

Uh oh!

freakboy3742 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

HalfWhitt commented Dec 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

HalfWhitt commented Dec 24, 2025 •

edited

Loading

HalfWhitt commented Dec 24, 2025 •

edited

Loading

HalfWhitt commented Dec 29, 2025 •

edited

Loading