Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tried an approch for text case in pixelmatch #7552

Open
wants to merge 10 commits into
base: dev-2.0
Choose a base branch
from

Conversation

Vaivaswat2244
Copy link

Resolves #7496

Changes:

Modified the checkMatch function in visualtest.js to be more tolerant of slight text position shifts while still catching meaningful changes. Key changes:

Added text detection using pixel pattern analysis:

  • Looks for moderate density and high contrast transitions typical of text
  • Helps identify regions where small shifts should be allowed

Implemented shift tolerance for text regions:

  • Allows up to 2-pixel shifts for detected text areas
  • Still fails on actual content changes
  • Configurable via shiftThreshold parameter

Screenshots of the change:

PR Checklist

@davepagurek
Copy link
Contributor

Thanks @Vaivaswat2244, nice approach! I have a few questions that might help us determine how to get this across the finish line.

  • Do you have a sense of how often isLikelyText returns true, and if this gets called on anything non-text? I wonder if maybe rather than trying to detect whether we think there is text, we can directly record when text is called. Since we pass in a p5 instance to the test case anyway, maybe we can do something like this to patch text() to record that it got called:

    let hasText = false
    const prevText = myp5.text
    myp5.text = function(...args) {
      hasText = true
      prevText.apply(this, args)
    }
  • It looks like some tests are failing on CI now. Do you know if these look like (1) false positives that the checker needs to address, (2) cases where we need to regenerate the screenshots because the old ones were out of date but still passed on CI under the old pixel checker, or (3) are actual bugs in p5 that are surfaced now by using a better pixel checker?

@Vaivaswat2244
Copy link
Author

Vaivaswat2244 commented Feb 16, 2025

yeah sure, the text() passing makes much more sense. I can implement that.
And regarding the tests getting failed. I think that these are the 2nd case as you described, failed the CI tests after regenerating on my machine but were not passed by the old checker (Although the previous one seemed fine). Also I'm not so sure how to move ahead from here. So ig for now I'll change the text case and will see how that works out.

@davepagurek
Copy link
Contributor

Are you able to see the test output from CI on your PR by clicking the Details button here?
image

For a next step, I'd look at the test output, and try regenerating the failed screenshots locally and pushing those. We can see if that fixes it on CI, and also by looking at the new images in the Files Changed tab, whether we think the changed images are expected.

@Vaivaswat2244
Copy link
Author

hey, so I just noticed that the files which I added in the earlier commit aren't the one which are failing the tests on CI, but they are other files which were not touched.

Now, I changed the files which were failing tests in CI, and tried the tests locally. The tests are now failing for different files in the typography and webgl cases. I guess there is a problem with my changes in the visualtest file.
So maybe I'll revert these changes and think of something else.

@davepagurek
Copy link
Contributor

Just looking at the last failure in the test file:

Expected:
image

Received:
image

Diff:
image

...it looks like a few isolated pixels are off. I wonder if we can do something to augment pixelmatch and check if the difference exists just on a single pixel (so no surrounding pixels are also flagged) and then ignore those?

@Vaivaswat2244
Copy link
Author

Sure, I can try that

@Vaivaswat2244
Copy link
Author

Hey, @davepagurek, So I tried to customize pixelmatch for our needs. In order to avoid the isolated pixels and also some clusters which I saw ion some failed screenshots,
I tried a clustering approach for this.
The clustering algorithm uses a breadth-first search (BFS) approach to identify and analyze connected pixel differences:

  1. Initial Difference Detection:

    • Pixelmatch identifies different pixels between images
    • Different pixels are marked in red in the diff canvas
  2. Cluster Identification Process:

    • Start from each different pixel
    • Use a queue-based breadth-first search to find connected pixels
    • Explore neighboring pixels within a defined radius
    • Track:
      • Total number of connected different pixels
      • Cluster size
      • Cluster center coordinates
  3. Cluster Filtering Criteria:

    • Ignore small, isolated pixel differences
    • Focus on meaningful, connected changes
    • Parameters control sensitivity:
      • minClusterSize: Minimum pixels to consider a significant difference
      • maxTotalDiffPixels: Maximum total different pixels allowed
  4. Key Benefits:

    • Distinguishes between minor rendering variations
    • Catches substantial visual changes
    • Provides detailed diff analysis
    • Adaptable to different testing requirements

Now, the thing is that I have commited two variations of the approach, the earlier one was with less tolerant(failed the same cases as failing in CI),
and the later one is more tolerant(passed all the cases locally ).

So, I needed a help with this, in one of the failed cases,
Expected:
image

Recieved:
image

Diff:
image

Now this is a scenario which is getting failed in both the cases(less and more tolerant).
Can you suggest some changes for this as this is the scheme which I think we can solve our problem as in this, the difference in failed CI tests and local changes are minimal which I'm assuming that means the CI is working as expected.

@davepagurek
Copy link
Contributor

Nice, the clustering idea is good and gives us some options for dealing with issues like this! I suspect if a shape is consistently 1px thick, then it could be the result of a shift due to environment differences.

A way to check that might be to try to see if, for every pixel in the cluster, it has at most two neighbours?

@Vaivaswat2244
Copy link
Author

Vaivaswat2244 commented Mar 7, 2025

I have made some changes to this, so now additional to clustering logic our algo does the following

  • Iterates through all 8 neighboring pixels (excluding itself) to determine how many are part of the difference.
  • If more than 80% of the pixels in the cluster have ≤2 neighbors, the difference is classified as a line shift (isLineShift = true).

Using this and keeping the thresholds same as the more tolerable commit, the tests are now passing.
After a game of trial and error in the thresholds, found that 0.5 for pixelmatch is optimum for us, as this is passing the tests in default cases. And in the pr #7495 , I reversed the changes and added pervVertex and the test which you added was the only one failed So I think this was the expected results.

Expected:
image

Recieved:
image

Diff:
image

So is there anything else I should change.

Copy link
Contributor

@davepagurek davepagurek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome, this is looking really promising, great work! I think the algorithm is in a spot where it's good to go. The last thing might be to add a comment somewhere explaining the general approach of the diff algorithm, similar to what you've described in comments to me here, and maybe additionally why it's necessary at all (because contributors running tests just on their own system might not realize that there will be differences on CI.)

const ratio = expected.width / expected.height;
const narrow = ratio !== 1;
if (narrow) {
scale *= 2;
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we take out the whitespace here?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right!

// Define significance thresholds
const MIN_CLUSTER_SIZE = 4; // Minimum pixels in a significant cluster
const MAX_TOTAL_DIFF_PIXELS = 40; // Maximum total different pixels
const MAX_LINE_SHIFT_PIXELS = 200;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this get used?

Copy link
Author

@Vaivaswat2244 Vaivaswat2244 Mar 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First two are getting used in the check match function itself for final comparison.
Variation of these thresholds also causes variations in the tests results as well, like the Min_Cluster_Size if reduced from this makes our test less tolerable and can cause false CI fails like we saw in the isolated pixel case.
The line shift pixels is not needed. I'll remove that

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants