Snapshot taken on CI has slight color differences compared to local #313

edulpn · 2020-03-17T13:29:19Z

Hi everybody, I'm having an issue here with color discrepancies in an image view. I'm using KIF framework for UI Tests and SnapshotTesting to take screenshots between interactions (asserts made on the app key window). Everything works fine, except that I get color differences from snapshots taken on my local machine vs snapshots taken on CI machine (both running the same stack, Mojave + Xcode 11.3).

Some examples below:

This one is taken by CI machine

This one is taken by my local computer

This is the comparison of both images made by imagemagick

And these are the color discrepancies of a same pixel, measured by macOS Digital Color Meter

I know there are lots of variables and also using KIF, but do you guys have any idea of what might be causing this difference?

Thanks!

The text was updated successfully, but these errors were encountered:

iandundas · 2020-04-05T08:17:34Z

Just to chime in that we're seeing the same issue. When run our snapshot tests from fastlane, assets (pngs etc) seem to be slightly differently rendered compared when we launch the same tests on the same machine, same simulator, from Xcode. Very strange.

gobetti · 2020-04-23T03:56:51Z

I'm not seeing any difference between fastlane and Xcode on my machine so far, but I do also experience tiny nuances between my machine's and CI's snapshots, specifically on iOS 13. Having a non-zero corner radius in a native view is enough to have a difference, don't really need to involve assets.

edit: including screenshots now. What scares me is that this tiny difference is consistently so, on both.

Mine	CI

if there's no workaround for that other than lowering the precision, I wonder if maybe it'd make sense to have a tolerance specifically for cases like that where the R and/or G and/or B channel(s) have a difference of only +-1...

edit 2: on iOS 12, both 2x and 3x screens tests pass, on iOS 13 both fail.

aegzorz · 2020-04-23T20:09:49Z

We're seeing similar results between different machines, as soon as we add "real life" images we get a small imperceptible differences causing the test to fail.

gobetti · 2020-04-23T20:20:57Z

@aegzorz could you please check if your differences are only on iOS 13? And how much is the difference? (i.e. what's the RGB for the different pixel in the two different runs)

aegzorz · 2020-04-23T20:57:48Z

We're only testing on iOS 13 currently. Seems using shadows on views throws it off sometimes as well. Haven't done any RGB measurements yet.

gobetti · 2020-04-23T21:47:09Z

sadly, lowering the precision or considering as "equal" pixels with an invisible difference aren't performant solutions. Tests that don't pass the memcmp comparison are orders of magnitude slower so having to go through the bytes comparison loop is probably very undesirable.

stephencelis · 2020-05-06T19:50:20Z

We have this in our README:

⚠️ Warning: Snapshots must be compared using a simulator with the same OS, device gamut, and scale as the simulator that originally took the reference to avoid discrepancies between images.

Is it possible to ensure that CI and developer machines test in a consistent environment?

iandundas · 2020-05-06T20:00:27Z

Could be something there, but we were seeing it on the exact same simulator (model, OS etc) on the same machine, with the difference being only between running the tests from Xcode vs running them (locally) from fastlane

stephencelis · 2020-05-06T20:03:26Z

Huh. Perhaps someone with fastlane experience can explain? Maybe it's worth opening an issue with them to ask?

iandundas · 2020-05-06T20:04:11Z

Good idea, will do 👍🏻

iandundas · 2020-05-15T11:07:33Z

I was just about to, when I realised above that some people are saying it is not fastlane which is causing this issue for them (but simply running the tests on a different machine). So a bit hesitant now to open that fastlane issue 🤔.

gscalzo · 2020-05-16T14:59:38Z

@iandundas it could be related to #288

Nikoloutsos · 2021-06-28T21:33:45Z

For future readers I faced this issue with different colours on different machines.
It happened when I tried to solve another problem with masked corner radius. For solving it I had to add host application to test module and enable a parameter called DrawOnViewHierarchy in .image strategy.
I removed them and then I recorded the snapshots tests for iPhone 8 14.4.
Then I forced fast lane to run tests on this simulator on CI and everything works great with 100% precision!

To me the problem was that after Xcode 11 simulators use GPU to render things and not CPU.

AlexApriamashvili · 2022-01-04T01:06:15Z

Hi guys,

I'm curious if there are any other solutions to this problem rather than hosting the test bundle inside the app executable? Unfortunately, this doesn't work in my case.

I've been looking into this problem for some time and here are my thoughts:

My understanding might not be completely correct, but I think that this problem might occur due to the fact that some CI (mostly cloud) runners would not have any graphic modules at all, so the entire rendering would have to happen on the CPU, versus GPU that is used when we are running tests via Xcode (>= v11.0). I also know that there is a difference in how rendering works for the simulators that run in the headless mode vs the ones that we see on screen. So even if the simulated device (and the machine) is the same, there is still a high chance of running into this issue, while simply launching tests in different ways.

I've confirmed this on my local machine by recording snapshots via xcodebuild and then verifying those by running the tests from Xcode and vice versa. In both cases, there were insignificant differences between the reference and the result images which urged tests to fail, given that precision was intact.

I have a suspicion that reference images recorded via the fastlane (same xcodebuild) would have a better chance of matching the results received on CI. However, the developer experience from having to use xcodebuild to record the reference images would be worse than just using Xcode, I also reckon that there is still quite a high chance for those tests to be flaky.

Another observation that I've made is that in most of the cases the difference was in how assets (icons, images, etc) are being rendered, as the rest of the components were quite identical. I'm curious if anyone has explored how the use of different UIImage.RenderingMode and UIGraphicsRendererFormat could affect the result of running snapshot tests?

I would be keen on having a discussion on this matter here

bill-florio · 2022-05-11T14:34:21Z

It is still an issue for Xcode 13.3.1 (iOS 15.4 simulator). In my case reference image is created on M1 mac, Github action generated sightly different images.

ejensen · 2022-09-12T14:22:25Z

#628 should resolve this issue with a new perceptual difference calculation. The tiny differences in rendering are generally under the 2% DeltaE value which is nearly imperceivable by the human eye. Using a perceptualPrecsion value of >=98% will prevent imperceivable differences from failing assertions while noticeable differences are still caught.

KingOfBrian · 2022-09-26T19:46:23Z

We have this same issue where one of our 6 CI machines fails 1-6 snapshots of our 500ish snapshots. We just changed everything to pass in perceptualPrecision: 0.98 and we're still seeing the issue.

Here's the difference that shows up in Xcode:

The wild thing is all these machines are setup via script and I don't understand how they could be different. The color profile angle sounded nice, but we don't have any ~/Library/ColorSync directory. Looking at the color profiles via image magic shows 8-bit sRGB for everything so it seems like it's the same.

ejensen · 2022-10-22T01:19:42Z

We have this same issue where one of our 6 CI machines fails 1-6 snapshots of our 500ish snapshots. We just changed everything to pass in perceptualPrecision: 0.98 and we're still seeing the issue.

The wild thing is all these machines are setup via script and I don't understand how they could be different. The color profile angle sounded nice, but we don't have any ~/Library/ColorSync directory. Looking at the color profiles via image magic shows 8-bit sRGB for everything so it seems like it's the same.

I found some cases where some machines, particularly virtualized environments, where the snapshot images are taken in a different color space than when the same test is run on a different machine with the exact same OS/simulator.
I put together #665 which attempts to normalize both the reference and new snapshot image's color space before comparison.

This adds variants of the `Snapshotting` extensions that allow for imprecise comparisons, i.e. using the `precision` and `perceptualPrecision` parameters. ## Why is this necessary? Adding precision parameters has been a highly requested feature (see #63) to work around some simulator changes introduced in iOS 13. Historically the simulator has supported CPU-based rendering, giving us very stable image representations of views that we can compare pixel-by-pixel. Unfortunately, with iOS 13, Apple changed the simulator to use exclusively GPU-based rendering, which means that the resulting snapshots may differ slightly across machines (see pointfreeco/swift-snapshot-testing#313). The negative effects of this were mitigated in SnapshotTesting by adding two precision controls to snapshot comparisons: a **perceptual precision** that controls how close in color two pixels need to be to count as unchanged (using the Lab ΔE distance between colors) and an overall **precision** that controls what portion of pixels between two images need to be the same (based on the per-pixel calculation) for the images to be considered unchanged. Setting these precisions to non-one values enables engineers to record tests on one machine and run them on another (e.g. record new reference images on their laptop and then run tests on CI) without worrying about the tests failing due to differences in GPU rendering. This is great in theory, but from our testing we've found even the lowest tolerances (near-one precision values) to consistently handle GPU differences between machine types let through a significant number of visual regressions. In other words, there is no magic set of precision values that avoids false negatives based on GPU rendering and also avoids false positives based on minor visual regressions. This is especially true for accessibility snapshots. To start, tolerances seem to be more reliable when applied to relatively small snapshot images, but accessibility snapshots tend to be fairly large since they include both the view and the legend. Additionally, the text in the legend can change meaningfully and reflect only a small number of pixel changes. For example, I ran a test of full screen snapshot on an iPhone 12 Pro with two columns of legend. Even a precision of `0.9999` (99.99%) was enough to let through a regression where one of the elements lost its `.link` trait (represented by the text "Link." appended to the element's description in the snapshot). But this high a precision _wasn't_ enough to handle the GPU rendering differences between a MacBook Pro and a Mac Mini. This is a simplified example since it only uses `precision`, not `perceptualPrecision`, but we've found many similar situations arise even with the combination. Some teams have developed infrastructure to allow snapshots to run on the same hardware consistently and have built a developer process around that infrastructure, but many others have accepted lowering precision as a necessity today. ## Why create separate "imprecise" variants? The simplest approach to adding tolerances would be adding the `precision` and `perceptualPrecision` parameters to the existing snapshot methods, however I feel adding separate methods with an "imprecise" prefix is better in the long run. The naming is motivated by the idea that **it needs to be very obvious when what you're doing might result in unexpected/undesirable behavior**. In other words, when using one of the core snapshot variants, you should have extremely high confidence that a test passing means there's no regressions. When you use an "imprecise" variant, it's up to you to set your confidence levels according to your chosen precision values. This is similar to the "unsafe" terminology around memory in the Swift API. You should generally feel very confident in the memory safety of your code, but any time you see "unsafe" it's a sign to be extra careful and not gather unwarranted confidence from the compiler. Longer term, I'm hopeful we can find alternative comparison algorithms that allow for GPU rendering differences without opening the door to regressions. We can integrate these into the core snapshot variants as long as they do not introduce opportunities for regressions, or add additional comparison variants to iterate on different approaches.

gobetti mentioned this issue May 16, 2020

Fix macOS Catalina issue #288

Closed

NickEntin mentioned this issue Nov 10, 2020

Improve support for snapshotting views with text field cursors to prevent tests from flaking cashapp/AccessibilitySnapshot#15

Closed

lukeredpath mentioned this issue Dec 17, 2020

Failing tests on CircleCI (color issues?) #419

Open

gobetti mentioned this issue Feb 4, 2021

Snapshots on Apple Silicon devices #424

Closed

tillhainbach mentioned this issue Nov 6, 2021

feat(macOS): deterministically snapshot NSViews on any device #533

Open

pimms mentioned this issue Feb 9, 2022

Allow for subpixel deviations + improved unoptimized loop execution in image comparison #571

Closed

bstien mentioned this issue Feb 11, 2022

Fix: Support for snapshot tests on Intel and M1 finn-no/FinniversKit#1094

Merged

lukepistrol mentioned this issue Apr 19, 2022

CodeEditUI: Added snapshot tests & Keep CircleCI? CodeEditApp/CodeEdit#482

Merged

5 tasks

Deco354 mentioned this issue May 12, 2022

M1 Snapshot inconsistency fix tumblr/swift-snapshot-testing#1

Merged

paulz mentioned this issue May 22, 2022

ignore negligible color difference paulz/SwiftUI-snapshot-testing#3

Closed

ejensen mentioned this issue Sep 12, 2022

Perceptual image precision + 90% speed improvement #628

Merged

stephencelis closed this as completed in #628 Sep 21, 2022

ejensen mentioned this issue Oct 22, 2022

Normalize image color spaces before comparison #665

Open

NickEntin mentioned this issue Aug 16, 2023

Add imprecise comparison variants to SnapshotTesting extension cashapp/AccessibilitySnapshot#143

Draft

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Snapshot taken on CI has slight color differences compared to local #313

Snapshot taken on CI has slight color differences compared to local #313

edulpn commented Mar 17, 2020 •

edited

iandundas commented Apr 5, 2020

gobetti commented Apr 23, 2020 •

edited

aegzorz commented Apr 23, 2020

gobetti commented Apr 23, 2020

aegzorz commented Apr 23, 2020

gobetti commented Apr 23, 2020

stephencelis commented May 6, 2020

iandundas commented May 6, 2020

stephencelis commented May 6, 2020

iandundas commented May 6, 2020

iandundas commented May 15, 2020

gscalzo commented May 16, 2020

Nikoloutsos commented Jun 28, 2021 •

edited

AlexApriamashvili commented Jan 4, 2022 •

edited

bill-florio commented May 11, 2022

ejensen commented Sep 12, 2022 •

edited

KingOfBrian commented Sep 26, 2022

ejensen commented Oct 22, 2022

Snapshot taken on CI has slight color differences compared to local #313

Snapshot taken on CI has slight color differences compared to local #313

Comments

edulpn commented Mar 17, 2020 • edited

iandundas commented Apr 5, 2020

gobetti commented Apr 23, 2020 • edited

aegzorz commented Apr 23, 2020

gobetti commented Apr 23, 2020

aegzorz commented Apr 23, 2020

gobetti commented Apr 23, 2020

stephencelis commented May 6, 2020

iandundas commented May 6, 2020

stephencelis commented May 6, 2020

iandundas commented May 6, 2020

iandundas commented May 15, 2020

gscalzo commented May 16, 2020

Nikoloutsos commented Jun 28, 2021 • edited

AlexApriamashvili commented Jan 4, 2022 • edited

bill-florio commented May 11, 2022

ejensen commented Sep 12, 2022 • edited

KingOfBrian commented Sep 26, 2022

ejensen commented Oct 22, 2022

edulpn commented Mar 17, 2020 •

edited

gobetti commented Apr 23, 2020 •

edited

Nikoloutsos commented Jun 28, 2021 •

edited

AlexApriamashvili commented Jan 4, 2022 •

edited

ejensen commented Sep 12, 2022 •

edited