Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don't use fast math #16

Open
wants to merge 1 commit into
base: unity
Choose a base branch
from

Conversation

blaztinn
Copy link

Without fast math we get the same output for the same input on all the
platforms and architectures.

Otherwise the shasum of output was different on different machines
that compressed the same input.

Without fast math we get the same output for the same input on all the
platforms and architectures.
@illwieckz
Copy link

illwieckz commented Jun 26, 2024

If I'm right, some “fast math” options may not only disable some checking, but also make possible to use some dedicated hardware that may even do more precise computation, and while that may not wrong, that may produce images with slight different colors and then checksum.

-ffast-math is more about selectively bypassing IEEE/ISO standard restrictions than it is to not care about precision.
-- https://discourse.llvm.org/t/rfc-deprecate-ofast/78687/26

IEEE talks about digital precision. For example, mul + add may not have the same binary answer as mla, so IEEE assumes precision is lost. But it’s often gained.
-- https://discourse.llvm.org/t/rfc-deprecate-ofast/78687/30

So, maybe the checksum being different is not the symptom of a bug. But it is probably expected that using fast math breaks reproducibility because then the math functions or even hardware don't have to be conformant to some IEEE standard and even things like level of precision may differ across software/hardware implementations.

For example with the Dæmon game engine we had to update some of our tests when we added an option to disable SSE, because then the x87 compute produced a slightly different result. It was not wrong, just the precision differs, actually SSE had higher precision than x87 so the result was 0.4261826 with SSE but 0.426183 with x87:

Since the tool is meant to produce distributable files, it looks to be a good idea to have build options guaranteeing the reproducibility of the result.

If someone implements a game engine that embeds libcrn to automatically convert PNG and JPG images to DDS/CRN and to store the generated DDS/CRN in a cache, it's probably fine to not care about reproducibility.

But when someone is implementing a toolchain like Urcheon for producing a distributable game with pre-computed DDS/CRN, this one may want to have a knob to enable reproducibility, even if at expense of spending more time at producing the released game.

I think I'll add to Dæmon's crunch a CMake option as a knob to favor reproducibility (and then disable fast math). This option will likely be enabled by default (ffast math disabled by default).

@blaztinn
Copy link
Author

@illwieckz Yes, that is also the conclusion I got to (with regards to fast math optimizations).

We are using this lib to produce the artifacts at build time and we're caching them by the checksum on some server. For this use-case the fast math being disabled is an appropriate setting.

But I see how it can be beneficial to turn the fast math on if used in an app/game at runtime. I like your approach to using a build flag for it so the user of the lib can decide what to use.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants