Don't use fast math #16

blaztinn · 2021-10-25T10:14:18Z

Without fast math we get the same output for the same input on all the
platforms and architectures.

Otherwise the shasum of output was different on different machines
that compressed the same input.

Without fast math we get the same output for the same input on all the platforms and architectures.

illwieckz · 2024-06-26T10:47:44Z

If I'm right, some “fast math” options may not only disable some checking, but also make possible to use some dedicated hardware that may even do more precise computation, and while that may not wrong, that may produce images with slight different colors and then checksum.

-ffast-math is more about selectively bypassing IEEE/ISO standard restrictions than it is to not care about precision.
-- https://discourse.llvm.org/t/rfc-deprecate-ofast/78687/26

IEEE talks about digital precision. For example, mul + add may not have the same binary answer as mla, so IEEE assumes precision is lost. But it’s often gained.
-- https://discourse.llvm.org/t/rfc-deprecate-ofast/78687/30

So, maybe the checksum being different is not the symptom of a bug. But it is probably expected that using fast math breaks reproducibility because then the math functions or even hardware don't have to be conformant to some IEEE standard and even things like level of precision may differ across software/hardware implementations.

For example with the Dæmon game engine we had to update some of our tests when we added an option to disable SSE, because then the x87 compute produced a slightly different result. It was not wrong, just the precision differs, actually SSE had higher precision than x87 so the result was 0.4261826 with SSE but 0.426183 with x87:

Fix trace tests with x87 floating point DaemonEngine/Daemon#1153

Since the tool is meant to produce distributable files, it looks to be a good idea to have build options guaranteeing the reproducibility of the result.

If someone implements a game engine that embeds libcrn to automatically convert PNG and JPG images to DDS/CRN and to store the generated DDS/CRN in a cache, it's probably fine to not care about reproducibility.

But when someone is implementing a toolchain like Urcheon for producing a distributable game with pre-computed DDS/CRN, this one may want to have a knob to enable reproducibility, even if at expense of spending more time at producing the released game.

I think I'll add to Dæmon's crunch a CMake option as a knob to favor reproducibility (and then disable fast math). This option will likely be enabled by default (ffast math disabled by default).

blaztinn · 2024-06-26T11:07:59Z

@illwieckz Yes, that is also the conclusion I got to (with regards to fast math optimizations).

We are using this lib to produce the artifacts at build time and we're caching them by the checksum on some server. For this use-case the fast math being disabled is an appropriate setting.

But I see how it can be beneficial to turn the fast math on if used in an app/game at runtime. I like your approach to using a build flag for it so the user of the lib can decide what to use.

Don't use fast math

cb500d5

Without fast math we get the same output for the same input on all the platforms and architectures.

illwieckz mentioned this pull request Jul 28, 2022

Make possible to not use fast math DaemonEngine/crunch#29

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Don't use fast math #16

Don't use fast math #16

blaztinn commented Oct 25, 2021

illwieckz commented Jun 26, 2024 •

edited

Loading

blaztinn commented Jun 26, 2024

Don't use fast math #16

Are you sure you want to change the base?

Don't use fast math #16

Conversation

blaztinn commented Oct 25, 2021

illwieckz commented Jun 26, 2024 • edited Loading

blaztinn commented Jun 26, 2024

illwieckz commented Jun 26, 2024 •

edited

Loading