Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use emulated fma on Cygwin64 #2177

Merged
merged 1 commit into from Mar 1, 2019

Conversation

Projects
None yet
5 participants
@dra27
Copy link
Contributor

commented Nov 30, 2018

Cygwin is based on newlib, not glibc, and the fma function is not implemented (well, it is just return x * y + z;). Split off HAS_WORKING_FMA from HAS_C99_FLOAT_OPS. Disable fma for Cygwin64 so that the fma test passes.

Cygwin32 has the same limitation, but because the intermediate result is placed in an 80 bit fp reg (unless -ffloat-store is given), then the fma tests pass, which is why this went unnoticed in #1354.

I have two questions:

  • Should the fma test include a case which fails with an 80 bit intermediate? This would force a failure on Cygwin32?
  • Should Cygwin32 also be using the emulation for FMA, regardless of the answer to the previous?

@dra27 dra27 added this to the 4.08 milestone Nov 30, 2018

@dra27

This comment has been minimized.

Copy link
Contributor Author

commented Nov 30, 2018

This has been tested along with #2160 in precheck#163 giving the first blue board in quite a while...

@shindere

This comment has been minimized.

Copy link
Contributor

commented Dec 6, 2018

@thvnx
Copy link
Contributor

left a comment

It could be a great idea to use the emulation provided here when the one provided by the underlying libc is known to be inaccurate.

@dra27, yes Cygwin32 should use the emulated FMA if the hardware instruction or IEEE compliant FMA emulation is not provided by the libc. IIUC, your patch cover this case (is HAS_WORKING_FMA not defined for both cygwin32 and cygwin64)?

Show resolved Hide resolved configure Outdated
@dra27

This comment has been minimized.

Copy link
Contributor Author

commented Dec 6, 2018

@thvnx - thanks for looking at this. In this patch, FMA emulation is enabled for amd64 Cygwin but not for x86 Cygwin. This was based solely on the fact that the FMA tests were failing on Cygwin64 but not failing on Cygwin32.

I'm not familiar with any of the standards governing how accurate the emulation is required to be, but if emulation is to be turned on for Cygwin32, shouldn't there also be a test which fails if that emulation is not turned on?

@thvnx

This comment has been minimized.

Copy link
Contributor

commented Dec 6, 2018

Can you share the results of the tests for Cygwin64?

Just to be clear, are the tests passing for Cygwin32 with FMA emulation turned on?
I think that actual tests are enough.

@damiendoligez damiendoligez added the bug label Dec 18, 2018

@dra27

This comment has been minimized.

Copy link
Contributor Author

commented Jan 14, 2019

Sorry for the extreme lag. Cygwin32 is working with emulated FMA, yes - but it is also passing all the tests with return x * y + z;, which is a lot less code if it's technically OK 😉 More seriously, I would prefer to have a test which fails with that incorrect implementation before insisting that the emulated version should be being used.

For Cygwin64, the failing tests (without turning on emulated FMA):

258 FAIL!      fma (0x1.deadbeef2feedp+1023, 0x0.deadbeef2feedp-1022, -0x1.a05f8c01a4bfbp+1) returned 0x0p+0 instead of 0x1.0989687bc9da4p-53.
259 FAIL!      fma (0x1.deadbeef2feedp+900, 0x0.deadbeef2feedp-1022, -0x1.a05f8c01a4bfbp-122) returned 0x0p+0 instead of 0x1.0989687bc9da4p-176.
260 FAIL!      fma (0x1.fffffffffffffp+1023, 0x1.001p+0, -0x1.fffffffffffffp+1023) returned infinity instead of 0x1.fffffffffffffp+1011.
261 FAIL!      fma (-0x1.fffffffffffffp+1023, 0x1.fffffffffffffp+0, 0x1.fffffffffffffp+1023) returned -infinity instead of -0x1.ffffffffffffdp+1023.
262 FAIL!      fma (0x1.fffffffffffffp+1023, 0x1p+1, -0x1.fffffffffffffp+1023) returned infinity instead of 0x1.fffffffffffffp+1023.
264 FAIL!      fma (0x1.deadbeef2feedp-495, 0x1.deadbeef2feedp-495, -0x1.bf86a5786a574p-989) returned 0x0p+0 instead of 0x0.0000042625a1fp-1022.
265 FAIL!      fma (0x1.deadbeef2feedp-503, 0x1.deadbeef2feedp-503, -0x1.bf86a5786a574p-1005) returned 0x0p+0 instead of 0x0.0000000004262p-1022.
266 FAIL!      fma (0x1p-537, 0x1p-538, 0x0.0000000000001p-1022) returned 0x0.0000000000001p-1022 instead of 0x0.0000000000002p-1022.
278 FAIL!      fma (0x0.0000000000001p-1022, 0x1p-1, 0x0.fffffffffffffp-1022) returned 0x0.fffffffffffffp-1022 instead of 0x1p-1022.
279 FAIL!      fma (-0x0.0000000000001p-1022, 0x1p-1, -0x0.fffffffffffffp-1022) returned -0x0.fffffffffffffp-1022 instead of -0x1p-1022.
258 FAIL!      fma (0x1.deadbeef2feedp+1023, 0x0.deadbeef2feedp-1022, -0x1.a05f8c01a4bfbp+1) returned 0x0p+0 instead of 0x1.0989687bc9da4p-53.
259 FAIL!      fma (0x1.deadbeef2feedp+900, 0x0.deadbeef2feedp-1022, -0x1.a05f8c01a4bfbp-122) returned 0x0p+0 instead of 0x1.0989687bc9da4p-176.
260 FAIL!      fma (0x1.fffffffffffffp+1023, 0x1.001p+0, -0x1.fffffffffffffp+1023) returned infinity instead of 0x1.fffffffffffffp+1011.
261 FAIL!      fma (-0x1.fffffffffffffp+1023, 0x1.fffffffffffffp+0, 0x1.fffffffffffffp+1023) returned -infinity instead of -0x1.ffffffffffffdp+1023.
262 FAIL!      fma (0x1.fffffffffffffp+1023, 0x1p+1, -0x1.fffffffffffffp+1023) returned infinity instead of 0x1.fffffffffffffp+1023.
264 FAIL!      fma (0x1.deadbeef2feedp-495, 0x1.deadbeef2feedp-495, -0x1.bf86a5786a574p-989) returned 0x0p+0 instead of 0x0.0000042625a1fp-1022.
265 FAIL!      fma (0x1.deadbeef2feedp-503, 0x1.deadbeef2feedp-503, -0x1.bf86a5786a574p-1005) returned 0x0p+0 instead of 0x0.0000000004262p-1022.
266 FAIL!      fma (0x1p-537, 0x1p-538, 0x0.0000000000001p-1022) returned 0x0.0000000000001p-1022 instead of 0x0.0000000000002p-1022.
278 FAIL!      fma (0x0.0000000000001p-1022, 0x1p-1, 0x0.fffffffffffffp-1022) returned 0x0.fffffffffffffp-1022 instead of 0x1p-1022.
279 FAIL!      fma (-0x0.0000000000001p-1022, 0x1p-1, -0x0.fffffffffffffp-1022) returned -0x0.fffffffffffffp-1022 instead of -0x1p-1022.

@dra27 dra27 force-pushed the dra27:fma-cygwin64 branch from 2ecfa5d to ebe5838 Jan 15, 2019

@thvnx

This comment has been minimized.

Copy link
Contributor

commented Jan 15, 2019

This seems safe to me.

Thanks for the test results, I will consider to add them to the list of accepted ones!

@dra27

This comment has been minimized.

Copy link
Contributor Author

commented Jan 15, 2019

@thvnx - these can't be added to the list of accepted results... they're the results of simply having fma be return x * y + z;?!

@thvnx

This comment has been minimized.

Copy link
Contributor

commented Jan 15, 2019

I was thinking of improving the tests in order to accept some degraded/inaccurate results for some identified platform (by emitting a warning for example). But this will ask some work which I just consider for now.

@dra27

This comment has been minimized.

Copy link
Contributor Author

commented Jan 15, 2019

Ah, I see. Warnings in the testsuite aren't terribly useful because in practice they never really get noticed - but an adapted version of that idea which could be very useful in future might be a test for configure which identifies that the fma function is useless newlib-ish.

@dra27

This comment has been minimized.

Copy link
Contributor Author

commented Jan 15, 2019

AFAICT this is ready to merge...

Show resolved Hide resolved runtime/floats.c
@gasche

gasche approved these changes Jan 16, 2019

@shindere

This comment has been minimized.

Copy link
Contributor

commented Jan 22, 2019

@dra27 dra27 referenced this pull request Mar 1, 2019

Merged

Support FMA operation #1354

@gasche

This comment has been minimized.

Copy link
Member

commented Mar 1, 2019

Coming back to this PR from #1354 (comment) : I would be in favor of merging as soon as possible in order to unblock the CI. @dra27, could you rebase, merge and cherry-pick into 4.08? Further iterations can be done as separate PRs.

@shindere

This comment has been minimized.

Copy link
Contributor

commented Mar 1, 2019

Use emulated fma on Cygwin64
Cygwin is based on newlib, not glibc, and the fma function is not
implemented. Split off HAS_WORKING_FMA from HAS_C99_FLOAT_OPS. Disable
fma for Cygwin64 so that the fma test passes.

@dra27 dra27 force-pushed the dra27:fma-cygwin64 branch from ebe5838 to 2df7119 Mar 1, 2019

@dra27

This comment has been minimized.

Copy link
Contributor Author

commented Mar 1, 2019

@shindere - the comment above seems to refer to #2160, which is already merged. I direct pushed an update to its Changes entry, and I'll merge this one once CI repeats.

@shindere

This comment has been minimized.

Copy link
Contributor

commented Mar 1, 2019

@shindere shindere merged commit f0f7458 into ocaml:trunk Mar 1, 2019

2 checks passed

continuous-integration/appveyor/pr AppVeyor build succeeded
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details

shindere added a commit that referenced this pull request Mar 1, 2019

Use emulated fma on Cygwin64 (#2177)
Cygwin is based on newlib, not glibc, and the fma function is not
implemented. Split off HAS_WORKING_FMA from HAS_C99_FLOAT_OPS. Disable
fma for Cygwin64 so that the fma test passes.
@shindere

This comment has been minimized.

Copy link
Contributor

commented Mar 1, 2019

Merged into trunk and cherry-picked into 4.08 as commit
0ffbe2d, not in time for the beta2,
unfortunately :(

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.