[APFloat] Correct semantics of minimum/maximum for signaling NaN arguments #109976

asb · 2024-09-25T13:15:18Z

The minimum and maximum operations were introduced in
https://reviews.llvm.org/D52764 alongside the intrinsics. The question
of NaN propagation was discussed at the time, but the resulting
semantics don't seem to match what was ultimately agreed in IEEE754-2019
or the description we now have in the LangRef at
https://llvm.org/docs/LangRef.html#llvm-min-intrinsics-comparation.

Essentially, the APFloat implementation doesn't quiet a signaling NaN
input when I believe it should in order to match the LangRef and IEEE
spec.

Per the langref and IEEE-754 2019 spec, a signaling input should produce a quiet nan output. This test just captures the current behaviour, prior to a fix.

…ments The minimum and maximum operations were introduced in https://reviews.llvm.org/D52764 alongside the intrinsics. The question of NaN propagation was discussed at the time, but the resulting semantics don't seem to match what was ultimately agreed in IEEE754-2019 or the description we now have in the LangRef at <https://llvm.org/docs/LangRef.html#llvm-min-intrinsics-comparation>. Essentially, the APFloat implementation doesn't quiet a signaling NaN input when I believe it should in order to match the LangRef and IEEE spec.

llvmbot · 2024-09-25T13:15:55Z

@llvm/pr-subscribers-llvm-adt

Author: Alex Bradbury (asb)

Changes

The minimum and maximum operations were introduced in
https://reviews.llvm.org/D52764 alongside the intrinsics. The question
of NaN propagation was discussed at the time, but the resulting
semantics don't seem to match what was ultimately agreed in IEEE754-2019
or the description we now have in the LangRef at
<https://llvm.org/docs/LangRef.html#llvm-min-intrinsics-comparation>.

Essentially, the APFloat implementation doesn't quiet a signaling NaN
input when I believe it should in order to match the LangRef and IEEE
spec.

Full diff: https://github.com/llvm/llvm-project/pull/109976.diff

2 Files Affected:

(modified) llvm/include/llvm/ADT/APFloat.h (+8-6)
(modified) llvm/unittests/ADT/APFloatTest.cpp (+6)

diff --git a/llvm/include/llvm/ADT/APFloat.h b/llvm/include/llvm/ADT/APFloat.h
index 9cc8369a0bf52b..acb3b2e2103009 100644
--- a/llvm/include/llvm/ADT/APFloat.h
+++ b/llvm/include/llvm/ADT/APFloat.h
@@ -1483,13 +1483,14 @@ inline APFloat maxnum(const APFloat &A, const APFloat &B) {
 }
 
 /// Implements IEEE 754-2019 minimum semantics. Returns the smaller of 2
-/// arguments, propagating NaNs and treating -0 as less than +0.
+/// arguments, returning a quiet NaN if an argument is a NaN and treating -0
+/// as less than +0.
 LLVM_READONLY
 inline APFloat minimum(const APFloat &A, const APFloat &B) {
   if (A.isNaN())
-    return A;
+    return A.makeQuiet();
   if (B.isNaN())
-    return B;
+    return B.makeQuiet();
   if (A.isZero() && B.isZero() && (A.isNegative() != B.isNegative()))
     return A.isNegative() ? A : B;
   return B < A ? B : A;
@@ -1509,13 +1510,14 @@ inline APFloat minimumnum(const APFloat &A, const APFloat &B) {
 }
 
 /// Implements IEEE 754-2019 maximum semantics. Returns the larger of 2
-/// arguments, propagating NaNs and treating -0 as less than +0.
+/// arguments, returning a quiet NaN if an argument is a NaN and treating -0
+/// as less than +0.
 LLVM_READONLY
 inline APFloat maximum(const APFloat &A, const APFloat &B) {
   if (A.isNaN())
-    return A;
+    return A.makeQuiet();
   if (B.isNaN())
-    return B;
+    return B.makeQuiet();
   if (A.isZero() && B.isZero() && (A.isNegative() != B.isNegative()))
     return A.isNegative() ? B : A;
   return A < B ? B : A;
diff --git a/llvm/unittests/ADT/APFloatTest.cpp b/llvm/unittests/ADT/APFloatTest.cpp
index 6c49d78e5c8ea9..29149ffadfb6a7 100644
--- a/llvm/unittests/ADT/APFloatTest.cpp
+++ b/llvm/unittests/ADT/APFloatTest.cpp
@@ -607,6 +607,7 @@ TEST(APFloatTest, Minimum) {
   APFloat zp(0.0);
   APFloat zn(-0.0);
   APFloat nan = APFloat::getNaN(APFloat::IEEEdouble());
+  APFloat snan = APFloat::getSNaN(APFloat::IEEEdouble());
 
   EXPECT_EQ(1.0, minimum(f1, f2).convertToDouble());
   EXPECT_EQ(1.0, minimum(f2, f1).convertToDouble());
@@ -614,6 +615,8 @@ TEST(APFloatTest, Minimum) {
   EXPECT_EQ(-0.0, minimum(zn, zp).convertToDouble());
   EXPECT_TRUE(std::isnan(minimum(f1, nan).convertToDouble()));
   EXPECT_TRUE(std::isnan(minimum(nan, f1).convertToDouble()));
+  EXPECT_TRUE(maximum(snan, f1).isNaN());
+  EXPECT_FALSE(maximum(snan, f1).isSignaling());
 }
 
 TEST(APFloatTest, Maximum) {
@@ -622,6 +625,7 @@ TEST(APFloatTest, Maximum) {
   APFloat zp(0.0);
   APFloat zn(-0.0);
   APFloat nan = APFloat::getNaN(APFloat::IEEEdouble());
+  APFloat snan = APFloat::getSNaN(APFloat::IEEEdouble());
 
   EXPECT_EQ(2.0, maximum(f1, f2).convertToDouble());
   EXPECT_EQ(2.0, maximum(f2, f1).convertToDouble());
@@ -629,6 +633,8 @@ TEST(APFloatTest, Maximum) {
   EXPECT_EQ(0.0, maximum(zn, zp).convertToDouble());
   EXPECT_TRUE(std::isnan(maximum(f1, nan).convertToDouble()));
   EXPECT_TRUE(std::isnan(maximum(nan, f1).convertToDouble()));
+  EXPECT_TRUE(maximum(snan, f1).isNaN());
+  EXPECT_FALSE(maximum(snan, f1).isSignaling());
 }
 
 TEST(APFloatTest, MinimumNumber) {

arsenm · 2024-09-25T13:17:21Z

llvm/unittests/ADT/APFloatTest.cpp

  EXPECT_TRUE(std::isnan(minimum(f1, nan).convertToDouble()));
  EXPECT_TRUE(std::isnan(minimum(nan, f1).convertToDouble()));
+  EXPECT_TRUE(maximum(snan, f1).isNaN());
+  EXPECT_FALSE(maximum(snan, f1).isSignaling());


Maybe could use a test that the payload and signbit are preserved, but it's not important

I'm not sure to what degree I should try to closely match the current APFloat implementation vs what the IEEE spec requires. To my cursory reading it looks like it only specifies "a" quiet NaN - but perhaps there's more detail on preserving payload and signbit for NaN operands elsewhere?

It is not required, but suggested you should try to preserve the payload of one of the input operands for nan producing operations

arsenm · 2024-09-25T13:17:47Z

llvm/unittests/ADT/APFloatTest.cpp

  EXPECT_TRUE(std::isnan(minimum(nan, f1).convertToDouble()));
+  EXPECT_TRUE(maximum(snan, f1).isNaN());
+  EXPECT_FALSE(maximum(snan, f1).isSignaling());
 }


Also test with snan in both positions

llvm-ci · 2024-10-01T12:45:14Z

LLVM Buildbot has detected a new failure on builder openmp-offload-libc-amdgpu-runtime running on omp-vega20-1 while building llvm at step 10 "Add check check-offload".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/73/builds/6426

Here is the relevant piece of the build log for the reference

Step 10 (Add check check-offload) failure: 1200 seconds without output running [b'ninja', b'-j 32', b'check-offload'], attempting to kill
******************** TEST 'libomptarget :: amdgcn-amd-amdhsa :: sanitizer/ptr_outside_alloc_2.c' FAILED ********************
Exit Code: 1

Command Output (stdout):
--
# RUN: at line 2
/home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/./bin/clang -fopenmp    -I /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.src/offload/test -I /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/openmp/runtime/src -L /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/offload -L /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/./lib -L /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/openmp/runtime/src  -nogpulib -Wl,-rpath,/home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/offload -Wl,-rpath,/home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/openmp/runtime/src -Wl,-rpath,/home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/./lib  -fopenmp-targets=amdgcn-amd-amdhsa -O3 /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.src/offload/test/sanitizer/ptr_outside_alloc_2.c -o /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/offload/test/amdgcn-amd-amdhsa/sanitizer/Output/ptr_outside_alloc_2.c.tmp -Xoffload-linker -lc -Xoffload-linker -lm /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/./lib/libomptarget.devicertl.a
# executed command: /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/./bin/clang -fopenmp -I /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.src/offload/test -I /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/openmp/runtime/src -L /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/offload -L /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/./lib -L /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/openmp/runtime/src -nogpulib -Wl,-rpath,/home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/offload -Wl,-rpath,/home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/openmp/runtime/src -Wl,-rpath,/home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/./lib -fopenmp-targets=amdgcn-amd-amdhsa -O3 /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.src/offload/test/sanitizer/ptr_outside_alloc_2.c -o /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/offload/test/amdgcn-amd-amdhsa/sanitizer/Output/ptr_outside_alloc_2.c.tmp -Xoffload-linker -lc -Xoffload-linker -lm /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/./lib/libomptarget.devicertl.a
# RUN: at line 3
/home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/./bin/not --crash env -u LLVM_DISABLE_SYMBOLIZATION OFFLOAD_TRACK_ALLOCATION_TRACES=1 /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/offload/test/amdgcn-amd-amdhsa/sanitizer/Output/ptr_outside_alloc_2.c.tmp 2>&1 | /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/./bin/FileCheck /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.src/offload/test/sanitizer/ptr_outside_alloc_2.c --check-prefixes=CHECK
# executed command: /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/./bin/not --crash env -u LLVM_DISABLE_SYMBOLIZATION OFFLOAD_TRACK_ALLOCATION_TRACES=1 /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/offload/test/amdgcn-amd-amdhsa/sanitizer/Output/ptr_outside_alloc_2.c.tmp
# executed command: /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/./bin/FileCheck /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.src/offload/test/sanitizer/ptr_outside_alloc_2.c --check-prefixes=CHECK
# .---command stderr------------
# | /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.src/offload/test/sanitizer/ptr_outside_alloc_2.c:21:11: error: CHECK: expected string not found in input
# | // CHECK: OFFLOAD ERROR: Memory access fault by GPU {{.*}} (agent 0x{{.*}}) at virtual address [[PTR:0x[0-9a-z]*]]. Reasons: {{.*}}
# |           ^
# | <stdin>:1:1: note: scanning from here
# | AMDGPU error: Error in hsa_amd_memory_pool_allocate: HSA_STATUS_ERROR_OUT_OF_RESOURCES: The runtime failed to allocate the necessary resources. This error may also occur when the core runtime library needs to spawn threads or create internal OS-specific events.
# | ^
# | 
# | Input file: <stdin>
# | Check file: /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.src/offload/test/sanitizer/ptr_outside_alloc_2.c
# | 
# | -dump-input=help explains the following input dump.
# | 
# | Input was:
# | <<<<<<
# |           1: AMDGPU error: Error in hsa_amd_memory_pool_allocate: HSA_STATUS_ERROR_OUT_OF_RESOURCES: The runtime failed to allocate the necessary resources. This error may also occur when the core runtime library needs to spawn threads or create internal OS-specific events. 
# | check:21     X~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ error: no match found
# |           2: AMDGPU error: Error in hsa_amd_memory_pool_allocate: HSA_STATUS_ERROR_OUT_OF_RESOURCES: The runtime failed to allocate the necessary resources. This error may also occur when the core runtime library needs to spawn threads or create internal OS-specific events. 
# | check:21     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# |           3: "PluginInterface" error: Failure to allocate device memory: Failed to allocate from memory manager 
# | check:21     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# |           4: omptarget error: Call to getTargetPointer returned null pointer (device failure or illegal mapping). 
# | check:21     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# |           5: omptarget error: Call to targetDataBegin failed, abort target. 
# | check:21     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# |           6: omptarget error: Failed to process data before launching the kernel. 
# | check:21     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# |           .
# |           .
# |           .
# | >>>>>>
# `-----------------------------
# error: command failed with exit status: 1

--

********************

…ments (llvm#109976) The minimum and maximum operations were introduced in https://reviews.llvm.org/D52764 alongside the intrinsics. The question of NaN propagation was discussed at the time, but the resulting semantics don't seem to match what was ultimately agreed in IEEE754-2019 or the description we now have in the LangRef at <https://llvm.org/docs/LangRef.html#llvm-min-intrinsics-comparation>. Essentially, the APFloat implementation doesn't quiet a signaling NaN input when it should in order to match the LangRef and IEEE spec.

asb added 2 commits September 25, 2024 13:53

[test][ADT] Pre-commit test for fix in APFloat minimum/maximum behaviour

2db60be

Per the langref and IEEE-754 2019 spec, a signaling input should produce a quiet nan output. This test just captures the current behaviour, prior to a fix.

asb requested review from arsenm, wzssyqa, preames, tlively, topperc and andykaylor September 25, 2024 13:15

llvmbot added the llvm:adt label Sep 25, 2024

arsenm added the floating-point Floating-point math label Sep 25, 2024

arsenm approved these changes Sep 25, 2024

View reviewed changes

arsenm reviewed Sep 25, 2024

View reviewed changes

Test snan in other position as well

6bd8a30

arsenm approved these changes Oct 1, 2024

View reviewed changes

asb merged commit a889018 into llvm:main Oct 1, 2024
5 of 8 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[APFloat] Correct semantics of minimum/maximum for signaling NaN arguments #109976

[APFloat] Correct semantics of minimum/maximum for signaling NaN arguments #109976

Uh oh!

asb commented Sep 25, 2024

Uh oh!

llvmbot commented Sep 25, 2024

Uh oh!

arsenm Sep 25, 2024

Uh oh!

asb Sep 25, 2024

Uh oh!

arsenm Sep 25, 2024

Uh oh!

arsenm Sep 25, 2024

Uh oh!

Uh oh!

llvm-ci commented Oct 1, 2024

Uh oh!

Uh oh!

[APFloat] Correct semantics of minimum/maximum for signaling NaN arguments #109976

[APFloat] Correct semantics of minimum/maximum for signaling NaN arguments #109976

Uh oh!

Conversation

asb commented Sep 25, 2024

Uh oh!

llvmbot commented Sep 25, 2024

Uh oh!

arsenm Sep 25, 2024

Choose a reason for hiding this comment

Uh oh!

asb Sep 25, 2024

Choose a reason for hiding this comment

Uh oh!

arsenm Sep 25, 2024

Choose a reason for hiding this comment

Uh oh!

arsenm Sep 25, 2024

Choose a reason for hiding this comment

Uh oh!

Uh oh!

llvm-ci commented Oct 1, 2024

Uh oh!

Uh oh!