Avoid maxnum(sNaN, x) optimizations / folds #170181

LewisCrawford · 2025-12-01T18:50:56Z

The behaviour of constant-folding maxnum(sNaN, x) and minnum(sNaN, x) has become controversial, and there are ongoing discussions about which behaviour we want to specify in the LLVM IR LangRef.

See:

LLVM broke frontends relying on documented NaN behavior for minnum/maxnum #170082
Revert "LangRef: Clarify llvm.minnum and llvm.maxnum about sNaN and signed zero (#112852)" #168838
Revert "LangRef: Clarify llvm.minnum and llvm.maxnum about sNaN and signed zero (#112852)" #138451
Revert "Revert "LangRef: Clarify llvm.minnum and llvm.maxnum about sNaN and signed zero (#112852)"" #170067
https://discourse.llvm.org/t/rfc-a-consistent-set-of-semantics-for-the-floating-point-minimum-and-maximum-operations/89006

This patch removes optimizations and constant-folding support for maxnum(sNaN, x) but keeps it folded/optimized for qNaN. This should allow for some more flexibility so the implementation can conform to either the old or new version of the semantics specified without any changes.

As far as I am aware, optimizations involving constant sNaN should generally be edge-cases that rarely occur, so here should hopefully be very little real-world performance impact from disabling these optimizations.

The behaviour of constant-folding maxnum(sNaN, x) and minnum(sNaN, x) has become controvertial, and there are ongoing discussions about which behaviour we want to specify in the LLVM IR LangRef. See: - llvm#170082 - llvm#168838 - llvm#138451 - llvm#170067 - https://discourse.llvm.org/t/rfc-a-consistent-set-of-semantics-for-the-floating-point-minimum-and-maximum-operations/89006 This patch removes optimizations and constant-folding support for maxnum(sNaN, x) but keeps it folded/optimized for qNaNs. This should allow for some more flexibility so the implementation can conform to either the old or new version of the semantics specified without any changes. As far as I am aware, optimizations involving constant sNaN should generally be edge-cases that rarely occur, so here should hopefully be very little real-world performance impact from disabling these optimizations.

llvmbot · 2025-12-01T18:51:35Z

@llvm/pr-subscribers-backend-arm
@llvm/pr-subscribers-backend-amdgpu
@llvm/pr-subscribers-llvm-analysis

@llvm/pr-subscribers-backend-x86

Author: Lewis Crawford (LewisCrawford)

Changes

The behaviour of constant-folding maxnum(sNaN, x) and minnum(sNaN, x) has become controversial, and there are ongoing discussions about which behaviour we want to specify in the LLVM IR LangRef.

See:

LLVM broke frontends relying on documented NaN behavior for minnum/maxnum #170082
Revert "LangRef: Clarify llvm.minnum and llvm.maxnum about sNaN and signed zero (#112852)" #168838
Revert "LangRef: Clarify llvm.minnum and llvm.maxnum about sNaN and signed zero (#112852)" #138451
Revert "Revert "LangRef: Clarify llvm.minnum and llvm.maxnum about sNaN and signed zero (#112852)"" #170067
https://discourse.llvm.org/t/rfc-a-consistent-set-of-semantics-for-the-floating-point-minimum-and-maximum-operations/89006

This patch removes optimizations and constant-folding support for maxnum(sNaN, x) but keeps it folded/optimized for qNaN. This should allow for some more flexibility so the implementation can conform to either the old or new version of the semantics specified without any changes.

As far as I am aware, optimizations involving constant sNaN should generally be edge-cases that rarely occur, so here should hopefully be very little real-world performance impact from disabling these optimizations.

Patch is 24.39 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/170181.diff

11 Files Affected:

(modified) llvm/lib/Analysis/ConstantFolding.cpp (+4)
(modified) llvm/lib/Analysis/InstructionSimplify.cpp (+11-14)
(modified) llvm/lib/CodeGen/GlobalISel/Utils.cpp (+4)
(modified) llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp (+7-8)
(modified) llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp (+4)
(modified) llvm/test/CodeGen/AMDGPU/fcanonicalize-elimination.ll (+4-2)
(modified) llvm/test/CodeGen/X86/fmaxnum.ll (+37-8)
(modified) llvm/test/CodeGen/X86/fminnum.ll (+37-8)
(modified) llvm/test/Transforms/InstCombine/simplify-demanded-fpclass.ll (+4-2)
(modified) llvm/test/Transforms/InstSimplify/ConstProp/min-max.ll (+8-4)
(modified) llvm/test/Transforms/InstSimplify/fminmax-folds.ll (+22-13)

diff --git a/llvm/lib/Analysis/ConstantFolding.cpp b/llvm/lib/Analysis/ConstantFolding.cpp
index 916154b465af4..63d12ee585e64 100644
--- a/llvm/lib/Analysis/ConstantFolding.cpp
+++ b/llvm/lib/Analysis/ConstantFolding.cpp
@@ -3348,8 +3348,12 @@ static Constant *ConstantFoldIntrinsicCall2(Intrinsic::ID IntrinsicID, Type *Ty,
       case Intrinsic::copysign:
         return ConstantFP::get(Ty->getContext(), APFloat::copySign(Op1V, Op2V));
       case Intrinsic::minnum:
+        if (Op1V.isSignaling() || Op2V.isSignaling())
+          return nullptr;
         return ConstantFP::get(Ty->getContext(), minnum(Op1V, Op2V));
       case Intrinsic::maxnum:
+        if (Op1V.isSignaling() || Op2V.isSignaling())
+          return nullptr;
         return ConstantFP::get(Ty->getContext(), maxnum(Op1V, Op2V));
       case Intrinsic::minimum:
         return ConstantFP::get(Ty->getContext(), minimum(Op1V, Op2V));
diff --git a/llvm/lib/Analysis/InstructionSimplify.cpp b/llvm/lib/Analysis/InstructionSimplify.cpp
index 59a213b47825a..bd85444d7d2b0 100644
--- a/llvm/lib/Analysis/InstructionSimplify.cpp
+++ b/llvm/lib/Analysis/InstructionSimplify.cpp
@@ -6620,7 +6620,8 @@ static MinMaxOptResult OptimizeConstMinMax(const Constant *RHSConst,
   assert(OutNewConstVal != nullptr);
 
   bool PropagateNaN = IID == Intrinsic::minimum || IID == Intrinsic::maximum;
-  bool PropagateSNaN = IID == Intrinsic::minnum || IID == Intrinsic::maxnum;
+  bool ReturnsOtherForAllNaNs =
+      IID == Intrinsic::minimumnum || IID == Intrinsic::maximumnum;
   bool IsMin = IID == Intrinsic::minimum || IID == Intrinsic::minnum ||
                IID == Intrinsic::minimumnum;
 
@@ -6637,29 +6638,27 @@ static MinMaxOptResult OptimizeConstMinMax(const Constant *RHSConst,
 
   // minnum(x, qnan) -> x
   // maxnum(x, qnan) -> x
-  // minnum(x, snan) -> qnan
-  // maxnum(x, snan) -> qnan
   // minimum(X, nan) -> qnan
   // maximum(X, nan) -> qnan
   // minimumnum(X, nan) -> x
   // maximumnum(X, nan) -> x
   if (CAPF.isNaN()) {
-    if (PropagateNaN || (PropagateSNaN && CAPF.isSignaling())) {
+    if (PropagateNaN) {
       *OutNewConstVal = ConstantFP::get(CFP->getType(), CAPF.makeQuiet());
       return MinMaxOptResult::UseNewConstVal;
+    } else if (ReturnsOtherForAllNaNs || !CAPF.isSignaling()) {
+      return MinMaxOptResult::UseOtherVal;
     }
-    return MinMaxOptResult::UseOtherVal;
+    return MinMaxOptResult::CannotOptimize;
   }
 
   if (CAPF.isInfinity() || (Call && Call->hasNoInfs() && CAPF.isLargest())) {
-    // minnum(X, -inf) -> -inf (ignoring sNaN -> qNaN propagation)
-    // maxnum(X, +inf) -> +inf (ignoring sNaN -> qNaN propagation)
     // minimum(X, -inf) -> -inf if nnan
     // maximum(X, +inf) -> +inf if nnan
     // minimumnum(X, -inf) -> -inf
     // maximumnum(X, +inf) -> +inf
     if (CAPF.isNegative() == IsMin &&
-        (!PropagateNaN || (Call && Call->hasNoNaNs()))) {
+        (ReturnsOtherForAllNaNs || (Call && Call->hasNoNaNs()))) {
       *OutNewConstVal = const_cast<Constant *>(RHSConst);
       return MinMaxOptResult::UseNewConstVal;
     }
@@ -7004,12 +7003,10 @@ Value *llvm::simplifyBinaryIntrinsic(Intrinsic::ID IID, Type *ReturnType,
   case Intrinsic::minimum:
   case Intrinsic::maximumnum:
   case Intrinsic::minimumnum: {
-    // In several cases here, we deviate from exact IEEE 754 semantics
-    // to enable optimizations (as allowed by the LLVM IR spec).
-    //
-    // For instance, we may return one of the arguments unmodified instead of
-    // inserting an llvm.canonicalize to transform input sNaNs into qNaNs,
-    // or may assume all NaN inputs are qNaNs.
+    // In some cases here, we deviate from exact IEEE-754 semantics to enable
+    // optimizations (as allowed by the LLVM IR spec) by returning one of the
+    // arguments unmodified instead of inserting an llvm.canonicalize to
+    // transform input sNaNs into qNaNs,
 
     // If the arguments are the same, this is a no-op (ignoring NaN quieting)
     if (Op0 == Op1)
diff --git a/llvm/lib/CodeGen/GlobalISel/Utils.cpp b/llvm/lib/CodeGen/GlobalISel/Utils.cpp
index e8954a3d9899b..bc01cb65c4a69 100644
--- a/llvm/lib/CodeGen/GlobalISel/Utils.cpp
+++ b/llvm/lib/CodeGen/GlobalISel/Utils.cpp
@@ -768,8 +768,12 @@ llvm::ConstantFoldFPBinOp(unsigned Opcode, const Register Op1,
     C1.copySign(C2);
     return C1;
   case TargetOpcode::G_FMINNUM:
+    if (C1.isSignaling() || C2.isSignaling())
+      return std::nullopt;
     return minnum(C1, C2);
   case TargetOpcode::G_FMAXNUM:
+    if (C1.isSignaling() || C2.isSignaling())
+      return std::nullopt;
     return maxnum(C1, C2);
   case TargetOpcode::G_FMINIMUM:
     return minimum(C1, C2);
diff --git a/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp b/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
index 0f3a207cc6414..70950084ee6b7 100644
--- a/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
@@ -19505,7 +19505,8 @@ SDValue DAGCombiner::visitFMinMax(SDNode *N) {
   const SDNodeFlags Flags = N->getFlags();
   unsigned Opc = N->getOpcode();
   bool PropAllNaNsToQNaNs = Opc == ISD::FMINIMUM || Opc == ISD::FMAXIMUM;
-  bool PropOnlySNaNsToQNaNs = Opc == ISD::FMINNUM || Opc == ISD::FMAXNUM;
+  bool ReturnsOtherForAllNaNs =
+      Opc == ISD::FMINIMUMNUM || Opc == ISD::FMAXIMUMNUM;
   bool IsMin =
       Opc == ISD::FMINNUM || Opc == ISD::FMINIMUM || Opc == ISD::FMINIMUMNUM;
   SelectionDAG::FlagInserter FlagsInserter(DAG, N);
@@ -19524,32 +19525,30 @@ SDValue DAGCombiner::visitFMinMax(SDNode *N) {
 
     // minnum(X, qnan) -> X
     // maxnum(X, qnan) -> X
-    // minnum(X, snan) -> qnan
-    // maxnum(X, snan) -> qnan
     // minimum(X, nan) -> qnan
     // maximum(X, nan) -> qnan
     // minimumnum(X, nan) -> X
     // maximumnum(X, nan) -> X
     if (AF.isNaN()) {
-      if (PropAllNaNsToQNaNs || (AF.isSignaling() && PropOnlySNaNsToQNaNs)) {
+      if (PropAllNaNsToQNaNs) {
         if (AF.isSignaling())
           return DAG.getConstantFP(AF.makeQuiet(), SDLoc(N), VT);
         return N->getOperand(1);
+      } else if (ReturnsOtherForAllNaNs || !AF.isSignaling()) {
+        return N->getOperand(0);
       }
-      return N->getOperand(0);
+      return SDValue();
     }
 
     // In the following folds, inf can be replaced with the largest finite
     // float, if the ninf flag is set.
     if (AF.isInfinity() || (Flags.hasNoInfs() && AF.isLargest())) {
-      // minnum(X, -inf) -> -inf (ignoring sNaN -> qNaN propagation)
-      // maxnum(X, +inf) -> +inf (ignoring sNaN -> qNaN propagation)
       // minimum(X, -inf) -> -inf if nnan
       // maximum(X, +inf) -> +inf if nnan
       // minimumnum(X, -inf) -> -inf
       // maximumnum(X, +inf) -> +inf
       if (IsMin == AF.isNegative() &&
-          (!PropAllNaNsToQNaNs || Flags.hasNoNaNs()))
+          (ReturnsOtherForAllNaNs || Flags.hasNoNaNs()))
         return N->getOperand(1);
 
       // minnum(X, +inf) -> X if nnan
diff --git a/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp b/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp
index c9519ce1610b2..7c2bd74b84fe4 100644
--- a/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp
@@ -7382,8 +7382,12 @@ SDValue SelectionDAG::foldConstantFPMath(unsigned Opcode, const SDLoc &DL,
       C1.copySign(C2);
       return getConstantFP(C1, DL, VT);
     case ISD::FMINNUM:
+      if (C1.isSignaling() || C2.isSignaling())
+        return SDValue();
       return getConstantFP(minnum(C1, C2), DL, VT);
     case ISD::FMAXNUM:
+      if (C1.isSignaling() || C2.isSignaling())
+        return SDValue();
       return getConstantFP(maxnum(C1, C2), DL, VT);
     case ISD::FMINIMUM:
       return getConstantFP(minimum(C1, C2), DL, VT);
diff --git a/llvm/test/CodeGen/AMDGPU/fcanonicalize-elimination.ll b/llvm/test/CodeGen/AMDGPU/fcanonicalize-elimination.ll
index 05d3e9c381910..e7685d53b2d10 100644
--- a/llvm/test/CodeGen/AMDGPU/fcanonicalize-elimination.ll
+++ b/llvm/test/CodeGen/AMDGPU/fcanonicalize-elimination.ll
@@ -497,10 +497,12 @@ define amdgpu_kernel void @test_fold_canonicalize_minnum_value_f32(ptr addrspace
   ret void
 }
 
-; FIXME: Should there be more checks here? minnum with sNaN operand is simplified to qNaN.
+; FIXME: Should there be more checks here? minnum with sNaN operand might get simplified away.
 
 ; GCN-LABEL: test_fold_canonicalize_sNaN_value_f32:
-; GCN: v_mov_b32_e32 v{{.+}}, 0x7fc00000
+; GCN: {{flat|global}}_load_dword [[LOAD:v[0-9]+]]
+; VI: v_mul_f32_e32 v{{[0-9]+}}, 1.0, [[LOAD]]
+; GFX9: v_max_f32_e32 v{{[0-9]+}}, [[LOAD]], [[LOAD]]
 define amdgpu_kernel void @test_fold_canonicalize_sNaN_value_f32(ptr addrspace(1) %arg) {
   %id = tail call i32 @llvm.amdgcn.workitem.id.x()
   %gep = getelementptr inbounds float, ptr addrspace(1) %arg, i32 %id
diff --git a/llvm/test/CodeGen/X86/fmaxnum.ll b/llvm/test/CodeGen/X86/fmaxnum.ll
index 150bef01bdbe0..6a03628d9f078 100644
--- a/llvm/test/CodeGen/X86/fmaxnum.ll
+++ b/llvm/test/CodeGen/X86/fmaxnum.ll
@@ -676,15 +676,44 @@ define float @test_maxnum_neg_inf_nnan(float %x, float %y) nounwind {
 
 ; Test SNaN quieting
 define float @test_maxnum_snan(float %x) {
-; SSE-LABEL: test_maxnum_snan:
-; SSE:       # %bb.0:
-; SSE-NEXT:    movss {{.*#+}} xmm0 = [NaN,0.0E+0,0.0E+0,0.0E+0]
-; SSE-NEXT:    retq
+; SSE2-LABEL: test_maxnum_snan:
+; SSE2:       # %bb.0:
+; SSE2-NEXT:    movss {{.*#+}} xmm2 = [NaN,0.0E+0,0.0E+0,0.0E+0]
+; SSE2-NEXT:    movaps %xmm0, %xmm1
+; SSE2-NEXT:    cmpunordss %xmm0, %xmm1
+; SSE2-NEXT:    movaps %xmm1, %xmm3
+; SSE2-NEXT:    andps %xmm2, %xmm3
+; SSE2-NEXT:    maxss %xmm0, %xmm2
+; SSE2-NEXT:    andnps %xmm2, %xmm1
+; SSE2-NEXT:    orps %xmm3, %xmm1
+; SSE2-NEXT:    movaps %xmm1, %xmm0
+; SSE2-NEXT:    retq
 ;
-; AVX-LABEL: test_maxnum_snan:
-; AVX:       # %bb.0:
-; AVX-NEXT:    vmovss {{.*#+}} xmm0 = [NaN,0.0E+0,0.0E+0,0.0E+0]
-; AVX-NEXT:    retq
+; SSE4-LABEL: test_maxnum_snan:
+; SSE4:       # %bb.0:
+; SSE4-NEXT:    movss {{.*#+}} xmm1 = [NaN,0.0E+0,0.0E+0,0.0E+0]
+; SSE4-NEXT:    maxss %xmm0, %xmm1
+; SSE4-NEXT:    cmpunordss %xmm0, %xmm0
+; SSE4-NEXT:    blendvps %xmm0, {{\.?LCPI[0-9]+_[0-9]+}}(%rip), %xmm1
+; SSE4-NEXT:    movaps %xmm1, %xmm0
+; SSE4-NEXT:    retq
+;
+; AVX1-LABEL: test_maxnum_snan:
+; AVX1:       # %bb.0:
+; AVX1-NEXT:    vmovss {{.*#+}} xmm1 = [NaN,0.0E+0,0.0E+0,0.0E+0]
+; AVX1-NEXT:    vmaxss %xmm0, %xmm1, %xmm1
+; AVX1-NEXT:    vcmpunordss %xmm0, %xmm0, %xmm0
+; AVX1-NEXT:    vblendvps %xmm0, {{\.?LCPI[0-9]+_[0-9]+}}(%rip), %xmm1, %xmm0
+; AVX1-NEXT:    retq
+;
+; AVX512-LABEL: test_maxnum_snan:
+; AVX512:       # %bb.0:
+; AVX512-NEXT:    vmovss {{.*#+}} xmm2 = [NaN,0.0E+0,0.0E+0,0.0E+0]
+; AVX512-NEXT:    vmaxss %xmm0, %xmm2, %xmm1
+; AVX512-NEXT:    vcmpunordss %xmm0, %xmm0, %k1
+; AVX512-NEXT:    vmovss %xmm2, %xmm1, %xmm1 {%k1}
+; AVX512-NEXT:    vmovaps %xmm1, %xmm0
+; AVX512-NEXT:    retq
   %r = call float @llvm.maxnum.f32(float 0x7ff4000000000000, float %x)
   ret float %r
 }
diff --git a/llvm/test/CodeGen/X86/fminnum.ll b/llvm/test/CodeGen/X86/fminnum.ll
index 4aa1a618be758..5c882c99d4f14 100644
--- a/llvm/test/CodeGen/X86/fminnum.ll
+++ b/llvm/test/CodeGen/X86/fminnum.ll
@@ -676,15 +676,44 @@ define float @test_minnum_inf_nnan(float %x, float %y) nounwind {
 
 ; Test SNaN quieting
 define float @test_minnum_snan(float %x) {
-; SSE-LABEL: test_minnum_snan:
-; SSE:       # %bb.0:
-; SSE-NEXT:    movss {{.*#+}} xmm0 = [NaN,0.0E+0,0.0E+0,0.0E+0]
-; SSE-NEXT:    retq
+; SSE2-LABEL: test_minnum_snan:
+; SSE2:       # %bb.0:
+; SSE2-NEXT:    movss {{.*#+}} xmm2 = [NaN,0.0E+0,0.0E+0,0.0E+0]
+; SSE2-NEXT:    movaps %xmm0, %xmm1
+; SSE2-NEXT:    cmpunordss %xmm0, %xmm1
+; SSE2-NEXT:    movaps %xmm1, %xmm3
+; SSE2-NEXT:    andps %xmm2, %xmm3
+; SSE2-NEXT:    minss %xmm0, %xmm2
+; SSE2-NEXT:    andnps %xmm2, %xmm1
+; SSE2-NEXT:    orps %xmm3, %xmm1
+; SSE2-NEXT:    movaps %xmm1, %xmm0
+; SSE2-NEXT:    retq
 ;
-; AVX-LABEL: test_minnum_snan:
-; AVX:       # %bb.0:
-; AVX-NEXT:    vmovss {{.*#+}} xmm0 = [NaN,0.0E+0,0.0E+0,0.0E+0]
-; AVX-NEXT:    retq
+; SSE4-LABEL: test_minnum_snan:
+; SSE4:       # %bb.0:
+; SSE4-NEXT:    movss {{.*#+}} xmm1 = [NaN,0.0E+0,0.0E+0,0.0E+0]
+; SSE4-NEXT:    minss %xmm0, %xmm1
+; SSE4-NEXT:    cmpunordss %xmm0, %xmm0
+; SSE4-NEXT:    blendvps %xmm0, {{\.?LCPI[0-9]+_[0-9]+}}(%rip), %xmm1
+; SSE4-NEXT:    movaps %xmm1, %xmm0
+; SSE4-NEXT:    retq
+;
+; AVX1-LABEL: test_minnum_snan:
+; AVX1:       # %bb.0:
+; AVX1-NEXT:    vmovss {{.*#+}} xmm1 = [NaN,0.0E+0,0.0E+0,0.0E+0]
+; AVX1-NEXT:    vminss %xmm0, %xmm1, %xmm1
+; AVX1-NEXT:    vcmpunordss %xmm0, %xmm0, %xmm0
+; AVX1-NEXT:    vblendvps %xmm0, {{\.?LCPI[0-9]+_[0-9]+}}(%rip), %xmm1, %xmm0
+; AVX1-NEXT:    retq
+;
+; AVX512-LABEL: test_minnum_snan:
+; AVX512:       # %bb.0:
+; AVX512-NEXT:    vmovss {{.*#+}} xmm2 = [NaN,0.0E+0,0.0E+0,0.0E+0]
+; AVX512-NEXT:    vminss %xmm0, %xmm2, %xmm1
+; AVX512-NEXT:    vcmpunordss %xmm0, %xmm0, %k1
+; AVX512-NEXT:    vmovss %xmm2, %xmm1, %xmm1 {%k1}
+; AVX512-NEXT:    vmovaps %xmm1, %xmm0
+; AVX512-NEXT:    retq
   %r = call float @llvm.minnum.f32(float 0x7ff4000000000000, float %x)
   ret float %r
 }
diff --git a/llvm/test/Transforms/InstCombine/simplify-demanded-fpclass.ll b/llvm/test/Transforms/InstCombine/simplify-demanded-fpclass.ll
index df60078dbf452..a7ff967d3123b 100644
--- a/llvm/test/Transforms/InstCombine/simplify-demanded-fpclass.ll
+++ b/llvm/test/Transforms/InstCombine/simplify-demanded-fpclass.ll
@@ -10,6 +10,8 @@ declare float @llvm.trunc.f32(float)
 declare float @llvm.arithmetic.fence.f32(float)
 declare float @llvm.minnum.f32(float, float)
 declare float @llvm.maxnum.f32(float, float)
+declare float @llvm.minimumnum.f32(float, float)
+declare float @llvm.maximumnum.f32(float, float)
 
 
 define float @ninf_user_select_inf(i1 %cond, float %x, float %y) {
@@ -1314,7 +1316,7 @@ define nofpclass(pinf) float @ret_nofpclass_pinf__minnum_ninf(i1 %cond, float %x
 ; CHECK-SAME: (i1 [[COND:%.*]], float [[X:%.*]]) {
 ; CHECK-NEXT:    ret float 0xFFF0000000000000
 ;
-  %min = call float @llvm.minnum.f32(float %x, float 0xFFF0000000000000)
+  %min = call float @llvm.minimumnum.f32(float %x, float 0xFFF0000000000000)
   ret float %min
 }
 
@@ -1335,6 +1337,6 @@ define nofpclass(ninf) float @ret_nofpclass_ninf__maxnum_pinf(i1 %cond, float %x
 ; CHECK-SAME: (i1 [[COND:%.*]], float [[X:%.*]]) {
 ; CHECK-NEXT:    ret float 0x7FF0000000000000
 ;
-  %max = call float @llvm.maxnum.f32(float %x, float 0x7FF0000000000000)
+  %max = call float @llvm.maximumnum.f32(float %x, float 0x7FF0000000000000)
   ret float %max
 }
diff --git a/llvm/test/Transforms/InstSimplify/ConstProp/min-max.ll b/llvm/test/Transforms/InstSimplify/ConstProp/min-max.ll
index a633d29179896..84bec15d6ed32 100644
--- a/llvm/test/Transforms/InstSimplify/ConstProp/min-max.ll
+++ b/llvm/test/Transforms/InstSimplify/ConstProp/min-max.ll
@@ -97,7 +97,8 @@ define float @minnum_float_qnan_p0() {
 
 define float @minnum_float_p0_snan() {
 ; CHECK-LABEL: @minnum_float_p0_snan(
-; CHECK-NEXT:    ret float 0x7FFC000000000000
+; CHECK-NEXT:    [[MIN:%.*]] = call float @llvm.minnum.f32(float 0.000000e+00, float 0x7FF4000000000000)
+; CHECK-NEXT:    ret float [[MIN]]
 ;
   %min = call float @llvm.minnum.f32(float 0.0, float 0x7FF4000000000000)
   ret float %min
@@ -105,7 +106,8 @@ define float @minnum_float_p0_snan() {
 
 define float @minnum_float_snan_p0() {
 ; CHECK-LABEL: @minnum_float_snan_p0(
-; CHECK-NEXT:    ret float 0x7FFC000000000000
+; CHECK-NEXT:    [[MIN:%.*]] = call float @llvm.minnum.f32(float 0x7FF4000000000000, float 0.000000e+00)
+; CHECK-NEXT:    ret float [[MIN]]
 ;
   %min = call float @llvm.minnum.f32(float 0x7FF4000000000000, float 0.0)
   ret float %min
@@ -205,7 +207,8 @@ define float @maxnum_float_qnan_p0() {
 
 define float @maxnum_float_p0_snan() {
 ; CHECK-LABEL: @maxnum_float_p0_snan(
-; CHECK-NEXT:    ret float 0x7FFC000000000000
+; CHECK-NEXT:    [[MAX:%.*]] = call float @llvm.maxnum.f32(float 0.000000e+00, float 0x7FF4000000000000)
+; CHECK-NEXT:    ret float [[MAX]]
 ;
   %max = call float @llvm.maxnum.f32(float 0.0, float 0x7FF4000000000000)
   ret float %max
@@ -213,7 +216,8 @@ define float @maxnum_float_p0_snan() {
 
 define float @maxnum_float_snan_p0() {
 ; CHECK-LABEL: @maxnum_float_snan_p0(
-; CHECK-NEXT:    ret float 0x7FFC000000000000
+; CHECK-NEXT:    [[MAX:%.*]] = call float @llvm.maxnum.f32(float 0x7FF4000000000000, float 0.000000e+00)
+; CHECK-NEXT:    ret float [[MAX]]
 ;
   %max = call float @llvm.maxnum.f32(float 0x7FF4000000000000, float 0.0)
   ret float %max
diff --git a/llvm/test/Transforms/InstSimplify/fminmax-folds.ll b/llvm/test/Transforms/InstSimplify/fminmax-folds.ll
index 091e85920c0df..7544f7190df89 100644
--- a/llvm/test/Transforms/InstSimplify/fminmax-folds.ll
+++ b/llvm/test/Transforms/InstSimplify/fminmax-folds.ll
@@ -43,11 +43,13 @@ define void @minmax_qnan_f32(float %x, ptr %minnum_res, ptr %maxnum_res, ptr %mi
 ; Note that maxnum/minnum return qnan here for snan inputs, unlike maximumnum/minimumnum
 define void @minmax_snan_f32(float %x, ptr %minnum_res, ptr %maxnum_res, ptr %minimum_res, ptr %maximum_res, ptr %minimumnum_res, ptr %maximumnum_res) {
 ; CHECK-LABEL: @minmax_snan_f32(
-; CHECK-NEXT:    store float 0x7FFC000000000000, ptr [[MINNUM_RES:%.*]], align 4
-; CHECK-NEXT:    store float 0x7FFC000000000000, ptr [[MAXNUM_RES:%.*]], align 4
+; CHECK-NEXT:    [[MINNUM:%.*]] = call float @llvm.minnum.f32(float [[X:%.*]], float 0x7FF4000000000000)
+; CHECK-NEXT:    store float [[MINNUM]], ptr [[MINNUM_RES:%.*]], align 4
+; CHECK-NEXT:    [[MAXNUM:%.*]] = call float @llvm.maxnum.f32(float [[X]], float 0x7FF4000000000000)
+; CHECK-NEXT:    store float [[MAXNUM]], ptr [[MAXNUM_RES:%.*]], align 4
 ; CHECK-NEXT:    store float 0x7FFC000000000000, ptr [[MINIMUM_RES:%.*]], align 4
 ; CHECK-NEXT:    store float 0x7FFC000000000000, ptr [[MAXIMUM_RES:%.*]], align 4
-; CHECK-NEXT:    store float [[X:%.*]], ptr [[MINIMUMNUM_RES:%.*]], align 4
+; CHECK-NEXT:    store float [[X]], ptr [[MINIMUMNUM_RES:%.*]], align 4
 ; CHECK-NEXT:    store float [[X]], ptr [[MAXIMUMNUM_RES:%.*]], align 4
 ; CHECK-NEXT:    ret void
 ;
@@ -98,11 +100,13 @@ define void @minmax_qnan_nxv2f64_op0(<vscale x 2 x double> %x, ptr %minnum_res,
 ; Note that maxnum/minnum return qnan here for snan inputs, unlike maximumnum/minimumnum
 define void @minmax_snan_nxv2f64_op1(<vscale x 2 x double> %x, ptr %minnum_res, ptr %maxnum_res, ptr %minimum_res, ptr %maximum_res, ptr %minimumnum_res, ptr %maximumnum_res) {
 ; CHECK-LABEL: @minmax_snan_nxv2f64_op1(
-; CHECK-NEXT:    store <vscale x 2 x double> splat (double 0x7FFC00DEAD00DEAD), ptr [[MINNUM_RES:%.*]], align 16
-; CHECK-NEXT:    store <vscale x 2 x double> splat (double 0x7FFC00DEAD00DEAD), ptr [[MAXNUM_RES:%.*]], align 16
+; CHECK-NEXT:    [[MINNUM:%.*]] = call <vscale x 2 x double> @llvm.minnum.nxv2f64(<vscale x 2 x double> splat (double 0x7FF400DEAD00DEAD), <vscale x 2 x double> [[X:%.*]])
+; CHECK-NEXT:    store <vscale x 2 x double> [[MINNUM]], ptr [[MINNUM_RES:%.*]], align 16
+; CHECK-NEXT:    [[MAXNUM:%.*]] = call <vscale x 2 x double> @llvm.maxnum.nxv2f64(<vscale x 2 x double> splat (double 0x7FF400DEAD00DEAD), <vscale x 2 x double> [[X]])
+; CHECK-NEXT:    store <vscale x 2 x double> [[MAXNUM]], ptr [[MAXNUM_RES:%.*]], align 16
 ; CHECK-NEXT:    store <vscale x 2 x double> splat (double 0x7FFC00DEAD00DEAD), ptr [[MINIMUM_RES:%.*]], align 16
 ; CHECK-NEXT:    store <vscale x 2 x double> splat (double 0x7FFC00DEAD00DEAD), ptr [[MAXIMUM_RES:%.*]], align 16
-; CHECK-NEXT:    store <vscale x 2 x double> [[X:%.*]], ptr [[MINIMUMNUM_RES:%.*]], align 16
+; CHECK-NEXT:    store <vscale x 2 x double> [[X]], ptr [[MINIMUMNUM_RES:%.*]], align 16
 ; CHECK-NEXT:    store <vscale x 2 x double> [[X]], ptr [[MAXIMUMNUM_RES:%.*]], align 16
 ; CHECK-NEXT:    ret void
 ;
@@ -255,7 +259,8 @@ define void @minmax_pos_inf_f32(float %x, ptr %minnum_res, ptr %maxnum_res, ptr
 ; CHECK-LABEL: @minmax_pos_inf_f32(
 ; CHECK-NEXT:    [[MINNUM:%.*]] = call float @llvm.minnum.f32(float [[X:%.*]], float 0x7FF0000000000000)
 ; CHECK-NEXT:    store float [[MINNUM]], ptr [[MINNUM_RES:%.*]], align 4
-; CHECK-NEXT:    store float 0x7FF0000000000000, ptr [[MAXNUM_RES:%.*]], align 4
+; CHECK-NEXT:    [[MAXNUM:%.*]] = call float @llvm.maxnum.f32(float [[X]], float 0x7FF0000000000000)
+; CHECK-NEXT:    store float [[MAXNUM]], ptr [[MAXNUM_RES:%.*]], align 4
 ; CHECK-NEXT:    store float ...
[truncated]

llvmbot · 2025-12-01T18:51:35Z

@llvm/pr-subscribers-llvm-selectiondag

Author: Lewis Crawford (LewisCrawford)

Changes

The behaviour of constant-folding maxnum(sNaN, x) and minnum(sNaN, x) has become controversial, and there are ongoing discussions about which behaviour we want to specify in the LLVM IR LangRef.

See:

LLVM broke frontends relying on documented NaN behavior for minnum/maxnum #170082
Revert "LangRef: Clarify llvm.minnum and llvm.maxnum about sNaN and signed zero (#112852)" #168838
Revert "LangRef: Clarify llvm.minnum and llvm.maxnum about sNaN and signed zero (#112852)" #138451
Revert "Revert "LangRef: Clarify llvm.minnum and llvm.maxnum about sNaN and signed zero (#112852)"" #170067
https://discourse.llvm.org/t/rfc-a-consistent-set-of-semantics-for-the-floating-point-minimum-and-maximum-operations/89006

This patch removes optimizations and constant-folding support for maxnum(sNaN, x) but keeps it folded/optimized for qNaN. This should allow for some more flexibility so the implementation can conform to either the old or new version of the semantics specified without any changes.

As far as I am aware, optimizations involving constant sNaN should generally be edge-cases that rarely occur, so here should hopefully be very little real-world performance impact from disabling these optimizations.

Patch is 24.39 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/170181.diff

11 Files Affected:

(modified) llvm/lib/Analysis/ConstantFolding.cpp (+4)
(modified) llvm/lib/Analysis/InstructionSimplify.cpp (+11-14)
(modified) llvm/lib/CodeGen/GlobalISel/Utils.cpp (+4)
(modified) llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp (+7-8)
(modified) llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp (+4)
(modified) llvm/test/CodeGen/AMDGPU/fcanonicalize-elimination.ll (+4-2)
(modified) llvm/test/CodeGen/X86/fmaxnum.ll (+37-8)
(modified) llvm/test/CodeGen/X86/fminnum.ll (+37-8)
(modified) llvm/test/Transforms/InstCombine/simplify-demanded-fpclass.ll (+4-2)
(modified) llvm/test/Transforms/InstSimplify/ConstProp/min-max.ll (+8-4)
(modified) llvm/test/Transforms/InstSimplify/fminmax-folds.ll (+22-13)

diff --git a/llvm/lib/Analysis/ConstantFolding.cpp b/llvm/lib/Analysis/ConstantFolding.cpp
index 916154b465af4..63d12ee585e64 100644
--- a/llvm/lib/Analysis/ConstantFolding.cpp
+++ b/llvm/lib/Analysis/ConstantFolding.cpp
@@ -3348,8 +3348,12 @@ static Constant *ConstantFoldIntrinsicCall2(Intrinsic::ID IntrinsicID, Type *Ty,
       case Intrinsic::copysign:
         return ConstantFP::get(Ty->getContext(), APFloat::copySign(Op1V, Op2V));
       case Intrinsic::minnum:
+        if (Op1V.isSignaling() || Op2V.isSignaling())
+          return nullptr;
         return ConstantFP::get(Ty->getContext(), minnum(Op1V, Op2V));
       case Intrinsic::maxnum:
+        if (Op1V.isSignaling() || Op2V.isSignaling())
+          return nullptr;
         return ConstantFP::get(Ty->getContext(), maxnum(Op1V, Op2V));
       case Intrinsic::minimum:
         return ConstantFP::get(Ty->getContext(), minimum(Op1V, Op2V));
diff --git a/llvm/lib/Analysis/InstructionSimplify.cpp b/llvm/lib/Analysis/InstructionSimplify.cpp
index 59a213b47825a..bd85444d7d2b0 100644
--- a/llvm/lib/Analysis/InstructionSimplify.cpp
+++ b/llvm/lib/Analysis/InstructionSimplify.cpp
@@ -6620,7 +6620,8 @@ static MinMaxOptResult OptimizeConstMinMax(const Constant *RHSConst,
   assert(OutNewConstVal != nullptr);
 
   bool PropagateNaN = IID == Intrinsic::minimum || IID == Intrinsic::maximum;
-  bool PropagateSNaN = IID == Intrinsic::minnum || IID == Intrinsic::maxnum;
+  bool ReturnsOtherForAllNaNs =
+      IID == Intrinsic::minimumnum || IID == Intrinsic::maximumnum;
   bool IsMin = IID == Intrinsic::minimum || IID == Intrinsic::minnum ||
                IID == Intrinsic::minimumnum;
 
@@ -6637,29 +6638,27 @@ static MinMaxOptResult OptimizeConstMinMax(const Constant *RHSConst,
 
   // minnum(x, qnan) -> x
   // maxnum(x, qnan) -> x
-  // minnum(x, snan) -> qnan
-  // maxnum(x, snan) -> qnan
   // minimum(X, nan) -> qnan
   // maximum(X, nan) -> qnan
   // minimumnum(X, nan) -> x
   // maximumnum(X, nan) -> x
   if (CAPF.isNaN()) {
-    if (PropagateNaN || (PropagateSNaN && CAPF.isSignaling())) {
+    if (PropagateNaN) {
       *OutNewConstVal = ConstantFP::get(CFP->getType(), CAPF.makeQuiet());
       return MinMaxOptResult::UseNewConstVal;
+    } else if (ReturnsOtherForAllNaNs || !CAPF.isSignaling()) {
+      return MinMaxOptResult::UseOtherVal;
     }
-    return MinMaxOptResult::UseOtherVal;
+    return MinMaxOptResult::CannotOptimize;
   }
 
   if (CAPF.isInfinity() || (Call && Call->hasNoInfs() && CAPF.isLargest())) {
-    // minnum(X, -inf) -> -inf (ignoring sNaN -> qNaN propagation)
-    // maxnum(X, +inf) -> +inf (ignoring sNaN -> qNaN propagation)
     // minimum(X, -inf) -> -inf if nnan
     // maximum(X, +inf) -> +inf if nnan
     // minimumnum(X, -inf) -> -inf
     // maximumnum(X, +inf) -> +inf
     if (CAPF.isNegative() == IsMin &&
-        (!PropagateNaN || (Call && Call->hasNoNaNs()))) {
+        (ReturnsOtherForAllNaNs || (Call && Call->hasNoNaNs()))) {
       *OutNewConstVal = const_cast<Constant *>(RHSConst);
       return MinMaxOptResult::UseNewConstVal;
     }
@@ -7004,12 +7003,10 @@ Value *llvm::simplifyBinaryIntrinsic(Intrinsic::ID IID, Type *ReturnType,
   case Intrinsic::minimum:
   case Intrinsic::maximumnum:
   case Intrinsic::minimumnum: {
-    // In several cases here, we deviate from exact IEEE 754 semantics
-    // to enable optimizations (as allowed by the LLVM IR spec).
-    //
-    // For instance, we may return one of the arguments unmodified instead of
-    // inserting an llvm.canonicalize to transform input sNaNs into qNaNs,
-    // or may assume all NaN inputs are qNaNs.
+    // In some cases here, we deviate from exact IEEE-754 semantics to enable
+    // optimizations (as allowed by the LLVM IR spec) by returning one of the
+    // arguments unmodified instead of inserting an llvm.canonicalize to
+    // transform input sNaNs into qNaNs,
 
     // If the arguments are the same, this is a no-op (ignoring NaN quieting)
     if (Op0 == Op1)
diff --git a/llvm/lib/CodeGen/GlobalISel/Utils.cpp b/llvm/lib/CodeGen/GlobalISel/Utils.cpp
index e8954a3d9899b..bc01cb65c4a69 100644
--- a/llvm/lib/CodeGen/GlobalISel/Utils.cpp
+++ b/llvm/lib/CodeGen/GlobalISel/Utils.cpp
@@ -768,8 +768,12 @@ llvm::ConstantFoldFPBinOp(unsigned Opcode, const Register Op1,
     C1.copySign(C2);
     return C1;
   case TargetOpcode::G_FMINNUM:
+    if (C1.isSignaling() || C2.isSignaling())
+      return std::nullopt;
     return minnum(C1, C2);
   case TargetOpcode::G_FMAXNUM:
+    if (C1.isSignaling() || C2.isSignaling())
+      return std::nullopt;
     return maxnum(C1, C2);
   case TargetOpcode::G_FMINIMUM:
     return minimum(C1, C2);
diff --git a/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp b/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
index 0f3a207cc6414..70950084ee6b7 100644
--- a/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
@@ -19505,7 +19505,8 @@ SDValue DAGCombiner::visitFMinMax(SDNode *N) {
   const SDNodeFlags Flags = N->getFlags();
   unsigned Opc = N->getOpcode();
   bool PropAllNaNsToQNaNs = Opc == ISD::FMINIMUM || Opc == ISD::FMAXIMUM;
-  bool PropOnlySNaNsToQNaNs = Opc == ISD::FMINNUM || Opc == ISD::FMAXNUM;
+  bool ReturnsOtherForAllNaNs =
+      Opc == ISD::FMINIMUMNUM || Opc == ISD::FMAXIMUMNUM;
   bool IsMin =
       Opc == ISD::FMINNUM || Opc == ISD::FMINIMUM || Opc == ISD::FMINIMUMNUM;
   SelectionDAG::FlagInserter FlagsInserter(DAG, N);
@@ -19524,32 +19525,30 @@ SDValue DAGCombiner::visitFMinMax(SDNode *N) {
 
     // minnum(X, qnan) -> X
     // maxnum(X, qnan) -> X
-    // minnum(X, snan) -> qnan
-    // maxnum(X, snan) -> qnan
     // minimum(X, nan) -> qnan
     // maximum(X, nan) -> qnan
     // minimumnum(X, nan) -> X
     // maximumnum(X, nan) -> X
     if (AF.isNaN()) {
-      if (PropAllNaNsToQNaNs || (AF.isSignaling() && PropOnlySNaNsToQNaNs)) {
+      if (PropAllNaNsToQNaNs) {
         if (AF.isSignaling())
           return DAG.getConstantFP(AF.makeQuiet(), SDLoc(N), VT);
         return N->getOperand(1);
+      } else if (ReturnsOtherForAllNaNs || !AF.isSignaling()) {
+        return N->getOperand(0);
       }
-      return N->getOperand(0);
+      return SDValue();
     }
 
     // In the following folds, inf can be replaced with the largest finite
     // float, if the ninf flag is set.
     if (AF.isInfinity() || (Flags.hasNoInfs() && AF.isLargest())) {
-      // minnum(X, -inf) -> -inf (ignoring sNaN -> qNaN propagation)
-      // maxnum(X, +inf) -> +inf (ignoring sNaN -> qNaN propagation)
       // minimum(X, -inf) -> -inf if nnan
       // maximum(X, +inf) -> +inf if nnan
       // minimumnum(X, -inf) -> -inf
       // maximumnum(X, +inf) -> +inf
       if (IsMin == AF.isNegative() &&
-          (!PropAllNaNsToQNaNs || Flags.hasNoNaNs()))
+          (ReturnsOtherForAllNaNs || Flags.hasNoNaNs()))
         return N->getOperand(1);
 
       // minnum(X, +inf) -> X if nnan
diff --git a/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp b/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp
index c9519ce1610b2..7c2bd74b84fe4 100644
--- a/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp
@@ -7382,8 +7382,12 @@ SDValue SelectionDAG::foldConstantFPMath(unsigned Opcode, const SDLoc &DL,
       C1.copySign(C2);
       return getConstantFP(C1, DL, VT);
     case ISD::FMINNUM:
+      if (C1.isSignaling() || C2.isSignaling())
+        return SDValue();
       return getConstantFP(minnum(C1, C2), DL, VT);
     case ISD::FMAXNUM:
+      if (C1.isSignaling() || C2.isSignaling())
+        return SDValue();
       return getConstantFP(maxnum(C1, C2), DL, VT);
     case ISD::FMINIMUM:
       return getConstantFP(minimum(C1, C2), DL, VT);
diff --git a/llvm/test/CodeGen/AMDGPU/fcanonicalize-elimination.ll b/llvm/test/CodeGen/AMDGPU/fcanonicalize-elimination.ll
index 05d3e9c381910..e7685d53b2d10 100644
--- a/llvm/test/CodeGen/AMDGPU/fcanonicalize-elimination.ll
+++ b/llvm/test/CodeGen/AMDGPU/fcanonicalize-elimination.ll
@@ -497,10 +497,12 @@ define amdgpu_kernel void @test_fold_canonicalize_minnum_value_f32(ptr addrspace
   ret void
 }
 
-; FIXME: Should there be more checks here? minnum with sNaN operand is simplified to qNaN.
+; FIXME: Should there be more checks here? minnum with sNaN operand might get simplified away.
 
 ; GCN-LABEL: test_fold_canonicalize_sNaN_value_f32:
-; GCN: v_mov_b32_e32 v{{.+}}, 0x7fc00000
+; GCN: {{flat|global}}_load_dword [[LOAD:v[0-9]+]]
+; VI: v_mul_f32_e32 v{{[0-9]+}}, 1.0, [[LOAD]]
+; GFX9: v_max_f32_e32 v{{[0-9]+}}, [[LOAD]], [[LOAD]]
 define amdgpu_kernel void @test_fold_canonicalize_sNaN_value_f32(ptr addrspace(1) %arg) {
   %id = tail call i32 @llvm.amdgcn.workitem.id.x()
   %gep = getelementptr inbounds float, ptr addrspace(1) %arg, i32 %id
diff --git a/llvm/test/CodeGen/X86/fmaxnum.ll b/llvm/test/CodeGen/X86/fmaxnum.ll
index 150bef01bdbe0..6a03628d9f078 100644
--- a/llvm/test/CodeGen/X86/fmaxnum.ll
+++ b/llvm/test/CodeGen/X86/fmaxnum.ll
@@ -676,15 +676,44 @@ define float @test_maxnum_neg_inf_nnan(float %x, float %y) nounwind {
 
 ; Test SNaN quieting
 define float @test_maxnum_snan(float %x) {
-; SSE-LABEL: test_maxnum_snan:
-; SSE:       # %bb.0:
-; SSE-NEXT:    movss {{.*#+}} xmm0 = [NaN,0.0E+0,0.0E+0,0.0E+0]
-; SSE-NEXT:    retq
+; SSE2-LABEL: test_maxnum_snan:
+; SSE2:       # %bb.0:
+; SSE2-NEXT:    movss {{.*#+}} xmm2 = [NaN,0.0E+0,0.0E+0,0.0E+0]
+; SSE2-NEXT:    movaps %xmm0, %xmm1
+; SSE2-NEXT:    cmpunordss %xmm0, %xmm1
+; SSE2-NEXT:    movaps %xmm1, %xmm3
+; SSE2-NEXT:    andps %xmm2, %xmm3
+; SSE2-NEXT:    maxss %xmm0, %xmm2
+; SSE2-NEXT:    andnps %xmm2, %xmm1
+; SSE2-NEXT:    orps %xmm3, %xmm1
+; SSE2-NEXT:    movaps %xmm1, %xmm0
+; SSE2-NEXT:    retq
 ;
-; AVX-LABEL: test_maxnum_snan:
-; AVX:       # %bb.0:
-; AVX-NEXT:    vmovss {{.*#+}} xmm0 = [NaN,0.0E+0,0.0E+0,0.0E+0]
-; AVX-NEXT:    retq
+; SSE4-LABEL: test_maxnum_snan:
+; SSE4:       # %bb.0:
+; SSE4-NEXT:    movss {{.*#+}} xmm1 = [NaN,0.0E+0,0.0E+0,0.0E+0]
+; SSE4-NEXT:    maxss %xmm0, %xmm1
+; SSE4-NEXT:    cmpunordss %xmm0, %xmm0
+; SSE4-NEXT:    blendvps %xmm0, {{\.?LCPI[0-9]+_[0-9]+}}(%rip), %xmm1
+; SSE4-NEXT:    movaps %xmm1, %xmm0
+; SSE4-NEXT:    retq
+;
+; AVX1-LABEL: test_maxnum_snan:
+; AVX1:       # %bb.0:
+; AVX1-NEXT:    vmovss {{.*#+}} xmm1 = [NaN,0.0E+0,0.0E+0,0.0E+0]
+; AVX1-NEXT:    vmaxss %xmm0, %xmm1, %xmm1
+; AVX1-NEXT:    vcmpunordss %xmm0, %xmm0, %xmm0
+; AVX1-NEXT:    vblendvps %xmm0, {{\.?LCPI[0-9]+_[0-9]+}}(%rip), %xmm1, %xmm0
+; AVX1-NEXT:    retq
+;
+; AVX512-LABEL: test_maxnum_snan:
+; AVX512:       # %bb.0:
+; AVX512-NEXT:    vmovss {{.*#+}} xmm2 = [NaN,0.0E+0,0.0E+0,0.0E+0]
+; AVX512-NEXT:    vmaxss %xmm0, %xmm2, %xmm1
+; AVX512-NEXT:    vcmpunordss %xmm0, %xmm0, %k1
+; AVX512-NEXT:    vmovss %xmm2, %xmm1, %xmm1 {%k1}
+; AVX512-NEXT:    vmovaps %xmm1, %xmm0
+; AVX512-NEXT:    retq
   %r = call float @llvm.maxnum.f32(float 0x7ff4000000000000, float %x)
   ret float %r
 }
diff --git a/llvm/test/CodeGen/X86/fminnum.ll b/llvm/test/CodeGen/X86/fminnum.ll
index 4aa1a618be758..5c882c99d4f14 100644
--- a/llvm/test/CodeGen/X86/fminnum.ll
+++ b/llvm/test/CodeGen/X86/fminnum.ll
@@ -676,15 +676,44 @@ define float @test_minnum_inf_nnan(float %x, float %y) nounwind {
 
 ; Test SNaN quieting
 define float @test_minnum_snan(float %x) {
-; SSE-LABEL: test_minnum_snan:
-; SSE:       # %bb.0:
-; SSE-NEXT:    movss {{.*#+}} xmm0 = [NaN,0.0E+0,0.0E+0,0.0E+0]
-; SSE-NEXT:    retq
+; SSE2-LABEL: test_minnum_snan:
+; SSE2:       # %bb.0:
+; SSE2-NEXT:    movss {{.*#+}} xmm2 = [NaN,0.0E+0,0.0E+0,0.0E+0]
+; SSE2-NEXT:    movaps %xmm0, %xmm1
+; SSE2-NEXT:    cmpunordss %xmm0, %xmm1
+; SSE2-NEXT:    movaps %xmm1, %xmm3
+; SSE2-NEXT:    andps %xmm2, %xmm3
+; SSE2-NEXT:    minss %xmm0, %xmm2
+; SSE2-NEXT:    andnps %xmm2, %xmm1
+; SSE2-NEXT:    orps %xmm3, %xmm1
+; SSE2-NEXT:    movaps %xmm1, %xmm0
+; SSE2-NEXT:    retq
 ;
-; AVX-LABEL: test_minnum_snan:
-; AVX:       # %bb.0:
-; AVX-NEXT:    vmovss {{.*#+}} xmm0 = [NaN,0.0E+0,0.0E+0,0.0E+0]
-; AVX-NEXT:    retq
+; SSE4-LABEL: test_minnum_snan:
+; SSE4:       # %bb.0:
+; SSE4-NEXT:    movss {{.*#+}} xmm1 = [NaN,0.0E+0,0.0E+0,0.0E+0]
+; SSE4-NEXT:    minss %xmm0, %xmm1
+; SSE4-NEXT:    cmpunordss %xmm0, %xmm0
+; SSE4-NEXT:    blendvps %xmm0, {{\.?LCPI[0-9]+_[0-9]+}}(%rip), %xmm1
+; SSE4-NEXT:    movaps %xmm1, %xmm0
+; SSE4-NEXT:    retq
+;
+; AVX1-LABEL: test_minnum_snan:
+; AVX1:       # %bb.0:
+; AVX1-NEXT:    vmovss {{.*#+}} xmm1 = [NaN,0.0E+0,0.0E+0,0.0E+0]
+; AVX1-NEXT:    vminss %xmm0, %xmm1, %xmm1
+; AVX1-NEXT:    vcmpunordss %xmm0, %xmm0, %xmm0
+; AVX1-NEXT:    vblendvps %xmm0, {{\.?LCPI[0-9]+_[0-9]+}}(%rip), %xmm1, %xmm0
+; AVX1-NEXT:    retq
+;
+; AVX512-LABEL: test_minnum_snan:
+; AVX512:       # %bb.0:
+; AVX512-NEXT:    vmovss {{.*#+}} xmm2 = [NaN,0.0E+0,0.0E+0,0.0E+0]
+; AVX512-NEXT:    vminss %xmm0, %xmm2, %xmm1
+; AVX512-NEXT:    vcmpunordss %xmm0, %xmm0, %k1
+; AVX512-NEXT:    vmovss %xmm2, %xmm1, %xmm1 {%k1}
+; AVX512-NEXT:    vmovaps %xmm1, %xmm0
+; AVX512-NEXT:    retq
   %r = call float @llvm.minnum.f32(float 0x7ff4000000000000, float %x)
   ret float %r
 }
diff --git a/llvm/test/Transforms/InstCombine/simplify-demanded-fpclass.ll b/llvm/test/Transforms/InstCombine/simplify-demanded-fpclass.ll
index df60078dbf452..a7ff967d3123b 100644
--- a/llvm/test/Transforms/InstCombine/simplify-demanded-fpclass.ll
+++ b/llvm/test/Transforms/InstCombine/simplify-demanded-fpclass.ll
@@ -10,6 +10,8 @@ declare float @llvm.trunc.f32(float)
 declare float @llvm.arithmetic.fence.f32(float)
 declare float @llvm.minnum.f32(float, float)
 declare float @llvm.maxnum.f32(float, float)
+declare float @llvm.minimumnum.f32(float, float)
+declare float @llvm.maximumnum.f32(float, float)
 
 
 define float @ninf_user_select_inf(i1 %cond, float %x, float %y) {
@@ -1314,7 +1316,7 @@ define nofpclass(pinf) float @ret_nofpclass_pinf__minnum_ninf(i1 %cond, float %x
 ; CHECK-SAME: (i1 [[COND:%.*]], float [[X:%.*]]) {
 ; CHECK-NEXT:    ret float 0xFFF0000000000000
 ;
-  %min = call float @llvm.minnum.f32(float %x, float 0xFFF0000000000000)
+  %min = call float @llvm.minimumnum.f32(float %x, float 0xFFF0000000000000)
   ret float %min
 }
 
@@ -1335,6 +1337,6 @@ define nofpclass(ninf) float @ret_nofpclass_ninf__maxnum_pinf(i1 %cond, float %x
 ; CHECK-SAME: (i1 [[COND:%.*]], float [[X:%.*]]) {
 ; CHECK-NEXT:    ret float 0x7FF0000000000000
 ;
-  %max = call float @llvm.maxnum.f32(float %x, float 0x7FF0000000000000)
+  %max = call float @llvm.maximumnum.f32(float %x, float 0x7FF0000000000000)
   ret float %max
 }
diff --git a/llvm/test/Transforms/InstSimplify/ConstProp/min-max.ll b/llvm/test/Transforms/InstSimplify/ConstProp/min-max.ll
index a633d29179896..84bec15d6ed32 100644
--- a/llvm/test/Transforms/InstSimplify/ConstProp/min-max.ll
+++ b/llvm/test/Transforms/InstSimplify/ConstProp/min-max.ll
@@ -97,7 +97,8 @@ define float @minnum_float_qnan_p0() {
 
 define float @minnum_float_p0_snan() {
 ; CHECK-LABEL: @minnum_float_p0_snan(
-; CHECK-NEXT:    ret float 0x7FFC000000000000
+; CHECK-NEXT:    [[MIN:%.*]] = call float @llvm.minnum.f32(float 0.000000e+00, float 0x7FF4000000000000)
+; CHECK-NEXT:    ret float [[MIN]]
 ;
   %min = call float @llvm.minnum.f32(float 0.0, float 0x7FF4000000000000)
   ret float %min
@@ -105,7 +106,8 @@ define float @minnum_float_p0_snan() {
 
 define float @minnum_float_snan_p0() {
 ; CHECK-LABEL: @minnum_float_snan_p0(
-; CHECK-NEXT:    ret float 0x7FFC000000000000
+; CHECK-NEXT:    [[MIN:%.*]] = call float @llvm.minnum.f32(float 0x7FF4000000000000, float 0.000000e+00)
+; CHECK-NEXT:    ret float [[MIN]]
 ;
   %min = call float @llvm.minnum.f32(float 0x7FF4000000000000, float 0.0)
   ret float %min
@@ -205,7 +207,8 @@ define float @maxnum_float_qnan_p0() {
 
 define float @maxnum_float_p0_snan() {
 ; CHECK-LABEL: @maxnum_float_p0_snan(
-; CHECK-NEXT:    ret float 0x7FFC000000000000
+; CHECK-NEXT:    [[MAX:%.*]] = call float @llvm.maxnum.f32(float 0.000000e+00, float 0x7FF4000000000000)
+; CHECK-NEXT:    ret float [[MAX]]
 ;
   %max = call float @llvm.maxnum.f32(float 0.0, float 0x7FF4000000000000)
   ret float %max
@@ -213,7 +216,8 @@ define float @maxnum_float_p0_snan() {
 
 define float @maxnum_float_snan_p0() {
 ; CHECK-LABEL: @maxnum_float_snan_p0(
-; CHECK-NEXT:    ret float 0x7FFC000000000000
+; CHECK-NEXT:    [[MAX:%.*]] = call float @llvm.maxnum.f32(float 0x7FF4000000000000, float 0.000000e+00)
+; CHECK-NEXT:    ret float [[MAX]]
 ;
   %max = call float @llvm.maxnum.f32(float 0x7FF4000000000000, float 0.0)
   ret float %max
diff --git a/llvm/test/Transforms/InstSimplify/fminmax-folds.ll b/llvm/test/Transforms/InstSimplify/fminmax-folds.ll
index 091e85920c0df..7544f7190df89 100644
--- a/llvm/test/Transforms/InstSimplify/fminmax-folds.ll
+++ b/llvm/test/Transforms/InstSimplify/fminmax-folds.ll
@@ -43,11 +43,13 @@ define void @minmax_qnan_f32(float %x, ptr %minnum_res, ptr %maxnum_res, ptr %mi
 ; Note that maxnum/minnum return qnan here for snan inputs, unlike maximumnum/minimumnum
 define void @minmax_snan_f32(float %x, ptr %minnum_res, ptr %maxnum_res, ptr %minimum_res, ptr %maximum_res, ptr %minimumnum_res, ptr %maximumnum_res) {
 ; CHECK-LABEL: @minmax_snan_f32(
-; CHECK-NEXT:    store float 0x7FFC000000000000, ptr [[MINNUM_RES:%.*]], align 4
-; CHECK-NEXT:    store float 0x7FFC000000000000, ptr [[MAXNUM_RES:%.*]], align 4
+; CHECK-NEXT:    [[MINNUM:%.*]] = call float @llvm.minnum.f32(float [[X:%.*]], float 0x7FF4000000000000)
+; CHECK-NEXT:    store float [[MINNUM]], ptr [[MINNUM_RES:%.*]], align 4
+; CHECK-NEXT:    [[MAXNUM:%.*]] = call float @llvm.maxnum.f32(float [[X]], float 0x7FF4000000000000)
+; CHECK-NEXT:    store float [[MAXNUM]], ptr [[MAXNUM_RES:%.*]], align 4
 ; CHECK-NEXT:    store float 0x7FFC000000000000, ptr [[MINIMUM_RES:%.*]], align 4
 ; CHECK-NEXT:    store float 0x7FFC000000000000, ptr [[MAXIMUM_RES:%.*]], align 4
-; CHECK-NEXT:    store float [[X:%.*]], ptr [[MINIMUMNUM_RES:%.*]], align 4
+; CHECK-NEXT:    store float [[X]], ptr [[MINIMUMNUM_RES:%.*]], align 4
 ; CHECK-NEXT:    store float [[X]], ptr [[MAXIMUMNUM_RES:%.*]], align 4
 ; CHECK-NEXT:    ret void
 ;
@@ -98,11 +100,13 @@ define void @minmax_qnan_nxv2f64_op0(<vscale x 2 x double> %x, ptr %minnum_res,
 ; Note that maxnum/minnum return qnan here for snan inputs, unlike maximumnum/minimumnum
 define void @minmax_snan_nxv2f64_op1(<vscale x 2 x double> %x, ptr %minnum_res, ptr %maxnum_res, ptr %minimum_res, ptr %maximum_res, ptr %minimumnum_res, ptr %maximumnum_res) {
 ; CHECK-LABEL: @minmax_snan_nxv2f64_op1(
-; CHECK-NEXT:    store <vscale x 2 x double> splat (double 0x7FFC00DEAD00DEAD), ptr [[MINNUM_RES:%.*]], align 16
-; CHECK-NEXT:    store <vscale x 2 x double> splat (double 0x7FFC00DEAD00DEAD), ptr [[MAXNUM_RES:%.*]], align 16
+; CHECK-NEXT:    [[MINNUM:%.*]] = call <vscale x 2 x double> @llvm.minnum.nxv2f64(<vscale x 2 x double> splat (double 0x7FF400DEAD00DEAD), <vscale x 2 x double> [[X:%.*]])
+; CHECK-NEXT:    store <vscale x 2 x double> [[MINNUM]], ptr [[MINNUM_RES:%.*]], align 16
+; CHECK-NEXT:    [[MAXNUM:%.*]] = call <vscale x 2 x double> @llvm.maxnum.nxv2f64(<vscale x 2 x double> splat (double 0x7FF400DEAD00DEAD), <vscale x 2 x double> [[X]])
+; CHECK-NEXT:    store <vscale x 2 x double> [[MAXNUM]], ptr [[MAXNUM_RES:%.*]], align 16
 ; CHECK-NEXT:    store <vscale x 2 x double> splat (double 0x7FFC00DEAD00DEAD), ptr [[MINIMUM_RES:%.*]], align 16
 ; CHECK-NEXT:    store <vscale x 2 x double> splat (double 0x7FFC00DEAD00DEAD), ptr [[MAXIMUM_RES:%.*]], align 16
-; CHECK-NEXT:    store <vscale x 2 x double> [[X:%.*]], ptr [[MINIMUMNUM_RES:%.*]], align 16
+; CHECK-NEXT:    store <vscale x 2 x double> [[X]], ptr [[MINIMUMNUM_RES:%.*]], align 16
 ; CHECK-NEXT:    store <vscale x 2 x double> [[X]], ptr [[MAXIMUMNUM_RES:%.*]], align 16
 ; CHECK-NEXT:    ret void
 ;
@@ -255,7 +259,8 @@ define void @minmax_pos_inf_f32(float %x, ptr %minnum_res, ptr %maxnum_res, ptr
 ; CHECK-LABEL: @minmax_pos_inf_f32(
 ; CHECK-NEXT:    [[MINNUM:%.*]] = call float @llvm.minnum.f32(float [[X:%.*]], float 0x7FF0000000000000)
 ; CHECK-NEXT:    store float [[MINNUM]], ptr [[MINNUM_RES:%.*]], align 4
-; CHECK-NEXT:    store float 0x7FF0000000000000, ptr [[MAXNUM_RES:%.*]], align 4
+; CHECK-NEXT:    [[MAXNUM:%.*]] = call float @llvm.maxnum.f32(float [[X]], float 0x7FF0000000000000)
+; CHECK-NEXT:    store float [[MAXNUM]], ptr [[MAXNUM_RES:%.*]], align 4
 ; CHECK-NEXT:    store float ...
[truncated]

github-actions · 2025-12-01T19:23:15Z

🐧 Linux x64 Test Results

186804 tests passed
4897 tests skipped

Artem-B · 2025-12-01T20:49:50Z

As far as I am aware, optimizations involving constant sNaN should generally be edge-cases that rarely occur, so here should hopefully be very little real-world performance impact from disabling these optimizations.

I'm curious, do we want to keep these optimizations around if the target makes no distinction between signalling and non-signalling NaNs? AFAICT, sNaN makes no difference on NVIDIA GPUs, and never causes any traps or updates any flags.

Artem-B

LGTM in general.

phoebewang

I'm good with the change assuming sNaN is rare in reality. I still want it back once we made consensus in LangRef.

phoebewang · 2025-12-02T06:32:08Z

As far as I am aware, optimizations involving constant sNaN should generally be edge-cases that rarely occur, so here should hopefully be very little real-world performance impact from disabling these optimizations.

I'm curious, do we want to keep these optimizations around if the target makes no distinction between signalling and non-signalling NaNs? AFAICT, sNaN makes no difference on NVIDIA GPUs, and never causes any traps or updates any flags.

X86 doesn't distinguish it either. We need either change the LangRef or change targets' implementation to match with it. But either way, the optimizations are worth well. We should bring it back once the controversy settled.

Hendiadyoin1 · 2025-12-02T11:03:47Z

llvm/test/CodeGen/X86/fminnum.ll

+; AVX512-NEXT:    vcmpunordss %xmm0, %xmm0, %k1
+; AVX512-NEXT:    vmovss %xmm2, %xmm1, %xmm1 {%k1}
+; AVX512-NEXT:    vmovaps %xmm1, %xmm0


probably something for a different PR,
but this really looks like a use case for VFIXUPIMMP[SD] with [QS]NaN to src1

A MOVSS is enough if NaN is known. But we don't need bother doing this in each backend. Just waiting for the code in DAGCombiner.cpp‎ brought back :)

LewisCrawford · 2025-12-02T12:41:00Z

do we want to keep these optimizations around if the target makes no distinction between signalling and non-signalling NaNs? AFAICT, sNaN makes no difference on NVIDIA GPUs, and never causes any traps or updates any flags.

Even on backends like NVPTX that don't really care about sNaNs, the specification discussions around what the correct behaviour of maxnum(x, sNaN) will still impact the behaviour on backends like NVPTX if the middle-end chooses to either optimize it to x or qNaN, even if the backend would not have triggered a trap / flag update.

For the NVPTX backend, we are hoping to transition nvvm_fmax from mapping to maxnum to using maximumnum instead, as this better reflects the way the PTX spec intends NaNs to be handled (by treating sNaN and qNaN identically).

We should bring it back once the controversy settled.

Most of these optimizations will probably be ok to bring back if the result is that the LangRef clearly defines the exact expected result here for all targets. If it instead goes for something like "return value is non-deterministic if one or both inputs are sNaN" as mentioned here #170082 (comment) , then maybe it's better to keep disabled so the constant and non-constant SNaNs produce consistent behaviour. Maybe we might also want to add a target-specific check for whether the backend cares about SNaN trap / flag setting too (but this is probably already covered by strictfp + constrained intrinsics), so hopefully we should be able to add these optimizations back at some point.

LewisCrawford requested a review from nikic as a code owner December 1, 2025 18:50

llvmbot added backend:AMDGPU backend:X86 llvm:globalisel llvm:instcombine Covers the InstCombine, InstSimplify and AggressiveInstCombine passes labels Dec 1, 2025

LewisCrawford removed the request for review from nikic December 1, 2025 18:51

llvmbot added llvm:SelectionDAG SelectionDAGISel as well llvm:analysis Includes value tracking, cost tables and constant folding labels Dec 1, 2025

LewisCrawford requested a review from arsenm December 1, 2025 18:51

llvmbot added the llvm:transforms label Dec 1, 2025

LewisCrawford requested review from Artem-B, nikic and phoebewang December 1, 2025 18:52

LewisCrawford mentioned this pull request Dec 1, 2025

LLVM broke frontends relying on documented NaN behavior for minnum/maxnum #170082

Open

Artem-B approved these changes Dec 1, 2025

View reviewed changes

Regenerate ARM tests

324d3c9

llvmbot added the backend:ARM label Dec 1, 2025

phoebewang approved these changes Dec 2, 2025

View reviewed changes

Hendiadyoin1 reviewed Dec 2, 2025

View reviewed changes

LewisCrawford merged commit ea3fdc5 into llvm:main Dec 2, 2025
11 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Avoid maxnum(sNaN, x) optimizations / folds #170181

Avoid maxnum(sNaN, x) optimizations / folds #170181

LewisCrawford commented Dec 1, 2025

Uh oh!

llvmbot commented Dec 1, 2025 •

edited

Loading

Uh oh!

llvmbot commented Dec 1, 2025

Uh oh!

github-actions bot commented Dec 1, 2025 •

edited

Loading

Uh oh!

Artem-B commented Dec 1, 2025

Uh oh!

Artem-B left a comment

Uh oh!

phoebewang left a comment

Uh oh!

phoebewang commented Dec 2, 2025

Uh oh!

Hendiadyoin1 Dec 2, 2025

Uh oh!

phoebewang Dec 2, 2025

Uh oh!

LewisCrawford commented Dec 2, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Avoid maxnum(sNaN, x) optimizations / folds #170181

Avoid maxnum(sNaN, x) optimizations / folds #170181

Conversation

LewisCrawford commented Dec 1, 2025

Uh oh!

llvmbot commented Dec 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

llvmbot commented Dec 1, 2025

Uh oh!

github-actions bot commented Dec 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🐧 Linux x64 Test Results

Uh oh!

Artem-B commented Dec 1, 2025

Uh oh!

Artem-B left a comment

Choose a reason for hiding this comment

Uh oh!

phoebewang left a comment

Choose a reason for hiding this comment

Uh oh!

phoebewang commented Dec 2, 2025

Uh oh!

Hendiadyoin1 Dec 2, 2025

Choose a reason for hiding this comment

Uh oh!

phoebewang Dec 2, 2025

Choose a reason for hiding this comment

Uh oh!

LewisCrawford commented Dec 2, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

llvmbot commented Dec 1, 2025 •

edited

Loading

github-actions bot commented Dec 1, 2025 •

edited

Loading