[flang][OpenMP] Use maxnum/minnum for lowering of max/min reduction operators #89258

jsjodin · 2024-04-18T16:14:43Z

This patch changes the lowering of max and min to be lowered to arith::MaxNumFop and arith::MinNumFOp instead of using arith::MaximumFOp and arith::MinimumFOp. The arith::MaximumFOp and arith::MinimumFOp map to the corresponding intrinsics llvm.maximum.* and llvm.minimum.* intrinsics which conform to the semantics specified in the draft of IEEE 754-2019, which is not supported by all hardware. Instead using arith::MaximumFOp and arith::MinimumFOp will allow code generation for more targets and match the code generated by clang OpenMP.

fixes #87955

…perators This patch changes the lowering of max and min to be lowered to arith::MaxNumFop and arith::MinNumFOp instead of using arith::MaximumFOp and arith::MinimumFOp. The arith::MaximumFOp and arith::MinimumFOp map to the corresponding intrinsics llvm.maximum.* and llvm.minimum.* intrinsics which conform to the semantics specified in the draft of IEEE 754-2019, which is not supported by all hardware. Instead using arith::MaximumFOp and arith::MinimumFOp will allow code generation for more targets and match the code generated by clang OpenMP.

llvmbot · 2024-04-18T16:15:17Z

@llvm/pr-subscribers-flang-fir-hlfir

@llvm/pr-subscribers-flang-openmp

Author: Jan Leyonberg (jsjodin)

Changes

This patch changes the lowering of max and min to be lowered to arith::MaxNumFop and arith::MinNumFOp instead of using arith::MaximumFOp and arith::MinimumFOp. The arith::MaximumFOp and arith::MinimumFOp map to the corresponding intrinsics llvm.maximum.* and llvm.minimum.* intrinsics which conform to the semantics specified in the draft of IEEE 754-2019, which is not supported by all hardware. Instead using arith::MaximumFOp and arith::MinimumFOp will allow code generation for more targets and match the code generated by clang OpenMP.

Full diff: https://github.com/llvm/llvm-project/pull/89258.diff

9 Files Affected:

(modified) flang/lib/Lower/OpenMP/ReductionProcessor.cpp (+2-2)
(modified) flang/test/Lower/OpenMP/FIR/wsloop-reduction-max-byref.f90 (+1-1)
(modified) flang/test/Lower/OpenMP/FIR/wsloop-reduction-max.f90 (+1-1)
(modified) flang/test/Lower/OpenMP/FIR/wsloop-reduction-min-byref.f90 (+1-1)
(modified) flang/test/Lower/OpenMP/FIR/wsloop-reduction-min.f90 (+1-1)
(modified) flang/test/Lower/OpenMP/wsloop-reduction-max-byref.f90 (+1-1)
(modified) flang/test/Lower/OpenMP/wsloop-reduction-max.f90 (+1-1)
(modified) flang/test/Lower/OpenMP/wsloop-reduction-min-byref.f90 (+1-1)
(modified) flang/test/Lower/OpenMP/wsloop-reduction-min.f90 (+1-1)

diff --git a/flang/lib/Lower/OpenMP/ReductionProcessor.cpp b/flang/lib/Lower/OpenMP/ReductionProcessor.cpp
index f42386fe2736dd..36ea9aba5e9856 100644
--- a/flang/lib/Lower/OpenMP/ReductionProcessor.cpp
+++ b/flang/lib/Lower/OpenMP/ReductionProcessor.cpp
@@ -220,12 +220,12 @@ mlir::Value ReductionProcessor::createScalarCombiner(
   switch (redId) {
   case ReductionIdentifier::MAX:
     reductionOp =
-        getReductionOperation<mlir::arith::MaximumFOp, mlir::arith::MaxSIOp>(
+        getReductionOperation<mlir::arith::MaxNumFOp, mlir::arith::MaxSIOp>(
             builder, type, loc, op1, op2);
     break;
   case ReductionIdentifier::MIN:
     reductionOp =
-        getReductionOperation<mlir::arith::MinimumFOp, mlir::arith::MinSIOp>(
+        getReductionOperation<mlir::arith::MinNumFOp, mlir::arith::MinSIOp>(
             builder, type, loc, op1, op2);
     break;
   case ReductionIdentifier::IOR:
diff --git a/flang/test/Lower/OpenMP/FIR/wsloop-reduction-max-byref.f90 b/flang/test/Lower/OpenMP/FIR/wsloop-reduction-max-byref.f90
index f0979ab95f568a..80b720e3aac1d1 100644
--- a/flang/test/Lower/OpenMP/FIR/wsloop-reduction-max-byref.f90
+++ b/flang/test/Lower/OpenMP/FIR/wsloop-reduction-max-byref.f90
@@ -11,7 +11,7 @@
 !CHECK: ^bb0(%[[ARG0:.*]]: !fir.ref<f32>, %[[ARG1:.*]]: !fir.ref<f32>):
 !CHECK:   %[[LD0:.*]] = fir.load %[[ARG0]] : !fir.ref<f32>
 !CHECK:   %[[LD1:.*]] = fir.load %[[ARG1]] : !fir.ref<f32>
-!CHECK:   %[[RES:.*]] = arith.maximumf %[[LD0]], %[[LD1]] {{.*}}: f32
+!CHECK:   %[[RES:.*]] = arith.maxnumf %[[LD0]], %[[LD1]] {{.*}}: f32
 !CHECK:   fir.store %[[RES]] to %[[ARG0]] : !fir.ref<f32>
 !CHECK:   omp.yield(%[[ARG0]] : !fir.ref<f32>)
 
diff --git a/flang/test/Lower/OpenMP/FIR/wsloop-reduction-max.f90 b/flang/test/Lower/OpenMP/FIR/wsloop-reduction-max.f90
index 996296c2adc2b5..c3b821ea591246 100644
--- a/flang/test/Lower/OpenMP/FIR/wsloop-reduction-max.f90
+++ b/flang/test/Lower/OpenMP/FIR/wsloop-reduction-max.f90
@@ -6,7 +6,7 @@
 !CHECK:   omp.yield(%[[MINIMUM_VAL_F]] : f32)
 !CHECK: combiner
 !CHECK: ^bb0(%[[ARG0_F:.*]]: f32, %[[ARG1_F:.*]]: f32):
-!CHECK:   %[[COMB_VAL_F:.*]] = arith.maximumf %[[ARG0_F]], %[[ARG1_F]] {{.*}}: f32
+!CHECK:   %[[COMB_VAL_F:.*]] = arith.maxnumf %[[ARG0_F]], %[[ARG1_F]] {{.*}}: f32
 !CHECK:   omp.yield(%[[COMB_VAL_F]] : f32)
 
 !CHECK: omp.declare_reduction @[[MAX_DECLARE_I:.*]] : i32 init {
diff --git a/flang/test/Lower/OpenMP/FIR/wsloop-reduction-min-byref.f90 b/flang/test/Lower/OpenMP/FIR/wsloop-reduction-min-byref.f90
index 24aa8e46e5bbbd..b284f8e5d96721 100644
--- a/flang/test/Lower/OpenMP/FIR/wsloop-reduction-min-byref.f90
+++ b/flang/test/Lower/OpenMP/FIR/wsloop-reduction-min-byref.f90
@@ -11,7 +11,7 @@
 !CHECK: ^bb0(%[[ARG0:.*]]: !fir.ref<f32>, %[[ARG1:.*]]: !fir.ref<f32>):
 !CHECK:   %[[LD0:.*]] = fir.load %[[ARG0]] : !fir.ref<f32>
 !CHECK:   %[[LD1:.*]] = fir.load %[[ARG1]] : !fir.ref<f32>
-!CHECK:   %[[RES:.*]] = arith.minimumf %[[LD0]], %[[LD1]] {{.*}}: f32
+!CHECK:   %[[RES:.*]] = arith.minnumf %[[LD0]], %[[LD1]] {{.*}}: f32
 !CHECK:   fir.store %[[RES]] to %[[ARG0]] : !fir.ref<f32>
 !CHECK:   omp.yield(%[[ARG0]] : !fir.ref<f32>)
 
diff --git a/flang/test/Lower/OpenMP/FIR/wsloop-reduction-min.f90 b/flang/test/Lower/OpenMP/FIR/wsloop-reduction-min.f90
index 268f51c9dc9330..ab33e180ed883d 100644
--- a/flang/test/Lower/OpenMP/FIR/wsloop-reduction-min.f90
+++ b/flang/test/Lower/OpenMP/FIR/wsloop-reduction-min.f90
@@ -6,7 +6,7 @@
 !CHECK:   omp.yield(%[[MAXIMUM_VAL_F]] : f32)
 !CHECK: combiner
 !CHECK: ^bb0(%[[ARG0_F:.*]]: f32, %[[ARG1_F:.*]]: f32):
-!CHECK:   %[[COMB_VAL_F:.*]] = arith.minimumf %[[ARG0_F]], %[[ARG1_F]] {{.*}}: f32
+!CHECK:   %[[COMB_VAL_F:.*]] = arith.minnumf %[[ARG0_F]], %[[ARG1_F]] {{.*}}: f32
 !CHECK:   omp.yield(%[[COMB_VAL_F]] : f32)
 
 !CHECK: omp.declare_reduction @[[MIN_DECLARE_I:.*]] : i32 init {
diff --git a/flang/test/Lower/OpenMP/wsloop-reduction-max-byref.f90 b/flang/test/Lower/OpenMP/wsloop-reduction-max-byref.f90
index ee562bbe15863e..2f6921edcb42a5 100644
--- a/flang/test/Lower/OpenMP/wsloop-reduction-max-byref.f90
+++ b/flang/test/Lower/OpenMP/wsloop-reduction-max-byref.f90
@@ -13,7 +13,7 @@
 !CHECK: ^bb0(%[[ARG0:.*]]: !fir.ref<f32>, %[[ARG1:.*]]: !fir.ref<f32>):
 !CHECK:   %[[LD0:.*]] = fir.load %[[ARG0]] : !fir.ref<f32>
 !CHECK:   %[[LD1:.*]] = fir.load %[[ARG1]] : !fir.ref<f32>
-!CHECK:   %[[RES:.*]] = arith.maximumf %[[LD0]], %[[LD1]] {{.*}}: f32
+!CHECK:   %[[RES:.*]] = arith.maxnumf %[[LD0]], %[[LD1]] {{.*}}: f32
 !CHECK:   fir.store %[[RES]] to %[[ARG0]] : !fir.ref<f32>
 !CHECK:   omp.yield(%[[ARG0]] : !fir.ref<f32>)
 
diff --git a/flang/test/Lower/OpenMP/wsloop-reduction-max.f90 b/flang/test/Lower/OpenMP/wsloop-reduction-max.f90
index 6f11f0ec96a7d3..c9cf5cbf4f8c02 100644
--- a/flang/test/Lower/OpenMP/wsloop-reduction-max.f90
+++ b/flang/test/Lower/OpenMP/wsloop-reduction-max.f90
@@ -10,7 +10,7 @@
 
 ! CHECK-LABEL:   } combiner {
 ! CHECK:         ^bb0(%[[VAL_0:.*]]: f32, %[[VAL_1:.*]]: f32):
-! CHECK:           %[[VAL_2:.*]] = arith.maximumf %[[VAL_0]], %[[VAL_1]] fastmath<contract> : f32
+! CHECK:           %[[VAL_2:.*]] = arith.maxnumf %[[VAL_0]], %[[VAL_1]] fastmath<contract> : f32
 ! CHECK:           omp.yield(%[[VAL_2]] : f32)
 ! CHECK:         }
 
diff --git a/flang/test/Lower/OpenMP/wsloop-reduction-min-byref.f90 b/flang/test/Lower/OpenMP/wsloop-reduction-min-byref.f90
index c0372117a03b9d..84a376b46b8fbe 100644
--- a/flang/test/Lower/OpenMP/wsloop-reduction-min-byref.f90
+++ b/flang/test/Lower/OpenMP/wsloop-reduction-min-byref.f90
@@ -13,7 +13,7 @@
 !CHECK: ^bb0(%[[ARG0:.*]]: !fir.ref<f32>, %[[ARG1:.*]]: !fir.ref<f32>):
 !CHECK:   %[[LD0:.*]] = fir.load %[[ARG0]] : !fir.ref<f32>
 !CHECK:   %[[LD1:.*]] = fir.load %[[ARG1]] : !fir.ref<f32>
-!CHECK:   %[[RES:.*]] = arith.minimumf %[[LD0]], %[[LD1]] {{.*}}: f32
+!CHECK:   %[[RES:.*]] = arith.minnumf %[[LD0]], %[[LD1]] {{.*}}: f32
 !CHECK:   fir.store %[[RES]] to %[[ARG0]] : !fir.ref<f32>
 !CHECK:   omp.yield(%[[ARG0]] : !fir.ref<f32>)
 
diff --git a/flang/test/Lower/OpenMP/wsloop-reduction-min.f90 b/flang/test/Lower/OpenMP/wsloop-reduction-min.f90
index 2c694f82e279a4..3ba279acd14c41 100644
--- a/flang/test/Lower/OpenMP/wsloop-reduction-min.f90
+++ b/flang/test/Lower/OpenMP/wsloop-reduction-min.f90
@@ -10,7 +10,7 @@
 
 ! CHECK-LABEL:   } combiner {
 ! CHECK:         ^bb0(%[[VAL_0:.*]]: f32, %[[VAL_1:.*]]: f32):
-! CHECK:           %[[VAL_2:.*]] = arith.minimumf %[[VAL_0]], %[[VAL_1]] fastmath<contract> : f32
+! CHECK:           %[[VAL_2:.*]] = arith.minnumf %[[VAL_0]], %[[VAL_1]] fastmath<contract> : f32
 ! CHECK:           omp.yield(%[[VAL_2]] : f32)
 ! CHECK:         }

kiranchandramohan · 2024-04-18T16:19:09Z

If this fixes #87955 then can you add a fixes tag?

kiranchandramohan · 2024-04-18T16:56:13Z

If this fixes #87955 then can you add a fixes tag?

This works for me on AArch64. I think it lowers the 128 version early to compare and select. Thanks for the fix.

kiranchandramohan

LG.

tblah

LGTM thanks

kiranchandramohan · 2024-04-18T17:14:11Z

If this fixes #87955 then can you add a fixes tag?

This works for me on AArch64. I think it lowers the 128 version early to compare and select. Thanks for the fix.

Sorry, I got that wrong. For #87955 it is a min reduction and the backend can convert minnum to a fminl call hence it works.

jsjodin · 2024-04-18T17:16:24Z

If this fixes #87955 then can you add a fixes tag?

This works for me on AArch64. I think it lowers the 128 version early to compare and select. Thanks for the fix.

Thank you for the review, I will add a fixes tag.

kparzysz · 2024-04-18T17:17:51Z

I have a concern with making a change this early in the compilation process only to avoid a selection error in the backend. First of all, we seem to replace an IEEE-compliant operation with one that isn't. Such a replacement should only be done if the compiler is allowed to do that (via relevant FP flags). Second, this is a legal operation in MLIR, and presumably in LLVM IR. If the backend can't generate valid code for it, then it's a missing functionality in the backend. Lastly, the replacing of an IEEE-compliant operation with a similar one should be done by the backend, since it's the backend that knows (1) if the operation needs to be replaced, and (2) what to replace it with.

jsjodin · 2024-04-18T17:18:04Z

If this fixes #87955 then can you add a fixes tag?

This works for me on AArch64. I think it lowers the 128 version early to compare and select. Thanks for the fix.

Sorry, I got that wrong. For #87955 it is a min reduction and the backend can convert minnum to a fminl call hence it works.

Okay, so does this PR fix anything?

kiranchandramohan · 2024-04-18T17:20:34Z

If this fixes #87955 then can you add a fixes tag?

This works for me on AArch64. I think it lowers the 128 version early to compare and select. Thanks for the fix.

Sorry, I got that wrong. For #87955 it is a min reduction and the backend can convert minnum to a fminl call hence it works.

Okay, so does this PR fix anything?

Yes it fixes #87955

jsjodin · 2024-04-18T17:21:41Z

I have a concern with making a change this early in the compilation process only to avoid a selection error in the backend. First of all, we seem to replace an IEEE-compliant operation with one that isn't. Such a replacement should only be done if the compiler is allowed to do that (via relevant FP flags). Second, this is a legal operation in MLIR, and presumably in LLVM IR. If the backend can't generate valid code for it, then it's a missing functionality in the backend. Lastly, the replacing of an IEEE-compliant operation with a similar one should be done by the backend, since it's the backend that knows (1) if the operation needs to be replaced, and (2) what to replace it with.

I think this change is probably okay since OpenMP 5.0 with min/max reductions was released 2018 and the IEEE-compliance relates to a 2019 draft. So we're not changing the semantics of OpenMP in this case.

kparzysz · 2024-04-18T17:24:05Z

Feel free to commit it as it is, but we should get the backend(s) to fix this.

kparzysz · 2024-04-18T17:41:26Z

Opened #89278 to track it.

…perators (llvm#89258) This patch changes the lowering of max and min to be lowered to arith::MaxNumFop and arith::MinNumFOp instead of using arith::MaximumFOp and arith::MinimumFOp. The arith::MaximumFOp and arith::MinimumFOp map to the corresponding intrinsics llvm.maximum.* and llvm.minimum.* intrinsics which conform to the semantics specified in the draft of IEEE 754-2019, which is not supported by all hardware. Instead using arith::MaximumFOp and arith::MinimumFOp will allow code generation for more targets and match the code generated by clang OpenMP. fixes llvm#87955

jsjodin requested review from DavidTruby, ergawy, skatrak, tblah, kiranchandramohan, agozillon, DominikAdamski, unterumarmung and TIFitis April 18, 2024 16:14

llvmbot added flang Flang issues not falling into any other category flang:fir-hlfir flang:openmp labels Apr 18, 2024

kiranchandramohan approved these changes Apr 18, 2024

View reviewed changes

tblah approved these changes Apr 18, 2024

View reviewed changes

jsjodin merged commit 0c455ee into llvm:main Apr 19, 2024
8 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[flang][OpenMP] Use maxnum/minnum for lowering of max/min reduction operators #89258

[flang][OpenMP] Use maxnum/minnum for lowering of max/min reduction operators #89258

jsjodin commented Apr 18, 2024 •

edited

llvmbot commented Apr 18, 2024 •

edited

kiranchandramohan commented Apr 18, 2024

kiranchandramohan commented Apr 18, 2024

kiranchandramohan left a comment

tblah left a comment

kiranchandramohan commented Apr 18, 2024

jsjodin commented Apr 18, 2024

kparzysz commented Apr 18, 2024

jsjodin commented Apr 18, 2024

kiranchandramohan commented Apr 18, 2024

jsjodin commented Apr 18, 2024

kparzysz commented Apr 18, 2024

kparzysz commented Apr 18, 2024

[flang][OpenMP] Use maxnum/minnum for lowering of max/min reduction operators #89258

[flang][OpenMP] Use maxnum/minnum for lowering of max/min reduction operators #89258

Conversation

jsjodin commented Apr 18, 2024 • edited

llvmbot commented Apr 18, 2024 • edited

kiranchandramohan commented Apr 18, 2024

kiranchandramohan commented Apr 18, 2024

kiranchandramohan left a comment

Choose a reason for hiding this comment

tblah left a comment

Choose a reason for hiding this comment

kiranchandramohan commented Apr 18, 2024

jsjodin commented Apr 18, 2024

kparzysz commented Apr 18, 2024

jsjodin commented Apr 18, 2024

kiranchandramohan commented Apr 18, 2024

jsjodin commented Apr 18, 2024

kparzysz commented Apr 18, 2024

kparzysz commented Apr 18, 2024

jsjodin commented Apr 18, 2024 •

edited

llvmbot commented Apr 18, 2024 •

edited