Skip to content

Conversation

LewisCrawford
Copy link
Contributor

NVPTX has ZeroOrNegativeOneBooleanContent, therefore we need to use -1 as the constant for i1 sint_to_fp operations in instruction selection.

NVPTX has ZeroOrNegativeOneBooleanContent, therefore we need to use -1
as the constant for i1 sint_to_fp operations in instruction selection.
@llvmbot
Copy link
Member

llvmbot commented Oct 2, 2024

@llvm/pr-subscribers-backend-nvptx

Author: Lewis Crawford (LewisCrawford)

Changes

NVPTX has ZeroOrNegativeOneBooleanContent, therefore we need to use -1 as the constant for i1 sint_to_fp operations in instruction selection.


Full diff: https://github.com/llvm/llvm-project/pull/110866.diff

2 Files Affected:

  • (modified) llvm/lib/Target/NVPTX/NVPTXInstrInfo.td (+5-4)
  • (modified) llvm/test/CodeGen/NVPTX/i1-int-to-fp.ll (+30-12)
diff --git a/llvm/lib/Target/NVPTX/NVPTXInstrInfo.td b/llvm/lib/Target/NVPTX/NVPTXInstrInfo.td
index b82826089d3fe3..8f4eddb5142740 100644
--- a/llvm/lib/Target/NVPTX/NVPTXInstrInfo.td
+++ b/llvm/lib/Target/NVPTX/NVPTXInstrInfo.td
@@ -3126,11 +3126,12 @@ foreach ta = [v2f16, v2bf16, v2i16, v4i8, i32] in {
 
 // NOTE: pred->fp are currently sub-optimal due to an issue in TableGen where
 // we cannot specify floating-point literals in isel patterns.  Therefore, we
-// use an integer selp to select either 1 or 0 and then cvt to floating-point.
+// use an integer selp to select either 1 (or -1 in case of signed) or 0
+// and then cvt to floating-point.
 
 // sint -> f16
 def : Pat<(f16 (sint_to_fp Int1Regs:$a)),
-          (CVT_f16_s32 (SELP_u32ii 1, 0, Int1Regs:$a), CvtRN)>;
+          (CVT_f16_s32 (SELP_s32ii -1, 0, Int1Regs:$a), CvtRN)>;
 def : Pat<(f16 (sint_to_fp Int16Regs:$a)),
           (CVT_f16_s16 Int16Regs:$a, CvtRN)>;
 def : Pat<(f16 (sint_to_fp Int32Regs:$a)),
@@ -3170,7 +3171,7 @@ def : Pat<(bf16 (uint_to_fp Int64Regs:$a)),
 
 // sint -> f32
 def : Pat<(f32 (sint_to_fp Int1Regs:$a)),
-          (CVT_f32_s32 (SELP_u32ii 1, 0, Int1Regs:$a), CvtRN)>;
+          (CVT_f32_s32 (SELP_s32ii -1, 0, Int1Regs:$a), CvtRN)>;
 def : Pat<(f32 (sint_to_fp Int16Regs:$a)),
           (CVT_f32_s16 Int16Regs:$a, CvtRN)>;
 def : Pat<(f32 (sint_to_fp Int32Regs:$a)),
@@ -3190,7 +3191,7 @@ def : Pat<(f32 (uint_to_fp Int64Regs:$a)),
 
 // sint -> f64
 def : Pat<(f64 (sint_to_fp Int1Regs:$a)),
-          (CVT_f64_s32 (SELP_u32ii 1, 0, Int1Regs:$a), CvtRN)>;
+          (CVT_f64_s32 (SELP_s32ii -1, 0, Int1Regs:$a), CvtRN)>;
 def : Pat<(f64 (sint_to_fp Int16Regs:$a)),
           (CVT_f64_s16 Int16Regs:$a, CvtRN)>;
 def : Pat<(f64 (sint_to_fp Int32Regs:$a)),
diff --git a/llvm/test/CodeGen/NVPTX/i1-int-to-fp.ll b/llvm/test/CodeGen/NVPTX/i1-int-to-fp.ll
index 6920be5cc4e9e3..a0f07afafa4593 100644
--- a/llvm/test/CodeGen/NVPTX/i1-int-to-fp.ll
+++ b/llvm/test/CodeGen/NVPTX/i1-int-to-fp.ll
@@ -2,37 +2,55 @@
 ; RUN: %if ptxas %{ llc < %s -march=nvptx64 -mcpu=sm_20 | %ptxas-verify %}
 
 ; CHECK-LABEL: foo
-; CHECK: setp
-; CHECK: selp
-; CHECK: cvt.rn.f32.u32
+; CHECK: setp.eq.b16 %[[P:p[0-9]+]], %{{.*}}, 1;
+; CHECK: selp.u32 %[[R:r[0-9]+]], 1, 0, %[[P]];
+; CHECK: cvt.rn.f32.u32 %f{{.*}}, %[[R]]
 define float @foo(i1 %a) {
   %ret = uitofp i1 %a to float
   ret float %ret
 }
 
 ; CHECK-LABEL: foo2
-; CHECK: setp
-; CHECK: selp
-; CHECK: cvt.rn.f32.s32
+; CHECK: setp.eq.b16 %[[P:p[0-9]+]], %{{.*}}, 1;
+; CHECK: selp.s32 %[[R:r[0-9]+]], -1, 0, %[[P]];
+; CHECK: cvt.rn.f32.s32 %f{{.*}}, %[[R]]
 define float @foo2(i1 %a) {
   %ret = sitofp i1 %a to float
   ret float %ret
 }
 
 ; CHECK-LABEL: foo3
-; CHECK: setp
-; CHECK: selp
-; CHECK: cvt.rn.f64.u32
+; CHECK: setp.eq.b16 %[[P:p[0-9]+]], %{{.*}}, 1;
+; CHECK: selp.u32 %[[R:r[0-9]+]], 1, 0, %[[P]];
+; CHECK: cvt.rn.f64.u32 %fd{{.*}}, %[[R]]
 define double @foo3(i1 %a) {
   %ret = uitofp i1 %a to double
   ret double %ret
 }
 
 ; CHECK-LABEL: foo4
-; CHECK: setp
-; CHECK: selp
-; CHECK: cvt.rn.f64.s32
+; CHECK: setp.eq.b16 %[[P:p[0-9]+]], %{{.*}}, 1;
+; CHECK: selp.s32 %[[R:r[0-9]+]], -1, 0, %[[P]];
+; CHECK: cvt.rn.f64.s32 %fd{{.*}}, %[[R]]
 define double @foo4(i1 %a) {
   %ret = sitofp i1 %a to double
   ret double %ret
 }
+
+; CHECK-LABEL: foo5
+; CHECK: setp.eq.b16 %[[P:p[0-9]+]], %{{.*}}, 1;
+; CHECK: selp.u32 %[[R:r[0-9]+]], 1, 0, %[[P]];
+; CHECK: cvt.rn.f16.u32 %{{.*}}, %[[R]]
+define half @foo5(i1 %a) {
+  %ret = uitofp i1 %a to half
+  ret half %ret
+}
+
+; CHECK-LABEL: foo6
+; CHECK: setp.eq.b16 %[[P:p[0-9]+]], %{{.*}}, 1;
+; CHECK: selp.s32 %[[R:r[0-9]+]], -1, 0, %[[P]];
+; CHECK: cvt.rn.f16.s32 %{{.*}}, %[[R]]
+define half @foo6(i1 %a) {
+  %ret = sitofp i1 %a to half
+  ret half %ret
+}

@AlexMaclean AlexMaclean requested a review from Artem-B October 2, 2024 16:25
; CHECK: selp.s32 %[[R:r[0-9]+]], -1, 0, %[[P]];
; CHECK: cvt.rn.f16.s32 %{{.*}}, %[[R]]
define half @foo6(i1 %a) {
%ret = sitofp i1 %a to half
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fascinating. My intuition says that (float)(int)(true) == (float)(true), yet in this case signed conversion from bool will produce -1.

A bit of digging around brings up #57181 which gives some context. TL;DR; version is that signext i1 in IR indeed sign extends to -1 and works differently from the bool in C++ which is treated as an unsigned int.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the review! :)

Would you or @AlexMaclean be able to merge this PR for me? (I don't have commit access to the llvm repo yet)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NVM, I've been granted commit access now.

@LewisCrawford LewisCrawford merged commit cc5ddae into llvm:main Oct 8, 2024
10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants