[NVPTX] expand trunc/ext on v2i32 #161715

Artem-B · 2025-10-02T18:19:25Z

#153478 made v2i32 legal on newer GPUs, but we can not lower all operations yet. Expand the trunc operation until we implement efficient lowering.

llvmbot · 2025-10-02T18:20:04Z

@llvm/pr-subscribers-backend-nvptx

Author: Artem Belevich (Artem-B)

Changes

#153478 made v2i32 legal on newer GPUs, but we can not lower all operations yet. Expand the trunc operation until we implement efficient lowering.

Full diff: https://github.com/llvm/llvm-project/pull/161715.diff

2 Files Affected:

(modified) llvm/lib/Target/NVPTX/NVPTXISelLowering.cpp (+3)
(modified) llvm/test/CodeGen/NVPTX/f32x2-convert-i32x2.ll (+37)

diff --git a/llvm/lib/Target/NVPTX/NVPTXISelLowering.cpp b/llvm/lib/Target/NVPTX/NVPTXISelLowering.cpp
index 3ac7c2874408b..48e539037dcc7 100644
--- a/llvm/lib/Target/NVPTX/NVPTXISelLowering.cpp
+++ b/llvm/lib/Target/NVPTX/NVPTXISelLowering.cpp
@@ -638,6 +638,9 @@ NVPTXTargetLowering::NVPTXTargetLowering(const NVPTXTargetMachine &TM,
   // No support for these operations with v2f32/v2i32
   setOperationAction(ISD::INSERT_VECTOR_ELT, {MVT::v2f32, MVT::v2i32}, Expand);
   setOperationAction(ISD::VECTOR_SHUFFLE, {MVT::v2f32, MVT::v2i32}, Expand);
+
+  setOperationAction(ISD::TRUNCATE, MVT::v2i16, Expand);
+
   // Need custom lowering in case the index is dynamic.
   if (STI.hasF32x2Instructions())
     setOperationAction(ISD::EXTRACT_VECTOR_ELT, {MVT::v2f32, MVT::v2i32},
diff --git a/llvm/test/CodeGen/NVPTX/f32x2-convert-i32x2.ll b/llvm/test/CodeGen/NVPTX/f32x2-convert-i32x2.ll
index 18fb87935d17d..4bfae5c437b85 100644
--- a/llvm/test/CodeGen/NVPTX/f32x2-convert-i32x2.ll
+++ b/llvm/test/CodeGen/NVPTX/f32x2-convert-i32x2.ll
@@ -115,5 +115,42 @@ define ptx_kernel void @inlineasm(ptr %p) {
   store <2 x float> %mul, ptr %p, align 8
   ret void
 }
+
+define ptx_kernel void @trunc_v2i32(<2 x i32> %0) {
+; CHECK-SM90A-LABEL: trunc_v2i32(
+; CHECK-SM90A:       {
+; CHECK-SM90A-NEXT:    .reg .b32 %r<7>;
+; CHECK-SM90A-NEXT:    .reg .b64 %rd<2>;
+; CHECK-SM90A-EMPTY:
+; CHECK-SM90A-NEXT:  // %bb.0:
+; CHECK-SM90A-NEXT:    ld.param.v2.b32 {%r1, %r2}, [trunc_v2i32_param_0];
+; CHECK-SM90A-NEXT:    prmt.b32 %r3, %r1, %r2, 0x3340U;
+; CHECK-SM90A-NEXT:    mov.b32 %r4, 0;
+; CHECK-SM90A-NEXT:    prmt.b32 %r5, %r4, 0, 0x3340U;
+; CHECK-SM90A-NEXT:    prmt.b32 %r6, %r5, %r3, 0x5410U;
+; CHECK-SM90A-NEXT:    mov.b64 %rd1, 0;
+; CHECK-SM90A-NEXT:    st.b32 [%rd1], %r6;
+; CHECK-SM90A-NEXT:    ret;
+;
+; CHECK-SM100-LABEL: trunc_v2i32(
+; CHECK-SM100:       {
+; CHECK-SM100-NEXT:    .reg .b32 %r<7>;
+; CHECK-SM100-NEXT:    .reg .b64 %rd<3>;
+; CHECK-SM100-EMPTY:
+; CHECK-SM100-NEXT:  // %bb.0:
+; CHECK-SM100-NEXT:    ld.param.b64 %rd1, [trunc_v2i32_param_0];
+; CHECK-SM100-NEXT:    mov.b64 {%r1, %r2}, %rd1;
+; CHECK-SM100-NEXT:    mov.b32 %r3, 0;
+; CHECK-SM100-NEXT:    prmt.b32 %r4, %r3, 0, 0x3340U;
+; CHECK-SM100-NEXT:    prmt.b32 %r5, %r1, %r2, 0x3340U;
+; CHECK-SM100-NEXT:    prmt.b32 %r6, %r4, %r5, 0x5410U;
+; CHECK-SM100-NEXT:    mov.b64 %rd2, 0;
+; CHECK-SM100-NEXT:    st.b32 [%rd2], %r6;
+; CHECK-SM100-NEXT:    ret;
+  %2 = trunc <2 x i32> %0 to <2 x i8>
+  %3 = shufflevector <2 x i8> zeroinitializer, <2 x i8> %2, <4 x i32> <i32 0, i32 1, i32 2, i32 3>
+  store <4 x i8> %3, ptr null, align 4
+  ret void
+}
 ;; NOTE: These prefixes are unused and the list is autogenerated. Do not add tests below this line:
 ; CHECK: {{.*}}

Prince781

We also need to expand any_extend and zero_extend: https://godbolt.org/z/dfeeGEWqr

Artem-B · 2025-10-02T19:21:04Z

@Prince781 Done. PTAL.

Prince781

LGTM. Thanks!

llvm#153478 made v2i32 legal on newer GPUs, but we can not lower all operations yet. Expand the `trunc` operation until we implement efficient lowering.

llvm#153478 made v2i32 legal on newer GPUs, but we can not lower all operations yet. Expand the `trunc/ext` operation until we implement efficient lowering.

…162391) Follow-up on #153478 and #161715. v2i32 register class exists mostly to facilitate v2f32's use of integer registers. There are no actual instructions that can apply to v2i32 directly (except bitwise logical ops). Everything else must be done elementwise.

llvmbot added the backend:NVPTX label Oct 2, 2025

Artem-B force-pushed the trunc-v2i32 branch from e0395ee to 23da2a5 Compare October 2, 2025 18:20

Artem-B changed the title ~~[NVPTX] expand trunk v2i32->v2i16~~ [NVPTX] expand trunc v2i32->v2i16 Oct 2, 2025

Artem-B requested a review from Prince781 October 2, 2025 18:21

Prince781 requested changes Oct 2, 2025

View reviewed changes

Artem-B changed the title ~~[NVPTX] expand trunc v2i32->v2i16~~ [NVPTX] expand trunc/ext on v2i32 Oct 2, 2025

Prince781 approved these changes Oct 2, 2025

View reviewed changes

Artem-B force-pushed the trunc-v2i32 branch from f52e708 to 49f2fd8 Compare October 2, 2025 19:34

Artem-B added 3 commits October 2, 2025 12:36

[NVPTX] expand trunc v2i32->v2i16

0c20751

llvm#153478 made v2i32 legal on newer GPUs, but we can not lower all operations yet. Expand the `trunc` operation until we implement efficient lowering.

added a test case

488ee01

expand *ext operations, too.

0953fa1

Artem-B force-pushed the trunc-v2i32 branch from 49f2fd8 to 0953fa1 Compare October 2, 2025 19:37

Artem-B merged commit a2b6602 into llvm:main Oct 2, 2025
7 of 8 checks passed

Artem-B mentioned this pull request Oct 7, 2025

[NVPTX] properly expand operations that we do not support on v2i32 #162391

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[NVPTX] expand trunc/ext on v2i32 #161715

[NVPTX] expand trunc/ext on v2i32 #161715

Artem-B commented Oct 2, 2025 •

edited

Loading

Uh oh!

llvmbot commented Oct 2, 2025

Uh oh!

Prince781 left a comment •

edited

Loading

Uh oh!

Artem-B commented Oct 2, 2025

Uh oh!

Prince781 left a comment

Uh oh!

Uh oh!

Uh oh!

[NVPTX] expand trunc/ext on v2i32 #161715

[NVPTX] expand trunc/ext on v2i32 #161715

Conversation

Artem-B commented Oct 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

llvmbot commented Oct 2, 2025

Uh oh!

Prince781 left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Artem-B commented Oct 2, 2025

Uh oh!

Prince781 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Artem-B commented Oct 2, 2025 •

edited

Loading

Prince781 left a comment •

edited

Loading