[AMDGPU] Shrink to SOPK with 32-bit signed literals #70263

rampitec · 2023-10-25T22:07:53Z

A literal like 0xffff8000 is valid to be used as KIMM in a SOPK instruction, but at the moment our checks expect it to be fully sign extended to a 64-bit signed integer. This is not required since all cases which are being shrunk only accept 32-bit operands.

We need to sign extend the operand to 64-bit though so it passes the verifier and properly printed.

A literal like 0xffff8000 is valid to be used as KIMM in a SOPK instruction, but at the moment our checks expect it to be fully sign extended to a 64-bit signed integer. This is not required since all cases which are being shrunk only accept 32-bit operands. We need to sign extend the operand to 64-bit though so it passes the verifier and properly printed.

llvmbot · 2023-10-25T22:09:06Z

@llvm/pr-subscribers-backend-amdgpu

Author: Stanislav Mekhanoshin (rampitec)

Changes

A literal like 0xffff8000 is valid to be used as KIMM in a SOPK instruction, but at the moment our checks expect it to be fully sign extended to a 64-bit signed integer. This is not required since all cases which are being shrunk only accept 32-bit operands.

We need to sign extend the operand to 64-bit though so it passes the verifier and properly printed.

Full diff: https://github.com/llvm/llvm-project/pull/70263.diff

2 Files Affected:

(modified) llvm/lib/Target/AMDGPU/SIShrinkInstructions.cpp (+10-5)
(added) llvm/test/CodeGen/AMDGPU/shrink-i32-kimm.mir (+57)

diff --git a/llvm/lib/Target/AMDGPU/SIShrinkInstructions.cpp b/llvm/lib/Target/AMDGPU/SIShrinkInstructions.cpp
index 88c75a0f86a6c13..856121be78031d0 100644
--- a/llvm/lib/Target/AMDGPU/SIShrinkInstructions.cpp
+++ b/llvm/lib/Target/AMDGPU/SIShrinkInstructions.cpp
@@ -159,7 +159,7 @@ bool SIShrinkInstructions::shouldShrinkTrue16(MachineInstr &MI) const {
 }
 
 bool SIShrinkInstructions::isKImmOperand(const MachineOperand &Src) const {
-  return isInt<16>(Src.getImm()) &&
+  return isInt<16>(SignExtend64(Src.getImm(), 32)) &&
          !TII->isInlineConstant(*Src.getParent(), Src.getOperandNo());
 }
 
@@ -170,7 +170,7 @@ bool SIShrinkInstructions::isKUImmOperand(const MachineOperand &Src) const {
 
 bool SIShrinkInstructions::isKImmOrKUImmOperand(const MachineOperand &Src,
                                                 bool &IsUnsigned) const {
-  if (isInt<16>(Src.getImm())) {
+  if (isInt<16>(SignExtend64(Src.getImm(), 32))) {
     IsUnsigned = false;
     return !TII->isInlineConstant(Src);
   }
@@ -221,7 +221,7 @@ void SIShrinkInstructions::shrinkScalarCompare(MachineInstr &MI) const {
   if (!Src0.isReg())
     return;
 
-  const MachineOperand &Src1 = MI.getOperand(1);
+  MachineOperand &Src1 = MI.getOperand(1);
   if (!Src1.isImm())
     return;
 
@@ -237,6 +237,7 @@ void SIShrinkInstructions::shrinkScalarCompare(MachineInstr &MI) const {
       if (!HasUImm) {
         SOPKOpc = (SOPKOpc == AMDGPU::S_CMPK_EQ_U32) ?
           AMDGPU::S_CMPK_EQ_I32 : AMDGPU::S_CMPK_LG_I32;
+        Src1.setImm(SignExtend32(Src1.getImm(), 32));
       }
 
       MI.setDesc(TII->get(SOPKOpc));
@@ -249,6 +250,8 @@ void SIShrinkInstructions::shrinkScalarCompare(MachineInstr &MI) const {
 
   if ((TII->sopkIsZext(SOPKOpc) && isKUImmOperand(Src1)) ||
       (!TII->sopkIsZext(SOPKOpc) && isKImmOperand(Src1))) {
+    if (!TII->sopkIsZext(SOPKOpc))
+      Src1.setImm(SignExtend64(Src1.getImm(), 32));
     MI.setDesc(NewDesc);
   }
 }
@@ -838,6 +841,7 @@ bool SIShrinkInstructions::runOnMachineFunction(MachineFunction &MF) {
             unsigned Opc = (MI.getOpcode() == AMDGPU::S_ADD_I32) ?
               AMDGPU::S_ADDK_I32 : AMDGPU::S_MULK_I32;
 
+            Src1->setImm(SignExtend64(Src1->getImm(), 32));
             MI.setDesc(TII->get(Opc));
             MI.tieOperands(0, 1);
           }
@@ -857,9 +861,10 @@ bool SIShrinkInstructions::runOnMachineFunction(MachineFunction &MF) {
 
         if (Src.isImm() && Dst.getReg().isPhysical()) {
           int32_t ReverseImm;
-          if (isKImmOperand(Src))
+          if (isKImmOperand(Src)) {
             MI.setDesc(TII->get(AMDGPU::S_MOVK_I32));
-          else if (isReverseInlineImm(Src, ReverseImm)) {
+            Src.setImm(SignExtend64(Src.getImm(), 32));
+          } else if (isReverseInlineImm(Src, ReverseImm)) {
             MI.setDesc(TII->get(AMDGPU::S_BREV_B32));
             Src.setImm(ReverseImm);
           }
diff --git a/llvm/test/CodeGen/AMDGPU/shrink-i32-kimm.mir b/llvm/test/CodeGen/AMDGPU/shrink-i32-kimm.mir
new file mode 100644
index 000000000000000..e2198faf13f71c8
--- /dev/null
+++ b/llvm/test/CodeGen/AMDGPU/shrink-i32-kimm.mir
@@ -0,0 +1,57 @@
+# NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py UTC_ARGS: --version 3
+# RUN: llc -march=amdgcn -mcpu=gfx900 -verify-machineinstrs -run-pass=si-shrink-instructions -o - %s | FileCheck -check-prefix=GCN %s
+
+---
+name:            shrink_kimm32_mov_b32
+tracksRegLiveness: true
+body:             |
+  bb.0:
+
+    ; GCN-LABEL: name: shrink_kimm32_mov_b32
+    ; GCN: $sgpr0 = S_MOVK_I32 -2048
+    $sgpr0 = S_MOV_B32 4294965248
+...
+
+---
+name:            shrink_kimm32_cmp_eq_u32
+tracksRegLiveness: true
+body:             |
+  bb.0:
+
+    ; GCN-LABEL: name: shrink_kimm32_cmp_eq_u32
+    ; GCN: S_CMPK_EQ_I32 undef $sgpr0, -2048, implicit-def $scc
+    S_CMP_EQ_U32 undef $sgpr0, 4294965248, implicit-def $scc
+...
+
+---
+name:            shrink_kimm32_cmp_gt_i32
+tracksRegLiveness: true
+body:             |
+  bb.0:
+
+    ; GCN-LABEL: name: shrink_kimm32_cmp_gt_i32
+    ; GCN: S_CMPK_GT_I32 undef $sgpr0, -2048, implicit-def $scc
+    S_CMP_GT_I32 undef $sgpr0, 4294965248, implicit-def $scc
+...
+
+---
+name:            shrink_kimm32_add_i32
+tracksRegLiveness: true
+body:             |
+  bb.0:
+
+    ; GCN-LABEL: name: shrink_kimm32_add_i32
+    ; GCN: $sgpr0 = S_ADDK_I32 undef $sgpr0, -2048, implicit-def $scc
+    $sgpr0 = S_ADD_I32 undef $sgpr0, 4294965248, implicit-def $scc
+...
+
+---
+name:            shrink_kimm32_mul_i32
+tracksRegLiveness: true
+body:             |
+  bb.0:
+
+    ; GCN-LABEL: name: shrink_kimm32_mul_i32
+    ; GCN: $sgpr0 = S_MULK_I32 undef $sgpr0, -2048, implicit-def $scc
+    $sgpr0 = S_MUL_I32 undef $sgpr0, 4294965248, implicit-def $scc
+...

jayfoad

I'm a bit surprised it doesn't affect any codegen tests.

Anyway makes sense to me since the src operand of S_MOVK_I32 is signed but the src operand of S_MOV_B32 is not (B32 is neither signed nor unsigned, but I guess we generally treat it like unsigned).

jayfoad · 2023-10-26T07:17:44Z

llvm/lib/Target/AMDGPU/SIShrinkInstructions.cpp

@@ -159,7 +159,7 @@ bool SIShrinkInstructions::shouldShrinkTrue16(MachineInstr &MI) const {
 }

 bool SIShrinkInstructions::isKImmOperand(const MachineOperand &Src) const {
-  return isInt<16>(Src.getImm()) &&
+  return isInt<16>(SignExtend64(Src.getImm(), 32)) &&


Could write this as isInt<16>((int32_t)Src.getImm()) but it's a matter of taste I guess.

Could write this as isInt<16>((int32_t)Src.getImm()) but it's a matter of taste I guess.

Same thing in principle.

rampitec · 2023-10-26T07:24:55Z

I'm a bit surprised it doesn't affect any codegen tests.

Anyway makes sense to me since the src operand of S_MOVK_I32 is signed but the src operand of S_MOV_B32 is not (B32 is neither signed nor unsigned, but I guess we generally treat it like unsigned).

I was surprised too, but this is all because we do not select 64-bit moves. All of that is a step to select these.

A literal like 0xffff8000 is valid to be used as KIMM in a SOPK instruction, but at the moment our checks expect it to be fully sign extended to a 64-bit signed integer. This is not required since all cases which are being shrunk only accept 32-bit operands. We need to sign extend the operand to 64-bit though so it passes the verifier and properly printed.

llvm/test/CodeGen/AMDGPU/shrink-i32-kimm.mir

rampitec requested review from jayfoad and Sisyph October 25, 2023 22:07

llvmbot added the backend:AMDGPU label Oct 25, 2023

jayfoad approved these changes Oct 26, 2023

View reviewed changes

rampitec merged commit 4602802 into llvm:main Oct 26, 2023
4 checks passed

rampitec deleted the shrink-kimm-32 branch October 26, 2023 07:27

DadSchoorse reviewed Oct 27, 2023

View reviewed changes

llvm/test/CodeGen/AMDGPU/shrink-i32-kimm.mir Show resolved Hide resolved

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[AMDGPU] Shrink to SOPK with 32-bit signed literals #70263

[AMDGPU] Shrink to SOPK with 32-bit signed literals #70263

rampitec commented Oct 25, 2023

llvmbot commented Oct 25, 2023

jayfoad left a comment

jayfoad Oct 26, 2023

rampitec Oct 26, 2023

rampitec commented Oct 26, 2023

[AMDGPU] Shrink to SOPK with 32-bit signed literals #70263

[AMDGPU] Shrink to SOPK with 32-bit signed literals #70263

Conversation

rampitec commented Oct 25, 2023

llvmbot commented Oct 25, 2023

jayfoad left a comment

Choose a reason for hiding this comment

jayfoad Oct 26, 2023

Choose a reason for hiding this comment

rampitec Oct 26, 2023

Choose a reason for hiding this comment

rampitec commented Oct 26, 2023