Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[AMDGPU] Shrink to SOPK with 32-bit signed literals #70263

Merged
merged 1 commit into from
Oct 26, 2023

Conversation

rampitec
Copy link
Collaborator

A literal like 0xffff8000 is valid to be used as KIMM in a SOPK instruction, but at the moment our checks expect it to be fully sign extended to a 64-bit signed integer. This is not required since all cases which are being shrunk only accept 32-bit operands.

We need to sign extend the operand to 64-bit though so it passes the verifier and properly printed.

A literal like 0xffff8000 is valid to be used as KIMM in a SOPK
instruction, but at the moment our checks expect it to be fully
sign extended to a 64-bit signed integer. This is not required
since all cases which are being shrunk only accept 32-bit operands.

We need to sign extend the operand to 64-bit though so it passes
the verifier and properly printed.
@llvmbot
Copy link
Collaborator

llvmbot commented Oct 25, 2023

@llvm/pr-subscribers-backend-amdgpu

Author: Stanislav Mekhanoshin (rampitec)

Changes

A literal like 0xffff8000 is valid to be used as KIMM in a SOPK instruction, but at the moment our checks expect it to be fully sign extended to a 64-bit signed integer. This is not required since all cases which are being shrunk only accept 32-bit operands.

We need to sign extend the operand to 64-bit though so it passes the verifier and properly printed.


Full diff: https://github.com/llvm/llvm-project/pull/70263.diff

2 Files Affected:

  • (modified) llvm/lib/Target/AMDGPU/SIShrinkInstructions.cpp (+10-5)
  • (added) llvm/test/CodeGen/AMDGPU/shrink-i32-kimm.mir (+57)
diff --git a/llvm/lib/Target/AMDGPU/SIShrinkInstructions.cpp b/llvm/lib/Target/AMDGPU/SIShrinkInstructions.cpp
index 88c75a0f86a6c13..856121be78031d0 100644
--- a/llvm/lib/Target/AMDGPU/SIShrinkInstructions.cpp
+++ b/llvm/lib/Target/AMDGPU/SIShrinkInstructions.cpp
@@ -159,7 +159,7 @@ bool SIShrinkInstructions::shouldShrinkTrue16(MachineInstr &MI) const {
 }
 
 bool SIShrinkInstructions::isKImmOperand(const MachineOperand &Src) const {
-  return isInt<16>(Src.getImm()) &&
+  return isInt<16>(SignExtend64(Src.getImm(), 32)) &&
          !TII->isInlineConstant(*Src.getParent(), Src.getOperandNo());
 }
 
@@ -170,7 +170,7 @@ bool SIShrinkInstructions::isKUImmOperand(const MachineOperand &Src) const {
 
 bool SIShrinkInstructions::isKImmOrKUImmOperand(const MachineOperand &Src,
                                                 bool &IsUnsigned) const {
-  if (isInt<16>(Src.getImm())) {
+  if (isInt<16>(SignExtend64(Src.getImm(), 32))) {
     IsUnsigned = false;
     return !TII->isInlineConstant(Src);
   }
@@ -221,7 +221,7 @@ void SIShrinkInstructions::shrinkScalarCompare(MachineInstr &MI) const {
   if (!Src0.isReg())
     return;
 
-  const MachineOperand &Src1 = MI.getOperand(1);
+  MachineOperand &Src1 = MI.getOperand(1);
   if (!Src1.isImm())
     return;
 
@@ -237,6 +237,7 @@ void SIShrinkInstructions::shrinkScalarCompare(MachineInstr &MI) const {
       if (!HasUImm) {
         SOPKOpc = (SOPKOpc == AMDGPU::S_CMPK_EQ_U32) ?
           AMDGPU::S_CMPK_EQ_I32 : AMDGPU::S_CMPK_LG_I32;
+        Src1.setImm(SignExtend32(Src1.getImm(), 32));
       }
 
       MI.setDesc(TII->get(SOPKOpc));
@@ -249,6 +250,8 @@ void SIShrinkInstructions::shrinkScalarCompare(MachineInstr &MI) const {
 
   if ((TII->sopkIsZext(SOPKOpc) && isKUImmOperand(Src1)) ||
       (!TII->sopkIsZext(SOPKOpc) && isKImmOperand(Src1))) {
+    if (!TII->sopkIsZext(SOPKOpc))
+      Src1.setImm(SignExtend64(Src1.getImm(), 32));
     MI.setDesc(NewDesc);
   }
 }
@@ -838,6 +841,7 @@ bool SIShrinkInstructions::runOnMachineFunction(MachineFunction &MF) {
             unsigned Opc = (MI.getOpcode() == AMDGPU::S_ADD_I32) ?
               AMDGPU::S_ADDK_I32 : AMDGPU::S_MULK_I32;
 
+            Src1->setImm(SignExtend64(Src1->getImm(), 32));
             MI.setDesc(TII->get(Opc));
             MI.tieOperands(0, 1);
           }
@@ -857,9 +861,10 @@ bool SIShrinkInstructions::runOnMachineFunction(MachineFunction &MF) {
 
         if (Src.isImm() && Dst.getReg().isPhysical()) {
           int32_t ReverseImm;
-          if (isKImmOperand(Src))
+          if (isKImmOperand(Src)) {
             MI.setDesc(TII->get(AMDGPU::S_MOVK_I32));
-          else if (isReverseInlineImm(Src, ReverseImm)) {
+            Src.setImm(SignExtend64(Src.getImm(), 32));
+          } else if (isReverseInlineImm(Src, ReverseImm)) {
             MI.setDesc(TII->get(AMDGPU::S_BREV_B32));
             Src.setImm(ReverseImm);
           }
diff --git a/llvm/test/CodeGen/AMDGPU/shrink-i32-kimm.mir b/llvm/test/CodeGen/AMDGPU/shrink-i32-kimm.mir
new file mode 100644
index 000000000000000..e2198faf13f71c8
--- /dev/null
+++ b/llvm/test/CodeGen/AMDGPU/shrink-i32-kimm.mir
@@ -0,0 +1,57 @@
+# NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py UTC_ARGS: --version 3
+# RUN: llc -march=amdgcn -mcpu=gfx900 -verify-machineinstrs -run-pass=si-shrink-instructions -o - %s | FileCheck -check-prefix=GCN %s
+
+---
+name:            shrink_kimm32_mov_b32
+tracksRegLiveness: true
+body:             |
+  bb.0:
+
+    ; GCN-LABEL: name: shrink_kimm32_mov_b32
+    ; GCN: $sgpr0 = S_MOVK_I32 -2048
+    $sgpr0 = S_MOV_B32 4294965248
+...
+
+---
+name:            shrink_kimm32_cmp_eq_u32
+tracksRegLiveness: true
+body:             |
+  bb.0:
+
+    ; GCN-LABEL: name: shrink_kimm32_cmp_eq_u32
+    ; GCN: S_CMPK_EQ_I32 undef $sgpr0, -2048, implicit-def $scc
+    S_CMP_EQ_U32 undef $sgpr0, 4294965248, implicit-def $scc
+...
+
+---
+name:            shrink_kimm32_cmp_gt_i32
+tracksRegLiveness: true
+body:             |
+  bb.0:
+
+    ; GCN-LABEL: name: shrink_kimm32_cmp_gt_i32
+    ; GCN: S_CMPK_GT_I32 undef $sgpr0, -2048, implicit-def $scc
+    S_CMP_GT_I32 undef $sgpr0, 4294965248, implicit-def $scc
+...
+
+---
+name:            shrink_kimm32_add_i32
+tracksRegLiveness: true
+body:             |
+  bb.0:
+
+    ; GCN-LABEL: name: shrink_kimm32_add_i32
+    ; GCN: $sgpr0 = S_ADDK_I32 undef $sgpr0, -2048, implicit-def $scc
+    $sgpr0 = S_ADD_I32 undef $sgpr0, 4294965248, implicit-def $scc
+...
+
+---
+name:            shrink_kimm32_mul_i32
+tracksRegLiveness: true
+body:             |
+  bb.0:
+
+    ; GCN-LABEL: name: shrink_kimm32_mul_i32
+    ; GCN: $sgpr0 = S_MULK_I32 undef $sgpr0, -2048, implicit-def $scc
+    $sgpr0 = S_MUL_I32 undef $sgpr0, 4294965248, implicit-def $scc
+...

Copy link
Contributor

@jayfoad jayfoad left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a bit surprised it doesn't affect any codegen tests.

Anyway makes sense to me since the src operand of S_MOVK_I32 is signed but the src operand of S_MOV_B32 is not (B32 is neither signed nor unsigned, but I guess we generally treat it like unsigned).

@@ -159,7 +159,7 @@ bool SIShrinkInstructions::shouldShrinkTrue16(MachineInstr &MI) const {
}

bool SIShrinkInstructions::isKImmOperand(const MachineOperand &Src) const {
return isInt<16>(Src.getImm()) &&
return isInt<16>(SignExtend64(Src.getImm(), 32)) &&
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could write this as isInt<16>((int32_t)Src.getImm()) but it's a matter of taste I guess.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could write this as isInt<16>((int32_t)Src.getImm()) but it's a matter of taste I guess.

Same thing in principle.

@rampitec
Copy link
Collaborator Author

I'm a bit surprised it doesn't affect any codegen tests.

Anyway makes sense to me since the src operand of S_MOVK_I32 is signed but the src operand of S_MOV_B32 is not (B32 is neither signed nor unsigned, but I guess we generally treat it like unsigned).

I was surprised too, but this is all because we do not select 64-bit moves. All of that is a step to select these.

@rampitec rampitec merged commit 4602802 into llvm:main Oct 26, 2023
4 checks passed
@rampitec rampitec deleted the shrink-kimm-32 branch October 26, 2023 07:27
zahiraam pushed a commit to zahiraam/llvm-project that referenced this pull request Oct 26, 2023
A literal like 0xffff8000 is valid to be used as KIMM in a SOPK
instruction, but at the moment our checks expect it to be fully sign
extended to a 64-bit signed integer. This is not required since all
cases which are being shrunk only accept 32-bit operands.

We need to sign extend the operand to 64-bit though so it passes the
verifier and properly printed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants