Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[AMDGPU] Prevent folding of the negative i32 literals as i64 #70274

Merged
merged 5 commits into from
Oct 30, 2023

Conversation

rampitec
Copy link
Collaborator

We can use sign extended 64-bit literals, but only for signed operands. At the moment we do not know if an operand is signed. Such operand will be encoded as its low 32 bits and then either correctly sign extended or incorrectly zero extended by HW.

We can use sign extended 64-bit literals, but only for signed
operands. At the moment we do not know if an operand is signed.
Such operand will be encoded as its low 32 bits and then either
correctly sign extended or incorrectly zero extended by HW.
@llvmbot
Copy link
Collaborator

llvmbot commented Oct 26, 2023

@llvm/pr-subscribers-backend-amdgpu

Author: Stanislav Mekhanoshin (rampitec)

Changes

We can use sign extended 64-bit literals, but only for signed operands. At the moment we do not know if an operand is signed. Such operand will be encoded as its low 32 bits and then either correctly sign extended or incorrectly zero extended by HW.


Full diff: https://github.com/llvm/llvm-project/pull/70274.diff

2 Files Affected:

  • (modified) llvm/lib/Target/AMDGPU/SIInstrInfo.cpp (+9)
  • (added) llvm/test/CodeGen/AMDGPU/folding-of-i32-as-i64.mir (+20)
diff --git a/llvm/lib/Target/AMDGPU/SIInstrInfo.cpp b/llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
index 827c2c156638468..355805e053f38df 100644
--- a/llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
+++ b/llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
@@ -5500,6 +5500,15 @@ bool SIInstrInfo::isOperandLegal(const MachineInstr &MI, unsigned OpIdx,
     if (Is64BitOp && !AMDGPU::isValid32BitLiteral(Imm, Is64BitFPOp) &&
         !AMDGPU::isInlinableLiteral64(Imm, ST.hasInv2PiInlineImm()))
       return false;
+
+    // FIXME: We can use sign extended 64-bit literals, but only for signed
+    //        operands. At the moment we do not know if an operand is signed.
+    //        Such operand will be encoded as its low 32 bits and then either
+    //        correctly sign extended or incorrectly zero extended by HW.
+    if (Is64BitOp && !Is64BitFPOp && isInt<32>(Imm) &&
+        (int32_t)Lo_32(Imm) < 0 &&
+        !AMDGPU::isInlinableLiteral64(Imm, ST.hasInv2PiInlineImm()))
+      return false;
   }
 
   // Handle non-register types that are treated like immediates.
diff --git a/llvm/test/CodeGen/AMDGPU/folding-of-i32-as-i64.mir b/llvm/test/CodeGen/AMDGPU/folding-of-i32-as-i64.mir
new file mode 100644
index 000000000000000..7cfa67d86fbd94e
--- /dev/null
+++ b/llvm/test/CodeGen/AMDGPU/folding-of-i32-as-i64.mir
@@ -0,0 +1,20 @@
+# NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py UTC_ARGS: --version 3
+# RUN: llc -march=amdgcn -mcpu=gfx900 -verify-machineinstrs -run-pass=si-fold-operands -o - %s | FileCheck -check-prefix=GCN %s
+
+# The constant is 0xffffffff80000000. It is 64-bit negative constant, but it passes the test
+# isInt<32>(). Nonetheless it is not a legal literal for a binary or unsigned operand and
+# cannot be used right in the shift as HW will zero extend it.
+
+---
+name:            imm64_shift_int32_const
+body: |
+  bb.0:
+    ; GCN-LABEL: name: imm64_shift_int32_const
+    ; GCN: [[S_MOV_B:%[0-9]+]]:sreg_64 = S_MOV_B64_IMM_PSEUDO -2147483648
+    ; GCN-NEXT: [[S_LSHL_B64_:%[0-9]+]]:sreg_64 = S_LSHL_B64 [[S_MOV_B]], 1, implicit-def $scc
+    ; GCN-NEXT: S_ENDPGM 0, implicit [[S_LSHL_B64_]]
+    %0:sreg_64 = S_MOV_B64_IMM_PSEUDO 18446744071562067968
+    %1:sreg_64 = S_LSHL_B64 %0, 1, implicit-def $scc
+    S_ENDPGM 0, implicit %1
+
+...

@rampitec
Copy link
Collaborator Author

In a long term I think we need to replace INT64 operands with SINT64 and UINT64. That said there are no MVT::s64 and MVT::u64 to use in an instruction profile.

@arsenm
Copy link
Contributor

arsenm commented Oct 26, 2023

In a long term I think we need to replace INT64 operands with SINT64 and UINT64. That said there are no MVT::s64 and MVT::u64 to use in an instruction profile.

You would have to bundle that somehow in the profile.

@rampitec
Copy link
Collaborator Author

In a long term I think we need to replace INT64 operands with SINT64 and UINT64. That said there are no MVT::s64 and MVT::u64 to use in an instruction profile.

You would have to bundle that somehow in the profile.

Right, VT to use in profile is also missing.

@rampitec rampitec requested a review from arsenm October 26, 2023 18:41
Copy link
Contributor

@jayfoad jayfoad left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM if Matt is happy with the tests.

llvm/lib/Target/AMDGPU/SIInstrInfo.cpp Outdated Show resolved Hide resolved
@rampitec rampitec merged commit ee6d62d into llvm:main Oct 30, 2023
3 checks passed
@rampitec rampitec deleted the folding-of-i32-as-i64 branch October 30, 2023 15:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants