[AMDGPU] Change the representation of double literals in operands #68740

rampitec · 2023-10-10T20:12:53Z

A 64-bit literal can be used as a 32-bit zero or sign extended operand. In case of double zeroes are added to the low 32 bits. Currently asm parser stores only high 32 bits of a double into an operand. To support codegen as requested by the #67781 we need to change the representation to store a full 64-bit value so that codegen can simply add immediates to an instruction.

There is some code to support compatibility with existing tests and asm kernels. We allow to use short hex strings to represent only a high 32 bit of a double value as a valid literal.

A 64-bit literal can be used as a 32-bit zero or sign extended operand. In case of double zeroes are added to the low 32 bits. Currently asm parser stores only high 32 bits of a double into an operand. To support codegen as requested by the llvm#67781 we need to change the representation to store a full 64-bit value so that codegen can simply add immediates to an instruction. There is some code to support compatibility with existing tests and asm kernels. We allow to use short hex strings to represent only a high 32 bit of a double value as a valid literal.

llvmbot · 2023-10-10T20:14:05Z

@llvm/pr-subscribers-backend-amdgpu

Author: Stanislav Mekhanoshin (rampitec)

Changes

A 64-bit literal can be used as a 32-bit zero or sign extended operand. In case of double zeroes are added to the low 32 bits. Currently asm parser stores only high 32 bits of a double into an operand. To support codegen as requested by the #67781 we need to change the representation to store a full 64-bit value so that codegen can simply add immediates to an instruction.

There is some code to support compatibility with existing tests and asm kernels. We allow to use short hex strings to represent only a high 32 bit of a double value as a valid literal.

Full diff: https://github.com/llvm/llvm-project/pull/68740.diff

9 Files Affected:

(modified) llvm/lib/Target/AMDGPU/AsmParser/AMDGPUAsmParser.cpp (+18-3)
(modified) llvm/lib/Target/AMDGPU/Disassembler/AMDGPUDisassembler.cpp (+21-7)
(modified) llvm/lib/Target/AMDGPU/Disassembler/AMDGPUDisassembler.h (+6-3)
(modified) llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUInstPrinter.cpp (+8-4)
(modified) llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUInstPrinter.h (+1-1)
(modified) llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUMCCodeEmitter.cpp (+3)
(modified) llvm/lib/Target/AMDGPU/SIRegisterInfo.td (+3-1)
(modified) llvm/lib/Target/AMDGPU/Utils/AMDGPUBaseInfo.cpp (+7)
(modified) llvm/lib/Target/AMDGPU/Utils/AMDGPUBaseInfo.h (+3)

diff --git a/llvm/lib/Target/AMDGPU/AsmParser/AMDGPUAsmParser.cpp b/llvm/lib/Target/AMDGPU/AsmParser/AMDGPUAsmParser.cpp
index 35656bcaea1af7f..0553d3f20b21c56 100644
--- a/llvm/lib/Target/AMDGPU/AsmParser/AMDGPUAsmParser.cpp
+++ b/llvm/lib/Target/AMDGPU/AsmParser/AMDGPUAsmParser.cpp
@@ -2140,9 +2140,10 @@ void AMDGPUOperand::addLiteralImmOperand(MCInst &Inst, int64_t Val, bool ApplyMo
           const_cast<AMDGPUAsmParser *>(AsmParser)->Warning(Inst.getLoc(),
           "Can't encode literal as exact 64-bit floating-point operand. "
           "Low 32-bits will be set to zero");
+          Val &= 0xffffffff00000000u;
         }
 
-        Inst.addOperand(MCOperand::createImm(Literal.lshr(32).getZExtValue()));
+        Inst.addOperand(MCOperand::createImm(Val));
         setImmKindLiteral();
         return;
       }
@@ -2241,7 +2242,10 @@ void AMDGPUOperand::addLiteralImmOperand(MCInst &Inst, int64_t Val, bool ApplyMo
       return;
     }
 
-    Inst.addOperand(MCOperand::createImm(Lo_32(Val)));
+    if (isInt<32>(Val) || isUInt<32>(Val))
+      Val = AMDGPU::isSISrcFPOperand(InstDesc, OpNum) ? Val << 32 : Lo_32(Val);
+
+    Inst.addOperand(MCOperand::createImm(Val));
     setImmKindLiteral();
     return;
 
@@ -4297,7 +4301,18 @@ bool AMDGPUAsmParser::validateVOPLiteral(const MCInst &Inst,
       continue;
 
     if (MO.isImm() && !isInlineConstant(Inst, OpIdx)) {
-      uint32_t Value = static_cast<uint32_t>(MO.getImm());
+      uint64_t Value = static_cast<uint64_t>(MO.getImm());
+      bool IsFP = AMDGPU::isSISrcFPOperand(Desc, OpIdx);
+      bool IsValid32Op = AMDGPU::isValid32BitLiteral(Value, IsFP);
+
+      if (!IsValid32Op && !isInt<32>(Value) && !isUInt<32>(Value)) {
+        Error(getLitLoc(Operands), "invalid operand for instruction");
+        return false;
+      }
+
+      if (IsFP && IsValid32Op)
+        Value = Hi_32(Value);
+
       if (NumLiterals == 0 || LiteralValue != Value) {
         LiteralValue = Value;
         ++NumLiterals;
diff --git a/llvm/lib/Target/AMDGPU/Disassembler/AMDGPUDisassembler.cpp b/llvm/lib/Target/AMDGPU/Disassembler/AMDGPUDisassembler.cpp
index 439762bc6caf786..8c49c9a9c87772e 100644
--- a/llvm/lib/Target/AMDGPU/Disassembler/AMDGPUDisassembler.cpp
+++ b/llvm/lib/Target/AMDGPU/Disassembler/AMDGPUDisassembler.cpp
@@ -378,6 +378,15 @@ static DecodeStatus decodeOperand_AVLdSt_Any(MCInst &Inst, unsigned Imm,
   return addOperand(Inst, DAsm->decodeSrcOp(Opw, Imm | 256));
 }
 
+static DecodeStatus
+decodeOperand_VSrc_f64(MCInst &Inst, unsigned Imm, uint64_t Addr,
+                       const MCDisassembler *Decoder) {
+  assert(Imm < (1 << 9) && "9-bit encoding");
+  auto DAsm = static_cast<const AMDGPUDisassembler *>(Decoder);
+  return addOperand(Inst, DAsm->decodeSrcOp(AMDGPUDisassembler::OPW64, Imm,
+                                            false, 64, true));
+}
+
 static DecodeStatus
 DecodeAVLdSt_32RegisterClass(MCInst &Inst, unsigned Imm, uint64_t Addr,
                              const MCDisassembler *Decoder) {
@@ -1218,7 +1227,7 @@ AMDGPUDisassembler::decodeMandatoryLiteralConstant(unsigned Val) const {
   return MCOperand::createImm(Literal);
 }
 
-MCOperand AMDGPUDisassembler::decodeLiteralConstant() const {
+MCOperand AMDGPUDisassembler::decodeLiteralConstant(bool ExtendFP64) const {
   // For now all literal constants are supposed to be unsigned integer
   // ToDo: deal with signed/unsigned 64-bit integer constants
   // ToDo: deal with float/double constants
@@ -1228,9 +1237,11 @@ MCOperand AMDGPUDisassembler::decodeLiteralConstant() const {
                         Twine(Bytes.size()));
     }
     HasLiteral = true;
-    Literal = eatBytes<uint32_t>(Bytes);
+    Literal = Literal64 = eatBytes<uint32_t>(Bytes);
+    if (ExtendFP64)
+      Literal64 <<= 32;
   }
-  return MCOperand::createImm(Literal);
+  return MCOperand::createImm(ExtendFP64 ? Literal64 : Literal);
 }
 
 MCOperand AMDGPUDisassembler::decodeIntImmed(unsigned Imm) {
@@ -1447,7 +1458,8 @@ int AMDGPUDisassembler::getTTmpIdx(unsigned Val) const {
 
 MCOperand AMDGPUDisassembler::decodeSrcOp(const OpWidthTy Width, unsigned Val,
                                           bool MandatoryLiteral,
-                                          unsigned ImmWidth) const {
+                                          unsigned ImmWidth,
+                                          bool IsFP) const {
   using namespace AMDGPU::EncValues;
 
   assert(Val < 1024); // enum10
@@ -1459,13 +1471,15 @@ MCOperand AMDGPUDisassembler::decodeSrcOp(const OpWidthTy Width, unsigned Val,
     return createRegOperand(IsAGPR ? getAgprClassId(Width)
                                    : getVgprClassId(Width), Val - VGPR_MIN);
   }
-  return decodeNonVGPRSrcOp(Width, Val & 0xFF, MandatoryLiteral, ImmWidth);
+  return decodeNonVGPRSrcOp(Width, Val & 0xFF, MandatoryLiteral, ImmWidth,
+                            IsFP);
 }
 
 MCOperand AMDGPUDisassembler::decodeNonVGPRSrcOp(const OpWidthTy Width,
                                                  unsigned Val,
                                                  bool MandatoryLiteral,
-                                                 unsigned ImmWidth) const {
+                                                 unsigned ImmWidth,
+                                                 bool IsFP) const {
   // Cases when Val{8} is 1 (vgpr, agpr or true 16 vgpr) should have been
   // decoded earlier.
   assert(Val < (1 << 8) && "9-bit Src encoding when Val{8} is 0");
@@ -1493,7 +1507,7 @@ MCOperand AMDGPUDisassembler::decodeNonVGPRSrcOp(const OpWidthTy Width,
       // Keep a sentinel value for deferred setting
       return MCOperand::createImm(LITERAL_CONST);
     else
-      return decodeLiteralConstant();
+      return decodeLiteralConstant(IsFP && ImmWidth == 64);
   }
 
   switch (Width) {
diff --git a/llvm/lib/Target/AMDGPU/Disassembler/AMDGPUDisassembler.h b/llvm/lib/Target/AMDGPU/Disassembler/AMDGPUDisassembler.h
index 5f3b277d577ff7c..865db2b26307b43 100644
--- a/llvm/lib/Target/AMDGPU/Disassembler/AMDGPUDisassembler.h
+++ b/llvm/lib/Target/AMDGPU/Disassembler/AMDGPUDisassembler.h
@@ -97,6 +97,7 @@ class AMDGPUDisassembler : public MCDisassembler {
   const unsigned TargetMaxInstBytes;
   mutable ArrayRef<uint8_t> Bytes;
   mutable uint32_t Literal;
+  mutable uint64_t Literal64;
   mutable bool HasLiteral;
   mutable std::optional<bool> EnableWavefrontSize32;
 
@@ -229,15 +230,17 @@ class AMDGPUDisassembler : public MCDisassembler {
   static MCOperand decodeFPImmed(unsigned ImmWidth, unsigned Imm);
 
   MCOperand decodeMandatoryLiteralConstant(unsigned Imm) const;
-  MCOperand decodeLiteralConstant() const;
+  MCOperand decodeLiteralConstant(bool ExtendFP64) const;
 
   MCOperand decodeSrcOp(const OpWidthTy Width, unsigned Val,
                         bool MandatoryLiteral = false,
-                        unsigned ImmWidth = 0) const;
+                        unsigned ImmWidth = 0,
+                        bool IsFP = false) const;
 
   MCOperand decodeNonVGPRSrcOp(const OpWidthTy Width, unsigned Val,
                                bool MandatoryLiteral = false,
-                               unsigned ImmWidth = 0) const;
+                               unsigned ImmWidth = 0,
+                               bool IsFP = false) const;
 
   MCOperand decodeVOPDDstYOp(MCInst &Inst, unsigned Val) const;
   MCOperand decodeSpecialReg32(unsigned Val) const;
diff --git a/llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUInstPrinter.cpp b/llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUInstPrinter.cpp
index ad4c48a8d65581a..40e92f00a9e52a6 100644
--- a/llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUInstPrinter.cpp
+++ b/llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUInstPrinter.cpp
@@ -426,7 +426,7 @@ void AMDGPUInstPrinter::printImmediate32(uint32_t Imm,
 
 void AMDGPUInstPrinter::printImmediate64(uint64_t Imm,
                                          const MCSubtargetInfo &STI,
-                                         raw_ostream &O) {
+                                         raw_ostream &O, bool IsFP) {
   int64_t SImm = static_cast<int64_t>(Imm);
   if (SImm >= -16 && SImm <= 64) {
     O << SImm;
@@ -454,6 +454,8 @@ void AMDGPUInstPrinter::printImmediate64(uint64_t Imm,
   else if (Imm == 0x3fc45f306dc9c882 &&
            STI.hasFeature(AMDGPU::FeatureInv2PiInlineImm))
     O << "0.15915494309189532";
+  else if (IsFP && AMDGPU::isValid32BitLiteral(Imm, true))
+    O << formatHex(static_cast<uint64_t>(Hi_32(Imm)));
   else {
     assert(isUInt<32>(Imm) || isInt<32>(Imm));
 
@@ -605,11 +607,13 @@ void AMDGPUInstPrinter::printRegularOperand(const MCInst *MI, unsigned OpNo,
       printImmediate32(Op.getImm(), STI, O);
       break;
     case AMDGPU::OPERAND_REG_IMM_INT64:
-    case AMDGPU::OPERAND_REG_IMM_FP64:
     case AMDGPU::OPERAND_REG_INLINE_C_INT64:
+      printImmediate64(Op.getImm(), STI, O, false);
+      break;
+    case AMDGPU::OPERAND_REG_IMM_FP64:
     case AMDGPU::OPERAND_REG_INLINE_C_FP64:
     case AMDGPU::OPERAND_REG_INLINE_AC_FP64:
-      printImmediate64(Op.getImm(), STI, O);
+      printImmediate64(Op.getImm(), STI, O, true);
       break;
     case AMDGPU::OPERAND_REG_INLINE_C_INT16:
     case AMDGPU::OPERAND_REG_INLINE_AC_INT16:
@@ -671,7 +675,7 @@ void AMDGPUInstPrinter::printRegularOperand(const MCInst *MI, unsigned OpNo,
       if (RCBits == 32)
         printImmediate32(llvm::bit_cast<uint32_t>((float)Value), STI, O);
       else if (RCBits == 64)
-        printImmediate64(llvm::bit_cast<uint64_t>(Value), STI, O);
+        printImmediate64(llvm::bit_cast<uint64_t>(Value), STI, O, true);
       else
         llvm_unreachable("Invalid register class size");
     }
diff --git a/llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUInstPrinter.h b/llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUInstPrinter.h
index 3b14faab136b35a..dc83547a4afe049 100644
--- a/llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUInstPrinter.h
+++ b/llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUInstPrinter.h
@@ -91,7 +91,7 @@ class AMDGPUInstPrinter : public MCInstPrinter {
   void printImmediate32(uint32_t Imm, const MCSubtargetInfo &STI,
                         raw_ostream &O);
   void printImmediate64(uint64_t Imm, const MCSubtargetInfo &STI,
-                        raw_ostream &O);
+                        raw_ostream &O, bool IsFP);
   void printOperand(const MCInst *MI, unsigned OpNo, const MCSubtargetInfo &STI,
                     raw_ostream &O);
   void printRegularOperand(const MCInst *MI, unsigned OpNo,
diff --git a/llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUMCCodeEmitter.cpp b/llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUMCCodeEmitter.cpp
index 57ccb523c70eee6..d93f747bf6f0a64 100644
--- a/llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUMCCodeEmitter.cpp
+++ b/llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUMCCodeEmitter.cpp
@@ -411,6 +411,9 @@ void AMDGPUMCCodeEmitter::encodeInstruction(const MCInst &MI,
     } else if (!Op.isExpr()) // Exprs will be replaced with a fixup value.
       llvm_unreachable("Must be immediate or expr");
 
+    if (Desc.operands()[i].OperandType == AMDGPU::OPERAND_REG_IMM_FP64)
+      Imm = Hi_32(Imm);
+
     support::endian::write<uint32_t>(CB, Imm, support::endianness::little);
 
     // Only one literal value allowed
diff --git a/llvm/lib/Target/AMDGPU/SIRegisterInfo.td b/llvm/lib/Target/AMDGPU/SIRegisterInfo.td
index c3c5bfae405aa45..ea06e85fb400c1b 100644
--- a/llvm/lib/Target/AMDGPU/SIRegisterInfo.td
+++ b/llvm/lib/Target/AMDGPU/SIRegisterInfo.td
@@ -1263,7 +1263,9 @@ def VSrc_f32 : RegOrF32 <"VS_32", "OPERAND_REG_IMM">;
 def VSrc_v2b16 : RegOrV2B16 <"VS_32", "OPERAND_REG_IMM">;
 def VSrc_v2f16 : RegOrV2F16 <"VS_32", "OPERAND_REG_IMM">;
 def VSrc_b64 : RegOrB64 <"VS_64", "OPERAND_REG_IMM">;
-def VSrc_f64 : RegOrF64 <"VS_64", "OPERAND_REG_IMM">;
+def VSrc_f64 : RegOrF64 <"VS_64", "OPERAND_REG_IMM"> {
+  let DecoderMethod = "decodeOperand_VSrc_f64";
+}
 def VSrc_v2b32 : RegOrV2B32 <"VS_64", "OPERAND_REG_IMM">;
 def VSrc_v2f32 : RegOrV2F32 <"VS_64", "OPERAND_REG_IMM">;
 
diff --git a/llvm/lib/Target/AMDGPU/Utils/AMDGPUBaseInfo.cpp b/llvm/lib/Target/AMDGPU/Utils/AMDGPUBaseInfo.cpp
index 6d0ad763d9e6cc1..e7907b28abedf9d 100644
--- a/llvm/lib/Target/AMDGPU/Utils/AMDGPUBaseInfo.cpp
+++ b/llvm/lib/Target/AMDGPU/Utils/AMDGPUBaseInfo.cpp
@@ -2519,6 +2519,13 @@ bool isFoldableLiteralV216(int32_t Literal, bool HasInv2Pi) {
   return Lo16 == Hi16;
 }
 
+bool isValid32BitLiteral(uint64_t Val, bool IsFP) {
+  if (IsFP)
+    return !(Val & 0xffffffffu);
+
+  return isUInt<32>(Val) || isInt<32>(Val);
+}
+
 bool isArgPassedInSGPR(const Argument *A) {
   const Function *F = A->getParent();
 
diff --git a/llvm/lib/Target/AMDGPU/Utils/AMDGPUBaseInfo.h b/llvm/lib/Target/AMDGPU/Utils/AMDGPUBaseInfo.h
index 297a69f54d63721..fbe9adfd74fa9c6 100644
--- a/llvm/lib/Target/AMDGPU/Utils/AMDGPUBaseInfo.h
+++ b/llvm/lib/Target/AMDGPU/Utils/AMDGPUBaseInfo.h
@@ -1290,6 +1290,9 @@ bool isInlinableIntLiteralV216(int32_t Literal);
 LLVM_READNONE
 bool isFoldableLiteralV216(int32_t Literal, bool HasInv2Pi);
 
+LLVM_READNONE
+bool isValid32BitLiteral(uint64_t Val, bool IsFP);
+
 bool isArgPassedInSGPR(const Argument *Arg);
 
 bool isArgPassedInSGPR(const CallBase *CB, unsigned ArgNo);

github-actions · 2023-10-10T20:14:43Z

✅ With the latest revision this PR passed the C/C++ code formatter.

jayfoad · 2023-10-11T09:07:09Z

llvm/lib/Target/AMDGPU/AsmParser/AMDGPUAsmParser.cpp

+    if (isInt<32>(Val) || isUInt<32>(Val))
+      Val = AMDGPU::isSISrcFPOperand(InstDesc, OpNum) ? Val << 32 : Lo_32(Val);


This doesn't fully make sense to me. Ignoring the FP case, it simplifies to:

if (isInt<32>(Val) || isUInt<32>(Val)) Val = Lo_32(Val);

Why would we only do the Lo_32 if Val is already a (signed or unsigned) 32 bit value???

To mask potential 0xffffffff in the high32 bits.

Aren't there three interesting cases?

if (!AMDGPU::isSISrcFPOperand(InstDesc, OpNum)) // integer operand Val = Lo_32(Val); else if (isInt<32>(Val) || isUInt<32>(Val))) // small hex literal for double operand Val = Val << 32; else // large hex literal for double operand do nothing;

Lo_32 for integers is used because inst printer would just use O << formatHex(static_cast<uint64_t>(Imm)); and since we have full 64-bits there the output will look like this if the value is unmasked:

llvm-mc -arch=amdgcn -mcpu=tonga -show-encoding <<< 's_mov_b64_e32 s[0:1], -54321' s_mov_b64 s[0:1], 0xffffffffffff2bcf ; encoding: [0xff,0x01,0x80,0xbe,0xcf,0x2b,0xff,0xff]

This is technically correct but inconvenient and not what we are expecting in our tests too:

// SICI: s_mov_b64 s[0:1], 0xffff2bcf ; encoding: [0xff,0x04,0x80,0xbe,0xcf,0x2b,0xff,0xff] // GFX89: s_mov_b64 s[0:1], 0xffff2bcf ; encoding: [0xff,0x01,0x80,0xbe,0xcf,0x2b,0xff,0xff] s_mov_b64_e32 s[0:1], -54321

For the large FP literals we do the validation:

llvm-mc -arch=amdgcn -mcpu=gfx1030 -show-encoding <<< 'v_add_f64 v[0:1], 100.1, v[0:1]' <unknown>:0: warning: Can't encode literal as exact 64-bit floating-point operand. Low 32-bits will be set to zero v_add_f64 v[0:1], 0x40590666, v[0:1] ; encoding: [0x00,0x00,0x64,0xd5,0xff,0x00,0x02,0x00,0x66,0x06,0x59,0x40]

But we need to truncate the the value as in the warning. If we don't there will be an error following the warning and this would change current behavior:

<unknown>:0: warning: Can't encode literal as exact 64-bit floating-point operand. Low 32-bits will be set to zero <stdin>:1:19: error: invalid operand for instruction v_add_f64 v[0:1], 100.1, v[0:1]

This is for the FP literal. For the hex string the situation just does not happen and we are getting the error instead because the constant does not pass isLiteralImm() check:

llvm-mc -arch=amdgcn -mcpu=gfx1030 -show-encoding <<< 'v_add_f64 v[0:1], 0xa12345678, v[0:1]' <stdin>:1:19: error: invalid operand for instruction v_add_f64 v[0:1], 0xa12345678, v[0:1] ^

Note that I have no intent to introduce any visible changes, just to change the internal operand representation and keep all existing tests passing.

I still find the Int/UInt check confusing here. How about:

Suggested change

if (isInt<32>(Val) || isUInt<32>(Val))

Val = AMDGPU::isSISrcFPOperand(InstDesc, OpNum) ? Val << 32 : Lo_32(Val);

if (AMDGPU::isSISrcFPOperand(InstDesc, OpNum))

Val <<= 32;

else

Val = Lo_32(Val);

change

I literally hate gitbub. I cannot even get a clue what file it is.

I still find the Int/UInt check confusing here. How about:

Done.

jayfoad · 2023-10-11T09:10:12Z

llvm/lib/Target/AMDGPU/AsmParser/AMDGPUAsmParser.cpp

@@ -4297,7 +4301,18 @@ bool AMDGPUAsmParser::validateVOPLiteral(const MCInst &Inst,
      continue;

    if (MO.isImm() && !isInlineConstant(Inst, OpIdx)) {
-      uint32_t Value = static_cast<uint32_t>(MO.getImm());
+      uint64_t Value = static_cast<uint64_t>(MO.getImm());
+      bool IsFP = AMDGPU::isSISrcFPOperand(Desc, OpIdx);


I don't fully understand this either. Are you trying to allow cases where the same 32-bit literal is used both as the high 32 bits of a double, and as a normal 32 bit integer, in the same instruction?

Yes, this is allowed. Encoded literal is 32-bit and can be used as both integer and double. For example in the gfx11_asm_vop3.s:

v_trig_preop_f64 v[254:255], 0xaf123456, 0xaf123456 clamp div:2 // GFX11: encoding: [0xfe,0x80,0x2f,0xd7,0xff,0xfe,0x01,0x18,0x56,0x34,0x12,0xaf]

Without this piece of code it would give error:

llvm-mc -arch=amdgcn -mcpu=gfx1100 -show-encoding <<< 'v_trig_preop_f64 v[254:255], 0xaf123456, 0xaf123456 clamp div:2' <stdin>:1:42: error: only one unique literal operand is allowed v_trig_preop_f64 v[254:255], 0xaf123456, 0xaf123456 clamp div:2 ^

jayfoad · 2023-10-11T09:17:32Z

we need to change the representation to store a full 64-bit value

I think this is clearly a Good Thing. There are just some details of the implementation that I don't understand.

A 64-bit literal can be used as a 32-bit zero or sign extended operand. In case of double zeroes are added to the low 32 bits. Currently asm parser stores only high 32 bits of a double into an operand. To support codegen as requested by the llvm#67781 we need to change the representation to store a full 64-bit value so that codegen can simply add immediates to an instruction. There is some code to support compatibility with existing tests and asm kernels. We allow to use short hex strings to represent only a high 32 bit of a double value as a valid literal.

…lit64-mc-enc

rampitec · 2023-10-12T07:54:39Z

Sorry for the noise, rebase and merge seems to add everyone committed in between to the reviewers.

…lit64-mc-enc

jayfoad · 2023-10-12T09:16:28Z

llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUInstPrinter.cpp

@@ -454,6 +454,8 @@ void AMDGPUInstPrinter::printImmediate64(uint64_t Imm,
  else if (Imm == 0x3fc45f306dc9c882 &&
           STI.hasFeature(AMDGPU::FeatureInv2PiInlineImm))
    O << "0.15915494309189532";
+  else if (IsFP && AMDGPU::isValid32BitLiteral(Imm, true))


Can this be:

Suggested change

else if (IsFP && AMDGPU::isValid32BitLiteral(Imm, true))

else if (IsFP) {

assert(AMDGPU::isValid32BitLiteral(Imm, true));

Yes, it can, given that lo32 are cleared by the parser. Changed.

…lit64-mc-enc

llvm/lib/Target/AMDGPU/AsmParser/AMDGPUAsmParser.cpp

jayfoad

LGTM modulo one remaining comment about validateVOPLiteral.

A 64-bit literal can be used as a 32-bit zero or sign extended operand. In case of double zeroes are added to the low 32 bits. Currently asm parser stores only high 32 bits of a double into an operand. To support codegen as requested by the llvm#67781 we need to change the representation to store a full 64-bit value so that codegen can simply add immediates to an instruction. There is some code to support compatibility with existing tests and asm kernels. We allow to use short hex strings to represent only a high 32 bit of a double value as a valid literal.

…lit64-mc-enc

rampitec · 2023-10-12T19:27:45Z

LGTM modulo one remaining comment about validateVOPLiteral.

Done.

mikaelholmen · 2023-10-13T06:17:09Z

Hi @rampitec

With UBSan built binaries the
MC/AMDGPU/literals.s
testcase fails and triggers UB like

07:33:04 ../lib/Target/AMDGPU/AsmParser/AMDGPUAsmParser.cpp:2246:59: runtime error: left shift of negative value -54321
07:33:04 SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior ../lib/Target/AMDGPU/AsmParser/AMDGPUAsmParser.cpp:2246:59

It's this new shift that it complains on:

    Val = AMDGPU::isSISrcFPOperand(InstDesc, OpNum) ? Val << 32 : Lo_32(Val);

rampitec · 2023-10-13T06:31:47Z

Hi @rampitec

With UBSan built binaries the MC/AMDGPU/literals.s testcase fails and triggers UB like

07:33:04 ../lib/Target/AMDGPU/AsmParser/AMDGPUAsmParser.cpp:2246:59: runtime error: left shift of negative value -54321
07:33:04 SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior ../lib/Target/AMDGPU/AsmParser/AMDGPUAsmParser.cpp:2246:59

It's this new shift that it complains on:

    Val = AMDGPU::isSISrcFPOperand(InstDesc, OpNum) ? Val << 32 : Lo_32(Val);

I am happy to fix it, but I do not understand how a LEFT shift can be a UB? Low bits will be zero, right? High bits will be disposed. I really do not understand.

rampitec · 2023-10-13T06:41:39Z

Hi @rampitec
With UBSan built binaries the MC/AMDGPU/literals.s testcase fails and triggers UB like
07:33:04 ../lib/Target/AMDGPU/AsmParser/AMDGPUAsmParser.cpp:2246:59: runtime error: left shift of negative value -54321
07:33:04 SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior ../lib/Target/AMDGPU/AsmParser/AMDGPUAsmParser.cpp:2246:59 
It's this new shift that it complains on:
    Val = AMDGPU::isSISrcFPOperand(InstDesc, OpNum) ? Val << 32 : Lo_32(Val);
I am happy to fix it, but I do not understand how a LEFT shift can be a UB? Low bits will be zero, right? High bits will be disposed. I really do not understand.

Building ubsan now...

rampitec · 2023-10-13T06:49:16Z

I have a small little problem that I cannot build tip now:

FAILED: tools/llvm-remarkutil/CMakeFiles/llvm-remarkutil.dir/RemarkCounter.cpp.o
CCACHE_CPP2=yes CCACHE_HASHDIR=yes /usr/bin/ccache /usr/bin/clang++ -DGTEST_HAS_RTTI=0 -D_DEBUG -D_GLIBCXX_ASSERTIONS -D_GNU_SOURCE -D_LIBCPP_ENABLE_HARDENED_MODE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -I/home/stas/work/llvm/ubsan/tools/llvm-remarkutil -I/home/stas/work/llvm/llvm/tools/llvm-remarkutil -I/home/stas/work/llvm/ubsan/include -I/home/stas/work/llvm/llvm/include -fPIC -fvisibility-inlines-hidden -Werror=date-time -Werror=unguarded-availability-new -Wall -Wextra -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wmissing-field-initializers -pedantic -Wno-long-long -Wc++98-compat-extra-semi -Wimplicit-fallthrough -Wcovered-switch-default -Wno-noexcept-type -Wnon-virtual-dtor -Wdelete-non-virtual-dtor -Wstring-conversion -Wmisleading-indentation -Wctad-maybe-unsupported -fdiagnostics-color -g  -fno-exceptions -funwind-tables -fno-rtti -std=c++17 -MD -MT tools/llvm-remarkutil/CMakeFiles/llvm-remarkutil.dir/RemarkCounter.cpp.o -MF tools/llvm-remarkutil/CMakeFiles/llvm-remarkutil.dir/RemarkCounter.cpp.o.d -o tools/llvm-remarkutil/CMakeFiles/llvm-remarkutil.dir/RemarkCounter.cpp.o -c /home/stas/work/llvm/llvm/tools/llvm-remarkutil/RemarkCounter.cpp
In file included from /home/stas/work/llvm/llvm/tools/llvm-remarkutil/RemarkCounter.cpp:13:
/home/stas/work/llvm/llvm/tools/llvm-remarkutil/RemarkCounter.h:90:14: error: call to deleted constructor of 'llvm::Error'
      return E;
             ^

mikaelholmen · 2023-10-13T06:49:29Z

I suppose left shift of negative values is undefined because if you shift out the sign bit you can overflow and get a positive value.

rampitec · 2023-10-13T06:51:58Z

I suppose left shift of negative values is undefined because if you shift out the sign bit you can overflow and get a positive value.

Sounds like BS. It is defined. Unexpected maybe.

rampitec · 2023-10-13T07:14:37Z

I suppose left shift of negative values is undefined because if you shift out the sign bit you can overflow and get a positive value.

#68959

rampitec requested a review from jayfoad October 10, 2023 20:12

llvmbot added the backend:AMDGPU label Oct 10, 2023

jayfoad reviewed Oct 11, 2023

View reviewed changes

rampitec added 2 commits October 12, 2023 00:44

Merge branch 'lit64-mc-enc' of github.com:rampitec/llvm-project into …

c6cef06

…lit64-mc-enc

rampitec requested review from JDevlieghere, nikic and a team as code owners October 12, 2023 07:51

Merge branch 'llvm:main' into lit64-mc-enc

5974041

nikic removed request for a team and nikic October 12, 2023 07:58

rampitec removed the request for review from JDevlieghere October 12, 2023 08:00

rampitec added 2 commits October 12, 2023 01:16

[AMDGPU] Make clang-format happy with disasm cganges

6e7dffa

Merge branch 'lit64-mc-enc' of github.com:rampitec/llvm-project into …

b430ae7

…lit64-mc-enc

jayfoad reviewed Oct 12, 2023

View reviewed changes

rampitec and others added 5 commits October 12, 2023 02:25

Merge branch 'lit64-mc-enc' of github.com:rampitec/llvm-project into …

a84e1bc

…lit64-mc-enc

Merge branch 'lit64-mc-enc' of github.com:rampitec/llvm-project into …

cf4d55e

…lit64-mc-enc

Merge branch 'lit64-mc-enc' of github.com:rampitec/llvm-project into …

32ba230

…lit64-mc-enc

Merge branch 'lit64-mc-enc' of github.com:rampitec/llvm-project into …

f6bd002

…lit64-mc-enc

Merge branch 'llvm:main' into lit64-mc-enc

4e64a2f

jayfoad reviewed Oct 12, 2023

View reviewed changes

llvm/lib/Target/AMDGPU/AsmParser/AMDGPUAsmParser.cpp Outdated Show resolved Hide resolved

jayfoad approved these changes Oct 12, 2023

View reviewed changes

rampitec added 4 commits October 12, 2023 03:53

[AMDGPU] Make clang-format happy with disasm cganges

0bb5d06

Change argument of AMDGPU::isValid32BitLiteral to IsFP64

e2e1efa

Merge branch 'lit64-mc-enc' of github.com:rampitec/llvm-project into …

95f2f2b

…lit64-mc-enc

rampitec added 2 commits October 12, 2023 12:28

Merge branch 'llvm:main' into lit64-mc-enc

68331b9

Merge branch 'llvm:main' into lit64-mc-enc

fca5229

rampitec merged commit ab6c3d5 into llvm:main Oct 12, 2023
3 checks passed

rampitec deleted the lit64-mc-enc branch October 12, 2023 21:47

madhur13490 mentioned this pull request Oct 20, 2023

Revert commit ba8565fbcb975e2d067ce3ae5a7dbaae4953edd3 madhur13490/llvm-project#3

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[AMDGPU] Change the representation of double literals in operands #68740

[AMDGPU] Change the representation of double literals in operands #68740

rampitec commented Oct 10, 2023

llvmbot commented Oct 10, 2023

github-actions bot commented Oct 10, 2023 •

edited

Loading

jayfoad Oct 11, 2023

rampitec Oct 11, 2023

jayfoad Oct 11, 2023

rampitec Oct 11, 2023

jayfoad Oct 12, 2023

rampitec Oct 12, 2023

rampitec Oct 12, 2023

jayfoad Oct 11, 2023

rampitec Oct 11, 2023

jayfoad commented Oct 11, 2023

rampitec commented Oct 12, 2023

jayfoad Oct 12, 2023

rampitec Oct 12, 2023

jayfoad left a comment

rampitec commented Oct 12, 2023

mikaelholmen commented Oct 13, 2023

rampitec commented Oct 13, 2023

rampitec commented Oct 13, 2023

rampitec commented Oct 13, 2023

mikaelholmen commented Oct 13, 2023

rampitec commented Oct 13, 2023

rampitec commented Oct 13, 2023

		if (isInt<32>(Val) \|\| isUInt<32>(Val))
		Val = AMDGPU::isSISrcFPOperand(InstDesc, OpNum) ? Val << 32 : Lo_32(Val);

-    if (isInt<32>(Val) || isUInt<32>(Val))
-      Val = AMDGPU::isSISrcFPOperand(InstDesc, OpNum) ? Val << 32 : Lo_32(Val);
+    if (AMDGPU::isSISrcFPOperand(InstDesc, OpNum))
+      Val <<= 32;
+    else
+      Val = Lo_32(Val);

	else if (IsFP && AMDGPU::isValid32BitLiteral(Imm, true))
	else if (IsFP) {
	assert(AMDGPU::isValid32BitLiteral(Imm, true));

[AMDGPU] Change the representation of double literals in operands #68740

[AMDGPU] Change the representation of double literals in operands #68740

Conversation

rampitec commented Oct 10, 2023

llvmbot commented Oct 10, 2023

github-actions bot commented Oct 10, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jayfoad commented Oct 11, 2023

rampitec commented Oct 12, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jayfoad left a comment

Choose a reason for hiding this comment

rampitec commented Oct 12, 2023

mikaelholmen commented Oct 13, 2023

rampitec commented Oct 13, 2023

rampitec commented Oct 13, 2023

rampitec commented Oct 13, 2023

mikaelholmen commented Oct 13, 2023

rampitec commented Oct 13, 2023

rampitec commented Oct 13, 2023

github-actions bot commented Oct 10, 2023 •

edited

Loading