Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[AMDGPU] Add an asm directive to track code_object_version #76267

Merged
merged 1 commit into from
Jan 21, 2024

Conversation

epilk
Copy link
Member

@epilk epilk commented Dec 22, 2023

Named '.amdgcn_code_object_version'. This directive sets the e_ident[ABIVERSION] in the ELF header, and should be used as the assumed COV for the rest of the asm file.

This commit also weakens the --amdhsa-code-object-version CL flag. Previously, the CL flag took precedence over the IR flag. Now the IR flag/asm directive take precedence over the CL flag. This is implemented by merging a few COV-checking functions in AMDGPUBaseInfo.h.

--amdhsa-code-object-version should be removed in the future, but for the moment it is still necessary to control whether AMDGPUDisassembler should print ".amdhsa_uses_dynamic_stack" or not. I'm planning on updating that to check against the ELF header in a follow-up, after that we should be able to remove the flag (and getDefaultCodeObjectVersion()).

Thanks for taking a look!

@llvmbot
Copy link
Collaborator

llvmbot commented Dec 22, 2023

@llvm/pr-subscribers-backend-amdgpu

@llvm/pr-subscribers-llvm-support

Author: Emma Pilkington (epilk)

Changes

Named '.amdgcn_code_object_version'. This directive sets the e_ident[ABIVERSION] in the ELF header, and should be used as the assumed COV for the rest of the asm file.

This commit also weakens the --amdhsa-code-object-version CL flag. Previously, the CL flag took precedence over the IR flag. Now the IR flag/asm directive take precedence over the CL flag. This is implemented by merging a few COV-checking functions in AMDGPUBaseInfo.h.

--amdhsa-code-object-version should be removed in the future, but for the moment it is still necessary to control whether AMDGPUDisassembler should print ".amdhsa_uses_dynamic_stack" or not. I'm planning on updating that to check against the ELF header in a follow-up, after that we should be able to remove the flag (and getDefaultCodeObjectVersion()).

Thanks for taking a look!


Patch is 68.63 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/76267.diff

37 Files Affected:

  • (modified) llvm/docs/AMDGPUUsage.rst (+8)
  • (modified) llvm/include/llvm/MC/MCAssembler.h (+6)
  • (modified) llvm/include/llvm/MC/MCELFObjectWriter.h (+1-3)
  • (modified) llvm/include/llvm/Support/AMDGPUMetadata.h (+2-8)
  • (modified) llvm/lib/MC/ELFObjectWriter.cpp (+1-1)
  • (modified) llvm/lib/MC/MCELFObjectTargetWriter.cpp (+2-3)
  • (modified) llvm/lib/Target/AMDGPU/AMDGPUAsmPrinter.cpp (+5-4)
  • (modified) llvm/lib/Target/AMDGPU/AsmParser/AMDGPUAsmParser.cpp (+13-78)
  • (modified) llvm/lib/Target/AMDGPU/Disassembler/AMDGPUDisassembler.cpp (+1-1)
  • (modified) llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUAsmBackend.cpp (+4-8)
  • (modified) llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUELFObjectWriter.cpp (+5-9)
  • (modified) llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUMCTargetDesc.h (+2-2)
  • (modified) llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUTargetStreamer.cpp (+9-124)
  • (modified) llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUTargetStreamer.h (+16-43)
  • (modified) llvm/lib/Target/AMDGPU/Utils/AMDGPUBaseInfo.cpp (+34-53)
  • (modified) llvm/lib/Target/AMDGPU/Utils/AMDGPUBaseInfo.h (+12-19)
  • (modified) llvm/test/CodeGen/AMDGPU/code-object-v3.ll (+2)
  • (modified) llvm/test/CodeGen/AMDGPU/codegen-internal-only-func.ll (+2)
  • (modified) llvm/test/CodeGen/AMDGPU/tid-mul-func-xnack-all-any.ll (+2-2)
  • (modified) llvm/test/CodeGen/AMDGPU/tid-mul-func-xnack-all-not-supported.ll (+2-2)
  • (modified) llvm/test/CodeGen/AMDGPU/tid-mul-func-xnack-all-off.ll (+2-2)
  • (modified) llvm/test/CodeGen/AMDGPU/tid-mul-func-xnack-all-on.ll (+2-2)
  • (modified) llvm/test/CodeGen/AMDGPU/tid-mul-func-xnack-any-off-1.ll (+2-2)
  • (modified) llvm/test/CodeGen/AMDGPU/tid-mul-func-xnack-any-off-2.ll (+2-2)
  • (modified) llvm/test/CodeGen/AMDGPU/tid-mul-func-xnack-any-on-1.ll (+2-2)
  • (modified) llvm/test/CodeGen/AMDGPU/tid-mul-func-xnack-any-on-2.ll (+2-2)
  • (modified) llvm/test/CodeGen/AMDGPU/tid-one-func-xnack-any.ll (+2-2)
  • (modified) llvm/test/CodeGen/AMDGPU/tid-one-func-xnack-not-supported.ll (+2-2)
  • (modified) llvm/test/CodeGen/AMDGPU/tid-one-func-xnack-off.ll (+2-2)
  • (modified) llvm/test/CodeGen/AMDGPU/tid-one-func-xnack-on.ll (+2-2)
  • (added) llvm/test/MC/AMDGPU/elf-header-cov.s (+31)
  • (modified) llvm/test/MC/AMDGPU/hsa-diag-v4.s (+7-5)
  • (modified) llvm/test/MC/AMDGPU/hsa-exp.s (+5-2)
  • (modified) llvm/test/MC/AMDGPU/hsa-gfx12-v4.s (+5-2)
  • (modified) llvm/test/MC/AMDGPU/hsa-v4.s (+5-2)
  • (modified) llvm/test/MC/AMDGPU/hsa-v5-uses-dynamic-stack.s (+5-2)
  • (removed) llvm/test/MC/AMDGPU/hsa_isa_version_attrs.s (-6)
diff --git a/llvm/docs/AMDGPUUsage.rst b/llvm/docs/AMDGPUUsage.rst
index f0c81bf878f7ac..438659b94eb31d 100644
--- a/llvm/docs/AMDGPUUsage.rst
+++ b/llvm/docs/AMDGPUUsage.rst
@@ -15379,6 +15379,14 @@ command-line options such as ``-triple``, ``-mcpu``, and
   The target ID syntax used for code object V2 to V3 for this directive differs
   from that used elsewhere. See :ref:`amdgpu-target-id-v2-v3`.
 
+.. _amdgpu-assembler-directive-amdgcn-code-object-version:
+
+.amdgcn_code_object_version <version>
++++++++++++++++++++++++++++++++++++++
+
+Optional directive which declares the code object version to be generated by the
+assembler. If not present, a default value will be used.
+
 .amdhsa_kernel <name>
 +++++++++++++++++++++
 
diff --git a/llvm/include/llvm/MC/MCAssembler.h b/llvm/include/llvm/MC/MCAssembler.h
index 5ae5f6d7093858..a70dafe0284f9e 100644
--- a/llvm/include/llvm/MC/MCAssembler.h
+++ b/llvm/include/llvm/MC/MCAssembler.h
@@ -161,6 +161,8 @@ class MCAssembler {
   // which flags to be set.
   unsigned ELFHeaderEFlags;
 
+  unsigned char ELFHeaderABIVersion = 0;
+
   /// Used to communicate Linker Optimization Hint information between
   /// the Streamer and the .o writer
   MCLOHContainer LOHContainer;
@@ -279,6 +281,10 @@ class MCAssembler {
   unsigned getELFHeaderEFlags() const { return ELFHeaderEFlags; }
   void setELFHeaderEFlags(unsigned Flags) { ELFHeaderEFlags = Flags; }
 
+  /// ELF e_ident[EI_ABIVERSION] value
+  unsigned char getELFHeaderABIVersion() const { return ELFHeaderABIVersion; }
+  void setELFHeaderABIVersion(unsigned char V) { ELFHeaderABIVersion = V; }
+
   /// MachO deployment target version information.
   const VersionInfoType &getVersionInfo() const { return VersionInfo; }
   void setVersionMin(MCVersionMinType Type, unsigned Major, unsigned Minor,
diff --git a/llvm/include/llvm/MC/MCELFObjectWriter.h b/llvm/include/llvm/MC/MCELFObjectWriter.h
index d7c223cdcc07f8..d561a334a7e277 100644
--- a/llvm/include/llvm/MC/MCELFObjectWriter.h
+++ b/llvm/include/llvm/MC/MCELFObjectWriter.h
@@ -52,14 +52,13 @@ struct ELFRelocationEntry {
 
 class MCELFObjectTargetWriter : public MCObjectTargetWriter {
   const uint8_t OSABI;
-  const uint8_t ABIVersion;
   const uint16_t EMachine;
   const unsigned HasRelocationAddend : 1;
   const unsigned Is64Bit : 1;
 
 protected:
   MCELFObjectTargetWriter(bool Is64Bit_, uint8_t OSABI_, uint16_t EMachine_,
-                          bool HasRelocationAddend_, uint8_t ABIVersion_ = 0);
+                          bool HasRelocationAddend_);
 
 public:
   virtual ~MCELFObjectTargetWriter() = default;
@@ -97,7 +96,6 @@ class MCELFObjectTargetWriter : public MCObjectTargetWriter {
   /// \name Accessors
   /// @{
   uint8_t getOSABI() const { return OSABI; }
-  uint8_t getABIVersion() const { return ABIVersion; }
   uint16_t getEMachine() const { return EMachine; }
   bool hasRelocationAddend() const { return HasRelocationAddend; }
   bool is64Bit() const { return Is64Bit; }
diff --git a/llvm/include/llvm/Support/AMDGPUMetadata.h b/llvm/include/llvm/Support/AMDGPUMetadata.h
index e0838a1f425ea5..2dae6feac0889a 100644
--- a/llvm/include/llvm/Support/AMDGPUMetadata.h
+++ b/llvm/include/llvm/Support/AMDGPUMetadata.h
@@ -29,11 +29,6 @@ namespace AMDGPU {
 //===----------------------------------------------------------------------===//
 namespace HSAMD {
 
-/// HSA metadata major version for code object V2.
-constexpr uint32_t VersionMajorV2 = 1;
-/// HSA metadata minor version for code object V2.
-constexpr uint32_t VersionMinorV2 = 0;
-
 /// HSA metadata major version for code object V3.
 constexpr uint32_t VersionMajorV3 = 1;
 /// HSA metadata minor version for code object V3.
@@ -49,10 +44,9 @@ constexpr uint32_t VersionMajorV5 = 1;
 /// HSA metadata minor version for code object V5.
 constexpr uint32_t VersionMinorV5 = 2;
 
-/// HSA metadata beginning assembler directive.
+/// Old HSA metadata beginning assembler directive for V2. This is only used for
+/// diagnostics now.
 constexpr char AssemblerDirectiveBegin[] = ".amd_amdgpu_hsa_metadata";
-/// HSA metadata ending assembler directive.
-constexpr char AssemblerDirectiveEnd[] = ".end_amd_amdgpu_hsa_metadata";
 
 /// Access qualifiers.
 enum class AccessQualifier : uint8_t {
diff --git a/llvm/lib/MC/ELFObjectWriter.cpp b/llvm/lib/MC/ELFObjectWriter.cpp
index cb8af1aa99551a..e6d370a82e93ce 100644
--- a/llvm/lib/MC/ELFObjectWriter.cpp
+++ b/llvm/lib/MC/ELFObjectWriter.cpp
@@ -417,7 +417,7 @@ void ELFWriter::writeHeader(const MCAssembler &Asm) {
                    ? int(ELF::ELFOSABI_GNU)
                    : OSABI);
   // e_ident[EI_ABIVERSION]
-  W.OS << char(OWriter.TargetObjectWriter->getABIVersion());
+  W.OS << char(Asm.getELFHeaderABIVersion());
 
   W.OS.write_zeros(ELF::EI_NIDENT - ELF::EI_PAD);
 
diff --git a/llvm/lib/MC/MCELFObjectTargetWriter.cpp b/llvm/lib/MC/MCELFObjectTargetWriter.cpp
index c35e1f26dc1efa..ff8de8e12c33a3 100644
--- a/llvm/lib/MC/MCELFObjectTargetWriter.cpp
+++ b/llvm/lib/MC/MCELFObjectTargetWriter.cpp
@@ -12,9 +12,8 @@ using namespace llvm;
 
 MCELFObjectTargetWriter::MCELFObjectTargetWriter(bool Is64Bit_, uint8_t OSABI_,
                                                  uint16_t EMachine_,
-                                                 bool HasRelocationAddend_,
-                                                 uint8_t ABIVersion_)
-    : OSABI(OSABI_), ABIVersion(ABIVersion_), EMachine(EMachine_),
+                                                 bool HasRelocationAddend_)
+    : OSABI(OSABI_), EMachine(EMachine_),
       HasRelocationAddend(HasRelocationAddend_), Is64Bit(Is64Bit_) {}
 
 bool MCELFObjectTargetWriter::needsRelocateWithSymbol(const MCValue &,
diff --git a/llvm/lib/Target/AMDGPU/AMDGPUAsmPrinter.cpp b/llvm/lib/Target/AMDGPU/AMDGPUAsmPrinter.cpp
index d317a733d4331c..0411a6e6f8828f 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPUAsmPrinter.cpp
+++ b/llvm/lib/Target/AMDGPU/AMDGPUAsmPrinter.cpp
@@ -117,6 +117,8 @@ void AMDGPUAsmPrinter::initTargetStreamer(Module &M) {
   if (getTargetStreamer() && !getTargetStreamer()->getTargetID())
     initializeTargetID(M);
 
+  getTargetStreamer()->EmitDirectiveAMDGCNCodeObjectVersion(CodeObjectVersion);
+
   if (TM.getTargetTriple().getOS() != Triple::AMDHSA &&
       TM.getTargetTriple().getOS() != Triple::AMDPAL)
     return;
@@ -230,8 +232,7 @@ void AMDGPUAsmPrinter::emitFunctionBodyEnd() {
           IsaInfo::getNumExtraSGPRs(
               &STM, CurrentProgramInfo.VCCUsed, CurrentProgramInfo.FlatUsed,
               getTargetStreamer()->getTargetID()->isXnackOnOrAny()),
-      CurrentProgramInfo.VCCUsed, CurrentProgramInfo.FlatUsed,
-      CodeObjectVersion);
+      CurrentProgramInfo.VCCUsed, CurrentProgramInfo.FlatUsed);
 
   Streamer.popSection();
 }
@@ -631,8 +632,8 @@ bool AMDGPUAsmPrinter::runOnMachineFunction(MachineFunction &MF) {
 void AMDGPUAsmPrinter::initializeTargetID(const Module &M) {
   // In the beginning all features are either 'Any' or 'NotSupported',
   // depending on global target features. This will cover empty modules.
-  getTargetStreamer()->initializeTargetID(
-      *getGlobalSTI(), getGlobalSTI()->getFeatureString(), CodeObjectVersion);
+  getTargetStreamer()->initializeTargetID(*getGlobalSTI(),
+                                          getGlobalSTI()->getFeatureString());
 
   // If module is empty, we are done.
   if (M.empty())
diff --git a/llvm/lib/Target/AMDGPU/AsmParser/AMDGPUAsmParser.cpp b/llvm/lib/Target/AMDGPU/AsmParser/AMDGPUAsmParser.cpp
index 3b69a37728ea1c..50059665ff1627 100644
--- a/llvm/lib/Target/AMDGPU/AsmParser/AMDGPUAsmParser.cpp
+++ b/llvm/lib/Target/AMDGPU/AsmParser/AMDGPUAsmParser.cpp
@@ -1295,10 +1295,8 @@ class AMDGPUAsmParser : public MCTargetAsmParser {
                           unsigned NextFreeSGPR, SMRange SGPRRange,
                           unsigned &VGPRBlocks, unsigned &SGPRBlocks);
   bool ParseDirectiveAMDGCNTarget();
+  bool ParseDirectiveAMDGCNCodeObjectVersion();
   bool ParseDirectiveAMDHSAKernel();
-  bool ParseDirectiveMajorMinor(uint32_t &Major, uint32_t &Minor);
-  bool ParseDirectiveHSACodeObjectVersion();
-  bool ParseDirectiveHSACodeObjectISA();
   bool ParseAMDKernelCodeTValue(StringRef ID, amd_kernel_code_t &Header);
   bool ParseDirectiveAMDKernelCodeT();
   // TODO: Possibly make subtargetHasRegister const.
@@ -5057,20 +5055,6 @@ bool AMDGPUAsmParser::ParseAsAbsoluteExpression(uint32_t &Ret) {
   return false;
 }
 
-bool AMDGPUAsmParser::ParseDirectiveMajorMinor(uint32_t &Major,
-                                               uint32_t &Minor) {
-  if (ParseAsAbsoluteExpression(Major))
-    return TokError("invalid major version");
-
-  if (!trySkipToken(AsmToken::Comma))
-    return TokError("minor version number required, comma expected");
-
-  if (ParseAsAbsoluteExpression(Minor))
-    return TokError("invalid minor version");
-
-  return false;
-}
-
 bool AMDGPUAsmParser::ParseDirectiveAMDGCNTarget() {
   if (getSTI().getTargetTriple().getArch() != Triple::amdgcn)
     return TokError("directive only supported for amdgcn architecture");
@@ -5535,63 +5519,18 @@ bool AMDGPUAsmParser::ParseDirectiveAMDHSAKernel() {
     }
   }
 
-  getTargetStreamer().EmitAmdhsaKernelDescriptor(
-      getSTI(), KernelName, KD, NextFreeVGPR, NextFreeSGPR, ReserveVCC,
-      ReserveFlatScr, AMDGPU::getAmdhsaCodeObjectVersion());
-  return false;
-}
-
-bool AMDGPUAsmParser::ParseDirectiveHSACodeObjectVersion() {
-  uint32_t Major;
-  uint32_t Minor;
-
-  if (ParseDirectiveMajorMinor(Major, Minor))
-    return true;
-
-  getTargetStreamer().EmitDirectiveHSACodeObjectVersion(Major, Minor);
+  getTargetStreamer().EmitAmdhsaKernelDescriptor(getSTI(), KernelName, KD,
+                                                 NextFreeVGPR, NextFreeSGPR,
+                                                 ReserveVCC, ReserveFlatScr);
   return false;
 }
 
-bool AMDGPUAsmParser::ParseDirectiveHSACodeObjectISA() {
-  uint32_t Major;
-  uint32_t Minor;
-  uint32_t Stepping;
-  StringRef VendorName;
-  StringRef ArchName;
-
-  // If this directive has no arguments, then use the ISA version for the
-  // targeted GPU.
-  if (isToken(AsmToken::EndOfStatement)) {
-    AMDGPU::IsaVersion ISA = AMDGPU::getIsaVersion(getSTI().getCPU());
-    getTargetStreamer().EmitDirectiveHSACodeObjectISAV2(ISA.Major, ISA.Minor,
-                                                        ISA.Stepping,
-                                                        "AMD", "AMDGPU");
-    return false;
-  }
-
-  if (ParseDirectiveMajorMinor(Major, Minor))
-    return true;
-
-  if (!trySkipToken(AsmToken::Comma))
-    return TokError("stepping version number required, comma expected");
-
-  if (ParseAsAbsoluteExpression(Stepping))
-    return TokError("invalid stepping version");
-
-  if (!trySkipToken(AsmToken::Comma))
-    return TokError("vendor name required, comma expected");
-
-  if (!parseString(VendorName, "invalid vendor name"))
-    return true;
-
-  if (!trySkipToken(AsmToken::Comma))
-    return TokError("arch name required, comma expected");
-
-  if (!parseString(ArchName, "invalid arch name"))
+bool AMDGPUAsmParser::ParseDirectiveAMDGCNCodeObjectVersion() {
+  uint32_t Version;
+  if (ParseAsAbsoluteExpression(Version))
     return true;
 
-  getTargetStreamer().EmitDirectiveHSACodeObjectISAV2(Major, Minor, Stepping,
-                                                      VendorName, ArchName);
+  getTargetStreamer().EmitDirectiveAMDGCNCodeObjectVersion(Version);
   return false;
 }
 
@@ -5882,12 +5821,6 @@ bool AMDGPUAsmParser::ParseDirective(AsmToken DirectiveID) {
     if (IDVal == AMDGPU::HSAMD::V3::AssemblerDirectiveBegin)
       return ParseDirectiveHSAMetadata();
   } else {
-    if (IDVal == ".hsa_code_object_version")
-      return ParseDirectiveHSACodeObjectVersion();
-
-    if (IDVal == ".hsa_code_object_isa")
-      return ParseDirectiveHSACodeObjectISA();
-
     if (IDVal == ".amd_kernel_code_t")
       return ParseDirectiveAMDKernelCodeT();
 
@@ -5917,6 +5850,9 @@ bool AMDGPUAsmParser::ParseDirective(AsmToken DirectiveID) {
   if (IDVal == PALMD::AssemblerDirective)
     return ParseDirectivePALMetadata();
 
+  if (IDVal == ".amdgcn_code_object_version")
+    return ParseDirectiveAMDGCNCodeObjectVersion();
+
   return true;
 }
 
@@ -8060,9 +7996,8 @@ void AMDGPUAsmParser::onBeginOfFile() {
     return;
 
   if (!getTargetStreamer().getTargetID())
-    getTargetStreamer().initializeTargetID(getSTI(), getSTI().getFeatureString(),
-        // TODO: Should try to check code object version from directive???
-        AMDGPU::getAmdhsaCodeObjectVersion());
+    getTargetStreamer().initializeTargetID(getSTI(),
+                                           getSTI().getFeatureString());
 
   if (isHsaAbi(getSTI()))
     getTargetStreamer().EmitDirectiveAMDGCNTarget();
diff --git a/llvm/lib/Target/AMDGPU/Disassembler/AMDGPUDisassembler.cpp b/llvm/lib/Target/AMDGPU/Disassembler/AMDGPUDisassembler.cpp
index 7939d0036568d4..8f41c43b181ae7 100644
--- a/llvm/lib/Target/AMDGPU/Disassembler/AMDGPUDisassembler.cpp
+++ b/llvm/lib/Target/AMDGPU/Disassembler/AMDGPUDisassembler.cpp
@@ -2149,7 +2149,7 @@ AMDGPUDisassembler::decodeKernelDescriptorDirective(
                       KERNEL_CODE_PROPERTY_ENABLE_WAVEFRONT_SIZE32);
     }
 
-    if (AMDGPU::getAmdhsaCodeObjectVersion() >= AMDGPU::AMDHSA_COV5)
+    if (AMDGPU::getDefaultCodeObjectVersion() >= AMDGPU::AMDHSA_COV5)
       PRINT_DIRECTIVE(".amdhsa_uses_dynamic_stack",
                       KERNEL_CODE_PROPERTY_USES_DYNAMIC_STACK);
 
diff --git a/llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUAsmBackend.cpp b/llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUAsmBackend.cpp
index f91f36ed851b7f..76610bef48df07 100644
--- a/llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUAsmBackend.cpp
+++ b/llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUAsmBackend.cpp
@@ -232,13 +232,11 @@ class ELFAMDGPUAsmBackend : public AMDGPUAsmBackend {
   bool Is64Bit;
   bool HasRelocationAddend;
   uint8_t OSABI = ELF::ELFOSABI_NONE;
-  uint8_t ABIVersion = 0;
 
 public:
-  ELFAMDGPUAsmBackend(const Target &T, const Triple &TT, uint8_t ABIVersion) :
+  ELFAMDGPUAsmBackend(const Target &T, const Triple &TT) :
       AMDGPUAsmBackend(T), Is64Bit(TT.getArch() == Triple::amdgcn),
-      HasRelocationAddend(TT.getOS() == Triple::AMDHSA),
-      ABIVersion(ABIVersion) {
+      HasRelocationAddend(TT.getOS() == Triple::AMDHSA) {
     switch (TT.getOS()) {
     case Triple::AMDHSA:
       OSABI = ELF::ELFOSABI_AMDGPU_HSA;
@@ -256,8 +254,7 @@ class ELFAMDGPUAsmBackend : public AMDGPUAsmBackend {
 
   std::unique_ptr<MCObjectTargetWriter>
   createObjectTargetWriter() const override {
-    return createAMDGPUELFObjectWriter(Is64Bit, OSABI, HasRelocationAddend,
-                                       ABIVersion);
+    return createAMDGPUELFObjectWriter(Is64Bit, OSABI, HasRelocationAddend);
   }
 };
 
@@ -267,6 +264,5 @@ MCAsmBackend *llvm::createAMDGPUAsmBackend(const Target &T,
                                            const MCSubtargetInfo &STI,
                                            const MCRegisterInfo &MRI,
                                            const MCTargetOptions &Options) {
-  return new ELFAMDGPUAsmBackend(T, STI.getTargetTriple(),
-                                 getHsaAbiVersion(&STI).value_or(0));
+  return new ELFAMDGPUAsmBackend(T, STI.getTargetTriple());
 }
diff --git a/llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUELFObjectWriter.cpp b/llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUELFObjectWriter.cpp
index 58eed81e075560..c05a53120a8090 100644
--- a/llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUELFObjectWriter.cpp
+++ b/llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUELFObjectWriter.cpp
@@ -18,8 +18,7 @@ namespace {
 
 class AMDGPUELFObjectWriter : public MCELFObjectTargetWriter {
 public:
-  AMDGPUELFObjectWriter(bool Is64Bit, uint8_t OSABI, bool HasRelocationAddend,
-                        uint8_t ABIVersion);
+  AMDGPUELFObjectWriter(bool Is64Bit, uint8_t OSABI, bool HasRelocationAddend);
 
 protected:
   unsigned getRelocType(MCContext &Ctx, const MCValue &Target,
@@ -31,10 +30,9 @@ class AMDGPUELFObjectWriter : public MCELFObjectTargetWriter {
 
 AMDGPUELFObjectWriter::AMDGPUELFObjectWriter(bool Is64Bit,
                                              uint8_t OSABI,
-                                             bool HasRelocationAddend,
-                                             uint8_t ABIVersion)
+                                             bool HasRelocationAddend)
   : MCELFObjectTargetWriter(Is64Bit, OSABI, ELF::EM_AMDGPU,
-                            HasRelocationAddend, ABIVersion) {}
+                            HasRelocationAddend) {}
 
 unsigned AMDGPUELFObjectWriter::getRelocType(MCContext &Ctx,
                                              const MCValue &Target,
@@ -100,9 +98,7 @@ unsigned AMDGPUELFObjectWriter::getRelocType(MCContext &Ctx,
 
 std::unique_ptr<MCObjectTargetWriter>
 llvm::createAMDGPUELFObjectWriter(bool Is64Bit, uint8_t OSABI,
-                                  bool HasRelocationAddend,
-                                  uint8_t ABIVersion) {
+                                  bool HasRelocationAddend) {
   return std::make_unique<AMDGPUELFObjectWriter>(Is64Bit, OSABI,
-                                                  HasRelocationAddend,
-                                                  ABIVersion);
+                                                 HasRelocationAddend);
 }
diff --git a/llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUMCTargetDesc.h b/llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUMCTargetDesc.h
index 006115ba14fc1c..3ef00f75735b0d 100644
--- a/llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUMCTargetDesc.h
+++ b/llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUMCTargetDesc.h
@@ -42,8 +42,8 @@ MCAsmBackend *createAMDGPUAsmBackend(const Target &T,
 
 std::unique_ptr<MCObjectTargetWriter>
 createAMDGPUELFObjectWriter(bool Is64Bit, uint8_t OSABI,
-                            bool HasRelocationAddend, uint8_t ABIVersion);
-} // End llvm namespace
+                            bool HasRelocationAddend);
+} // namespace llvm
 
 #define GET_REGINFO_ENUM
 #include "AMDGPUGenRegisterInfo.inc"
diff --git a/llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUTargetStreamer.cpp b/llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUTargetStreamer.cpp
index a855cf585205bc..6829aef91e459c 100644
--- a/llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUTargetStreamer.cpp
+++ b/llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUTargetStreamer.cpp
@@ -35,27 +35,6 @@ using namespace llvm::AMDGPU;
 // AMDGPUTargetStreamer
 //===----------------------------------------------------------------------===//
 
-static void convertIsaVersionV2(uint32_t &Major, uint32_t &Minor,
-                                uint32_t &Stepping, bool Sramecc, bool Xnack) {
-  if (Major == 9 && Minor == 0) {
-    switch (Stepping) {
-      case 0:
-      case 2:
-      case 4:
-      case 6:
-        if (Xnack)
-          Stepping++;
-    }
-  }
-}
-
-bool AMDGPUTargetStreamer::EmitHSAMetadataV2(StringRef HSAMetadataString) {
-  HSAMD::Metadata HSAMetadata;
-  if (HSAMD::fromString(HSAMetadataString, HSAMetadata))
-    return false;
-  return EmitHSAMetadata(HSAMetadata);
-}
-
 bool AMDGPUTargetStreamer::EmitHSAMetadataV3(StringRef HSAMetadataString) {
   msgpack::Document HSAMetadataDoc;
   if (!HSAMetadataDoc.fromYAML(HSAMetadataString))
@@ -238,21 +217,10 @@ void AMDGPUTargetAsmStreamer::EmitDirectiveAMDGCNTarget() {
   OS << "\t.amdgcn_target \"" << getTargetID()->toString() << "\"\n";
 }
 
-void AMDGPUTargetAsmStreamer::EmitDirectiveHSACodeObjectVersion(
-    uint32_t Major, uint32_t Minor) {
-  OS << "\t.hsa_code_object_version " <<
-        Twine(Major) << "," << Twine(Minor) << '\n';
-}
-
-void
-AMDGPUTargetAsmStreamer::EmitDirectiveHSACodeObjectISAV2(uint32_t Major,
-                                                         uint32_t Minor,
-                                                         uint32_t Stepping,
-                                                         StringRef VendorName,
-                                                         StringRef ArchName) {
-  convertIsaVersionV2(Major, Minor, Stepping, TargetID->isSramEccOnOrAny(), TargetID->isXnackOnOrAny());
-  OS << "\t.hsa_code_object_isa " << Twine(Major) << "," << Twine(Minor) << ","
-     << Twine(Stepping) << ",\"" << VendorName << "\",\"" << ArchName << "\"\n";
+void AMDGPUTargetAsmStreamer::EmitDirectiveAMDGCNCodeObjectVersion(
+    unsigne...
[truncated]

@llvmbot
Copy link
Collaborator

llvmbot commented Dec 22, 2023

@llvm/pr-subscribers-mc

Author: Emma Pilkington (epilk)

Changes

Named '.amdgcn_code_object_version'. This directive sets the e_ident[ABIVERSION] in the ELF header, and should be used as the assumed COV for the rest of the asm file.

This commit also weakens the --amdhsa-code-object-version CL flag. Previously, the CL flag took precedence over the IR flag. Now the IR flag/asm directive take precedence over the CL flag. This is implemented by merging a few COV-checking functions in AMDGPUBaseInfo.h.

--amdhsa-code-object-version should be removed in the future, but for the moment it is still necessary to control whether AMDGPUDisassembler should print ".amdhsa_uses_dynamic_stack" or not. I'm planning on updating that to check against the ELF header in a follow-up, after that we should be able to remove the flag (and getDefaultCodeObjectVersion()).

Thanks for taking a look!


Patch is 68.63 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/76267.diff

37 Files Affected:

  • (modified) llvm/docs/AMDGPUUsage.rst (+8)
  • (modified) llvm/include/llvm/MC/MCAssembler.h (+6)
  • (modified) llvm/include/llvm/MC/MCELFObjectWriter.h (+1-3)
  • (modified) llvm/include/llvm/Support/AMDGPUMetadata.h (+2-8)
  • (modified) llvm/lib/MC/ELFObjectWriter.cpp (+1-1)
  • (modified) llvm/lib/MC/MCELFObjectTargetWriter.cpp (+2-3)
  • (modified) llvm/lib/Target/AMDGPU/AMDGPUAsmPrinter.cpp (+5-4)
  • (modified) llvm/lib/Target/AMDGPU/AsmParser/AMDGPUAsmParser.cpp (+13-78)
  • (modified) llvm/lib/Target/AMDGPU/Disassembler/AMDGPUDisassembler.cpp (+1-1)
  • (modified) llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUAsmBackend.cpp (+4-8)
  • (modified) llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUELFObjectWriter.cpp (+5-9)
  • (modified) llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUMCTargetDesc.h (+2-2)
  • (modified) llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUTargetStreamer.cpp (+9-124)
  • (modified) llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUTargetStreamer.h (+16-43)
  • (modified) llvm/lib/Target/AMDGPU/Utils/AMDGPUBaseInfo.cpp (+34-53)
  • (modified) llvm/lib/Target/AMDGPU/Utils/AMDGPUBaseInfo.h (+12-19)
  • (modified) llvm/test/CodeGen/AMDGPU/code-object-v3.ll (+2)
  • (modified) llvm/test/CodeGen/AMDGPU/codegen-internal-only-func.ll (+2)
  • (modified) llvm/test/CodeGen/AMDGPU/tid-mul-func-xnack-all-any.ll (+2-2)
  • (modified) llvm/test/CodeGen/AMDGPU/tid-mul-func-xnack-all-not-supported.ll (+2-2)
  • (modified) llvm/test/CodeGen/AMDGPU/tid-mul-func-xnack-all-off.ll (+2-2)
  • (modified) llvm/test/CodeGen/AMDGPU/tid-mul-func-xnack-all-on.ll (+2-2)
  • (modified) llvm/test/CodeGen/AMDGPU/tid-mul-func-xnack-any-off-1.ll (+2-2)
  • (modified) llvm/test/CodeGen/AMDGPU/tid-mul-func-xnack-any-off-2.ll (+2-2)
  • (modified) llvm/test/CodeGen/AMDGPU/tid-mul-func-xnack-any-on-1.ll (+2-2)
  • (modified) llvm/test/CodeGen/AMDGPU/tid-mul-func-xnack-any-on-2.ll (+2-2)
  • (modified) llvm/test/CodeGen/AMDGPU/tid-one-func-xnack-any.ll (+2-2)
  • (modified) llvm/test/CodeGen/AMDGPU/tid-one-func-xnack-not-supported.ll (+2-2)
  • (modified) llvm/test/CodeGen/AMDGPU/tid-one-func-xnack-off.ll (+2-2)
  • (modified) llvm/test/CodeGen/AMDGPU/tid-one-func-xnack-on.ll (+2-2)
  • (added) llvm/test/MC/AMDGPU/elf-header-cov.s (+31)
  • (modified) llvm/test/MC/AMDGPU/hsa-diag-v4.s (+7-5)
  • (modified) llvm/test/MC/AMDGPU/hsa-exp.s (+5-2)
  • (modified) llvm/test/MC/AMDGPU/hsa-gfx12-v4.s (+5-2)
  • (modified) llvm/test/MC/AMDGPU/hsa-v4.s (+5-2)
  • (modified) llvm/test/MC/AMDGPU/hsa-v5-uses-dynamic-stack.s (+5-2)
  • (removed) llvm/test/MC/AMDGPU/hsa_isa_version_attrs.s (-6)
diff --git a/llvm/docs/AMDGPUUsage.rst b/llvm/docs/AMDGPUUsage.rst
index f0c81bf878f7ac..438659b94eb31d 100644
--- a/llvm/docs/AMDGPUUsage.rst
+++ b/llvm/docs/AMDGPUUsage.rst
@@ -15379,6 +15379,14 @@ command-line options such as ``-triple``, ``-mcpu``, and
   The target ID syntax used for code object V2 to V3 for this directive differs
   from that used elsewhere. See :ref:`amdgpu-target-id-v2-v3`.
 
+.. _amdgpu-assembler-directive-amdgcn-code-object-version:
+
+.amdgcn_code_object_version <version>
++++++++++++++++++++++++++++++++++++++
+
+Optional directive which declares the code object version to be generated by the
+assembler. If not present, a default value will be used.
+
 .amdhsa_kernel <name>
 +++++++++++++++++++++
 
diff --git a/llvm/include/llvm/MC/MCAssembler.h b/llvm/include/llvm/MC/MCAssembler.h
index 5ae5f6d7093858..a70dafe0284f9e 100644
--- a/llvm/include/llvm/MC/MCAssembler.h
+++ b/llvm/include/llvm/MC/MCAssembler.h
@@ -161,6 +161,8 @@ class MCAssembler {
   // which flags to be set.
   unsigned ELFHeaderEFlags;
 
+  unsigned char ELFHeaderABIVersion = 0;
+
   /// Used to communicate Linker Optimization Hint information between
   /// the Streamer and the .o writer
   MCLOHContainer LOHContainer;
@@ -279,6 +281,10 @@ class MCAssembler {
   unsigned getELFHeaderEFlags() const { return ELFHeaderEFlags; }
   void setELFHeaderEFlags(unsigned Flags) { ELFHeaderEFlags = Flags; }
 
+  /// ELF e_ident[EI_ABIVERSION] value
+  unsigned char getELFHeaderABIVersion() const { return ELFHeaderABIVersion; }
+  void setELFHeaderABIVersion(unsigned char V) { ELFHeaderABIVersion = V; }
+
   /// MachO deployment target version information.
   const VersionInfoType &getVersionInfo() const { return VersionInfo; }
   void setVersionMin(MCVersionMinType Type, unsigned Major, unsigned Minor,
diff --git a/llvm/include/llvm/MC/MCELFObjectWriter.h b/llvm/include/llvm/MC/MCELFObjectWriter.h
index d7c223cdcc07f8..d561a334a7e277 100644
--- a/llvm/include/llvm/MC/MCELFObjectWriter.h
+++ b/llvm/include/llvm/MC/MCELFObjectWriter.h
@@ -52,14 +52,13 @@ struct ELFRelocationEntry {
 
 class MCELFObjectTargetWriter : public MCObjectTargetWriter {
   const uint8_t OSABI;
-  const uint8_t ABIVersion;
   const uint16_t EMachine;
   const unsigned HasRelocationAddend : 1;
   const unsigned Is64Bit : 1;
 
 protected:
   MCELFObjectTargetWriter(bool Is64Bit_, uint8_t OSABI_, uint16_t EMachine_,
-                          bool HasRelocationAddend_, uint8_t ABIVersion_ = 0);
+                          bool HasRelocationAddend_);
 
 public:
   virtual ~MCELFObjectTargetWriter() = default;
@@ -97,7 +96,6 @@ class MCELFObjectTargetWriter : public MCObjectTargetWriter {
   /// \name Accessors
   /// @{
   uint8_t getOSABI() const { return OSABI; }
-  uint8_t getABIVersion() const { return ABIVersion; }
   uint16_t getEMachine() const { return EMachine; }
   bool hasRelocationAddend() const { return HasRelocationAddend; }
   bool is64Bit() const { return Is64Bit; }
diff --git a/llvm/include/llvm/Support/AMDGPUMetadata.h b/llvm/include/llvm/Support/AMDGPUMetadata.h
index e0838a1f425ea5..2dae6feac0889a 100644
--- a/llvm/include/llvm/Support/AMDGPUMetadata.h
+++ b/llvm/include/llvm/Support/AMDGPUMetadata.h
@@ -29,11 +29,6 @@ namespace AMDGPU {
 //===----------------------------------------------------------------------===//
 namespace HSAMD {
 
-/// HSA metadata major version for code object V2.
-constexpr uint32_t VersionMajorV2 = 1;
-/// HSA metadata minor version for code object V2.
-constexpr uint32_t VersionMinorV2 = 0;
-
 /// HSA metadata major version for code object V3.
 constexpr uint32_t VersionMajorV3 = 1;
 /// HSA metadata minor version for code object V3.
@@ -49,10 +44,9 @@ constexpr uint32_t VersionMajorV5 = 1;
 /// HSA metadata minor version for code object V5.
 constexpr uint32_t VersionMinorV5 = 2;
 
-/// HSA metadata beginning assembler directive.
+/// Old HSA metadata beginning assembler directive for V2. This is only used for
+/// diagnostics now.
 constexpr char AssemblerDirectiveBegin[] = ".amd_amdgpu_hsa_metadata";
-/// HSA metadata ending assembler directive.
-constexpr char AssemblerDirectiveEnd[] = ".end_amd_amdgpu_hsa_metadata";
 
 /// Access qualifiers.
 enum class AccessQualifier : uint8_t {
diff --git a/llvm/lib/MC/ELFObjectWriter.cpp b/llvm/lib/MC/ELFObjectWriter.cpp
index cb8af1aa99551a..e6d370a82e93ce 100644
--- a/llvm/lib/MC/ELFObjectWriter.cpp
+++ b/llvm/lib/MC/ELFObjectWriter.cpp
@@ -417,7 +417,7 @@ void ELFWriter::writeHeader(const MCAssembler &Asm) {
                    ? int(ELF::ELFOSABI_GNU)
                    : OSABI);
   // e_ident[EI_ABIVERSION]
-  W.OS << char(OWriter.TargetObjectWriter->getABIVersion());
+  W.OS << char(Asm.getELFHeaderABIVersion());
 
   W.OS.write_zeros(ELF::EI_NIDENT - ELF::EI_PAD);
 
diff --git a/llvm/lib/MC/MCELFObjectTargetWriter.cpp b/llvm/lib/MC/MCELFObjectTargetWriter.cpp
index c35e1f26dc1efa..ff8de8e12c33a3 100644
--- a/llvm/lib/MC/MCELFObjectTargetWriter.cpp
+++ b/llvm/lib/MC/MCELFObjectTargetWriter.cpp
@@ -12,9 +12,8 @@ using namespace llvm;
 
 MCELFObjectTargetWriter::MCELFObjectTargetWriter(bool Is64Bit_, uint8_t OSABI_,
                                                  uint16_t EMachine_,
-                                                 bool HasRelocationAddend_,
-                                                 uint8_t ABIVersion_)
-    : OSABI(OSABI_), ABIVersion(ABIVersion_), EMachine(EMachine_),
+                                                 bool HasRelocationAddend_)
+    : OSABI(OSABI_), EMachine(EMachine_),
       HasRelocationAddend(HasRelocationAddend_), Is64Bit(Is64Bit_) {}
 
 bool MCELFObjectTargetWriter::needsRelocateWithSymbol(const MCValue &,
diff --git a/llvm/lib/Target/AMDGPU/AMDGPUAsmPrinter.cpp b/llvm/lib/Target/AMDGPU/AMDGPUAsmPrinter.cpp
index d317a733d4331c..0411a6e6f8828f 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPUAsmPrinter.cpp
+++ b/llvm/lib/Target/AMDGPU/AMDGPUAsmPrinter.cpp
@@ -117,6 +117,8 @@ void AMDGPUAsmPrinter::initTargetStreamer(Module &M) {
   if (getTargetStreamer() && !getTargetStreamer()->getTargetID())
     initializeTargetID(M);
 
+  getTargetStreamer()->EmitDirectiveAMDGCNCodeObjectVersion(CodeObjectVersion);
+
   if (TM.getTargetTriple().getOS() != Triple::AMDHSA &&
       TM.getTargetTriple().getOS() != Triple::AMDPAL)
     return;
@@ -230,8 +232,7 @@ void AMDGPUAsmPrinter::emitFunctionBodyEnd() {
           IsaInfo::getNumExtraSGPRs(
               &STM, CurrentProgramInfo.VCCUsed, CurrentProgramInfo.FlatUsed,
               getTargetStreamer()->getTargetID()->isXnackOnOrAny()),
-      CurrentProgramInfo.VCCUsed, CurrentProgramInfo.FlatUsed,
-      CodeObjectVersion);
+      CurrentProgramInfo.VCCUsed, CurrentProgramInfo.FlatUsed);
 
   Streamer.popSection();
 }
@@ -631,8 +632,8 @@ bool AMDGPUAsmPrinter::runOnMachineFunction(MachineFunction &MF) {
 void AMDGPUAsmPrinter::initializeTargetID(const Module &M) {
   // In the beginning all features are either 'Any' or 'NotSupported',
   // depending on global target features. This will cover empty modules.
-  getTargetStreamer()->initializeTargetID(
-      *getGlobalSTI(), getGlobalSTI()->getFeatureString(), CodeObjectVersion);
+  getTargetStreamer()->initializeTargetID(*getGlobalSTI(),
+                                          getGlobalSTI()->getFeatureString());
 
   // If module is empty, we are done.
   if (M.empty())
diff --git a/llvm/lib/Target/AMDGPU/AsmParser/AMDGPUAsmParser.cpp b/llvm/lib/Target/AMDGPU/AsmParser/AMDGPUAsmParser.cpp
index 3b69a37728ea1c..50059665ff1627 100644
--- a/llvm/lib/Target/AMDGPU/AsmParser/AMDGPUAsmParser.cpp
+++ b/llvm/lib/Target/AMDGPU/AsmParser/AMDGPUAsmParser.cpp
@@ -1295,10 +1295,8 @@ class AMDGPUAsmParser : public MCTargetAsmParser {
                           unsigned NextFreeSGPR, SMRange SGPRRange,
                           unsigned &VGPRBlocks, unsigned &SGPRBlocks);
   bool ParseDirectiveAMDGCNTarget();
+  bool ParseDirectiveAMDGCNCodeObjectVersion();
   bool ParseDirectiveAMDHSAKernel();
-  bool ParseDirectiveMajorMinor(uint32_t &Major, uint32_t &Minor);
-  bool ParseDirectiveHSACodeObjectVersion();
-  bool ParseDirectiveHSACodeObjectISA();
   bool ParseAMDKernelCodeTValue(StringRef ID, amd_kernel_code_t &Header);
   bool ParseDirectiveAMDKernelCodeT();
   // TODO: Possibly make subtargetHasRegister const.
@@ -5057,20 +5055,6 @@ bool AMDGPUAsmParser::ParseAsAbsoluteExpression(uint32_t &Ret) {
   return false;
 }
 
-bool AMDGPUAsmParser::ParseDirectiveMajorMinor(uint32_t &Major,
-                                               uint32_t &Minor) {
-  if (ParseAsAbsoluteExpression(Major))
-    return TokError("invalid major version");
-
-  if (!trySkipToken(AsmToken::Comma))
-    return TokError("minor version number required, comma expected");
-
-  if (ParseAsAbsoluteExpression(Minor))
-    return TokError("invalid minor version");
-
-  return false;
-}
-
 bool AMDGPUAsmParser::ParseDirectiveAMDGCNTarget() {
   if (getSTI().getTargetTriple().getArch() != Triple::amdgcn)
     return TokError("directive only supported for amdgcn architecture");
@@ -5535,63 +5519,18 @@ bool AMDGPUAsmParser::ParseDirectiveAMDHSAKernel() {
     }
   }
 
-  getTargetStreamer().EmitAmdhsaKernelDescriptor(
-      getSTI(), KernelName, KD, NextFreeVGPR, NextFreeSGPR, ReserveVCC,
-      ReserveFlatScr, AMDGPU::getAmdhsaCodeObjectVersion());
-  return false;
-}
-
-bool AMDGPUAsmParser::ParseDirectiveHSACodeObjectVersion() {
-  uint32_t Major;
-  uint32_t Minor;
-
-  if (ParseDirectiveMajorMinor(Major, Minor))
-    return true;
-
-  getTargetStreamer().EmitDirectiveHSACodeObjectVersion(Major, Minor);
+  getTargetStreamer().EmitAmdhsaKernelDescriptor(getSTI(), KernelName, KD,
+                                                 NextFreeVGPR, NextFreeSGPR,
+                                                 ReserveVCC, ReserveFlatScr);
   return false;
 }
 
-bool AMDGPUAsmParser::ParseDirectiveHSACodeObjectISA() {
-  uint32_t Major;
-  uint32_t Minor;
-  uint32_t Stepping;
-  StringRef VendorName;
-  StringRef ArchName;
-
-  // If this directive has no arguments, then use the ISA version for the
-  // targeted GPU.
-  if (isToken(AsmToken::EndOfStatement)) {
-    AMDGPU::IsaVersion ISA = AMDGPU::getIsaVersion(getSTI().getCPU());
-    getTargetStreamer().EmitDirectiveHSACodeObjectISAV2(ISA.Major, ISA.Minor,
-                                                        ISA.Stepping,
-                                                        "AMD", "AMDGPU");
-    return false;
-  }
-
-  if (ParseDirectiveMajorMinor(Major, Minor))
-    return true;
-
-  if (!trySkipToken(AsmToken::Comma))
-    return TokError("stepping version number required, comma expected");
-
-  if (ParseAsAbsoluteExpression(Stepping))
-    return TokError("invalid stepping version");
-
-  if (!trySkipToken(AsmToken::Comma))
-    return TokError("vendor name required, comma expected");
-
-  if (!parseString(VendorName, "invalid vendor name"))
-    return true;
-
-  if (!trySkipToken(AsmToken::Comma))
-    return TokError("arch name required, comma expected");
-
-  if (!parseString(ArchName, "invalid arch name"))
+bool AMDGPUAsmParser::ParseDirectiveAMDGCNCodeObjectVersion() {
+  uint32_t Version;
+  if (ParseAsAbsoluteExpression(Version))
     return true;
 
-  getTargetStreamer().EmitDirectiveHSACodeObjectISAV2(Major, Minor, Stepping,
-                                                      VendorName, ArchName);
+  getTargetStreamer().EmitDirectiveAMDGCNCodeObjectVersion(Version);
   return false;
 }
 
@@ -5882,12 +5821,6 @@ bool AMDGPUAsmParser::ParseDirective(AsmToken DirectiveID) {
     if (IDVal == AMDGPU::HSAMD::V3::AssemblerDirectiveBegin)
       return ParseDirectiveHSAMetadata();
   } else {
-    if (IDVal == ".hsa_code_object_version")
-      return ParseDirectiveHSACodeObjectVersion();
-
-    if (IDVal == ".hsa_code_object_isa")
-      return ParseDirectiveHSACodeObjectISA();
-
     if (IDVal == ".amd_kernel_code_t")
       return ParseDirectiveAMDKernelCodeT();
 
@@ -5917,6 +5850,9 @@ bool AMDGPUAsmParser::ParseDirective(AsmToken DirectiveID) {
   if (IDVal == PALMD::AssemblerDirective)
     return ParseDirectivePALMetadata();
 
+  if (IDVal == ".amdgcn_code_object_version")
+    return ParseDirectiveAMDGCNCodeObjectVersion();
+
   return true;
 }
 
@@ -8060,9 +7996,8 @@ void AMDGPUAsmParser::onBeginOfFile() {
     return;
 
   if (!getTargetStreamer().getTargetID())
-    getTargetStreamer().initializeTargetID(getSTI(), getSTI().getFeatureString(),
-        // TODO: Should try to check code object version from directive???
-        AMDGPU::getAmdhsaCodeObjectVersion());
+    getTargetStreamer().initializeTargetID(getSTI(),
+                                           getSTI().getFeatureString());
 
   if (isHsaAbi(getSTI()))
     getTargetStreamer().EmitDirectiveAMDGCNTarget();
diff --git a/llvm/lib/Target/AMDGPU/Disassembler/AMDGPUDisassembler.cpp b/llvm/lib/Target/AMDGPU/Disassembler/AMDGPUDisassembler.cpp
index 7939d0036568d4..8f41c43b181ae7 100644
--- a/llvm/lib/Target/AMDGPU/Disassembler/AMDGPUDisassembler.cpp
+++ b/llvm/lib/Target/AMDGPU/Disassembler/AMDGPUDisassembler.cpp
@@ -2149,7 +2149,7 @@ AMDGPUDisassembler::decodeKernelDescriptorDirective(
                       KERNEL_CODE_PROPERTY_ENABLE_WAVEFRONT_SIZE32);
     }
 
-    if (AMDGPU::getAmdhsaCodeObjectVersion() >= AMDGPU::AMDHSA_COV5)
+    if (AMDGPU::getDefaultCodeObjectVersion() >= AMDGPU::AMDHSA_COV5)
       PRINT_DIRECTIVE(".amdhsa_uses_dynamic_stack",
                       KERNEL_CODE_PROPERTY_USES_DYNAMIC_STACK);
 
diff --git a/llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUAsmBackend.cpp b/llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUAsmBackend.cpp
index f91f36ed851b7f..76610bef48df07 100644
--- a/llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUAsmBackend.cpp
+++ b/llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUAsmBackend.cpp
@@ -232,13 +232,11 @@ class ELFAMDGPUAsmBackend : public AMDGPUAsmBackend {
   bool Is64Bit;
   bool HasRelocationAddend;
   uint8_t OSABI = ELF::ELFOSABI_NONE;
-  uint8_t ABIVersion = 0;
 
 public:
-  ELFAMDGPUAsmBackend(const Target &T, const Triple &TT, uint8_t ABIVersion) :
+  ELFAMDGPUAsmBackend(const Target &T, const Triple &TT) :
       AMDGPUAsmBackend(T), Is64Bit(TT.getArch() == Triple::amdgcn),
-      HasRelocationAddend(TT.getOS() == Triple::AMDHSA),
-      ABIVersion(ABIVersion) {
+      HasRelocationAddend(TT.getOS() == Triple::AMDHSA) {
     switch (TT.getOS()) {
     case Triple::AMDHSA:
       OSABI = ELF::ELFOSABI_AMDGPU_HSA;
@@ -256,8 +254,7 @@ class ELFAMDGPUAsmBackend : public AMDGPUAsmBackend {
 
   std::unique_ptr<MCObjectTargetWriter>
   createObjectTargetWriter() const override {
-    return createAMDGPUELFObjectWriter(Is64Bit, OSABI, HasRelocationAddend,
-                                       ABIVersion);
+    return createAMDGPUELFObjectWriter(Is64Bit, OSABI, HasRelocationAddend);
   }
 };
 
@@ -267,6 +264,5 @@ MCAsmBackend *llvm::createAMDGPUAsmBackend(const Target &T,
                                            const MCSubtargetInfo &STI,
                                            const MCRegisterInfo &MRI,
                                            const MCTargetOptions &Options) {
-  return new ELFAMDGPUAsmBackend(T, STI.getTargetTriple(),
-                                 getHsaAbiVersion(&STI).value_or(0));
+  return new ELFAMDGPUAsmBackend(T, STI.getTargetTriple());
 }
diff --git a/llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUELFObjectWriter.cpp b/llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUELFObjectWriter.cpp
index 58eed81e075560..c05a53120a8090 100644
--- a/llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUELFObjectWriter.cpp
+++ b/llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUELFObjectWriter.cpp
@@ -18,8 +18,7 @@ namespace {
 
 class AMDGPUELFObjectWriter : public MCELFObjectTargetWriter {
 public:
-  AMDGPUELFObjectWriter(bool Is64Bit, uint8_t OSABI, bool HasRelocationAddend,
-                        uint8_t ABIVersion);
+  AMDGPUELFObjectWriter(bool Is64Bit, uint8_t OSABI, bool HasRelocationAddend);
 
 protected:
   unsigned getRelocType(MCContext &Ctx, const MCValue &Target,
@@ -31,10 +30,9 @@ class AMDGPUELFObjectWriter : public MCELFObjectTargetWriter {
 
 AMDGPUELFObjectWriter::AMDGPUELFObjectWriter(bool Is64Bit,
                                              uint8_t OSABI,
-                                             bool HasRelocationAddend,
-                                             uint8_t ABIVersion)
+                                             bool HasRelocationAddend)
   : MCELFObjectTargetWriter(Is64Bit, OSABI, ELF::EM_AMDGPU,
-                            HasRelocationAddend, ABIVersion) {}
+                            HasRelocationAddend) {}
 
 unsigned AMDGPUELFObjectWriter::getRelocType(MCContext &Ctx,
                                              const MCValue &Target,
@@ -100,9 +98,7 @@ unsigned AMDGPUELFObjectWriter::getRelocType(MCContext &Ctx,
 
 std::unique_ptr<MCObjectTargetWriter>
 llvm::createAMDGPUELFObjectWriter(bool Is64Bit, uint8_t OSABI,
-                                  bool HasRelocationAddend,
-                                  uint8_t ABIVersion) {
+                                  bool HasRelocationAddend) {
   return std::make_unique<AMDGPUELFObjectWriter>(Is64Bit, OSABI,
-                                                  HasRelocationAddend,
-                                                  ABIVersion);
+                                                 HasRelocationAddend);
 }
diff --git a/llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUMCTargetDesc.h b/llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUMCTargetDesc.h
index 006115ba14fc1c..3ef00f75735b0d 100644
--- a/llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUMCTargetDesc.h
+++ b/llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUMCTargetDesc.h
@@ -42,8 +42,8 @@ MCAsmBackend *createAMDGPUAsmBackend(const Target &T,
 
 std::unique_ptr<MCObjectTargetWriter>
 createAMDGPUELFObjectWriter(bool Is64Bit, uint8_t OSABI,
-                            bool HasRelocationAddend, uint8_t ABIVersion);
-} // End llvm namespace
+                            bool HasRelocationAddend);
+} // namespace llvm
 
 #define GET_REGINFO_ENUM
 #include "AMDGPUGenRegisterInfo.inc"
diff --git a/llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUTargetStreamer.cpp b/llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUTargetStreamer.cpp
index a855cf585205bc..6829aef91e459c 100644
--- a/llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUTargetStreamer.cpp
+++ b/llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUTargetStreamer.cpp
@@ -35,27 +35,6 @@ using namespace llvm::AMDGPU;
 // AMDGPUTargetStreamer
 //===----------------------------------------------------------------------===//
 
-static void convertIsaVersionV2(uint32_t &Major, uint32_t &Minor,
-                                uint32_t &Stepping, bool Sramecc, bool Xnack) {
-  if (Major == 9 && Minor == 0) {
-    switch (Stepping) {
-      case 0:
-      case 2:
-      case 4:
-      case 6:
-        if (Xnack)
-          Stepping++;
-    }
-  }
-}
-
-bool AMDGPUTargetStreamer::EmitHSAMetadataV2(StringRef HSAMetadataString) {
-  HSAMD::Metadata HSAMetadata;
-  if (HSAMD::fromString(HSAMetadataString, HSAMetadata))
-    return false;
-  return EmitHSAMetadata(HSAMetadata);
-}
-
 bool AMDGPUTargetStreamer::EmitHSAMetadataV3(StringRef HSAMetadataString) {
   msgpack::Document HSAMetadataDoc;
   if (!HSAMetadataDoc.fromYAML(HSAMetadataString))
@@ -238,21 +217,10 @@ void AMDGPUTargetAsmStreamer::EmitDirectiveAMDGCNTarget() {
   OS << "\t.amdgcn_target \"" << getTargetID()->toString() << "\"\n";
 }
 
-void AMDGPUTargetAsmStreamer::EmitDirectiveHSACodeObjectVersion(
-    uint32_t Major, uint32_t Minor) {
-  OS << "\t.hsa_code_object_version " <<
-        Twine(Major) << "," << Twine(Minor) << '\n';
-}
-
-void
-AMDGPUTargetAsmStreamer::EmitDirectiveHSACodeObjectISAV2(uint32_t Major,
-                                                         uint32_t Minor,
-                                                         uint32_t Stepping,
-                                                         StringRef VendorName,
-                                                         StringRef ArchName) {
-  convertIsaVersionV2(Major, Minor, Stepping, TargetID->isSramEccOnOrAny(), TargetID->isXnackOnOrAny());
-  OS << "\t.hsa_code_object_isa " << Twine(Major) << "," << Twine(Minor) << ","
-     << Twine(Stepping) << ",\"" << VendorName << "\",\"" << ArchName << "\"\n";
+void AMDGPUTargetAsmStreamer::EmitDirectiveAMDGCNCodeObjectVersion(
+    unsigne...
[truncated]

Copy link

github-actions bot commented Dec 22, 2023

✅ With the latest revision this PR passed the C/C++ code formatter.

Copy link
Contributor

@slinder1 slinder1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with some nits, and a broader question about what we call this version across IR, Asm, and in documentation.

llvm/include/llvm/MC/MCAssembler.h Outdated Show resolved Hide resolved
llvm/lib/Target/AMDGPU/Disassembler/AMDGPUDisassembler.cpp Outdated Show resolved Hide resolved
llvm/lib/Target/AMDGPU/Utils/AMDGPUBaseInfo.cpp Outdated Show resolved Hide resolved
Twine(AmdhsaCodeObjectVersion));
unsigned getCodeObjectVersion(const Module &M) {
if (auto Ver = mdconst::extract_or_null<ConstantInt>(
M.getModuleFlag("amdgpu_code_object_version"))) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, the naming issue is even more difficult to resolve since the module flag was added with amdgpu. It is unfortunate how many names we have grown for this, between amdgpu, amdgcn, and amdhsa.

For some things the distinctions are meaningful, but here we are always referring to the same thing, just with confusingly distinct names.

@kzhuravl do you have thoughts on how to resolve the prefix proliferation? It seems like we really want amdhsa everywhere as the ABI is specific to the HSA OS, right? Can we rename the module flag with an auto-upgrade or something?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is HSA-specific version only used with ELFOSABI_AMDGPU_HSA. See e_ident[EI_ABIVERSION] row in https://llvm.org/docs/AMDGPUUsage.html#amdgpu-elf-header-table. So I think it would be good to clean this up and have consistent naming, with amdhsa prefix in LLVM IR and ASM directive.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I'll plan to do the auto-upgrade rename in a follow up.

@slinder1 slinder1 requested a review from MaskRay January 3, 2024 00:31
@slinder1
Copy link
Contributor

slinder1 commented Jan 3, 2024

Requesting review from @MaskRay as someone outside AMD to look, in particular because of the generic change needed in MCAssembler.

@@ -5918,6 +5851,9 @@ bool AMDGPUAsmParser::ParseDirective(AsmToken DirectiveID) {
if (IDVal == PALMD::AssemblerDirective)
return ParseDirectivePALMetadata();

if (IDVal == ".amdgcn_code_object_version")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why .amdgcn and not .amdhsa? Don't the other directives use an amdhsa prefix?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, should be amdhsa_code_object_version (see my previous comment). It would be good to fix the LLVM IR module flag to be amdhsa_code_object_version as well for consistency.

@epilk
Copy link
Member Author

epilk commented Jan 3, 2024

@slinder1 @arsenm Ah, so code_object_version is only meaningful on HSA OSes? I originally chose 'amdgcn' since I thought it was also meaningful on PAL/Mesa. The update renames the directive to amdhsa_code_object_version and only emits/parses it on HSA triples.

I noticed that clang emits the IR module flag, amdgpu_code_object_verison on PAL/mesa. Should that be updated to just emit on HSA? Thanks!

@arsenm
Copy link
Contributor

arsenm commented Jan 4, 2024

@slinder1 @arsenm Ah, so code_object_version is only meaningful on HSA OSes? I originally chose 'amdgcn' since I thought it was also meaningful on PAL/Mesa. The update renames the directive to amdhsa_code_object_version and only emits/parses it on HSA triples.

I noticed that clang emits the IR module flag, amdgpu_code_object_verison on PAL/mesa. Should that be updated to just emit on HSA? Thanks!

It's not exactly well defined enough to make a definitive statement. The hope was mesa would follow HSA code objects, but that maintenance work never happened. Less sure about PAL

@MaskRay
Copy link
Member

MaskRay commented Jan 7, 2024

Requesting review from @MaskRay as someone outside AMD to look, in particular because of the generic change needed in MCAssembler.

Thanks. I've left some comments.

@@ -279,6 +280,10 @@ class MCAssembler {
unsigned getELFHeaderEFlags() const { return ELFHeaderEFlags; }
void setELFHeaderEFlags(unsigned Flags) { ELFHeaderEFlags = Flags; }

/// ELF e_ident[EI_ABIVERSION] value
unsigned char getELFHeaderABIVersion() const { return ELFHeaderABIVersion; }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See llvm/lib/MC/MCELFStreamer.cpp:Asm.getWriter().markGnuAbi(). This can be moved to ELFObjectWriter and revert the MCELFObjectTargetWriter change below.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I can do it like that if you prefer. Would you mind verifying the update is what you had in mind?

llvm/test/MC/AMDGPU/hsa-cov-invalid.s Outdated Show resolved Hide resolved
@epilk
Copy link
Member Author

epilk commented Jan 9, 2024

@slinder1 @arsenm Ah, so code_object_version is only meaningful on HSA OSes? I originally chose 'amdgcn' since I thought it was also meaningful on PAL/Mesa. The update renames the directive to amdhsa_code_object_version and only emits/parses it on HSA triples.
I noticed that clang emits the IR module flag, amdgpu_code_object_verison on PAL/mesa. Should that be updated to just emit on HSA? Thanks!

It's not exactly well defined enough to make a definitive statement. The hope was mesa would follow HSA code objects, but that maintenance work never happened. Less sure about PAL

Hmm, okay. So would you be okay with the the new approach of using the amdhsa spelling and requiring a HSA triple? I suppose we could always add support for PAL/mesa if we need that in the future.

Copy link
Contributor

@kzhuravl kzhuravl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me as well, thanks!

Named '.amdhsa_code_object_version'. This directive sets the
e_ident[ABIVERSION] in the ELF header, and should be used as the assumed
COV for the rest of the asm file.

This commit also weakens the --amdhsa-code-object-version CL flag.
Previously, the CL flag took precedence over the IR flag. Now the IR
flag/asm directive take precedence over the CL flag. This is implemented
by merging a few COV-checking functions in AMDGPUBaseInfo.h.
@epilk epilk merged commit bc82cfb into llvm:main Jan 21, 2024
4 of 5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants