Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[llvm-objdump][AMDGPU] Pass ELF ABIVersion through disassembler #78907

Merged
merged 1 commit into from
Feb 1, 2024

Conversation

epilk
Copy link
Member

@epilk epilk commented Jan 21, 2024

Admittedly, its a bit ugly to pass the ABIVersion through onSymbolStart but I'm not sure what a better place for it would be.

@llvmbot
Copy link
Collaborator

llvmbot commented Jan 21, 2024

@llvm/pr-subscribers-llvm-binary-utilities

@llvm/pr-subscribers-backend-webassembly

Author: Emma Pilkington (epilk)

Changes

Admittedly, its a bit ugly to pass the ABIVersion through onSymbolStart but I'm not sure what a better place for it would be.


Patch is 35.41 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/78907.diff

18 Files Affected:

  • (modified) llvm/include/llvm/MC/MCDisassembler/MCDisassembler.h (+5-2)
  • (modified) llvm/include/llvm/Object/ELFObjectFile.h (+7)
  • (modified) llvm/lib/MC/MCDisassembler/MCDisassembler.cpp (+3-3)
  • (modified) llvm/lib/Target/AMDGPU/Disassembler/AMDGPUDisassembler.cpp (+12-10)
  • (modified) llvm/lib/Target/AMDGPU/Disassembler/AMDGPUDisassembler.h (+8-7)
  • (modified) llvm/lib/Target/AMDGPU/Utils/AMDGPUBaseInfo.cpp (+11)
  • (modified) llvm/lib/Target/AMDGPU/Utils/AMDGPUBaseInfo.h (+3)
  • (modified) llvm/lib/Target/WebAssembly/Disassembler/WebAssemblyDisassembler.cpp (+5-4)
  • (added) llvm/test/tools/llvm-objdump/ELF/AMDGPU/kd-cov5.s (+23)
  • (modified) llvm/test/tools/llvm-objdump/ELF/AMDGPU/kd-gfx10.s (+16-4)
  • (modified) llvm/test/tools/llvm-objdump/ELF/AMDGPU/kd-gfx11.s (+16-4)
  • (modified) llvm/test/tools/llvm-objdump/ELF/AMDGPU/kd-gfx12.s (+8-2)
  • (modified) llvm/test/tools/llvm-objdump/ELF/AMDGPU/kd-gfx90a.s (+12-3)
  • (modified) llvm/test/tools/llvm-objdump/ELF/AMDGPU/kd-sgpr.s (+12-3)
  • (modified) llvm/test/tools/llvm-objdump/ELF/AMDGPU/kd-vgpr.s (+12-3)
  • (modified) llvm/test/tools/llvm-objdump/ELF/AMDGPU/kd-zeroed-gfx10.s (+2)
  • (modified) llvm/test/tools/llvm-objdump/ELF/AMDGPU/kd-zeroed-gfx9.s (+4-2)
  • (modified) llvm/tools/llvm-objdump/llvm-objdump.cpp (+6-2)
diff --git a/llvm/include/llvm/MC/MCDisassembler/MCDisassembler.h b/llvm/include/llvm/MC/MCDisassembler/MCDisassembler.h
index 2553a086cd53bb..6171baad86e261 100644
--- a/llvm/include/llvm/MC/MCDisassembler/MCDisassembler.h
+++ b/llvm/include/llvm/MC/MCDisassembler/MCDisassembler.h
@@ -143,6 +143,8 @@ class MCDisassembler {
   /// to treat symbols separately.
   ///
   /// \param Symbol   - The symbol.
+  /// \param Version  - Target disassembler-specific ABI version number to used
+  ///                   to interpret the symbol.
   /// \param Size     - The number of bytes consumed.
   /// \param Address  - The address, in the memory space of region, of the first
   ///                   byte of the symbol.
@@ -159,8 +161,9 @@ class MCDisassembler {
   ///                   symbol separately. Value of Size is ignored in this
   ///                   case.
   virtual std::optional<DecodeStatus>
-  onSymbolStart(SymbolInfoTy &Symbol, uint64_t &Size, ArrayRef<uint8_t> Bytes,
-                uint64_t Address, raw_ostream &CStream) const;
+  onSymbolStart(SymbolInfoTy &Symbol, unsigned Version, uint64_t &Size,
+                ArrayRef<uint8_t> Bytes, uint64_t Address,
+                raw_ostream &CStream) const;
   // TODO:
   // Implement similar hooks that can be used at other points during
   // disassembly. Something along the following lines:
diff --git a/llvm/include/llvm/Object/ELFObjectFile.h b/llvm/include/llvm/Object/ELFObjectFile.h
index 7124df50b561db..c9227da65708cc 100644
--- a/llvm/include/llvm/Object/ELFObjectFile.h
+++ b/llvm/include/llvm/Object/ELFObjectFile.h
@@ -103,6 +103,8 @@ class ELFObjectFileBase : public ObjectFile {
 
   virtual uint16_t getEMachine() const = 0;
 
+  virtual uint8_t getEIdentABIVersion() const = 0;
+
   std::vector<ELFPltEntry> getPltEntries() const;
 
   /// Returns a vector containing a symbol version for each dynamic symbol.
@@ -251,6 +253,7 @@ ELFObjectFileBase::symbols() const {
 template <class ELFT> class ELFObjectFile : public ELFObjectFileBase {
   uint16_t getEMachine() const override;
   uint16_t getEType() const override;
+  uint8_t getEIdentABIVersion() const override;
   uint64_t getSymbolSize(DataRefImpl Sym) const override;
 
 public:
@@ -645,6 +648,10 @@ template <class ELFT> uint16_t ELFObjectFile<ELFT>::getEType() const {
   return EF.getHeader().e_type;
 }
 
+template <class ELFT> uint8_t ELFObjectFile<ELFT>::getEIdentABIVersion() const {
+  return EF.getHeader().e_ident[ELF::EI_ABIVERSION];
+}
+
 template <class ELFT>
 uint64_t ELFObjectFile<ELFT>::getSymbolSize(DataRefImpl Sym) const {
   Expected<const Elf_Sym *> SymOrErr = getSymbol(Sym);
diff --git a/llvm/lib/MC/MCDisassembler/MCDisassembler.cpp b/llvm/lib/MC/MCDisassembler/MCDisassembler.cpp
index 80c32ac5608220..51396a6f9cdcd7 100644
--- a/llvm/lib/MC/MCDisassembler/MCDisassembler.cpp
+++ b/llvm/lib/MC/MCDisassembler/MCDisassembler.cpp
@@ -14,9 +14,9 @@ using namespace llvm;
 MCDisassembler::~MCDisassembler() = default;
 
 std::optional<MCDisassembler::DecodeStatus>
-MCDisassembler::onSymbolStart(SymbolInfoTy &Symbol, uint64_t &Size,
-                              ArrayRef<uint8_t> Bytes, uint64_t Address,
-                              raw_ostream &CStream) const {
+MCDisassembler::onSymbolStart(SymbolInfoTy &Symbol, unsigned Version,
+                              uint64_t &Size, ArrayRef<uint8_t> Bytes,
+                              uint64_t Address, raw_ostream &CStream) const {
   return std::nullopt;
 }
 
diff --git a/llvm/lib/Target/AMDGPU/Disassembler/AMDGPUDisassembler.cpp b/llvm/lib/Target/AMDGPU/Disassembler/AMDGPUDisassembler.cpp
index 86096b0d80b424..30fcee1a264553 100644
--- a/llvm/lib/Target/AMDGPU/Disassembler/AMDGPUDisassembler.cpp
+++ b/llvm/lib/Target/AMDGPU/Disassembler/AMDGPUDisassembler.cpp
@@ -2077,7 +2077,7 @@ MCDisassembler::DecodeStatus AMDGPUDisassembler::decodeCOMPUTE_PGM_RSRC3(
 MCDisassembler::DecodeStatus
 AMDGPUDisassembler::decodeKernelDescriptorDirective(
     DataExtractor::Cursor &Cursor, ArrayRef<uint8_t> Bytes,
-    raw_string_ostream &KdStream) const {
+    raw_string_ostream &KdStream, unsigned CodeObjectVersion) const {
 #define PRINT_DIRECTIVE(DIRECTIVE, MASK)                                       \
   do {                                                                         \
     KdStream << Indent << DIRECTIVE " "                                        \
@@ -2184,8 +2184,7 @@ AMDGPUDisassembler::decodeKernelDescriptorDirective(
                       KERNEL_CODE_PROPERTY_ENABLE_WAVEFRONT_SIZE32);
     }
 
-    // FIXME: We should be looking at the ELF header ABI version for this.
-    if (AMDGPU::getDefaultAMDHSACodeObjectVersion() >= AMDGPU::AMDHSA_COV5)
+    if (CodeObjectVersion >= AMDGPU::AMDHSA_COV5)
       PRINT_DIRECTIVE(".amdhsa_uses_dynamic_stack",
                       KERNEL_CODE_PROPERTY_USES_DYNAMIC_STACK);
 
@@ -2225,7 +2224,8 @@ AMDGPUDisassembler::decodeKernelDescriptorDirective(
 }
 
 MCDisassembler::DecodeStatus AMDGPUDisassembler::decodeKernelDescriptor(
-    StringRef KdName, ArrayRef<uint8_t> Bytes, uint64_t KdAddress) const {
+    StringRef KdName, ArrayRef<uint8_t> Bytes, uint64_t KdAddress,
+    unsigned CodeObjectVersion) const {
   // CP microcode requires the kernel descriptor to be 64 aligned.
   if (Bytes.size() != 64 || KdAddress % 64 != 0)
     return MCDisassembler::Fail;
@@ -2251,7 +2251,7 @@ MCDisassembler::DecodeStatus AMDGPUDisassembler::decodeKernelDescriptor(
   DataExtractor::Cursor C(0);
   while (C && C.tell() < Bytes.size()) {
     MCDisassembler::DecodeStatus Status =
-        decodeKernelDescriptorDirective(C, Bytes, KdStream);
+        decodeKernelDescriptorDirective(C, Bytes, KdStream, CodeObjectVersion);
 
     cantFail(C.takeError());
 
@@ -2263,10 +2263,9 @@ MCDisassembler::DecodeStatus AMDGPUDisassembler::decodeKernelDescriptor(
   return MCDisassembler::Success;
 }
 
-std::optional<MCDisassembler::DecodeStatus>
-AMDGPUDisassembler::onSymbolStart(SymbolInfoTy &Symbol, uint64_t &Size,
-                                  ArrayRef<uint8_t> Bytes, uint64_t Address,
-                                  raw_ostream &CStream) const {
+std::optional<MCDisassembler::DecodeStatus> AMDGPUDisassembler::onSymbolStart(
+    SymbolInfoTy &Symbol, unsigned ABIVersion, uint64_t &Size,
+    ArrayRef<uint8_t> Bytes, uint64_t Address, raw_ostream &CStream) const {
   // Right now only kernel descriptor needs to be handled.
   // We ignore all other symbols for target specific handling.
   // TODO:
@@ -2279,11 +2278,14 @@ AMDGPUDisassembler::onSymbolStart(SymbolInfoTy &Symbol, uint64_t &Size,
     return MCDisassembler::Fail;
   }
 
+  unsigned CodeObjectVersion = AMDGPU::getAMDHSACodeObjectVersion(ABIVersion);
+
   // Code Object V3 kernel descriptors.
   StringRef Name = Symbol.Name;
   if (Symbol.Type == ELF::STT_OBJECT && Name.ends_with(StringRef(".kd"))) {
     Size = 64; // Size = 64 regardless of success or failure.
-    return decodeKernelDescriptor(Name.drop_back(3), Bytes, Address);
+    return decodeKernelDescriptor(Name.drop_back(3), Bytes, Address,
+                                  CodeObjectVersion);
   }
   return std::nullopt;
 }
diff --git a/llvm/lib/Target/AMDGPU/Disassembler/AMDGPUDisassembler.h b/llvm/lib/Target/AMDGPU/Disassembler/AMDGPUDisassembler.h
index 233581949d7124..288863565e5f1b 100644
--- a/llvm/lib/Target/AMDGPU/Disassembler/AMDGPUDisassembler.h
+++ b/llvm/lib/Target/AMDGPU/Disassembler/AMDGPUDisassembler.h
@@ -158,16 +158,17 @@ class AMDGPUDisassembler : public MCDisassembler {
   }
 
   std::optional<DecodeStatus>
-  onSymbolStart(SymbolInfoTy &Symbol, uint64_t &Size, ArrayRef<uint8_t> Bytes,
-                uint64_t Address, raw_ostream &CStream) const override;
+  onSymbolStart(SymbolInfoTy &Symbol, unsigned Version, uint64_t &Size,
+                ArrayRef<uint8_t> Bytes, uint64_t Address,
+                raw_ostream &CStream) const override;
 
   DecodeStatus decodeKernelDescriptor(StringRef KdName, ArrayRef<uint8_t> Bytes,
-                                      uint64_t KdAddress) const;
+                                      uint64_t KdAddress,
+                                      unsigned CodeObjectVersion) const;
 
-  DecodeStatus
-  decodeKernelDescriptorDirective(DataExtractor::Cursor &Cursor,
-                                  ArrayRef<uint8_t> Bytes,
-                                  raw_string_ostream &KdStream) const;
+  DecodeStatus decodeKernelDescriptorDirective(
+      DataExtractor::Cursor &Cursor, ArrayRef<uint8_t> Bytes,
+      raw_string_ostream &KdStream, unsigned CodeObjectVersion) const;
 
   /// Decode as directives that handle COMPUTE_PGM_RSRC1.
   /// \param FourByteBuffer - Bytes holding contents of COMPUTE_PGM_RSRC1.
diff --git a/llvm/lib/Target/AMDGPU/Utils/AMDGPUBaseInfo.cpp b/llvm/lib/Target/AMDGPU/Utils/AMDGPUBaseInfo.cpp
index f1c05446bf6069..2a89969bdebd06 100644
--- a/llvm/lib/Target/AMDGPU/Utils/AMDGPUBaseInfo.cpp
+++ b/llvm/lib/Target/AMDGPU/Utils/AMDGPUBaseInfo.cpp
@@ -175,6 +175,17 @@ unsigned getDefaultAMDHSACodeObjectVersion() {
   return DefaultAMDHSACodeObjectVersion;
 }
 
+unsigned getAMDHSACodeObjectVersion(unsigned ABIVersion) {
+  switch (ABIVersion) {
+  case ELF::ELFABIVERSION_AMDGPU_HSA_V4:
+    return 4;
+  case ELF::ELFABIVERSION_AMDGPU_HSA_V5:
+    return 5;
+  default:
+    return getDefaultAMDHSACodeObjectVersion();
+  }
+}
+
 uint8_t getELFABIVersion(const Triple &T, unsigned CodeObjectVersion) {
   if (T.getOS() != Triple::AMDHSA)
     return 0;
diff --git a/llvm/lib/Target/AMDGPU/Utils/AMDGPUBaseInfo.h b/llvm/lib/Target/AMDGPU/Utils/AMDGPUBaseInfo.h
index d3f55c79201747..849f0f8b200fae 100644
--- a/llvm/lib/Target/AMDGPU/Utils/AMDGPUBaseInfo.h
+++ b/llvm/lib/Target/AMDGPU/Utils/AMDGPUBaseInfo.h
@@ -50,6 +50,9 @@ bool isHsaAbi(const MCSubtargetInfo &STI);
 /// \returns Code object version from the IR module flag.
 unsigned getAMDHSACodeObjectVersion(const Module &M);
 
+/// \returns Code object version from ELF's e_ident[ABIVERSION].
+unsigned getAMDHSACodeObjectVersion(unsigned ABIVersion);
+
 /// \returns The default HSA code object version. This should only be used when
 /// we lack a more accurate CodeObjectVersion value (e.g. from the IR module
 /// flag or a .amdhsa_code_object_version directive)
diff --git a/llvm/lib/Target/WebAssembly/Disassembler/WebAssemblyDisassembler.cpp b/llvm/lib/Target/WebAssembly/Disassembler/WebAssemblyDisassembler.cpp
index ed7757be661582..0f9b55bfac1f3b 100644
--- a/llvm/lib/Target/WebAssembly/Disassembler/WebAssemblyDisassembler.cpp
+++ b/llvm/lib/Target/WebAssembly/Disassembler/WebAssemblyDisassembler.cpp
@@ -47,8 +47,9 @@ class WebAssemblyDisassembler final : public MCDisassembler {
                               ArrayRef<uint8_t> Bytes, uint64_t Address,
                               raw_ostream &CStream) const override;
   std::optional<DecodeStatus>
-  onSymbolStart(SymbolInfoTy &Symbol, uint64_t &Size, ArrayRef<uint8_t> Bytes,
-                uint64_t Address, raw_ostream &CStream) const override;
+  onSymbolStart(SymbolInfoTy &Symbol, unsigned Version, uint64_t &Size,
+                ArrayRef<uint8_t> Bytes, uint64_t Address,
+                raw_ostream &CStream) const override;
 
 public:
   WebAssemblyDisassembler(const MCSubtargetInfo &STI, MCContext &Ctx,
@@ -122,8 +123,8 @@ bool parseImmediate(MCInst &MI, uint64_t &Size, ArrayRef<uint8_t> Bytes) {
 }
 
 std::optional<MCDisassembler::DecodeStatus>
-WebAssemblyDisassembler::onSymbolStart(SymbolInfoTy &Symbol, uint64_t &Size,
-                                       ArrayRef<uint8_t> Bytes,
+WebAssemblyDisassembler::onSymbolStart(SymbolInfoTy &Symbol, unsigned Version,
+                                       uint64_t &Size, ArrayRef<uint8_t> Bytes,
                                        uint64_t Address,
                                        raw_ostream &CStream) const {
   Size = 0;
diff --git a/llvm/test/tools/llvm-objdump/ELF/AMDGPU/kd-cov5.s b/llvm/test/tools/llvm-objdump/ELF/AMDGPU/kd-cov5.s
new file mode 100644
index 00000000000000..1190964b9d4632
--- /dev/null
+++ b/llvm/test/tools/llvm-objdump/ELF/AMDGPU/kd-cov5.s
@@ -0,0 +1,23 @@
+; RUN: sed 's/CODE_OBJECT_VERSION/5/g' %s \
+; RUN:   | llvm-mc --triple=amdgcn-amd-amdhsa -mcpu=gfx1010 -mattr=-xnack,+wavefrontsize32,-wavefrontsize64 -filetype=obj > %t.o
+; RUN: llvm-objdump --disassemble-symbols=kernel.kd %t.o | FileCheck %s --check-prefixes=COV5,CHECK
+
+; RUN: sed 's/CODE_OBJECT_VERSION/4/g' %s \
+; RUN:   | llvm-mc --triple=amdgcn-amd-amdhsa -mcpu=gfx1010 -mattr=-xnack,+wavefrontsize32,-wavefrontsize64 -filetype=obj > %t.o
+; RUN: llvm-objdump --disassemble-symbols=kernel.kd %t.o | FileCheck %s --check-prefixes=COV4,CHECK
+
+; Verify that .amdhsa_uses_dynamic_stack is only printed on COV5+.
+
+; CHECK: .amdhsa_kernel kernel
+; ...
+; COV5: .amdhsa_uses_dynamic_stack 0
+; COV4-NOT: .amdhsa_uses_dynamic_stack
+; ...
+; CHECK: .end_amdhsa_kernel
+
+.amdhsa_code_object_version CODE_OBJECT_VERSION
+
+.amdhsa_kernel kernel
+  .amdhsa_next_free_vgpr 32
+  .amdhsa_next_free_sgpr 32
+.end_amdhsa_kernel
diff --git a/llvm/test/tools/llvm-objdump/ELF/AMDGPU/kd-gfx10.s b/llvm/test/tools/llvm-objdump/ELF/AMDGPU/kd-gfx10.s
index 58b01031afe383..81d0d868ab9184 100644
--- a/llvm/test/tools/llvm-objdump/ELF/AMDGPU/kd-gfx10.s
+++ b/llvm/test/tools/llvm-objdump/ELF/AMDGPU/kd-gfx10.s
@@ -4,7 +4,8 @@
 
 ;--- 1.s
 ; RUN: llvm-mc --triple=amdgcn-amd-amdhsa -mattr=-xnack,+wavefrontsize32,-wavefrontsize64 -filetype=obj -mcpu=gfx1010 < 1.s > 1.o
-; RUN: llvm-objdump --disassemble-symbols=kernel.kd 1.o | tail -n +7 | tee 1-disasm.s | FileCheck 1.s
+; RUN: echo '.amdhsa_code_object_version 5' > 1-disasm.s
+; RUN: llvm-objdump --disassemble-symbols=kernel.kd 1.o | tail -n +7 | tee -a 1-disasm.s | FileCheck 1.s
 ; RUN: llvm-mc --triple=amdgcn-amd-amdhsa -mattr=-xnack,+wavefrontsize32,-wavefrontsize64 -filetype=obj -mcpu=gfx1010 < 1-disasm.s > 1-disasm.o
 ; RUN: cmp 1.o 1-disasm.o
 ; CHECK: .amdhsa_kernel kernel
@@ -48,7 +49,9 @@
 ; CHECK-NEXT: .amdhsa_user_sgpr_flat_scratch_init 0
 ; CHECK-NEXT: .amdhsa_user_sgpr_private_segment_size 0
 ; CHECK-NEXT: .amdhsa_wavefront_size32 1
+; CHECK-NEXT: .amdhsa_uses_dynamic_stack 0
 ; CHECK-NEXT: .end_amdhsa_kernel
+.amdhsa_code_object_version 5
 .amdhsa_kernel kernel
   .amdhsa_next_free_vgpr 32
   .amdhsa_next_free_sgpr 32
@@ -57,7 +60,8 @@
 
 ;--- 2.s
 ; RUN: llvm-mc --triple=amdgcn-amd-amdhsa -mattr=-xnack,+wavefrontsize64,-wavefrontsize32 -filetype=obj -mcpu=gfx1010 < 2.s > 2.o
-; RUN: llvm-objdump --disassemble-symbols=kernel.kd 2.o | tail -n +7 | tee 2-disasm.s | FileCheck 2.s
+; RUN: echo '.amdhsa_code_object_version 5' > 2-disasm.s
+; RUN: llvm-objdump --disassemble-symbols=kernel.kd 2.o | tail -n +7 | tee -a 2-disasm.s | FileCheck 2.s
 ; RUN: llvm-mc --triple=amdgcn-amd-amdhsa -mattr=-xnack,+wavefrontsize64,-wavefrontsize32 -filetype=obj -mcpu=gfx1010 < 2-disasm.s > 2-disasm.o
 ; RUN: cmp 2.o 2-disasm.o
 ; CHECK: .amdhsa_kernel kernel
@@ -101,7 +105,9 @@
 ; CHECK-NEXT: .amdhsa_user_sgpr_flat_scratch_init 0
 ; CHECK-NEXT: .amdhsa_user_sgpr_private_segment_size 0
 ; CHECK-NEXT: .amdhsa_wavefront_size32 0
+; CHECK-NEXT: .amdhsa_uses_dynamic_stack 0
 ; CHECK-NEXT: .end_amdhsa_kernel
+.amdhsa_code_object_version 5
 .amdhsa_kernel kernel
   .amdhsa_next_free_vgpr 32
   .amdhsa_next_free_sgpr 32
@@ -110,7 +116,8 @@
 
 ;--- 3.s
 ; RUN: llvm-mc --triple=amdgcn-amd-amdhsa -mattr=-xnack,+wavefrontsize64,-wavefrontsize32 -filetype=obj -mcpu=gfx1010 < 3.s > 3.o
-; RUN: llvm-objdump --disassemble-symbols=kernel.kd 3.o | tail -n +7 | tee 3-disasm.s | FileCheck 3.s
+; RUN: echo '.amdhsa_code_object_version 5' > 3-disasm.s
+; RUN: llvm-objdump --disassemble-symbols=kernel.kd 3.o | tail -n +7 | tee -a 3-disasm.s | FileCheck 3.s
 ; RUN: llvm-mc --triple=amdgcn-amd-amdhsa -mattr=-xnack,+wavefrontsize64,-wavefrontsize32 -filetype=obj -mcpu=gfx1010 < 3-disasm.s > 3-disasm.o
 ; RUN: cmp 3.o 3-disasm.o
 ; CHECK: .amdhsa_kernel kernel
@@ -154,7 +161,9 @@
 ; CHECK-NEXT: .amdhsa_user_sgpr_flat_scratch_init 0
 ; CHECK-NEXT: .amdhsa_user_sgpr_private_segment_size 0
 ; CHECK-NEXT: .amdhsa_wavefront_size32 0
+; CHECK-NEXT: .amdhsa_uses_dynamic_stack 0
 ; CHECK-NEXT: .end_amdhsa_kernel
+.amdhsa_code_object_version 5
 .amdhsa_kernel kernel
   .amdhsa_next_free_vgpr 32
   .amdhsa_next_free_sgpr 32
@@ -163,7 +172,8 @@
 
 ;--- 4.s
 ; RUN: llvm-mc --triple=amdgcn-amd-amdhsa -mattr=-xnack,+wavefrontsize64,-wavefrontsize32 -filetype=obj -mcpu=gfx1010 < 4.s > 4.o
-; RUN: llvm-objdump --disassemble-symbols=kernel.kd 4.o | tail -n +7 | tee 4-disasm.s | FileCheck 4.s
+; RUN: echo '.amdhsa_code_object_version 5' > 4-disasm.s
+; RUN: llvm-objdump --disassemble-symbols=kernel.kd 4.o | tail -n +7 | tee -a 4-disasm.s | FileCheck 4.s
 ; RUN: llvm-mc --triple=amdgcn-amd-amdhsa -mattr=-xnack,+wavefrontsize64,-wavefrontsize32 -filetype=obj -mcpu=gfx1010 < 4-disasm.s > 4-disasm.o
 ; RUN: cmp 4.o 4-disasm.o
 ; CHECK: .amdhsa_kernel kernel
@@ -207,7 +217,9 @@
 ; CHECK-NEXT: .amdhsa_user_sgpr_flat_scratch_init 0
 ; CHECK-NEXT: .amdhsa_user_sgpr_private_segment_size 0
 ; CHECK-NEXT: .amdhsa_wavefront_size32 0
+; CHECK-NEXT: .amdhsa_uses_dynamic_stack 0
 ; CHECK-NEXT: .end_amdhsa_kernel
+.amdhsa_code_object_version 5
 .amdhsa_kernel kernel
   .amdhsa_next_free_vgpr 32
   .amdhsa_next_free_sgpr 32
diff --git a/llvm/test/tools/llvm-objdump/ELF/AMDGPU/kd-gfx11.s b/llvm/test/tools/llvm-objdump/ELF/AMDGPU/kd-gfx11.s
index 2133002908d9fc..750809128189f1 100644
--- a/llvm/test/tools/llvm-objdump/ELF/AMDGPU/kd-gfx11.s
+++ b/llvm/test/tools/llvm-objdump/ELF/AMDGPU/kd-gfx11.s
@@ -4,7 +4,8 @@
 
 ;--- 1.s
 ; RUN: llvm-mc --triple=amdgcn-amd-amdhsa -mattr=+wavefrontsize32,-wavefrontsize64 -filetype=obj -mcpu=gfx1100 < 1.s > 1.o
-; RUN: llvm-objdump --disassemble-symbols=kernel.kd 1.o | tail -n +7 | tee 1-disasm.s | FileCheck 1.s
+; RUN: echo '.amdhsa_code_object_version 5' > 1-disasm.s
+; RUN: llvm-objdump --disassemble-symbols=kernel.kd 1.o | tail -n +7 | tee -a 1-disasm.s | FileCheck 1.s
 ; RUN: llvm-mc --triple=amdgcn-amd-amdhsa -mattr=+wavefrontsize32,-wavefrontsize64 -filetype=obj -mcpu=gfx1100 < 1-disasm.s > 1-disasm.o
 ; RUN: cmp 1.o 1-disasm.o
 ; CHECK: .amdhsa_kernel kernel
@@ -49,7 +50,9 @@
 ; CHECK-NEXT: .amdhsa_user_sgpr_dispatch_id 0
 ; CHECK-NEXT: .amdhsa_user_sgpr_private_segment_size 0
 ; CHECK-NEXT: .amdhsa_wavefront_size32 1
+; CHECK-NEXT: .amdhsa_uses_dynamic_stack 0
 ; CHECK-NEXT: .end_amdhsa_kernel
+.amdhsa_code_object_version 5
 .amdhsa_kernel kernel
   .amdhsa_next_free_vgpr 32
   .amdhsa_next_free_sgpr 32
@@ -58,7 +61,8 @@
 
 ;--- 2.s
 ; RUN: llvm-mc --triple=amdgcn-amd-amdhsa -mattr=+wavefrontsize64,-wavefrontsize32 -filetype=obj -mcpu=gfx1100 < 2.s > 2.o
-; RUN: llvm-objdump --disassemble-symbols=kernel.kd 2.o | tail -n +7 | tee 2-disasm.s | FileCheck 2.s
+; RUN: echo '.amdhsa_code_object_version 5' > 2-disasm.s
+; RUN: llvm-objdump --disassemble-symbols=kernel.kd 2.o | tail -n +7 | tee -a 2-disasm.s | FileCheck 2.s
 ; RUN: llvm-mc --triple=amdgcn-amd-amdhsa -mattr=+wavefrontsize64,-wavefrontsize32 -filetype=obj -mcpu=gfx1100 < 2-disasm.s > 2-disasm.o
 ; RUN: cmp 2.o 2-disasm.o
 ; CHECK: .amdhsa_kernel kernel
@@ -103,7 +107,9 @@
 ; CHECK-NEXT: .amdhsa_user_sgpr_dispatch_id 0
 ; CHECK-NEXT: .amdhsa_user_sgpr_private_segment_size 0
 ; CHECK-NEXT: .amdhsa_wavefront_size32 0
+; CHECK-NEXT: .amdhsa_uses_dynamic_stack 0
 ; CHECK-NEXT: .end_amdhsa_kernel
+.amdhsa_code_object_version 5
 .amdhsa_kernel kernel
   .amdhsa_next_free_vgpr 32
   .amdhsa_next_free_sgpr 32
@@ -112,7 +118,8 @@
 
 ;--- 3.s
 ; RUN: llvm-mc --triple=amdgcn-amd-amdhsa -mattr=+wavefrontsize64,-wavefrontsize32 -filetype=obj -mcpu=gfx1100 < 3.s > 3.o
-; RUN: llvm-objdump --disassemble-symbols=kernel.kd 3.o | tail -n +7 | tee 3-disasm.s | FileCheck 3.s
+; RUN: echo '.amdhsa_code_object_version 5' > 3-disasm.s
+; RUN: llvm-objdump --disassemble-symbols=kernel.kd 3.o | tail -n +7 | tee -a 3-disasm.s | FileCheck 3.s
 ; RUN: llvm-mc --triple=amdgcn-amd-amdhsa -mattr=+wavefrontsize64,-wavefrontsize32 -filetype=obj -mcpu=gfx1100 < 3-disasm.s > 3-disasm.o
 ; RUN: cmp 3.o 3-disasm.o
 ; CHECK: .amdhsa_kernel kernel
@@ -157,7 +164,9 @@
 ; CHECK-NEXT: .amdhsa_user_sgpr_dispatch_id 0
 ; CHECK-NEXT: .amdhsa_user_sgpr_private_segment_size 0
 ; CHECK-NEXT: .amdhsa_wavefront_size32 0
+; CHECK-NEXT: .amdhsa_uses_dynamic_stack 0
 ; CHECK-NEXT: .end_amdhsa_kernel
+.amdhsa_code_object_version 5
 .amdhsa_kernel kernel
...
[truncated]

@MaskRay MaskRay requested a review from jh7370 January 22, 2024 08:37
@MaskRay
Copy link
Member

MaskRay commented Jan 22, 2024

Admittedly, its a bit ugly to pass the ABIVersion through onSymbolStart but I'm not sure what a better place for it would be.

I am indeed concerned about this. Can you place the version inside AMDGPUDisassembler instead?

@jh7370
Copy link
Collaborator

jh7370 commented Jan 22, 2024

Admittedly, its a bit ugly to pass the ABIVersion through onSymbolStart but I'm not sure what a better place for it would be.

I am indeed concerned about this. Can you place the version inside AMDGPUDisassembler instead?

+1. The ABIVersion should be constant for a given file, so it feels like it should be configured up front when configuring the disassembler (and therefore presumably stored somewhere inside the disassembler class).

@epilk
Copy link
Member Author

epilk commented Jan 22, 2024

Admittedly, its a bit ugly to pass the ABIVersion through onSymbolStart but I'm not sure what a better place for it would be.

I am indeed concerned about this. Can you place the version inside AMDGPUDisassembler instead?

+1. The ABIVersion should be constant for a given file, so it feels like it should be configured up front when configuring the disassembler (and therefore presumably stored somewhere inside the disassembler class).

Okay, I added a virtual setABIVersion() function to MCDisassembler in the update, so we can just store it in the disassembler. WDYT?

@@ -50,6 +50,9 @@ bool isHsaAbi(const MCSubtargetInfo &STI);
/// \returns Code object version from the IR module flag.
unsigned getAMDHSACodeObjectVersion(const Module &M);

/// \returns Code object version from ELF's e_ident[ABIVERSION].
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

e_ident[EI_ABIVERSION]

; RUN: | llvm-mc --triple=amdgcn-amd-amdhsa -mcpu=gfx1010 -mattr=-xnack,+wavefrontsize32,-wavefrontsize64 -filetype=obj > %t.o
; RUN: llvm-objdump --disassemble-symbols=kernel.kd %t.o | FileCheck %s --check-prefixes=COV4,CHECK

; Verify that .amdhsa_uses_dynamic_stack is only printed on COV5+.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

;; for a non-RUN-non-CHECK comment

Copy link
Collaborator

@jh7370 jh7370 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ELFObjectFile and llvm-obdjump parts LGTM, I don't know anything really to be able to review the other aspects.

Copy link
Contributor

@kzhuravl kzhuravl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AMDGPU-related parts LGTM, thanks!

This is needed to determine the code object version, which affects the
disassembler output.
@epilk epilk merged commit 4eb0810 into llvm:main Feb 1, 2024
4 checks passed
smithp35 pushed a commit to smithp35/llvm-project that referenced this pull request Feb 1, 2024
…#78907)

Admittedly, its a bit ugly to pass the ABIVersion through onSymbolStart
but I'm not sure what a better place for it would be.
carlosgalvezp pushed a commit to carlosgalvezp/llvm-project that referenced this pull request Feb 1, 2024
…#78907)

Admittedly, its a bit ugly to pass the ABIVersion through onSymbolStart
but I'm not sure what a better place for it would be.
agozillon pushed a commit to agozillon/llvm-project that referenced this pull request Feb 5, 2024
…#78907)

Admittedly, its a bit ugly to pass the ABIVersion through onSymbolStart
but I'm not sure what a better place for it would be.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants