[AMDGPU] Add disassembler diagnostics for invalid kernel descriptors #87400

epilk · 2024-04-02T19:44:44Z

These mostly are checking for various reserved bits being set. The diagnostics for gpu-dependent reserved bits have a bit more context since they seem like the most likely ones to be observed in practice.

Previously these were being ignored, but no disassembler actually used them, so this is really NFC. This does repurpose the stated use of the CStream parameter, which was for any comment. The AMDGPU and WebAssembly disassemblers both just emit comments to outs(), so having a dedicated parameter for this seems unnecessary.

These mostly are checking for various reserved bits being set. The diagnostics for gpu-dependent reserved bits have a bit more context since they seem like the most likely ones to be observed in practice.

llvm/include/llvm/MC/MCDisassembler/MCDisassembler.h

llvmbot · 2024-04-03T16:25:41Z

@llvm/pr-subscribers-mc
@llvm/pr-subscribers-llvm-binary-utilities
@llvm/pr-subscribers-backend-webassembly

@llvm/pr-subscribers-backend-amdgpu

Author: Emma Pilkington (epilk)

Changes

These mostly are checking for various reserved bits being set. The diagnostics for gpu-dependent reserved bits have a bit more context since they seem like the most likely ones to be observed in practice.

Patch is 31.10 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/87400.diff

6 Files Affected:

(modified) llvm/docs/AMDGPUUsage.rst (+3-3)
(modified) llvm/include/llvm/MC/MCDisassembler/MCDisassembler.h (+3-2)
(modified) llvm/lib/Target/AMDGPU/Disassembler/AMDGPUDisassembler.cpp (+131-49)
(added) llvm/test/MC/Disassembler/AMDGPU/kernel-descriptor-errors.test (+73)
(added) llvm/test/MC/Disassembler/AMDGPU/kernel-descriptor-rsrc-errors.test (+92)
(modified) llvm/tools/llvm-objdump/llvm-objdump.cpp (+21-4)

diff --git a/llvm/docs/AMDGPUUsage.rst b/llvm/docs/AMDGPUUsage.rst
index 22c1d1f186ea54..0055b72160bfef 100644
--- a/llvm/docs/AMDGPUUsage.rst
+++ b/llvm/docs/AMDGPUUsage.rst
@@ -4652,7 +4652,7 @@ The fields used by CP for code objects before V3 also match those specified in
                                                      entry point instruction
                                                      which must be 256 byte
                                                      aligned.
-     351:272 20                                      Reserved, must be 0.
+     351:192 20                                      Reserved, must be 0.
              bytes
      383:352 4 bytes COMPUTE_PGM_RSRC3               GFX6-GFX9
                                                        Reserved, must be 0.
@@ -5248,14 +5248,14 @@ The fields used by CP for code objects before V3 also match those specified in
      5:0     6 bits  ACCUM_OFFSET                    Offset of a first AccVGPR in the unified register file. Granularity 4.
                                                      Value 0-63. 0 - accum-offset = 4, 1 - accum-offset = 8, ...,
                                                      63 - accum-offset = 256.
-     6:15    10                                      Reserved, must be 0.
+     15:6    10                                      Reserved, must be 0.
              bits
      16      1 bit   TG_SPLIT                        - If 0 the waves of a work-group are
                                                        launched in the same CU.
                                                      - If 1 the waves of a work-group can be
                                                        launched in different CUs. The waves
                                                        cannot use S_BARRIER or LDS.
-     17:31   15                                      Reserved, must be 0.
+     31:17   15                                      Reserved, must be 0.
              bits
      32      **Total size 4 bytes.**
      ======= ===================================================================================================================
diff --git a/llvm/include/llvm/MC/MCDisassembler/MCDisassembler.h b/llvm/include/llvm/MC/MCDisassembler/MCDisassembler.h
index 7dd8b0b7d3778d..21a6e92a2ab7c3 100644
--- a/llvm/include/llvm/MC/MCDisassembler/MCDisassembler.h
+++ b/llvm/include/llvm/MC/MCDisassembler/MCDisassembler.h
@@ -147,7 +147,8 @@ class MCDisassembler {
   /// \param Address  - The address, in the memory space of region, of the first
   ///                   byte of the symbol.
   /// \param Bytes    - A reference to the actual bytes at the symbol location.
-  /// \param CStream  - The stream to print comments and annotations on.
+  /// \param ErrS     - A stream to print newline separated error comments that
+  ///                   will be emitted if Fail is returned.
   /// \return         - MCDisassembler::Success if bytes are decoded
   ///                   successfully. Size must hold the number of bytes that
   ///                   were decoded.
@@ -160,7 +161,7 @@ class MCDisassembler {
   ///                   case.
   virtual std::optional<DecodeStatus>
   onSymbolStart(SymbolInfoTy &Symbol, uint64_t &Size, ArrayRef<uint8_t> Bytes,
-                uint64_t Address, raw_ostream &CStream) const;
+                uint64_t Address, raw_ostream &ErrS) const;
   // TODO:
   // Implement similar hooks that can be used at other points during
   // disassembly. Something along the following lines:
diff --git a/llvm/lib/Target/AMDGPU/Disassembler/AMDGPUDisassembler.cpp b/llvm/lib/Target/AMDGPU/Disassembler/AMDGPUDisassembler.cpp
index 8c42304ce0bee5..95278783952d92 100644
--- a/llvm/lib/Target/AMDGPU/Disassembler/AMDGPUDisassembler.cpp
+++ b/llvm/lib/Target/AMDGPU/Disassembler/AMDGPUDisassembler.cpp
@@ -33,6 +33,7 @@
 #include "llvm/MC/MCSubtargetInfo.h"
 #include "llvm/MC/TargetRegistry.h"
 #include "llvm/Support/AMDHSAKernelDescriptor.h"
+#include "llvm/Support/SaveAndRestore.h"
 
 using namespace llvm;
 
@@ -1780,6 +1781,29 @@ bool AMDGPUDisassembler::hasKernargPreload() const {
 //===----------------------------------------------------------------------===//
 // AMDGPU specific symbol handling
 //===----------------------------------------------------------------------===//
+
+/// Print a string describing the reserved bit range specified by Mask with
+/// offset BaseBytes for use in error comments. Mask is a single continuous
+/// range of 1s surrounded by zeros. The format here is meant to align with the
+/// tables that describe these bits in llvm.org/docs/AMDGPUUsage.html.
+static SmallString<20> getBitRangeFromMask(uint32_t Mask, unsigned BaseBytes) {
+  SmallString<20> Result;
+  raw_svector_ostream S(Result);
+
+  int TrailingZeros = llvm::countr_zero(Mask);
+  int PopCount = llvm::popcount(Mask);
+
+  if (PopCount == 1) {
+    S << "bit (" << (TrailingZeros + BaseBytes * CHAR_BIT) << ")";
+  } else {
+    S << "bits in range ("
+      << (TrailingZeros + PopCount - 1 + BaseBytes * CHAR_BIT) << ":"
+      << (TrailingZeros + BaseBytes * CHAR_BIT) << ")";
+  }
+
+  return Result;
+}
+
 #define GET_FIELD(MASK) (AMDHSA_BITS_GET(FourByteBuffer, MASK))
 #define PRINT_DIRECTIVE(DIRECTIVE, MASK)                                       \
   do {                                                                         \
@@ -1791,6 +1815,23 @@ bool AMDGPUDisassembler::hasKernargPreload() const {
              << GET_FIELD(MASK) << '\n';                                       \
   } while (0)
 
+#define CHECK_RESERVED_BITS_IMPL(MASK, DESC, MSG)                              \
+  do {                                                                         \
+    if (FourByteBuffer & (MASK)) {                                             \
+      *CommentStream << "kernel descriptor " DESC " reserved "                 \
+                     << getBitRangeFromMask((MASK), 0) << " set" MSG "\n";     \
+      return MCDisassembler::Fail;                                             \
+    }                                                                          \
+  } while (0)
+
+#define CHECK_RESERVED_BITS(MASK) CHECK_RESERVED_BITS_IMPL(MASK, #MASK, "")
+#define CHECK_RESERVED_BITS_MSG(MASK, MSG)                                     \
+  CHECK_RESERVED_BITS_IMPL(MASK, #MASK, ", " MSG)
+#define CHECK_RESERVED_BITS_DESC(MASK, DESC)                                   \
+  CHECK_RESERVED_BITS_IMPL(MASK, DESC, "")
+#define CHECK_RESERVED_BITS_DESC_MSG(MASK, DESC, MSG)                          \
+  CHECK_RESERVED_BITS_IMPL(MASK, DESC, ", " MSG)
+
 // NOLINTNEXTLINE(readability-identifier-naming)
 MCDisassembler::DecodeStatus AMDGPUDisassembler::decodeCOMPUTE_PGM_RSRC1(
     uint32_t FourByteBuffer, raw_string_ostream &KdStream) const {
@@ -1833,8 +1874,9 @@ MCDisassembler::DecodeStatus AMDGPUDisassembler::decodeCOMPUTE_PGM_RSRC1(
   uint32_t GranulatedWavefrontSGPRCount =
       GET_FIELD(COMPUTE_PGM_RSRC1_GRANULATED_WAVEFRONT_SGPR_COUNT);
 
-  if (isGFX10Plus() && GranulatedWavefrontSGPRCount)
-    return MCDisassembler::Fail;
+  if (isGFX10Plus())
+    CHECK_RESERVED_BITS_MSG(COMPUTE_PGM_RSRC1_GRANULATED_WAVEFRONT_SGPR_COUNT,
+                            "must be zero on gfx10+");
 
   uint32_t NextFreeSGPR = (GranulatedWavefrontSGPRCount + 1) *
                           AMDGPU::IsaInfo::getSGPREncodingGranule(&STI);
@@ -1845,8 +1887,7 @@ MCDisassembler::DecodeStatus AMDGPUDisassembler::decodeCOMPUTE_PGM_RSRC1(
   KdStream << Indent << ".amdhsa_reserve_xnack_mask " << 0 << '\n';
   KdStream << Indent << ".amdhsa_next_free_sgpr " << NextFreeSGPR << "\n";
 
-  if (FourByteBuffer & COMPUTE_PGM_RSRC1_PRIORITY)
-    return MCDisassembler::Fail;
+  CHECK_RESERVED_BITS(COMPUTE_PGM_RSRC1_PRIORITY);
 
   PRINT_DIRECTIVE(".amdhsa_float_round_mode_32",
                   COMPUTE_PGM_RSRC1_FLOAT_ROUND_MODE_32);
@@ -1857,37 +1898,33 @@ MCDisassembler::DecodeStatus AMDGPUDisassembler::decodeCOMPUTE_PGM_RSRC1(
   PRINT_DIRECTIVE(".amdhsa_float_denorm_mode_16_64",
                   COMPUTE_PGM_RSRC1_FLOAT_DENORM_MODE_16_64);
 
-  if (FourByteBuffer & COMPUTE_PGM_RSRC1_PRIV)
-    return MCDisassembler::Fail;
+  CHECK_RESERVED_BITS(COMPUTE_PGM_RSRC1_PRIV);
 
   if (!isGFX12Plus())
     PRINT_DIRECTIVE(".amdhsa_dx10_clamp",
                     COMPUTE_PGM_RSRC1_GFX6_GFX11_ENABLE_DX10_CLAMP);
 
-  if (FourByteBuffer & COMPUTE_PGM_RSRC1_DEBUG_MODE)
-    return MCDisassembler::Fail;
+  CHECK_RESERVED_BITS(COMPUTE_PGM_RSRC1_DEBUG_MODE);
 
   if (!isGFX12Plus())
     PRINT_DIRECTIVE(".amdhsa_ieee_mode",
                     COMPUTE_PGM_RSRC1_GFX6_GFX11_ENABLE_IEEE_MODE);
 
-  if (FourByteBuffer & COMPUTE_PGM_RSRC1_BULKY)
-    return MCDisassembler::Fail;
-
-  if (FourByteBuffer & COMPUTE_PGM_RSRC1_CDBG_USER)
-    return MCDisassembler::Fail;
+  CHECK_RESERVED_BITS(COMPUTE_PGM_RSRC1_BULKY);
+  CHECK_RESERVED_BITS(COMPUTE_PGM_RSRC1_CDBG_USER);
 
   if (isGFX9Plus())
     PRINT_DIRECTIVE(".amdhsa_fp16_overflow", COMPUTE_PGM_RSRC1_GFX9_PLUS_FP16_OVFL);
 
   if (!isGFX9Plus())
-    if (FourByteBuffer & COMPUTE_PGM_RSRC1_GFX6_GFX8_RESERVED0)
-      return MCDisassembler::Fail;
-  if (FourByteBuffer & COMPUTE_PGM_RSRC1_RESERVED1)
-    return MCDisassembler::Fail;
+    CHECK_RESERVED_BITS_DESC_MSG(COMPUTE_PGM_RSRC1_GFX6_GFX8_RESERVED0,
+                                 "COMPUTE_PGM_RSRC1", "must be zero pre-gfx9");
+
+  CHECK_RESERVED_BITS_DESC(COMPUTE_PGM_RSRC1_RESERVED1, "COMPUTE_PGM_RSRC1");
+
   if (!isGFX10Plus())
-    if (FourByteBuffer & COMPUTE_PGM_RSRC1_GFX6_GFX9_RESERVED2)
-      return MCDisassembler::Fail;
+    CHECK_RESERVED_BITS_DESC_MSG(COMPUTE_PGM_RSRC1_GFX6_GFX9_RESERVED2,
+                                 "COMPUTE_PGM_RSRC1", "must be zero pre-gfx10");
 
   if (isGFX10Plus()) {
     PRINT_DIRECTIVE(".amdhsa_workgroup_processor_mode",
@@ -1925,14 +1962,9 @@ MCDisassembler::DecodeStatus AMDGPUDisassembler::decodeCOMPUTE_PGM_RSRC2(
   PRINT_DIRECTIVE(".amdhsa_system_vgpr_workitem_id",
                   COMPUTE_PGM_RSRC2_ENABLE_VGPR_WORKITEM_ID);
 
-  if (FourByteBuffer & COMPUTE_PGM_RSRC2_ENABLE_EXCEPTION_ADDRESS_WATCH)
-    return MCDisassembler::Fail;
-
-  if (FourByteBuffer & COMPUTE_PGM_RSRC2_ENABLE_EXCEPTION_MEMORY)
-    return MCDisassembler::Fail;
-
-  if (FourByteBuffer & COMPUTE_PGM_RSRC2_GRANULATED_LDS_SIZE)
-    return MCDisassembler::Fail;
+  CHECK_RESERVED_BITS(COMPUTE_PGM_RSRC2_ENABLE_EXCEPTION_ADDRESS_WATCH);
+  CHECK_RESERVED_BITS(COMPUTE_PGM_RSRC2_ENABLE_EXCEPTION_MEMORY);
+  CHECK_RESERVED_BITS(COMPUTE_PGM_RSRC2_GRANULATED_LDS_SIZE);
 
   PRINT_DIRECTIVE(
       ".amdhsa_exception_fp_ieee_invalid_op",
@@ -1951,8 +1983,7 @@ MCDisassembler::DecodeStatus AMDGPUDisassembler::decodeCOMPUTE_PGM_RSRC2(
   PRINT_DIRECTIVE(".amdhsa_exception_int_div_zero",
                   COMPUTE_PGM_RSRC2_ENABLE_EXCEPTION_INT_DIVIDE_BY_ZERO);
 
-  if (FourByteBuffer & COMPUTE_PGM_RSRC2_RESERVED0)
-    return MCDisassembler::Fail;
+  CHECK_RESERVED_BITS_DESC(COMPUTE_PGM_RSRC2_RESERVED0, "COMPUTE_PGM_RSRC2");
 
   return MCDisassembler::Success;
 }
@@ -1966,11 +1997,13 @@ MCDisassembler::DecodeStatus AMDGPUDisassembler::decodeCOMPUTE_PGM_RSRC3(
     KdStream << Indent << ".amdhsa_accum_offset "
              << (GET_FIELD(COMPUTE_PGM_RSRC3_GFX90A_ACCUM_OFFSET) + 1) * 4
              << '\n';
-    if (FourByteBuffer & COMPUTE_PGM_RSRC3_GFX90A_RESERVED0)
-      return MCDisassembler::Fail;
+
     PRINT_DIRECTIVE(".amdhsa_tg_split", COMPUTE_PGM_RSRC3_GFX90A_TG_SPLIT);
-    if (FourByteBuffer & COMPUTE_PGM_RSRC3_GFX90A_RESERVED1)
-      return MCDisassembler::Fail;
+
+    CHECK_RESERVED_BITS_DESC_MSG(COMPUTE_PGM_RSRC3_GFX90A_RESERVED0,
+                                 "COMPUTE_PGM_RSRC3", "must be zero on gfx90a");
+    CHECK_RESERVED_BITS_DESC_MSG(COMPUTE_PGM_RSRC3_GFX90A_RESERVED1,
+                                 "COMPUTE_PGM_RSRC3", "must be zero on gfx90a");
   } else if (isGFX10Plus()) {
     // Bits [0-3].
     if (!isGFX12Plus()) {
@@ -1983,8 +2016,9 @@ MCDisassembler::DecodeStatus AMDGPUDisassembler::decodeCOMPUTE_PGM_RSRC3(
             COMPUTE_PGM_RSRC3_GFX10_GFX11_SHARED_VGPR_COUNT);
       }
     } else {
-      if (FourByteBuffer & COMPUTE_PGM_RSRC3_GFX12_PLUS_RESERVED0)
-        return MCDisassembler::Fail;
+      CHECK_RESERVED_BITS_DESC_MSG(COMPUTE_PGM_RSRC3_GFX12_PLUS_RESERVED0,
+                                   "COMPUTE_PGM_RSRC3",
+                                   "must be zero on gfx12+");
     }
 
     // Bits [4-11].
@@ -1999,36 +2033,41 @@ MCDisassembler::DecodeStatus AMDGPUDisassembler::decodeCOMPUTE_PGM_RSRC3(
       PRINT_PSEUDO_DIRECTIVE_COMMENT(
           "INST_PREF_SIZE", COMPUTE_PGM_RSRC3_GFX12_PLUS_INST_PREF_SIZE);
     } else {
-      if (FourByteBuffer & COMPUTE_PGM_RSRC3_GFX10_RESERVED1)
-        return MCDisassembler::Fail;
+      CHECK_RESERVED_BITS_DESC_MSG(COMPUTE_PGM_RSRC3_GFX10_RESERVED1,
+                                   "COMPUTE_PGM_RSRC3",
+                                   "must be zero on gfx10");
     }
 
     // Bits [12].
-    if (FourByteBuffer & COMPUTE_PGM_RSRC3_GFX10_PLUS_RESERVED2)
-      return MCDisassembler::Fail;
+    CHECK_RESERVED_BITS_DESC_MSG(COMPUTE_PGM_RSRC3_GFX10_PLUS_RESERVED2,
+                                 "COMPUTE_PGM_RSRC3", "must be zero on gfx10+");
 
     // Bits [13].
     if (isGFX12Plus()) {
       PRINT_PSEUDO_DIRECTIVE_COMMENT("GLG_EN",
                                      COMPUTE_PGM_RSRC3_GFX12_PLUS_GLG_EN);
     } else {
-      if (FourByteBuffer & COMPUTE_PGM_RSRC3_GFX10_GFX11_RESERVED3)
-        return MCDisassembler::Fail;
+      CHECK_RESERVED_BITS_DESC_MSG(COMPUTE_PGM_RSRC3_GFX10_GFX11_RESERVED3,
+                                   "COMPUTE_PGM_RSRC3",
+                                   "must be zero on gfx10 or gfx11");
     }
 
     // Bits [14-30].
-    if (FourByteBuffer & COMPUTE_PGM_RSRC3_GFX10_PLUS_RESERVED4)
-      return MCDisassembler::Fail;
+    CHECK_RESERVED_BITS_DESC_MSG(COMPUTE_PGM_RSRC3_GFX10_PLUS_RESERVED4,
+                                 "COMPUTE_PGM_RSRC3", "must be zero on gfx10+");
 
     // Bits [31].
     if (isGFX11Plus()) {
       PRINT_PSEUDO_DIRECTIVE_COMMENT("IMAGE_OP",
                                      COMPUTE_PGM_RSRC3_GFX11_PLUS_IMAGE_OP);
     } else {
-      if (FourByteBuffer & COMPUTE_PGM_RSRC3_GFX10_RESERVED5)
-        return MCDisassembler::Fail;
+      CHECK_RESERVED_BITS_DESC_MSG(COMPUTE_PGM_RSRC3_GFX10_RESERVED5,
+                                   "COMPUTE_PGM_RSRC3",
+                                   "must be zero on gfx10");
     }
   } else if (FourByteBuffer) {
+    *CommentStream
+        << "kernel descriptor COMPUTE_PGM_RSRC3 must be all zero before gfx9\n";
     return MCDisassembler::Fail;
   }
   return MCDisassembler::Success;
@@ -2036,6 +2075,30 @@ MCDisassembler::DecodeStatus AMDGPUDisassembler::decodeCOMPUTE_PGM_RSRC3(
 #undef PRINT_PSEUDO_DIRECTIVE_COMMENT
 #undef PRINT_DIRECTIVE
 #undef GET_FIELD
+#undef CHECK_RESERVED_BITS_IMPL
+#undef CHECK_RESERVED_BITS
+#undef CHECK_RESERVED_BITS_MSG
+#undef CHECK_RESERVED_BITS_DESC
+#undef CHECK_RESERVED_BITS_DESC_MSG
+
+static void printReservedKDBitsError(raw_ostream *CommentStream, uint32_t Mask,
+                                     unsigned BaseInBytes, StringRef Msg = "") {
+  *CommentStream << "kernel descriptor reserved "
+                 << getBitRangeFromMask(Mask, BaseInBytes) << " set";
+  if (!Msg.empty())
+    *CommentStream << ", " << Msg;
+  *CommentStream << '\n';
+}
+
+static void printReservedKDBytesError(raw_ostream *CommentStream,
+                                      unsigned BaseInBytes,
+                                      unsigned WidthInBytes) {
+  // Print an error comment in the same format as the "Kernel Descriptor"
+  // table here: https://llvm.org/docs/AMDGPUUsage.html#kernel-descriptor .
+  *CommentStream << "kernel descriptor reserved bits set in range "
+                 << ((BaseInBytes + WidthInBytes) * CHAR_BIT - 1) << ":"
+                 << (BaseInBytes * CHAR_BIT) << '\n';
+}
 
 MCDisassembler::DecodeStatus
 AMDGPUDisassembler::decodeKernelDescriptorDirective(
@@ -2080,6 +2143,7 @@ AMDGPUDisassembler::decodeKernelDescriptorDirective(
     ReservedBytes = DE.getBytes(Cursor, 4);
     for (int I = 0; I < 4; ++I) {
       if (ReservedBytes[I] != 0) {
+        printReservedKDBytesError(CommentStream, amdhsa::RESERVED0_OFFSET, 4);
         return MCDisassembler::Fail;
       }
     }
@@ -2097,6 +2161,7 @@ AMDGPUDisassembler::decodeKernelDescriptorDirective(
     ReservedBytes = DE.getBytes(Cursor, 20);
     for (int I = 0; I < 20; ++I) {
       if (ReservedBytes[I] != 0) {
+        printReservedKDBytesError(CommentStream, amdhsa::RESERVED1_OFFSET, 20);
         return MCDisassembler::Fail;
       }
     }
@@ -2135,12 +2200,18 @@ AMDGPUDisassembler::decodeKernelDescriptorDirective(
     PRINT_DIRECTIVE(".amdhsa_user_sgpr_private_segment_size",
                     KERNEL_CODE_PROPERTY_ENABLE_SGPR_PRIVATE_SEGMENT_SIZE);
 
-    if (TwoByteBuffer & KERNEL_CODE_PROPERTY_RESERVED0)
+    if (TwoByteBuffer & KERNEL_CODE_PROPERTY_RESERVED0) {
+      printReservedKDBitsError(CommentStream, KERNEL_CODE_PROPERTY_RESERVED0,
+                               amdhsa::KERNEL_CODE_PROPERTIES_OFFSET);
       return MCDisassembler::Fail;
+    }
 
     // Reserved for GFX9
     if (isGFX9() &&
         (TwoByteBuffer & KERNEL_CODE_PROPERTY_ENABLE_WAVEFRONT_SIZE32)) {
+      printReservedKDBitsError(
+          CommentStream, KERNEL_CODE_PROPERTY_ENABLE_WAVEFRONT_SIZE32,
+          amdhsa::KERNEL_CODE_PROPERTIES_OFFSET, "must be zero on gfx9");
       return MCDisassembler::Fail;
     } else if (isGFX10Plus()) {
       PRINT_DIRECTIVE(".amdhsa_wavefront_size32",
@@ -2151,8 +2222,11 @@ AMDGPUDisassembler::decodeKernelDescriptorDirective(
       PRINT_DIRECTIVE(".amdhsa_uses_dynamic_stack",
                       KERNEL_CODE_PROPERTY_USES_DYNAMIC_STACK);
 
-    if (TwoByteBuffer & KERNEL_CODE_PROPERTY_RESERVED1)
+    if (TwoByteBuffer & KERNEL_CODE_PROPERTY_RESERVED1) {
+      printReservedKDBitsError(CommentStream, KERNEL_CODE_PROPERTY_RESERVED1,
+                               amdhsa::KERNEL_CODE_PROPERTIES_OFFSET);
       return MCDisassembler::Fail;
+    }
 
     return MCDisassembler::Success;
 
@@ -2174,8 +2248,10 @@ AMDGPUDisassembler::decodeKernelDescriptorDirective(
     // 4 bytes from here are reserved, must be 0.
     ReservedBytes = DE.getBytes(Cursor, 4);
     for (int I = 0; I < 4; ++I) {
-      if (ReservedBytes[I] != 0)
+      if (ReservedBytes[I] != 0) {
+        printReservedKDBytesError(CommentStream, amdhsa::RESERVED3_OFFSET, 4);
         return MCDisassembler::Fail;
+      }
     }
     return MCDisassembler::Success;
 
@@ -2189,8 +2265,10 @@ AMDGPUDisassembler::decodeKernelDescriptorDirective(
 MCDisassembler::DecodeStatus AMDGPUDisassembler::decodeKernelDescriptor(
     StringRef KdName, ArrayRef<uint8_t> Bytes, uint64_t KdAddress) const {
   // CP microcode requires the kernel descriptor to be 64 aligned.
-  if (Bytes.size() != 64 || KdAddress % 64 != 0)
+  if (Bytes.size() != 64 || KdAddress % 64 != 0) {
+    *CommentStream << "kernel descriptor must be 64-byte aligned\n";
     return MCDisassembler::Fail;
+  }
 
   // FIXME: We can't actually decode "in order" as is done below, as e.g. GFX10
   // requires us to know the setting of .amdhsa_wavefront_size32 in order to
@@ -2235,9 +2313,12 @@ AMDGPUDisassembler::onSymbolStart(SymbolInfoTy &Symbol, uint64_t &Size,
   // Fix the spurious symbol issue for AMDGPU kernels. Exists for both Code
   // Object V2 and V3 when symbols are marked protected.
 
+  SaveAndRestore CommentStreamRAII(CommentStream, &CStream);
+
   // amd_kernel_code_t for Code Object V2.
   if (Symbol.Type == ELF::STT_AMDGPU_HSA_KERNEL) {
     Size = 256;
+    *CommentStream << "code object v2 is not supported";
     return MCDisassembler::Fail;
   }
 
@@ -2247,6 +2328,7 @@ AMDGPUDisassembler::onSymbolStart(SymbolInfoTy &Symbol, uint64_t &Size,
     Size = 64; // Size = 64 regardless of success or failure.
     return decodeKernelDescriptor(Name.drop_back(3), Bytes, Address);
   }
+
   return std::nullopt;
 }
 
diff --git a/llvm/test/MC/Disassembler/AMDGPU/kernel-descriptor-errors.test b/llvm/test/MC/Disassembler/AMDGPU/kernel-descriptor-errors.test
new file mode 100644
index 00000000000000..bdfd70f8a022d1
--- /dev/null
+++ b/llvm/test/MC/Disassembler/AMD...
[truncated]

arsenm · 2024-04-03T16:26:09Z

llvm/include/llvm/MC/MCDisassembler/MCDisassembler.h

@@ -160,7 +161,7 @@ class MCDisassembler {
  ///                   case.
  virtual std::optional<DecodeStatus>
  onSymbolStart(SymbolInfoTy &Symbol, uint64_t &Size, ArrayRef<uint8_t> Bytes,
-                uint64_t Address, raw_ostream &CStream) const;
+                uint64_t Address, raw_ostream &ErrS) const;


If we have an error string stream, probably should convert this to use Error return instead of optional?

Using an Error to represent errors seems like a nice refactor, but an Error return isn't sufficient for this function. It has three cases: a) nullopt, meaning the disassembler didn't have any handling for this symbol, b) Fail, meaning an error was encountered, and c) Success. b & c both indicate to objdump that some bytes were consumed, so Size must be set. In the update I changed the return type to bool, where true means the target disassembler had some handling for this symbol, with an Error out param to indicate whether it contained errors.

llvm/lib/Target/AMDGPU/Disassembler/AMDGPUDisassembler.cpp

slinder1

Two small nits, otherwise LGTM! Thank you for making the errors so descriptive, it is a massive improvement for the exceptional cases where the tool is the most useful.

llvm/lib/Target/AMDGPU/Disassembler/AMDGPUDisassembler.cpp

…lue to bool.

jh7370

I've only taken a look at the llvm-objdump part.

llvm/tools/llvm-objdump/llvm-objdump.cpp

jh7370

llvm-objdump parts LGTM.

epilk added 2 commits April 2, 2024 14:01

[AMDGPU] Add disassembler diagnostics for invalid kernel descriptors

352b5e3

These mostly are checking for various reserved bits being set. The diagnostics for gpu-dependent reserved bits have a bit more context since they seem like the most likely ones to be observed in practice.

epilk requested review from jayfoad, kzhuravl and slinder1 April 2, 2024 19:44

jayfoad reviewed Apr 3, 2024

View reviewed changes

llvm/include/llvm/MC/MCDisassembler/MCDisassembler.h Outdated Show resolved Hide resolved

Fix a typo

a8524c0

arsenm added the backend:AMDGPU label Apr 3, 2024

arsenm reviewed Apr 3, 2024

View reviewed changes

slinder1 requested changes Apr 3, 2024

View reviewed changes

llvm/lib/Target/AMDGPU/Disassembler/AMDGPUDisassembler.cpp Outdated Show resolved Hide resolved

llvm/lib/Target/AMDGPU/Disassembler/AMDGPUDisassembler.cpp Outdated Show resolved Hide resolved

Refactor onSymbolStart to report errors using Error, change return va…

408d35d

…lue to bool.

llvmbot added backend:WebAssembly mc Machine (object) code llvm:binary-utilities labels Apr 9, 2024

epilk requested a review from MaskRay April 9, 2024 15:43

jh7370 reviewed Apr 10, 2024

View reviewed changes

llvm/tools/llvm-objdump/llvm-objdump.cpp Outdated Show resolved Hide resolved

llvm/tools/llvm-objdump/llvm-objdump.cpp Outdated Show resolved Hide resolved

Use Expected<bool>, remove period in error comment

6114654

jh7370 reviewed Apr 11, 2024

View reviewed changes

llvm/tools/llvm-objdump/llvm-objdump.cpp Outdated Show resolved Hide resolved

llvm/tools/llvm-objdump/llvm-objdump.cpp Show resolved Hide resolved

arsenm reviewed Apr 12, 2024

View reviewed changes

llvm/tools/llvm-objdump/llvm-objdump.cpp Show resolved Hide resolved

Address review comments

ff51c76

arsenm approved these changes Apr 15, 2024

View reviewed changes

Go back to looping over lines of the error message

1c31d52

jh7370 approved these changes Apr 17, 2024

View reviewed changes

epilk merged commit 68e814d into llvm:main Apr 18, 2024
5 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[AMDGPU] Add disassembler diagnostics for invalid kernel descriptors #87400

[AMDGPU] Add disassembler diagnostics for invalid kernel descriptors #87400

epilk commented Apr 2, 2024

llvmbot commented Apr 3, 2024 •

edited

arsenm Apr 3, 2024

epilk Apr 9, 2024

slinder1 left a comment

jh7370 left a comment

jh7370 left a comment

[AMDGPU] Add disassembler diagnostics for invalid kernel descriptors #87400

[AMDGPU] Add disassembler diagnostics for invalid kernel descriptors #87400

Conversation

epilk commented Apr 2, 2024

llvmbot commented Apr 3, 2024 • edited

arsenm Apr 3, 2024

Choose a reason for hiding this comment

epilk Apr 9, 2024

Choose a reason for hiding this comment

slinder1 left a comment

Choose a reason for hiding this comment

jh7370 left a comment

Choose a reason for hiding this comment

jh7370 left a comment

Choose a reason for hiding this comment

llvmbot commented Apr 3, 2024 •

edited