-
Notifications
You must be signed in to change notification settings - Fork 10.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[AMDGPU] Add disassembler diagnostics for invalid kernel descriptors #87400
Conversation
Previously these were being ignored, but no disassembler actually used them, so this is really NFC. This does repurpose the stated use of the CStream parameter, which was for any comment. The AMDGPU and WebAssembly disassemblers both just emit comments to outs(), so having a dedicated parameter for this seems unnecessary.
These mostly are checking for various reserved bits being set. The diagnostics for gpu-dependent reserved bits have a bit more context since they seem like the most likely ones to be observed in practice.
@llvm/pr-subscribers-mc @llvm/pr-subscribers-backend-amdgpu Author: Emma Pilkington (epilk) ChangesThese mostly are checking for various reserved bits being set. The diagnostics for gpu-dependent reserved bits have a bit more context since they seem like the most likely ones to be observed in practice. Patch is 31.10 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/87400.diff 6 Files Affected:
diff --git a/llvm/docs/AMDGPUUsage.rst b/llvm/docs/AMDGPUUsage.rst
index 22c1d1f186ea54..0055b72160bfef 100644
--- a/llvm/docs/AMDGPUUsage.rst
+++ b/llvm/docs/AMDGPUUsage.rst
@@ -4652,7 +4652,7 @@ The fields used by CP for code objects before V3 also match those specified in
entry point instruction
which must be 256 byte
aligned.
- 351:272 20 Reserved, must be 0.
+ 351:192 20 Reserved, must be 0.
bytes
383:352 4 bytes COMPUTE_PGM_RSRC3 GFX6-GFX9
Reserved, must be 0.
@@ -5248,14 +5248,14 @@ The fields used by CP for code objects before V3 also match those specified in
5:0 6 bits ACCUM_OFFSET Offset of a first AccVGPR in the unified register file. Granularity 4.
Value 0-63. 0 - accum-offset = 4, 1 - accum-offset = 8, ...,
63 - accum-offset = 256.
- 6:15 10 Reserved, must be 0.
+ 15:6 10 Reserved, must be 0.
bits
16 1 bit TG_SPLIT - If 0 the waves of a work-group are
launched in the same CU.
- If 1 the waves of a work-group can be
launched in different CUs. The waves
cannot use S_BARRIER or LDS.
- 17:31 15 Reserved, must be 0.
+ 31:17 15 Reserved, must be 0.
bits
32 **Total size 4 bytes.**
======= ===================================================================================================================
diff --git a/llvm/include/llvm/MC/MCDisassembler/MCDisassembler.h b/llvm/include/llvm/MC/MCDisassembler/MCDisassembler.h
index 7dd8b0b7d3778d..21a6e92a2ab7c3 100644
--- a/llvm/include/llvm/MC/MCDisassembler/MCDisassembler.h
+++ b/llvm/include/llvm/MC/MCDisassembler/MCDisassembler.h
@@ -147,7 +147,8 @@ class MCDisassembler {
/// \param Address - The address, in the memory space of region, of the first
/// byte of the symbol.
/// \param Bytes - A reference to the actual bytes at the symbol location.
- /// \param CStream - The stream to print comments and annotations on.
+ /// \param ErrS - A stream to print newline separated error comments that
+ /// will be emitted if Fail is returned.
/// \return - MCDisassembler::Success if bytes are decoded
/// successfully. Size must hold the number of bytes that
/// were decoded.
@@ -160,7 +161,7 @@ class MCDisassembler {
/// case.
virtual std::optional<DecodeStatus>
onSymbolStart(SymbolInfoTy &Symbol, uint64_t &Size, ArrayRef<uint8_t> Bytes,
- uint64_t Address, raw_ostream &CStream) const;
+ uint64_t Address, raw_ostream &ErrS) const;
// TODO:
// Implement similar hooks that can be used at other points during
// disassembly. Something along the following lines:
diff --git a/llvm/lib/Target/AMDGPU/Disassembler/AMDGPUDisassembler.cpp b/llvm/lib/Target/AMDGPU/Disassembler/AMDGPUDisassembler.cpp
index 8c42304ce0bee5..95278783952d92 100644
--- a/llvm/lib/Target/AMDGPU/Disassembler/AMDGPUDisassembler.cpp
+++ b/llvm/lib/Target/AMDGPU/Disassembler/AMDGPUDisassembler.cpp
@@ -33,6 +33,7 @@
#include "llvm/MC/MCSubtargetInfo.h"
#include "llvm/MC/TargetRegistry.h"
#include "llvm/Support/AMDHSAKernelDescriptor.h"
+#include "llvm/Support/SaveAndRestore.h"
using namespace llvm;
@@ -1780,6 +1781,29 @@ bool AMDGPUDisassembler::hasKernargPreload() const {
//===----------------------------------------------------------------------===//
// AMDGPU specific symbol handling
//===----------------------------------------------------------------------===//
+
+/// Print a string describing the reserved bit range specified by Mask with
+/// offset BaseBytes for use in error comments. Mask is a single continuous
+/// range of 1s surrounded by zeros. The format here is meant to align with the
+/// tables that describe these bits in llvm.org/docs/AMDGPUUsage.html.
+static SmallString<20> getBitRangeFromMask(uint32_t Mask, unsigned BaseBytes) {
+ SmallString<20> Result;
+ raw_svector_ostream S(Result);
+
+ int TrailingZeros = llvm::countr_zero(Mask);
+ int PopCount = llvm::popcount(Mask);
+
+ if (PopCount == 1) {
+ S << "bit (" << (TrailingZeros + BaseBytes * CHAR_BIT) << ")";
+ } else {
+ S << "bits in range ("
+ << (TrailingZeros + PopCount - 1 + BaseBytes * CHAR_BIT) << ":"
+ << (TrailingZeros + BaseBytes * CHAR_BIT) << ")";
+ }
+
+ return Result;
+}
+
#define GET_FIELD(MASK) (AMDHSA_BITS_GET(FourByteBuffer, MASK))
#define PRINT_DIRECTIVE(DIRECTIVE, MASK) \
do { \
@@ -1791,6 +1815,23 @@ bool AMDGPUDisassembler::hasKernargPreload() const {
<< GET_FIELD(MASK) << '\n'; \
} while (0)
+#define CHECK_RESERVED_BITS_IMPL(MASK, DESC, MSG) \
+ do { \
+ if (FourByteBuffer & (MASK)) { \
+ *CommentStream << "kernel descriptor " DESC " reserved " \
+ << getBitRangeFromMask((MASK), 0) << " set" MSG "\n"; \
+ return MCDisassembler::Fail; \
+ } \
+ } while (0)
+
+#define CHECK_RESERVED_BITS(MASK) CHECK_RESERVED_BITS_IMPL(MASK, #MASK, "")
+#define CHECK_RESERVED_BITS_MSG(MASK, MSG) \
+ CHECK_RESERVED_BITS_IMPL(MASK, #MASK, ", " MSG)
+#define CHECK_RESERVED_BITS_DESC(MASK, DESC) \
+ CHECK_RESERVED_BITS_IMPL(MASK, DESC, "")
+#define CHECK_RESERVED_BITS_DESC_MSG(MASK, DESC, MSG) \
+ CHECK_RESERVED_BITS_IMPL(MASK, DESC, ", " MSG)
+
// NOLINTNEXTLINE(readability-identifier-naming)
MCDisassembler::DecodeStatus AMDGPUDisassembler::decodeCOMPUTE_PGM_RSRC1(
uint32_t FourByteBuffer, raw_string_ostream &KdStream) const {
@@ -1833,8 +1874,9 @@ MCDisassembler::DecodeStatus AMDGPUDisassembler::decodeCOMPUTE_PGM_RSRC1(
uint32_t GranulatedWavefrontSGPRCount =
GET_FIELD(COMPUTE_PGM_RSRC1_GRANULATED_WAVEFRONT_SGPR_COUNT);
- if (isGFX10Plus() && GranulatedWavefrontSGPRCount)
- return MCDisassembler::Fail;
+ if (isGFX10Plus())
+ CHECK_RESERVED_BITS_MSG(COMPUTE_PGM_RSRC1_GRANULATED_WAVEFRONT_SGPR_COUNT,
+ "must be zero on gfx10+");
uint32_t NextFreeSGPR = (GranulatedWavefrontSGPRCount + 1) *
AMDGPU::IsaInfo::getSGPREncodingGranule(&STI);
@@ -1845,8 +1887,7 @@ MCDisassembler::DecodeStatus AMDGPUDisassembler::decodeCOMPUTE_PGM_RSRC1(
KdStream << Indent << ".amdhsa_reserve_xnack_mask " << 0 << '\n';
KdStream << Indent << ".amdhsa_next_free_sgpr " << NextFreeSGPR << "\n";
- if (FourByteBuffer & COMPUTE_PGM_RSRC1_PRIORITY)
- return MCDisassembler::Fail;
+ CHECK_RESERVED_BITS(COMPUTE_PGM_RSRC1_PRIORITY);
PRINT_DIRECTIVE(".amdhsa_float_round_mode_32",
COMPUTE_PGM_RSRC1_FLOAT_ROUND_MODE_32);
@@ -1857,37 +1898,33 @@ MCDisassembler::DecodeStatus AMDGPUDisassembler::decodeCOMPUTE_PGM_RSRC1(
PRINT_DIRECTIVE(".amdhsa_float_denorm_mode_16_64",
COMPUTE_PGM_RSRC1_FLOAT_DENORM_MODE_16_64);
- if (FourByteBuffer & COMPUTE_PGM_RSRC1_PRIV)
- return MCDisassembler::Fail;
+ CHECK_RESERVED_BITS(COMPUTE_PGM_RSRC1_PRIV);
if (!isGFX12Plus())
PRINT_DIRECTIVE(".amdhsa_dx10_clamp",
COMPUTE_PGM_RSRC1_GFX6_GFX11_ENABLE_DX10_CLAMP);
- if (FourByteBuffer & COMPUTE_PGM_RSRC1_DEBUG_MODE)
- return MCDisassembler::Fail;
+ CHECK_RESERVED_BITS(COMPUTE_PGM_RSRC1_DEBUG_MODE);
if (!isGFX12Plus())
PRINT_DIRECTIVE(".amdhsa_ieee_mode",
COMPUTE_PGM_RSRC1_GFX6_GFX11_ENABLE_IEEE_MODE);
- if (FourByteBuffer & COMPUTE_PGM_RSRC1_BULKY)
- return MCDisassembler::Fail;
-
- if (FourByteBuffer & COMPUTE_PGM_RSRC1_CDBG_USER)
- return MCDisassembler::Fail;
+ CHECK_RESERVED_BITS(COMPUTE_PGM_RSRC1_BULKY);
+ CHECK_RESERVED_BITS(COMPUTE_PGM_RSRC1_CDBG_USER);
if (isGFX9Plus())
PRINT_DIRECTIVE(".amdhsa_fp16_overflow", COMPUTE_PGM_RSRC1_GFX9_PLUS_FP16_OVFL);
if (!isGFX9Plus())
- if (FourByteBuffer & COMPUTE_PGM_RSRC1_GFX6_GFX8_RESERVED0)
- return MCDisassembler::Fail;
- if (FourByteBuffer & COMPUTE_PGM_RSRC1_RESERVED1)
- return MCDisassembler::Fail;
+ CHECK_RESERVED_BITS_DESC_MSG(COMPUTE_PGM_RSRC1_GFX6_GFX8_RESERVED0,
+ "COMPUTE_PGM_RSRC1", "must be zero pre-gfx9");
+
+ CHECK_RESERVED_BITS_DESC(COMPUTE_PGM_RSRC1_RESERVED1, "COMPUTE_PGM_RSRC1");
+
if (!isGFX10Plus())
- if (FourByteBuffer & COMPUTE_PGM_RSRC1_GFX6_GFX9_RESERVED2)
- return MCDisassembler::Fail;
+ CHECK_RESERVED_BITS_DESC_MSG(COMPUTE_PGM_RSRC1_GFX6_GFX9_RESERVED2,
+ "COMPUTE_PGM_RSRC1", "must be zero pre-gfx10");
if (isGFX10Plus()) {
PRINT_DIRECTIVE(".amdhsa_workgroup_processor_mode",
@@ -1925,14 +1962,9 @@ MCDisassembler::DecodeStatus AMDGPUDisassembler::decodeCOMPUTE_PGM_RSRC2(
PRINT_DIRECTIVE(".amdhsa_system_vgpr_workitem_id",
COMPUTE_PGM_RSRC2_ENABLE_VGPR_WORKITEM_ID);
- if (FourByteBuffer & COMPUTE_PGM_RSRC2_ENABLE_EXCEPTION_ADDRESS_WATCH)
- return MCDisassembler::Fail;
-
- if (FourByteBuffer & COMPUTE_PGM_RSRC2_ENABLE_EXCEPTION_MEMORY)
- return MCDisassembler::Fail;
-
- if (FourByteBuffer & COMPUTE_PGM_RSRC2_GRANULATED_LDS_SIZE)
- return MCDisassembler::Fail;
+ CHECK_RESERVED_BITS(COMPUTE_PGM_RSRC2_ENABLE_EXCEPTION_ADDRESS_WATCH);
+ CHECK_RESERVED_BITS(COMPUTE_PGM_RSRC2_ENABLE_EXCEPTION_MEMORY);
+ CHECK_RESERVED_BITS(COMPUTE_PGM_RSRC2_GRANULATED_LDS_SIZE);
PRINT_DIRECTIVE(
".amdhsa_exception_fp_ieee_invalid_op",
@@ -1951,8 +1983,7 @@ MCDisassembler::DecodeStatus AMDGPUDisassembler::decodeCOMPUTE_PGM_RSRC2(
PRINT_DIRECTIVE(".amdhsa_exception_int_div_zero",
COMPUTE_PGM_RSRC2_ENABLE_EXCEPTION_INT_DIVIDE_BY_ZERO);
- if (FourByteBuffer & COMPUTE_PGM_RSRC2_RESERVED0)
- return MCDisassembler::Fail;
+ CHECK_RESERVED_BITS_DESC(COMPUTE_PGM_RSRC2_RESERVED0, "COMPUTE_PGM_RSRC2");
return MCDisassembler::Success;
}
@@ -1966,11 +1997,13 @@ MCDisassembler::DecodeStatus AMDGPUDisassembler::decodeCOMPUTE_PGM_RSRC3(
KdStream << Indent << ".amdhsa_accum_offset "
<< (GET_FIELD(COMPUTE_PGM_RSRC3_GFX90A_ACCUM_OFFSET) + 1) * 4
<< '\n';
- if (FourByteBuffer & COMPUTE_PGM_RSRC3_GFX90A_RESERVED0)
- return MCDisassembler::Fail;
+
PRINT_DIRECTIVE(".amdhsa_tg_split", COMPUTE_PGM_RSRC3_GFX90A_TG_SPLIT);
- if (FourByteBuffer & COMPUTE_PGM_RSRC3_GFX90A_RESERVED1)
- return MCDisassembler::Fail;
+
+ CHECK_RESERVED_BITS_DESC_MSG(COMPUTE_PGM_RSRC3_GFX90A_RESERVED0,
+ "COMPUTE_PGM_RSRC3", "must be zero on gfx90a");
+ CHECK_RESERVED_BITS_DESC_MSG(COMPUTE_PGM_RSRC3_GFX90A_RESERVED1,
+ "COMPUTE_PGM_RSRC3", "must be zero on gfx90a");
} else if (isGFX10Plus()) {
// Bits [0-3].
if (!isGFX12Plus()) {
@@ -1983,8 +2016,9 @@ MCDisassembler::DecodeStatus AMDGPUDisassembler::decodeCOMPUTE_PGM_RSRC3(
COMPUTE_PGM_RSRC3_GFX10_GFX11_SHARED_VGPR_COUNT);
}
} else {
- if (FourByteBuffer & COMPUTE_PGM_RSRC3_GFX12_PLUS_RESERVED0)
- return MCDisassembler::Fail;
+ CHECK_RESERVED_BITS_DESC_MSG(COMPUTE_PGM_RSRC3_GFX12_PLUS_RESERVED0,
+ "COMPUTE_PGM_RSRC3",
+ "must be zero on gfx12+");
}
// Bits [4-11].
@@ -1999,36 +2033,41 @@ MCDisassembler::DecodeStatus AMDGPUDisassembler::decodeCOMPUTE_PGM_RSRC3(
PRINT_PSEUDO_DIRECTIVE_COMMENT(
"INST_PREF_SIZE", COMPUTE_PGM_RSRC3_GFX12_PLUS_INST_PREF_SIZE);
} else {
- if (FourByteBuffer & COMPUTE_PGM_RSRC3_GFX10_RESERVED1)
- return MCDisassembler::Fail;
+ CHECK_RESERVED_BITS_DESC_MSG(COMPUTE_PGM_RSRC3_GFX10_RESERVED1,
+ "COMPUTE_PGM_RSRC3",
+ "must be zero on gfx10");
}
// Bits [12].
- if (FourByteBuffer & COMPUTE_PGM_RSRC3_GFX10_PLUS_RESERVED2)
- return MCDisassembler::Fail;
+ CHECK_RESERVED_BITS_DESC_MSG(COMPUTE_PGM_RSRC3_GFX10_PLUS_RESERVED2,
+ "COMPUTE_PGM_RSRC3", "must be zero on gfx10+");
// Bits [13].
if (isGFX12Plus()) {
PRINT_PSEUDO_DIRECTIVE_COMMENT("GLG_EN",
COMPUTE_PGM_RSRC3_GFX12_PLUS_GLG_EN);
} else {
- if (FourByteBuffer & COMPUTE_PGM_RSRC3_GFX10_GFX11_RESERVED3)
- return MCDisassembler::Fail;
+ CHECK_RESERVED_BITS_DESC_MSG(COMPUTE_PGM_RSRC3_GFX10_GFX11_RESERVED3,
+ "COMPUTE_PGM_RSRC3",
+ "must be zero on gfx10 or gfx11");
}
// Bits [14-30].
- if (FourByteBuffer & COMPUTE_PGM_RSRC3_GFX10_PLUS_RESERVED4)
- return MCDisassembler::Fail;
+ CHECK_RESERVED_BITS_DESC_MSG(COMPUTE_PGM_RSRC3_GFX10_PLUS_RESERVED4,
+ "COMPUTE_PGM_RSRC3", "must be zero on gfx10+");
// Bits [31].
if (isGFX11Plus()) {
PRINT_PSEUDO_DIRECTIVE_COMMENT("IMAGE_OP",
COMPUTE_PGM_RSRC3_GFX11_PLUS_IMAGE_OP);
} else {
- if (FourByteBuffer & COMPUTE_PGM_RSRC3_GFX10_RESERVED5)
- return MCDisassembler::Fail;
+ CHECK_RESERVED_BITS_DESC_MSG(COMPUTE_PGM_RSRC3_GFX10_RESERVED5,
+ "COMPUTE_PGM_RSRC3",
+ "must be zero on gfx10");
}
} else if (FourByteBuffer) {
+ *CommentStream
+ << "kernel descriptor COMPUTE_PGM_RSRC3 must be all zero before gfx9\n";
return MCDisassembler::Fail;
}
return MCDisassembler::Success;
@@ -2036,6 +2075,30 @@ MCDisassembler::DecodeStatus AMDGPUDisassembler::decodeCOMPUTE_PGM_RSRC3(
#undef PRINT_PSEUDO_DIRECTIVE_COMMENT
#undef PRINT_DIRECTIVE
#undef GET_FIELD
+#undef CHECK_RESERVED_BITS_IMPL
+#undef CHECK_RESERVED_BITS
+#undef CHECK_RESERVED_BITS_MSG
+#undef CHECK_RESERVED_BITS_DESC
+#undef CHECK_RESERVED_BITS_DESC_MSG
+
+static void printReservedKDBitsError(raw_ostream *CommentStream, uint32_t Mask,
+ unsigned BaseInBytes, StringRef Msg = "") {
+ *CommentStream << "kernel descriptor reserved "
+ << getBitRangeFromMask(Mask, BaseInBytes) << " set";
+ if (!Msg.empty())
+ *CommentStream << ", " << Msg;
+ *CommentStream << '\n';
+}
+
+static void printReservedKDBytesError(raw_ostream *CommentStream,
+ unsigned BaseInBytes,
+ unsigned WidthInBytes) {
+ // Print an error comment in the same format as the "Kernel Descriptor"
+ // table here: https://llvm.org/docs/AMDGPUUsage.html#kernel-descriptor .
+ *CommentStream << "kernel descriptor reserved bits set in range "
+ << ((BaseInBytes + WidthInBytes) * CHAR_BIT - 1) << ":"
+ << (BaseInBytes * CHAR_BIT) << '\n';
+}
MCDisassembler::DecodeStatus
AMDGPUDisassembler::decodeKernelDescriptorDirective(
@@ -2080,6 +2143,7 @@ AMDGPUDisassembler::decodeKernelDescriptorDirective(
ReservedBytes = DE.getBytes(Cursor, 4);
for (int I = 0; I < 4; ++I) {
if (ReservedBytes[I] != 0) {
+ printReservedKDBytesError(CommentStream, amdhsa::RESERVED0_OFFSET, 4);
return MCDisassembler::Fail;
}
}
@@ -2097,6 +2161,7 @@ AMDGPUDisassembler::decodeKernelDescriptorDirective(
ReservedBytes = DE.getBytes(Cursor, 20);
for (int I = 0; I < 20; ++I) {
if (ReservedBytes[I] != 0) {
+ printReservedKDBytesError(CommentStream, amdhsa::RESERVED1_OFFSET, 20);
return MCDisassembler::Fail;
}
}
@@ -2135,12 +2200,18 @@ AMDGPUDisassembler::decodeKernelDescriptorDirective(
PRINT_DIRECTIVE(".amdhsa_user_sgpr_private_segment_size",
KERNEL_CODE_PROPERTY_ENABLE_SGPR_PRIVATE_SEGMENT_SIZE);
- if (TwoByteBuffer & KERNEL_CODE_PROPERTY_RESERVED0)
+ if (TwoByteBuffer & KERNEL_CODE_PROPERTY_RESERVED0) {
+ printReservedKDBitsError(CommentStream, KERNEL_CODE_PROPERTY_RESERVED0,
+ amdhsa::KERNEL_CODE_PROPERTIES_OFFSET);
return MCDisassembler::Fail;
+ }
// Reserved for GFX9
if (isGFX9() &&
(TwoByteBuffer & KERNEL_CODE_PROPERTY_ENABLE_WAVEFRONT_SIZE32)) {
+ printReservedKDBitsError(
+ CommentStream, KERNEL_CODE_PROPERTY_ENABLE_WAVEFRONT_SIZE32,
+ amdhsa::KERNEL_CODE_PROPERTIES_OFFSET, "must be zero on gfx9");
return MCDisassembler::Fail;
} else if (isGFX10Plus()) {
PRINT_DIRECTIVE(".amdhsa_wavefront_size32",
@@ -2151,8 +2222,11 @@ AMDGPUDisassembler::decodeKernelDescriptorDirective(
PRINT_DIRECTIVE(".amdhsa_uses_dynamic_stack",
KERNEL_CODE_PROPERTY_USES_DYNAMIC_STACK);
- if (TwoByteBuffer & KERNEL_CODE_PROPERTY_RESERVED1)
+ if (TwoByteBuffer & KERNEL_CODE_PROPERTY_RESERVED1) {
+ printReservedKDBitsError(CommentStream, KERNEL_CODE_PROPERTY_RESERVED1,
+ amdhsa::KERNEL_CODE_PROPERTIES_OFFSET);
return MCDisassembler::Fail;
+ }
return MCDisassembler::Success;
@@ -2174,8 +2248,10 @@ AMDGPUDisassembler::decodeKernelDescriptorDirective(
// 4 bytes from here are reserved, must be 0.
ReservedBytes = DE.getBytes(Cursor, 4);
for (int I = 0; I < 4; ++I) {
- if (ReservedBytes[I] != 0)
+ if (ReservedBytes[I] != 0) {
+ printReservedKDBytesError(CommentStream, amdhsa::RESERVED3_OFFSET, 4);
return MCDisassembler::Fail;
+ }
}
return MCDisassembler::Success;
@@ -2189,8 +2265,10 @@ AMDGPUDisassembler::decodeKernelDescriptorDirective(
MCDisassembler::DecodeStatus AMDGPUDisassembler::decodeKernelDescriptor(
StringRef KdName, ArrayRef<uint8_t> Bytes, uint64_t KdAddress) const {
// CP microcode requires the kernel descriptor to be 64 aligned.
- if (Bytes.size() != 64 || KdAddress % 64 != 0)
+ if (Bytes.size() != 64 || KdAddress % 64 != 0) {
+ *CommentStream << "kernel descriptor must be 64-byte aligned\n";
return MCDisassembler::Fail;
+ }
// FIXME: We can't actually decode "in order" as is done below, as e.g. GFX10
// requires us to know the setting of .amdhsa_wavefront_size32 in order to
@@ -2235,9 +2313,12 @@ AMDGPUDisassembler::onSymbolStart(SymbolInfoTy &Symbol, uint64_t &Size,
// Fix the spurious symbol issue for AMDGPU kernels. Exists for both Code
// Object V2 and V3 when symbols are marked protected.
+ SaveAndRestore CommentStreamRAII(CommentStream, &CStream);
+
// amd_kernel_code_t for Code Object V2.
if (Symbol.Type == ELF::STT_AMDGPU_HSA_KERNEL) {
Size = 256;
+ *CommentStream << "code object v2 is not supported";
return MCDisassembler::Fail;
}
@@ -2247,6 +2328,7 @@ AMDGPUDisassembler::onSymbolStart(SymbolInfoTy &Symbol, uint64_t &Size,
Size = 64; // Size = 64 regardless of success or failure.
return decodeKernelDescriptor(Name.drop_back(3), Bytes, Address);
}
+
return std::nullopt;
}
diff --git a/llvm/test/MC/Disassembler/AMDGPU/kernel-descriptor-errors.test b/llvm/test/MC/Disassembler/AMDGPU/kernel-descriptor-errors.test
new file mode 100644
index 00000000000000..bdfd70f8a022d1
--- /dev/null
+++ b/llvm/test/MC/Disassembler/AMD...
[truncated]
|
@@ -160,7 +161,7 @@ class MCDisassembler { | |||
/// case. | |||
virtual std::optional<DecodeStatus> | |||
onSymbolStart(SymbolInfoTy &Symbol, uint64_t &Size, ArrayRef<uint8_t> Bytes, | |||
uint64_t Address, raw_ostream &CStream) const; | |||
uint64_t Address, raw_ostream &ErrS) const; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we have an error string stream, probably should convert this to use Error return instead of optional?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using an Error
to represent errors seems like a nice refactor, but an Error return isn't sufficient for this function. It has three cases: a) nullopt, meaning the disassembler didn't have any handling for this symbol, b) Fail, meaning an error was encountered, and c) Success. b & c both indicate to objdump that some bytes were consumed, so Size
must be set. In the update I changed the return type to bool, where true means the target disassembler had some handling for this symbol, with an Error out param to indicate whether it contained errors.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Two small nits, otherwise LGTM! Thank you for making the errors so descriptive, it is a massive improvement for the exceptional cases where the tool is the most useful.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've only taken a look at the llvm-objdump part.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
llvm-objdump parts LGTM.
These mostly are checking for various reserved bits being set. The diagnostics for gpu-dependent reserved bits have a bit more context since they seem like the most likely ones to be observed in practice.