[lldb] Fix printing of unsigned enum bitfields when they contain the max value #96202

DavidSpickett · 2024-06-20T14:59:50Z

While testing register fields I found that if you put the max value into a bitfield with an underlying type that is an unsigned enum, lldb would not print the enum name.

This is because the code to match values to names wasn't checking whether the enum's type was signed, it just assumed it was.

So for example a 2 bit field with value 3 got signed extended to -1, which didn't match the enumerator value of 3. So lldb just printed the number instead of the name.

For a value of 1, the top bit was 0 so the sign extend became a zero extend, and lldb did print the name of the enumerator.

I added a new test because I needed to use C++ to get typed enums. It checks min, max and an in between value for signed and unsigned enums applied to a bitfield.

While testing register fields I found that if you put the max value into a bitfield with an underlying type that is an unsigned enum, lldb would not print the enum name. This is because the code to match values to names wasn't checking whether the enum's type was signed, it just assumed it was. So for example a 2 bit field with value 3 got signed extended to -1, which didn't match the enumerator value of 3. So lldb just printed the number instead of the name. For a value of 1, the top bit was 0 so the sign extend became a zero extend, and lldb did print the name of the enumerator. I added a new test because I needed to use C++ to get typed enums. It checks min, max and an in between value for signed and unsigned enums applied to a bitfield.

llvmbot · 2024-06-20T15:00:21Z

@llvm/pr-subscribers-lldb

Author: David Spickett (DavidSpickett)

Changes

While testing register fields I found that if you put the max value into a bitfield with an underlying type that is an unsigned enum, lldb would not print the enum name.

This is because the code to match values to names wasn't checking whether the enum's type was signed, it just assumed it was.

So for example a 2 bit field with value 3 got signed extended to -1, which didn't match the enumerator value of 3. So lldb just printed the number instead of the name.

For a value of 1, the top bit was 0 so the sign extend became a zero extend, and lldb did print the name of the enumerator.

I added a new test because I needed to use C++ to get typed enums. It checks min, max and an in between value for signed and unsigned enums applied to a bitfield.

Full diff: https://github.com/llvm/llvm-project/pull/96202.diff

4 Files Affected:

(modified) lldb/source/Plugins/TypeSystem/Clang/TypeSystemClang.cpp (+13-5)
(added) lldb/test/API/commands/expression/bitfield_enums/Makefile (+3)
(added) lldb/test/API/commands/expression/bitfield_enums/TestBitfieldEnums.py (+31)
(added) lldb/test/API/commands/expression/bitfield_enums/main.cpp (+24)

diff --git a/lldb/source/Plugins/TypeSystem/Clang/TypeSystemClang.cpp b/lldb/source/Plugins/TypeSystem/Clang/TypeSystemClang.cpp
index 0270c67366d1d..0fdc0ff150d30 100644
--- a/lldb/source/Plugins/TypeSystem/Clang/TypeSystemClang.cpp
+++ b/lldb/source/Plugins/TypeSystem/Clang/TypeSystemClang.cpp
@@ -8639,8 +8639,13 @@ static bool DumpEnumValue(const clang::QualType &qual_type, Stream &s,
   const clang::EnumDecl *enum_decl = enutype->getDecl();
   assert(enum_decl);
   lldb::offset_t offset = byte_offset;
-  const uint64_t enum_svalue = data.GetMaxS64Bitfield(
-      &offset, byte_size, bitfield_bit_size, bitfield_bit_offset);
+  bool qual_type_is_signed = qual_type->isSignedIntegerOrEnumerationType();
+  const uint64_t enum_svalue =
+      qual_type_is_signed
+          ? data.GetMaxS64Bitfield(&offset, byte_size, bitfield_bit_size,
+                                   bitfield_bit_offset)
+          : data.GetMaxU64Bitfield(&offset, byte_size, bitfield_bit_size,
+                                   bitfield_bit_offset);
   bool can_be_bitfield = true;
   uint64_t covered_bits = 0;
   int num_enumerators = 0;
@@ -8652,8 +8657,11 @@ static bool DumpEnumValue(const clang::QualType &qual_type, Stream &s,
   // enumerators. Also 0 doesn't make sense when the enumerators are used as
   // flags.
   for (auto *enumerator : enum_decl->enumerators()) {
-    uint64_t val = enumerator->getInitVal().getSExtValue();
-    val = llvm::SignExtend64(val, 8*byte_size);
+    llvm::APSInt init_val = enumerator->getInitVal();
+    uint64_t val =
+        qual_type_is_signed ? init_val.getSExtValue() : init_val.getZExtValue();
+    if (qual_type_is_signed)
+      val = llvm::SignExtend64(val, 8 * byte_size);
     if (llvm::popcount(val) != 1 && (val & ~covered_bits) != 0)
       can_be_bitfield = false;
     covered_bits |= val;
@@ -8673,7 +8681,7 @@ static bool DumpEnumValue(const clang::QualType &qual_type, Stream &s,
   // No exact match, but we don't think this is a bitfield. Print the value as
   // decimal.
   if (!can_be_bitfield) {
-    if (qual_type->isSignedIntegerOrEnumerationType())
+    if (qual_type_is_signed)
       s.Printf("%" PRIi64, enum_svalue);
     else
       s.Printf("%" PRIu64, enum_uvalue);
diff --git a/lldb/test/API/commands/expression/bitfield_enums/Makefile b/lldb/test/API/commands/expression/bitfield_enums/Makefile
new file mode 100644
index 0000000000000..99998b20bcb05
--- /dev/null
+++ b/lldb/test/API/commands/expression/bitfield_enums/Makefile
@@ -0,0 +1,3 @@
+CXX_SOURCES := main.cpp
+
+include Makefile.rules
diff --git a/lldb/test/API/commands/expression/bitfield_enums/TestBitfieldEnums.py b/lldb/test/API/commands/expression/bitfield_enums/TestBitfieldEnums.py
new file mode 100644
index 0000000000000..a484b69300e7b
--- /dev/null
+++ b/lldb/test/API/commands/expression/bitfield_enums/TestBitfieldEnums.py
@@ -0,0 +1,31 @@
+"""
+Test that the expression parser accounts for the underlying type of bitfield
+enums when looking for matching values.
+"""
+
+import lldb
+from lldbsuite.test.decorators import *
+from lldbsuite.test.lldbtest import *
+from lldbsuite.test import lldbutil
+
+
+class TestBitfieldEnum(TestBase):
+    def test_bitfield_enums(self):
+        self.build()
+
+        lldbutil.run_to_source_breakpoint(
+            self, "// break here", lldb.SBFileSpec("main.cpp", False)
+        )
+
+        self.expect_expr(
+            "bfs",
+            result_type="BitfieldStruct",
+            result_children=[
+                ValueCheck(name="signed_min", value="min"),
+                ValueCheck(name="signed_other", value="-1"),
+                ValueCheck(name="signed_max", value="max"),
+                ValueCheck(name="unsigned_min", value="min"),
+                ValueCheck(name="unsigned_other", value="1"),
+                ValueCheck(name="unsigned_max", value="max"),
+            ],
+        )
diff --git a/lldb/test/API/commands/expression/bitfield_enums/main.cpp b/lldb/test/API/commands/expression/bitfield_enums/main.cpp
new file mode 100644
index 0000000000000..f6c53b3100b93
--- /dev/null
+++ b/lldb/test/API/commands/expression/bitfield_enums/main.cpp
@@ -0,0 +1,24 @@
+enum class SignedEnum : int { min = -2, max = 1 };
+enum class UnsignedEnum : unsigned { min = 0, max = 3 };
+
+struct BitfieldStruct {
+  SignedEnum signed_min : 2;
+  SignedEnum signed_other : 2;
+  SignedEnum signed_max : 2;
+  UnsignedEnum unsigned_min : 2;
+  UnsignedEnum unsigned_other : 2;
+  UnsignedEnum unsigned_max : 2;
+};
+
+int main() {
+  BitfieldStruct bfs;
+  bfs.signed_min = SignedEnum::min;
+  bfs.signed_other = static_cast<SignedEnum>(-1);
+  bfs.signed_max = SignedEnum::max;
+
+  bfs.unsigned_min = UnsignedEnum::min;
+  bfs.unsigned_other = static_cast<UnsignedEnum>(1);
+  bfs.unsigned_max = UnsignedEnum::max;
+
+  return 0; // break here
+}

DavidSpickett · 2024-06-20T15:05:50Z

lldb/test/API/commands/expression/bitfield_enums/TestBitfieldEnums.py

+                ValueCheck(name="signed_max", value="max"),
+                ValueCheck(name="unsigned_min", value="min"),
+                ValueCheck(name="unsigned_other", value="1"),
+                ValueCheck(name="unsigned_max", value="max"),


Prior to this change, this printed 3 instead of max.

Michael137

good catch, LGTM

DavidSpickett · 2024-07-03T13:30:42Z

Thanks for all the reviews!

…max value (llvm#96202) While testing register fields I found that if you put the max value into a bitfield with an underlying type that is an unsigned enum, lldb would not print the enum name. This is because the code to match values to names wasn't checking whether the enum's type was signed, it just assumed it was. So for example a 2 bit field with value 3 got signed extended to -1, which didn't match the enumerator value of 3. So lldb just printed the number instead of the name. For a value of 1, the top bit was 0 so the sign extend became a zero extend, and lldb did print the name of the enumerator. I added a new test because I needed to use C++ to get typed enums. It checks min, max and an in between value for signed and unsigned enums applied to a bitfield.

Michael137 · 2024-07-07T23:33:04Z

FYI, the new test seems to be failing on the matrix LLDB bot that's testing DWARFv2:

======================================================================
FAIL: test_bitfield_enums_dsym (TestBitfieldEnums.TestBitfieldEnum)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake-matrix/llvm-project/lldb/packages/Python/lldbsuite/test/lldbtest.py", line 1756, in test_method
    return attrvalue(self)
  File "/Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake-matrix/llvm-project/lldb/test/API/commands/expression/bitfield_enums/TestBitfieldEnums.py", line 20, in test_bitfield_enums
    self.expect_expr(
  File "/Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake-matrix/llvm-project/lldb/packages/Python/lldbsuite/test/lldbtest.py", line 2512, in expect_expr
    value_check.check_value(self, eval_result, str(eval_result))
  File "/Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake-matrix/llvm-project/lldb/packages/Python/lldbsuite/test/lldbtest.py", line 326, in check_value
    self.check_value_children(test_base, val, error_msg)
  File "/Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake-matrix/llvm-project/lldb/packages/Python/lldbsuite/test/lldbtest.py", line 349, in check_value_children
    expected_child.check_value(test_base, actual_child, child_error)
  File "/Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake-matrix/llvm-project/lldb/packages/Python/lldbsuite/test/lldbtest.py", line 311, in check_value
    test_base.assertEqual(self.expect_value, val.GetValue(), this_error_msg)
AssertionError: 'max' != '-1'
- max
+ -1
 : Checking child with index 5:
(BitfieldStruct) $0 = {
  signed_min = min
  signed_other = -1
  signed_max = max
  unsigned_min = min
  unsigned_other = 1
  unsigned_max = -1
}
Checking SBValue: (UnsignedEnum:2) unsigned_max = -1

https://green.lab.llvm.org/job/llvm.org/view/LLDB/job/lldb-cmake-matrix/459/execution/node/59/log/?consoleFull

(both the dSYM and non-dSYM variants of the test are failing)

AFAICT, this seems to just be an existing bug in the DWARFv2 support that this test uncovered.

With my system LLDB (compiling the test with -gdwarf-2):

(lldb) v bfs
(BitfieldStruct) bfs = {
  signed_min = min
  signed_other = -1
  signed_max = max
  unsigned_min = min
  unsigned_other = 1
  unsigned_max = -1
}

DavidSpickett · 2024-07-09T14:01:19Z

Clang's dwarf v2 output is missing DW_AT_type. With g++ I see that (and more) with v2 and v3.

v2:

0x00000043:   DW_TAG_enumeration_type
                DW_AT_name      ("UnsignedEnum")
                DW_AT_byte_size (0x04)
                DW_AT_decl_file ("/tmp/test.cpp")
                DW_AT_decl_line (2)

v3:

0x0000004e:   DW_TAG_enumeration_type
                DW_AT_type      (0x00000067 "unsigned int")
                DW_AT_name      ("UnsignedEnum")
                DW_AT_byte_size (0x04)
                DW_AT_decl_file ("/tmp/test.cpp")
                DW_AT_decl_line (2)

Will skip it while I figure that out.

Clang's v2 output appears to be missing a key DW_AT_type attribute, and this causes the "max" of the unsigned enum to appear as -1 instead of "max" aka 3. ``` (BitfieldStruct) $0 = { signed_min = min signed_other = -1 signed_max = max unsigned_min = min unsigned_other = 1 unsigned_max = -1 } ``` Test added by #96202.

DavidSpickett · 2024-07-10T12:03:02Z

Clang's decision to emit DW_AT_type or not goes back to https://reviews.llvm.org/D42734 and in particular this code:

void DwarfUnit::constructEnumTypeDIE(DIE &Buffer, const DICompositeType *CTy) {
  const DIType *DTy = resolve(CTy->getBaseType());
  bool IsUnsigned = DTy && isUnsignedDIType(DD, DTy);
  if (DTy && DD->getDwarfVersion() >= 3)
    addType(Buffer, DTy);

DWARF v2 (https://dwarfstd.org/doc/dwarf-2.0.0.pdf) does have the concept of DW_AT_type but it does not mention that it may be found in enumeration types.

"5.6 Enumeration Type Entries
<...>
Each enumerator entry has a DW_AT_name attribute, whose value is a null-terminated string
containing the name of the enumeration literal as it appears in the source program. Each
enumerator entry also has a DW_AT_const_value attribute, whose value is the actual numeric
value of the enumerator as represented on the target system."

So perhaps you could infer the type from the enumerators, but this is ambiguous if you only have positive enumerators within signed int's range. If the debugger is in C/C++ mode, it's reasonable for it to assume signed int most of the time.

In DWARF v3 (https://dwarfstd.org/doc/Dwarf3.pdf) "5.8 Enumeration Type Entries" we have an explicit mention of DW_AT_type.

"The enumeration type entry may also have a DW_AT_type attribute which refers to the
underlying data type used to implement the enumeration.

In C or C++, the underlying type will be the appropriate integral type determined by the
compiler from the properties of the enumeration literal values."

See https://gcc.gnu.org/bugzilla/show_bug.cgi?id=16063 for details on what g++ did for this.

	* dwarf2out.c (gen_enumeration_type_die): Add DW_AT_type if DWARF
	version >= 3 or not strict DWARF.

If you pass -gstrict-dwarf to g++ then DW_AT_type is not emitted.

Clang does have that option so we could do the same, let me see what -gstrict-dwarf things we have currently.

…rict DWARF v2 mode During testing of llvm#96202 we found that when clang set to DWARF v2 was used to build the test file, lldb could not tell that the unsigned enum type was in fact unsigned. So it defaulted to signed and printed the wrong value. The reason for this is that DWARFv2 does not include DW_AT_type in DW_TAG_enumeration_type. This was added in DWARF v3: "The enumeration type entry may also have a DW_AT_type attribute which refers to the underlying data type used to implement the enumeration. In C or C++, the underlying type will be the appropriate integral type determined by the compiler from the properties of the enumeration literal values." I noticed that gcc does emit this attribute for DWARF v2 but not when strict DWARF is requested (more details in https://gcc.gnu.org/bugzilla/show_bug.cgi?id=16063#c7). This patch changes to clang to do the same. This will improve the experience of anyone using tools that can understand the attribute but for whatever reason are stuck building binaries containing v2 only. You can see a current clang/gcc comparison here: https://godbolt.org/z/eG9Kc9WGf https://reviews.llvm.org/D42734 added the original code that emitted this for >= v3 only.

…rict DWARF v2 mode (#98335) During testing of #96202 we found that when clang set to DWARF v2 was used to build the test file, lldb could not tell that the unsigned enum type was in fact unsigned. So it defaulted to signed and printed the wrong value. The reason for this is that DWARFv2 does not include DW_AT_type in DW_TAG_enumeration_type. This was added in DWARF v3: "The enumeration type entry may also have a DW_AT_type attribute which refers to the underlying data type used to implement the enumeration. In C or C++, the underlying type will be the appropriate integral type determined by the compiler from the properties of the enumeration literal values." I noticed that gcc does emit this attribute for DWARF v2 but not when strict DWARF is requested (more details in https://gcc.gnu.org/bugzilla/show_bug.cgi?id=16063#c7). This patch changes to clang to do the same. This will improve the experience of anyone using tools that can understand the attribute but for whatever reason are stuck building binaries containing v2 only. You can see a current clang/gcc comparison here: https://godbolt.org/z/eG9Kc9WGf https://reviews.llvm.org/D42734 added the original code that emitted this for >= v3 only.

Clang's v2 output appears to be missing a key DW_AT_type attribute, and this causes the "max" of the unsigned enum to appear as -1 instead of "max" aka 3. ``` (BitfieldStruct) $0 = { signed_min = min signed_other = -1 signed_max = max unsigned_min = min unsigned_other = 1 unsigned_max = -1 } ``` Test added by llvm#96202.

…rict DWARF v2 mode (llvm#98335) During testing of llvm#96202 we found that when clang set to DWARF v2 was used to build the test file, lldb could not tell that the unsigned enum type was in fact unsigned. So it defaulted to signed and printed the wrong value. The reason for this is that DWARFv2 does not include DW_AT_type in DW_TAG_enumeration_type. This was added in DWARF v3: "The enumeration type entry may also have a DW_AT_type attribute which refers to the underlying data type used to implement the enumeration. In C or C++, the underlying type will be the appropriate integral type determined by the compiler from the properties of the enumeration literal values." I noticed that gcc does emit this attribute for DWARF v2 but not when strict DWARF is requested (more details in https://gcc.gnu.org/bugzilla/show_bug.cgi?id=16063#c7). This patch changes to clang to do the same. This will improve the experience of anyone using tools that can understand the attribute but for whatever reason are stuck building binaries containing v2 only. You can see a current clang/gcc comparison here: https://godbolt.org/z/eG9Kc9WGf https://reviews.llvm.org/D42734 added the original code that emitted this for >= v3 only.

DavidSpickett requested a review from JDevlieghere as a code owner June 20, 2024 14:59

llvmbot added the lldb label Jun 20, 2024

DavidSpickett changed the title ~~[lldb] Fix printing of unsigned enums bitfields when they contain the max value~~ [lldb] Fix printing of unsigned enum bitfields when they contain the max value Jun 20, 2024

DavidSpickett commented Jun 20, 2024

View reviewed changes

DavidSpickett requested review from clayborg and Michael137 July 1, 2024 10:07

Michael137 approved these changes Jul 3, 2024

View reviewed changes

DavidSpickett merged commit dde3f17 into llvm:main Jul 3, 2024
8 checks passed

DavidSpickett deleted the lldb-bfenums branch July 3, 2024 13:30

DavidSpickett mentioned this pull request Jul 10, 2024

[llvm][DebugInfo] Add DW_AT_type to DW_TAG_enumeration_type in non-strict DWARF v2 mode #98335

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[lldb] Fix printing of unsigned enum bitfields when they contain the max value #96202

[lldb] Fix printing of unsigned enum bitfields when they contain the max value #96202

DavidSpickett commented Jun 20, 2024

llvmbot commented Jun 20, 2024

DavidSpickett Jun 20, 2024 •

edited

Loading

Michael137 left a comment

DavidSpickett commented Jul 3, 2024

Michael137 commented Jul 7, 2024

DavidSpickett commented Jul 9, 2024 •

edited

Loading

DavidSpickett commented Jul 10, 2024

[lldb] Fix printing of unsigned enum bitfields when they contain the max value #96202

[lldb] Fix printing of unsigned enum bitfields when they contain the max value #96202

Conversation

DavidSpickett commented Jun 20, 2024

llvmbot commented Jun 20, 2024

DavidSpickett Jun 20, 2024 • edited Loading

Choose a reason for hiding this comment

Michael137 left a comment

Choose a reason for hiding this comment

DavidSpickett commented Jul 3, 2024

Michael137 commented Jul 7, 2024

DavidSpickett commented Jul 9, 2024 • edited Loading

DavidSpickett commented Jul 10, 2024

DavidSpickett Jun 20, 2024 •

edited

Loading

DavidSpickett commented Jul 9, 2024 •

edited

Loading