Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[lldb] Fix printing of unsigned enum bitfields when they contain the max value #96202

Merged
merged 1 commit into from
Jul 3, 2024

Conversation

DavidSpickett
Copy link
Collaborator

While testing register fields I found that if you put the max value into a bitfield with an underlying type that is an unsigned enum, lldb would not print the enum name.

This is because the code to match values to names wasn't checking whether the enum's type was signed, it just assumed it was.

So for example a 2 bit field with value 3 got signed extended to -1, which didn't match the enumerator value of 3. So lldb just printed the number instead of the name.

For a value of 1, the top bit was 0 so the sign extend became a zero extend, and lldb did print the name of the enumerator.

I added a new test because I needed to use C++ to get typed enums. It checks min, max and an in between value for signed and unsigned enums applied to a bitfield.

While testing register fields I found that if you put the max
value into a bitfield with an underlying type that is an unsigned
enum, lldb would not print the enum name.

This is because the code to match values to names wasn't checking
whether the enum's type was signed, it just assumed it was.

So for example a 2 bit field with value 3 got signed extended
to -1, which didn't match the enumerator value of 3. So lldb
just printed the number instead of the name.

For a value of 1, the top bit was 0 so the sign extend became
a zero extend, and lldb did print the name of the enumerator.

I added a new test because I needed to use C++ to get typed
enums. It checks min, max and an in between value for signed
and unsigned enums applied to a bitfield.
@llvmbot llvmbot added the lldb label Jun 20, 2024
@DavidSpickett DavidSpickett changed the title [lldb] Fix printing of unsigned enums bitfields when they contain the max value [lldb] Fix printing of unsigned enum bitfields when they contain the max value Jun 20, 2024
@llvmbot
Copy link
Collaborator

llvmbot commented Jun 20, 2024

@llvm/pr-subscribers-lldb

Author: David Spickett (DavidSpickett)

Changes

While testing register fields I found that if you put the max value into a bitfield with an underlying type that is an unsigned enum, lldb would not print the enum name.

This is because the code to match values to names wasn't checking whether the enum's type was signed, it just assumed it was.

So for example a 2 bit field with value 3 got signed extended to -1, which didn't match the enumerator value of 3. So lldb just printed the number instead of the name.

For a value of 1, the top bit was 0 so the sign extend became a zero extend, and lldb did print the name of the enumerator.

I added a new test because I needed to use C++ to get typed enums. It checks min, max and an in between value for signed and unsigned enums applied to a bitfield.


Full diff: https://github.com/llvm/llvm-project/pull/96202.diff

4 Files Affected:

  • (modified) lldb/source/Plugins/TypeSystem/Clang/TypeSystemClang.cpp (+13-5)
  • (added) lldb/test/API/commands/expression/bitfield_enums/Makefile (+3)
  • (added) lldb/test/API/commands/expression/bitfield_enums/TestBitfieldEnums.py (+31)
  • (added) lldb/test/API/commands/expression/bitfield_enums/main.cpp (+24)
diff --git a/lldb/source/Plugins/TypeSystem/Clang/TypeSystemClang.cpp b/lldb/source/Plugins/TypeSystem/Clang/TypeSystemClang.cpp
index 0270c67366d1d..0fdc0ff150d30 100644
--- a/lldb/source/Plugins/TypeSystem/Clang/TypeSystemClang.cpp
+++ b/lldb/source/Plugins/TypeSystem/Clang/TypeSystemClang.cpp
@@ -8639,8 +8639,13 @@ static bool DumpEnumValue(const clang::QualType &qual_type, Stream &s,
   const clang::EnumDecl *enum_decl = enutype->getDecl();
   assert(enum_decl);
   lldb::offset_t offset = byte_offset;
-  const uint64_t enum_svalue = data.GetMaxS64Bitfield(
-      &offset, byte_size, bitfield_bit_size, bitfield_bit_offset);
+  bool qual_type_is_signed = qual_type->isSignedIntegerOrEnumerationType();
+  const uint64_t enum_svalue =
+      qual_type_is_signed
+          ? data.GetMaxS64Bitfield(&offset, byte_size, bitfield_bit_size,
+                                   bitfield_bit_offset)
+          : data.GetMaxU64Bitfield(&offset, byte_size, bitfield_bit_size,
+                                   bitfield_bit_offset);
   bool can_be_bitfield = true;
   uint64_t covered_bits = 0;
   int num_enumerators = 0;
@@ -8652,8 +8657,11 @@ static bool DumpEnumValue(const clang::QualType &qual_type, Stream &s,
   // enumerators. Also 0 doesn't make sense when the enumerators are used as
   // flags.
   for (auto *enumerator : enum_decl->enumerators()) {
-    uint64_t val = enumerator->getInitVal().getSExtValue();
-    val = llvm::SignExtend64(val, 8*byte_size);
+    llvm::APSInt init_val = enumerator->getInitVal();
+    uint64_t val =
+        qual_type_is_signed ? init_val.getSExtValue() : init_val.getZExtValue();
+    if (qual_type_is_signed)
+      val = llvm::SignExtend64(val, 8 * byte_size);
     if (llvm::popcount(val) != 1 && (val & ~covered_bits) != 0)
       can_be_bitfield = false;
     covered_bits |= val;
@@ -8673,7 +8681,7 @@ static bool DumpEnumValue(const clang::QualType &qual_type, Stream &s,
   // No exact match, but we don't think this is a bitfield. Print the value as
   // decimal.
   if (!can_be_bitfield) {
-    if (qual_type->isSignedIntegerOrEnumerationType())
+    if (qual_type_is_signed)
       s.Printf("%" PRIi64, enum_svalue);
     else
       s.Printf("%" PRIu64, enum_uvalue);
diff --git a/lldb/test/API/commands/expression/bitfield_enums/Makefile b/lldb/test/API/commands/expression/bitfield_enums/Makefile
new file mode 100644
index 0000000000000..99998b20bcb05
--- /dev/null
+++ b/lldb/test/API/commands/expression/bitfield_enums/Makefile
@@ -0,0 +1,3 @@
+CXX_SOURCES := main.cpp
+
+include Makefile.rules
diff --git a/lldb/test/API/commands/expression/bitfield_enums/TestBitfieldEnums.py b/lldb/test/API/commands/expression/bitfield_enums/TestBitfieldEnums.py
new file mode 100644
index 0000000000000..a484b69300e7b
--- /dev/null
+++ b/lldb/test/API/commands/expression/bitfield_enums/TestBitfieldEnums.py
@@ -0,0 +1,31 @@
+"""
+Test that the expression parser accounts for the underlying type of bitfield
+enums when looking for matching values.
+"""
+
+import lldb
+from lldbsuite.test.decorators import *
+from lldbsuite.test.lldbtest import *
+from lldbsuite.test import lldbutil
+
+
+class TestBitfieldEnum(TestBase):
+    def test_bitfield_enums(self):
+        self.build()
+
+        lldbutil.run_to_source_breakpoint(
+            self, "// break here", lldb.SBFileSpec("main.cpp", False)
+        )
+
+        self.expect_expr(
+            "bfs",
+            result_type="BitfieldStruct",
+            result_children=[
+                ValueCheck(name="signed_min", value="min"),
+                ValueCheck(name="signed_other", value="-1"),
+                ValueCheck(name="signed_max", value="max"),
+                ValueCheck(name="unsigned_min", value="min"),
+                ValueCheck(name="unsigned_other", value="1"),
+                ValueCheck(name="unsigned_max", value="max"),
+            ],
+        )
diff --git a/lldb/test/API/commands/expression/bitfield_enums/main.cpp b/lldb/test/API/commands/expression/bitfield_enums/main.cpp
new file mode 100644
index 0000000000000..f6c53b3100b93
--- /dev/null
+++ b/lldb/test/API/commands/expression/bitfield_enums/main.cpp
@@ -0,0 +1,24 @@
+enum class SignedEnum : int { min = -2, max = 1 };
+enum class UnsignedEnum : unsigned { min = 0, max = 3 };
+
+struct BitfieldStruct {
+  SignedEnum signed_min : 2;
+  SignedEnum signed_other : 2;
+  SignedEnum signed_max : 2;
+  UnsignedEnum unsigned_min : 2;
+  UnsignedEnum unsigned_other : 2;
+  UnsignedEnum unsigned_max : 2;
+};
+
+int main() {
+  BitfieldStruct bfs;
+  bfs.signed_min = SignedEnum::min;
+  bfs.signed_other = static_cast<SignedEnum>(-1);
+  bfs.signed_max = SignedEnum::max;
+
+  bfs.unsigned_min = UnsignedEnum::min;
+  bfs.unsigned_other = static_cast<UnsignedEnum>(1);
+  bfs.unsigned_max = UnsignedEnum::max;
+
+  return 0; // break here
+}

ValueCheck(name="signed_max", value="max"),
ValueCheck(name="unsigned_min", value="min"),
ValueCheck(name="unsigned_other", value="1"),
ValueCheck(name="unsigned_max", value="max"),
Copy link
Collaborator Author

@DavidSpickett DavidSpickett Jun 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Prior to this change, this printed 3 instead of max.

Copy link
Member

@Michael137 Michael137 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch, LGTM

@DavidSpickett
Copy link
Collaborator Author

Thanks for all the reviews!

@DavidSpickett DavidSpickett merged commit dde3f17 into llvm:main Jul 3, 2024
8 checks passed
@DavidSpickett DavidSpickett deleted the lldb-bfenums branch July 3, 2024 13:30
kirillpyasecky pushed a commit to kirillpyasecky/llvm-project that referenced this pull request Jul 3, 2024
…max value (llvm#96202)

While testing register fields I found that if you put the max value into
a bitfield with an underlying type that is an unsigned enum, lldb would
not print the enum name.

This is because the code to match values to names wasn't checking
whether the enum's type was signed, it just assumed it was.

So for example a 2 bit field with value 3 got signed extended to -1,
which didn't match the enumerator value of 3. So lldb just printed the
number instead of the name.

For a value of 1, the top bit was 0 so the sign extend became a zero
extend, and lldb did print the name of the enumerator.

I added a new test because I needed to use C++ to get typed enums. It
checks min, max and an in between value for signed and unsigned enums
applied to a bitfield.
kbluck pushed a commit to kbluck/llvm-project that referenced this pull request Jul 6, 2024
…max value (llvm#96202)

While testing register fields I found that if you put the max value into
a bitfield with an underlying type that is an unsigned enum, lldb would
not print the enum name.

This is because the code to match values to names wasn't checking
whether the enum's type was signed, it just assumed it was.

So for example a 2 bit field with value 3 got signed extended to -1,
which didn't match the enumerator value of 3. So lldb just printed the
number instead of the name.

For a value of 1, the top bit was 0 so the sign extend became a zero
extend, and lldb did print the name of the enumerator.

I added a new test because I needed to use C++ to get typed enums. It
checks min, max and an in between value for signed and unsigned enums
applied to a bitfield.
@Michael137
Copy link
Member

FYI, the new test seems to be failing on the matrix LLDB bot that's testing DWARFv2:

======================================================================
FAIL: test_bitfield_enums_dsym (TestBitfieldEnums.TestBitfieldEnum)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake-matrix/llvm-project/lldb/packages/Python/lldbsuite/test/lldbtest.py", line 1756, in test_method
    return attrvalue(self)
  File "/Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake-matrix/llvm-project/lldb/test/API/commands/expression/bitfield_enums/TestBitfieldEnums.py", line 20, in test_bitfield_enums
    self.expect_expr(
  File "/Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake-matrix/llvm-project/lldb/packages/Python/lldbsuite/test/lldbtest.py", line 2512, in expect_expr
    value_check.check_value(self, eval_result, str(eval_result))
  File "/Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake-matrix/llvm-project/lldb/packages/Python/lldbsuite/test/lldbtest.py", line 326, in check_value
    self.check_value_children(test_base, val, error_msg)
  File "/Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake-matrix/llvm-project/lldb/packages/Python/lldbsuite/test/lldbtest.py", line 349, in check_value_children
    expected_child.check_value(test_base, actual_child, child_error)
  File "/Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake-matrix/llvm-project/lldb/packages/Python/lldbsuite/test/lldbtest.py", line 311, in check_value
    test_base.assertEqual(self.expect_value, val.GetValue(), this_error_msg)
AssertionError: 'max' != '-1'
- max
+ -1
 : Checking child with index 5:
(BitfieldStruct) $0 = {
  signed_min = min
  signed_other = -1
  signed_max = max
  unsigned_min = min
  unsigned_other = 1
  unsigned_max = -1
}
Checking SBValue: (UnsignedEnum:2) unsigned_max = -1

https://green.lab.llvm.org/job/llvm.org/view/LLDB/job/lldb-cmake-matrix/459/execution/node/59/log/?consoleFull

(both the dSYM and non-dSYM variants of the test are failing)

AFAICT, this seems to just be an existing bug in the DWARFv2 support that this test uncovered.

With my system LLDB (compiling the test with -gdwarf-2):

(lldb) v bfs
(BitfieldStruct) bfs = {
  signed_min = min
  signed_other = -1
  signed_max = max
  unsigned_min = min
  unsigned_other = 1
  unsigned_max = -1
}

@DavidSpickett
Copy link
Collaborator Author

DavidSpickett commented Jul 9, 2024

Clang's dwarf v2 output is missing DW_AT_type. With g++ I see that (and more) with v2 and v3.

v2:

0x00000043:   DW_TAG_enumeration_type
                DW_AT_name      ("UnsignedEnum")
                DW_AT_byte_size (0x04)
                DW_AT_decl_file ("/tmp/test.cpp")
                DW_AT_decl_line (2)

v3:

0x0000004e:   DW_TAG_enumeration_type
                DW_AT_type      (0x00000067 "unsigned int")
                DW_AT_name      ("UnsignedEnum")
                DW_AT_byte_size (0x04)
                DW_AT_decl_file ("/tmp/test.cpp")
                DW_AT_decl_line (2)

Will skip it while I figure that out.

DavidSpickett added a commit that referenced this pull request Jul 9, 2024
Clang's v2 output appears to be missing a key DW_AT_type attribute,
and this causes the "max" of the unsigned enum to appear as -1 instead
of "max" aka 3.

```
(BitfieldStruct) $0 = {
  signed_min = min
  signed_other = -1
  signed_max = max
  unsigned_min = min
  unsigned_other = 1
  unsigned_max = -1
}
```

Test added by #96202.
@DavidSpickett
Copy link
Collaborator Author

Clang's decision to emit DW_AT_type or not goes back to https://reviews.llvm.org/D42734 and in particular this code:

void DwarfUnit::constructEnumTypeDIE(DIE &Buffer, const DICompositeType *CTy) {
  const DIType *DTy = resolve(CTy->getBaseType());
  bool IsUnsigned = DTy && isUnsignedDIType(DD, DTy);
  if (DTy && DD->getDwarfVersion() >= 3)
    addType(Buffer, DTy);

DWARF v2 (https://dwarfstd.org/doc/dwarf-2.0.0.pdf) does have the concept of DW_AT_type but it does not mention that it may be found in enumeration types.

"5.6 Enumeration Type Entries
<...>
Each enumerator entry has a DW_AT_name attribute, whose value is a null-terminated string
containing the name of the enumeration literal as it appears in the source program. Each
enumerator entry also has a DW_AT_const_value attribute, whose value is the actual numeric
value of the enumerator as represented on the target system."

So perhaps you could infer the type from the enumerators, but this is ambiguous if you only have positive enumerators within signed int's range. If the debugger is in C/C++ mode, it's reasonable for it to assume signed int most of the time.

In DWARF v3 (https://dwarfstd.org/doc/Dwarf3.pdf) "5.8 Enumeration Type Entries" we have an explicit mention of DW_AT_type.

"The enumeration type entry may also have a DW_AT_type attribute which refers to the
underlying data type used to implement the enumeration.

In C or C++, the underlying type will be the appropriate integral type determined by the
compiler from the properties of the enumeration literal values."

See https://gcc.gnu.org/bugzilla/show_bug.cgi?id=16063 for details on what g++ did for this.

	* dwarf2out.c (gen_enumeration_type_die): Add DW_AT_type if DWARF
	version >= 3 or not strict DWARF.

If you pass -gstrict-dwarf to g++ then DW_AT_type is not emitted.

Clang does have that option so we could do the same, let me see what -gstrict-dwarf things we have currently.

DavidSpickett added a commit to DavidSpickett/llvm-project that referenced this pull request Jul 10, 2024
…rict DWARF v2 mode

During testing of llvm#96202 we found that
when clang set to DWARF v2 was used to build the test file, lldb could not tell
that the unsigned enum type was in fact unsigned. So it defaulted to signed and
printed the wrong value.

The reason for this is that DWARFv2 does not include DW_AT_type in DW_TAG_enumeration_type.
This was added in DWARF v3:
"The enumeration type entry may also have a DW_AT_type attribute which refers to the
underlying data type used to implement the enumeration.

In C or C++, the underlying type will be the appropriate integral type determined by the
compiler from the properties of the enumeration literal values."

I noticed that gcc does emit this attribute for DWARF v2 but not when strict
DWARF is requested (more details in https://gcc.gnu.org/bugzilla/show_bug.cgi?id=16063#c7).

This patch changes to clang to do the same. This will improve the experience of anyone
using tools that can understand the attribute but for whatever reason are stuck
building binaries containing v2 only.

You can see a current clang/gcc comparison here: https://godbolt.org/z/eG9Kc9WGf

https://reviews.llvm.org/D42734 added the original code that emitted this
for >= v3 only.
DavidSpickett added a commit that referenced this pull request Jul 12, 2024
…rict DWARF v2 mode (#98335)

During testing of #96202 we
found that when clang set to DWARF v2 was used to build the test file,
lldb could not tell that the unsigned enum type was in fact unsigned. So
it defaulted to signed and printed the wrong value.

The reason for this is that DWARFv2 does not include DW_AT_type in
DW_TAG_enumeration_type. This was added in DWARF v3:
"The enumeration type entry may also have a DW_AT_type attribute which
refers to the underlying data type used to implement the enumeration.

In C or C++, the underlying type will be the appropriate integral type
determined by the compiler from the properties of the enumeration
literal values."

I noticed that gcc does emit this attribute for DWARF v2 but not when
strict DWARF is requested (more details in
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=16063#c7).

This patch changes to clang to do the same. This will improve the
experience of anyone using tools that can understand the attribute but
for whatever reason are stuck building binaries containing v2 only.

You can see a current clang/gcc comparison here:
https://godbolt.org/z/eG9Kc9WGf

https://reviews.llvm.org/D42734 added the original code that emitted
this for >= v3 only.
aaryanshukla pushed a commit to aaryanshukla/llvm-project that referenced this pull request Jul 14, 2024
Clang's v2 output appears to be missing a key DW_AT_type attribute,
and this causes the "max" of the unsigned enum to appear as -1 instead
of "max" aka 3.

```
(BitfieldStruct) $0 = {
  signed_min = min
  signed_other = -1
  signed_max = max
  unsigned_min = min
  unsigned_other = 1
  unsigned_max = -1
}
```

Test added by llvm#96202.
aaryanshukla pushed a commit to aaryanshukla/llvm-project that referenced this pull request Jul 14, 2024
…rict DWARF v2 mode (llvm#98335)

During testing of llvm#96202 we
found that when clang set to DWARF v2 was used to build the test file,
lldb could not tell that the unsigned enum type was in fact unsigned. So
it defaulted to signed and printed the wrong value.

The reason for this is that DWARFv2 does not include DW_AT_type in
DW_TAG_enumeration_type. This was added in DWARF v3:
"The enumeration type entry may also have a DW_AT_type attribute which
refers to the underlying data type used to implement the enumeration.

In C or C++, the underlying type will be the appropriate integral type
determined by the compiler from the properties of the enumeration
literal values."

I noticed that gcc does emit this attribute for DWARF v2 but not when
strict DWARF is requested (more details in
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=16063#c7).

This patch changes to clang to do the same. This will improve the
experience of anyone using tools that can understand the attribute but
for whatever reason are stuck building binaries containing v2 only.

You can see a current clang/gcc comparison here:
https://godbolt.org/z/eG9Kc9WGf

https://reviews.llvm.org/D42734 added the original code that emitted
this for >= v3 only.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants