[lldb] Add support for using integral const static data members in th…

…e expression evaluator This adds support for using const static integral data members as described by C++11 [class.static.data]p3 to LLDB's expression evaluator. So far LLDB treated these data members are normal static variables. They already work as intended when they are declared in the class definition and then defined in a namespace scope. However, if they are declared and initialised in the class definition but never defined in a namespace scope, all LLDB expressions that use them will fail to link when LLDB can't find the respective symbol for the variable. The reason for this is that the data members which are only declared in the class are not emitted into any object file so LLDB can never resolve them. Expressions that use these variables are expected to directly use their constant value if possible. Clang can do this for us during codegen, but it requires that we add the constant value to the VarDecl we generate for these data members. This patch implements this by: * parsing the constant values from the debug info and adding it to variable declarations we encounter. * ensuring that LLDB doesn't implicitly try to take the address of expressions that might be an lvalue that points to such a special data member. The second change is caused by LLDB's way of storing lvalues in the expression parser. When LLDB parses an expression, it tries to keep the result around via two mechanisms: 1. For lvalues, LLDB generates a static pointer variable and stores the address of the last expression in it: `T *$__lldb_expr_result_ptr = &LastExpression` 2. For everything else, LLDB generates a static variable of the same type as the last expression and then direct initialises that variable: `T $__lldb_expr_result(LastExpression)` If we try to print a special const static data member via something like `expr Class::Member`, then LLDB will try to take the address of this expression as it's an lvalue. This means LLDB will try to take the address of the variable which causes that Clang can't replace the use with the constant value. There isn't any good way to detect this case (as there a lot of different expressions that could yield an lvalue that points to such a data member), so this patch also changes that we only use the first way of capturing the result if the last expression does not have a type that could potentially indicate it's coming from such a special data member. This change shouldn't break most workflows for users. The only observable side effect I could find is that the implicit persistent result variables for const int's now have their own memory address: Before this change: ``` (lldb) p i (const int) $0 = 123 (lldb) p &$0 (const int *) $1 = 0x00007ffeefbff8e8 (lldb) p &i (const int *) $2 = 0x00007ffeefbff8e8 ``` After this change we capture `i` by value so it has its own value. ``` (lldb) p i (const int) $0 = 123 (lldb) p &$0 (const int *) $1 = 0x0000000100155320 (lldb) p &i (const int *) $2 = 0x00007ffeefbff8e8 ``` Reviewed By: Michael137 Differential Revision: https://reviews.llvm.org/D81471
llvm · Jul 14, 2022 · 4867872 · 4867872
1 parent 2619ce8
commit 4867872
Show file tree

Hide file tree

Showing 11 changed files with 497 additions and 4 deletions.
diff --git a/lldb/source/Plugins/ExpressionParser/Clang/ASTResultSynthesizer.cpp b/lldb/source/Plugins/ExpressionParser/Clang/ASTResultSynthesizer.cpp
@@ -197,6 +197,27 @@ bool ASTResultSynthesizer::SynthesizeObjCMethodResult(
   return ret;
 }
 
+/// Returns true if LLDB can take the address of the given lvalue for the sake
+/// of capturing the expression result. Returns false if LLDB should instead
+/// store the expression result in a result variable.
+static bool CanTakeAddressOfLValue(const Expr *lvalue_expr) {
+  assert(lvalue_expr->getValueKind() == VK_LValue &&
+         "lvalue_expr not a lvalue");
+
+  QualType qt = lvalue_expr->getType();
+  // If the lvalue has const-qualified non-volatile integral or enum type, then
+  // the underlying value might come from a const static data member as
+  // described in C++11 [class.static.data]p3. If that's the case, then the
+  // value might not have an address if the user didn't also define the member
+  // in a namespace scope. Taking the address would cause that LLDB later fails
+  // to link the expression, so those lvalues should be stored in a result
+  // variable.
+  if (qt->isIntegralOrEnumerationType() && qt.isConstQualified() &&
+      !qt.isVolatileQualified())
+    return false;
+  return true;
+}
+
 bool ASTResultSynthesizer::SynthesizeBodyResult(CompoundStmt *Body,
                                                 DeclContext *DC) {
   Log *log = GetLog(LLDBLog::Expressions);
@@ -265,6 +286,10 @@ bool ASTResultSynthesizer::SynthesizeBodyResult(CompoundStmt *Body,
   //   - During dematerialization, $0 is marked up as a load address with value
   //     equal to the contents of the structure entry.
   //
+  //   - Note: if we cannot take an address of the resulting Lvalue (e.g. it's
+  //     a static const member without an out-of-class definition), then we
+  //     follow the Rvalue route.
+  //
   // For Rvalues
   //
   //   - In AST result synthesis the expression E is transformed into an
@@ -304,7 +329,7 @@ bool ASTResultSynthesizer::SynthesizeBodyResult(CompoundStmt *Body,
 
   clang::VarDecl *result_decl = nullptr;
 
-  if (is_lvalue) {
+  if (is_lvalue && CanTakeAddressOfLValue(last_expr)) {
     IdentifierInfo *result_ptr_id;
 
     if (expr_type->isFunctionType())

diff --git a/lldb/source/Plugins/SymbolFile/DWARF/DWARFASTParserClang.cpp b/lldb/source/Plugins/SymbolFile/DWARF/DWARFASTParserClang.cpp
@@ -2390,6 +2390,7 @@ struct MemberAttributes {
   uint64_t data_bit_offset = UINT64_MAX;
   AccessType accessibility = eAccessNone;
   llvm::Optional<uint64_t> byte_size;
+  llvm::Optional<DWARFFormValue> const_value_form;
   DWARFFormValue encoding_form;
   /// Indicates the byte offset of the word from the base address of the
   /// structure.
@@ -2436,6 +2437,9 @@ MemberAttributes::MemberAttributes(const DWARFDIE &die,
       case DW_AT_byte_size:
         byte_size = form_value.Unsigned();
         break;
+      case DW_AT_const_value:
+        const_value_form = form_value;
+        break;
       case DW_AT_data_bit_offset:
         data_bit_offset = form_value.Unsigned();
         break;
@@ -2587,12 +2591,65 @@ void DWARFASTParserClang::ParseObjCProperty(
       propAttrs.prop_getter_name, propAttrs.prop_attributes, &metadata));
 }
 
+llvm::Expected<llvm::APInt> DWARFASTParserClang::ExtractIntFromFormValue(
+    const CompilerType &int_type, const DWARFFormValue &form_value) const {
+  clang::QualType qt = ClangUtil::GetQualType(int_type);
+  assert(qt->isIntegralOrEnumerationType());
+  TypeSystemClang &ts = *llvm::cast<TypeSystemClang>(int_type.GetTypeSystem());
+  clang::ASTContext &ast = ts.getASTContext();
+
+  const unsigned type_bits = ast.getIntWidth(qt);
+  const bool is_unsigned = qt->isUnsignedIntegerType();
+
+  // The maximum int size supported at the moment by this function. Limited
+  // by the uint64_t return type of DWARFFormValue::Signed/Unsigned.
+  constexpr std::size_t max_bit_size = 64;
+
+  // For values bigger than 64 bit (e.g. __int128_t values),
+  // DWARFFormValue's Signed/Unsigned functions will return wrong results so
+  // emit an error for now.
+  if (type_bits > max_bit_size) {
+    auto msg = llvm::formatv("Can only parse integers with up to {0} bits, but "
+                             "given integer has {1} bits.",
+                             max_bit_size, type_bits);
+    return llvm::createStringError(llvm::inconvertibleErrorCode(), msg.str());
+  }
+
+  // Construct an APInt with the maximum bit size and the given integer.
+  llvm::APInt result(max_bit_size, form_value.Unsigned(), !is_unsigned);
+
+  // Calculate how many bits are required to represent the input value.
+  // For unsigned types, take the number of active bits in the APInt.
+  // For signed types, ask APInt how many bits are required to represent the
+  // signed integer.
+  const unsigned required_bits =
+      is_unsigned ? result.getActiveBits() : result.getMinSignedBits();
+
+  // If the input value doesn't fit into the integer type, return an error.
+  if (required_bits > type_bits) {
+    std::string value_as_str = is_unsigned
+                                   ? std::to_string(form_value.Unsigned())
+                                   : std::to_string(form_value.Signed());
+    auto msg = llvm::formatv("Can't store {0} value {1} in integer with {2} "
+                             "bits.",
+                             (is_unsigned ? "unsigned" : "signed"),
+                             value_as_str, type_bits);
+    return llvm::createStringError(llvm::inconvertibleErrorCode(), msg.str());
+  }
+
+  // Trim the result to the bit width our the int type.
+  if (result.getBitWidth() > type_bits)
+    result = result.trunc(type_bits);
+  return result;
+}
+
 void DWARFASTParserClang::ParseSingleMember(
     const DWARFDIE &die, const DWARFDIE &parent_die,
     const lldb_private::CompilerType &class_clang_type,
     lldb::AccessType default_accessibility,
     lldb_private::ClangASTImporter::LayoutInfo &layout_info,
     FieldInfo &last_field_info) {
+  Log *log = GetLog(DWARFLog::TypeCompletion | DWARFLog::Lookups);
   // This function can only parse DW_TAG_member.
   assert(die.Tag() == DW_TAG_member);
 
@@ -2623,9 +2680,27 @@ void DWARFASTParserClang::ParseSingleMember(
     if (var_type) {
       if (attrs.accessibility == eAccessNone)
         attrs.accessibility = eAccessPublic;
-      TypeSystemClang::AddVariableToRecordType(
-          class_clang_type, attrs.name, var_type->GetForwardCompilerType(),
-          attrs.accessibility);
+      CompilerType ct = var_type->GetForwardCompilerType();
+      clang::VarDecl *v = TypeSystemClang::AddVariableToRecordType(
+          class_clang_type, attrs.name, ct, attrs.accessibility);
+      if (!v) {
+        LLDB_LOG(log, "Failed to add variable to the record type");
+        return;
+      }
+
+      bool unused;
+      // TODO: Support float/double static members as well.
+      if (!attrs.const_value_form || !ct.IsIntegerOrEnumerationType(unused))
+        return;
+      llvm::Expected<llvm::APInt> const_value_or_err =
+          ExtractIntFromFormValue(ct, *attrs.const_value_form);
+      if (!const_value_or_err) {
+        LLDB_LOG_ERROR(log, const_value_or_err.takeError(),
+                       "Failed to add const value to variable {1}: {0}",
+                       v->getQualifiedNameAsString());
+        return;
+      }
+      TypeSystemClang::SetIntegerInitializerForVariable(v, *const_value_or_err);
     }
     return;
   }

diff --git a/lldb/source/Plugins/SymbolFile/DWARF/DWARFASTParserClang.h b/lldb/source/Plugins/SymbolFile/DWARF/DWARFASTParserClang.h
@@ -68,6 +68,22 @@ class DWARFASTParserClang : public DWARFASTParser {
 
   lldb_private::ClangASTImporter &GetClangASTImporter();
 
+  /// Extracts an value for a given Clang integer type from a DWARFFormValue.
+  ///
+  /// \param int_type The Clang type that defines the bit size and signedness
+  ///                 of the integer that should be extracted. Has to be either
+  ///                 an integer type or an enum type. For enum types the
+  ///                 underlying integer type will be considered as the
+  ///                 expected integer type that should be extracted.
+  /// \param form_value The DWARFFormValue that contains the integer value.
+  /// \return An APInt containing the same integer value as the given
+  ///         DWARFFormValue with the bit width of the given integer type.
+  ///         Returns an error if the value in the DWARFFormValue does not fit
+  ///         into the given integer type or the integer type isn't supported.
+  llvm::Expected<llvm::APInt>
+  ExtractIntFromFormValue(const lldb_private::CompilerType &int_type,
+                          const DWARFFormValue &form_value) const;
+
 protected:
   /// Protected typedefs and members.
   /// @{

diff --git a/lldb/test/API/lang/cpp/const_static_integral_member/Makefile b/lldb/test/API/lang/cpp/const_static_integral_member/Makefile
@@ -0,0 +1,3 @@
+CXX_SOURCES := main.cpp
+
+include Makefile.rules
diff --git a/lldb/test/API/lang/cpp/const_static_integral_member/TestConstStaticIntegralMember.py b/lldb/test/API/lang/cpp/const_static_integral_member/TestConstStaticIntegralMember.py
@@ -0,0 +1,94 @@
+"""
+Tests const static data members as specified by C++11 [class.static.data]p3.
+"""
+
+import lldb
+from lldbsuite.test.decorators import *
+from lldbsuite.test.lldbtest import *
+from lldbsuite.test import lldbutil
+
+class TestCase(TestBase):
+
+    mydir = TestBase.compute_mydir(__file__)
+
+    def test(self):
+        self.build()
+        lldbutil.run_to_source_breakpoint(self, "// break here",
+                                          lldb.SBFileSpec("main.cpp"))
+
+        # Test using a simple const static integer member.
+        self.expect_expr("A::int_val", result_value="1")
+
+        # Try accessing the int member via some expressions that still produce
+        # an lvalue.
+        self.expect_expr("a.int_val", result_value="1")
+        self.expect_expr("(A::int_val)", result_value="1")
+        self.expect_expr("+A::int_val", result_value="1")
+        self.expect_expr("1,A::int_val", result_value="1")
+        self.expect_expr("true ? A::int_val : A::int_val", result_value="1")
+
+        # Test a simple integer member that was also defined in a namespace
+        # scope and has an address.
+        self.expect_expr("A::int_val_with_address", result_value="2")
+
+        # Try to take the address of the data member. Should produce a linker
+        # error only if the member doesn't have an address.
+        self.expect("expr const int *i = &A::int_val; *i", error=True,
+                    substrs=["Couldn't lookup symbols:"])
+        self.expect_expr("const int *i = &A::int_val_with_address; *i",
+                         result_value="2")
+
+        # Test a bool member.
+        self.expect_expr("A::bool_val", result_value="true")
+
+        # Test that minimum and maximum values for each data type are right.
+        self.expect_expr("A::char_max == char_max", result_value="true")
+        self.expect_expr("A::uchar_max == uchar_max", result_value="true")
+        self.expect_expr("A::int_max == int_max", result_value="true")
+        self.expect_expr("A::uint_max == uint_max", result_value="true")
+        self.expect_expr("A::long_max == long_max", result_value="true")
+        self.expect_expr("A::ulong_max == ulong_max", result_value="true")
+        self.expect_expr("A::longlong_max == longlong_max", result_value="true")
+        self.expect_expr("A::ulonglong_max == ulonglong_max", result_value="true")
+
+        self.expect_expr("A::char_min == char_min", result_value="true")
+        self.expect_expr("A::uchar_min == uchar_min", result_value="true")
+        self.expect_expr("A::int_min == int_min", result_value="true")
+        self.expect_expr("A::uint_min == uint_min", result_value="true")
+        self.expect_expr("A::long_min == long_min", result_value="true")
+        self.expect_expr("A::ulong_min == ulong_min", result_value="true")
+        self.expect_expr("A::longlong_min == longlong_min", result_value="true")
+        self.expect_expr("A::ulonglong_min == ulonglong_min", result_value="true")
+
+        # Test an unscoped enum.
+        self.expect_expr("A::enum_val", result_value="enum_case2")
+        # Test an unscoped enum with an invalid enum case.
+        self.expect_expr("A::invalid_enum_val", result_value="enum_case1 | enum_case2 | 0x4")
+
+        # Test a scoped enum.
+        self.expect_expr("A::scoped_enum_val", result_value="scoped_enum_case2")
+        # Test an scoped enum with an invalid enum case.
+        self.expect_expr("A::invalid_scoped_enum_val", result_value="scoped_enum_case1 | 0x4")
+
+        # Test an enum with fixed underlying type.
+        self.expect_expr("A::scoped_char_enum_val", result_value="case2")
+        self.expect_expr("A::scoped_ll_enum_val_neg", result_value="case0")
+        self.expect_expr("A::scoped_ll_enum_val", result_value="case2")
+
+        # Try to take the address of a member that doesn't have one.
+        self.expect("expr const int *i = &A::int_val; *i", error=True,
+                    substrs=["Couldn't lookup symbols:"])
+
+    # dsymutil strips the debug info for classes that only have const static
+    # data members without a definition namespace scope.
+    @expectedFailureAll(debug_info=["dsym"])
+    def test_class_with_only_const_static(self):
+        self.build()
+        lldbutil.run_to_source_breakpoint(self, "// break here", lldb.SBFileSpec("main.cpp"))
+
+        self.expect_expr("ClassWithOnlyConstStatic::member", result_value="3")
+
+        # Test `constexpr static`.
+        self.expect_expr("ClassWithConstexprs::member", result_value="2")
+        self.expect_expr("ClassWithConstexprs::enum_val", result_value="enum_case2")
+        self.expect_expr("ClassWithConstexprs::scoped_enum_val", result_value="scoped_enum_case2")
diff --git a/lldb/test/API/lang/cpp/const_static_integral_member/main.cpp b/lldb/test/API/lang/cpp/const_static_integral_member/main.cpp
@@ -0,0 +1,102 @@
+#include <limits>
+
+enum Enum {
+  enum_case1 = 1,
+  enum_case2 = 2,
+};
+
+enum class ScopedEnum {
+  scoped_enum_case1 = 1,
+  scoped_enum_case2 = 2,
+};
+
+enum class ScopedCharEnum : char {
+  case1 = 1,
+  case2 = 2,
+};
+
+enum class ScopedLongLongEnum : long long {
+  case0 = std::numeric_limits<long long>::min(),
+  case1 = 1,
+  case2 = std::numeric_limits<long long>::max(),
+};
+
+struct A {
+  const static int int_val = 1;
+  const static int int_val_with_address = 2;
+  const static bool bool_val = true;
+
+  const static auto char_max = std::numeric_limits<signed char>::max();
+  const static auto uchar_max = std::numeric_limits<unsigned char>::max();
+  const static auto int_max = std::numeric_limits<int>::max();
+  const static auto uint_max = std::numeric_limits<unsigned>::max();
+  const static auto long_max = std::numeric_limits<long>::max();
+  const static auto ulong_max = std::numeric_limits<unsigned long>::max();
+  const static auto longlong_max = std::numeric_limits<long long>::max();
+  const static auto ulonglong_max =
+      std::numeric_limits<unsigned long long>::max();
+
+  const static auto char_min = std::numeric_limits<char>::min();
+  const static auto uchar_min = std::numeric_limits<unsigned char>::min();
+  const static auto int_min = std::numeric_limits<int>::min();
+  const static auto uint_min = std::numeric_limits<unsigned>::min();
+  const static auto long_min = std::numeric_limits<long>::min();
+  const static auto ulong_min = std::numeric_limits<unsigned long>::min();
+  const static auto longlong_min = std::numeric_limits<long long>::min();
+  const static auto ulonglong_min =
+      std::numeric_limits<unsigned long long>::min();
+
+  const static Enum enum_val = enum_case2;
+  const static Enum invalid_enum_val = static_cast<Enum>(enum_case2 + 5);
+  const static ScopedEnum scoped_enum_val = ScopedEnum::scoped_enum_case2;
+  const static ScopedEnum invalid_scoped_enum_val = static_cast<ScopedEnum>(5);
+  const static ScopedCharEnum scoped_char_enum_val = ScopedCharEnum::case2;
+  const static ScopedLongLongEnum scoped_ll_enum_val_neg =
+      ScopedLongLongEnum::case0;
+  const static ScopedLongLongEnum scoped_ll_enum_val =
+      ScopedLongLongEnum::case2;
+};
+
+const int A::int_val_with_address;
+
+struct ClassWithOnlyConstStatic {
+  const static int member = 3;
+};
+
+struct ClassWithConstexprs {
+  constexpr static int member = 2;
+  constexpr static Enum enum_val = enum_case2;
+  constexpr static ScopedEnum scoped_enum_val = ScopedEnum::scoped_enum_case2;
+} cwc;
+
+int main() {
+  A a;
+
+  auto char_max = A::char_max;
+  auto uchar_max = A::uchar_max;
+  auto int_max = A::int_max;
+  auto uint_max = A::uint_max;
+  auto long_max = A::long_max;
+  auto ulong_max = A::ulong_max;
+  auto longlong_max = A::longlong_max;
+  auto ulonglong_max = A::ulonglong_max;
+
+  auto char_min = A::char_min;
+  auto uchar_min = A::uchar_min;
+  auto int_min = A::int_min;
+  auto uint_min = A::uint_min;
+  auto long_min = A::long_min;
+  auto ulong_min = A::ulong_min;
+  auto longlong_min = A::longlong_min;
+  auto ulonglong_min = A::ulonglong_min;
+
+  int member_copy = ClassWithOnlyConstStatic::member;
+
+  Enum e = A::enum_val;
+  e = A::invalid_enum_val;
+  ScopedEnum se = A::scoped_enum_val;
+  se = A::invalid_scoped_enum_val;
+  ScopedCharEnum sce = A::scoped_char_enum_val;
+  ScopedLongLongEnum sle = A::scoped_ll_enum_val;
+  return 0; // break here
+}
diff --git a/lldb/test/API/lang/cpp/const_static_integral_member_int128/Makefile b/lldb/test/API/lang/cpp/const_static_integral_member_int128/Makefile
@@ -0,0 +1,3 @@
+CXX_SOURCES := main.cpp
+
+include Makefile.rules