Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[NFC] Add DIExpression::extractLeadingOffset #97719

Merged
merged 4 commits into from
Jul 8, 2024

Conversation

OCHyams
Copy link
Contributor

@OCHyams OCHyams commented Jul 4, 2024

Patch [2/x] to fix structured bindings debug info in SROA.

It extracts a constant offset from the DIExpression if there is one and fills
RemainingOps with the ops that come after it.

This function will be used in a subsequent patch.


Based on top of #97705. Ignore the first commit, which is under review over there.

@llvmbot
Copy link
Collaborator

llvmbot commented Jul 4, 2024

@llvm/pr-subscribers-llvm-ir

@llvm/pr-subscribers-debuginfo

Author: Orlando Cazalet-Hyams (OCHyams)

Changes

Patch [2/x] to fix structured bindings debug info in SROA.

It extracts a constant offset from the DIExpression if there is one and fills
RemainingOps with the ops that come after it.

This function will be used in a subsequent patch.


Based on top of #97705. Ignore the first commit, which is under review over there.


Full diff: https://github.com/llvm/llvm-project/pull/97719.diff

6 Files Affected:

  • (added) llvm/include/llvm/IR/DbgVariableFragmentInfo.h (+45)
  • (modified) llvm/include/llvm/IR/DebugInfoMetadata.h (+12-23)
  • (modified) llvm/include/llvm/IR/DebugProgramInstruction.h (+14)
  • (modified) llvm/lib/IR/DebugInfoMetadata.cpp (+39)
  • (modified) llvm/lib/IR/DebugProgramInstruction.cpp (+4)
  • (modified) llvm/unittests/IR/MetadataTest.cpp (+78)
diff --git a/llvm/include/llvm/IR/DbgVariableFragmentInfo.h b/llvm/include/llvm/IR/DbgVariableFragmentInfo.h
new file mode 100644
index 0000000000000..40326d5792f9f
--- /dev/null
+++ b/llvm/include/llvm/IR/DbgVariableFragmentInfo.h
@@ -0,0 +1,45 @@
+//===- llvm/IR/DbgVariableFragmentInfo.h ------------------------*- C++ -*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+//
+// Helper struct to describe a fragment of a debug variable.
+//
+//===----------------------------------------------------------------------===//
+#ifndef LLVM_IR_DBGVARIABLEFRAGMENTINFO_H
+#define LLVM_IR_DBGVARIABLEFRAGMENTINFO_H
+
+#include <cstdint>
+
+namespace llvm {
+struct DbgVariableFragmentInfo {
+  DbgVariableFragmentInfo() = default;
+  DbgVariableFragmentInfo(uint64_t SizeInBits, uint64_t OffsetInBits)
+      : SizeInBits(SizeInBits), OffsetInBits(OffsetInBits) {}
+  uint64_t SizeInBits;
+  uint64_t OffsetInBits;
+  /// Return the index of the first bit of the fragment.
+  uint64_t startInBits() const { return OffsetInBits; }
+  /// Return the index of the bit after the end of the fragment, e.g. for
+  /// fragment offset=16 and size=32 return their sum, 48.
+  uint64_t endInBits() const { return OffsetInBits + SizeInBits; }
+
+  /// Returns a zero-sized fragment if A and B don't intersect.
+  static DbgVariableFragmentInfo intersect(DbgVariableFragmentInfo A,
+                                           DbgVariableFragmentInfo B) {
+    // Don't use std::max or min to avoid including <algorithm>.
+    uint64_t StartInBits =
+        A.OffsetInBits > B.OffsetInBits ? A.OffsetInBits : B.OffsetInBits;
+    uint64_t EndInBits =
+        A.endInBits() < B.endInBits() ? A.endInBits() : B.endInBits();
+    if (EndInBits <= StartInBits)
+      return {0, 0};
+    return DbgVariableFragmentInfo(EndInBits - StartInBits, StartInBits);
+  }
+};
+} // end namespace llvm
+
+#endif // LLVM_IR_DBGVARIABLEFRAGMENTINFO_H
diff --git a/llvm/include/llvm/IR/DebugInfoMetadata.h b/llvm/include/llvm/IR/DebugInfoMetadata.h
index 524945862e8d4..a316c7bde64b0 100644
--- a/llvm/include/llvm/IR/DebugInfoMetadata.h
+++ b/llvm/include/llvm/IR/DebugInfoMetadata.h
@@ -21,6 +21,7 @@
 #include "llvm/ADT/StringRef.h"
 #include "llvm/ADT/iterator_range.h"
 #include "llvm/IR/Constants.h"
+#include "llvm/IR/DbgVariableFragmentInfo.h"
 #include "llvm/IR/Metadata.h"
 #include "llvm/IR/PseudoProbe.h"
 #include "llvm/Support/Casting.h"
@@ -2886,29 +2887,7 @@ class DIExpression : public MDNode {
   /// Return whether there is exactly one operator and it is a DW_OP_deref;
   bool isDeref() const;
 
-  /// Holds the characteristics of one fragment of a larger variable.
-  struct FragmentInfo {
-    FragmentInfo() = default;
-    FragmentInfo(uint64_t SizeInBits, uint64_t OffsetInBits)
-        : SizeInBits(SizeInBits), OffsetInBits(OffsetInBits) {}
-    uint64_t SizeInBits;
-    uint64_t OffsetInBits;
-    /// Return the index of the first bit of the fragment.
-    uint64_t startInBits() const { return OffsetInBits; }
-    /// Return the index of the bit after the end of the fragment, e.g. for
-    /// fragment offset=16 and size=32 return their sum, 48.
-    uint64_t endInBits() const { return OffsetInBits + SizeInBits; }
-
-    /// Returns a zero-sized fragment if A and B don't intersect.
-    static DIExpression::FragmentInfo intersect(DIExpression::FragmentInfo A,
-                                                DIExpression::FragmentInfo B) {
-      uint64_t StartInBits = std::max(A.OffsetInBits, B.OffsetInBits);
-      uint64_t EndInBits = std::min(A.endInBits(), B.endInBits());
-      if (EndInBits <= StartInBits)
-        return {0, 0};
-      return DIExpression::FragmentInfo(EndInBits - StartInBits, StartInBits);
-    }
-  };
+  using FragmentInfo = DbgVariableFragmentInfo;
 
   /// Return the number of bits that have an active value, i.e. those that
   /// aren't known to be zero/sign (depending on the type of Var) and which
@@ -3003,6 +2982,16 @@ class DIExpression : public MDNode {
   /// return true with an offset of zero.
   bool extractIfOffset(int64_t &Offset) const;
 
+  /// Assuming that the expression operates on an address, extract a constant
+  /// offset and the succsessive ops. Return false if the expression contains
+  /// any incompatible ops (including non-zero DW_OP_LLVM_args - only a single
+  /// address operand to the expression is permitted).
+  ///
+  /// We don't try very hard to interpret the expression because we assume that
+  /// foldConstantMath has canonicalized the expression.
+  bool extractLeadingOffset(int64_t &Offset,
+                            SmallVectorImpl<uint64_t> &RemainingOps) const;
+
   /// Returns true iff this DIExpression contains at least one instance of
   /// `DW_OP_LLVM_arg, n` for all n in [0, N).
   bool hasAllLocationOps(unsigned N) const;
diff --git a/llvm/include/llvm/IR/DebugProgramInstruction.h b/llvm/include/llvm/IR/DebugProgramInstruction.h
index ed8081a3cad19..fd93dd9b89f7c 100644
--- a/llvm/include/llvm/IR/DebugProgramInstruction.h
+++ b/llvm/include/llvm/IR/DebugProgramInstruction.h
@@ -50,6 +50,7 @@
 #include "llvm/ADT/ilist.h"
 #include "llvm/ADT/ilist_node.h"
 #include "llvm/ADT/iterator.h"
+#include "llvm/IR/DbgVariableFragmentInfo.h"
 #include "llvm/IR/DebugLoc.h"
 #include "llvm/IR/Instruction.h"
 #include "llvm/IR/SymbolTableListTraits.h"
@@ -460,6 +461,19 @@ class DbgVariableRecord : public DbgRecord, protected DebugValueUser {
     resetDebugValue(0, NewLocation);
   }
 
+  std::optional<DbgVariableFragmentInfo> getFragment() const;
+  /// Get the FragmentInfo for the variable if it exists, otherwise return a
+  /// FragmentInfo that covers the entire variable if the variable size is
+  /// known, otherwise return a zero-sized fragment.
+  DbgVariableFragmentInfo getFragmentOrEntireVariable() const {
+    DbgVariableFragmentInfo VariableSlice(0, 0);
+    // Get the fragment or variable size, or zero.
+    if (auto Sz = getFragmentSizeInBits())
+      VariableSlice.SizeInBits = *Sz;
+    if (auto Frag = getFragment())
+      VariableSlice.OffsetInBits = Frag->OffsetInBits;
+    return VariableSlice;
+  }
   /// Get the size (in bits) of the variable, or fragment of the variable that
   /// is described.
   std::optional<uint64_t> getFragmentSizeInBits() const;
diff --git a/llvm/lib/IR/DebugInfoMetadata.cpp b/llvm/lib/IR/DebugInfoMetadata.cpp
index 161a30dfb3828..f800f324d05ac 100644
--- a/llvm/lib/IR/DebugInfoMetadata.cpp
+++ b/llvm/lib/IR/DebugInfoMetadata.cpp
@@ -1760,6 +1760,45 @@ bool DIExpression::extractIfOffset(int64_t &Offset) const {
   return false;
 }
 
+bool DIExpression::extractLeadingOffset(
+    int64_t &Offset, SmallVectorImpl<uint64_t> &RemainingOps) const {
+  Offset = 0;
+  RemainingOps.clear();
+
+  auto SingleLocEltsOpt = getSingleLocationExpressionElements();
+  if (!SingleLocEltsOpt)
+    return false;
+
+  auto ExprOpEnd = expr_op_iterator(SingleLocEltsOpt->end());
+  auto ExprOpIt = expr_op_iterator(SingleLocEltsOpt->begin());
+  while (ExprOpIt != ExprOpEnd) {
+    uint64_t Op = ExprOpIt->getOp();
+    if (Op == dwarf::DW_OP_deref || Op == dwarf::DW_OP_deref_size ||
+        Op == dwarf::DW_OP_deref_type || Op == dwarf::DW_OP_LLVM_fragment ||
+        Op == dwarf::DW_OP_LLVM_extract_bits_zext ||
+        Op == dwarf::DW_OP_LLVM_extract_bits_sext) {
+      break;
+    } else if (Op == dwarf::DW_OP_plus_uconst) {
+      Offset += ExprOpIt->getArg(0);
+    } else if (Op == dwarf::DW_OP_constu) {
+      uint64_t Value = ExprOpIt->getArg(0);
+      ++ExprOpIt;
+      if (ExprOpIt->getOp() == dwarf::DW_OP_plus)
+        Offset += Value;
+      else if (ExprOpIt->getOp() == dwarf::DW_OP_minus)
+        Offset -= Value;
+      else
+        return false;
+    } else {
+      // Not a const plus/minus operation or deref.
+      return false;
+    }
+    ++ExprOpIt;
+  }
+  RemainingOps.append(ExprOpIt.getBase(), ExprOpEnd.getBase());
+  return true;
+}
+
 bool DIExpression::hasAllLocationOps(unsigned N) const {
   SmallDenseSet<uint64_t, 4> SeenOps;
   for (auto ExprOp : expr_ops())
diff --git a/llvm/lib/IR/DebugProgramInstruction.cpp b/llvm/lib/IR/DebugProgramInstruction.cpp
index 9a4926c81dca2..1711fd6943517 100644
--- a/llvm/lib/IR/DebugProgramInstruction.cpp
+++ b/llvm/lib/IR/DebugProgramInstruction.cpp
@@ -371,6 +371,10 @@ bool DbgVariableRecord::isKillLocation() const {
          any_of(location_ops(), [](Value *V) { return isa<UndefValue>(V); });
 }
 
+std::optional<DbgVariableFragmentInfo> DbgVariableRecord::getFragment() const {
+  return getExpression()->getFragmentInfo();
+}
+
 std::optional<uint64_t> DbgVariableRecord::getFragmentSizeInBits() const {
   if (auto Fragment = getExpression()->getFragmentInfo())
     return Fragment->SizeInBits;
diff --git a/llvm/unittests/IR/MetadataTest.cpp b/llvm/unittests/IR/MetadataTest.cpp
index 3f766a414f08f..29c9bd0a4cd9a 100644
--- a/llvm/unittests/IR/MetadataTest.cpp
+++ b/llvm/unittests/IR/MetadataTest.cpp
@@ -3870,6 +3870,84 @@ TEST_F(DIExpressionTest, createFragmentExpression) {
 #undef EXPECT_INVALID_FRAGMENT
 }
 
+TEST_F(DIExpressionTest, extractLeadingOffset) {
+  int64_t Offset;
+  SmallVector<uint64_t> Remaining;
+  using namespace dwarf;
+#define OPS(...) SmallVector<uint64_t>(ArrayRef<uint64_t>{__VA_ARGS__})
+#define EXTRACT_FROM(...)                                                      \
+  DIExpression::get(Context, {__VA_ARGS__})                                    \
+      ->extractLeadingOffset(Offset, Remaining)
+  // Test the number of expression inputs
+  // ------------------------------------
+  //
+  // Single location expressions are permitted.
+  EXPECT_TRUE(EXTRACT_FROM(DW_OP_plus_uconst, 2));
+  EXPECT_EQ(Offset, 2);
+  EXPECT_EQ(Remaining.size(), 0);
+  // This is also a single-location.
+  EXPECT_TRUE(EXTRACT_FROM(DW_OP_LLVM_arg, 0, DW_OP_plus_uconst, 2));
+  EXPECT_EQ(Offset, 2);
+  EXPECT_EQ(Remaining.size(), 0);
+  // Variadic locations are not permitted. A non-zero arg is assumed to
+  // indicate multiple inputs.
+  EXPECT_FALSE(EXTRACT_FROM(DW_OP_LLVM_arg, 1));
+  EXPECT_FALSE(EXTRACT_FROM(DW_OP_LLVM_arg, 0, DW_OP_LLVM_arg, 1, DW_OP_plus));
+
+  // Test offsets expressions
+  // ------------------------
+  EXPECT_TRUE(EXTRACT_FROM());
+  EXPECT_EQ(Offset, 0);
+  EXPECT_EQ(Remaining.size(), 0u);
+
+  EXPECT_TRUE(EXTRACT_FROM(DW_OP_constu, 4, DW_OP_plus));
+  EXPECT_EQ(Offset, 4);
+  EXPECT_EQ(Remaining.size(), 0u);
+
+  EXPECT_TRUE(EXTRACT_FROM(DW_OP_constu, 2, DW_OP_minus));
+  EXPECT_EQ(Offset, -2);
+  EXPECT_EQ(Remaining.size(), 0u);
+
+  EXPECT_TRUE(EXTRACT_FROM(DW_OP_plus_uconst, 8));
+  EXPECT_EQ(Offset, 8);
+  EXPECT_EQ(Remaining.size(), 0u);
+
+  EXPECT_TRUE(EXTRACT_FROM(DW_OP_plus_uconst, 4, DW_OP_constu, 2, DW_OP_minus));
+  EXPECT_EQ(Offset, 2);
+  EXPECT_EQ(Remaining.size(), 0u);
+
+  // Not all operations are permitted for simplicity. Can be added
+  // if needed in future.
+  EXPECT_FALSE(EXTRACT_FROM(DW_OP_constu, 2, DW_OP_mul));
+
+  // Test "remaining ops"
+  // --------------------
+  EXPECT_TRUE(EXTRACT_FROM(DW_OP_plus_uconst, 4, DW_OP_constu, 8, DW_OP_minus,
+                           DW_OP_LLVM_fragment, 0, 32));
+  EXPECT_EQ(Remaining, OPS(DW_OP_LLVM_fragment, 0, 32));
+  EXPECT_EQ(Offset, -4);
+
+  EXPECT_TRUE(EXTRACT_FROM(DW_OP_deref));
+  EXPECT_EQ(Remaining, OPS(DW_OP_deref));
+  EXPECT_EQ(Offset, 0);
+
+  // Check things after the non-offset ops are added too.
+  EXPECT_TRUE(EXTRACT_FROM(DW_OP_plus_uconst, 2, DW_OP_deref_size, 4,
+                           DW_OP_stack_value));
+  EXPECT_EQ(Remaining, OPS(DW_OP_deref_size, 4, DW_OP_stack_value));
+  EXPECT_EQ(Offset, 2);
+
+  // DW_OP_deref_type isn't supported in LLVM so this currently fails.
+  EXPECT_FALSE(EXTRACT_FROM(DW_OP_deref_type, 0));
+
+  EXPECT_TRUE(EXTRACT_FROM(DW_OP_LLVM_extract_bits_zext, 0, 8));
+  EXPECT_EQ(Remaining, OPS(DW_OP_LLVM_extract_bits_zext, 0, 8));
+
+  EXPECT_TRUE(EXTRACT_FROM(DW_OP_LLVM_extract_bits_sext, 4, 4));
+  EXPECT_EQ(Remaining, OPS(DW_OP_LLVM_extract_bits_sext, 4, 4));
+#undef EXTRACT_FROM
+}
+
 TEST_F(DIExpressionTest, convertToUndefExpression) {
 #define EXPECT_UNDEF_OPS_EQUAL(TestExpr, Expected)                             \
   do {                                                                         \

/// foldConstantMath has canonicalized the expression.
bool extractLeadingOffset(int64_t &Offset,
SmallVectorImpl<uint64_t> &RemainingOps) const;

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note to self: rename Offset to OffsetInBytes

OCHyams and others added 4 commits July 7, 2024 23:21
Patch [2/x] to fix structured bindings debug info in SROA.

It extracts a constant offset from the DIExpression if there is one and fills
RemainingOps with the ops that come after it.

This function will be used in a subsequent patch.
Co-authored-by: Stephen Tozer <Melamoto@gmail.com>
@OCHyams OCHyams force-pushed the structured-bindings-fix-2 branch 2 times, most recently from 16d4155 to 0bb27ac Compare July 8, 2024 09:08
@OCHyams OCHyams merged commit f50f7a7 into llvm:main Jul 8, 2024
4 of 6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants