Introduce DIExpressionOptimizer #69769

rastogishubham · 2023-10-20T20:28:06Z

DIExpressionOptimizer is a lightweight class that can be used to optimize DIExpressions that can get very large, by involving some simple constant folding and removing appends of unnecessary operations.

For example:

If we have a DIExpression: {dwarf::DW_OP_LLVM_arg, 0, dwarf::DW_OP_constu, 2, dwarf::DW_OP_plus} and want to append {dwarf::DW_OP_constu, 3, dwarf::DW_OP_plus}

Instead of having the final DIExpression be {dwarf::DW_OP_LLVM_arg, 0, dwarf::DW_OP_constu, 2, dwarf::DW_OP_plus, dwarf::DW_OP_constu, 3, dwarf::DW_OP_plus}.

It will be reduced to {dwarf::DW_OP_LLVM_arg, 0, dwarf::DW_OP_constu, 5, dwarf::DW_OP_plus}

There is an implementation for DW_OP_mul and DW_OP_plus_uconst as well

It also just flat out does not append {DW_OP_constu, 1, DW_OP_mul}, {DW_OP_constu, 1, DW_OP_div}, {DW_OP_constu, 0, DW_OP_plus}, {DW_OP_constu, 0, DW_OP_minus}, {DW_OP_constu, 0, DW_OP_shl}, {DW_OP_constu, 0, DW_OP_shr}

Hopefully this can be improved upon by the larger community to add more optimizations in the future.

This patch is part of a stacked diff, I was not sure how to do this any better because github squashes multiple commits in one PR, but the PR this depends on is:

#69768

DIExpressionOptimizer is a lightweight class that can be used to optimize DIExpressions that can get very large, by involving some simple constant folding, and removing appends of unecessary operations.

llvmbot · 2023-10-20T20:29:47Z

@llvm/pr-subscribers-llvm-ir

@llvm/pr-subscribers-debuginfo

Author: Shubham Sandeep Rastogi (rastogishubham)

Changes

DIExpressionOptimizer is a lightweight class that can be used to optimize DIExpressions that can get very large, by involving some simple constant folding and removing appends of unnecessary operations.

Full diff: https://github.com/llvm/llvm-project/pull/69769.diff

3 Files Affected:

(modified) llvm/include/llvm/IR/DebugInfoMetadata.h (+103)
(modified) llvm/lib/CodeGen/AsmPrinter/DwarfExpression.h (-61)
(modified) llvm/lib/IR/DebugInfoMetadata.cpp (+137)

diff --git a/llvm/include/llvm/IR/DebugInfoMetadata.h b/llvm/include/llvm/IR/DebugInfoMetadata.h
index b347664883fd9f2..b9df644e44ccf36 100644
--- a/llvm/include/llvm/IR/DebugInfoMetadata.h
+++ b/llvm/include/llvm/IR/DebugInfoMetadata.h
@@ -3073,6 +3073,109 @@ template <> struct DenseMapInfo<DIExpression::FragmentInfo> {
   static bool isEqual(const FragInfo &A, const FragInfo &B) { return A == B; }
 };
 
+class DIExpressionOptimizer {
+
+  // When looking at a DIExpression such as {DW_OP_constu, 1, DW_OP_constu, 2,
+  // DW_OP_plus} and trying to append {DW_OP_consts, 3, DW_OP_minus}
+  // NewMathOperator = DW_OP_minus
+  // OperandRight = 3
+  // OperatorRight = DW_OP_consts
+  // CurrMathOperator = DW_OP_plus
+  // OperandLeft = 2
+  // OperandLeft = DW_OP_constu
+
+  /// The math operator of the new subexpression being appended to the
+  /// DIExpression.
+  uint64_t NewMathOperator = 0;
+
+  /// The operand of the new subexpression being appended to the DIExpression.
+  uint64_t OperandRight;
+
+  /// The operator attached to OperandRight.
+  uint64_t OperatorRight = 0;
+
+  /// The math operator at the end of the current DIExpression.
+  uint64_t CurrMathOperator = 0;
+
+  /// The operand at the end of the current DIExpression at the top of the
+  /// stack.
+  uint64_t OperandLeft;
+
+  /// The operator attached to OperandLeft.
+  uint64_t OperatorLeft = 0;
+
+  bool operatorIsCommutative(uint64_t Operator);
+
+  SmallVector<uint64_t> getOperatorLocations(ArrayRef<uint64_t> Ops);
+
+  void reset();
+
+public:
+  bool operatorCanBeOptimized(uint64_t Operator);
+  void optimize(SmallVectorImpl<uint64_t> &NewOps, ArrayRef<uint64_t> Ops);
+};
+
+/// Holds a DIExpression and keeps track of how many operands have been consumed
+/// so far.
+class DIExpressionCursor {
+  DIExpression::expr_op_iterator Start, End;
+
+public:
+  DIExpressionCursor(const DIExpression *Expr) {
+    if (!Expr) {
+      assert(Start == End);
+      return;
+    }
+    Start = Expr->expr_op_begin();
+    End = Expr->expr_op_end();
+  }
+
+  DIExpressionCursor(ArrayRef<uint64_t> Expr)
+      : Start(Expr.begin()), End(Expr.end()) {}
+
+  DIExpressionCursor(const DIExpressionCursor &) = default;
+
+  /// Consume one operation.
+  std::optional<DIExpression::ExprOperand> take() {
+    if (Start == End)
+      return std::nullopt;
+    return *(Start++);
+  }
+
+  /// Consume N operations.
+  void consume(unsigned N) { std::advance(Start, N); }
+
+  /// Return the current operation.
+  std::optional<DIExpression::ExprOperand> peek() const {
+    if (Start == End)
+      return std::nullopt;
+    return *(Start);
+  }
+
+  /// Return the next operation.
+  std::optional<DIExpression::ExprOperand> peekNext() const {
+    if (Start == End)
+      return std::nullopt;
+
+    auto Next = Start.getNext();
+    if (Next == End)
+      return std::nullopt;
+
+    return *Next;
+  }
+
+  /// Determine whether there are any operations left in this expression.
+  operator bool() const { return Start != End; }
+
+  DIExpression::expr_op_iterator begin() const { return Start; }
+  DIExpression::expr_op_iterator end() const { return End; }
+
+  /// Retrieve the fragment information, if any.
+  std::optional<DIExpression::FragmentInfo> getFragmentInfo() const {
+    return DIExpression::getFragmentInfo(Start, End);
+  }
+};
+
 /// Global variables.
 ///
 /// TODO: Remove DisplayName.  It's always equal to Name.
diff --git a/llvm/lib/CodeGen/AsmPrinter/DwarfExpression.h b/llvm/lib/CodeGen/AsmPrinter/DwarfExpression.h
index 667a9efc6f6c04f..4daa78b15b8e298 100644
--- a/llvm/lib/CodeGen/AsmPrinter/DwarfExpression.h
+++ b/llvm/lib/CodeGen/AsmPrinter/DwarfExpression.h
@@ -31,67 +31,6 @@ class DIELoc;
 class TargetRegisterInfo;
 class MachineLocation;
 
-/// Holds a DIExpression and keeps track of how many operands have been consumed
-/// so far.
-class DIExpressionCursor {
-  DIExpression::expr_op_iterator Start, End;
-
-public:
-  DIExpressionCursor(const DIExpression *Expr) {
-    if (!Expr) {
-      assert(Start == End);
-      return;
-    }
-    Start = Expr->expr_op_begin();
-    End = Expr->expr_op_end();
-  }
-
-  DIExpressionCursor(ArrayRef<uint64_t> Expr)
-      : Start(Expr.begin()), End(Expr.end()) {}
-
-  DIExpressionCursor(const DIExpressionCursor &) = default;
-
-  /// Consume one operation.
-  std::optional<DIExpression::ExprOperand> take() {
-    if (Start == End)
-      return std::nullopt;
-    return *(Start++);
-  }
-
-  /// Consume N operations.
-  void consume(unsigned N) { std::advance(Start, N); }
-
-  /// Return the current operation.
-  std::optional<DIExpression::ExprOperand> peek() const {
-    if (Start == End)
-      return std::nullopt;
-    return *(Start);
-  }
-
-  /// Return the next operation.
-  std::optional<DIExpression::ExprOperand> peekNext() const {
-    if (Start == End)
-      return std::nullopt;
-
-    auto Next = Start.getNext();
-    if (Next == End)
-      return std::nullopt;
-
-    return *Next;
-  }
-
-  /// Determine whether there are any operations left in this expression.
-  operator bool() const { return Start != End; }
-
-  DIExpression::expr_op_iterator begin() const { return Start; }
-  DIExpression::expr_op_iterator end() const { return End; }
-
-  /// Retrieve the fragment information, if any.
-  std::optional<DIExpression::FragmentInfo> getFragmentInfo() const {
-    return DIExpression::getFragmentInfo(Start, End);
-  }
-};
-
 /// Base class containing the logic for constructing DWARF expressions
 /// independently of whether they are emitted into a DIE or into a .debug_loc
 /// entry.
diff --git a/llvm/lib/IR/DebugInfoMetadata.cpp b/llvm/lib/IR/DebugInfoMetadata.cpp
index f7f36129ec8557c..dfda851b77777ad 100644
--- a/llvm/lib/IR/DebugInfoMetadata.cpp
+++ b/llvm/lib/IR/DebugInfoMetadata.cpp
@@ -1734,6 +1734,143 @@ const DIExpression *DIExpression::extractAddressClass(const DIExpression *Expr,
   return Expr;
 }
 
+SmallVector<uint64_t>
+DIExpressionOptimizer::getOperatorLocations(ArrayRef<uint64_t> Ops) {
+  SmallVector<uint64_t> OpLocs;
+  uint64_t Loc = 0;
+  DIExpressionCursor Cursor(Ops);
+
+  while (Cursor.peek()) {
+    OpLocs.push_back(Loc);
+    auto Size = Cursor.peek()->getSize();
+    Loc += Size;
+    Cursor.consume(1);
+  }
+
+  return OpLocs;
+}
+
+bool DIExpressionOptimizer::operatorIsCommutative(uint64_t Operator) {
+  return Operator == dwarf::DW_OP_mul || Operator == dwarf::DW_OP_plus;
+}
+
+bool DIExpressionOptimizer::operatorCanBeOptimized(uint64_t Operator) {
+  switch (Operator) {
+  case dwarf::DW_OP_plus:
+  case dwarf::DW_OP_plus_uconst:
+  case dwarf::DW_OP_minus:
+  case dwarf::DW_OP_mul:
+  case dwarf::DW_OP_div:
+  case dwarf::DW_OP_shl:
+  case dwarf::DW_OP_shr:
+    return true;
+  default:
+    return false;
+  }
+}
+
+void DIExpressionOptimizer::reset() {
+  NewMathOperator = 0;
+  CurrMathOperator = 0;
+  OperatorLeft = 0;
+  OperatorRight = 0;
+}
+
+void DIExpressionOptimizer::optimize(SmallVectorImpl<uint64_t> &NewOps,
+                                     ArrayRef<uint64_t> Ops) {
+
+  if (Ops.size() == 2 && Ops[0] == dwarf::DW_OP_plus_uconst) {
+    // Convert a {DW_OP_plus_uconst, <constant value>} expression to
+    // {DW_OP_constu, <constant value>, DW_OP_plus}
+    NewMathOperator = dwarf::DW_OP_plus;
+    OperandRight = Ops[1];
+    OperatorRight = dwarf::DW_OP_constu;
+  } else if (Ops.size() == 3) {
+    NewMathOperator = Ops[2];
+    OperandRight = Ops[1];
+    OperatorRight = Ops[0];
+  } else {
+    // Currently, DIExpressionOptimizer only supports arithmetic operations
+    // such as Ops = {DW_OP_constu, 1, DW_OP_mul}
+    NewOps.append(Ops.begin(), Ops.end());
+    return;
+  }
+
+  if (OperatorRight != dwarf::DW_OP_constu &&
+      OperatorRight != dwarf::DW_OP_consts) {
+    NewOps.append(Ops.begin(), Ops.end());
+    return;
+  }
+
+  uint64_t CurrOperator;
+
+  if ((NewMathOperator == dwarf::DW_OP_mul ||
+       NewMathOperator == dwarf::DW_OP_div) &&
+      OperandRight == 1) {
+    // It is a multiply or divide by 1
+    reset();
+    return;
+  }
+  if ((NewMathOperator == dwarf::DW_OP_plus ||
+       NewMathOperator == dwarf::DW_OP_minus ||
+       NewMathOperator == dwarf::DW_OP_shl ||
+       NewMathOperator == dwarf::DW_OP_shr) &&
+      OperandRight == 0) {
+    // It is a add, subtract, shift left, or shift right with 0
+    reset();
+    return;
+  }
+
+  // Get the locations of the operators in the DIExpression
+  auto OperatorLocs = getOperatorLocations(NewOps);
+
+  for (auto It = OperatorLocs.rbegin(); It != OperatorLocs.rend(); It++) {
+    // Only look at the last two operators of the DIExpression to see if the new
+    // operators can be constant folded with them.
+    if (std::distance(OperatorLocs.rbegin(), It) > 1)
+      break;
+    CurrOperator = NewOps[*It];
+    if (CurrOperator == dwarf::DW_OP_mul || CurrOperator == dwarf::DW_OP_div ||
+        CurrOperator == dwarf::DW_OP_plus ||
+        CurrOperator == dwarf::DW_OP_minus) {
+      CurrMathOperator = CurrOperator;
+      continue;
+    }
+    if (CurrOperator == dwarf::DW_OP_plus_uconst &&
+        NewMathOperator == dwarf::DW_OP_plus) {
+      NewOps[*It + 1] = OperandRight + NewOps[*It + 1];
+      reset();
+      return;
+    }
+    if (CurrOperator == dwarf::DW_OP_constu ||
+        CurrOperator == dwarf::DW_OP_consts) {
+      OperatorLeft = CurrOperator;
+      OperandLeft = NewOps[*It + 1];
+
+      if (NewMathOperator == CurrMathOperator &&
+          operatorIsCommutative(NewMathOperator)) {
+        switch (NewMathOperator) {
+        case dwarf::DW_OP_plus:
+          NewOps[*It + 1] = OperandRight + OperandLeft;
+          break;
+        case dwarf::DW_OP_mul:
+          NewOps[*It + 1] = OperandRight * OperandLeft;
+          break;
+        default:
+          llvm_unreachable("Operator is not multiplication or addition!");
+        }
+        reset();
+        return;
+      }
+      break;
+    }
+    break;
+  }
+  NewOps.append(Ops.begin(), Ops.end());
+  reset();
+  return;
+}
+
 DIExpression *DIExpression::prepend(const DIExpression *Expr, uint8_t Flags,
                                     int64_t Offset) {
   SmallVector<uint64_t, 8> Ops;

pogo59 · 2023-10-20T20:57:36Z

how does this relate to (or supersede) DIExpression::constantFold() ?

rastogishubham · 2023-10-23T16:53:13Z

Hi @pogo59

It seems like DIExpression::constantFold only works on DW_OP_LLVM_convert operations. While this is folding mathematical expressions such as two DW_OP_plus in succession

pogo59 · 2023-10-23T17:05:05Z

I suspect we would prefer not to have two separate constant folders. Can you combine them?

adrian-prantl

I think in general this looks very useful.
Can you explain why DIExpressionOptimizer needs to keep state?

adrian-prantl · 2023-10-23T20:31:17Z

llvm/include/llvm/IR/DebugInfoMetadata.h

@@ -3073,6 +3073,109 @@ template <> struct DenseMapInfo<DIExpression::FragmentInfo> {
  static bool isEqual(const FragInfo &A, const FragInfo &B) { return A == B; }
 };

+class DIExpressionOptimizer {


Can you add a Doxygen comment for the entire class with just a sentence on what this does?

adrian-prantl · 2023-10-23T20:32:43Z

llvm/lib/IR/DebugInfoMetadata.cpp

+}
+
+bool DIExpressionOptimizer::operatorIsCommutative(uint64_t Operator) {
+  return Operator == dwarf::DW_OP_mul || Operator == dwarf::DW_OP_plus;


How about DW_OP_and, or, xor?

adrian-prantl · 2023-10-23T20:34:00Z

llvm/lib/IR/DebugInfoMetadata.cpp

+  }
+}
+
+void DIExpressionOptimizer::reset() {


Is this needed for performance reasons? If not, it is easier to reason about the code if a fresh DIExpressionOptimizer is created for each use.

adrian-prantl · 2023-10-23T20:35:04Z

llvm/lib/IR/DebugInfoMetadata.cpp

+void DIExpressionOptimizer::optimize(SmallVectorImpl<uint64_t> &NewOps,
+                                     ArrayRef<uint64_t> Ops) {
+
+  if (Ops.size() == 2 && Ops[0] == dwarf::DW_OP_plus_uconst) {


Why == 2 and not >= 2?

adrian-prantl · 2023-10-23T20:36:41Z

llvm/include/llvm/IR/DebugInfoMetadata.h

+
+  /// The math operator of the new subexpression being appended to the
+  /// DIExpression.
+  uint64_t NewMathOperator = 0;


What do you think about calling MathOperator more generically BinOp, or BinaryOperator, so all operations that take two arguments fit the description?

adrian-prantl · 2023-10-23T20:41:35Z

llvm/lib/IR/DebugInfoMetadata.cpp

+  if (OperatorRight != dwarf::DW_OP_constu &&
+      OperatorRight != dwarf::DW_OP_consts) {
+    NewOps.append(Ops.begin(), Ops.end());
+    return;


Why is the state preserved in this case but not in all the others?

adrian-prantl · 2023-10-23T20:43:56Z

I suspect we would prefer not to have two separate constant folders. Can you combine them?

+1, that sounds like a good idea!

rastogishubham added 2 commits October 20, 2023 13:16

[NFC] Move DIExpressionCursor to DebugInfoMetadata.h

d16393b

Introduce DIExpressionOptimizer

412d655

DIExpressionOptimizer is a lightweight class that can be used to optimize DIExpressions that can get very large, by involving some simple constant folding, and removing appends of unecessary operations.

rastogishubham marked this pull request as draft October 20, 2023 20:28

llvmbot added debuginfo llvm:ir labels Oct 20, 2023

rastogishubham mentioned this pull request Oct 20, 2023

[NFC] Move DIExpressionCursor to DebugInfoMetadata.h #69768

Merged

rastogishubham marked this pull request as ready for review October 20, 2023 20:30

This was referenced Oct 20, 2023

Introduce DIExpressionOptimizer in DIExpression::append #69770

Closed

Introduce DIExpressionOptimizer to DIExpression::appendOpsToArg #69771

Closed

rastogishubham requested review from MaskRay, jmorse, adrian-prantl and SLTozer October 20, 2023 20:32

jmorse mentioned this pull request Oct 23, 2023

[SROA] Fix incorrect offsets for structured binding variables #69007

Draft

adrian-prantl reviewed Oct 23, 2023

View reviewed changes

rastogishubham closed this Nov 8, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Introduce DIExpressionOptimizer #69769

Introduce DIExpressionOptimizer #69769

rastogishubham commented Oct 20, 2023 •

edited

llvmbot commented Oct 20, 2023 •

edited

pogo59 commented Oct 20, 2023

rastogishubham commented Oct 23, 2023

pogo59 commented Oct 23, 2023

adrian-prantl left a comment

adrian-prantl Oct 23, 2023

adrian-prantl Oct 23, 2023

adrian-prantl Oct 23, 2023

adrian-prantl Oct 23, 2023

adrian-prantl Oct 23, 2023

adrian-prantl Oct 23, 2023

adrian-prantl commented Oct 23, 2023

Introduce DIExpressionOptimizer #69769

Introduce DIExpressionOptimizer #69769

Conversation

rastogishubham commented Oct 20, 2023 • edited

llvmbot commented Oct 20, 2023 • edited

pogo59 commented Oct 20, 2023

rastogishubham commented Oct 23, 2023

pogo59 commented Oct 23, 2023

adrian-prantl left a comment

Choose a reason for hiding this comment

adrian-prantl Oct 23, 2023

Choose a reason for hiding this comment

adrian-prantl Oct 23, 2023

Choose a reason for hiding this comment

adrian-prantl Oct 23, 2023

Choose a reason for hiding this comment

adrian-prantl Oct 23, 2023

Choose a reason for hiding this comment

adrian-prantl Oct 23, 2023

Choose a reason for hiding this comment

adrian-prantl Oct 23, 2023

Choose a reason for hiding this comment

adrian-prantl commented Oct 23, 2023

rastogishubham commented Oct 20, 2023 •

edited

llvmbot commented Oct 20, 2023 •

edited