Use DIExpression::foldConstantMath at the result of a Salvaged expression #71721

rastogishubham · 2023-11-08T18:40:28Z

This patch uses DIExpression::foldConstantMath() at the result of a Salvaged expression, that is, it runs the folding optimizations after an expression has been salvaged completely, to reduce how many times the fold optimization function is called. Which should help in reducing the size of DIExpressions that grow because of salvaging debug info

After checking the size of the dSYM with and without this change, I saw a decrease of about 300KB, where the debug_loc section is about 1.6 GB in size.

Where the debug loc section reduced in size by 212KB and it is 193MB in size, the rest comes from the debug_info section

This is part of a stack of patches and comes after:
#69768
#71717
#71718
#71719

llvmbot · 2023-11-08T18:40:57Z

@llvm/pr-subscribers-coroutines
@llvm/pr-subscribers-llvm-transforms
@llvm/pr-subscribers-llvm-ir

@llvm/pr-subscribers-debuginfo

Author: Shubham Sandeep Rastogi (rastogishubham)

Changes

This patch uses DIExpression::foldConstantMath() at the end of a DIExpression::appendOpsToArg(). Which should help in reducing the size of DIExpressions that grow because of salvaging debug info

After checking the size of the debug_loc section with and without this change, I saw a decrease of about 300KB, where the debug_loc section is about 211 MB in size

This is part of a stack of patches and comes after:
#69768
#71717
#71718
#71719

Patch is 32.09 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/71721.diff

6 Files Affected:

(modified) llvm/include/llvm/IR/DebugInfoMetadata.h (+85)
(modified) llvm/lib/CodeGen/AsmPrinter/DwarfExpression.h (-61)
(modified) llvm/lib/IR/DebugInfoMetadata.cpp (+284-3)
(modified) llvm/test/Bitcode/upgrade-dbg-addr.ll (+1-1)
(modified) llvm/test/DebugInfo/MIR/AArch64/dbgcall-site-expr-chain.mir (+2-2)
(modified) llvm/unittests/IR/MetadataTest.cpp (+382)

diff --git a/llvm/include/llvm/IR/DebugInfoMetadata.h b/llvm/include/llvm/IR/DebugInfoMetadata.h
index b347664883fd9f2..354ae48c3d676c1 100644
--- a/llvm/include/llvm/IR/DebugInfoMetadata.h
+++ b/llvm/include/llvm/IR/DebugInfoMetadata.h
@@ -3044,6 +3044,11 @@ class DIExpression : public MDNode {
   /// expression and constant on failure.
   std::pair<DIExpression *, const ConstantInt *>
   constantFold(const ConstantInt *CI);
+
+  /// Try to shorten an expression with constand math operations that can be
+  /// evaulated at compile time. Returns a new expression on success, or the old
+  /// expression if there is nothing to be reduced.
+  DIExpression *foldConstantMath();
 };
 
 inline bool operator==(const DIExpression::FragmentInfo &A,
@@ -3073,6 +3078,86 @@ template <> struct DenseMapInfo<DIExpression::FragmentInfo> {
   static bool isEqual(const FragInfo &A, const FragInfo &B) { return A == B; }
 };
 
+/// Holds a DIExpression and keeps track of how many operands have been consumed
+/// so far.
+class DIExpressionCursor {
+  DIExpression::expr_op_iterator Start, End;
+
+public:
+  DIExpressionCursor(const DIExpression *Expr) {
+    if (!Expr) {
+      assert(Start == End);
+      return;
+    }
+    Start = Expr->expr_op_begin();
+    End = Expr->expr_op_end();
+  }
+
+  DIExpressionCursor(ArrayRef<uint64_t> Expr)
+      : Start(Expr.begin()), End(Expr.end()) {}
+
+  DIExpressionCursor(const DIExpressionCursor &) = default;
+
+  /// Consume one operation.
+  std::optional<DIExpression::ExprOperand> take() {
+    if (Start == End)
+      return std::nullopt;
+    return *(Start++);
+  }
+
+  /// Consume N operations.
+  void consume(unsigned N) { std::advance(Start, N); }
+
+  /// Return the current operation.
+  std::optional<DIExpression::ExprOperand> peek() const {
+    if (Start == End)
+      return std::nullopt;
+    return *(Start);
+  }
+
+  /// Return the next operation.
+  std::optional<DIExpression::ExprOperand> peekNext() const {
+    if (Start == End)
+      return std::nullopt;
+
+    auto Next = Start.getNext();
+    if (Next == End)
+      return std::nullopt;
+
+    return *Next;
+  }
+
+  std::optional<DIExpression::ExprOperand> peekNextN(unsigned N) const {
+    if (Start == End)
+      return std::nullopt;
+    DIExpression::expr_op_iterator Nth = Start;
+    for (unsigned I = 0; I < N; I++) {
+      Nth = Nth.getNext();
+      if (Nth == End)
+        return std::nullopt;
+    }
+    return *Nth;
+  }
+
+  void assignNewExpr(ArrayRef<uint64_t> Expr) {
+    DIExpression::expr_op_iterator NewStart(Expr.begin());
+    DIExpression::expr_op_iterator NewEnd(Expr.end());
+    this->Start = NewStart;
+    this->End = NewEnd;
+  }
+
+  /// Determine whether there are any operations left in this expression.
+  operator bool() const { return Start != End; }
+
+  DIExpression::expr_op_iterator begin() const { return Start; }
+  DIExpression::expr_op_iterator end() const { return End; }
+
+  /// Retrieve the fragment information, if any.
+  std::optional<DIExpression::FragmentInfo> getFragmentInfo() const {
+    return DIExpression::getFragmentInfo(Start, End);
+  }
+};
+
 /// Global variables.
 ///
 /// TODO: Remove DisplayName.  It's always equal to Name.
diff --git a/llvm/lib/CodeGen/AsmPrinter/DwarfExpression.h b/llvm/lib/CodeGen/AsmPrinter/DwarfExpression.h
index 667a9efc6f6c04f..4daa78b15b8e298 100644
--- a/llvm/lib/CodeGen/AsmPrinter/DwarfExpression.h
+++ b/llvm/lib/CodeGen/AsmPrinter/DwarfExpression.h
@@ -31,67 +31,6 @@ class DIELoc;
 class TargetRegisterInfo;
 class MachineLocation;
 
-/// Holds a DIExpression and keeps track of how many operands have been consumed
-/// so far.
-class DIExpressionCursor {
-  DIExpression::expr_op_iterator Start, End;
-
-public:
-  DIExpressionCursor(const DIExpression *Expr) {
-    if (!Expr) {
-      assert(Start == End);
-      return;
-    }
-    Start = Expr->expr_op_begin();
-    End = Expr->expr_op_end();
-  }
-
-  DIExpressionCursor(ArrayRef<uint64_t> Expr)
-      : Start(Expr.begin()), End(Expr.end()) {}
-
-  DIExpressionCursor(const DIExpressionCursor &) = default;
-
-  /// Consume one operation.
-  std::optional<DIExpression::ExprOperand> take() {
-    if (Start == End)
-      return std::nullopt;
-    return *(Start++);
-  }
-
-  /// Consume N operations.
-  void consume(unsigned N) { std::advance(Start, N); }
-
-  /// Return the current operation.
-  std::optional<DIExpression::ExprOperand> peek() const {
-    if (Start == End)
-      return std::nullopt;
-    return *(Start);
-  }
-
-  /// Return the next operation.
-  std::optional<DIExpression::ExprOperand> peekNext() const {
-    if (Start == End)
-      return std::nullopt;
-
-    auto Next = Start.getNext();
-    if (Next == End)
-      return std::nullopt;
-
-    return *Next;
-  }
-
-  /// Determine whether there are any operations left in this expression.
-  operator bool() const { return Start != End; }
-
-  DIExpression::expr_op_iterator begin() const { return Start; }
-  DIExpression::expr_op_iterator end() const { return End; }
-
-  /// Retrieve the fragment information, if any.
-  std::optional<DIExpression::FragmentInfo> getFragmentInfo() const {
-    return DIExpression::getFragmentInfo(Start, End);
-  }
-};
-
 /// Base class containing the logic for constructing DWARF expressions
 /// independently of whether they are emitted into a DIE or into a .debug_loc
 /// entry.
diff --git a/llvm/lib/IR/DebugInfoMetadata.cpp b/llvm/lib/IR/DebugInfoMetadata.cpp
index f7f36129ec8557c..4e4b243a1f0a50a 100644
--- a/llvm/lib/IR/DebugInfoMetadata.cpp
+++ b/llvm/lib/IR/DebugInfoMetadata.cpp
@@ -1782,7 +1782,7 @@ DIExpression *DIExpression::appendOpsToArg(const DIExpression *Expr,
   if (StackValue)
     NewOps.push_back(dwarf::DW_OP_stack_value);
 
-  return DIExpression::get(Expr->getContext(), NewOps);
+  return DIExpression::get(Expr->getContext(), NewOps)->foldConstantMath();
 }
 
 DIExpression *DIExpression::replaceArg(const DIExpression *Expr,
@@ -1857,11 +1857,10 @@ DIExpression *DIExpression::append(const DIExpression *Expr,
     }
     Op.appendToVector(NewOps);
   }
-
   NewOps.append(Ops.begin(), Ops.end());
   auto *result = DIExpression::get(Expr->getContext(), NewOps);
   assert(result->isValid() && "concatenated expression is not valid");
-  return result;
+  return result->foldConstantMath();
 }
 
 DIExpression *DIExpression::appendToStack(const DIExpression *Expr,
@@ -1998,6 +1997,288 @@ DIExpression::constantFold(const ConstantInt *CI) {
           ConstantInt::get(getContext(), NewInt)};
 }
 
+static bool isConstOperation(uint64_t Op) {
+  return Op == dwarf::DW_OP_constu || Op == dwarf::DW_OP_consts;
+}
+
+static bool operationIsNoOp(uint64_t Op, uint64_t Val) {
+  switch (Op) {
+  case dwarf::DW_OP_plus:
+  case dwarf::DW_OP_minus:
+  case dwarf::DW_OP_shl:
+  case dwarf::DW_OP_shr:
+    return Val == 0;
+  case dwarf::DW_OP_mul:
+  case dwarf::DW_OP_div:
+    return Val == 1;
+  default:
+    return false;
+  }
+}
+
+static std::optional<uint64_t>
+foldOperationIfPossible(uint64_t Op, uint64_t Operand1, uint64_t Operand2) {
+  switch (Op) {
+  case dwarf::DW_OP_plus:
+    return Operand1 + Operand2;
+  case dwarf::DW_OP_minus:
+    return Operand1 - Operand2;
+  case dwarf::DW_OP_shl:
+    return Operand1 << Operand2;
+  case dwarf::DW_OP_shr:
+    return Operand1 >> Operand2;
+  case dwarf::DW_OP_mul:
+    return Operand1 * Operand2;
+  case dwarf::DW_OP_div:
+    return Operand1 / Operand2;
+  default:
+    return std::nullopt;
+  }
+}
+
+static bool operationsAreFoldableAndCommutative(uint64_t Op1, uint64_t Op2) {
+  if (Op1 != Op2)
+    return false;
+  switch (Op1) {
+  case dwarf::DW_OP_plus:
+  case dwarf::DW_OP_mul:
+    return true;
+  default:
+    return false;
+  }
+}
+
+static void moveCursorAndLocPos(DIExpressionCursor &Cursor, uint64_t &Loc,
+                                const DIExpression::ExprOperand &Op) {
+  Cursor.consume(1);
+  Loc = Loc + Op.getSize();
+}
+
+DIExpression *DIExpression::foldConstantMath() {
+
+  SmallVector<uint64_t, 8> WorkingOps;
+  WorkingOps.append(Elements.begin(), Elements.end());
+  uint64_t Loc = 0;
+  DIExpressionCursor Cursor(WorkingOps);
+
+  while (Loc < WorkingOps.size()) {
+
+    auto Op1 = Cursor.peek();
+    // Expression has no operations, exit
+    if (!Op1)
+      break;
+    auto Op1Raw = Op1->getOp();
+    auto Op1Arg = Op1->getArg(0);
+
+    // {DW_OP_plus_uconst, 0} -> {}
+    if (Op1Raw == dwarf::DW_OP_plus_uconst && Op1Arg == 0) {
+      WorkingOps.erase(WorkingOps.begin() + Loc, WorkingOps.begin() + Loc + 2);
+      moveCursorAndLocPos(Cursor, Loc, *Op1);
+      continue;
+    }
+
+    if (!isConstOperation(Op1Raw) && Op1Raw != dwarf::DW_OP_plus_uconst) {
+      // Early exit, all of the following patterns start with a constant value
+      moveCursorAndLocPos(Cursor, Loc, *Op1);
+      continue;
+    }
+
+    auto Op2 = Cursor.peekNext();
+    // All following patterns require at least 2 Operations, exit
+    if (!Op2)
+      break;
+    auto Op2Raw = Op2->getOp();
+
+    // {DW_OP_const[u, s], 0, DW_OP_[plus, minus, shl, shr]} -> {}
+    // {DW_OP_const[u, s], 1, DW_OP_[mul, div]} -> {}
+    if (isConstOperation(Op1Raw) && operationIsNoOp(Op2Raw, Op1Arg)) {
+      WorkingOps.erase(WorkingOps.begin() + Loc, WorkingOps.begin() + Loc + 3);
+      Cursor.assignNewExpr(WorkingOps);
+      Loc = 0;
+      continue;
+    }
+
+    auto Op2Arg = Op2->getArg(0);
+
+    // {DW_OP_plus_uconst, Const1, DW_OP_plus_uconst, Const2} ->
+    // {DW_OP_plus_uconst, Const1 + Const2}
+    if (Op1Raw == dwarf::DW_OP_plus_uconst &&
+        Op2Raw == dwarf::DW_OP_plus_uconst) {
+      auto Result = Op1Arg + Op2Arg;
+      WorkingOps.erase(WorkingOps.begin() + Loc + 2,
+                       WorkingOps.begin() + Loc + 4);
+      WorkingOps[Loc + 1] = Result;
+      Cursor.assignNewExpr(WorkingOps);
+      Loc = 0;
+      continue;
+    }
+
+    // {DW_OP_const[u, s], Const1, DW_OP_plus_uconst Const2} -> {DW_OP_constu,
+    // Const1 + Const2}
+    if (isConstOperation(Op1Raw) && Op2Raw == dwarf::DW_OP_plus_uconst) {
+      auto Result = Op1Arg + Op2Arg;
+      WorkingOps.erase(WorkingOps.begin() + Loc + 2,
+                       WorkingOps.begin() + Loc + 4);
+      WorkingOps[Loc] = dwarf::DW_OP_constu;
+      WorkingOps[Loc + 1] = Result;
+      Cursor.assignNewExpr(WorkingOps);
+      Loc = 0;
+      continue;
+    }
+
+    auto Op3 = Cursor.peekNextN(2);
+    // Op2 could still match a pattern, skip iteration
+    if (!Op3) {
+      moveCursorAndLocPos(Cursor, Loc, *Op1);
+      continue;
+    }
+    auto Op3Raw = Op3->getOp();
+
+    // {DW_OP_const[u, s], Const1, DW_OP_const[u, s], Const2, DW_OP_[plus,
+    // minus, mul, div, shl, shr] -> {DW_OP_constu, Const1 [+, -, *, /, <<, >>]
+    // Const2}
+    if (isConstOperation(Op1Raw) && isConstOperation(Op2Raw)) {
+      auto Result = foldOperationIfPossible(Op3Raw, Op1Arg, Op2Arg);
+      if (!Result) {
+        moveCursorAndLocPos(Cursor, Loc, *Op1);
+        continue;
+      }
+      WorkingOps.erase(WorkingOps.begin() + Loc + 2,
+                       WorkingOps.begin() + Loc + 5);
+      WorkingOps[Loc] = dwarf::DW_OP_constu;
+      WorkingOps[Loc + 1] = *Result;
+      Cursor.assignNewExpr(WorkingOps);
+      Loc = 0;
+      continue;
+    }
+
+    // {DW_OP_plus_uconst, Const1, DW_OP_const[u, s], Const1, DW_OP_plus} ->
+    // {DW_OP_plus_uconst, Const1 + Const2}
+    if (Op1Raw == dwarf::DW_OP_plus_uconst && isConstOperation(Op2Raw) &&
+        Op3Raw == dwarf::DW_OP_plus) {
+      auto Result = Op1Arg + Op2Arg;
+      WorkingOps.erase(WorkingOps.begin() + Loc + 2,
+                       WorkingOps.begin() + Loc + 5);
+      WorkingOps[Loc + 1] = Result;
+      Cursor.assignNewExpr(WorkingOps);
+      Loc = 0;
+      continue;
+    }
+
+    auto Op3Arg = Op3->getArg(0);
+    // {DW_OP_const[u, s], Const1, DW_OP_plus, DW_OP_plus_uconst, Const2} ->
+    // {DW_OP_plus_uconst, Const1 + Const2}
+    if (isConstOperation(Op1Raw) && Op2Raw == dwarf::DW_OP_plus &&
+        Op3Raw == dwarf::DW_OP_plus_uconst) {
+      auto Result = Op1Arg + Op3Arg;
+      WorkingOps.erase(WorkingOps.begin() + Loc + 2,
+                       WorkingOps.begin() + Loc + 5);
+      WorkingOps[Loc] = dwarf::DW_OP_plus_uconst;
+      WorkingOps[Loc + 1] = Result;
+      Cursor.assignNewExpr(WorkingOps);
+      Loc = 0;
+      continue;
+    }
+
+    auto Op4 = Cursor.peekNextN(3);
+    // Op2 and Op3 could still match a pattern, skip iteration
+    if (!Op4) {
+      moveCursorAndLocPos(Cursor, Loc, *Op1);
+      continue;
+    }
+    auto Op4Raw = Op4->getOp();
+
+    // {DW_OP_const[u, s], Const1, DW_OP_[plus, mul], Const2, DW_OP_[plus, mul]}
+    // -> {DW_OP_constu, Const1 [+, *] Const2, DW_OP_[plus, mul]}
+    if (isConstOperation(Op1Raw) && isConstOperation(Op3Raw) &&
+        operationsAreFoldableAndCommutative(Op2Raw, Op4Raw)) {
+      auto Result = foldOperationIfPossible(Op2Raw, Op1Arg, Op3Arg);
+      if (!Result)
+        llvm_unreachable(
+            "Something went very wrong here! This should always work");
+      WorkingOps.erase(WorkingOps.begin() + Loc + 3,
+                       WorkingOps.begin() + Loc + 6);
+      WorkingOps[Loc] = dwarf::DW_OP_constu;
+      WorkingOps[Loc + 1] = *Result;
+      Cursor.assignNewExpr(WorkingOps);
+      Loc = 0;
+      continue;
+    }
+    auto Op5 = Cursor.peekNextN(4);
+    if (!Op5) {
+      moveCursorAndLocPos(Cursor, Loc, *Op1);
+      continue;
+    }
+    auto Op5Raw = Op5->getOp();
+    auto Op5Arg = Op5->getArg(0);
+
+    // {DW_OP_const[u, s], Const1, DW_OP_plus, DW_OP_LLVM_arg, Arg1, DW_OP_plus,
+    // DW_OP_plus_uconst, Const2} -> {DW_OP_constu, Const1 + Const2, DW_OP_plus,
+    // DW_OP_LLVM_arg, Arg1, DW_OP_plus}
+    if (isConstOperation(Op1Raw) && Op3Raw == dwarf::DW_OP_LLVM_arg &&
+        Op5Raw == dwarf::DW_OP_plus_uconst && Op2Raw == Op4Raw &&
+        Op4Raw == dwarf::DW_OP_plus) {
+      auto Result = Op1Arg + Op5Arg;
+      WorkingOps.erase(WorkingOps.begin() + Loc + 6,
+                       WorkingOps.begin() + Loc + 8);
+      WorkingOps[Loc] = dwarf::DW_OP_constu;
+      WorkingOps[Loc + 1] = Result;
+      Cursor.assignNewExpr(WorkingOps);
+      Loc = 0;
+      continue;
+    }
+
+    auto Op4Arg = Op4->getArg(0);
+
+    // {DW_OP_plus_uconst, Const1, DW_OP_LLVM_arg, Arg1, DW_OP_plus,
+    // DW_OP_const[u, s], Const2, DW_OP_plus} -> {DW_OP_plus_uconst, Const1 +
+    // Const2, DW_OP_LLVM_arg, Arg1, DW_OP_plus}
+    if (Op1Raw == dwarf::DW_OP_plus_uconst && Op2Raw == dwarf::DW_OP_LLVM_arg &&
+        Op3Raw == Op5Raw && Op3Raw == dwarf::DW_OP_plus &&
+        isConstOperation(Op4Raw)) {
+      auto Result = Op1Arg + Op4Arg;
+      WorkingOps.erase(WorkingOps.begin() + Loc + 5,
+                       WorkingOps.begin() + Loc + 8);
+      WorkingOps[Loc + 1] = Result;
+      Cursor.assignNewExpr(WorkingOps);
+      Loc = 0;
+      continue;
+    }
+
+    auto Op6 = Cursor.peekNextN(5);
+    if (!Op6) {
+      moveCursorAndLocPos(Cursor, Loc, *Op1);
+      continue;
+    }
+    auto Op6Raw = Op6->getOp();
+    // {DW_OP_const[u, s], Const1, DW_OP_[plus, mul], DW_OP_LLVM_arg, Arg1,
+    // DW_OP_[plus, mul], DW_OP_const[u, s], Const2, DW_OP_[plus, mul]} ->
+    // {DW_OP_constu, Const1 [+, *] Const2, DW_OP_[plus, mul], DW_OP_LLVM_arg,
+    // Arg1, DW_OP_[plus, mul]}
+    if (isConstOperation(Op1Raw) && Op3Raw == dwarf::DW_OP_LLVM_arg &&
+        isConstOperation(Op5Raw) &&
+        operationsAreFoldableAndCommutative(Op2Raw, Op4Raw) &&
+        operationsAreFoldableAndCommutative(Op4Raw, Op6Raw)) {
+      auto Result = foldOperationIfPossible(Op2Raw, Op1Arg, Op5Arg);
+      if (!Result)
+        llvm_unreachable(
+            "Something went very wrong here! This should always work");
+      WorkingOps.erase(WorkingOps.begin() + Loc + 6,
+                       WorkingOps.begin() + Loc + 9);
+      WorkingOps[Loc] = dwarf::DW_OP_constu;
+      WorkingOps[Loc + 1] = *Result;
+      Cursor.assignNewExpr(WorkingOps);
+      Loc = 0;
+      continue;
+    }
+
+    moveCursorAndLocPos(Cursor, Loc, *Op1);
+  }
+  auto *Result = DIExpression::get(getContext(), WorkingOps);
+  assert(Result->isValid() && "concatenated expression is not valid");
+  return Result;
+}
+
 uint64_t DIExpression::getNumLocationOperands() const {
   uint64_t Result = 0;
   for (auto ExprOp : expr_ops())
diff --git a/llvm/test/Bitcode/upgrade-dbg-addr.ll b/llvm/test/Bitcode/upgrade-dbg-addr.ll
index 40fd7db18948b57..1943c35b0059fb1 100644
--- a/llvm/test/Bitcode/upgrade-dbg-addr.ll
+++ b/llvm/test/Bitcode/upgrade-dbg-addr.ll
@@ -8,7 +8,7 @@ entry:
   %num.addr = alloca i32, align 4
   store i32 %num, ptr %num.addr, align 4
   ; CHECK-NOT: call void @llvm.dbg.addr
-  ; CHECK: call void @llvm.dbg.value(metadata ptr %num.addr, metadata ![[#]], metadata !DIExpression(DW_OP_plus_uconst, 0, DW_OP_deref))
+  ; CHECK: call void @llvm.dbg.value(metadata ptr %num.addr, metadata ![[#]], metadata !DIExpression(DW_OP_deref))
   call void @llvm.dbg.addr(metadata ptr %num.addr, metadata !16, metadata !DIExpression(DW_OP_plus_uconst, 0)), !dbg !17
   %0 = load i32, ptr %num.addr, align 4
   ret i32 %0
diff --git a/llvm/test/DebugInfo/MIR/AArch64/dbgcall-site-expr-chain.mir b/llvm/test/DebugInfo/MIR/AArch64/dbgcall-site-expr-chain.mir
index 3e2244f76c905e4..b32f7310537daf4 100644
--- a/llvm/test/DebugInfo/MIR/AArch64/dbgcall-site-expr-chain.mir
+++ b/llvm/test/DebugInfo/MIR/AArch64/dbgcall-site-expr-chain.mir
@@ -105,7 +105,7 @@ body:             |
 
 # CHECK: DW_TAG_GNU_call_site_parameter
 # CHECK-NEXT: DW_AT_location (DW_OP_reg2 W2)
-# CHECK-NEXT: DW_AT_GNU_call_site_value (DW_OP_breg19 W19+700, DW_OP_plus_uconst 0x9, DW_OP_plus_uconst 0x50)
+# CHECK-NEXT: DW_AT_GNU_call_site_value (DW_OP_breg19 W19+789)
 
 # CHECK: DW_TAG_GNU_call_site_parameter
 # CHECK-NEXT: DW_AT_location (DW_OP_reg1 W1)
@@ -113,4 +113,4 @@ body:             |
 
 # CHECK: DW_TAG_GNU_call_site_parameter
 # CHECK-NEXT: DW_AT_location (DW_OP_reg0 W0)
-# CHECK-NEXT: DW_AT_GNU_call_site_value (DW_OP_breg19 W19+100, DW_OP_plus_uconst 0x17)
+# CHECK-NEXT: DW_AT_GNU_call_site_value (DW_OP_breg19 W19+123)
diff --git a/llvm/unittests/IR/MetadataTest.cpp b/llvm/unittests/IR/MetadataTest.cpp
index 1570631287b2a78..2221efc73d9f468 100644
--- a/llvm/unittests/IR/MetadataTest.cpp
+++ b/llvm/unittests/IR/MetadataTest.cpp
@@ -3120,6 +3120,388 @@ TEST_F(DIExpressionTest, get) {
   EXPECT_EQ(N0WithPrependedOps, N2);
 }
 
+TEST_F(DIExpressionTest, Fold) {
+
+  // Remove a No-op DW_OP_plus_uconst from an expression.
+  SmallVector<uint64_t, 8> Ops = {dwarf::DW_OP_plus_uconst, 0};
+  auto *Expr = DIExpression::get(Context, Ops);
+  auto *E = Expr->foldConstantMath();
+  SmallVector<uint64_t, 8> ResOps;
+  auto *ResExpr = DIExpression::get(Context, ResOps);
+  EXPECT_EQ(E, ResExpr);
+
+  // Remove a No-op add from an expression.
+  Ops[0] = dwarf::DW_OP_constu;
+  Ops[1] = 0;
+  Ops.push_back(dwarf::DW_OP_plus);
+  Expr = DIExpression::get(Context, Ops);
+  E = Expr->foldConstantMath();
+  ResExpr = DIExpression::get(Context, ResOps);
+  EXPECT_EQ(E, ResExpr);
+
+  // Remove a No-op subtract from an expression.
+  Ops[2] = dwarf::DW_OP_minus;
+  Expr = DIExpression::get(Context, Ops);
+  E = Expr->foldConstantMath();
+  ResExpr = DIExpression::get(Context, ResOps);
+  EXPECT_EQ(E, ResExpr);
+
+  // Remove a No-op shift left from an expression.
+  Ops[2] = dwarf::DW_OP_shl;
+  Expr = DIExpression::get(Context, Ops);
+  E = Expr->foldConstantMath();
+  ResExpr = DIExpression::get(Context, ResOps);
+  EXPECT_EQ(E, ResExpr);
+
+  // Remove a No-op shift right from an expression.
+  Ops[2] = dwarf::DW_OP_shr;
+  Expr = DIExpression::get(Context, Ops);
+  E = Expr->foldConstantMath();
+  ResExpr = DIExpression::get(Context, ResOps);
+  EXPECT_EQ(E, ResExpr);
+
+  // Remove a No-op multiply from an expression.
+  Ops[2] = dwarf::DW_OP_mul;
+  Ops[1] = 1;
+  Expr = DIExpression::get(Context, Ops);
+  E = Expr->foldConstantMath();
+  ResExpr = DIExpression::get(Context, ResOps);
+  EXPECT_EQ(E, ResExpr);
+
+  // Remove a No-op divide from an expression.
+  Ops[2] = dwarf::DW_OP_div;
+  Expr = DIExpression::get(Context, Ops);
+  E = Expr->foldConstantMath();
+  ResExpr = DIExpression::get(Context, ResOps);
+  EXPECT_EQ(E, ResExpr);
+
+  // Test fold {DW_OP_plus_uconst, Const1, DW_OP_plus_uconst, Const2} ->
+  // {DW_OP_plus_uconst, Const1 + Const2}
+  Ops[0] = dwarf::DW_OP_plus_uconst;
+ ...
[truncated]

github-actions · 2023-11-08T18:51:38Z

✅ With the latest revision this PR passed the C/C++ code formatter.

jmorse

LGTM with performance consideration mentioned on the parent patch

rastogishubham · 2024-03-09T01:12:14Z

@jmorse @adrian-prantl patch has been updated, I am unsure about what to do for DebugInfo/salvage-limit-expr-size.ll, because of this patch, there is no longer a limit of 128 salvageable instructions, because they get folded down to one instruction

adrian-prantl · 2024-03-15T15:47:39Z

llvm/lib/IR/DebugInfoMetadata.cpp

@@ -1774,7 +1774,8 @@ DIExpression *DIExpression::appendOpsToArg(const DIExpression *Expr,
    assert(ArgNo == 0 &&
           "Location Index must be 0 for a non-variadic expression.");
    SmallVector<uint64_t, 8> NewOps(Ops.begin(), Ops.end());
-    return DIExpression::prependOpcodes(Expr, NewOps, StackValue);
+    return DIExpression::prependOpcodes(Expr, NewOps, StackValue)


Can prependOpcodes return nullptr?

SLTozer · 2024-03-22T15:26:35Z

I am unsure about what to do for DebugInfo/salvage-limit-expr-size.ll, because of this patch, there is no longer a limit of 128 salvageable instructions, because they get folded down to one instruction

In some ways, this is quite a good problem to have! I'm not well-acquainted with this patch, but if there's any kind of legal DIExpression that won't get constant-folded then you could substitute that; it looks like you've not covered all the logical operators (not, xor, etc), there are some operators that might be effectively repeatable no-ops (DW_OP_rot), and if all else fails then it should be possible to repeat DW_OP_LLVM_arg, 0 in the expression, so that you're effectively performing x + x + x + x + x + x +... until you have a DIExpression of the right length.

SLTozer · 2024-04-22T13:59:26Z

Looking at some compile time stats as-of the most recent rebase: this patch stack has a geomean +0.32% to compile time at ReleaseLTO-g, in return for -0.01% file size. Looking at the specific projects, it looks as though the best cost-benefit ratio comes from tramp3d-v4 (which probably generates a higher proportion of constant-foldable expressions), which has +0.53% compile time for -0.05% file size; I think that's still not a great ratio. I'd be interested to see if there are any heuristics that could be applied to reduce the compile time increase.

A few suggestions that might improve performance:

Move this from appendOpsToArg to salvageDebugInfoImpl - that's where I'd guess all of the foldable expressions are coming from, and in addition to being called less often, there's contextual information there that could help simplify further, such as...
Only check for a constant foldable expression if we have just salvaged a foldable operation with a constant operand - any other salvage operation can't have turned a non-foldable expression to a foldable one.
Instead of performing this on every DIExpression that gets updated, which in many cases will actually be performing this on the same DIExpression repeatedly and returning the same result each time, you could manually track the users of each updated DIExpression, then for each DIExpression try to constant-fold it once and update each user to point to the new DIExpression.

I think that simplifying DIExpressions is something that should be pursued, so I apologize for adding to the workload of actually implementing it!

rastogishubham · 2024-04-22T18:28:05Z

@SLTozer Thanks for the feedback! I really appreciate it, I will try to update the patches with the suggestions

rastogishubham · 2024-05-07T18:00:24Z

@SLTozer After taking your feedback into account, I believe I have a solution that is more appropriate. The compile time is not significanlty higher, and the file size reduction is the same.

I would like to note that the size of the file will go down, because currently DIExpression::foldConstantMath doesn't fold signed operations, and doesn't fold the logical operations either. Please let me know what you think

llvm/lib/IR/DebugInfoMetadata.cpp

llvm/include/llvm/IR/DIExpressionOptimizer.h

llvm/lib/IR/DIExpressionOptimizer.cpp

llvm/include/llvm/IR/DIExpressionOptimizer.h

rastogishubham · 2024-05-15T17:54:01Z

@SLTozer Just pinging again in case you missed the updates,
I think I have addressed the compile time issues here: #71721 (comment)

SLTozer · 2024-05-21T13:41:10Z

@SLTozer Just pinging again in case you missed the updates,

Hi - just letting you know that I've been preoccupied by other work, but I'm looking at this now; the improvements in compile time LGTM. Outside of that my specific approval to land isn't necessary as long as the patches receive general approval, but otherwise I'll work my way through the patches that haven't yet got the green tick.

adrian-prantl · 2024-05-23T00:40:17Z

To summarize an offline conversation I had with @rastogishubham — I think hardcoding the rules like this is a great starting point.

Going forward (i.e., after this patch) I would like to see us at least explore if we can write the rules in a declarative form where they are compiled into function (similar to the rule bodies right now) and some sort of type signature that conceptually consists of the parameters each rules consumes (e.g.: uconst, uconst, operator) and a rewrite system that abstracts the the input DIExpression into its type signature and then matches the rules accordingly, so we don't need to hardcode the application in foldConstantMath(). So for example, the rules themselves could be stored in a tree where the edges are types and the rules are stored in the nodes:

          []
  const  / \ op
        []  ...
 const / 
      []
       \ op
       []  all rules that consume const,const,op

Maybe that tree can be stored in linear fashion and even produced at compile time by tablegen? Maybe this is also over-engineering the problem ;-)

jryans · 2024-05-23T00:51:07Z

I agree with @adrian-prantl, a rewrite rule design should work quite well here. It should make this system easier to maintain and understand long term.

Of course, that’s future work for after this PR, but I am happy to hear people are thinking along these lines. 😄

SLTozer

Maybe that tree can be stored in linear fashion and even produced at compile time by tablegen? Maybe this is also over-engineering the problem ;-)

Maybe just the tiniest bit 😄! But I do agree that we should have a more extensible approach to this, which could be tablegen, or any other simple declarative style.
With regards to the current design, it seems suitable for a first pass. In fact, if I were to suggest a further (non-arithmetic) pattern, I think that replacing patterns of consecutive DW_OP_LLVM_convert would be a very effective way to reduce expression sizes during compilation - off the top of my head I don't believe the "types" mean anything, we just emit them as truncations (e.g. DW_OP_constu 0x0000ffff, DW_OP_and) at the end of compilation, and we sometimes end up with a lot of converts after salvaging, which results in slower expression processing and some debug values being dropped. That pattern doesn't need to/probably shouldn't appear in this patch, though!

llvm/lib/IR/DIExpressionOptimizer.cpp

llvm/lib/IR/DebugInfoMetadata.cpp

llvm/lib/Transforms/Utils/Local.cpp

rastogishubham · 2024-05-23T22:30:56Z

@SLTozer Thanks for the review

@adrian-prantl thanks for summarizing the discussion we had!

…sion The functions salvageDbgAssignAddress, and salvageDebugInfoForDbgValues call into DIExpression::AppendOpsToArg, many times, use DIExpression::foldConstantMath at the end of the salvaging loop to fold any constant math to reduce the size of the salvaged expression.

llvmbot added debuginfo llvm:ir labels Nov 8, 2023

rastogishubham requested review from MaskRay, adrian-prantl, SLTozer, dwblaikie and jmorse November 8, 2023 18:42

rastogishubham force-pushed the DIExpressionAppendOps branch from 4ca3532 to 352a2b6 Compare November 13, 2023 21:26

jmorse approved these changes Nov 14, 2023

View reviewed changes

rastogishubham force-pushed the DIExpressionAppendOps branch from 352a2b6 to 2abd30c Compare March 9, 2024 01:03

llvmbot added coroutines C++20 coroutines llvm:transforms labels Mar 9, 2024

rastogishubham force-pushed the DIExpressionAppendOps branch 4 times, most recently from 8523e93 to 9a6a692 Compare March 14, 2024 20:36

adrian-prantl reviewed Mar 15, 2024

View reviewed changes

rastogishubham force-pushed the DIExpressionAppendOps branch from 9a6a692 to 6e3b0ee Compare March 20, 2024 19:06

rastogishubham force-pushed the DIExpressionAppendOps branch 3 times, most recently from 2e55fc0 to 6f8bd6c Compare April 4, 2024 02:02

rastogishubham force-pushed the DIExpressionAppendOps branch from 6f8bd6c to b4d7e1c Compare April 18, 2024 03:11

rastogishubham force-pushed the DIExpressionAppendOps branch from b4d7e1c to aae995a Compare April 24, 2024 21:48

rastogishubham force-pushed the DIExpressionAppendOps branch from aae995a to 9e6b2e4 Compare May 7, 2024 16:30

rastogishubham requested a review from jmorse May 7, 2024 18:00

rastogishubham changed the title ~~Use DIExpression::foldConstantMath() at the result of an appendOpsToArg()~~ Use DIExpression::foldConstantMath at the result of a Salvaged expression May 7, 2024

adrian-prantl reviewed May 8, 2024

View reviewed changes

llvm/lib/IR/DebugInfoMetadata.cpp Outdated Show resolved Hide resolved

adrian-prantl reviewed May 8, 2024

View reviewed changes

llvm/lib/IR/DebugInfoMetadata.cpp Outdated Show resolved Hide resolved

rastogishubham force-pushed the DIExpressionAppendOps branch 2 times, most recently from e22cecc to ce1b5f2 Compare May 10, 2024 23:42

rastogishubham requested a review from adrian-prantl May 10, 2024 23:43

rastogishubham force-pushed the DIExpressionAppendOps branch from ce1b5f2 to f8f2706 Compare May 11, 2024 01:13

MaskRay reviewed May 15, 2024

View reviewed changes

llvm/include/llvm/IR/DIExpressionOptimizer.h Outdated Show resolved Hide resolved

MaskRay reviewed May 15, 2024

View reviewed changes

llvm/include/llvm/IR/DIExpressionOptimizer.h Outdated Show resolved Hide resolved

llvm/lib/IR/DIExpressionOptimizer.cpp Outdated Show resolved Hide resolved

llvm/include/llvm/IR/DIExpressionOptimizer.h Outdated Show resolved Hide resolved

rastogishubham force-pushed the DIExpressionAppendOps branch from f8f2706 to 47bb523 Compare May 15, 2024 17:51

rastogishubham requested a review from MaskRay May 15, 2024 17:52

rastogishubham force-pushed the DIExpressionAppendOps branch from 47bb523 to 944f712 Compare May 16, 2024 20:07

rastogishubham force-pushed the DIExpressionAppendOps branch from 944f712 to b2586bb Compare May 22, 2024 19:18

SLTozer reviewed May 23, 2024

View reviewed changes

rastogishubham force-pushed the DIExpressionAppendOps branch from b2586bb to 7eb1745 Compare May 23, 2024 22:28

rastogishubham requested a review from SLTozer May 23, 2024 22:30

adrian-prantl approved these changes May 29, 2024

View reviewed changes

SLTozer approved these changes May 29, 2024

View reviewed changes

rastogishubham force-pushed the DIExpressionAppendOps branch from 7eb1745 to 3a33227 Compare May 29, 2024 23:21

rastogishubham merged commit f4681be into llvm:main May 29, 2024
4 of 6 checks passed

rastogishubham deleted the DIExpressionAppendOps branch May 29, 2024 23:25

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use DIExpression::foldConstantMath at the result of a Salvaged expression #71721

Use DIExpression::foldConstantMath at the result of a Salvaged expression #71721

rastogishubham commented Nov 8, 2023 •

edited

llvmbot commented Nov 8, 2023 •

edited

github-actions bot commented Nov 8, 2023 •

edited

jmorse left a comment

rastogishubham commented Mar 9, 2024

adrian-prantl Mar 15, 2024

SLTozer commented Mar 22, 2024

SLTozer commented Apr 22, 2024

rastogishubham commented Apr 22, 2024

rastogishubham commented May 7, 2024

rastogishubham commented May 15, 2024

SLTozer commented May 21, 2024

adrian-prantl commented May 23, 2024

jryans commented May 23, 2024 •

edited

SLTozer left a comment •

edited

rastogishubham commented May 23, 2024

Use DIExpression::foldConstantMath at the result of a Salvaged expression #71721

Use DIExpression::foldConstantMath at the result of a Salvaged expression #71721

Conversation

rastogishubham commented Nov 8, 2023 • edited

llvmbot commented Nov 8, 2023 • edited

github-actions bot commented Nov 8, 2023 • edited

jmorse left a comment

Choose a reason for hiding this comment

rastogishubham commented Mar 9, 2024

adrian-prantl Mar 15, 2024

Choose a reason for hiding this comment

SLTozer commented Mar 22, 2024

SLTozer commented Apr 22, 2024

rastogishubham commented Apr 22, 2024

rastogishubham commented May 7, 2024

rastogishubham commented May 15, 2024

SLTozer commented May 21, 2024

adrian-prantl commented May 23, 2024

jryans commented May 23, 2024 • edited

SLTozer left a comment • edited

Choose a reason for hiding this comment

rastogishubham commented May 23, 2024

rastogishubham commented Nov 8, 2023 •

edited

llvmbot commented Nov 8, 2023 •

edited

github-actions bot commented Nov 8, 2023 •

edited

jryans commented May 23, 2024 •

edited

SLTozer left a comment •

edited