-
Notifications
You must be signed in to change notification settings - Fork 10.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Introduce DIExpressionOptimizer #69769
Conversation
DIExpressionOptimizer is a lightweight class that can be used to optimize DIExpressions that can get very large, by involving some simple constant folding, and removing appends of unecessary operations.
@llvm/pr-subscribers-llvm-ir @llvm/pr-subscribers-debuginfo Author: Shubham Sandeep Rastogi (rastogishubham) ChangesDIExpressionOptimizer is a lightweight class that can be used to optimize DIExpressions that can get very large, by involving some simple constant folding and removing appends of unnecessary operations. Full diff: https://github.com/llvm/llvm-project/pull/69769.diff 3 Files Affected:
diff --git a/llvm/include/llvm/IR/DebugInfoMetadata.h b/llvm/include/llvm/IR/DebugInfoMetadata.h
index b347664883fd9f2..b9df644e44ccf36 100644
--- a/llvm/include/llvm/IR/DebugInfoMetadata.h
+++ b/llvm/include/llvm/IR/DebugInfoMetadata.h
@@ -3073,6 +3073,109 @@ template <> struct DenseMapInfo<DIExpression::FragmentInfo> {
static bool isEqual(const FragInfo &A, const FragInfo &B) { return A == B; }
};
+class DIExpressionOptimizer {
+
+ // When looking at a DIExpression such as {DW_OP_constu, 1, DW_OP_constu, 2,
+ // DW_OP_plus} and trying to append {DW_OP_consts, 3, DW_OP_minus}
+ // NewMathOperator = DW_OP_minus
+ // OperandRight = 3
+ // OperatorRight = DW_OP_consts
+ // CurrMathOperator = DW_OP_plus
+ // OperandLeft = 2
+ // OperandLeft = DW_OP_constu
+
+ /// The math operator of the new subexpression being appended to the
+ /// DIExpression.
+ uint64_t NewMathOperator = 0;
+
+ /// The operand of the new subexpression being appended to the DIExpression.
+ uint64_t OperandRight;
+
+ /// The operator attached to OperandRight.
+ uint64_t OperatorRight = 0;
+
+ /// The math operator at the end of the current DIExpression.
+ uint64_t CurrMathOperator = 0;
+
+ /// The operand at the end of the current DIExpression at the top of the
+ /// stack.
+ uint64_t OperandLeft;
+
+ /// The operator attached to OperandLeft.
+ uint64_t OperatorLeft = 0;
+
+ bool operatorIsCommutative(uint64_t Operator);
+
+ SmallVector<uint64_t> getOperatorLocations(ArrayRef<uint64_t> Ops);
+
+ void reset();
+
+public:
+ bool operatorCanBeOptimized(uint64_t Operator);
+ void optimize(SmallVectorImpl<uint64_t> &NewOps, ArrayRef<uint64_t> Ops);
+};
+
+/// Holds a DIExpression and keeps track of how many operands have been consumed
+/// so far.
+class DIExpressionCursor {
+ DIExpression::expr_op_iterator Start, End;
+
+public:
+ DIExpressionCursor(const DIExpression *Expr) {
+ if (!Expr) {
+ assert(Start == End);
+ return;
+ }
+ Start = Expr->expr_op_begin();
+ End = Expr->expr_op_end();
+ }
+
+ DIExpressionCursor(ArrayRef<uint64_t> Expr)
+ : Start(Expr.begin()), End(Expr.end()) {}
+
+ DIExpressionCursor(const DIExpressionCursor &) = default;
+
+ /// Consume one operation.
+ std::optional<DIExpression::ExprOperand> take() {
+ if (Start == End)
+ return std::nullopt;
+ return *(Start++);
+ }
+
+ /// Consume N operations.
+ void consume(unsigned N) { std::advance(Start, N); }
+
+ /// Return the current operation.
+ std::optional<DIExpression::ExprOperand> peek() const {
+ if (Start == End)
+ return std::nullopt;
+ return *(Start);
+ }
+
+ /// Return the next operation.
+ std::optional<DIExpression::ExprOperand> peekNext() const {
+ if (Start == End)
+ return std::nullopt;
+
+ auto Next = Start.getNext();
+ if (Next == End)
+ return std::nullopt;
+
+ return *Next;
+ }
+
+ /// Determine whether there are any operations left in this expression.
+ operator bool() const { return Start != End; }
+
+ DIExpression::expr_op_iterator begin() const { return Start; }
+ DIExpression::expr_op_iterator end() const { return End; }
+
+ /// Retrieve the fragment information, if any.
+ std::optional<DIExpression::FragmentInfo> getFragmentInfo() const {
+ return DIExpression::getFragmentInfo(Start, End);
+ }
+};
+
/// Global variables.
///
/// TODO: Remove DisplayName. It's always equal to Name.
diff --git a/llvm/lib/CodeGen/AsmPrinter/DwarfExpression.h b/llvm/lib/CodeGen/AsmPrinter/DwarfExpression.h
index 667a9efc6f6c04f..4daa78b15b8e298 100644
--- a/llvm/lib/CodeGen/AsmPrinter/DwarfExpression.h
+++ b/llvm/lib/CodeGen/AsmPrinter/DwarfExpression.h
@@ -31,67 +31,6 @@ class DIELoc;
class TargetRegisterInfo;
class MachineLocation;
-/// Holds a DIExpression and keeps track of how many operands have been consumed
-/// so far.
-class DIExpressionCursor {
- DIExpression::expr_op_iterator Start, End;
-
-public:
- DIExpressionCursor(const DIExpression *Expr) {
- if (!Expr) {
- assert(Start == End);
- return;
- }
- Start = Expr->expr_op_begin();
- End = Expr->expr_op_end();
- }
-
- DIExpressionCursor(ArrayRef<uint64_t> Expr)
- : Start(Expr.begin()), End(Expr.end()) {}
-
- DIExpressionCursor(const DIExpressionCursor &) = default;
-
- /// Consume one operation.
- std::optional<DIExpression::ExprOperand> take() {
- if (Start == End)
- return std::nullopt;
- return *(Start++);
- }
-
- /// Consume N operations.
- void consume(unsigned N) { std::advance(Start, N); }
-
- /// Return the current operation.
- std::optional<DIExpression::ExprOperand> peek() const {
- if (Start == End)
- return std::nullopt;
- return *(Start);
- }
-
- /// Return the next operation.
- std::optional<DIExpression::ExprOperand> peekNext() const {
- if (Start == End)
- return std::nullopt;
-
- auto Next = Start.getNext();
- if (Next == End)
- return std::nullopt;
-
- return *Next;
- }
-
- /// Determine whether there are any operations left in this expression.
- operator bool() const { return Start != End; }
-
- DIExpression::expr_op_iterator begin() const { return Start; }
- DIExpression::expr_op_iterator end() const { return End; }
-
- /// Retrieve the fragment information, if any.
- std::optional<DIExpression::FragmentInfo> getFragmentInfo() const {
- return DIExpression::getFragmentInfo(Start, End);
- }
-};
-
/// Base class containing the logic for constructing DWARF expressions
/// independently of whether they are emitted into a DIE or into a .debug_loc
/// entry.
diff --git a/llvm/lib/IR/DebugInfoMetadata.cpp b/llvm/lib/IR/DebugInfoMetadata.cpp
index f7f36129ec8557c..dfda851b77777ad 100644
--- a/llvm/lib/IR/DebugInfoMetadata.cpp
+++ b/llvm/lib/IR/DebugInfoMetadata.cpp
@@ -1734,6 +1734,143 @@ const DIExpression *DIExpression::extractAddressClass(const DIExpression *Expr,
return Expr;
}
+SmallVector<uint64_t>
+DIExpressionOptimizer::getOperatorLocations(ArrayRef<uint64_t> Ops) {
+ SmallVector<uint64_t> OpLocs;
+ uint64_t Loc = 0;
+ DIExpressionCursor Cursor(Ops);
+
+ while (Cursor.peek()) {
+ OpLocs.push_back(Loc);
+ auto Size = Cursor.peek()->getSize();
+ Loc += Size;
+ Cursor.consume(1);
+ }
+
+ return OpLocs;
+}
+
+bool DIExpressionOptimizer::operatorIsCommutative(uint64_t Operator) {
+ return Operator == dwarf::DW_OP_mul || Operator == dwarf::DW_OP_plus;
+}
+
+bool DIExpressionOptimizer::operatorCanBeOptimized(uint64_t Operator) {
+ switch (Operator) {
+ case dwarf::DW_OP_plus:
+ case dwarf::DW_OP_plus_uconst:
+ case dwarf::DW_OP_minus:
+ case dwarf::DW_OP_mul:
+ case dwarf::DW_OP_div:
+ case dwarf::DW_OP_shl:
+ case dwarf::DW_OP_shr:
+ return true;
+ default:
+ return false;
+ }
+}
+
+void DIExpressionOptimizer::reset() {
+ NewMathOperator = 0;
+ CurrMathOperator = 0;
+ OperatorLeft = 0;
+ OperatorRight = 0;
+}
+
+void DIExpressionOptimizer::optimize(SmallVectorImpl<uint64_t> &NewOps,
+ ArrayRef<uint64_t> Ops) {
+
+ if (Ops.size() == 2 && Ops[0] == dwarf::DW_OP_plus_uconst) {
+ // Convert a {DW_OP_plus_uconst, <constant value>} expression to
+ // {DW_OP_constu, <constant value>, DW_OP_plus}
+ NewMathOperator = dwarf::DW_OP_plus;
+ OperandRight = Ops[1];
+ OperatorRight = dwarf::DW_OP_constu;
+ } else if (Ops.size() == 3) {
+ NewMathOperator = Ops[2];
+ OperandRight = Ops[1];
+ OperatorRight = Ops[0];
+ } else {
+ // Currently, DIExpressionOptimizer only supports arithmetic operations
+ // such as Ops = {DW_OP_constu, 1, DW_OP_mul}
+ NewOps.append(Ops.begin(), Ops.end());
+ return;
+ }
+
+ if (OperatorRight != dwarf::DW_OP_constu &&
+ OperatorRight != dwarf::DW_OP_consts) {
+ NewOps.append(Ops.begin(), Ops.end());
+ return;
+ }
+
+ uint64_t CurrOperator;
+
+ if ((NewMathOperator == dwarf::DW_OP_mul ||
+ NewMathOperator == dwarf::DW_OP_div) &&
+ OperandRight == 1) {
+ // It is a multiply or divide by 1
+ reset();
+ return;
+ }
+ if ((NewMathOperator == dwarf::DW_OP_plus ||
+ NewMathOperator == dwarf::DW_OP_minus ||
+ NewMathOperator == dwarf::DW_OP_shl ||
+ NewMathOperator == dwarf::DW_OP_shr) &&
+ OperandRight == 0) {
+ // It is a add, subtract, shift left, or shift right with 0
+ reset();
+ return;
+ }
+
+ // Get the locations of the operators in the DIExpression
+ auto OperatorLocs = getOperatorLocations(NewOps);
+
+ for (auto It = OperatorLocs.rbegin(); It != OperatorLocs.rend(); It++) {
+ // Only look at the last two operators of the DIExpression to see if the new
+ // operators can be constant folded with them.
+ if (std::distance(OperatorLocs.rbegin(), It) > 1)
+ break;
+ CurrOperator = NewOps[*It];
+ if (CurrOperator == dwarf::DW_OP_mul || CurrOperator == dwarf::DW_OP_div ||
+ CurrOperator == dwarf::DW_OP_plus ||
+ CurrOperator == dwarf::DW_OP_minus) {
+ CurrMathOperator = CurrOperator;
+ continue;
+ }
+ if (CurrOperator == dwarf::DW_OP_plus_uconst &&
+ NewMathOperator == dwarf::DW_OP_plus) {
+ NewOps[*It + 1] = OperandRight + NewOps[*It + 1];
+ reset();
+ return;
+ }
+ if (CurrOperator == dwarf::DW_OP_constu ||
+ CurrOperator == dwarf::DW_OP_consts) {
+ OperatorLeft = CurrOperator;
+ OperandLeft = NewOps[*It + 1];
+
+ if (NewMathOperator == CurrMathOperator &&
+ operatorIsCommutative(NewMathOperator)) {
+ switch (NewMathOperator) {
+ case dwarf::DW_OP_plus:
+ NewOps[*It + 1] = OperandRight + OperandLeft;
+ break;
+ case dwarf::DW_OP_mul:
+ NewOps[*It + 1] = OperandRight * OperandLeft;
+ break;
+ default:
+ llvm_unreachable("Operator is not multiplication or addition!");
+ }
+ reset();
+ return;
+ }
+ break;
+ }
+ break;
+ }
+ NewOps.append(Ops.begin(), Ops.end());
+ reset();
+ return;
+}
+
DIExpression *DIExpression::prepend(const DIExpression *Expr, uint8_t Flags,
int64_t Offset) {
SmallVector<uint64_t, 8> Ops;
|
how does this relate to (or supersede) DIExpression::constantFold() ? |
Hi @pogo59 It seems like |
I suspect we would prefer not to have two separate constant folders. Can you combine them? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think in general this looks very useful.
Can you explain why DIExpressionOptimizer needs to keep state?
@@ -3073,6 +3073,109 @@ template <> struct DenseMapInfo<DIExpression::FragmentInfo> { | |||
static bool isEqual(const FragInfo &A, const FragInfo &B) { return A == B; } | |||
}; | |||
|
|||
class DIExpressionOptimizer { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add a Doxygen comment for the entire class with just a sentence on what this does?
} | ||
|
||
bool DIExpressionOptimizer::operatorIsCommutative(uint64_t Operator) { | ||
return Operator == dwarf::DW_OP_mul || Operator == dwarf::DW_OP_plus; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about DW_OP_and, or, xor?
} | ||
} | ||
|
||
void DIExpressionOptimizer::reset() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this needed for performance reasons? If not, it is easier to reason about the code if a fresh DIExpressionOptimizer is created for each use.
void DIExpressionOptimizer::optimize(SmallVectorImpl<uint64_t> &NewOps, | ||
ArrayRef<uint64_t> Ops) { | ||
|
||
if (Ops.size() == 2 && Ops[0] == dwarf::DW_OP_plus_uconst) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why == 2 and not >= 2?
|
||
/// The math operator of the new subexpression being appended to the | ||
/// DIExpression. | ||
uint64_t NewMathOperator = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you think about calling MathOperator
more generically BinOp
, or BinaryOperator
, so all operations that take two arguments fit the description?
if (OperatorRight != dwarf::DW_OP_constu && | ||
OperatorRight != dwarf::DW_OP_consts) { | ||
NewOps.append(Ops.begin(), Ops.end()); | ||
return; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is the state preserved in this case but not in all the others?
+1, that sounds like a good idea! |
DIExpressionOptimizer is a lightweight class that can be used to optimize DIExpressions that can get very large, by involving some simple constant folding and removing appends of unnecessary operations.
For example:
If we have a
DIExpression: {dwarf::DW_OP_LLVM_arg, 0, dwarf::DW_OP_constu, 2, dwarf::DW_OP_plus}
and want to append{dwarf::DW_OP_constu, 3, dwarf::DW_OP_plus}
Instead of having the final
DIExpression be {dwarf::DW_OP_LLVM_arg, 0, dwarf::DW_OP_constu, 2, dwarf::DW_OP_plus, dwarf::DW_OP_constu, 3, dwarf::DW_OP_plus}.
It will be reduced to
{dwarf::DW_OP_LLVM_arg, 0, dwarf::DW_OP_constu, 5, dwarf::DW_OP_plus}
There is an implementation for
DW_OP_mul
andDW_OP_plus_uconst
as wellIt also just flat out does not append
{DW_OP_constu, 1, DW_OP_mul}, {DW_OP_constu, 1, DW_OP_div}, {DW_OP_constu, 0, DW_OP_plus}, {DW_OP_constu, 0, DW_OP_minus}, {DW_OP_constu, 0, DW_OP_shl}, {DW_OP_constu, 0, DW_OP_shr}
Hopefully this can be improved upon by the larger community to add more optimizations in the future.
This patch is part of a stacked diff, I was not sure how to do this any better because github squashes multiple commits in one PR, but the PR this depends on is:
#69768