-
Notifications
You must be signed in to change notification settings - Fork 10.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[DebugNames] Compare TableEntry names more efficiently #79759
[DebugNames] Compare TableEntry names more efficiently #79759
Conversation
@llvm/pr-subscribers-debuginfo Author: Felipe de Azevedo Piovezan (felipepiovezan) ChangesTableEntry names are pointers into the string table section, and accessing their length requires a search for This commit adds a method to the interface of TableEntry so that such a comparison can be done without extracting the full name. It saves 10% in the time (1250ms -> 1100 ms) to evaluate the following expression.
Why not use strcmp? This function requires both operands to be null terminated. This is true for the strp entry -- and LLDB seems to always assume this -- but could lead to buffer overruns in corrupt data. This is also true for the "Target" string in the two uses of this function introduced here, but may not necessarily be true in the future; we would need to change the API from Why not use strncmp? We can't use the "N" argument of strnmp to be hwstd::min(<debug_str_starting_at_name>.size(), Target.size()) Full diff: https://github.com/llvm/llvm-project/pull/79759.diff 2 Files Affected:
diff --git a/llvm/include/llvm/DebugInfo/DWARF/DWARFAcceleratorTable.h b/llvm/include/llvm/DebugInfo/DWARF/DWARFAcceleratorTable.h
index 44a19c7b13f9a7b..748f6481c1925e3 100644
--- a/llvm/include/llvm/DebugInfo/DWARF/DWARFAcceleratorTable.h
+++ b/llvm/include/llvm/DebugInfo/DWARF/DWARFAcceleratorTable.h
@@ -543,6 +543,26 @@ class DWARFDebugNames : public DWARFAcceleratorTable {
return StrData.getCStr(&Off);
}
+ // Compares the name of this entry against Target, returning true if they
+ // are equal. This is helpful is hot code paths that do not need the length
+ // of the name.
+ bool compareNameAgainst(StringRef Target) const {
+ // Note: this is not the name, but the rest of debug_names starting from
+ // name. This handles corrupt data (non-null terminated) without
+ // overruning the buffer.
+ auto Data = StrData.getData().substr(StringOffset);
+ auto DataSize = Data.size();
+ auto TargetSize = Target.size();
+
+ // Invariant: at the start of the loop, we have non-null characters to read from Data.
+ size_t Idx = 0;
+ for (; Idx < DataSize && Data[Idx]; Idx++) {
+ if (Idx >= TargetSize || Data[Idx] != Target[Idx])
+ return false;
+ }
+ return Idx == TargetSize;
+ }
+
/// Returns the offset of the first Entry in the list.
uint64_t getEntryOffset() const { return EntryOffset; }
};
diff --git a/llvm/lib/DebugInfo/DWARF/DWARFAcceleratorTable.cpp b/llvm/lib/DebugInfo/DWARF/DWARFAcceleratorTable.cpp
index 03ad5d133caddf4..30ac0c0d1507aba 100644
--- a/llvm/lib/DebugInfo/DWARF/DWARFAcceleratorTable.cpp
+++ b/llvm/lib/DebugInfo/DWARF/DWARFAcceleratorTable.cpp
@@ -921,7 +921,7 @@ DWARFDebugNames::ValueIterator::findEntryOffsetInCurrentIndex() {
if (Hdr.BucketCount == 0) {
// No Hash Table, We need to search through all names in the Name Index.
for (const NameTableEntry &NTE : *CurrentIndex) {
- if (NTE.getString() == Key)
+ if (NTE.compareNameAgainst(Key))
return NTE.getEntryOffset();
}
return std::nullopt;
@@ -942,7 +942,7 @@ DWARFDebugNames::ValueIterator::findEntryOffsetInCurrentIndex() {
return std::nullopt; // End of bucket
NameTableEntry NTE = CurrentIndex->getNameTableEntry(Index);
- if (NTE.getString() == Key)
+ if (NTE.compareNameAgainst(Key))
return NTE.getEntryOffset();
}
return std::nullopt;
|
✅ With the latest revision this PR passed the C/C++ code formatter. |
996ccb6
to
abfb6f2
Compare
@@ -543,6 +543,27 @@ class DWARFDebugNames : public DWARFAcceleratorTable { | |||
return StrData.getCStr(&Off); | |||
} | |||
|
|||
// Compares the name of this entry against Target, returning true if they | |||
// are equal. This is helpful is hot code paths that do not need the length | |||
// of the name. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
///
bool compareNameAgainst(StringRef Target) const { | ||
// Note: this is not the name, but the rest of debug_names starting from | ||
// name. This handles corrupt data (non-null terminated) without | ||
// overruning the buffer. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// overruning the buffer. | |
// overrunning the buffer. |
// Invariant: at the start of the loop, we have non-null characters to | ||
// read from Data. | ||
size_t Idx = 0; | ||
for (; Idx < DataSize && Data[Idx]; Idx++) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would the code become shorter / more readable / efficient if we used StringRef.equals() here? https://llvm.org/doxygen/StringRef_8h_source.html#l00164
Or is this also not possible?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can anwser my own question: because it would ignore a NUL in Data
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that the first thing StringRef.equals
does is to compare the lengths of the two StringRefs.
But Data.size()
is the length of the debug_str section (starting at the strp offset), not the length of the string we are interested in
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW, it might not hurt to include the (shortened) rationale about str(n)cmp in the code as a comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In fact we don't know the length of the string we are interested in
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suppose the primary reason why strcmp
isn't faster is because in 99% of the cases this look will fail the comparison very early?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM with cosmetic changes addressed!
Rather than requiring So the function could be written, I think, as
Could you elaborate on this further - did strncmp provide equivalent benefits to this proposed patch? But was avoided because of the extra requirement (null termination of |
I'd expect that to at least in theory be able to beat the loop in this patch because strncmp could be some clever vectorized/unrolled/whatever implementation, but my guess is that since most comparisons fail early, that advantage practically doesn't matter. I would still mildly prefer the strncmp variant over the loop... |
I hadn't thought of that! To be clear, I also would have written a loop-less version if I had been able to make it work. |
TableEntry names are pointers into the string table section, and accessing their length requires a search for `\0`. However, 99% of the time we only need to compare the name against some other other, and such a comparison will fail as early as the first character. This commit adds a method to the interface of TableEntry so that such a comparison can be done without extracting the full name. It saves 10% in the time (1250ms -> 1100 ms) to evaluate the following expression. ``` lldb \ --batch \ -o "b CodeGenFunction::GenerateCode" \ -o run \ -o "expr Fn" \ -- \ clang++ -c -g test.cpp -o /dev/null &> output ```
I ended up amending the original commit to change its message, since the implementation is now vastly different.
|
abfb6f2
to
d5ca597
Compare
@dwblaikie I had to tweak this slightly: we want |
@@ -543,6 +543,19 @@ class DWARFDebugNames : public DWARFAcceleratorTable { | |||
return StrData.getCStr(&Off); | |||
} | |||
|
|||
/// Compares the name of this entry against Target, returning true if they | |||
/// are equal. This is helpful is hot code paths that do not need the length |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/// are equal. This is helpful is hot code paths that do not need the length | |
/// are equal. This is more efficient in hot code paths that do not need the length |
@@ -543,6 +543,19 @@ class DWARFDebugNames : public DWARFAcceleratorTable { | |||
return StrData.getCStr(&Off); | |||
} | |||
|
|||
/// Compares the name of this entry against Target, returning true if they | |||
/// are equal. This is helpful is hot code paths that do not need the length |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you mean "This is helpful in" ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup! I updated the comment after Adrian's other suggestion
// name. This handles corrupt data (non-null terminated) without | ||
// overrunning the buffer. | ||
auto Data = StrData.getData().substr(StringOffset); | ||
auto TargetSize = Target.size(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From the code style guide perspective I think this needs explicit type?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah... I guess you are right, I will change it. It just makes me uncomfortable how easy it is to trigger implicit conversions the moment I write any types here :(
Oh, yeah, that totally makes sense! 👍 |
TableEntry names are pointers into the string table section, and accessing their length requires a search for `\0`. However, 99% of the time we only need to compare the name against some other other, and such a comparison will fail as early as the first character. This commit adds a method to the interface of TableEntry so that such a comparison can be done without extracting the full name. It saves 10% in the time (1250ms -> 1100 ms) to evaluate the following expression. ``` lldb \ --batch \ -o "b CodeGenFunction::GenerateCode" \ -o run \ -o "expr Fn" \ -- \ clang++ -c -g test.cpp -o /dev/null &> output ``` (cherry picked from commit 75ea78a)
TableEntry names are pointers into the string table section, and accessing their
length requires a search for
\0
. However, 99% of the time we only need tocompare the name against some other other, and such a comparison will fail as
early as the first character.
This commit adds a method to the interface of TableEntry so that such a
comparison can be done without extracting the full name. It saves 10% in the
time (1250ms -> 1100 ms) to evaluate the following expression.