[DebugNames] Compare TableEntry names more efficiently #79759

felipepiovezan · 2024-01-28T16:52:47Z

TableEntry names are pointers into the string table section, and accessing their
length requires a search for \0. However, 99% of the time we only need to
compare the name against some other other, and such a comparison will fail as
early as the first character.

This commit adds a method to the interface of TableEntry so that such a
comparison can be done without extracting the full name. It saves 10% in the
time (1250ms -> 1100 ms) to evaluate the following expression.

lldb \
  --batch \
  -o "b CodeGenFunction::GenerateCode" \
  -o run \
  -o "expr Fn" \
  -- \
  clang++ -c -g test.cpp -o /dev/null &> output

llvmbot · 2024-01-28T16:53:15Z

@llvm/pr-subscribers-debuginfo

Author: Felipe de Azevedo Piovezan (felipepiovezan)

Changes

TableEntry names are pointers into the string table section, and accessing their length requires a search for \0. However, 99% of the time we only need to compare the name against some other other, and such a comparison will fail as early as the first character.

This commit adds a method to the interface of TableEntry so that such a comparison can be done without extracting the full name. It saves 10% in the time (1250ms -> 1100 ms) to evaluate the following expression.

lldb \
  --batch \
  -o "b CodeGenFunction::GenerateCode" \
  -o run \
  -o "expr Fn" \
  -- \
  clang++ -c -g test.cpp -o /dev/null &amp;&gt; output

Why not use strcmp? This function requires both operands to be null terminated. This is true for the strp entry -- and LLDB seems to always assume this -- but could lead to buffer overruns in corrupt data. This is also true for the "Target" string in the two uses of this function introduced here, but may not necessarily be true in the future; we would need to change the API from StringRef to std::string to emphasize this point.

Why not use strncmp? We can't use the "N" argument of strnmp to be hwstd::min(<debug_str_starting_at_name>.size(), Target.size()), as this would return "equal" when Target is a prefix of Name. To work around this, we would need to require Target` to be null-terminated.

Full diff: https://github.com/llvm/llvm-project/pull/79759.diff

2 Files Affected:

(modified) llvm/include/llvm/DebugInfo/DWARF/DWARFAcceleratorTable.h (+20)
(modified) llvm/lib/DebugInfo/DWARF/DWARFAcceleratorTable.cpp (+2-2)

diff --git a/llvm/include/llvm/DebugInfo/DWARF/DWARFAcceleratorTable.h b/llvm/include/llvm/DebugInfo/DWARF/DWARFAcceleratorTable.h
index 44a19c7b13f9a7b..748f6481c1925e3 100644
--- a/llvm/include/llvm/DebugInfo/DWARF/DWARFAcceleratorTable.h
+++ b/llvm/include/llvm/DebugInfo/DWARF/DWARFAcceleratorTable.h
@@ -543,6 +543,26 @@ class DWARFDebugNames : public DWARFAcceleratorTable {
       return StrData.getCStr(&Off);
     }
 
+    // Compares the name of this entry against Target, returning true if they
+    // are equal. This is helpful is hot code paths that do not need the length
+    // of the name.
+    bool compareNameAgainst(StringRef Target) const {
+      // Note: this is not the name, but the rest of debug_names starting from
+      // name. This handles corrupt data (non-null terminated) without
+      // overruning the buffer.
+      auto Data = StrData.getData().substr(StringOffset);
+      auto DataSize = Data.size();
+      auto TargetSize = Target.size();
+
+      // Invariant: at the start of the loop, we have non-null characters to read from Data.
+      size_t Idx = 0;
+      for (; Idx < DataSize && Data[Idx]; Idx++) {
+        if (Idx >= TargetSize || Data[Idx] != Target[Idx])
+          return false;
+      }
+      return Idx == TargetSize;
+    }
+
     /// Returns the offset of the first Entry in the list.
     uint64_t getEntryOffset() const { return EntryOffset; }
   };
diff --git a/llvm/lib/DebugInfo/DWARF/DWARFAcceleratorTable.cpp b/llvm/lib/DebugInfo/DWARF/DWARFAcceleratorTable.cpp
index 03ad5d133caddf4..30ac0c0d1507aba 100644
--- a/llvm/lib/DebugInfo/DWARF/DWARFAcceleratorTable.cpp
+++ b/llvm/lib/DebugInfo/DWARF/DWARFAcceleratorTable.cpp
@@ -921,7 +921,7 @@ DWARFDebugNames::ValueIterator::findEntryOffsetInCurrentIndex() {
   if (Hdr.BucketCount == 0) {
     // No Hash Table, We need to search through all names in the Name Index.
     for (const NameTableEntry &NTE : *CurrentIndex) {
-      if (NTE.getString() == Key)
+      if (NTE.compareNameAgainst(Key))
         return NTE.getEntryOffset();
     }
     return std::nullopt;
@@ -942,7 +942,7 @@ DWARFDebugNames::ValueIterator::findEntryOffsetInCurrentIndex() {
       return std::nullopt; // End of bucket
 
     NameTableEntry NTE = CurrentIndex->getNameTableEntry(Index);
-    if (NTE.getString() == Key)
+    if (NTE.compareNameAgainst(Key))
       return NTE.getEntryOffset();
   }
   return std::nullopt;

github-actions · 2024-01-28T16:55:02Z

✅ With the latest revision this PR passed the C/C++ code formatter.

adrian-prantl · 2024-01-29T22:02:59Z

llvm/include/llvm/DebugInfo/DWARF/DWARFAcceleratorTable.h

@@ -543,6 +543,27 @@ class DWARFDebugNames : public DWARFAcceleratorTable {
      return StrData.getCStr(&Off);
    }

+    // Compares the name of this entry against Target, returning true if they
+    // are equal. This is helpful is hot code paths that do not need the length
+    // of the name.


adrian-prantl · 2024-01-29T22:03:28Z

llvm/include/llvm/DebugInfo/DWARF/DWARFAcceleratorTable.h

+    bool compareNameAgainst(StringRef Target) const {
+      // Note: this is not the name, but the rest of debug_names starting from
+      // name. This handles corrupt data (non-null terminated) without
+      // overruning the buffer.


Suggested change

// overruning the buffer.

// overrunning the buffer.

adrian-prantl · 2024-01-29T22:07:01Z

llvm/include/llvm/DebugInfo/DWARF/DWARFAcceleratorTable.h

+      // Invariant: at the start of the loop, we have non-null characters to
+      // read from Data.
+      size_t Idx = 0;
+      for (; Idx < DataSize && Data[Idx]; Idx++) {


Would the code become shorter / more readable / efficient if we used StringRef.equals() here? https://llvm.org/doxygen/StringRef_8h_source.html#l00164
Or is this also not possible?

I can anwser my own question: because it would ignore a NUL in Data.

Note that the first thing StringRef.equals does is to compare the lengths of the two StringRefs.
But Data.size() is the length of the debug_str section (starting at the strp offset), not the length of the string we are interested in

FWIW, it might not hurt to include the (shortened) rationale about str(n)cmp in the code as a comment.

In fact we don't know the length of the string we are interested in

I suppose the primary reason why strcmp isn't faster is because in 99% of the cases this look will fail the comparison very early?

adrian-prantl

LGTM with cosmetic changes addressed!

dwblaikie · 2024-01-29T23:37:51Z

Why not use strncmp? We can't use the "N" argument of strncmp to be std::min(<debug_str_starting_at_name>.size(), Target.size()), as this would return "equal" when Target is a prefix of Name. To work around this, we would need to require Target to be null-terminated.

Rather than requiring Target to be null terminated, an extra check after strncmp could be done, yeah? (checking that <debug_str_starting_at_name>[Target.size()] == \0`)

So the function could be written, I think, as return Data.size() >= Target.size() && strncmp(Data.data(), Target.data(), Target.size()) == 0 && !Data[Target.size()]; I think? Something like that

The final argument for not using strcmp/strncmp is that they did provide any measurable speed benefits when compared to the proposed patch.

Could you elaborate on this further - did strncmp provide equivalent benefits to this proposed patch? But was avoided because of the extra requirement (null termination of Target, currently satisfied, but not necessarily true in the future?)

adrian-prantl · 2024-01-30T00:49:37Z

So the function could be written, I think, as return Data.size() >= Target.size() && strncmp(Data.data(), Target.data(), Target.size()) == 0 && !Data[Target.size()]; I think? Something like that

I'd expect that to at least in theory be able to beat the loop in this patch because strncmp could be some clever vectorized/unrolled/whatever implementation, but my guess is that since most comparisons fail early, that advantage practically doesn't matter. I would still mildly prefer the strncmp variant over the loop...

felipepiovezan · 2024-01-30T17:49:30Z

So the function could be written, I think, as return Data.size() >= Target.size() && strncmp(Data.data(), Target.data(), Target.size()) == 0 && !Data[Target.size()]; I think? Something like that

I hadn't thought of that! To be clear, I also would have written a loop-less version if I had been able to make it work.
But I like David's idea, let me give this a try.

TableEntry names are pointers into the string table section, and accessing their length requires a search for `\0`. However, 99% of the time we only need to compare the name against some other other, and such a comparison will fail as early as the first character. This commit adds a method to the interface of TableEntry so that such a comparison can be done without extracting the full name. It saves 10% in the time (1250ms -> 1100 ms) to evaluate the following expression. ``` lldb \ --batch \ -o "b CodeGenFunction::GenerateCode" \ -o run \ -o "expr Fn" \ -- \ clang++ -c -g test.cpp -o /dev/null &> output ```

felipepiovezan · 2024-01-30T20:06:16Z

I ended up amending the original commit to change its message, since the implementation is now vastly different.
For historical purposes, here is the original implementation:

    // Compares the name of this entry against Target, returning true if they
     // are equal. This is helpful is hot code paths that do not need the length
     // of the name.
     bool compareNameAgainst(StringRef Target) const {
       // Note: this is not the name, but the rest of debug_names starting from
       // name. This handles corrupt data (non-null terminated) without
       // overruning the buffer.
       auto Data = StrData.getData().substr(StringOffset);
       auto DataSize = Data.size();
       auto TargetSize = Target.size();

       // Invariant: at the start of the loop, we have non-null characters to
       // read from Data.
       size_t Idx = 0;
       for (; Idx < DataSize && Data[Idx]; Idx++) {
         if (Idx >= TargetSize || Data[Idx] != Target[Idx])
           return false;
       }
       return Idx == TargetSize;
     }

felipepiovezan · 2024-01-30T20:07:52Z

So the function could be written, I think, as return Data.size() >= Target.size() && strncmp(Data.data(), Target.data(), Target.size()) == 0 && !Data[Target.size()]; I think? Something like that

@dwblaikie I had to tweak this slightly: we want Data.size() > Target.size() instead of >=, because a (non-corrupt) string table name must be null terminated and Data.size() includes the \0.

adrian-prantl · 2024-01-30T20:22:43Z

llvm/include/llvm/DebugInfo/DWARF/DWARFAcceleratorTable.h

@@ -543,6 +543,19 @@ class DWARFDebugNames : public DWARFAcceleratorTable {
      return StrData.getCStr(&Off);
    }

+    /// Compares the name of this entry against Target, returning true if they
+    /// are equal. This is helpful is hot code paths that do not need the length


Suggested change

/// are equal. This is helpful is hot code paths that do not need the length

/// are equal. This is more efficient in hot code paths that do not need the length

ayermolo · 2024-01-30T20:22:57Z

llvm/include/llvm/DebugInfo/DWARF/DWARFAcceleratorTable.h

@@ -543,6 +543,19 @@ class DWARFDebugNames : public DWARFAcceleratorTable {
      return StrData.getCStr(&Off);
    }

+    /// Compares the name of this entry against Target, returning true if they
+    /// are equal. This is helpful is hot code paths that do not need the length


Did you mean "This is helpful in" ?

Yup! I updated the comment after Adrian's other suggestion

ayermolo · 2024-01-30T20:23:56Z

llvm/include/llvm/DebugInfo/DWARF/DWARFAcceleratorTable.h

+      // name. This handles corrupt data (non-null terminated) without
+      // overrunning the buffer.
+      auto Data = StrData.getData().substr(StringOffset);
+      auto TargetSize = Target.size();


From the code style guide perspective I think this needs explicit type?

Yeah... I guess you are right, I will change it. It just makes me uncomfortable how easy it is to trigger implicit conversions the moment I write any types here :(

llvm/include/llvm/DebugInfo/DWARF/DWARFAcceleratorTable.h

dwblaikie · 2024-01-30T20:54:32Z

So the function could be written, I think, as return Data.size() >= Target.size() && strncmp(Data.data(), Target.data(), Target.size()) == 0 && !Data[Target.size()]; I think? Something like that

@dwblaikie I had to tweak this slightly: we want Data.size() > Target.size() instead of >=, because a (non-corrupt) string table name must be null terminated and Data.size() includes the \0.

Oh, yeah, that totally makes sense! 👍

TableEntry names are pointers into the string table section, and accessing their length requires a search for `\0`. However, 99% of the time we only need to compare the name against some other other, and such a comparison will fail as early as the first character. This commit adds a method to the interface of TableEntry so that such a comparison can be done without extracting the full name. It saves 10% in the time (1250ms -> 1100 ms) to evaluate the following expression. ``` lldb \ --batch \ -o "b CodeGenFunction::GenerateCode" \ -o run \ -o "expr Fn" \ -- \ clang++ -c -g test.cpp -o /dev/null &> output ``` (cherry picked from commit 75ea78a)

llvmbot added the debuginfo label Jan 28, 2024

felipepiovezan force-pushed the felipe/faster_name_cmp branch 2 times, most recently from 996ccb6 to abfb6f2 Compare January 29, 2024 15:01

felipepiovezan requested review from JDevlieghere, adrian-prantl, dwblaikie and ayermolo January 29, 2024 15:01

adrian-prantl reviewed Jan 29, 2024

View reviewed changes

adrian-prantl approved these changes Jan 29, 2024

View reviewed changes

felipepiovezan force-pushed the felipe/faster_name_cmp branch from abfb6f2 to d5ca597 Compare January 30, 2024 20:06

fixup! Fix typo in comment

e73146e

adrian-prantl reviewed Jan 30, 2024

View reviewed changes

ayermolo reviewed Jan 30, 2024

View reviewed changes

felipepiovezan added 2 commits January 30, 2024 12:25

fixup! Updat comment based on review

01ef075

fixup! Update types based on review

98e0bfa

ayermolo approved these changes Jan 30, 2024

View reviewed changes

dwblaikie approved these changes Jan 30, 2024

View reviewed changes

felipepiovezan merged commit 75ea78a into llvm:main Jan 30, 2024
4 checks passed

felipepiovezan deleted the felipe/faster_name_cmp branch January 30, 2024 22:28

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[DebugNames] Compare TableEntry names more efficiently #79759

[DebugNames] Compare TableEntry names more efficiently #79759

felipepiovezan commented Jan 28, 2024 •

edited

llvmbot commented Jan 28, 2024

github-actions bot commented Jan 28, 2024 •

edited

adrian-prantl Jan 29, 2024

adrian-prantl Jan 29, 2024

adrian-prantl Jan 29, 2024

adrian-prantl Jan 29, 2024

felipepiovezan Jan 29, 2024 •

edited

adrian-prantl Jan 29, 2024

felipepiovezan Jan 29, 2024

adrian-prantl Jan 29, 2024

adrian-prantl left a comment

dwblaikie commented Jan 29, 2024

adrian-prantl commented Jan 30, 2024

felipepiovezan commented Jan 30, 2024

felipepiovezan commented Jan 30, 2024

felipepiovezan commented Jan 30, 2024

adrian-prantl Jan 30, 2024

ayermolo Jan 30, 2024

felipepiovezan Jan 30, 2024

ayermolo Jan 30, 2024

felipepiovezan Jan 30, 2024

dwblaikie commented Jan 30, 2024

	/// are equal. This is helpful is hot code paths that do not need the length
	/// are equal. This is more efficient in hot code paths that do not need the length

[DebugNames] Compare TableEntry names more efficiently #79759

[DebugNames] Compare TableEntry names more efficiently #79759

Conversation

felipepiovezan commented Jan 28, 2024 • edited

llvmbot commented Jan 28, 2024

github-actions bot commented Jan 28, 2024 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

felipepiovezan Jan 29, 2024 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

adrian-prantl left a comment

Choose a reason for hiding this comment

dwblaikie commented Jan 29, 2024

adrian-prantl commented Jan 30, 2024

felipepiovezan commented Jan 30, 2024

felipepiovezan commented Jan 30, 2024

felipepiovezan commented Jan 30, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dwblaikie commented Jan 30, 2024

felipepiovezan commented Jan 28, 2024 •

edited

github-actions bot commented Jan 28, 2024 •

edited

felipepiovezan Jan 29, 2024 •

edited