Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DebugNames] Compare TableEntry names more efficiently #79759

Merged
merged 4 commits into from
Jan 30, 2024

Conversation

felipepiovezan
Copy link
Contributor

@felipepiovezan felipepiovezan commented Jan 28, 2024

TableEntry names are pointers into the string table section, and accessing their
length requires a search for \0. However, 99% of the time we only need to
compare the name against some other other, and such a comparison will fail as
early as the first character.

This commit adds a method to the interface of TableEntry so that such a
comparison can be done without extracting the full name. It saves 10% in the
time (1250ms -> 1100 ms) to evaluate the following expression.

lldb \
  --batch \
  -o "b CodeGenFunction::GenerateCode" \
  -o run \
  -o "expr Fn" \
  -- \
  clang++ -c -g test.cpp -o /dev/null &> output

@llvmbot
Copy link
Collaborator

llvmbot commented Jan 28, 2024

@llvm/pr-subscribers-debuginfo

Author: Felipe de Azevedo Piovezan (felipepiovezan)

Changes

TableEntry names are pointers into the string table section, and accessing their length requires a search for \0. However, 99% of the time we only need to compare the name against some other other, and such a comparison will fail as early as the first character.

This commit adds a method to the interface of TableEntry so that such a comparison can be done without extracting the full name. It saves 10% in the time (1250ms -> 1100 ms) to evaluate the following expression.

lldb \
  --batch \
  -o "b CodeGenFunction::GenerateCode" \
  -o run \
  -o "expr Fn" \
  -- \
  clang++ -c -g test.cpp -o /dev/null &> output

Why not use strcmp? This function requires both operands to be null terminated. This is true for the strp entry -- and LLDB seems to always assume this -- but could lead to buffer overruns in corrupt data. This is also true for the "Target" string in the two uses of this function introduced here, but may not necessarily be true in the future; we would need to change the API from StringRef to std::string to emphasize this point.

Why not use strncmp? We can't use the "N" argument of strnmp to be hwstd::min(<debug_str_starting_at_name>.size(), Target.size()), as this would return "equal" when Target is a prefix of Name. To work around this, we would need to require Target` to be null-terminated.


Full diff: https://github.com/llvm/llvm-project/pull/79759.diff

2 Files Affected:

  • (modified) llvm/include/llvm/DebugInfo/DWARF/DWARFAcceleratorTable.h (+20)
  • (modified) llvm/lib/DebugInfo/DWARF/DWARFAcceleratorTable.cpp (+2-2)
diff --git a/llvm/include/llvm/DebugInfo/DWARF/DWARFAcceleratorTable.h b/llvm/include/llvm/DebugInfo/DWARF/DWARFAcceleratorTable.h
index 44a19c7b13f9a7b..748f6481c1925e3 100644
--- a/llvm/include/llvm/DebugInfo/DWARF/DWARFAcceleratorTable.h
+++ b/llvm/include/llvm/DebugInfo/DWARF/DWARFAcceleratorTable.h
@@ -543,6 +543,26 @@ class DWARFDebugNames : public DWARFAcceleratorTable {
       return StrData.getCStr(&Off);
     }
 
+    // Compares the name of this entry against Target, returning true if they
+    // are equal. This is helpful is hot code paths that do not need the length
+    // of the name.
+    bool compareNameAgainst(StringRef Target) const {
+      // Note: this is not the name, but the rest of debug_names starting from
+      // name. This handles corrupt data (non-null terminated) without
+      // overruning the buffer.
+      auto Data = StrData.getData().substr(StringOffset);
+      auto DataSize = Data.size();
+      auto TargetSize = Target.size();
+
+      // Invariant: at the start of the loop, we have non-null characters to read from Data.
+      size_t Idx = 0;
+      for (; Idx < DataSize && Data[Idx]; Idx++) {
+        if (Idx >= TargetSize || Data[Idx] != Target[Idx])
+          return false;
+      }
+      return Idx == TargetSize;
+    }
+
     /// Returns the offset of the first Entry in the list.
     uint64_t getEntryOffset() const { return EntryOffset; }
   };
diff --git a/llvm/lib/DebugInfo/DWARF/DWARFAcceleratorTable.cpp b/llvm/lib/DebugInfo/DWARF/DWARFAcceleratorTable.cpp
index 03ad5d133caddf4..30ac0c0d1507aba 100644
--- a/llvm/lib/DebugInfo/DWARF/DWARFAcceleratorTable.cpp
+++ b/llvm/lib/DebugInfo/DWARF/DWARFAcceleratorTable.cpp
@@ -921,7 +921,7 @@ DWARFDebugNames::ValueIterator::findEntryOffsetInCurrentIndex() {
   if (Hdr.BucketCount == 0) {
     // No Hash Table, We need to search through all names in the Name Index.
     for (const NameTableEntry &NTE : *CurrentIndex) {
-      if (NTE.getString() == Key)
+      if (NTE.compareNameAgainst(Key))
         return NTE.getEntryOffset();
     }
     return std::nullopt;
@@ -942,7 +942,7 @@ DWARFDebugNames::ValueIterator::findEntryOffsetInCurrentIndex() {
       return std::nullopt; // End of bucket
 
     NameTableEntry NTE = CurrentIndex->getNameTableEntry(Index);
-    if (NTE.getString() == Key)
+    if (NTE.compareNameAgainst(Key))
       return NTE.getEntryOffset();
   }
   return std::nullopt;

Copy link

github-actions bot commented Jan 28, 2024

✅ With the latest revision this PR passed the C/C++ code formatter.

@@ -543,6 +543,27 @@ class DWARFDebugNames : public DWARFAcceleratorTable {
return StrData.getCStr(&Off);
}

// Compares the name of this entry against Target, returning true if they
// are equal. This is helpful is hot code paths that do not need the length
// of the name.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

///

bool compareNameAgainst(StringRef Target) const {
// Note: this is not the name, but the rest of debug_names starting from
// name. This handles corrupt data (non-null terminated) without
// overruning the buffer.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// overruning the buffer.
// overrunning the buffer.

// Invariant: at the start of the loop, we have non-null characters to
// read from Data.
size_t Idx = 0;
for (; Idx < DataSize && Data[Idx]; Idx++) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would the code become shorter / more readable / efficient if we used StringRef.equals() here? https://llvm.org/doxygen/StringRef_8h_source.html#l00164
Or is this also not possible?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can anwser my own question: because it would ignore a NUL in Data.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that the first thing StringRef.equals does is to compare the lengths of the two StringRefs.
But Data.size() is the length of the debug_str section (starting at the strp offset), not the length of the string we are interested in

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, it might not hurt to include the (shortened) rationale about str(n)cmp in the code as a comment.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In fact we don't know the length of the string we are interested in

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose the primary reason why strcmp isn't faster is because in 99% of the cases this look will fail the comparison very early?

Copy link
Collaborator

@adrian-prantl adrian-prantl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with cosmetic changes addressed!

@dwblaikie
Copy link
Collaborator

Why not use strncmp? We can't use the "N" argument of strncmp to be std::min(<debug_str_starting_at_name>.size(), Target.size()), as this would return "equal" when Target is a prefix of Name. To work around this, we would need to require Target to be null-terminated.

Rather than requiring Target to be null terminated, an extra check after strncmp could be done, yeah? (checking that <debug_str_starting_at_name>[Target.size()] == \0`)

So the function could be written, I think, as return Data.size() >= Target.size() && strncmp(Data.data(), Target.data(), Target.size()) == 0 && !Data[Target.size()]; I think? Something like that

The final argument for not using strcmp/strncmp is that they did provide any measurable speed benefits when compared to the proposed patch.

Could you elaborate on this further - did strncmp provide equivalent benefits to this proposed patch? But was avoided because of the extra requirement (null termination of Target, currently satisfied, but not necessarily true in the future?)

@adrian-prantl
Copy link
Collaborator

So the function could be written, I think, as return Data.size() >= Target.size() && strncmp(Data.data(), Target.data(), Target.size()) == 0 && !Data[Target.size()]; I think? Something like that

I'd expect that to at least in theory be able to beat the loop in this patch because strncmp could be some clever vectorized/unrolled/whatever implementation, but my guess is that since most comparisons fail early, that advantage practically doesn't matter. I would still mildly prefer the strncmp variant over the loop...

@felipepiovezan
Copy link
Contributor Author

So the function could be written, I think, as return Data.size() >= Target.size() && strncmp(Data.data(), Target.data(), Target.size()) == 0 && !Data[Target.size()]; I think? Something like that

I hadn't thought of that! To be clear, I also would have written a loop-less version if I had been able to make it work.
But I like David's idea, let me give this a try.

TableEntry names are pointers into the string table section, and accessing their
length requires a search for `\0`. However, 99% of the time we only need to
compare the name against some other other, and such a comparison will fail as
early as the first character.

This commit adds a method to the interface of TableEntry so that such a
comparison can be done without extracting the full name. It saves 10% in the
time (1250ms -> 1100 ms) to evaluate the following expression.

```
lldb \
  --batch \
  -o "b CodeGenFunction::GenerateCode" \
  -o run \
  -o "expr Fn" \
  -- \
  clang++ -c -g test.cpp -o /dev/null &> output
```
@felipepiovezan
Copy link
Contributor Author

I ended up amending the original commit to change its message, since the implementation is now vastly different.
For historical purposes, here is the original implementation:

    // Compares the name of this entry against Target, returning true if they
     // are equal. This is helpful is hot code paths that do not need the length
     // of the name.
     bool compareNameAgainst(StringRef Target) const {
       // Note: this is not the name, but the rest of debug_names starting from
       // name. This handles corrupt data (non-null terminated) without
       // overruning the buffer.
       auto Data = StrData.getData().substr(StringOffset);
       auto DataSize = Data.size();
       auto TargetSize = Target.size();

       // Invariant: at the start of the loop, we have non-null characters to
       // read from Data.
       size_t Idx = 0;
       for (; Idx < DataSize && Data[Idx]; Idx++) {
         if (Idx >= TargetSize || Data[Idx] != Target[Idx])
           return false;
       }
       return Idx == TargetSize;
     }

@felipepiovezan
Copy link
Contributor Author

So the function could be written, I think, as return Data.size() >= Target.size() && strncmp(Data.data(), Target.data(), Target.size()) == 0 && !Data[Target.size()]; I think? Something like that

@dwblaikie I had to tweak this slightly: we want Data.size() > Target.size() instead of >=, because a (non-corrupt) string table name must be null terminated and Data.size() includes the \0.

@@ -543,6 +543,19 @@ class DWARFDebugNames : public DWARFAcceleratorTable {
return StrData.getCStr(&Off);
}

/// Compares the name of this entry against Target, returning true if they
/// are equal. This is helpful is hot code paths that do not need the length
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/// are equal. This is helpful is hot code paths that do not need the length
/// are equal. This is more efficient in hot code paths that do not need the length

@@ -543,6 +543,19 @@ class DWARFDebugNames : public DWARFAcceleratorTable {
return StrData.getCStr(&Off);
}

/// Compares the name of this entry against Target, returning true if they
/// are equal. This is helpful is hot code paths that do not need the length
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you mean "This is helpful in" ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup! I updated the comment after Adrian's other suggestion

// name. This handles corrupt data (non-null terminated) without
// overrunning the buffer.
auto Data = StrData.getData().substr(StringOffset);
auto TargetSize = Target.size();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From the code style guide perspective I think this needs explicit type?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah... I guess you are right, I will change it. It just makes me uncomfortable how easy it is to trigger implicit conversions the moment I write any types here :(

@dwblaikie
Copy link
Collaborator

So the function could be written, I think, as return Data.size() >= Target.size() && strncmp(Data.data(), Target.data(), Target.size()) == 0 && !Data[Target.size()]; I think? Something like that

@dwblaikie I had to tweak this slightly: we want Data.size() > Target.size() instead of >=, because a (non-corrupt) string table name must be null terminated and Data.size() includes the \0.

Oh, yeah, that totally makes sense! 👍

@felipepiovezan felipepiovezan merged commit 75ea78a into llvm:main Jan 30, 2024
4 checks passed
@felipepiovezan felipepiovezan deleted the felipe/faster_name_cmp branch January 30, 2024 22:28
felipepiovezan added a commit to felipepiovezan/llvm-project that referenced this pull request Feb 2, 2024
TableEntry names are pointers into the string table section, and
accessing their
length requires a search for `\0`. However, 99% of the time we only need
to
compare the name against some other other, and such a comparison will
fail as
early as the first character.

This commit adds a method to the interface of TableEntry so that such a
comparison can be done without extracting the full name. It saves 10% in
the
time (1250ms -> 1100 ms) to evaluate the following expression.

```
lldb \
  --batch \
  -o "b CodeGenFunction::GenerateCode" \
  -o run \
  -o "expr Fn" \
  -- \
  clang++ -c -g test.cpp -o /dev/null &> output
```

(cherry picked from commit 75ea78a)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants