Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clean up / speed up ULEB128 decoding #73585

Closed
wants to merge 4 commits into from

Conversation

adrian-prantl
Copy link
Collaborator

This series of patches simplifies the two [U]LEB128 decoder functions in LLVM and makes them ever so slightly faster in the process.

As a quick performance test decoding DWARF I instructed dwarfdump to print all DIEs with the name "end" in clang.dSYM without using the accelerator tables:

_build.ninja.noassert$ time bin/llvm-dwarfdump-old -n end -o /dev/null bin/clang-18.dSYM ; time bin/llvm-dwarfdump-new -n end -o /dev/null bin/clang-18.dSYM
bin/llvm-dwarfdump-old -n end -o /dev/null bin/clang-18.dSYM  20.34s user 0.51s system 98% cpu 21.151 total
bin/llvm-dwarfdump-new -n end -o /dev/null bin/clang-18.dSYM  20.15s user 0.50s system 98% cpu 20.950 total
_build.ninja.noassert$ time bin/llvm-dwarfdump-old -n end -o /dev/null bin/clang-18.dSYM ; time bin/llvm-dwarfdump-new -n end -o /dev/null bin/clang-18.dSYM
bin/llvm-dwarfdump-old -n end -o /dev/null bin/clang-18.dSYM  20.33s user 0.50s system 98% cpu 21.178 total
bin/llvm-dwarfdump-new -n end -o /dev/null bin/clang-18.dSYM  20.21s user 0.50s system 98% cpu 21.027 total
_build.ninja.noassert$ time bin/llvm-dwarfdump-old -n end -o /dev/null bin/clang-18.dSYM ; time bin/llvm-dwarfdump-new -n end -o /dev/null bin/clang-18.dSYM
bin/llvm-dwarfdump-old -n end -o /dev/null bin/clang-18.dSYM  20.35s user 0.53s system 98% cpu 21.224 total
bin/llvm-dwarfdump-new -n end -o /dev/null bin/clang-18.dSYM  20.27s user 0.49s system 98% cpu 21.057 total

@llvmbot
Copy link
Collaborator

llvmbot commented Nov 27, 2023

@llvm/pr-subscribers-lld-elf
@llvm/pr-subscribers-lld
@llvm/pr-subscribers-lld-coff
@llvm/pr-subscribers-debuginfo

@llvm/pr-subscribers-llvm-binary-utilities

Author: Adrian Prantl (adrian-prantl)

Changes

This series of patches simplifies the two [U]LEB128 decoder functions in LLVM and makes them ever so slightly faster in the process.

As a quick performance test decoding DWARF I instructed dwarfdump to print all DIEs with the name "end" in clang.dSYM without using the accelerator tables:

_build.ninja.noassert$ time bin/llvm-dwarfdump-old -n end -o /dev/null bin/clang-18.dSYM ; time bin/llvm-dwarfdump-new -n end -o /dev/null bin/clang-18.dSYM
bin/llvm-dwarfdump-old -n end -o /dev/null bin/clang-18.dSYM  20.34s user 0.51s system 98% cpu 21.151 total
bin/llvm-dwarfdump-new -n end -o /dev/null bin/clang-18.dSYM  20.15s user 0.50s system 98% cpu 20.950 total
_build.ninja.noassert$ time bin/llvm-dwarfdump-old -n end -o /dev/null bin/clang-18.dSYM ; time bin/llvm-dwarfdump-new -n end -o /dev/null bin/clang-18.dSYM
bin/llvm-dwarfdump-old -n end -o /dev/null bin/clang-18.dSYM  20.33s user 0.50s system 98% cpu 21.178 total
bin/llvm-dwarfdump-new -n end -o /dev/null bin/clang-18.dSYM  20.21s user 0.50s system 98% cpu 21.027 total
_build.ninja.noassert$ time bin/llvm-dwarfdump-old -n end -o /dev/null bin/clang-18.dSYM ; time bin/llvm-dwarfdump-new -n end -o /dev/null bin/clang-18.dSYM
bin/llvm-dwarfdump-old -n end -o /dev/null bin/clang-18.dSYM  20.35s user 0.53s system 98% cpu 21.224 total
bin/llvm-dwarfdump-new -n end -o /dev/null bin/clang-18.dSYM  20.27s user 0.49s system 98% cpu 21.057 total

Full diff: https://github.com/llvm/llvm-project/pull/73585.diff

2 Files Affected:

  • (modified) llvm/include/llvm/Support/LEB128.h (+18-15)
  • (modified) llvm/lib/Object/MachOObjectFile.cpp (+1-1)
diff --git a/llvm/include/llvm/Support/LEB128.h b/llvm/include/llvm/Support/LEB128.h
index a5d367279aefe64..3d5e98c4b2cddee 100644
--- a/llvm/include/llvm/Support/LEB128.h
+++ b/llvm/include/llvm/Support/LEB128.h
@@ -125,29 +125,30 @@ inline unsigned encodeULEB128(uint64_t Value, uint8_t *p,
 }
 
 /// Utility function to decode a ULEB128 value.
+///
+/// If \p error is non-null, it will point to a static error message,
+/// if an error occured. It will not be modified on success.
 inline uint64_t decodeULEB128(const uint8_t *p, unsigned *n = nullptr,
                               const uint8_t *end = nullptr,
                               const char **error = nullptr) {
   const uint8_t *orig_p = p;
   uint64_t Value = 0;
   unsigned Shift = 0;
-  if (error)
-    *error = nullptr;
   do {
-    if (p == end) {
+    if (LLVM_UNLIKELY(p == end)) {
       if (error)
         *error = "malformed uleb128, extends past end";
-      if (n)
-        *n = (unsigned)(p - orig_p);
-      return 0;
+      Value = 0;
+      break;
     }
     uint64_t Slice = *p & 0x7f;
-    if ((Shift >= 64 && Slice != 0) || Slice << Shift >> Shift != Slice) {
+    if (LLVM_UNLIKELY(Shift >= 63) &&
+        ((Shift == 63 && ((Slice << Shift) >> Shift) != Slice) ||
+         (Shift > 63 && Slice != 0))) {
       if (error)
         *error = "uleb128 too big for uint64";
-      if (n)
-        *n = (unsigned)(p - orig_p);
-      return 0;
+      Value = 0;
+      break;
     }
     Value += Slice << Shift;
     Shift += 7;
@@ -158,6 +159,9 @@ inline uint64_t decodeULEB128(const uint8_t *p, unsigned *n = nullptr,
 }
 
 /// Utility function to decode a SLEB128 value.
+///
+/// If \p error is non-null, it will point to a static error message,
+/// if an error occured. It will not be modified on success.
 inline int64_t decodeSLEB128(const uint8_t *p, unsigned *n = nullptr,
                              const uint8_t *end = nullptr,
                              const char **error = nullptr) {
@@ -165,10 +169,8 @@ inline int64_t decodeSLEB128(const uint8_t *p, unsigned *n = nullptr,
   int64_t Value = 0;
   unsigned Shift = 0;
   uint8_t Byte;
-  if (error)
-    *error = nullptr;
   do {
-    if (p == end) {
+    if (LLVM_UNLIKELY(p == end)) {
       if (error)
         *error = "malformed sleb128, extends past end";
       if (n)
@@ -177,8 +179,9 @@ inline int64_t decodeSLEB128(const uint8_t *p, unsigned *n = nullptr,
     }
     Byte = *p;
     uint64_t Slice = Byte & 0x7f;
-    if ((Shift >= 64 && Slice != (Value < 0 ? 0x7f : 0x00)) ||
-        (Shift == 63 && Slice != 0 && Slice != 0x7f)) {
+    if (LLVM_UNLIKELY(Shift >= 63) &&
+        ((Shift == 63 && Slice != 0 && Slice != 0x7f) ||
+         (Shift > 63 && Slice != (Value < 0 ? 0x7f : 0x00)))) {
       if (error)
         *error = "sleb128 too big for int64";
       if (n)
diff --git a/llvm/lib/Object/MachOObjectFile.cpp b/llvm/lib/Object/MachOObjectFile.cpp
index aa57de16ed18f44..11ad8aeae65da5d 100644
--- a/llvm/lib/Object/MachOObjectFile.cpp
+++ b/llvm/lib/Object/MachOObjectFile.cpp
@@ -2996,7 +2996,7 @@ void ExportEntry::pushNode(uint64_t offset) {
   ErrorAsOutParameter ErrAsOutParam(E);
   const uint8_t *Ptr = Trie.begin() + offset;
   NodeState State(Ptr);
-  const char *error;
+  const char *error = nullptr;
   uint64_t ExportInfoSize = readULEB128(State.Current, &error);
   if (error) {
     *E = malformedError("export info size " + Twine(error) +

@llvmbot
Copy link
Collaborator

llvmbot commented Nov 27, 2023

@llvm/pr-subscribers-llvm-support

Author: Adrian Prantl (adrian-prantl)

Changes

This series of patches simplifies the two [U]LEB128 decoder functions in LLVM and makes them ever so slightly faster in the process.

As a quick performance test decoding DWARF I instructed dwarfdump to print all DIEs with the name "end" in clang.dSYM without using the accelerator tables:

_build.ninja.noassert$ time bin/llvm-dwarfdump-old -n end -o /dev/null bin/clang-18.dSYM ; time bin/llvm-dwarfdump-new -n end -o /dev/null bin/clang-18.dSYM
bin/llvm-dwarfdump-old -n end -o /dev/null bin/clang-18.dSYM  20.34s user 0.51s system 98% cpu 21.151 total
bin/llvm-dwarfdump-new -n end -o /dev/null bin/clang-18.dSYM  20.15s user 0.50s system 98% cpu 20.950 total
_build.ninja.noassert$ time bin/llvm-dwarfdump-old -n end -o /dev/null bin/clang-18.dSYM ; time bin/llvm-dwarfdump-new -n end -o /dev/null bin/clang-18.dSYM
bin/llvm-dwarfdump-old -n end -o /dev/null bin/clang-18.dSYM  20.33s user 0.50s system 98% cpu 21.178 total
bin/llvm-dwarfdump-new -n end -o /dev/null bin/clang-18.dSYM  20.21s user 0.50s system 98% cpu 21.027 total
_build.ninja.noassert$ time bin/llvm-dwarfdump-old -n end -o /dev/null bin/clang-18.dSYM ; time bin/llvm-dwarfdump-new -n end -o /dev/null bin/clang-18.dSYM
bin/llvm-dwarfdump-old -n end -o /dev/null bin/clang-18.dSYM  20.35s user 0.53s system 98% cpu 21.224 total
bin/llvm-dwarfdump-new -n end -o /dev/null bin/clang-18.dSYM  20.27s user 0.49s system 98% cpu 21.057 total

Full diff: https://github.com/llvm/llvm-project/pull/73585.diff

2 Files Affected:

  • (modified) llvm/include/llvm/Support/LEB128.h (+18-15)
  • (modified) llvm/lib/Object/MachOObjectFile.cpp (+1-1)
diff --git a/llvm/include/llvm/Support/LEB128.h b/llvm/include/llvm/Support/LEB128.h
index a5d367279aefe64..3d5e98c4b2cddee 100644
--- a/llvm/include/llvm/Support/LEB128.h
+++ b/llvm/include/llvm/Support/LEB128.h
@@ -125,29 +125,30 @@ inline unsigned encodeULEB128(uint64_t Value, uint8_t *p,
 }
 
 /// Utility function to decode a ULEB128 value.
+///
+/// If \p error is non-null, it will point to a static error message,
+/// if an error occured. It will not be modified on success.
 inline uint64_t decodeULEB128(const uint8_t *p, unsigned *n = nullptr,
                               const uint8_t *end = nullptr,
                               const char **error = nullptr) {
   const uint8_t *orig_p = p;
   uint64_t Value = 0;
   unsigned Shift = 0;
-  if (error)
-    *error = nullptr;
   do {
-    if (p == end) {
+    if (LLVM_UNLIKELY(p == end)) {
       if (error)
         *error = "malformed uleb128, extends past end";
-      if (n)
-        *n = (unsigned)(p - orig_p);
-      return 0;
+      Value = 0;
+      break;
     }
     uint64_t Slice = *p & 0x7f;
-    if ((Shift >= 64 && Slice != 0) || Slice << Shift >> Shift != Slice) {
+    if (LLVM_UNLIKELY(Shift >= 63) &&
+        ((Shift == 63 && ((Slice << Shift) >> Shift) != Slice) ||
+         (Shift > 63 && Slice != 0))) {
       if (error)
         *error = "uleb128 too big for uint64";
-      if (n)
-        *n = (unsigned)(p - orig_p);
-      return 0;
+      Value = 0;
+      break;
     }
     Value += Slice << Shift;
     Shift += 7;
@@ -158,6 +159,9 @@ inline uint64_t decodeULEB128(const uint8_t *p, unsigned *n = nullptr,
 }
 
 /// Utility function to decode a SLEB128 value.
+///
+/// If \p error is non-null, it will point to a static error message,
+/// if an error occured. It will not be modified on success.
 inline int64_t decodeSLEB128(const uint8_t *p, unsigned *n = nullptr,
                              const uint8_t *end = nullptr,
                              const char **error = nullptr) {
@@ -165,10 +169,8 @@ inline int64_t decodeSLEB128(const uint8_t *p, unsigned *n = nullptr,
   int64_t Value = 0;
   unsigned Shift = 0;
   uint8_t Byte;
-  if (error)
-    *error = nullptr;
   do {
-    if (p == end) {
+    if (LLVM_UNLIKELY(p == end)) {
       if (error)
         *error = "malformed sleb128, extends past end";
       if (n)
@@ -177,8 +179,9 @@ inline int64_t decodeSLEB128(const uint8_t *p, unsigned *n = nullptr,
     }
     Byte = *p;
     uint64_t Slice = Byte & 0x7f;
-    if ((Shift >= 64 && Slice != (Value < 0 ? 0x7f : 0x00)) ||
-        (Shift == 63 && Slice != 0 && Slice != 0x7f)) {
+    if (LLVM_UNLIKELY(Shift >= 63) &&
+        ((Shift == 63 && Slice != 0 && Slice != 0x7f) ||
+         (Shift > 63 && Slice != (Value < 0 ? 0x7f : 0x00)))) {
       if (error)
         *error = "sleb128 too big for int64";
       if (n)
diff --git a/llvm/lib/Object/MachOObjectFile.cpp b/llvm/lib/Object/MachOObjectFile.cpp
index aa57de16ed18f44..11ad8aeae65da5d 100644
--- a/llvm/lib/Object/MachOObjectFile.cpp
+++ b/llvm/lib/Object/MachOObjectFile.cpp
@@ -2996,7 +2996,7 @@ void ExportEntry::pushNode(uint64_t offset) {
   ErrorAsOutParameter ErrAsOutParam(E);
   const uint8_t *Ptr = Trie.begin() + offset;
   NodeState State(Ptr);
-  const char *error;
+  const char *error = nullptr;
   uint64_t ExportInfoSize = readULEB128(State.Current, &error);
   if (error) {
     *E = malformedError("export info size " + Twine(error) +

}
uint64_t Slice = *p & 0x7f;
if ((Shift >= 64 && Slice != 0) || Slice << Shift >> Shift != Slice) {
if (LLVM_UNLIKELY(Shift >= 63) &&
((Shift == 63 && ((Slice << Shift) >> Shift) != Slice) ||
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The inner paren in (Slice << Shift) >> Shift can be omitted.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I only added this for better readability. At first glance x << Shift >> Shift looks like a noop otherwise.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thinking about it though, the parenthesis don't make this any clearer.

This change removes an unnecessary branch from a hot path. It's also
questionable API to override any previous error unconditonally.
Previously the overflow check was done for every byte even though it
is only needed for the case where Shift == 63.
Copy link
Contributor

@felipepiovezan felipepiovezan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Nice side-effect of exposing uninitialized variables.

@adrian-prantl
Copy link
Collaborator Author

I manually pushed this in

commit 80fc872a24c4dca4820d2e7885b5ee9195bec42a
Author: Adrian Prantl <aprantl@apple.com>
Date:   Mon Nov 27 14:46:57 2023 -0800

    [LEB128] Mark error condition with LLVM_UNLIKELY

commit 0cc2acc30b3d2f4e914fd49c599cfde8a17f26a6
Author: Adrian Prantl <aprantl@apple.com>
Date:   Mon Nov 27 14:15:12 2023 -0800

    [LEB128] Don't handle edge cases in every loop iteration
    
    Previously the overflow check was done for every byte even though it
    is only needed for the case where Shift == 63.

commit b96121c2e7de66154a70db5f202c9adce515aa45
Author: Adrian Prantl <aprantl@apple.com>
Date:   Mon Nov 27 14:15:06 2023 -0800

    [LEB128] Factor out redundant code

commit 545c8e009e2b649ef38f7e432ffbc06ba8a9b813
Author: Adrian Prantl <aprantl@apple.com>
Date:   Mon Nov 27 10:42:57 2023 -0800

    [LEB128] Don't initialize error on success
    
    This change removes an unnecessary branch from a hot path. It's also
    questionable API to override any previous error unconditonally.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants