Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tweak BumpPtrAllocator to benefit the hot path #90571

Merged
merged 1 commit into from
May 1, 2024

Conversation

resistor
Copy link
Collaborator

This takes the form of three consecutive but related changes:

  • Mark the fast path of BumpPtrAllocator as likely-taken.
  • Move the slow path of BumpPtrAllocator to a separate function.
  • Mark the slow path of BumpPtrAllocator as noinline.

Overall, this saves geomean 0.4% userspace instructions on CTMark -O3,
and 0.98% on CTMark -O0 -g.

http://llvm-compile-time-tracker.com/compare.php?from=e1622e189e8c0ef457bfac528f90a7a930d9aad2&to=9eb53a4ed3af4a55e769ae1dd22d034b63d046e3&stat=instructions%3Au

@llvmbot
Copy link
Collaborator

llvmbot commented Apr 30, 2024

@llvm/pr-subscribers-llvm-support

Author: Owen Anderson (resistor)

Changes

This takes the form of three consecutive but related changes:

  • Mark the fast path of BumpPtrAllocator as likely-taken.
  • Move the slow path of BumpPtrAllocator to a separate function.
  • Mark the slow path of BumpPtrAllocator as noinline.

Overall, this saves geomean 0.4% userspace instructions on CTMark -O3,
and 0.98% on CTMark -O0 -g.

http://llvm-compile-time-tracker.com/compare.php?from=e1622e189e8c0ef457bfac528f90a7a930d9aad2&to=9eb53a4ed3af4a55e769ae1dd22d034b63d046e3&stat=instructions%3Au


Full diff: https://github.com/llvm/llvm-project/pull/90571.diff

1 Files Affected:

  • (modified) llvm/include/llvm/Support/Allocator.h (+8-3)
diff --git a/llvm/include/llvm/Support/Allocator.h b/llvm/include/llvm/Support/Allocator.h
index c1e5c6d2853bd5..873d453650187e 100644
--- a/llvm/include/llvm/Support/Allocator.h
+++ b/llvm/include/llvm/Support/Allocator.h
@@ -159,9 +159,9 @@ class BumpPtrAllocatorImpl
 #endif
 
     // Check if we have enough space.
-    if (Adjustment + SizeToAllocate <= size_t(End - CurPtr)
-        // We can't return nullptr even for a zero-sized allocation!
-        && CurPtr != nullptr) {
+    if (LLVM_LIKELY(Adjustment + SizeToAllocate <= size_t(End - CurPtr)
+                    // We can't return nullptr even for a zero-sized allocation!
+                    && CurPtr != nullptr)) {
       char *AlignedPtr = CurPtr + Adjustment;
       CurPtr = AlignedPtr + SizeToAllocate;
       // Update the allocation point of this memory block in MemorySanitizer.
@@ -173,6 +173,11 @@ class BumpPtrAllocatorImpl
       return AlignedPtr;
     }
 
+    return AllocateSlow(SizeToAllocate, Alignment);
+  }
+
+  LLVM_ATTRIBUTE_RETURNS_NONNULL LLVM_ATTRIBUTE_NOINLINE void *
+  AllocateSlow(size_t SizeToAllocate, Align Alignment) {
     // If Size is really big, allocate a separate slab for it.
     size_t PaddedSize = SizeToAllocate + Alignment.value() - 1;
     if (PaddedSize > SizeThreshold) {

@resistor resistor requested a review from nikic April 30, 2024 08:03
This takes the form of three consecutive but related changes:
- Mark the fast path of BumpPtrAllocator as likely-taken.
- Move the slow path of BumpPtrAllocator to a separate function.
- Mark the slow path of BumpPtrAllocator as noinline.

Overall, this saves geomean 0.4% userspace instructions on CTMark -O3,
and 0.98% on CTMark -O0 -g.
Copy link
Contributor

@nikic nikic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@resistor resistor force-pushed the perf/bumpptr branch 2 times, most recently from ab3ff66 to 0b8de15 Compare May 1, 2024 04:53
@resistor resistor merged commit cd46c2c into llvm:main May 1, 2024
3 of 4 checks passed
@resistor resistor deleted the perf/bumpptr branch May 7, 2024 08:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants