Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use Log2_64_Ceil to compute PowerOf2Ceil #67580

Merged
merged 1 commit into from
Jan 16, 2024
Merged

Conversation

AtariDreams
Copy link
Contributor

@AtariDreams AtariDreams commented Sep 27, 2023

Instead of calling NextPowerOf2, which is only useful for constants,
we should call Log2_64_Ceil, which is faster because it uses compiler
intrinsics where supported.

@llvmbot
Copy link
Collaborator

llvmbot commented Sep 27, 2023

@llvm/pr-subscribers-llvm-support

Changes

Instead of calling NextPowerOf2, we can use compiler intrinsics to do this faster.


Full diff: https://github.com/llvm/llvm-project/pull/67580.diff

1 Files Affected:

  • (modified) llvm/include/llvm/Support/MathExtras.h (+4)
diff --git a/llvm/include/llvm/Support/MathExtras.h b/llvm/include/llvm/Support/MathExtras.h
index dc095941fdc8a9f..f592beee5ccffc2 100644
--- a/llvm/include/llvm/Support/MathExtras.h
+++ b/llvm/include/llvm/Support/MathExtras.h
@@ -361,7 +361,11 @@ constexpr inline uint64_t NextPowerOf2(uint64_t A) {
 inline uint64_t PowerOf2Ceil(uint64_t A) {
   if (!A)
     return 0;
+#if __has_builtin(__builtin_clzl)
+  return 1ULL << (64 - __builtin_clzl(A - 1));
+#else
   return NextPowerOf2(A - 1);
+#endif
 }
 
 /// Returns the next integer (mod 2**64) that is greater than or equal to

@AtariDreams AtariDreams changed the title Use compiler intrinsics to compute PowerOf2Ceil Use countl_zero to compute PowerOf2Ceil Sep 27, 2023
@AtariDreams AtariDreams changed the title Use countl_zero to compute PowerOf2Ceil Use Log2_64_Ceil to compute PowerOf2Ceil Sep 27, 2023
@AtariDreams
Copy link
Contributor Author

@gysit thoughts on this? @MaskRay

@gysit
Copy link
Contributor

gysit commented Sep 29, 2023

@AtariDreams I am not really familiar with the reasoning behind the current solution. I would probably ping the original author of the function?

@AtariDreams
Copy link
Contributor Author

@dcci Thoughts on this?

Copy link
Member

@MaskRay MaskRay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is manual strength reduction. NextPowerOf2 is a complex operation whose original purpose isn't clear to me.

@AtariDreams
Copy link
Contributor Author

This is manual strength reduction. NextPowerOf2 is a complex operation whose original purpose isn't clear to me.

The only thing NextPowerOf2 has doing for it is that it is constexpr but also like it's used in non constexpr places too but that is something for another PR to address

@AtariDreams
Copy link
Contributor Author

@MaskRay Can we please merge this?

@AtariDreams
Copy link
Contributor Author

@MaskRay Any updates?

@rilysh
Copy link
Contributor

rilysh commented Oct 18, 2023

The current implemention of NextPowerOf2 seems to be the most portable way to achieve the power of 2, which is equivalent to int(pow(2, ceil(log2(x)))). Only noticeable difference to me is that in the PR implementation, if x > (UINT32_MAX + 1), the current NextPowerOf2 and PR changes of that function will yield different results.

@MaskRay
Copy link
Member

MaskRay commented Oct 20, 2023

The current implemention of NextPowerOf2 seems to be the most portable way to achieve the power of 2, which is equivalent to int(pow(2, ceil(log2(x)))). Only noticeable difference to me is that in the PR implementation, if x > (UINT32_MAX + 1), the current NextPowerOf2 and PR changes of that function will yield different results.

Thanks for catching this.

Minor correction, when the argument is larger than 2**63, NextPowerOf2(A - 1) returns 0 (reasonable) while UINT64_C(1) << 64-std::countl_zero(A-1) is undefined (the right operand is 0)

@AtariDreams
Copy link
Contributor Author

Fixed!

@github-actions
Copy link

github-actions bot commented Oct 28, 2023

✅ With the latest revision this PR passed the C/C++ code formatter.

@AtariDreams
Copy link
Contributor Author

@MaskRay All done and taken care of!

Instead of calling NextPowerOf2, which is only useful for constants,
we should call Log2_64_Ceil, which is faster because it uses compiler
intrinsics where supported.
@MaskRay MaskRay merged commit eca2529 into llvm:main Jan 16, 2024
4 checks passed
@AtariDreams AtariDreams deleted the fputc branch January 16, 2024 02:56
justinfargnoli pushed a commit to justinfargnoli/llvm-project that referenced this pull request Jan 28, 2024
Instead of calling NextPowerOf2, which is only useful for constants,
we should call Log2_64_Ceil, which is faster because it uses compiler
intrinsics where supported.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants