Skip to content

Conversation

philnik777
Copy link
Contributor

@philnik777 philnik777 commented Sep 25, 2025

This has multiple benefits:

  1. The compiler has to do way less work to figure out things fold into a simple popcount, improving compile times quite a bit
  2. The compiler inlines better, since the compile doesn't have to do complicated optimizations to get to the same point. Looking at the pipeline, it seems that without this, LLVM has to go all the way to GVN to get to the same code as there is after the first InstCombine pass with this change.

Currently this applies only to bitsets with at most 64 bits, but that is by far the most common case.

Copy link

github-actions bot commented Sep 25, 2025

✅ With the latest revision this PR passed the C/C++ code formatter.

@ldionne ldionne marked this pull request as ready for review September 25, 2025 12:21
@ldionne ldionne requested a review from a team as a code owner September 25, 2025 12:21
@llvmbot llvmbot added the libc++ libc++ C++ Standard Library. Not GNU libstdc++. Not libc++abi. label Sep 25, 2025
@llvmbot
Copy link
Member

llvmbot commented Sep 25, 2025

@llvm/pr-subscribers-libcxx

Author: Nikolas Klauser (philnik777)

Changes

This has multiple benefits:

  1. The compiler has to do way less work to figure out things fold into a simple popcount, improving compile times quite a bit
  2. The compiler inlines better, since the compile doesn't have to do complicated optimizations to get to the same point. Looking at the pipeline, it seems that without this, LLVM has to go all the way to GVN to get to the same code as there is after the first InstCombine pass with this change.

Currently this applies only to bitsets with at most 64 bits, but that is by far the most common case.


Full diff: https://github.com/llvm/llvm-project/pull/160679.diff

1 Files Affected:

  • (modified) libcxx/include/bitset (+8-1)
diff --git a/libcxx/include/bitset b/libcxx/include/bitset
index e2b46154ae730..d8bb938456104 100644
--- a/libcxx/include/bitset
+++ b/libcxx/include/bitset
@@ -867,7 +867,14 @@ bitset<_Size>::to_string(char __zero, char __one) const {
 
 template <size_t _Size>
 inline _LIBCPP_HIDE_FROM_ABI _LIBCPP_CONSTEXPR_SINCE_CXX23 size_t bitset<_Size>::count() const _NOEXCEPT {
-  return static_cast<size_t>(std::count(__base::__make_iter(0), __base::__make_iter(_Size), true));
+#ifdef _LIBCPP_COMPILER_CLANG_BASED
+  if constexpr (_Size <= __base::__bits_per_word) {
+    return __builtin_popcountg(static_cast<unsigned _BitInt(_Size)>(__base::__first_));
+  } else
+#endif
+  {
+    return static_cast<size_t>(std::count(__base::__make_iter(0), __base::__make_iter(_Size), true));
+  }
 }
 
 template <size_t _Size>

@philnik777 philnik777 force-pushed the bitset_use_bitint_popcount branch from 28df28b to d820b2c Compare September 25, 2025 12:44
return 0;
} else if constexpr (_Size <= __base::__bits_per_word) {
return __builtin_popcountg(static_cast<unsigned _BitInt(_Size)>(__base::__first_));
} else
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we really need this else? Can't we have

#if defined(CLANG)
if constexpr (...) {
  // A
} else if constexpr (...) {
 // B
}
#endif

return std::count(...);

Seems a bit simpler and equivalent. Are you worried about the mere presence of return std::count(...) preventing inlining?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't need the else, but it improves compile times quite a bit, since we avoid instantiating std::count altogether with it. IMO the tiny simplification isn't worth the compile time increase.

@philnik777 philnik777 force-pushed the bitset_use_bitint_popcount branch from d820b2c to c5c359b Compare September 25, 2025 13:32
@philnik777 philnik777 merged commit 1b0553c into llvm:main Sep 25, 2025
71 of 73 checks passed
@philnik777 philnik777 deleted the bitset_use_bitint_popcount branch September 25, 2025 18:16
mahesh-attarde pushed a commit to mahesh-attarde/llvm-project that referenced this pull request Oct 3, 2025
…#160679)

This has multiple benefits:
1) The compiler has to do way less work to figure out things fold into a
simple `popcount`, improving compile times quite a bit
2) The compiler inlines better, since the compile doesn't have to do
complicated optimizations to get to the same point. Looking at the
pipeline, it seems that without this, LLVM has to go all the way to GVN
to get to the same code as there is after the first InstCombine pass
with this change.

Currently this applies only to `bitset`s with at most 64 bits, but that
is by far the most common case.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
libc++ libc++ C++ Standard Library. Not GNU libstdc++. Not libc++abi.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants