Skip to content

Conversation

c8ef
Copy link
Contributor

@c8ef c8ef commented Oct 11, 2025

Part of #102817.

This patch attempts to optimize the performance of std::generate for segmented iterators. Below are the benchmark numbers from libcxx\test\benchmarks\algorithms\modifying\generate.bench.cpp. Test cases that use segmented iterators have also been added.

  • before
std::generate(deque<int>)/32           194 ns          193 ns      3733333
std::generate(deque<int>)/50           276 ns          276 ns      2488889
std::generate(deque<int>)/1024        5096 ns         5022 ns       112000
std::generate(deque<int>)/8192       40806 ns        40806 ns        17231
  • after
std::generate(deque<int>)/32           106 ns          105 ns      6400000
std::generate(deque<int>)/50           139 ns          138 ns      4977778
std::generate(deque<int>)/1024        2713 ns         2699 ns       248889
std::generate(deque<int>)/8192       18983 ns        19252 ns        37333

@c8ef c8ef requested a review from a team as a code owner October 11, 2025 15:23
@llvmbot llvmbot added the libc++ libc++ C++ Standard Library. Not GNU libstdc++. Not libc++abi. label Oct 11, 2025
@llvmbot
Copy link
Member

llvmbot commented Oct 11, 2025

@llvm/pr-subscribers-libcxx

Author: Connector Switch (c8ef)

Changes

Part of #102817.


Full diff: https://github.com/llvm/llvm-project/pull/163006.diff

2 Files Affected:

  • (modified) libcxx/include/__algorithm/generate.h (+25-2)
  • (modified) libcxx/test/std/algorithms/alg.modifying.operations/alg.generate/generate.pass.cpp (+11)
diff --git a/libcxx/include/__algorithm/generate.h b/libcxx/include/__algorithm/generate.h
index c95b527402f5d..91e2ada7daf77 100644
--- a/libcxx/include/__algorithm/generate.h
+++ b/libcxx/include/__algorithm/generate.h
@@ -9,7 +9,10 @@
 #ifndef _LIBCPP___ALGORITHM_GENERATE_H
 #define _LIBCPP___ALGORITHM_GENERATE_H
 
+#include <__algorithm/for_each_segment.h>
 #include <__config>
+#include <__iterator/segmented_iterator.h>
+#include <__type_traits/enable_if.h>
 
 #if !defined(_LIBCPP_HAS_NO_PRAGMA_SYSTEM_HEADER)
 #  pragma GCC system_header
@@ -17,13 +20,33 @@
 
 _LIBCPP_BEGIN_NAMESPACE_STD
 
-template <class _ForwardIterator, class _Generator>
+template <class _ForwardIterator, class _Sent, class _Generator>
 inline _LIBCPP_HIDE_FROM_ABI _LIBCPP_CONSTEXPR_SINCE_CXX20 void
-generate(_ForwardIterator __first, _ForwardIterator __last, _Generator __gen) {
+__generate(_ForwardIterator __first, _Sent __last, _Generator __gen) {
   for (; __first != __last; ++__first)
     *__first = __gen();
 }
 
+#ifndef _LIBCPP_CXX03_LANG
+template <class _SegmentedIterator,
+          class _Generator,
+          __enable_if_t<__is_segmented_iterator_v<_SegmentedIterator>, int> = 0>
+_LIBCPP_HIDE_FROM_ABI _LIBCPP_CONSTEXPR_SINCE_CXX20
+_SegmentedIterator __generate(_SegmentedIterator __first, _SegmentedIterator __last, _Generator& __gen) {
+  using __local_iterator_t = typename __segmented_iterator_traits<_SegmentedIterator>::__local_iterator;
+  std::__for_each_segment(__first, __last, [&](__local_iterator_t __lfirst, __local_iterator_t __llast) {
+    std::__generate(__lfirst, __llast, __gen);
+  });
+  return __last;
+}
+#endif // !_LIBCPP_CXX03_LANG
+
+template <class _ForwardIterator, class _Generator>
+inline _LIBCPP_HIDE_FROM_ABI
+_LIBCPP_CONSTEXPR_SINCE_CXX20 void generate(_ForwardIterator __first, _ForwardIterator __last, _Generator __gen) {
+  std::__generate(__first, __last, __gen);
+}
+
 _LIBCPP_END_NAMESPACE_STD
 
 #endif // _LIBCPP___ALGORITHM_GENERATE_H
diff --git a/libcxx/test/std/algorithms/alg.modifying.operations/alg.generate/generate.pass.cpp b/libcxx/test/std/algorithms/alg.modifying.operations/alg.generate/generate.pass.cpp
index 29d32d7156742..4591d7ece4645 100644
--- a/libcxx/test/std/algorithms/alg.modifying.operations/alg.generate/generate.pass.cpp
+++ b/libcxx/test/std/algorithms/alg.modifying.operations/alg.generate/generate.pass.cpp
@@ -16,6 +16,7 @@
 
 #include <algorithm>
 #include <cassert>
+#include <deque>
 
 #include "test_macros.h"
 #include "test_iterators.h"
@@ -51,12 +52,22 @@ test()
     assert(ia[3] == 1);
 }
 
+void deque_test() {
+  int sizes[] = {0, 1, 2, 1023, 1024, 1025, 2047, 2048, 2049};
+  for (const int size : sizes) {
+    std::deque<int> d(size);
+    std::generate(d.begin(), d.end(), gen_test());
+    assert(std::all_of(d.begin(), d.end(), [](int x) { return x == 1; }));
+  }
+}
+
 int main(int, char**)
 {
     test<forward_iterator<int*> >();
     test<bidirectional_iterator<int*> >();
     test<random_access_iterator<int*> >();
     test<int*>();
+    deque_test();
 
 #if TEST_STD_VER > 17
     static_assert(test_constexpr());

Copy link
Contributor

@philnik777 philnik777 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we instead just forward to std::for_each?

@c8ef
Copy link
Contributor Author

c8ef commented Oct 13, 2025

Can we instead just forward to std::for_each?

You mean like following?

template<class ForwardIt, class Generator>
void generate(ForwardIt first, ForwardIt last, Generator gen) {
    std::for_each(first, last, [&gen](auto& element) {
        element = gen();
    });
}

Will test this tonight.

@c8ef
Copy link
Contributor Author

c8ef commented Oct 13, 2025

To some extent, I think the current implementation is also acceptable since it uses the for_each_segment utility.

@c8ef
Copy link
Contributor Author

c8ef commented Oct 13, 2025

std::for_each(first, last, [&gen](auto& element) {
        element = gen();
    });
std::generate(deque<int>)/32           220 ns          220 ns      3200000
std::generate(deque<int>)/50           321 ns          322 ns      2133333
std::generate(deque<int>)/1024        5808 ns         5720 ns       112000
std::generate(deque<int>)/8192       46257 ns        46527 ns        15448

Forwarding this to std::for_each seems to make it even slower than the current implementation.

template <class _ForwardIterator, class _Generator>
inline _LIBCPP_HIDE_FROM_ABI _LIBCPP_CONSTEXPR_SINCE_CXX20 void
generate(_ForwardIterator __first, _ForwardIterator __last, _Generator __gen) {
  std::for_each(__first, __last, [&](auto& __element) { __element = __gen(); });
}

@philnik777
Copy link
Contributor

Have you enabled optimizations?

@c8ef
Copy link
Contributor Author

c8ef commented Oct 13, 2025

std::generate(deque<int>)/32          16.4 ns         16.4 ns     44800000
std::generate(deque<int>)/50          24.9 ns         25.1 ns     28000000
std::generate(deque<int>)/1024         288 ns          289 ns      2488889
std::generate(deque<int>)/8192        2284 ns         2295 ns       320000
std::generate(deque<int>)/32          16.5 ns         16.1 ns     40727273
std::generate(deque<int>)/50          24.9 ns         25.1 ns     28000000
std::generate(deque<int>)/1024         288 ns          289 ns      2488889
std::generate(deque<int>)/8192        2192 ns         2197 ns       320000

Have you enabled optimizations?

It seems that the default ./bin/llvm-lit generate.bench.cpp does not enable this (or my configuration is incorrect). They have the expected performance.

@c8ef
Copy link
Contributor Author

c8ef commented Oct 13, 2025

This looks weird...

  | -- Performing Test HAVE_STD_REGEX -- failed to compile
  | -- Compiling and running to test HAVE_GNU_POSIX_REGEX
  | -- Performing Test HAVE_GNU_POSIX_REGEX -- failed to compile
  | -- Compiling and running to test HAVE_POSIX_REGEX
  | -- Performing Test HAVE_POSIX_REGEX -- failed to compile
  | CMake Error at CMakeLists.txt:315 (message):
  | Failed to determine the source files for the regular expression backend

@c8ef
Copy link
Contributor Author

c8ef commented Oct 13, 2025

While both implementations offer the same performance, the for_each forward approach cannot properly build the benchmark, failing specifically on regex testing. It's odd because locally, using ninja cxx to build a file like #include <regex> doesn't produce an error.

@philnik777 Could you please help take a look?

@philnik777
Copy link
Contributor

This usually indicates that your code doesn't work in C++14 mode. You should try to run lit with e.g. --param=std=c++03 and see whether that fails.

@c8ef
Copy link
Contributor Author

c8ef commented Oct 14, 2025

This usually indicates that your code doesn't work in C++14 mode. You should try to run lit with e.g. --param=std=c++03 and see whether that fails.

# .---command stderr------------
# | In file included from C:\\llvm-project\libcxx\test\std\algorithms\robust_against_proxy_iterators_lifetime_bugs.pass.cpp:
14:
# | In file included from C://llvm-project/build/libcxx/test-suite-install/include/c++/v1/algorithm:1861:
# | C://llvm-project/build/libcxx/test-suite-install/include/c++/v1/__algorithm/for_each.h:34:5: error: no matching function
 for call to '__invoke'
# |    34 |     std::__invoke(__f, std::__invoke(__proj, *__first));
# |       |     ^~~~~~~~~~~~~
# | C://llvm-project/build/libcxx/test-suite-install/include/c++/v1/__algorithm/for_each.h:57:8: note: in instantiation of f
unction template specialization 'std::__for_each<LifetimeIterator, LifetimeIterator, (lambda at C://llvm-project/build/libcx
x/test-suite-install/include/c++/v1/__algorithm/generate.h:24:34), std::__identity>' requested here
# |    57 |   std::__for_each(__first, __last, __f, __proj);
# |       |        ^
# | C://llvm-project/build/libcxx/test-suite-install/include/c++/v1/__algorithm/generate.h:24:8: note: in instantiation of f
unction template specialization 'std::for_each<LifetimeIterator, (lambda at C://llvm-project/build/libcxx/test-suite-install
/include/c++/v1/__algorithm/generate.h:24:34)>' requested here
# |    24 |   std::for_each(__first, __last, [&](auto& __element) { __element = __gen(); });
# |       |        ^
# | C:\\llvm-project\libcxx\test\std\algorithms\robust_against_proxy_iterators_lifetime_bugs.pass.cpp:710:47: note: in insta
ntiation of function template specialization 'std::generate<LifetimeIterator, (lambda at C://llvm-project/libcxx/test/std/al
gorithms/robust_against_proxy_iterators_lifetime_bugs.pass.cpp:651:14)>' requested here
# |   710 |   test(simple_in, [&](I b, I e) { (void) std::generate(b, e, gen); });
# |       |                                               ^
# | C:\\llvm-project\libcxx\test\std\algorithms\robust_against_proxy_iterators_lifetime_bugs.pass.cpp:710:33: note: while su
bstituting into a lambda expression here
# |   710 |   test(simple_in, [&](I b, I e) { (void) std::generate(b, e, gen); });
# |       |                                 ^
# | C:\\llvm-project\libcxx\test\std\algorithms\robust_against_proxy_iterators_lifetime_bugs.pass.cpp:763:3: note: in instan
tiation of function template specialization 'test<LifetimeIterator>' requested here
# |   763 |   test<LifetimeIterator>();
# |       |   ^
# | C://llvm-project/build/libcxx/test-suite-install/include/c++/v1/__type_traits/invoke.h:88:69: note: candidate template i
gnored: substitution failure [with _Args = <(lambda at C://llvm-project/build/libcxx/test-suite-install/include/c++/v1/__alg
orithm/generate.h:24:34) &, LifetimeIterator::Reference>]: no type named 'type' in 'std::__invoke_result_impl<void, (lambda at C:/Users/Mario/Doc
uments/llvm-project/build/libcxx/test-suite-install/include/c++/v1/__algorithm/generate.h:24:34) &, LifetimeIterator::Reference>'
# |    85 | using __invoke_result_t _LIBCPP_NODEBUG = typename __invoke_result<_Args...>::type;
# |       | ~~~~~
# |    86 |
# |    87 | template <class... _Args>
# |    88 | _LIBCPP_HIDE_FROM_ABI _LIBCPP_CONSTEXPR __invoke_result_t<_Args...> __invoke(_Args&&... __args)
# |       |                                                                     ^
# | In file included from C:\\llvm-project\libcxx\test\std\algorithms\robust_against_proxy_iterators_lifetime_bugs.pass.cpp:
14:
# | In file included from C://llvm-project/build/libcxx/test-suite-install/include/c++/v1/algorithm:1861:
# | C://llvm-project/build/libcxx/test-suite-install/include/c++/v1/__algorithm/for_each.h:34:5: error: no matching function
 for call to '__invoke'
# |    34 |     std::__invoke(__f, std::__invoke(__proj, *__first));
# |       |     ^~~~~~~~~~~~~
# | C://llvm-project/build/libcxx/test-suite-install/include/c++/v1/__algorithm/for_each.h:57:8: note: in instantiation of f
unction template specialization 'std::__for_each<ConstexprIterator, ConstexprIterator, (lambda at C://llvm-project/build/lib
cxx/test-suite-install/include/c++/v1/__algorithm/generate.h:24:34), std::__identity>' requested here
# |    57 |   std::__for_each(__first, __last, __f, __proj);
# |       |        ^
# | C://llvm-project/build/libcxx/test-suite-install/include/c++/v1/__algorithm/generate.h:24:8: note: in instantiation of f
unction template specialization 'std::for_each<ConstexprIterator, (lambda at C://llvm-project/build/libcxx/test-suite-instal
l/include/c++/v1/__algorithm/generate.h:24:34)>' requested here
# |    24 |   std::for_each(__first, __last, [&](auto& __element) { __element = __gen(); });
# |       |        ^
# | C:\\llvm-project\libcxx\test\std\algorithms\robust_against_proxy_iterators_lifetime_bugs.pass.cpp:710:47: note: in insta
ntiation of function template specialization 'std::generate<ConstexprIterator, (lambda at C://llvm-project/libcxx/test/std/a
lgorithms/robust_against_proxy_iterators_lifetime_bugs.pass.cpp:651:14)>' requested here
# |   710 |   test(simple_in, [&](I b, I e) { (void) std::generate(b, e, gen); });
# |       |                                               ^
# | C:\\llvm-project\libcxx\test\std\algorithms\robust_against_proxy_iterators_lifetime_bugs.pass.cpp:710:33: note: while su
bstituting into a lambda expression here
# |   710 |   test(simple_in, [&](I b, I e) { (void) std::generate(b, e, gen); });
# |       |                                 ^
# | C:\\llvm-project\libcxx\test\std\algorithms\robust_against_proxy_iterators_lifetime_bugs.pass.cpp:765:17: note: in insta
ntiation of function template specialization 'test<ConstexprIterator>' requested here
# |   765 |   static_assert(test<ConstexprIterator>());
# |       |                 ^
# | C://llvm-project/build/libcxx/test-suite-install/include/c++/v1/__type_traits/invoke.h:88:69: note: candidate template i
gnored: substitution failure [with _Args = <(lambda at C://llvm-project/build/libcxx/test-suite-install/include/c++/v1/__alg
orithm/generate.h:24:34) &, ConstexprIterator::Reference>]: no type named 'type' in 'std::__invoke_result_impl<void, (lambda at C:/Users/Mario/Do
cuments/llvm-project/build/libcxx/test-suite-install/include/c++/v1/__algorithm/generate.h:24:34) &, ConstexprIterator::Reference>'
# |    85 | using __invoke_result_t _LIBCPP_NODEBUG = typename __invoke_result<_Args...>::type;
# |       | ~~~~~
# |    86 |
# |    87 | template <class... _Args>
# |    88 | _LIBCPP_HIDE_FROM_ABI _LIBCPP_CONSTEXPR __invoke_result_t<_Args...> __invoke(_Args&&... __args)
# |       |                                                                     ^
# | C:\\llvm-project\libcxx\test\std\algorithms\robust_against_proxy_iterators_lifetime_bugs.pass.cpp:765:17: error: static
assertion expression is not an integral constant expression
# |   765 |   static_assert(test<ConstexprIterator>());
# |       |                 ^~~~~~~~~~~~~~~~~~~~~~~~~
# | 3 errors generated.
# `-----------------------------
# error: command failed with exit status: 1

I still haven't figured out why the regex check failed, but one of the test error message is above. I suspect it relates to std::__invoke. Is it possible to revert to the original version without forwarding to std::for_each?

@c8ef c8ef requested a review from philnik777 October 14, 2025 18:30
@c8ef
Copy link
Contributor Author

c8ef commented Oct 15, 2025

It's really weird that the regex check still isn't working. Even worse, it isn't producing a CMakeError log that I can use to reproduce the failed compilation. I ran the full test suite and found that the failures are only related to fs(symlink), iostream, and locale, which seems to have little relevance to std::generator.

********************
Failed Tests (17):
  llvm-libc++-mingw.cfg.in :: std/input.output/filesystems/fs.op.funcs/fs.op.rename/rename.pass.cpp
  llvm-libc++-mingw.cfg.in :: std/input.output/iostream.format/ext.manip/get_money.pass.cpp
  llvm-libc++-mingw.cfg.in :: std/input.output/iostream.format/ext.manip/put_money.pass.cpp
  llvm-libc++-mingw.cfg.in :: std/input.output/iostream.format/output.streams/ostream.formatted/ostream.inserters.arithmetic/long_double.pass.cpp
  llvm-libc++-mingw.cfg.in :: std/localization/locale.categories/category.monetary/locale.money.get/locale.money.get.members/get_long_double_en_U
S.pass.cpp
  llvm-libc++-mingw.cfg.in :: std/localization/locale.categories/category.monetary/locale.money.get/locale.money.get.members/get_long_double_fr_F
R.pass.cpp
  llvm-libc++-mingw.cfg.in :: std/localization/locale.categories/category.monetary/locale.money.get/locale.money.get.members/get_long_double_over
long.pass.cpp
  llvm-libc++-mingw.cfg.in :: std/localization/locale.categories/category.monetary/locale.money.get/locale.money.get.members/get_long_double_ru_R
U.pass.cpp
  llvm-libc++-mingw.cfg.in :: std/localization/locale.categories/category.monetary/locale.money.get/locale.money.get.members/get_long_double_zh_C
N.pass.cpp
  llvm-libc++-mingw.cfg.in :: std/localization/locale.categories/category.monetary/locale.money.put/locale.money.put.members/put_long_double_en_U
S.pass.cpp
  llvm-libc++-mingw.cfg.in :: std/localization/locale.categories/category.monetary/locale.money.put/locale.money.put.members/put_long_double_fr_F
R.pass.cpp
  llvm-libc++-mingw.cfg.in :: std/localization/locale.categories/category.monetary/locale.money.put/locale.money.put.members/put_long_double_ru_R
U.pass.cpp
  llvm-libc++-mingw.cfg.in :: std/localization/locale.categories/category.monetary/locale.money.put/locale.money.put.members/put_long_double_zh_C
N.pass.cpp
  llvm-libc++-mingw.cfg.in :: std/localization/locale.categories/category.numeric/locale.nm.put/facet.num.put.members/put_long_double.pass.cpp
  llvm-libc++-mingw.cfg.in :: std/strings/string.conversions/to_string.pass.cpp
  llvm-libc++-mingw.cfg.in :: std/strings/string.conversions/to_wstring.pass.cpp
  llvm-libc++-mingw.cfg.in :: std/time/time.duration/time.duration.nonmember/ostream.pass.cpp

@c8ef
Copy link
Contributor Author

c8ef commented Oct 15, 2025

I also checked the GitHub Actions CMakeError log for the REGEX, but it didn't return any hit.

@c8ef c8ef requested a review from frederick-vs-ja October 15, 2025 17:12
@c8ef
Copy link
Contributor Author

c8ef commented Oct 15, 2025

The final issue is that auto&& is unavailable in C++ standards prior to C++14. : (

@c8ef
Copy link
Contributor Author

c8ef commented Oct 16, 2025

Since the CI is almost passing now, I'm wondering:

  1. Does the release note entry look OK?
  2. I'm not sure if __fn is optimal here. Does the implementation detail need to be refined?
  3. Are there any other issues that need to be addressed?

Looking forward to your advice! @philnik777 @frederick-vs-ja

@philnik777
Copy link
Contributor

The final issue is that auto&& is unavailable in C++ standards prior to C++14. : (

[]<class T>(T&& v) { ... } should work.

@c8ef
Copy link
Contributor Author

c8ef commented Oct 16, 2025

The final issue is that auto&& is unavailable in C++ standards prior to C++14. : (

[]<class T>(T&& v) { ... } should work.

<source>:2:17: warning: explicit template parameter list for lambdas is a C++20 extension [-Wc++20-extensions]
    2 |     auto fn = []<class T>(T&& v) { return v + 1; };

Get a clang ICE on c++11 mode: https://godbolt.org/z/P4j6r5cra 😆

@c8ef
Copy link
Contributor Author

c8ef commented Oct 16, 2025

I think I prefer the functor approach to make sure it works.

@c8ef c8ef requested a review from frederick-vs-ja October 17, 2025 02:25
@c8ef
Copy link
Contributor Author

c8ef commented Oct 17, 2025

The windows arm mingw CI failure seems unrelated.

Co-authored-by: A. Jiang <de34@live.cn>
@c8ef c8ef requested a review from frederick-vs-ja October 18, 2025 12:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

libc++ libc++ C++ Standard Library. Not GNU libstdc++. Not libc++abi. performance

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants