Skip to content

Conversation

philnik777
Copy link
Contributor

Since __resize_default_init is only ever used inside the dylib we can remove the libc++-internal API and switch to the public one. This patch inlines a bunch of functions that aren't required anymore and simplifies the code that way.

@philnik777 philnik777 force-pushed the string_simplify branch 5 times, most recently from d61aeb4 to 4ca3c38 Compare September 9, 2025 08:36
@ldionne ldionne marked this pull request as ready for review September 10, 2025 16:36
@ldionne ldionne requested a review from a team as a code owner September 10, 2025 16:36
@llvmbot llvmbot added the libc++ libc++ C++ Standard Library. Not GNU libstdc++. Not libc++abi. label Sep 10, 2025
@llvmbot
Copy link
Member

llvmbot commented Sep 10, 2025

@llvm/pr-subscribers-libcxx

Author: Nikolas Klauser (philnik777)

Changes

Since __resize_default_init is only ever used inside the dylib we can remove the libc++-internal API and switch to the public one. This patch inlines a bunch of functions that aren't required anymore and simplifies the code that way.


Full diff: https://github.com/llvm/llvm-project/pull/157121.diff

7 Files Affected:

  • (modified) libcxx/include/__utility/scope_guard.h (+2)
  • (modified) libcxx/include/string (+8-32)
  • (modified) libcxx/src/filesystem/format_string.h (+5-7)
  • (removed) libcxx/test/libcxx-03/strings/basic.string/string.modifiers/resize_default_initialized.pass.cpp (-75)
  • (modified) libcxx/test/libcxx/input.output/filesystems/class.directory_entry/directory_entry.mods/last_write_time.pass.cpp (+1-1)
  • (modified) libcxx/test/libcxx/input.output/filesystems/convert_file_time.pass.cpp (+1-1)
  • (removed) libcxx/test/libcxx/strings/basic.string/string.modifiers/resize_default_initialized.pass.cpp (-75)
diff --git a/libcxx/include/__utility/scope_guard.h b/libcxx/include/__utility/scope_guard.h
index 3972102eee891..db4f0e4c73a2a 100644
--- a/libcxx/include/__utility/scope_guard.h
+++ b/libcxx/include/__utility/scope_guard.h
@@ -43,6 +43,8 @@ class __scope_guard {
 #endif
 };
 
+_LIBCPP_CTAD_SUPPORTED_FOR_TYPE(__scope_guard);
+
 template <class _Func>
 _LIBCPP_HIDE_FROM_ABI _LIBCPP_CONSTEXPR_SINCE_CXX14 __scope_guard<_Func> __make_scope_guard(_Func __func) {
   return __scope_guard<_Func>(std::move(__func));
diff --git a/libcxx/include/string b/libcxx/include/string
index 0abdfebcb863f..6a8523ee1e04a 100644
--- a/libcxx/include/string
+++ b/libcxx/include/string
@@ -1312,13 +1312,17 @@ public:
 #  if _LIBCPP_STD_VER >= 23
   template <class _Op>
   _LIBCPP_HIDE_FROM_ABI constexpr void resize_and_overwrite(size_type __n, _Op __op) {
-    __resize_default_init(__n);
-    __erase_to_end(std::move(__op)(data(), _LIBCPP_AUTO_CAST(__n)));
+    size_type __sz  = size();
+    size_type __cap = capacity();
+    if (__n > __cap)
+      __grow_by_without_replace(__cap, __n - __cap, __sz, __sz, 0);
+    __annotate_delete();
+    __set_size(__n);
+    __annotate_new(__n);
+    __erase_to_end(std::move(__op)(data(), auto(__n)));
   }
 #  endif
 
-  _LIBCPP_HIDE_FROM_ABI _LIBCPP_CONSTEXPR_SINCE_CXX20 void __resize_default_init(size_type __n);
-
 #  if _LIBCPP_STD_VER < 26 || defined(_LIBCPP_ENABLE_CXX26_REMOVED_STRING_RESERVE)
   _LIBCPP_DEPRECATED_IN_CXX20 _LIBCPP_HIDE_FROM_ABI void reserve() _NOEXCEPT { shrink_to_fit(); }
 #  endif
@@ -1410,8 +1414,6 @@ public:
   _LIBCPP_CONSTEXPR_SINCE_CXX20 basic_string& append(const value_type* _LIBCPP_DIAGNOSE_NULLPTR __s);
   _LIBCPP_CONSTEXPR_SINCE_CXX20 basic_string& append(size_type __n, value_type __c);
 
-  _LIBCPP_HIDE_FROM_ABI _LIBCPP_CONSTEXPR_SINCE_CXX20 void __append_default_init(size_type __n);
-
   template <class _InputIterator, __enable_if_t<__has_exactly_input_iterator_category<_InputIterator>::value, int> = 0>
   _LIBCPP_HIDE_FROM_ABI _LIBCPP_HIDE_FROM_ABI _LIBCPP_CONSTEXPR_SINCE_CXX20 basic_string&
   append(_InputIterator __first, _InputIterator __last) {
@@ -3079,22 +3081,6 @@ basic_string<_CharT, _Traits, _Allocator>::append(size_type __n, value_type __c)
   return *this;
 }
 
-template <class _CharT, class _Traits, class _Allocator>
-_LIBCPP_CONSTEXPR_SINCE_CXX20 inline void
-basic_string<_CharT, _Traits, _Allocator>::__append_default_init(size_type __n) {
-  if (__n == 0)
-    return;
-  size_type __cap = capacity();
-  size_type __sz  = size();
-  if (__cap - __sz < __n)
-    __grow_by_without_replace(__cap, __sz + __n - __cap, __sz, __sz, 0);
-  __annotate_increase(__n);
-  pointer __p = __get_pointer();
-  __sz += __n;
-  __set_size(__sz);
-  traits_type::assign(__p[__sz], value_type());
-}
-
 template <class _CharT, class _Traits, class _Allocator>
 _LIBCPP_CONSTEXPR_SINCE_CXX20 void basic_string<_CharT, _Traits, _Allocator>::push_back(value_type __c) {
   bool __is_short = !__is_long();
@@ -3433,16 +3419,6 @@ _LIBCPP_CONSTEXPR_SINCE_CXX20 void basic_string<_CharT, _Traits, _Allocator>::re
     __erase_to_end(__n);
 }
 
-template <class _CharT, class _Traits, class _Allocator>
-_LIBCPP_CONSTEXPR_SINCE_CXX20 inline void
-basic_string<_CharT, _Traits, _Allocator>::__resize_default_init(size_type __n) {
-  size_type __sz = size();
-  if (__n > __sz) {
-    __append_default_init(__n - __sz);
-  } else
-    __erase_to_end(__n);
-}
-
 template <class _CharT, class _Traits, class _Allocator>
 _LIBCPP_CONSTEXPR_SINCE_CXX20 void basic_string<_CharT, _Traits, _Allocator>::reserve(size_type __requested_capacity) {
   if (__requested_capacity > max_size())
diff --git a/libcxx/src/filesystem/format_string.h b/libcxx/src/filesystem/format_string.h
index ad6c57579a0a6..d0208f2b39ec2 100644
--- a/libcxx/src/filesystem/format_string.h
+++ b/libcxx/src/filesystem/format_string.h
@@ -34,20 +34,18 @@ inline _LIBCPP_ATTRIBUTE_FORMAT(__printf__, 1, 0) string vformat_string(const ch
 
   va_list apcopy;
   va_copy(apcopy, ap);
-  int ret = ::vsnprintf(buf.data(), buf.size(), msg, apcopy);
+  int size = ::vsnprintf(buf.data(), buf.size(), msg, apcopy);
   va_end(apcopy);
 
   string result;
-  if (static_cast<size_t>(ret) < buf.size()) {
-    result.assign(buf.data(), static_cast<size_t>(ret));
+  if (static_cast<size_t>(size) < buf.size()) {
+    result.assign(buf.data(), static_cast<size_t>(size));
   } else {
     // we did not provide a long enough buffer on our first attempt. The
     // return value is the number of bytes (excluding the null byte) that are
     // needed for formatting.
-    size_t size_with_null = static_cast<size_t>(ret) + 1;
-    result.__resize_default_init(size_with_null - 1);
-    ret = ::vsnprintf(&result[0], size_with_null, msg, ap);
-    _LIBCPP_ASSERT_INTERNAL(static_cast<size_t>(ret) == (size_with_null - 1), "TODO");
+    result.resize_and_overwrite(size, [&](char* res, size_t n) { return ::vsnprintf(res, n, msg, ap); });
+    _LIBCPP_ASSERT_INTERNAL(static_cast<size_t>(size) == result.size(), "TODO");
   }
   return result;
 }
diff --git a/libcxx/test/libcxx-03/strings/basic.string/string.modifiers/resize_default_initialized.pass.cpp b/libcxx/test/libcxx-03/strings/basic.string/string.modifiers/resize_default_initialized.pass.cpp
deleted file mode 100644
index 8e6e07d659c1a..0000000000000
--- a/libcxx/test/libcxx-03/strings/basic.string/string.modifiers/resize_default_initialized.pass.cpp
+++ /dev/null
@@ -1,75 +0,0 @@
-//===----------------------------------------------------------------------===//
-//
-// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
-// See https://llvm.org/LICENSE.txt for license information.
-// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
-//
-//===----------------------------------------------------------------------===//
-
-// <string>
-
-// __resize_default_init(size_type)
-
-#include <string>
-#include <cassert>
-
-#include "test_macros.h"
-
-TEST_CONSTEXPR_CXX20 void write_c_str(char* buf, int size) {
-  for (int i = 0; i < size; ++i) {
-    buf[i] = 'a';
-  }
-  buf[size] = '\0';
-}
-
-template <class S>
-TEST_CONSTEXPR_CXX20 void test_buffer_usage() {
-  {
-    unsigned buff_size = 125;
-    unsigned used_size = buff_size - 16;
-    S s;
-    s.__resize_default_init(buff_size);
-    write_c_str(&s[0], used_size);
-    assert(s.size() == buff_size);
-    assert(std::char_traits<char>().length(s.data()) == used_size);
-    s.__resize_default_init(used_size);
-    assert(s.size() == used_size);
-    assert(s.data()[used_size] == '\0');
-    for (unsigned i = 0; i < used_size; ++i) {
-      assert(s[i] == 'a');
-    }
-  }
-}
-
-template <class S>
-TEST_CONSTEXPR_CXX20 void test_basic() {
-  {
-    S s;
-    s.__resize_default_init(3);
-    assert(s.size() == 3);
-    assert(s.data()[3] == '\0');
-    for (int i = 0; i < 3; ++i)
-      s[i] = 'a' + i;
-    s.__resize_default_init(1);
-    assert(s[0] == 'a');
-    assert(s.data()[1] == '\0');
-    assert(s.size() == 1);
-  }
-}
-
-template <class S>
-TEST_CONSTEXPR_CXX20 bool test() {
-  test_basic<S>();
-  test_buffer_usage<S>();
-
-  return true;
-}
-
-int main(int, char**) {
-  test<std::string>();
-#if TEST_STD_VER > 17
-  static_assert(test<std::string>());
-#endif
-
-  return 0;
-}
diff --git a/libcxx/test/libcxx/input.output/filesystems/class.directory_entry/directory_entry.mods/last_write_time.pass.cpp b/libcxx/test/libcxx/input.output/filesystems/class.directory_entry/directory_entry.mods/last_write_time.pass.cpp
index 1acbed55d2b51..a25601698e4ef 100644
--- a/libcxx/test/libcxx/input.output/filesystems/class.directory_entry/directory_entry.mods/last_write_time.pass.cpp
+++ b/libcxx/test/libcxx/input.output/filesystems/class.directory_entry/directory_entry.mods/last_write_time.pass.cpp
@@ -6,7 +6,7 @@
 //
 //===----------------------------------------------------------------------===//
 
-// UNSUPPORTED: c++03, c++11, c++14
+// UNSUPPORTED: c++03, c++11, c++14, c++17, c++20
 // UNSUPPORTED: availability-filesystem-missing
 // UNSUPPORTED: no-filesystem
 // ADDITIONAL_COMPILE_FLAGS: -I %{libcxx-dir}/src
diff --git a/libcxx/test/libcxx/input.output/filesystems/convert_file_time.pass.cpp b/libcxx/test/libcxx/input.output/filesystems/convert_file_time.pass.cpp
index c501969c31167..6e5c5aa52674c 100644
--- a/libcxx/test/libcxx/input.output/filesystems/convert_file_time.pass.cpp
+++ b/libcxx/test/libcxx/input.output/filesystems/convert_file_time.pass.cpp
@@ -6,7 +6,7 @@
 //
 //===----------------------------------------------------------------------===//
 
-// UNSUPPORTED: c++03, c++11, c++14
+// UNSUPPORTED: c++03, c++11, c++14, c++17, c++20
 // UNSUPPORTED: availability-filesystem-missing
 
 // <filesystem>
diff --git a/libcxx/test/libcxx/strings/basic.string/string.modifiers/resize_default_initialized.pass.cpp b/libcxx/test/libcxx/strings/basic.string/string.modifiers/resize_default_initialized.pass.cpp
deleted file mode 100644
index 8e6e07d659c1a..0000000000000
--- a/libcxx/test/libcxx/strings/basic.string/string.modifiers/resize_default_initialized.pass.cpp
+++ /dev/null
@@ -1,75 +0,0 @@
-//===----------------------------------------------------------------------===//
-//
-// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
-// See https://llvm.org/LICENSE.txt for license information.
-// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
-//
-//===----------------------------------------------------------------------===//
-
-// <string>
-
-// __resize_default_init(size_type)
-
-#include <string>
-#include <cassert>
-
-#include "test_macros.h"
-
-TEST_CONSTEXPR_CXX20 void write_c_str(char* buf, int size) {
-  for (int i = 0; i < size; ++i) {
-    buf[i] = 'a';
-  }
-  buf[size] = '\0';
-}
-
-template <class S>
-TEST_CONSTEXPR_CXX20 void test_buffer_usage() {
-  {
-    unsigned buff_size = 125;
-    unsigned used_size = buff_size - 16;
-    S s;
-    s.__resize_default_init(buff_size);
-    write_c_str(&s[0], used_size);
-    assert(s.size() == buff_size);
-    assert(std::char_traits<char>().length(s.data()) == used_size);
-    s.__resize_default_init(used_size);
-    assert(s.size() == used_size);
-    assert(s.data()[used_size] == '\0');
-    for (unsigned i = 0; i < used_size; ++i) {
-      assert(s[i] == 'a');
-    }
-  }
-}
-
-template <class S>
-TEST_CONSTEXPR_CXX20 void test_basic() {
-  {
-    S s;
-    s.__resize_default_init(3);
-    assert(s.size() == 3);
-    assert(s.data()[3] == '\0');
-    for (int i = 0; i < 3; ++i)
-      s[i] = 'a' + i;
-    s.__resize_default_init(1);
-    assert(s[0] == 'a');
-    assert(s.data()[1] == '\0');
-    assert(s.size() == 1);
-  }
-}
-
-template <class S>
-TEST_CONSTEXPR_CXX20 bool test() {
-  test_basic<S>();
-  test_buffer_usage<S>();
-
-  return true;
-}
-
-int main(int, char**) {
-  test<std::string>();
-#if TEST_STD_VER > 17
-  static_assert(test<std::string>());
-#endif
-
-  return 0;
-}

Copy link
Member

@ldionne ldionne left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this refactoring! Have some comments though. Please run this through the benchmarks, too.

Copy link
Member

@ldionne ldionne left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM but let's make sure the benchmarks are not negatively impacted.

@philnik777
Copy link
Contributor Author

LGTM but let's make sure the benchmarks are not negatively impacted.

Since we didn't have any benchmarks yet, I've added one, and the numbers seem pretty good:

-----------------------------------------------------------
Benchmark                               old             new
-----------------------------------------------------------
BM_StringResizeAndOverwrite         8.39 ns         5.66 ns

@philnik777 philnik777 merged commit 92f5d8d into llvm:main Sep 16, 2025
67 of 77 checks passed
@philnik777 philnik777 deleted the string_simplify branch September 16, 2025 11:38
result.__resize_default_init(size_with_null - 1);
ret = ::vsnprintf(&result[0], size_with_null, msg, ap);
_LIBCPP_ASSERT_INTERNAL(static_cast<size_t>(ret) == (size_with_null - 1), "TODO");
result.resize_and_overwrite(size, [&](char* res, size_t n) { return ::vsnprintf(res, n, msg, ap); });
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

resize_and_overwrite is c++23-only but this here isn't, is it?

We build in c++20 mode and are now getting:

[4057/177031] CXX android_clang_arm/obj/buildtools/third_party/libc++/libc++/filesystem_error.o
../../third_party/llvm-build/Release+Asserts/bin/clang++ -MMD -MF android_clang_arm/obj/buildtools/third_party/libc++/libc++/filesystem_error....(too long)
In file included from ../../third_party/libc++/src/src/filesystem/filesystem_error.cpp:15:
../../third_party/libc++/src/src/filesystem/format_string.h:47:12: error: no member named 'resize_and_overwrite' in 'std::string'
   47 |     result.resize_and_overwrite(size, [&](char* res, size_t n) { return ::vsnprintf(res, n, msg, ap); });
      |     ~~~~~~ ^
1 error generated.

Are we holding something wrong?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You have to build the library with C++23, so yes, this is C++23 only code. I guess you aren't building with CMake? Maybe something needs to be updated?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, and sorry for the noise.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No worries!

@jyknight
Copy link
Member

The removal of __resize_default_init and __append_default_init (originally added in 01a87ef back in 2018), causes a performance regression for Google, as we use those widely to avoid unnecessary writes for overwrite.

The uses are guarded by a presence-check, so it's not a functional breakage, but it's critical for performance. Happily, C++23 has standardized string::resize_and_overwrite, but that's not available under C++20. That makes the timing of the removal of these nonstandard APIs unfortunate, as we're left with no workable option at the moment.

You can see the current usage in Google's ABSL library

Could we add back these APIs? Or else maybe make a version of resize_and_overwrite available (under a non-standard name?) to code which is built before C++23? If not, I'll need to carry a local patch to libc++ to re-add this functionality, or maybe use a similar horrible-hack that facebook's Folly library uses to achieve the same goal, https://github.com/facebook/folly/blob/c82ea832857119a27f4a56d37f28086fac06e63a/folly/memory/UninitializedMemoryHacks.h#L171

@ldionne
Copy link
Member

ldionne commented Sep 19, 2025

My preference here would be not to re-introduce this upstream, since it's dead code as far as upstream is concerned. I actually think that Folly's approach is quite neat for a hack. It uses low-level functions that are unlikely to change in tricky ways to achieve its goal, and in addition it would turn a "merge conflict" of the kind we had here into a compilation error instead of a potentially silent performance regression. All in all, I believe that this is probably a superior solution to just relying on the internal name.

WDYT?

@jyknight
Copy link
Member

__resize_default_init was a tested, nonstandard public API (that is, intentionally in the "public:" access section) on libc++'s std::string -- so, I'd say, not really dead code even if the internal implementation no longer needs to call it. I don't think calling internal private methods through a "bug" in the language is a better option than that!

And as it happens, I just had to fix some other code doing a trick similar to the evil hack in Folly, in order to access the private methods __set_long_cap, __set_long_size, __set_long_pointer, which were removed yesterday by 8dae17b -- an entirely reasonable code cleanup. Those were actually private methods of the string class, of course nobody external could be calling them...except for that weird hole in class member access control from explicit template instantiation. (I wish that spec hole could be fixed...)

And I'm not worried about a silent regression -- we do catch it -- it's only "silent" because we intentionally implemented presence-detection for the nonstandard method, instead of unconditionally calling it.

@philnik777
Copy link
Contributor Author

__resize_default_init was a tested, nonstandard public API (that is, intentionally in the "public:" access section) on libc++'s std::string -- so, I'd say, not really dead code even if the internal implementation no longer needs to call it.

Where did you get that information from? I can't find anything in the git history or our documentation suggesting that is was intended as an extension.
I don't think the mere fact that it was public can be argumentation here, since we have some other functions which are clearly internal (__move_assign and __invariants AFAICT) but public as well as any free functions starting with __.

FWIW I also don't see much of a reason not to carry an internal patch (e.g. making resize_and_overwrite available for the language modes you care about). That seems to me much lower risk than using internal functions we could rename or entirely remove at any point.

itzexpoexpo pushed a commit to itzexpoexpo/llvm-project that referenced this pull request Sep 21, 2025
…m#157121)

Since `__resize_default_init` is only ever used inside the dylib we can
remove the libc++-internal API and switch to the public one. This patch
inlines a bunch of functions that aren't required anymore and simplifies
the code that way.
@jyknight
Copy link
Member

Where did you get that information from? I can't find anything in the git history or our documentation suggesting that is was intended as an extension.

Extremely vague recollections of discussions when it was added in 2018, combined with it being public.

I don't think the mere fact that it was public can be argumentation here [...]

Point taken. Let's just say that I thought it was intentionally public...but you're making me doubt my conviction.

FWIW I also don't see much of a reason not to carry an internal patch (e.g. making resize_and_overwrite available for the language modes you care about).

This is what I'm currently pursuing.

It's not ideal to not have this available upstream, since it means the performance properties running on an unpatched libc++ may be drastically worse, and sometimes we build code against standard libc++. It's probably fine to miss this optimization in those cases, but my ideal situation would be to have a supported solution that works even with unpatched libc++ upstream.

SeongjaeP pushed a commit to SeongjaeP/llvm-project that referenced this pull request Sep 23, 2025
…m#157121)

Since `__resize_default_init` is only ever used inside the dylib we can
remove the libc++-internal API and switch to the public one. This patch
inlines a bunch of functions that aren't required anymore and simplifies
the code that way.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
libc++ libc++ C++ Standard Library. Not GNU libstdc++. Not libc++abi.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants