Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mark some std::string functions noinline. #72869

Merged
merged 2 commits into from
Nov 28, 2023

Conversation

jyknight
Copy link
Member

The intent of these particular functions, since their introduction, was to NOT be inlinable.

However, the mechanism by which this was accomplished was non-obvious, and stopped working when string is compiled for C++20.

A longstanding behavior specified by the C++ standard is that instantiation of the body of a template function is suppressed by an extern template declaration -- unless the function is explicitly marked either constexpr or inline. Of course, if the body is not instantiated, then it cannot possibly be inlined, and thus all the functions listed in libcxx/include/__string/extern_template_lists.h were uninlineable.

But, in C++20 mode, string functions were annotated constexpr, which means they are instantiated, and do become inlineable. And, in fact, they do get inlined, which has caused noticeable binary-size growth for users.

For example, in C++17,
std::string f(std::string *in) { return *in; }
does not inline the copy-constructor call, and instead generates a call to the exported function defined in the libc++ shared library.

I think we probably don't want to mark all functions that are currently in the extern template list as noinline, as many of them really are reasonable inlining candidates. Thus, I've restricted this change to only the few functions that were clearly intended to be outlined.

See commits like b019c5c (and some others like it) for background, in which functions were removed from the extern template list in the unstable ABI in order to allow the short-string case to be inlined, while moving the long-string case to a separate function, added to the extern template list.

The intent of these particular functions, since their introduction,
was to NOT be inlinable.

However, the mechanism by which this was accomplished was non-obvious,
and stopped working when string is compiled for C++20.

A longstanding behavior specified by the C++ standard is that
instantiation of the body of a template function is suppressed by an
extern template declaration -- unless the function is explicitly
marked either constexpr or inline. Of course, if the body is not
instantiated, then it cannot possibly be inlined, and thus all the
functions listed in libcxx/include/__string/extern_template_lists.h
were uninlineable.

But, in C++20 mode, string functions were annotated constexpr, which
means they _are_ instantiated, and do become inlineable. And, in fact,
they do get inlined, which has caused noticeable binary-size growth
for users.

For example, in C++17,
`std::string f(std::string *in) { return *in; }`
does not inline the copy-constructor call, and instead generates a
call to the exported function defined in the libc++ shared library.

I think we probably don't want to mark all functions that are
currently in the extern template list as noinline, as many of them
really are reasonable inlining candidates. Thus, I've restricted this
change to only the few functions that were clearly intended to be
outlined.

See commits like b019c5c (and some
others like it) for background, in which functions were removed from
the extern template list in the unstable ABI in order to allow the
short-string case to be inlined, while moving the long-string case to
a separate function, added to the extern template list.
@jyknight jyknight requested a review from a team as a code owner November 20, 2023 13:46
@llvmbot llvmbot added the libc++ libc++ C++ Standard Library. Not GNU libstdc++. Not libc++abi. label Nov 20, 2023
@llvmbot
Copy link
Collaborator

llvmbot commented Nov 20, 2023

@llvm/pr-subscribers-libcxx

Author: James Y Knight (jyknight)

Changes

The intent of these particular functions, since their introduction, was to NOT be inlinable.

However, the mechanism by which this was accomplished was non-obvious, and stopped working when string is compiled for C++20.

A longstanding behavior specified by the C++ standard is that instantiation of the body of a template function is suppressed by an extern template declaration -- unless the function is explicitly marked either constexpr or inline. Of course, if the body is not instantiated, then it cannot possibly be inlined, and thus all the functions listed in libcxx/include/__string/extern_template_lists.h were uninlineable.

But, in C++20 mode, string functions were annotated constexpr, which means they are instantiated, and do become inlineable. And, in fact, they do get inlined, which has caused noticeable binary-size growth for users.

For example, in C++17,
std::string f(std::string *in) { return *in; }
does not inline the copy-constructor call, and instead generates a call to the exported function defined in the libc++ shared library.

I think we probably don't want to mark all functions that are currently in the extern template list as noinline, as many of them really are reasonable inlining candidates. Thus, I've restricted this change to only the few functions that were clearly intended to be outlined.

See commits like b019c5c (and some others like it) for background, in which functions were removed from the extern template list in the unstable ABI in order to allow the short-string case to be inlined, while moving the long-string case to a separate function, added to the extern template list.


Full diff: https://github.com/llvm/llvm-project/pull/72869.diff

2 Files Affected:

  • (modified) libcxx/include/__config (+6)
  • (modified) libcxx/include/string (+10-10)
diff --git a/libcxx/include/__config b/libcxx/include/__config
index e8da358bb8d7cd5..267d58735e8aab3 100644
--- a/libcxx/include/__config
+++ b/libcxx/include/__config
@@ -1279,6 +1279,12 @@ __sanitizer_verify_double_ended_contiguous_container(const void*, const void*, c
 #    define _LIBCPP_NO_SANITIZE(...)
 #  endif
 
+#  if __has_attribute(__noinline__)
+#    define _LIBCPP_NOINLINE __attribute__((__noinline__))
+#  else
+#    define _LIBCPP_NOINLINE
+#  endif
+
 // We often repeat things just for handling wide characters in the library.
 // When wide characters are disabled, it can be useful to have a quick way of
 // disabling it without having to resort to #if-#endif, which has a larger
diff --git a/libcxx/include/string b/libcxx/include/string
index 9c2efac8006bd72..fa06ac316b2d16c 100644
--- a/libcxx/include/string
+++ b/libcxx/include/string
@@ -1899,7 +1899,7 @@ private:
     // to call the __init() functions as those are marked as inline which may
     // result in over-aggressive inlining by the compiler, where our aim is
     // to only inline the fast path code directly in the ctor.
-    _LIBCPP_CONSTEXPR_SINCE_CXX20 void __init_copy_ctor_external(const value_type* __s, size_type __sz);
+    _LIBCPP_CONSTEXPR_SINCE_CXX20 _LIBCPP_NOINLINE void __init_copy_ctor_external(const value_type* __s, size_type __sz);
 
     template <class _InputIterator, __enable_if_t<__has_exactly_input_iterator_category<_InputIterator>::value, int> = 0>
     inline _LIBCPP_CONSTEXPR_SINCE_CXX20 void __init(_InputIterator __first, _InputIterator __last);
@@ -1933,7 +1933,7 @@ private:
     // have proof that the input does not alias the current instance.
     // For example, operator=(basic_string) performs a 'self' check.
     template <bool __is_short>
-    _LIBCPP_CONSTEXPR_SINCE_CXX20 basic_string& __assign_no_alias(const value_type* __s, size_type __n);
+    _LIBCPP_CONSTEXPR_SINCE_CXX20 _LIBCPP_NOINLINE basic_string& __assign_no_alias(const value_type* __s, size_type __n);
 
   _LIBCPP_HIDE_FROM_ABI _LIBCPP_CONSTEXPR_SINCE_CXX20 void __erase_to_end(size_type __pos) {
     __null_terminate_at(std::__to_address(__get_pointer()), __pos);
@@ -1941,7 +1941,7 @@ private:
 
     // __erase_external_with_move is invoked for erase() invocations where
     // `n ~= npos`, likely requiring memory moves on the string data.
-    _LIBCPP_CONSTEXPR_SINCE_CXX20 void __erase_external_with_move(size_type __pos, size_type __n);
+    _LIBCPP_CONSTEXPR_SINCE_CXX20 _LIBCPP_NOINLINE void __erase_external_with_move(size_type __pos, size_type __n);
 
     _LIBCPP_HIDE_FROM_ABI _LIBCPP_CONSTEXPR_SINCE_CXX20
     void __copy_assign_alloc(const basic_string& __str)
@@ -2013,8 +2013,8 @@ private:
         _NOEXCEPT
         {}
 
-    _LIBCPP_CONSTEXPR_SINCE_CXX20 basic_string& __assign_external(const value_type* __s);
-    _LIBCPP_CONSTEXPR_SINCE_CXX20 basic_string& __assign_external(const value_type* __s, size_type __n);
+    _LIBCPP_CONSTEXPR_SINCE_CXX20 _LIBCPP_NOINLINE basic_string& __assign_external(const value_type* __s);
+    _LIBCPP_CONSTEXPR_SINCE_CXX20 _LIBCPP_NOINLINE basic_string& __assign_external(const value_type* __s, size_type __n);
 
     // Assigns the value in __s, guaranteed to be __n < __min_cap in length.
     inline _LIBCPP_CONSTEXPR_SINCE_CXX20 basic_string& __assign_short(const value_type* __s, size_type __n) {
@@ -2167,7 +2167,7 @@ basic_string<_CharT, _Traits, _Allocator>::__init(const value_type* __s, size_ty
 }
 
 template <class _CharT, class _Traits, class _Allocator>
-_LIBCPP_CONSTEXPR_SINCE_CXX20
+_LIBCPP_CONSTEXPR_SINCE_CXX20 _LIBCPP_NOINLINE
 void basic_string<_CharT, _Traits, _Allocator>::__init_copy_ctor_external(
     const value_type* __s, size_type __sz) {
   if (__libcpp_is_constant_evaluated())
@@ -2396,7 +2396,7 @@ basic_string<_CharT, _Traits, _Allocator>::__grow_by_without_replace(
 
 template <class _CharT, class _Traits, class _Allocator>
 template <bool __is_short>
-_LIBCPP_CONSTEXPR_SINCE_CXX20
+_LIBCPP_CONSTEXPR_SINCE_CXX20 _LIBCPP_NOINLINE
 basic_string<_CharT, _Traits, _Allocator>&
 basic_string<_CharT, _Traits, _Allocator>::__assign_no_alias(
     const value_type* __s, size_type __n) {
@@ -2414,7 +2414,7 @@ basic_string<_CharT, _Traits, _Allocator>::__assign_no_alias(
 }
 
 template <class _CharT, class _Traits, class _Allocator>
-_LIBCPP_CONSTEXPR_SINCE_CXX20
+_LIBCPP_CONSTEXPR_SINCE_CXX20 _LIBCPP_NOINLINE
 basic_string<_CharT, _Traits, _Allocator>&
 basic_string<_CharT, _Traits, _Allocator>::__assign_external(
     const value_type* __s, size_type __n) {
@@ -2625,7 +2625,7 @@ basic_string<_CharT, _Traits, _Allocator>::assign(const _Tp& __t, size_type __po
 }
 
 template <class _CharT, class _Traits, class _Allocator>
-_LIBCPP_CONSTEXPR_SINCE_CXX20
+_LIBCPP_CONSTEXPR_SINCE_CXX20 _LIBCPP_NOINLINE
 basic_string<_CharT, _Traits, _Allocator>&
 basic_string<_CharT, _Traits, _Allocator>::__assign_external(const value_type* __s) {
   return __assign_external(__s, traits_type::length(__s));
@@ -3110,7 +3110,7 @@ basic_string<_CharT, _Traits, _Allocator>::replace(size_type __pos, size_type __
 // 'externally instantiated' erase() implementation, called when __n != npos.
 // Does not check __pos against size()
 template <class _CharT, class _Traits, class _Allocator>
-_LIBCPP_CONSTEXPR_SINCE_CXX20
+_LIBCPP_CONSTEXPR_SINCE_CXX20 _LIBCPP_NOINLINE
 void
 basic_string<_CharT, _Traits, _Allocator>::__erase_external_with_move(
     size_type __pos, size_type __n)

Copy link
Member

@ldionne ldionne left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes sense to me. I hate introducing _LIBCPP_NOINLINE, but in this case the intent of these functions is pretty clear and allowing them to be inlined was an unintended side effect of adding constexpr.

@EricWF
Copy link
Member

EricWF commented Nov 20, 2023

The project has had a checked past with _LIBCPP_ALWAYS_INLINE and _LIBCPP_NOINLINE coming and going. I'm afraid this is the start to another one of those cycles.

For future readers of the code, I would like to leave more of a trace exactly why we're doing this, and when it should and shouldn't be used to solve other problems.

@jyknight
Copy link
Member Author

Can you give a suggestion as to where you'd like to see it, or what else you'd like to see? I tried to make the commit message extra-descriptive already, for just that reason.

Copy link
Member

@ldionne ldionne left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO this is reasonable, especially when coupled with the commit message. I would like to avoid holding this patch off without requiring actionable changes.

@@ -1899,7 +1899,7 @@ private:
// to call the __init() functions as those are marked as inline which may
// result in over-aggressive inlining by the compiler, where our aim is
// to only inline the fast path code directly in the ctor.
_LIBCPP_CONSTEXPR_SINCE_CXX20 void __init_copy_ctor_external(const value_type* __s, size_type __sz);
_LIBCPP_CONSTEXPR_SINCE_CXX20 _LIBCPP_NOINLINE void __init_copy_ctor_external(const value_type* __s, size_type __sz);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO comments like this one above explain why we have _LIBCPP_NOINLINE on this method:

// Slow path for the (inlined) copy constructor for 'long' strings.
// Always externally instantiated and not inlined.

Furthermore, these methods all have _external in their names which kind of explains why they're not inlined. I'm not sure we need more documentation.

@jyknight
Copy link
Member Author

Seeing no further feedback, going ahead and submitting.

@jyknight jyknight merged commit e1f59ad into llvm:main Nov 28, 2023
36 of 39 checks passed
@jyknight jyknight deleted the libcxx-string-noinline branch December 12, 2023 03:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
libc++ libc++ C++ Standard Library. Not GNU libstdc++. Not libc++abi.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants