Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ASan][libc++] Update string ASan annotations to zero-overhead #76165

Closed

Conversation

AdvenamTacet
Copy link
Member

@AdvenamTacet AdvenamTacet commented Dec 21, 2023

This commit introduces zero-overhead string ASan annotations by wrapping every call to ASan helper functions inside a constexpr if statement. This ensures that ASan checks are only performed at runtime when the compiler determines that the string is potentially vulnerable to memory safety issues. This approach significantly reduces the overhead of ASan while maintaining its effectiveness in detecting memory safety bugs.

While the intended execution path should not incur any overhead due to compiler optimizations, a few local variables solely accessed by ASan helper functions are not currently removed and we rely on optimizing them away. A future commit may reduce number of these variables.

This commit introduces Zero-overhead string ASan annotations by wrapping every call to ASan helper functions inside a constexpr if statement. This ensures that ASan checks are only performed at runtime when the compiler determines that the string is potentially vulnerable to memory safety issues. This approach significantly reduces the overhead of ASan while maintaining its effectiveness in detecting memory safety bugs.

While the intended execution path should not incur any overhead due to compiler optimizations, a few local variables solely accessed by ASan helper functions are not currently removed and we rely on optimizing them away. A future commit may reduce number of these variables.
@AdvenamTacet AdvenamTacet added libc++ libc++ C++ Standard Library. Not GNU libstdc++. Not libc++abi. performance labels Dec 21, 2023
@AdvenamTacet AdvenamTacet requested a review from a team as a code owner December 21, 2023 16:00
@llvmbot
Copy link
Collaborator

llvmbot commented Dec 21, 2023

@llvm/pr-subscribers-libcxx

Author: Tacet (AdvenamTacet)

Changes

This commit introduces Zero-overhead string ASan annotations by wrapping every call to ASan helper functions inside a constexpr if statement. This ensures that ASan checks are only performed at runtime when the compiler determines that the string is potentially vulnerable to memory safety issues. This approach significantly reduces the overhead of ASan while maintaining its effectiveness in detecting memory safety bugs.

While the intended execution path should not incur any overhead due to compiler optimizations, a few local variables solely accessed by ASan helper functions are not currently removed and we rely on optimizing them away. A future commit may reduce number of these variables.


Patch is 24.04 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/76165.diff

1 Files Affected:

  • (modified) libcxx/include/string (+101-85)
diff --git a/libcxx/include/string b/libcxx/include/string
index fdffca5aed18be..fcfee0e6e2f9df 100644
--- a/libcxx/include/string
+++ b/libcxx/include/string
@@ -659,6 +659,13 @@ _LIBCPP_PUSH_MACROS
 #else
 #  define _LIBCPP_STRING_INTERNAL_MEMORY_ACCESS
 #endif
+#ifndef _IF_ASAN
+#  ifndef _LIBCPP_HAS_NO_ASAN
+#    define _IF_ASAN() if _LIBCPP_CONSTEXPR (true)
+#  else
+#    define _IF_ASAN() if _LIBCPP_CONSTEXPR (false)
+#  endif
+#endif
 #define _LIBCPP_SHORT_STRING_ANNOTATIONS_ALLOWED false
 
 _LIBCPP_BEGIN_NAMESPACE_STD
@@ -858,7 +865,7 @@ private:
       __set_long_pointer(__allocation);
       __set_long_size(__size);
     }
-    __annotate_new(__size);
+    _IF_ASAN() __annotate_new(__size);
   }
 
   template <class _Iter, class _Sent>
@@ -880,7 +887,7 @@ public:
   _LIBCPP_HIDE_FROM_ABI _LIBCPP_CONSTEXPR_SINCE_CXX20 basic_string()
       _NOEXCEPT_(is_nothrow_default_constructible<allocator_type>::value)
       : __r_(__value_init_tag(), __default_init_tag()) {
-    __annotate_new(0);
+    _IF_ASAN() __annotate_new(0);
   }
 
   _LIBCPP_HIDE_FROM_ABI _LIBCPP_CONSTEXPR_SINCE_CXX20 explicit basic_string(const allocator_type& __a)
@@ -890,14 +897,14 @@ public:
       _NOEXCEPT
 #endif
       : __r_(__value_init_tag(), __a) {
-    __annotate_new(0);
+    _IF_ASAN() __annotate_new(0);
   }
 
   _LIBCPP_CONSTEXPR_SINCE_CXX20 _LIBCPP_STRING_INTERNAL_MEMORY_ACCESS basic_string(const basic_string& __str)
       : __r_(__default_init_tag(), __alloc_traits::select_on_container_copy_construction(__str.__alloc())) {
     if (!__str.__is_long()) {
       __r_.first() = __str.__r_.first();
-      __annotate_new(__get_short_size());
+      _IF_ASAN() __annotate_new(__get_short_size());
     } else
       __init_copy_ctor_external(std::__to_address(__str.__get_long_pointer()), __str.__get_long_size());
   }
@@ -907,7 +914,7 @@ public:
       : __r_(__default_init_tag(), __a) {
     if (!__str.__is_long()) {
       __r_.first() = __str.__r_.first();
-      __annotate_new(__get_short_size());
+      _IF_ASAN() __annotate_new(__get_short_size());
     } else
       __init_copy_ctor_external(std::__to_address(__str.__get_long_pointer()), __str.__get_long_size());
   }
@@ -919,13 +926,19 @@ public:
 #  else
       _NOEXCEPT
 #  endif
+#ifndef _LIBCPP_HAS_NO_ASAN
       // Turning off ASan instrumentation for variable initialization with _LIBCPP_STRING_INTERNAL_MEMORY_ACCESS
       // does not work consistently during initialization of __r_, so we instead unpoison __str's memory manually first.
       // __str's memory needs to be unpoisoned only in the case where it's a short string.
       : __r_(((__str.__is_long() ? 0 : (__str.__annotate_delete(), 0)), std::move(__str.__r_))) {
+      // ASan TODO: ^ This line results in stack frame growth and isn't correctly optimized by the compiler.
+      //              It should be refactored to improve performance.
+#else
+      : __r_(std::move(__str.__r_)) {
+#endif
     __str.__r_.first() = __rep();
-    __str.__annotate_new(0);
-    if (!__is_long())
+    _IF_ASAN() __str.__annotate_new(0);
+    _IF_ASAN() if (!__is_long())
       __annotate_new(size());
   }
 
@@ -936,12 +949,12 @@ public:
     else {
       if (__libcpp_is_constant_evaluated())
         __r_.first() = __rep();
-      if (!__str.__is_long())
+      _IF_ASAN() if (!__str.__is_long())
         __str.__annotate_delete();
       __r_.first()       = __str.__r_.first();
       __str.__r_.first() = __rep();
-      __str.__annotate_new(0);
-      if (!__is_long() && this != &__str)
+      _IF_ASAN() __str.__annotate_new(0);
+      _IF_ASAN() if (!__is_long() && this != &__str)
         __annotate_new(size());
     }
   }
@@ -1100,7 +1113,7 @@ public:
 #endif // _LIBCPP_CXX03_LANG
 
   inline _LIBCPP_CONSTEXPR_SINCE_CXX20 ~basic_string() {
-    __annotate_delete();
+    _IF_ASAN() __annotate_delete();
     if (__is_long())
       __alloc_traits::deallocate(__alloc(), __get_long_pointer(), __get_long_cap());
   }
@@ -1357,20 +1370,22 @@ public:
     // Pilfer the allocation from __str.
     _LIBCPP_ASSERT_INTERNAL(__alloc() == __str.__alloc(), "__move_assign called with wrong allocator");
     size_type __old_sz = __str.size();
-    if (!__str.__is_long())
+    _IF_ASAN() if (!__str.__is_long())
       __str.__annotate_delete();
     __r_.first()       = __str.__r_.first();
     __str.__r_.first() = __rep();
-    __str.__annotate_new(0);
+    _IF_ASAN() __str.__annotate_new(0);
 
     _Traits::move(data(), data() + __pos, __len);
     __set_size(__len);
     _Traits::assign(data()[__len], value_type());
 
-    if (!__is_long()) {
-      __annotate_new(__len);
-    } else if (__old_sz > __len) {
-      __annotate_shrink(__old_sz);
+    _IF_ASAN() {
+      if (!__is_long()) {
+        __annotate_new(__len);
+      } else if (__old_sz > __len) {
+        __annotate_shrink(__old_sz);
+      }
     }
   }
 #endif
@@ -1801,7 +1816,7 @@ private:
     size_type __cap = capacity();
     value_type* __p;
     if (__cap - __sz >= __n) {
-      __annotate_increase(__n);
+      _IF_ASAN() __annotate_increase(__n);
       __p                = std::__to_address(__get_pointer());
       size_type __n_move = __sz - __ip;
       if (__n_move != 0)
@@ -2026,7 +2041,7 @@ private:
         __clear_and_shrink();
         __alloc() = __str.__alloc();
       } else {
-        __annotate_delete();
+        _IF_ASAN() __annotate_delete();
         allocator_type __a = __str.__alloc();
         auto __allocation  = std::__allocate_at_least(__a, __str.__get_long_cap());
         __begin_lifetime(__allocation.ptr, __allocation.count);
@@ -2036,7 +2051,7 @@ private:
         __set_long_pointer(__allocation.ptr);
         __set_long_cap(__allocation.count);
         __set_long_size(__str.size());
-        __annotate_new(__get_long_size());
+        _IF_ASAN() __annotate_new(__get_long_size());
       }
     }
   }
@@ -2076,13 +2091,13 @@ private:
   // Assigns the value in __s, guaranteed to be __n < __min_cap in length.
   inline _LIBCPP_CONSTEXPR_SINCE_CXX20 basic_string& __assign_short(const value_type* __s, size_type __n) {
     size_type __old_size = size();
-    if (__n > __old_size)
+    _IF_ASAN() if (__n > __old_size)
       __annotate_increase(__n - __old_size);
     pointer __p =
         __is_long() ? (__set_long_size(__n), __get_long_pointer()) : (__set_short_size(__n), __get_short_pointer());
     traits_type::move(std::__to_address(__p), __s, __n);
     traits_type::assign(__p[__n], value_type());
-    if (__old_size > __n)
+    _IF_ASAN() if (__old_size > __n)
       __annotate_shrink(__old_size);
     return *this;
   }
@@ -2090,11 +2105,11 @@ private:
   _LIBCPP_HIDE_FROM_ABI _LIBCPP_CONSTEXPR_SINCE_CXX20 basic_string&
   __null_terminate_at(value_type* __p, size_type __newsz) {
     size_type __old_size = size();
-    if (__newsz > __old_size)
+    _IF_ASAN() if (__newsz > __old_size)
       __annotate_increase(__newsz - __old_size);
     __set_size(__newsz);
     traits_type::assign(__p[__newsz], value_type());
-    if (__old_size > __newsz)
+    _IF_ASAN() if (__old_size > __newsz)
       __annotate_shrink(__old_size);
     return *this;
   }
@@ -2189,7 +2204,7 @@ basic_string<_CharT, _Traits, _Allocator>::__init(const value_type* __s, size_ty
   }
   traits_type::copy(std::__to_address(__p), __s, __sz);
   traits_type::assign(__p[__sz], value_type());
-  __annotate_new(__sz);
+  _IF_ASAN() __annotate_new(__sz);
 }
 
 template <class _CharT, class _Traits, class _Allocator>
@@ -2213,7 +2228,7 @@ basic_string<_CharT, _Traits, _Allocator>::__init(const value_type* __s, size_ty
   }
   traits_type::copy(std::__to_address(__p), __s, __sz);
   traits_type::assign(__p[__sz], value_type());
-  __annotate_new(__sz);
+  _IF_ASAN() __annotate_new(__sz);
 }
 
 template <class _CharT, class _Traits, class _Allocator>
@@ -2237,7 +2252,7 @@ basic_string<_CharT, _Traits, _Allocator>::__init_copy_ctor_external(const value
     __set_long_size(__sz);
   }
   traits_type::copy(std::__to_address(__p), __s, __sz + 1);
-  __annotate_new(__sz);
+  _IF_ASAN() __annotate_new(__sz);
 }
 
 template <class _CharT, class _Traits, class _Allocator>
@@ -2261,7 +2276,7 @@ _LIBCPP_CONSTEXPR_SINCE_CXX20 void basic_string<_CharT, _Traits, _Allocator>::__
   }
   traits_type::assign(std::__to_address(__p), __n, __c);
   traits_type::assign(__p[__n], value_type());
-  __annotate_new(__n);
+  _IF_ASAN() __annotate_new(__n);
 }
 
 template <class _CharT, class _Traits, class _Allocator>
@@ -2276,7 +2291,7 @@ template <class _InputIterator, class _Sentinel>
 _LIBCPP_HIDE_FROM_ABI _LIBCPP_CONSTEXPR_SINCE_CXX20 void
 basic_string<_CharT, _Traits, _Allocator>::__init_with_sentinel(_InputIterator __first, _Sentinel __last) {
   __r_.first() = __rep();
-  __annotate_new(0);
+  _IF_ASAN() __annotate_new(0);
 
 #ifndef _LIBCPP_HAS_NO_EXCEPTIONS
   try {
@@ -2285,7 +2300,7 @@ basic_string<_CharT, _Traits, _Allocator>::__init_with_sentinel(_InputIterator _
       push_back(*__first);
 #ifndef _LIBCPP_HAS_NO_EXCEPTIONS
   } catch (...) {
-    __annotate_delete();
+    _IF_ASAN() __annotate_delete();
     if (__is_long())
       __alloc_traits::deallocate(__alloc(), __get_long_pointer(), __get_long_cap());
     throw;
@@ -2338,7 +2353,7 @@ basic_string<_CharT, _Traits, _Allocator>::__init_with_size(_InputIterator __fir
     throw;
   }
 #endif // _LIBCPP_HAS_NO_EXCEPTIONS
-  __annotate_new(__sz);
+  _IF_ASAN() __annotate_new(__sz);
 }
 
 template <class _CharT, class _Traits, class _Allocator>
@@ -2356,7 +2371,7 @@ _LIBCPP_CONSTEXPR_SINCE_CXX20 void basic_string<_CharT, _Traits, _Allocator>::__
   pointer __old_p = __get_pointer();
   size_type __cap =
       __old_cap < __ms / 2 - __alignment ? __recommend(std::max(__old_cap + __delta_cap, 2 * __old_cap)) : __ms - 1;
-  __annotate_delete();
+  _IF_ASAN() __annotate_delete();
   auto __allocation = std::__allocate_at_least(__alloc(), __cap + 1);
   pointer __p       = __allocation.ptr;
   __begin_lifetime(__p, __allocation.count);
@@ -2375,7 +2390,7 @@ _LIBCPP_CONSTEXPR_SINCE_CXX20 void basic_string<_CharT, _Traits, _Allocator>::__
   __old_sz = __n_copy + __n_add + __sec_cp_sz;
   __set_long_size(__old_sz);
   traits_type::assign(__p[__old_sz], value_type());
-  __annotate_new(__old_cap + __delta_cap);
+  _IF_ASAN() __annotate_new(__old_cap + __delta_cap);
 }
 
 // __grow_by is deprecated because it does not set the size. It may not update the size when the size is changed, and it
@@ -2399,7 +2414,7 @@ void _LIBCPP_CONSTEXPR_SINCE_CXX20
   pointer __old_p = __get_pointer();
   size_type __cap =
       __old_cap < __ms / 2 - __alignment ? __recommend(std::max(__old_cap + __delta_cap, 2 * __old_cap)) : __ms - 1;
-  __annotate_delete();
+  _IF_ASAN() __annotate_delete();
   auto __allocation = std::__allocate_at_least(__alloc(), __cap + 1);
   pointer __p       = __allocation.ptr;
   __begin_lifetime(__p, __allocation.count);
@@ -2428,7 +2443,7 @@ basic_string<_CharT, _Traits, _Allocator>::__grow_by_without_replace(
   __grow_by(__old_cap, __delta_cap, __old_sz, __n_copy, __n_del, __n_add);
   _LIBCPP_SUPPRESS_DEPRECATED_POP
   __set_long_size(__old_sz - __n_del + __n_add);
-  __annotate_new(__old_sz - __n_del + __n_add);
+  _IF_ASAN() __annotate_new(__old_sz - __n_del + __n_add);
 }
 
 // assign
@@ -2440,13 +2455,13 @@ basic_string<_CharT, _Traits, _Allocator>::__assign_no_alias(const value_type* _
   size_type __cap = __is_short ? static_cast<size_type>(__min_cap) : __get_long_cap();
   if (__n < __cap) {
     size_type __old_size = __is_short ? __get_short_size() : __get_long_size();
-    if (__n > __old_size)
+    _IF_ASAN() if (__n > __old_size)
       __annotate_increase(__n - __old_size);
     pointer __p = __is_short ? __get_short_pointer() : __get_long_pointer();
     __is_short ? __set_short_size(__n) : __set_long_size(__n);
     traits_type::copy(std::__to_address(__p), __s, __n);
     traits_type::assign(__p[__n], value_type());
-    if (__old_size > __n)
+    _IF_ASAN() if (__old_size > __n)
       __annotate_shrink(__old_size);
   } else {
     size_type __sz = __is_short ? __get_short_size() : __get_long_size();
@@ -2461,7 +2476,7 @@ basic_string<_CharT, _Traits, _Allocator>::__assign_external(const value_type* _
   size_type __cap = capacity();
   if (__cap >= __n) {
     size_type __old_size = size();
-    if (__n > __old_size)
+    _IF_ASAN() if (__n > __old_size)
       __annotate_increase(__n - __old_size);
     value_type* __p = std::__to_address(__get_pointer());
     traits_type::move(__p, __s, __n);
@@ -2488,8 +2503,8 @@ basic_string<_CharT, _Traits, _Allocator>::assign(size_type __n, value_type __c)
   if (__cap < __n) {
     size_type __sz = size();
     __grow_by_without_replace(__cap, __n - __cap, __sz, 0, __sz);
-    __annotate_increase(__n);
-  } else if (__n > __old_size)
+    _IF_ASAN() __annotate_increase(__n);
+  } else _IF_ASAN() if (__n > __old_size)
     __annotate_increase(__n - __old_size);
   value_type* __p = std::__to_address(__get_pointer());
   traits_type::assign(__p, __n, __c);
@@ -2501,7 +2516,7 @@ _LIBCPP_CONSTEXPR_SINCE_CXX20 basic_string<_CharT, _Traits, _Allocator>&
 basic_string<_CharT, _Traits, _Allocator>::operator=(value_type __c) {
   pointer __p;
   size_type __old_size = size();
-  if (__old_size == 0)
+  _IF_ASAN() if (__old_size == 0)
     __annotate_increase(1);
   if (__is_long()) {
     __p = __get_long_pointer();
@@ -2512,7 +2527,7 @@ basic_string<_CharT, _Traits, _Allocator>::operator=(value_type __c) {
   }
   traits_type::assign(*__p, __c);
   traits_type::assign(*++__p, value_type());
-  if (__old_size > 1)
+  _IF_ASAN() if (__old_size > 1)
     __annotate_shrink(__old_size);
   return *this;
 }
@@ -2525,10 +2540,10 @@ basic_string<_CharT, _Traits, _Allocator>::operator=(const basic_string& __str)
     if (!__is_long()) {
       if (!__str.__is_long()) {
         size_type __old_size = __get_short_size();
-        if (__get_short_size() < __str.__get_short_size())
+        _IF_ASAN() if (__get_short_size() < __str.__get_short_size())
           __annotate_increase(__str.__get_short_size() - __get_short_size());
         __r_.first() = __str.__r_.first();
-        if (__old_size > __get_short_size())
+        _IF_ASAN() if (__old_size > __get_short_size())
           __annotate_shrink(__old_size);
       } else {
         return __assign_no_alias<true>(__str.data(), __str.size());
@@ -2561,14 +2576,14 @@ basic_string<_CharT, _Traits, _Allocator>::__move_assign(basic_string& __str, tr
     _NOEXCEPT_(is_nothrow_move_assignable<allocator_type>::value)
 #  endif
 {
-  __annotate_delete();
+  _IF_ASAN() __annotate_delete();
   if (__is_long()) {
     __alloc_traits::deallocate(__alloc(), __get_long_pointer(), __get_long_cap());
 #  if _LIBCPP_STD_VER <= 14
     if (!is_nothrow_move_assignable<allocator_type>::value) {
       __set_short_size(0);
       traits_type::assign(__get_short_pointer()[0], value_type());
-      __annotate_new(0);
+      _IF_ASAN() __annotate_new(0);
     }
 #  endif
   }
@@ -2579,24 +2594,25 @@ basic_string<_CharT, _Traits, _Allocator>::__move_assign(basic_string& __str, tr
   __r_.first() = __str.__r_.first();
   __str.__set_short_size(0);
   traits_type::assign(__str.__get_short_pointer()[0], value_type());
+  _IF_ASAN() {
+    if (__str_was_short && this != &__str)
+      __str.__annotate_shrink(__str_old_size);
+    else
+      // ASan annotations: was long, so object memory is unpoisoned as new.
+      // Or is same as *this, and __annotate_delete() was called.
+      __str.__annotate_new(0);
 
-  if (__str_was_short && this != &__str)
-    __str.__annotate_shrink(__str_old_size);
-  else
-    // ASan annotations: was long, so object memory is unpoisoned as new.
-    // Or is same as *this, and __annotate_delete() was called.
-    __str.__annotate_new(0);
-
-  // ASan annotations: Guard against `std::string s; s = std::move(s);`
-  // You can find more here: https://en.cppreference.com/w/cpp/utility/move
-  // Quote: "Unless otherwise specified, all standard library objects that have been moved
-  // from are placed in a "valid but unspecified state", meaning the object's class
-  // invariants hold (so functions without preconditions, such as the assignment operator,
-  // can be safely used on the object after it was moved from):"
-  // Quote: "v = std::move(v); // the value of v is unspecified"
-  if (!__is_long() && &__str != this)
-    // If it is long string, delete was never called on original __str's buffer.
-    __annotate_new(__get_short_size());
+    // ASan annotations: Guard against `std::string s; s = std::move(s);`
+    // You can find more here: https://en.cppreference.com/w/cpp/utility/move
+    // Quote: "Unless otherwise specified, all standard library objects that have been moved
+    // from are placed in a "valid but unspecified state", meaning the object's class
+    // invariants hold (so functions without preconditions, such as the assignment operator,
+    // can be safely used on the object after it was moved from):"
+    // Quote: "v = std::move(v); // the value of v is unspecified"
+    if (!__is_long() && &__str != this)
+      // If it is long string, delete was never called on original __str's buffer.
+      __annotate_new(__get_short_size());
+  }
 }
 
 #endif
@@ -2649,15 +2665,15 @@ basic_string<_CharT, _Traits, _Allocator>::__assign_trivial(_Iterator __first, _
     //    object itself stays valid even if reallocation happens.
     size_type __sz = size();
     __grow_by_without_replace(__cap, __n - __cap, __sz, 0, __sz);
-    __annotate_increase(__n);
-  } else if (__n > __old_size)
+    _IF_ASAN() __annotate_increase(__n);
+  } else _IF_ASAN() if (__n > __old_size)
     __annotate_increase(__n - __old_size);
   pointer __p = __get_pointer();
   for (; __first != __last; ++__p, (void)++__first)
     traits_type::assign(*__p, *__first);
   traits_type::assign(*__p, value_type());
   __set_size(__n);
-  if (__n < __old_size)
+  _IF_ASAN() if (__n < __old_size)
     __annotate_shrink(__old_size);
 }
 
@@ -2709,7 +2725,7 @@ basic_string<_CharT, _Traits, _Allocator>::append(const value_type* __s, size_ty
   size_type __sz  = size();
   if (__cap - __sz >= __n) {
     if (__n) {
-      __annotate_increase(__n);
+      _IF_ASAN() __annotate_increase(__n);
       value_type* __p = std::__to_address(__get_pointer());
       traits_type::copy(__p + __sz, __s, __n);
       __sz += __n;
@@ -2729,7 +2745,7 @@ basic_string<_CharT, _Traits, _Allocator>::append(size_type __n, value_type __c)
     size_type __sz  = size();
     if (__cap - __sz < __n)
       __grow_by_without_replace(__cap, __sz + __n - __cap, __sz, __sz, 0);
-    __annotate_increase(__n);
+    _IF_ASAN() __annotate_increase(__n);
     pointer __p = __get_pointer();
     traits_type::assign(std::__to_address(__p) + __sz, __n, __c);
     __sz += __n;
@@ -2747,7 +2763,7 @@ basic_string<_CharT, _Traits, _Allocator>::__append_default_init(size_type __n)
     size_type __sz  = size();
     if (__cap - __sz < __n)
       __grow_by_without_replace(__cap, __sz + __n - __cap, __sz, __sz, 0);
-    __annotate_increase(__n);
+    _IF_ASAN() __annotate_increase(__n);
     pointer __p = __get_pointer();
     __sz += __n;
     __set_size(__sz);
@@ -2769,10 +2785,10 @@ _LIBCPP_CONSTEXPR_SINCE_CXX20 void basic_string<_CharT, _Traits, _Allocator>::pu
   }
   if (__sz == __cap) {
     __grow_by_without_replace(__cap, 1, __sz, __sz, 0);
-    __annotate_increase(1);
+    _IF_ASAN() __annotate_increase(1);
     __is_short = false; // the string is always long after __grow_by
   } else
-    __annotate_increase(1);
+    _IF_ASAN() __annotate_increase(1);
   pointer __p = __get_pointer();
   if (__is_short) {
     __p = __get_short_pointer() + __sz;
@@ -2796,7 +2812,7 @@ basic_string<_CharT, _Traits, _Allocator>::append(_ForwardIterator __first, _For
     if (__string_is_trivial_iterator<_ForwardIterator>::value && !__addr_in_range(*__first)) {
       if (__cap - __sz < __n)
         __grow_by_without_replace(__cap, __sz + __n - __cap, __sz, __sz, 0);
-      __annotate_increase(__n);
+      _IF_ASAN() __annotate_increase(__n);
       pointer __p = __get_pointer() + __sz;
       for (; __first != __last; ++__p, (void)++__first)
         traits_type::assign(*__p, *__first);
@@ -2852,7 +2868,7 @@ basic_string<_CharT, _Traits, _Allocator>::insert(size_type __pos, const value_t
   size_type __cap = capacity();
   if (__cap - __sz >= __n) {
     if (__n...
[truncated]

Copy link
Member

@EricWF EricWF left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this works great. I don't love the way it looks but we can iterate on it.

The changes I would like before landing are:

(1) Rename to _LIBCPP_IF_ASAN.
(2) Put it in __config.
(3) Always use { ... } after _LIBCPP_IF_ASAN().

I think (3) is important because it's not clear exactly what _LIBCPP_IF_ASAN() expands to. We don't know if it actually captures the next statement as part of an if or not.

libcxx/include/string Outdated Show resolved Hide resolved
libcxx/include/string Outdated Show resolved Hide resolved
libcxx/include/string Outdated Show resolved Hide resolved
@EricWF
Copy link
Member

EricWF commented Dec 21, 2023

Also we need to do this to everything. I think everything needs to compile away fully without the optimizer before I feel comfortable fixing forward with this.

@EricWF
Copy link
Member

EricWF commented Dec 21, 2023

I'm putting up a change shortly to disable -Wc++17-extensions.

(1) Renamed to `_LIBCPP_IF_ASAN`.
(2) Moved to `__config`.
(3) Added `{ ... }` after `_LIBCPP_IF_ASAN()`.

Missing: `#ifdef` around ASan related variables.
Variables used only by ASan helper functions are conditionalized (put behind `#ifndef _LIBCPP_HAS_NO_ASAN`) after that commit.
That should remove any impact on performance.
libcxx/include/__config Outdated Show resolved Hide resolved
libcxx/include/__config Outdated Show resolved Hide resolved
@EricWF
Copy link
Member

EricWF commented Dec 21, 2023

The tests should start passing for this once #76178 lands.

- Macro moved to from `__config` to `string` file.
- `_LIBCPP_IF_ASAN` -> `_LIBCPP_STRING_IF_ASAN`.
- No more check `#ifndef _LIBCPP_STRING_IF_ASAN` before defining macro.
@AdvenamTacet AdvenamTacet changed the title [ASan][libc++] Update string ASan annotations to Zero-overhead [ASan][libc++] Update string ASan annotations to zero-overhead Dec 21, 2023
Copy link
Member

@EricWF EricWF left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Once the bots pass I'm good with this. That's contingent on #76178 being accepted.

Thank you again for working on this.

return *this;
}

_LIBCPP_HIDE_FROM_ABI _LIBCPP_CONSTEXPR_SINCE_CXX20 basic_string&
__null_terminate_at(value_type* __p, size_type __newsz) {
#ifndef __LIBCPP_HAS_NO_ASAN
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This sucks, but it works for now i think.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We definitely should consider how to refactor it. I can reduce number of lines like that by removing a few not necessary variables, in another patch soon, but not completely remove all cases.

@ldionne
Copy link
Member

ldionne commented Dec 22, 2023

I would prefer that we understand whether #76082 is sufficient first, and if it's not, then we can go ahead with this change. I'm worried about this change because it's really reducing the readability of our string implementation, which at this point is already borderline unmanageable. So I would only do this if we absolutely must, and that would need to include the justification for this change, i.e. what concrete problem we are solving.

@AdvenamTacet
Copy link
Member Author

We upstreamed #76082 and #76200 - @EricWF is there still a need for that PR or can we close it without merge?

@EricWF
Copy link
Member

EricWF commented Jan 8, 2024

We can close it without merging. Thanks for your work on this.

@AdvenamTacet AdvenamTacet deleted the string-annotations-zero-cost branch January 22, 2024 22:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
libc++ libc++ C++ Standard Library. Not GNU libstdc++. Not libc++abi. performance
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants