Skip to content

Conversation

vonosmas
Copy link
Contributor

Fast strlen implementations (naive wide-reads, SIMD-based, and x86_64/aarch64-optimized versions) all may perform technically-out-of-bound reads, which leads to reports under ASan, HWASan (on ARM machines), and also TSan (which also has the capability to detect heap out-of-bound reads). So, we need to explicitly disable instrumentation in all three cases.

Tragically, Clang didn't support [[gnu::no_sanitize]] syntax until recently, and since we're supporting both GCC and Clang, we have to revert to __attribute__ syntax.

Fast strlen implementations (naive wide-reads, SIMD-based, and
x86_64/aarch64-optimized versions) all may perform
technically-out-of-bound reads, which leads to reports under ASan,
HWASan (on ARM machines), and also TSan (which also has the capability
to detect heap out-of-bound reads). So, we need to explicitly disable
instrumentation in all three cases.

Tragically, Clang didn't support `[[gnu::no_sanitize]]` syntax until
recently, and since we're supporting both GCC and Clang, we have to
revert to `__attribute__` syntax.
@llvmbot
Copy link
Member

llvmbot commented Sep 30, 2025

@llvm/pr-subscribers-libc

Author: Alexey Samsonov (vonosmas)

Changes

Fast strlen implementations (naive wide-reads, SIMD-based, and x86_64/aarch64-optimized versions) all may perform technically-out-of-bound reads, which leads to reports under ASan, HWASan (on ARM machines), and also TSan (which also has the capability to detect heap out-of-bound reads). So, we need to explicitly disable instrumentation in all three cases.

Tragically, Clang didn't support [[gnu::no_sanitize]] syntax until recently, and since we're supporting both GCC and Clang, we have to revert to __attribute__ syntax.


Full diff: https://github.com/llvm/llvm-project/pull/161316.diff

4 Files Affected:

  • (modified) libc/src/string/memory_utils/aarch64/inline_strlen.h (+2-1)
  • (modified) libc/src/string/memory_utils/generic/inline_strlen.h (+2-1)
  • (modified) libc/src/string/memory_utils/x86_64/inline_strlen.h (+4-2)
  • (modified) libc/src/string/string_utils.h (+2-1)
diff --git a/libc/src/string/memory_utils/aarch64/inline_strlen.h b/libc/src/string/memory_utils/aarch64/inline_strlen.h
index 36fd1aa636b54..9e5320afe987f 100644
--- a/libc/src/string/memory_utils/aarch64/inline_strlen.h
+++ b/libc/src/string/memory_utils/aarch64/inline_strlen.h
@@ -17,7 +17,8 @@
 namespace LIBC_NAMESPACE_DECL {
 
 namespace neon {
-[[gnu::no_sanitize_address]] [[maybe_unused]] LIBC_INLINE static size_t
+__attribute__((no_sanitize("address", "hwaddress", "thread")))
+[[maybe_unused]] LIBC_INLINE static size_t
 string_length(const char *src) {
   using Vector __attribute__((may_alias)) = uint8x8_t;
 
diff --git a/libc/src/string/memory_utils/generic/inline_strlen.h b/libc/src/string/memory_utils/generic/inline_strlen.h
index d7435afb03719..0c13209d106d4 100644
--- a/libc/src/string/memory_utils/generic/inline_strlen.h
+++ b/libc/src/string/memory_utils/generic/inline_strlen.h
@@ -24,7 +24,8 @@ LIBC_INLINE constexpr cpp::simd_mask<char> shift_mask(cpp::simd_mask<char> m,
   return cpp::bit_cast<cpp::simd_mask<char>>(r);
 }
 
-[[clang::no_sanitize("address")]] LIBC_INLINE size_t
+__attribute__((no_sanitize("address", "hwaddress", "thread")))
+LIBC_INLINE size_t
 string_length(const char *src) {
   constexpr cpp::simd<char> null_byte = cpp::splat('\0');
 
diff --git a/libc/src/string/memory_utils/x86_64/inline_strlen.h b/libc/src/string/memory_utils/x86_64/inline_strlen.h
index 739f8c1aaddbc..047f10d8b2bad 100644
--- a/libc/src/string/memory_utils/x86_64/inline_strlen.h
+++ b/libc/src/string/memory_utils/x86_64/inline_strlen.h
@@ -18,12 +18,14 @@ namespace LIBC_NAMESPACE_DECL {
 namespace string_length_internal {
 // Return a bit-mask with the nth bit set if the nth-byte in block_ptr is zero.
 template <typename Vector, typename Mask>
-[[gnu::no_sanitize_address]] LIBC_INLINE static Mask
+__attribute__((no_sanitize("address", "hwaddress", "thread")))
+LIBC_INLINE static Mask
 compare_and_mask(const Vector *block_ptr);
 
 template <typename Vector, typename Mask,
           decltype(compare_and_mask<Vector, Mask>)>
-[[gnu::no_sanitize_address]] LIBC_INLINE static size_t
+__attribute__((no_sanitize("address", "hwaddress", "thread")))
+LIBC_INLINE static size_t
 string_length_vector(const char *src) {
   uintptr_t misalign_bytes = reinterpret_cast<uintptr_t>(src) % sizeof(Vector);
 
diff --git a/libc/src/string/string_utils.h b/libc/src/string/string_utils.h
index 9d636d02f4756..6ee94c244034b 100644
--- a/libc/src/string/string_utils.h
+++ b/libc/src/string/string_utils.h
@@ -119,7 +119,8 @@ template <typename T> LIBC_INLINE size_t string_length(const T *src) {
 }
 
 template <typename Word>
-[[gnu::no_sanitize_address]] LIBC_INLINE void *
+__attribute__((no_sanitize("address", "hwaddress", "thread")))
+LIBC_INLINE void *
 find_first_character_wide_read(const unsigned char *src, unsigned char ch,
                                size_t n) {
   const unsigned char *char_ptr = src;

Copy link
Contributor

@jhuber6 jhuber6 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we really need to disable tsan as well? Likely this should be a macro in a header somewhere.

@llvmbot llvmbot added the bazel "Peripheral" support tier build system: utils/bazel label Sep 30, 2025
@vonosmas
Copy link
Contributor Author

Do we really need to disable tsan as well? Likely this should be a macro in a header somewhere.

Yes, TSan detects (and complains about) OOB accesses to heap memory. I've created a macro for the attribute, PTAL!

@vonosmas vonosmas requested a review from jhuber6 September 30, 2025 17:27
@vonosmas vonosmas requested a review from jhuber6 September 30, 2025 18:15
@vonosmas
Copy link
Contributor Author

vonosmas commented Oct 1, 2025

@michaelrj-google - could you PTAL if this change looks reasonable? I'd like to check it in to fix downstream sanitizer reports.

Copy link
Contributor

@michaelrj-google michaelrj-google left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@vonosmas vonosmas merged commit 5f0f497 into llvm:main Oct 2, 2025
20 checks passed
@vonosmas vonosmas deleted the llvm-libc-attr-fix branch October 2, 2025 00:26
mahesh-attarde pushed a commit to mahesh-attarde/llvm-project that referenced this pull request Oct 3, 2025
Fast strlen implementations (naive wide-reads, SIMD-based, and
x86_64/aarch64-optimized versions) all may perform
technically-out-of-bound reads, which leads to reports under ASan,
HWASan (on ARM machines), and also TSan (which also has the capability
to detect heap out-of-bound reads). So, we need to explicitly disable
instrumentation in all three cases.

Tragically, Clang didn't support `[[gnu::no_sanitize]]` syntax until
recently, and since we're supporting both GCC and Clang, we have to
revert to `__attribute__` syntax.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bazel "Peripheral" support tier build system: utils/bazel libc
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants