-
Notifications
You must be signed in to change notification settings - Fork 15.2k
[libc] Unify and extend no_sanitize attributes for strlen. #161316
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Fast strlen implementations (naive wide-reads, SIMD-based, and x86_64/aarch64-optimized versions) all may perform technically-out-of-bound reads, which leads to reports under ASan, HWASan (on ARM machines), and also TSan (which also has the capability to detect heap out-of-bound reads). So, we need to explicitly disable instrumentation in all three cases. Tragically, Clang didn't support `[[gnu::no_sanitize]]` syntax until recently, and since we're supporting both GCC and Clang, we have to revert to `__attribute__` syntax.
@llvm/pr-subscribers-libc Author: Alexey Samsonov (vonosmas) ChangesFast strlen implementations (naive wide-reads, SIMD-based, and x86_64/aarch64-optimized versions) all may perform technically-out-of-bound reads, which leads to reports under ASan, HWASan (on ARM machines), and also TSan (which also has the capability to detect heap out-of-bound reads). So, we need to explicitly disable instrumentation in all three cases. Tragically, Clang didn't support Full diff: https://github.com/llvm/llvm-project/pull/161316.diff 4 Files Affected:
diff --git a/libc/src/string/memory_utils/aarch64/inline_strlen.h b/libc/src/string/memory_utils/aarch64/inline_strlen.h
index 36fd1aa636b54..9e5320afe987f 100644
--- a/libc/src/string/memory_utils/aarch64/inline_strlen.h
+++ b/libc/src/string/memory_utils/aarch64/inline_strlen.h
@@ -17,7 +17,8 @@
namespace LIBC_NAMESPACE_DECL {
namespace neon {
-[[gnu::no_sanitize_address]] [[maybe_unused]] LIBC_INLINE static size_t
+__attribute__((no_sanitize("address", "hwaddress", "thread")))
+[[maybe_unused]] LIBC_INLINE static size_t
string_length(const char *src) {
using Vector __attribute__((may_alias)) = uint8x8_t;
diff --git a/libc/src/string/memory_utils/generic/inline_strlen.h b/libc/src/string/memory_utils/generic/inline_strlen.h
index d7435afb03719..0c13209d106d4 100644
--- a/libc/src/string/memory_utils/generic/inline_strlen.h
+++ b/libc/src/string/memory_utils/generic/inline_strlen.h
@@ -24,7 +24,8 @@ LIBC_INLINE constexpr cpp::simd_mask<char> shift_mask(cpp::simd_mask<char> m,
return cpp::bit_cast<cpp::simd_mask<char>>(r);
}
-[[clang::no_sanitize("address")]] LIBC_INLINE size_t
+__attribute__((no_sanitize("address", "hwaddress", "thread")))
+LIBC_INLINE size_t
string_length(const char *src) {
constexpr cpp::simd<char> null_byte = cpp::splat('\0');
diff --git a/libc/src/string/memory_utils/x86_64/inline_strlen.h b/libc/src/string/memory_utils/x86_64/inline_strlen.h
index 739f8c1aaddbc..047f10d8b2bad 100644
--- a/libc/src/string/memory_utils/x86_64/inline_strlen.h
+++ b/libc/src/string/memory_utils/x86_64/inline_strlen.h
@@ -18,12 +18,14 @@ namespace LIBC_NAMESPACE_DECL {
namespace string_length_internal {
// Return a bit-mask with the nth bit set if the nth-byte in block_ptr is zero.
template <typename Vector, typename Mask>
-[[gnu::no_sanitize_address]] LIBC_INLINE static Mask
+__attribute__((no_sanitize("address", "hwaddress", "thread")))
+LIBC_INLINE static Mask
compare_and_mask(const Vector *block_ptr);
template <typename Vector, typename Mask,
decltype(compare_and_mask<Vector, Mask>)>
-[[gnu::no_sanitize_address]] LIBC_INLINE static size_t
+__attribute__((no_sanitize("address", "hwaddress", "thread")))
+LIBC_INLINE static size_t
string_length_vector(const char *src) {
uintptr_t misalign_bytes = reinterpret_cast<uintptr_t>(src) % sizeof(Vector);
diff --git a/libc/src/string/string_utils.h b/libc/src/string/string_utils.h
index 9d636d02f4756..6ee94c244034b 100644
--- a/libc/src/string/string_utils.h
+++ b/libc/src/string/string_utils.h
@@ -119,7 +119,8 @@ template <typename T> LIBC_INLINE size_t string_length(const T *src) {
}
template <typename Word>
-[[gnu::no_sanitize_address]] LIBC_INLINE void *
+__attribute__((no_sanitize("address", "hwaddress", "thread")))
+LIBC_INLINE void *
find_first_character_wide_read(const unsigned char *src, unsigned char ch,
size_t n) {
const unsigned char *char_ptr = src;
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we really need to disable tsan as well? Likely this should be a macro in a header somewhere.
Yes, TSan detects (and complains about) OOB accesses to heap memory. I've created a macro for the attribute, PTAL! |
@michaelrj-google - could you PTAL if this change looks reasonable? I'd like to check it in to fix downstream sanitizer reports. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Fast strlen implementations (naive wide-reads, SIMD-based, and x86_64/aarch64-optimized versions) all may perform technically-out-of-bound reads, which leads to reports under ASan, HWASan (on ARM machines), and also TSan (which also has the capability to detect heap out-of-bound reads). So, we need to explicitly disable instrumentation in all three cases. Tragically, Clang didn't support `[[gnu::no_sanitize]]` syntax until recently, and since we're supporting both GCC and Clang, we have to revert to `__attribute__` syntax.
Fast strlen implementations (naive wide-reads, SIMD-based, and x86_64/aarch64-optimized versions) all may perform technically-out-of-bound reads, which leads to reports under ASan, HWASan (on ARM machines), and also TSan (which also has the capability to detect heap out-of-bound reads). So, we need to explicitly disable instrumentation in all three cases.
Tragically, Clang didn't support
[[gnu::no_sanitize]]
syntax until recently, and since we're supporting both GCC and Clang, we have to revert to__attribute__
syntax.