-
Notifications
You must be signed in to change notification settings - Fork 15.1k
[libc] Fix undefined behavior for nan functions. #106468
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@llvm/pr-subscribers-libc Author: None (lntue) ChangesCurrently the nan* functions use nullptr dereferencing to crash with SIGSEGV if the input is nullptr. Both See for instance: https://godbolt.org/z/fd8KcM9bx This PR uses volatile load to prevent the undefined behavior if libc is built without sanitizers, and leave the current undefined behavior if libc is built with sanitizers, so that the undefined behavior can be caught for users' codes. Full diff: https://github.com/llvm/llvm-project/pull/106468.diff 11 Files Affected:
diff --git a/libc/src/__support/CMakeLists.txt b/libc/src/__support/CMakeLists.txt
index 9bd1e29081a801..1d5c2b585ea32f 100644
--- a/libc/src/__support/CMakeLists.txt
+++ b/libc/src/__support/CMakeLists.txt
@@ -192,6 +192,9 @@ add_header_library(
libc.src.__support.CPP.optional
libc.src.__support.FPUtil.fp_bits
libc.src.__support.FPUtil.rounding_mode
+ libc.src.__support.macros.config
+ libc.src.__support.macros.optimization
+ libc.src.__support.macros.sanitizer
libc.src.errno.errno
)
diff --git a/libc/src/__support/macros/sanitizer.h b/libc/src/__support/macros/sanitizer.h
index c4f8b5bce39755..c20412e0f8b69f 100644
--- a/libc/src/__support/macros/sanitizer.h
+++ b/libc/src/__support/macros/sanitizer.h
@@ -15,7 +15,25 @@
// Functions to unpoison memory
//-----------------------------------------------------------------------------
+#if LIBC_HAS_FEATURE(address_sanitizer) || defined(__SANITIZE_ADDRESS__)
+#define LIBC_HAS_ADDRESS_SANITIZER
+#endif
+
#if LIBC_HAS_FEATURE(memory_sanitizer)
+#define LIBC_HAS_MEMORY_SANITIZER
+#endif
+
+#if LIBC_HAS_FEATURE(undefined_behavior_sanitizer)
+#define LIBC_HAS_UNDEFINED_BEHAVIOR_SANITIZER
+#endif
+
+#if defined(LIBC_HAS_ADDRESS_SANITIZER) || \
+ defined(LIBC_HAS_MEMORY_SANITIZER) || \
+ defined(LIBC_HAS_UNDEFINED_BEHAVIOR_SANITIZER)
+#define LIBC_HAS_SANITIZER
+#endif
+
+#ifdef LIBC_HAS_MEMORY_SANITIZER
// Only perform MSAN unpoison in non-constexpr context.
#include <sanitizer/msan_interface.h>
#define MSAN_UNPOISON(addr, size) \
@@ -27,8 +45,7 @@
#define MSAN_UNPOISON(ptr, size)
#endif
-#if LIBC_HAS_FEATURE(address_sanitizer)
-#define LIBC_HAVE_ADDRESS_SANITIZER
+#ifdef LIBC_HAS_ADDRESS_SANITIZER
#include <sanitizer/asan_interface.h>
#define ASAN_POISON_MEMORY_REGION(addr, size) \
__asan_poison_memory_region((addr), (size))
diff --git a/libc/src/__support/str_to_float.h b/libc/src/__support/str_to_float.h
index ffd6ebf27c7726..4cae02d591078a 100644
--- a/libc/src/__support/str_to_float.h
+++ b/libc/src/__support/str_to_float.h
@@ -20,6 +20,8 @@
#include "src/__support/detailed_powers_of_ten.h"
#include "src/__support/high_precision_decimal.h"
#include "src/__support/macros/config.h"
+#include "src/__support/macros/optimization.h"
+#include "src/__support/macros/sanitizer.h"
#include "src/__support/str_to_integer.h"
#include "src/__support/str_to_num_result.h"
#include "src/__support/uint128.h"
@@ -1208,6 +1210,15 @@ template <class T> LIBC_INLINE StrToNumResult<T> strtonan(const char *arg) {
using FPBits = typename fputil::FPBits<T>;
using StorageType = typename FPBits::StorageType;
+#ifndef LIBC_HAS_SANITIZER
+ if (LIBC_UNLIKELY(arg == nullptr)) {
+ // Use volatile to prevent undefined behavior of dereferencing nullptr.
+ volatile const char *crashing = arg;
+ // Intentionally crashing with SIGSEGV.
+ return {static_cast<T>(crashing[0]), 0, 0};
+ }
+#endif
+
FPBits result;
int error = 0;
StorageType nan_mantissa = 0;
diff --git a/libc/test/src/compiler/CMakeLists.txt b/libc/test/src/compiler/CMakeLists.txt
index 65a9acceb6f7f1..a45fa8c55e5128 100644
--- a/libc/test/src/compiler/CMakeLists.txt
+++ b/libc/test/src/compiler/CMakeLists.txt
@@ -7,6 +7,7 @@ add_libc_unittest(
SRCS
stack_chk_guard_test.cpp
DEPENDS
+ libc.hdr.signal_macros
libc.src.__support.macros.sanitizer
libc.src.compiler.__stack_chk_fail
libc.src.string.memset
diff --git a/libc/test/src/compiler/stack_chk_guard_test.cpp b/libc/test/src/compiler/stack_chk_guard_test.cpp
index 6b71e155fa3e4d..4ec8398c9fc95d 100644
--- a/libc/test/src/compiler/stack_chk_guard_test.cpp
+++ b/libc/test/src/compiler/stack_chk_guard_test.cpp
@@ -6,7 +6,7 @@
//
//===----------------------------------------------------------------------===//
-#include "include/llvm-libc-macros/signal-macros.h"
+#include "hdr/signal_macros.h"
#include "src/__support/macros/sanitizer.h"
#include "src/compiler/__stack_chk_fail.h"
#include "src/string/memset.h"
@@ -18,7 +18,7 @@ TEST(LlvmLibcStackChkFail, Death) {
// Disable the test when asan is enabled so that it doesn't immediately fail
// after the memset, but before the stack canary is re-checked.
-#ifndef LIBC_HAVE_ADDRESS_SANITIZER
+#ifndef LIBC_HAS_ADDRESS_SANITIZER
TEST(LlvmLibcStackChkFail, Smash) {
EXPECT_DEATH(
[] {
@@ -27,4 +27,4 @@ TEST(LlvmLibcStackChkFail, Smash) {
},
WITH_SIGNAL(SIGABRT));
}
-#endif // LIBC_HAVE_ADDRESS_SANITIZER
+#endif // LIBC_HAS_ADDRESS_SANITIZER
diff --git a/libc/test/src/math/smoke/CMakeLists.txt b/libc/test/src/math/smoke/CMakeLists.txt
index 7271e933b9311d..e943d98256a97b 100644
--- a/libc/test/src/math/smoke/CMakeLists.txt
+++ b/libc/test/src/math/smoke/CMakeLists.txt
@@ -2895,9 +2895,10 @@ add_fp_unittest(
SRCS
nanf_test.cpp
DEPENDS
- libc.include.signal
+ libc.hdr.signal_macros
libc.src.math.nanf
libc.src.__support.FPUtil.fp_bits
+ libc.src.__support.macros.sanitizer
# FIXME: The nan tests currently have death tests, which aren't supported for
# hermetic tests.
UNIT_TEST_ONLY
@@ -2910,9 +2911,10 @@ add_fp_unittest(
SRCS
nan_test.cpp
DEPENDS
- libc.include.signal
+ libc.hdr.signal_macros
libc.src.math.nan
libc.src.__support.FPUtil.fp_bits
+ libc.src.__support.macros.sanitizer
# FIXME: The nan tests currently have death tests, which aren't supported for
# hermetic tests.
UNIT_TEST_ONLY
@@ -2925,9 +2927,10 @@ add_fp_unittest(
SRCS
nanl_test.cpp
DEPENDS
- libc.include.signal
+ libc.hdr.signal_macros
libc.src.math.nanl
libc.src.__support.FPUtil.fp_bits
+ libc.src.__support.macros.sanitizer
# FIXME: The nan tests currently have death tests, which aren't supported for
# hermetic tests.
UNIT_TEST_ONLY
@@ -2940,7 +2943,7 @@ add_fp_unittest(
SRCS
nanf16_test.cpp
DEPENDS
- libc.include.signal
+ libc.hdr.signal_macros
libc.src.math.nanf16
libc.src.__support.FPUtil.fp_bits
libc.src.__support.macros.sanitizer
@@ -2956,9 +2959,10 @@ add_fp_unittest(
SRCS
nanf128_test.cpp
DEPENDS
- libc.include.signal
+ libc.hdr.signal_macros
libc.src.math.nanf128
libc.src.__support.FPUtil.fp_bits
+ libc.src.__support.macros.sanitizer
# FIXME: The nan tests currently have death tests, which aren't supported for
# hermetic tests.
UNIT_TEST_ONLY
diff --git a/libc/test/src/math/smoke/nan_test.cpp b/libc/test/src/math/smoke/nan_test.cpp
index 68c844181a1946..46b9e9aa9563ab 100644
--- a/libc/test/src/math/smoke/nan_test.cpp
+++ b/libc/test/src/math/smoke/nan_test.cpp
@@ -6,12 +6,13 @@
//
//===----------------------------------------------------------------------===//
+#include "hdr/signal_macros.h"
#include "src/__support/FPUtil/FPBits.h"
+#include "src/__support/macros/sanitizer.h"
#include "src/math/nan.h"
#include "test/UnitTest/FEnvSafeTest.h"
#include "test/UnitTest/FPMatcher.h"
#include "test/UnitTest/Test.h"
-#include <signal.h>
class LlvmLibcNanTest : public LIBC_NAMESPACE::testing::FEnvSafeTest {
public:
@@ -43,8 +44,8 @@ TEST_F(LlvmLibcNanTest, RandomString) {
run_test("123 ", 0x7ff8000000000000);
}
-#if !defined(LIBC_HAVE_ADDRESS_SANITIZER) && defined(LIBC_TARGET_OS_IS_LINUX)
+#if !defined(LIBC_HAS_ADDRESS_SANITIZER) && defined(LIBC_TARGET_OS_IS_LINUX)
TEST_F(LlvmLibcNanTest, InvalidInput) {
EXPECT_DEATH([] { LIBC_NAMESPACE::nan(nullptr); }, WITH_SIGNAL(SIGSEGV));
}
-#endif // LIBC_HAVE_ADDRESS_SANITIZER
+#endif // LIBC_HAS_ADDRESS_SANITIZER
diff --git a/libc/test/src/math/smoke/nanf128_test.cpp b/libc/test/src/math/smoke/nanf128_test.cpp
index 015cc31e4be237..25dd2ef1d5b1ca 100644
--- a/libc/test/src/math/smoke/nanf128_test.cpp
+++ b/libc/test/src/math/smoke/nanf128_test.cpp
@@ -6,7 +6,9 @@
//
//===----------------------------------------------------------------------===//
+#include "hdr/signal_macros.h"
#include "src/__support/FPUtil/FPBits.h"
+#include "src/__support/macros/sanitizer.h"
#include "src/__support/uint128.h"
#include "src/math/nanf128.h"
#include "test/UnitTest/FEnvSafeTest.h"
@@ -53,9 +55,8 @@ TEST_F(LlvmLibcNanf128Test, RandomString) {
QUIET_NAN);
}
-#if !defined(LIBC_HAVE_ADDRESS_SANITIZER) && defined(LIBC_TARGET_OS_IS_LINUX)
-#include <signal.h>
+#if !defined(LIBC_HAS_ADDRESS_SANITIZER) && defined(LIBC_TARGET_OS_IS_LINUX)
TEST_F(LlvmLibcNanf128Test, InvalidInput) {
EXPECT_DEATH([] { LIBC_NAMESPACE::nanf128(nullptr); }, WITH_SIGNAL(SIGSEGV));
}
-#endif // LIBC_HAVE_ADDRESS_SANITIZER
+#endif // LIBC_HAS_ADDRESS_SANITIZER
diff --git a/libc/test/src/math/smoke/nanf16_test.cpp b/libc/test/src/math/smoke/nanf16_test.cpp
index 81b844bf6bb59c..ec640a3b9eef92 100644
--- a/libc/test/src/math/smoke/nanf16_test.cpp
+++ b/libc/test/src/math/smoke/nanf16_test.cpp
@@ -6,6 +6,7 @@
//
//===----------------------------------------------------------------------===//
+#include "hdr/signal_macros.h"
#include "src/__support/FPUtil/FPBits.h"
#include "src/__support/macros/sanitizer.h"
#include "src/math/nanf16.h"
@@ -13,8 +14,6 @@
#include "test/UnitTest/FPMatcher.h"
#include "test/UnitTest/Test.h"
-#include <signal.h>
-
class LlvmLibcNanf16Test : public LIBC_NAMESPACE::testing::FEnvSafeTest {
public:
using StorageType = LIBC_NAMESPACE::fputil::FPBits<float16>::StorageType;
@@ -44,8 +43,8 @@ TEST_F(LlvmLibcNanf16Test, RandomString) {
run_test("123 ", 0x7e00);
}
-#if !defined(LIBC_HAVE_ADDRESS_SANITIZER) && defined(LIBC_TARGET_OS_IS_LINUX)
+#if !defined(LIBC_HAS_ADDRESS_SANITIZER) && defined(LIBC_TARGET_OS_IS_LINUX)
TEST_F(LlvmLibcNanf16Test, InvalidInput) {
EXPECT_DEATH([] { LIBC_NAMESPACE::nanf16(nullptr); }, WITH_SIGNAL(SIGSEGV));
}
-#endif // LIBC_HAVE_ADDRESS_SANITIZER
+#endif // LIBC_HAS_ADDRESS_SANITIZER
diff --git a/libc/test/src/math/smoke/nanf_test.cpp b/libc/test/src/math/smoke/nanf_test.cpp
index ff5823685225ce..dd3124ee9c5112 100644
--- a/libc/test/src/math/smoke/nanf_test.cpp
+++ b/libc/test/src/math/smoke/nanf_test.cpp
@@ -6,12 +6,13 @@
//
//===----------------------------------------------------------------------===//
+#include "hdr/signal_macros.h"
#include "src/__support/FPUtil/FPBits.h"
+#include "src/__support/macros/sanitizer.h"
#include "src/math/nanf.h"
#include "test/UnitTest/FEnvSafeTest.h"
#include "test/UnitTest/FPMatcher.h"
#include "test/UnitTest/Test.h"
-#include <signal.h>
class LlvmLibcNanfTest : public LIBC_NAMESPACE::testing::FEnvSafeTest {
public:
@@ -42,8 +43,8 @@ TEST_F(LlvmLibcNanfTest, RandomString) {
run_test("123 ", 0x7fc00000);
}
-#if !defined(LIBC_HAVE_ADDRESS_SANITIZER) && defined(LIBC_TARGET_OS_IS_LINUX)
+#if !defined(LIBC_HAS_ADDRESS_SANITIZER) && defined(LIBC_TARGET_OS_IS_LINUX)
TEST_F(LlvmLibcNanfTest, InvalidInput) {
EXPECT_DEATH([] { LIBC_NAMESPACE::nanf(nullptr); }, WITH_SIGNAL(SIGSEGV));
}
-#endif // LIBC_HAVE_ADDRESS_SANITIZER
+#endif // LIBC_HAS_ADDRESS_SANITIZER
diff --git a/libc/test/src/math/smoke/nanl_test.cpp b/libc/test/src/math/smoke/nanl_test.cpp
index de9af05100c10a..ef3f9c15dafd9f 100644
--- a/libc/test/src/math/smoke/nanl_test.cpp
+++ b/libc/test/src/math/smoke/nanl_test.cpp
@@ -6,12 +6,13 @@
//
//===----------------------------------------------------------------------===//
+#include "hdr/signal_macros.h"
#include "src/__support/FPUtil/FPBits.h"
+#include "src/__support/macros/sanitizer.h"
#include "src/math/nanl.h"
#include "test/UnitTest/FEnvSafeTest.h"
#include "test/UnitTest/FPMatcher.h"
#include "test/UnitTest/Test.h"
-#include <signal.h>
#if defined(LIBC_TYPES_LONG_DOUBLE_IS_FLOAT64)
#define SELECT_LONG_DOUBLE(val, _, __) val
@@ -70,8 +71,8 @@ TEST_F(LlvmLibcNanlTest, RandomString) {
run_test("123 ", expected);
}
-#if !defined(LIBC_HAVE_ADDRESS_SANITIZER) && defined(LIBC_TARGET_OS_IS_LINUX)
+#if !defined(LIBC_HAS_ADDRESS_SANITIZER) && defined(LIBC_TARGET_OS_IS_LINUX)
TEST_F(LlvmLibcNanlTest, InvalidInput) {
EXPECT_DEATH([] { LIBC_NAMESPACE::nanl(nullptr); }, WITH_SIGNAL(SIGSEGV));
}
-#endif // LIBC_HAVE_ADDRESS_SANITIZER
+#endif // LIBC_HAS_ADDRESS_SANITIZER
|
libc/src/__support/str_to_float.h
Outdated
|
||
#ifndef LIBC_HAS_SANITIZER | ||
if (LIBC_UNLIKELY(arg == nullptr)) { | ||
// Use volatile to prevent undefined behavior of dereferencing nullptr. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
given that this problem is only likely to occur in the tests, does it make more sense to change the tests instead of adding this extra-explicit check here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, the main purpose is not for the test, but hardening user code when sanitizers are not in used. This is especially important when this header function is included and built from source, as the compiler can detect undefined behavior and optimize the code away completely. The explicit check is for performance purpose, since you will not want everything to always be round-tripped through the volatile variable.
And we leave the undefined-behavior back if the code is built under sanitizers, so that if in user codes, nullptr
does reach here, then the sanitizers will be able to catch it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From what I understand, the only time this will matter is when all of the following conditions are met:
a) User code is built alongside libc code as source
b) The code is built without sanitizers
c) The code is built with optimizations
d) The code passes a nullptr to the nan
function in a constant way.
In that case, the user code is incorrect (it should not pass nullptr to nan
) and the compiler may do unexpected things. I agree that this check would help find the bug in this specific case, but it will have a penalty for every other case, since this check can't be optimized out due to the volatile
. Additionally, if the user code does have this undefined behavior, then they can find it by changing any of the conditions (e.g. building with sanitizers) without us needing this check all the time.
My main point is: If the user wants hardening to find this sort of undefined behavior, they have access to it. If they didn't choose to use hardening then they may not want it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's triggered even without a & d. This got my attention when the nan tests didn't crash and trigger SIGSEGV on some riscv64 configs.
About your worry regarding the performance, if the inputs (runtime or compile-time) have nullptr, our original intention is to crash with SIGSEGV anyway, and the added nullptr
check and volatile load is negligible compared to crashing part.
If the inputs do not have nullptr, the only extra cost is the nullptr check arg == nullptr
, which always returns false under the condition / assumption, cost at most 1 CPU clock, and will be hidden away completely by the CPU's branch predictors.
So for this check arg == nullptr
to incur observable performance penalty, you'll need to call strtonan
for a large array with inputs alternating between very short string and nullptr
in order to beat the CPU's branch predictor. But in that case, we already crash at the first nullptr
encountered.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a fair point. In that case we should create a generic way to do nullptr checks so it can be used in other places, and also guard it behind a hardening macro so it can be turned off on space-sensitive systems.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added a config option to enable/disable null checks. At the current form it's not straightforward to refactor it to a generic macro. I'll leave adding a generic macro for later, after seeing how to do null checks in other functions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd suggest replacing the volatile
stuff with just __builtin_trap()
. It makes this more generic and makes it less dependent on optimization behavior. If you really want it to segfault, then you could do this:
volatile char *crashing = reinterpret_cast<char*>(arg); //might also need a const_cast to be properly generic
char a = *crashing;
__builtin_trap(); //avoid the return while also making it clear to the compiler that this is a terminal path
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I refactor it to a generic macro in __support/macros/null_check.h
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Currently the nan* functions use nullptr dereferencing to crash with SIGSEGV if the input is nullptr. Both
nan(nullptr)
andnullptr
dereferencing are undefined behaviors according to the C standard. Employingnullptr
dereference in thenan
function implementation is ok if users only linked against the pre-built library, but it might be completely removed by the compilers' optimizations if it is built from source together with the users' code.See for instance: https://godbolt.org/z/fd8KcM9bx
This PR uses volatile load to prevent the undefined behavior if libc is built without sanitizers, and leave the current undefined behavior if libc is built with sanitizers, so that the undefined behavior can be caught for users' codes.