Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[sanitizer] Fix asserts in asan and tsan in pthread interceptors. #75394

Merged
merged 1 commit into from Jan 11, 2024

Conversation

goussepi
Copy link
Collaborator

Calling one of pthread join/detach interceptor on an already joined/detached thread causes asserts such as:

AddressSanitizer: CHECK failed: sanitizer_thread_arg_retval.cpp:56 "((t)) != (0)" (0x0, 0x0) (tid=1236094)
#0 0x555555634f8b in __asan::CheckUnwind() compiler-rt/lib/asan/asan_rtl.cpp:69:3
#1 0x55555564e06e in __sanitizer::CheckFailed(char const*, int, char const*, unsigned long long, unsigned long long) compiler-rt/lib/sanitizer_common/sanitizer_termination.cpp:86:24
#2 0x5555556491df in __sanitizer::ThreadArgRetval::BeforeJoin(unsigned long) const compiler-rt/lib/sanitizer_common/sanitizer_thread_arg_retval.cpp:56:3
#3 0x5555556198ed in Join<___interceptor_pthread_tryjoin_np(void*, void**)::<lambda()> > compiler-rt/lib/asan/../sanitizer_common/sanitizer_thread_arg_retval.h:74:26
#4 0x5555556198ed in pthread_tryjoin_np compiler-rt/lib/asan/asan_interceptors.cpp:311:29

The assert are replaced by error codes.

@llvmbot
Copy link
Collaborator

llvmbot commented Dec 13, 2023

@llvm/pr-subscribers-compiler-rt-sanitizer

Author: None (goussepi)

Changes

Calling one of pthread join/detach interceptor on an already joined/detached thread causes asserts such as:

AddressSanitizer: CHECK failed: sanitizer_thread_arg_retval.cpp:56 "((t)) != (0)" (0x0, 0x0) (tid=1236094)
#0 0x555555634f8b in __asan::CheckUnwind() compiler-rt/lib/asan/asan_rtl.cpp:69:3
#1 0x55555564e06e in __sanitizer::CheckFailed(char const*, int, char const*, unsigned long long, unsigned long long) compiler-rt/lib/sanitizer_common/sanitizer_termination.cpp:86:24
#2 0x5555556491df in __sanitizer::ThreadArgRetval::BeforeJoin(unsigned long) const compiler-rt/lib/sanitizer_common/sanitizer_thread_arg_retval.cpp:56:3
#3 0x5555556198ed in Join<___interceptor_pthread_tryjoin_np(void*, void**)::<lambda()> > compiler-rt/lib/asan/../sanitizer_common/sanitizer_thread_arg_retval.h:74:26
#4 0x5555556198ed in pthread_tryjoin_np compiler-rt/lib/asan/asan_interceptors.cpp:311:29

The assert are replaced by error codes.


Full diff: https://github.com/llvm/llvm-project/pull/75394.diff

4 Files Affected:

  • (modified) compiler-rt/lib/sanitizer_common/sanitizer_thread_arg_retval.cpp (+2-1)
  • (modified) compiler-rt/lib/sanitizer_common/sanitizer_thread_registry.cpp (+2-1)
  • (modified) compiler-rt/lib/tsan/rtl/tsan_interceptors_posix.cpp (+2-2)
  • (modified) compiler-rt/test/sanitizer_common/TestCases/Linux/pthread_join.cpp (+9)
diff --git a/compiler-rt/lib/sanitizer_common/sanitizer_thread_arg_retval.cpp b/compiler-rt/lib/sanitizer_common/sanitizer_thread_arg_retval.cpp
index bddb2852140854..5edf6f1764f966 100644
--- a/compiler-rt/lib/sanitizer_common/sanitizer_thread_arg_retval.cpp
+++ b/compiler-rt/lib/sanitizer_common/sanitizer_thread_arg_retval.cpp
@@ -53,7 +53,8 @@ void ThreadArgRetval::Finish(uptr thread, void* retval) {
 u32 ThreadArgRetval::BeforeJoin(uptr thread) const {
   __sanitizer::Lock lock(&mtx_);
   auto t = data_.find(thread);
-  CHECK(t);
+  if (!t)
+    return 0;
   CHECK(!t->second.detached);
   return t->second.gen;
 }
diff --git a/compiler-rt/lib/sanitizer_common/sanitizer_thread_registry.cpp b/compiler-rt/lib/sanitizer_common/sanitizer_thread_registry.cpp
index 741e0731c41559..12d36277589626 100644
--- a/compiler-rt/lib/sanitizer_common/sanitizer_thread_registry.cpp
+++ b/compiler-rt/lib/sanitizer_common/sanitizer_thread_registry.cpp
@@ -345,7 +345,8 @@ u32 ThreadRegistry::ConsumeThreadUserId(uptr user_id) {
   ThreadRegistryLock l(this);
   u32 tid;
   auto *t = live_.find(user_id);
-  CHECK(t);
+  if (!t)
+    return kInvalidTid;
   tid = t->second;
   live_.erase(t);
   auto *tctx = threads_[tid];
diff --git a/compiler-rt/lib/tsan/rtl/tsan_interceptors_posix.cpp b/compiler-rt/lib/tsan/rtl/tsan_interceptors_posix.cpp
index 80f86ca98ed9cd..bef0aa373119fa 100644
--- a/compiler-rt/lib/tsan/rtl/tsan_interceptors_posix.cpp
+++ b/compiler-rt/lib/tsan/rtl/tsan_interceptors_posix.cpp
@@ -1118,7 +1118,7 @@ TSAN_INTERCEPTOR(int, pthread_tryjoin_np, void *th, void **ret) {
   ThreadIgnoreEnd(thr);
   if (res == 0)
     ThreadJoin(thr, pc, tid);
-  else
+  else if (tid != kInvalidTid)
     ThreadNotJoined(thr, pc, tid, (uptr)th);
   return res;
 }
@@ -1132,7 +1132,7 @@ TSAN_INTERCEPTOR(int, pthread_timedjoin_np, void *th, void **ret,
   ThreadIgnoreEnd(thr);
   if (res == 0)
     ThreadJoin(thr, pc, tid);
-  else
+  else if (tid != kInvalidTid)
     ThreadNotJoined(thr, pc, tid, (uptr)th);
   return res;
 }
diff --git a/compiler-rt/test/sanitizer_common/TestCases/Linux/pthread_join.cpp b/compiler-rt/test/sanitizer_common/TestCases/Linux/pthread_join.cpp
index 212a28dd3985bb..dd70f12fbbef6c 100644
--- a/compiler-rt/test/sanitizer_common/TestCases/Linux/pthread_join.cpp
+++ b/compiler-rt/test/sanitizer_common/TestCases/Linux/pthread_join.cpp
@@ -4,6 +4,10 @@
 // FIXME: Crashes on some bots in pthread_exit.
 // RUN: %run %t %if tsan %{ 0 %} %else %{ 1 %}
 
+// Check interceptors' return codes are the same without sanitizers.
+// RUN: %clangxx -pthread -fno-sanitize=all %s -o %t
+// RUN: %run %t 0
+
 // REQUIRES: glibc
 
 #include <assert.h>
@@ -12,6 +16,7 @@
 #include <stdint.h>
 #include <stdlib.h>
 #include <unistd.h>
+#include <errno.h>
 
 bool use_exit;
 static void *fn(void *args) {
@@ -36,12 +41,14 @@ int main(int argc, char **argv) {
   assert(!pthread_attr_destroy(&attr));
 
   assert(!pthread_detach(thread[0]));
+  assert(pthread_detach(thread[0]) == EINVAL);
 
   {
     void *res;
     while (pthread_tryjoin_np(thread[1], &res))
       sleep(1);
     assert(~(uintptr_t)res == 1001);
+    assert(pthread_tryjoin_np(thread[1], &res) == EBUSY);
   }
 
   {
@@ -50,12 +57,14 @@ int main(int argc, char **argv) {
     while (pthread_timedjoin_np(thread[2], &res, &tm))
       sleep(1);
     assert(~(uintptr_t)res == 1002);
+    assert(pthread_timedjoin_np(thread[2], &res, &tm) == ESRCH);
   }
 
   {
     void *res;
     assert(!pthread_join(thread[3], &res));
     assert(~(uintptr_t)res == 1003);
+    assert(pthread_join(thread[3], &res) == ESRCH);
   }
 
   return 0;

Copy link

github-actions bot commented Dec 13, 2023

✅ With the latest revision this PR passed the C/C++ code formatter.

@vitalybuka
Copy link
Collaborator

@dvyukov had strong opinion on that.
I would probably prefer sanitizer crash with report with explanation.
But I don't like silently return error here as well.

From pthread_detach manual

       Attempting to detach an already detached thread results in
       unspecified behavior.

There is no promise to return particular error on the second detach. User code should not do that.

However sanitizers can get into inconsistent state, which will be hard to debug.

@dvyukov
Copy link
Collaborator

dvyukov commented Dec 14, 2023

Yes, I wonder what's the pattern in user code that triggers this and is not a bug.
Error codes are frequently ignored.
Perhaps print a better explanatory message before crashing.

@goussepi
Copy link
Collaborator Author

Thanks @vitalybuka @dvyukov for the comments.
Yes, I agree no need to check for invalid pthread_detach calls’ return value, I have removed it from the test.
Regarding the pthread join apis, what about not asserting unless running Tsan (cf test update) ?
Many thanks!
Pierre

@@ -4,10 +4,15 @@
// FIXME: Crashes on some bots in pthread_exit.
// RUN: %run %t %if tsan %{ 0 %} %else %{ 1 %}

// Check interceptors' return codes are the same without sanitizers.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see value in this test
-fno-sanitize=all removes runtime, we don't have interceptors.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I meant it as a sanity check to double check the interceptors returns the same error code as the intercepted functions.
Not strictly required so happy to remove.

@@ -53,7 +53,8 @@ void ThreadArgRetval::Finish(uptr thread, void* retval) {
u32 ThreadArgRetval::BeforeJoin(uptr thread) const {
__sanitizer::Lock lock(&mtx_);
auto t = data_.find(thread);
CHECK(t);
if (!t)
return 0;
CHECK(!t->second.detached);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we still have the case when joined thread is reused by another 'detached' thread and we will hit the CHECK below

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A yes of course, will fix thanks!

@vitalybuka
Copy link
Collaborator

It's not critical for this component to mismatch join

However, it could be nice to see example why keeping the bug in the program is needed?
Obvious danger of this is thread reuse, and you are joining alive but wrong thread.
Sanitizer does not detect that, but if you hit this CHECK, it's possible.

Maybe rather introduce detect_invalid_join into sanitizer_flags.inc, set it true by default, and for let users disable it?
Would you like to do that? If not, please add TODO near the current CHECK

@vitalybuka vitalybuka closed this Dec 18, 2023
@vitalybuka vitalybuka reopened this Dec 18, 2023
@goussepi
Copy link
Collaborator Author

It's not critical for this component to mismatch join

However, it could be nice to see example why keeping the bug in the program is needed? Obvious danger of this is thread reuse, and you are joining alive but wrong thread. Sanitizer does not detect that, but if you hit this CHECK, it's possible.

Maybe rather introduce detect_invalid_join into sanitizer_flags.inc, set it true by default, and for let users disable it? Would you like to do that? If not, please add TODO near the current CHECK

The error report I got was for pthread_join being called unconditionally inside a destructor.
I agree they are bugs but users see a crash and it can end up being reported as a sanitizer issue.
Adding a flag such as detect_invalid_join sounds good, and using Report() + Die() rather than a CHECK ?

Many thanks!

Pierre

@vitalybuka
Copy link
Collaborator

It's not critical for this component to mismatch join
However, it could be nice to see example why keeping the bug in the program is needed? Obvious danger of this is thread reuse, and you are joining alive but wrong thread. Sanitizer does not detect that, but if you hit this CHECK, it's possible.
Maybe rather introduce detect_invalid_join into sanitizer_flags.inc, set it true by default, and for let users disable it? Would you like to do that? If not, please add TODO near the current CHECK

The error report I got was for pthread_join being called unconditionally inside a destructor. I agree they are bugs but users see a crash and it can end up being reported as a sanitizer issue. Adding a flag such as detect_invalid_join sounds good, and using Report() + Die() rather than a CHECK ?

Many thanks!

Pierre

Yes, we need Report() + Die().

@goussepi
Copy link
Collaborator Author

goussepi commented Jan 9, 2024

ping!

@vitalybuka
Copy link
Collaborator

ping!

There is "re-request" review, otherwise it does not show up in searches like this https://github.com/llvm/llvm-project/pulls/review-requested/@me

CHECK(t);
CHECK(!t->second.detached);
return t->second.gen;
bool detect_invalid_join = common_flags()->detect_invalid_join;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  if (t && !t->second.detached) {
    return t->second.gen;
  }
  if (!common_flags()->detect_invalid_join)
     return kInvalidGen;
  const char* const reason = "unknown";
  if (!t) {
    reason = "already joined";
  } else if (t->second.detached) {
    reason = "detached";
  } 
  Report("ERROR: %s: Joining %s thread, aborting.\n",
               SanitizerToolName, reason);
  Die();

}

void ThreadArgRetval::AfterJoin(uptr thread, u32 gen) {
__sanitizer::Lock lock(&mtx_);
auto t = data_.find(thread);
if (!t || gen != t->second.gen) {
// Thread was reused and erased by any other event.
if (!t || gen != t->second.gen || gen == kInvalidGen) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should not have t->second.gen = kInvalidGen,

to make sure skip invalid here .gen = gen_++;

Copy link
Collaborator

@vitalybuka vitalybuka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM
Thank you!

@@ -23,6 +23,9 @@ void ThreadArgRetval::CreateLocked(uptr thread, bool detached,
Data& t = data_[thread];
t = {};
t.gen = gen_++;
static_assert(sizeof(gen_) == sizeof(u32) && kInvalidGen == UINT32_MAX);
if (gen_ == kInvalidGen)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess I missed if (gen_ == kInvalidGen) = 0,
so we don't need the one introduced with b96deb8

@vitalybuka
Copy link
Collaborator

I believe "member" can merge?
If not, please re-request, and I'll merge

Calling one of pthread join/detach interceptor on an already joined/detached
thread causes asserts such as:

AddressSanitizer: CHECK failed: sanitizer_thread_arg_retval.cpp:56 "((t)) != (0)" (0x0, 0x0) (tid=1236094)
    #0 0x555555634f8b in __asan::CheckUnwind() compiler-rt/lib/asan/asan_rtl.cpp:69:3
    llvm#1 0x55555564e06e in __sanitizer::CheckFailed(char const*, int, char const*, unsigned long long, unsigned long long) compiler-rt/lib/sanitizer_common/sanitizer_termination.cpp:86:24
    llvm#2 0x5555556491df in __sanitizer::ThreadArgRetval::BeforeJoin(unsigned long) const compiler-rt/lib/sanitizer_common/sanitizer_thread_arg_retval.cpp:56:3
    llvm#3 0x5555556198ed in Join<___interceptor_pthread_tryjoin_np(void*, void**)::<lambda()> > compiler-rt/lib/asan/../sanitizer_common/sanitizer_thread_arg_retval.h:74:26
    llvm#4 0x5555556198ed in pthread_tryjoin_np compiler-rt/lib/asan/asan_interceptors.cpp:311:29

This change does 2 things:
- Introduce a detect_invalid_join common flags
- Replace CHECK assert by Report() and Die()
@goussepi goussepi force-pushed the pthread_join-interceptors-crash-fix branch from 5f3b82b to 4e031a9 Compare January 11, 2024 10:43
@goussepi goussepi merged commit 5e40661 into llvm:main Jan 11, 2024
4 checks passed
justinfargnoli pushed a commit to justinfargnoli/llvm-project that referenced this pull request Jan 28, 2024
…vm#75394)

Calling one of pthread join/detach interceptor on an already
joined/detached thread causes asserts such as:

AddressSanitizer: CHECK failed: sanitizer_thread_arg_retval.cpp:56
"((t)) != (0)" (0x0, 0x0) (tid=1236094)
#0 0x555555634f8b in __asan::CheckUnwind()
compiler-rt/lib/asan/asan_rtl.cpp:69:3
llvm#1 0x55555564e06e in __sanitizer::CheckFailed(char const*, int, char
const*, unsigned long long, unsigned long long)
compiler-rt/lib/sanitizer_common/sanitizer_termination.cpp:86:24
llvm#2 0x5555556491df in __sanitizer::ThreadArgRetval::BeforeJoin(unsigned
long) const
compiler-rt/lib/sanitizer_common/sanitizer_thread_arg_retval.cpp:56:3
llvm#3 0x5555556198ed in Join<___interceptor_pthread_tryjoin_np(void*,
void**)::<lambda()> >
compiler-rt/lib/asan/../sanitizer_common/sanitizer_thread_arg_retval.h:74:26
llvm#4 0x5555556198ed in pthread_tryjoin_np
compiler-rt/lib/asan/asan_interceptors.cpp:311:29

The assert are replaced by error codes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants