Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[tsan] Mark pthread_*_lock functions as blocking #84162

Merged
merged 1 commit into from
Mar 26, 2024

Conversation

canova
Copy link
Contributor

@canova canova commented Mar 6, 2024

Fixes #83561.

When a thread is blocked on a mutex and we send an async signal to that mutex, it never arrives because tsan thinks that pthread_mutex_lock is not a blocking function. This patch marks pthread_*_lock functions as blocking so we can successfully deliver async signals like SIGPROF when the thread is blocked on them.

See the issue also for more details. I also added a test, which is a simplified version of the compiler explorer example I posted in the issue.

Please let me know if you have any other ideas or things to improve! Happy to work on them.

Also I filed #83844 which is more tricky because we don't have a libc wrapper for SYS_futex. I'm not sure how to intercept this yet. Please let me know if you have ideas on that as well. Thanks!

Fixes llvm#83561.

When a thread is blocked on a mutex and we send an async signal to that
mutex, it never arrives because tsan thinks that `pthread_mutex_lock` is
not a blocking function. This patch marks `pthread_*_lock` functions as
blocking so we can successfully deliver async signals like `SIGPROF`
when the thread is blocked on them.

See the issue also for more details. I also added a test, which is a
simplified version of the compiler explorer example I posted in the
issue.

Please let me know if you have any other ideas or things to improve!
Happy to work on them.

Also I filed llvm#83844 which is more tricky because we don't have a libc
wrapper for `SYS_futex`. I'm not sure how to intercept this yet. Please
let me know if you have ideas on that as well. Thanks!
Copy link

github-actions bot commented Mar 6, 2024

Thank you for submitting a Pull Request (PR) to the LLVM Project!

This PR will be automatically labeled and the relevant teams will be
notified.

If you wish to, you can add reviewers by using the "Reviewers" section on this page.

If this is not working for you, it is probably because you do not have write
permissions for the repository. In which case you can instead tag reviewers by
name in a comment by using @ followed by their GitHub username.

If you have received no comments on your PR for a week, you can request a review
by "ping"ing the PR by adding a comment “Ping”. The common courtesy "ping" rate
is once a week. Please remember that you are asking for valuable time from other developers.

If you have further questions, they may be answered by the LLVM GitHub User Guide.

You can also ask questions in a comment on this PR, on the LLVM Discord or on the forums.

@canova canova changed the title Mark pthread_*_lock functions as blocking [tsan] Mark pthread_*_lock functions as blocking Mar 6, 2024
@llvmbot
Copy link
Collaborator

llvmbot commented Mar 6, 2024

@llvm/pr-subscribers-compiler-rt-sanitizer

Author: Nazım Can Altınova (canova)

Changes

Fixes #83561.

When a thread is blocked on a mutex and we send an async signal to that mutex, it never arrives because tsan thinks that pthread_mutex_lock is not a blocking function. This patch marks pthread_*_lock functions as blocking so we can successfully deliver async signals like SIGPROF when the thread is blocked on them.

See the issue also for more details. I also added a test, which is a simplified version of the compiler explorer example I posted in the issue.

Please let me know if you have any other ideas or things to improve! Happy to work on them.

Also I filed #83844 which is more tricky because we don't have a libc wrapper for SYS_futex. I'm not sure how to intercept this yet. Please let me know if you have ideas on that as well. Thanks!


Full diff: https://github.com/llvm/llvm-project/pull/84162.diff

2 Files Affected:

  • (modified) compiler-rt/lib/tsan/rtl/tsan_interceptors_posix.cpp (+5-5)
  • (added) compiler-rt/test/tsan/signal_in_mutex_lock.cpp (+71)
diff --git a/compiler-rt/lib/tsan/rtl/tsan_interceptors_posix.cpp b/compiler-rt/lib/tsan/rtl/tsan_interceptors_posix.cpp
index a9f6673ac44e90..4e92c2dbba11cf 100644
--- a/compiler-rt/lib/tsan/rtl/tsan_interceptors_posix.cpp
+++ b/compiler-rt/lib/tsan/rtl/tsan_interceptors_posix.cpp
@@ -1340,7 +1340,7 @@ TSAN_INTERCEPTOR(int, pthread_mutex_destroy, void *m) {
 TSAN_INTERCEPTOR(int, pthread_mutex_lock, void *m) {
   SCOPED_TSAN_INTERCEPTOR(pthread_mutex_lock, m);
   MutexPreLock(thr, pc, (uptr)m);
-  int res = REAL(pthread_mutex_lock)(m);
+  int res = BLOCK_REAL(pthread_mutex_lock)(m);
   if (res == errno_EOWNERDEAD)
     MutexRepair(thr, pc, (uptr)m);
   if (res == 0 || res == errno_EOWNERDEAD)
@@ -1385,7 +1385,7 @@ TSAN_INTERCEPTOR(int, pthread_mutex_clocklock, void *m,
                  __sanitizer_clockid_t clock, void *abstime) {
   SCOPED_TSAN_INTERCEPTOR(pthread_mutex_clocklock, m, clock, abstime);
   MutexPreLock(thr, pc, (uptr)m);
-  int res = REAL(pthread_mutex_clocklock)(m, clock, abstime);
+  int res = BLOCK_REAL(pthread_mutex_clocklock)(m, clock, abstime);
   if (res == errno_EOWNERDEAD)
     MutexRepair(thr, pc, (uptr)m);
   if (res == 0 || res == errno_EOWNERDEAD)
@@ -1403,7 +1403,7 @@ TSAN_INTERCEPTOR(int, pthread_mutex_clocklock, void *m,
 TSAN_INTERCEPTOR(int, __pthread_mutex_lock, void *m) {
   SCOPED_TSAN_INTERCEPTOR(__pthread_mutex_lock, m);
   MutexPreLock(thr, pc, (uptr)m);
-  int res = REAL(__pthread_mutex_lock)(m);
+  int res = BLOCK_REAL(__pthread_mutex_lock)(m);
   if (res == errno_EOWNERDEAD)
     MutexRepair(thr, pc, (uptr)m);
   if (res == 0 || res == errno_EOWNERDEAD)
@@ -1446,7 +1446,7 @@ TSAN_INTERCEPTOR(int, pthread_spin_destroy, void *m) {
 TSAN_INTERCEPTOR(int, pthread_spin_lock, void *m) {
   SCOPED_TSAN_INTERCEPTOR(pthread_spin_lock, m);
   MutexPreLock(thr, pc, (uptr)m);
-  int res = REAL(pthread_spin_lock)(m);
+  int res = BLOCK_REAL(pthread_spin_lock)(m);
   if (res == 0) {
     MutexPostLock(thr, pc, (uptr)m);
   }
@@ -1521,7 +1521,7 @@ TSAN_INTERCEPTOR(int, pthread_rwlock_timedrdlock, void *m, void *abstime) {
 TSAN_INTERCEPTOR(int, pthread_rwlock_wrlock, void *m) {
   SCOPED_TSAN_INTERCEPTOR(pthread_rwlock_wrlock, m);
   MutexPreLock(thr, pc, (uptr)m);
-  int res = REAL(pthread_rwlock_wrlock)(m);
+  int res = BLOCK_REAL(pthread_rwlock_wrlock)(m);
   if (res == 0) {
     MutexPostLock(thr, pc, (uptr)m);
   }
diff --git a/compiler-rt/test/tsan/signal_in_mutex_lock.cpp b/compiler-rt/test/tsan/signal_in_mutex_lock.cpp
new file mode 100644
index 00000000000000..ec99e23198400f
--- /dev/null
+++ b/compiler-rt/test/tsan/signal_in_mutex_lock.cpp
@@ -0,0 +1,71 @@
+// RUN: %clang_tsan %s -lstdc++ -o %t && %run %t 2>&1 | FileCheck %s
+
+#include "test.h"
+#include <pthread.h>
+#include <signal.h>
+#include <stdio.h>
+
+#include <cassert>
+#include <condition_variable>
+#include <mutex>
+
+std::mutex sampler_mutex; //dummy mutex to lock in the thread we spawn.
+std::mutex done_mutex;    // guards the cv and done variables.
+std::condition_variable cv;
+bool done = false;
+
+void *ThreadFunc(void *x) {
+  while (true) {
+    // Lock the mutex
+    std::lock_guard<std::mutex> guard(sampler_mutex);
+    // Mutex is released at the end
+  }
+
+  return nullptr;
+}
+
+static void SigprofHandler(int signal, siginfo_t *info, void *context) {
+  // Assuming we did some work, change the variable to let the main thread
+  // know that we are done.
+  {
+    std::unique_lock<std::mutex> lck(done_mutex);
+    done = true;
+    cv.notify_one();
+  }
+}
+
+int main() {
+  alarm(60); // Kill the test if it hangs.
+
+  // Install the signal handler
+  struct sigaction sa;
+  sa.sa_sigaction = SigprofHandler;
+  sigemptyset(&sa.sa_mask);
+  sa.sa_flags = SA_RESTART | SA_SIGINFO;
+  if (sigaction(SIGPROF, &sa, 0) != 0) {
+    fprintf(stderr, "failed to install signal handler\n");
+    abort();
+  }
+
+  // Spawn a thread that will just loop and get the mutex lock:
+  pthread_t thread;
+  pthread_create(&thread, NULL, ThreadFunc, NULL);
+
+  // Lock the mutex before sending the signal
+  std::lock_guard<std::mutex> guard(sampler_mutex);
+  // From now on thread 1 will be waiting for the lock
+
+  // Send the SIGPROF signal to thread.
+  int r = pthread_kill(thread, SIGPROF);
+  assert(r == 0);
+
+  // Wait until signal handler sends the data.
+  std::unique_lock lk(done_mutex);
+  cv.wait(lk, [] { return done; });
+
+  // We got the done variable from the signal handler. Exiting successfully.
+  fprintf(stderr, "PASS\n");
+}
+
+// CHECK-NOT: WARNING: ThreadSanitizer:
+// CHECK: PASS

@canova
Copy link
Contributor Author

canova commented Mar 26, 2024

@dvyukov Thanks a lot for the review! I don't have permissions to merge PRs, would you mind merging this and #86537 also?

@dvyukov dvyukov merged commit f01377d into llvm:main Mar 26, 2024
8 checks passed
Copy link

@canova Congratulations on having your first Pull Request (PR) merged into the LLVM Project!

Your changes will be combined with recent changes from other authors, then tested
by our build bots. If there is a problem with a build, you may recieve a report in an email or a comment on this PR.

Please check whether problems have been caused by your change specifically, as
the builds can include changes from many authors. It is not uncommon for your
change to be included in a build that fails due to someone else's changes, or
infrastructure issues.

How to do this, and the rest of the post-merge process, is covered in detail here.

If your change does cause a problem, it may be reverted, or you can revert it yourself.
This is a normal part of LLVM development. You can fix your changes and open a new PR to merge them again.

If you don't get any reports, no action is required from you. Your changes are working as expected, well done!

dvyukov pushed a commit that referenced this pull request Mar 26, 2024
…tsan (#86537)

Fixes #83844.

This PR adds callbacks to mark futex syscalls as blocking. Unfortunately
we didn't have a mechanism before to mark syscalls as a blocking call,
so I had to implement it, but it mostly reuses the `BlockingCall`
implementation
[here](https://github.com/llvm/llvm-project/blob/96819daa3d095cf9f662e0229dc82eaaa25480e8/compiler-rt/lib/tsan/rtl/tsan_interceptors_posix.cpp#L362-L380).

The issue includes some information but this issue was discovered
because Rust uses futexes directly. So most likely we need to update
Rust as well to use these callbacks.

Also see the latest comments in #85188 for some context.
I also sent another PR #84162 to mark `pthread_*_lock` calls as
blocking.
@canova canova deleted the tsan-signal-mutex branch March 28, 2024 15:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

TSan: async signals are never being delivered when the target thread is blocked waiting for a mutex lock
3 participants