Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Profiler tries to acquire mutex in signal handler #51124

Closed
mkustermann opened this issue Jan 25, 2023 · 6 comments
Closed

Profiler tries to acquire mutex in signal handler #51124

mkustermann opened this issue Jan 25, 2023 · 6 comments
Assignees
Labels
area-vm Use area-vm for VM related issues, including code coverage, and the AOT and JIT backends. P1 A high priority bug; for example, a single project is unusable or has many test failures type-bug Incorrect behavior (everything from a crash to more subtle misbehavior)

Comments

@mkustermann
Copy link
Member

From an internal crash:

F0124 20:17:02.002853  527579 mutex.cc:2080] RAW: Check waitp->thread->waitp == nullptr || waitp->thread->suppress_fatal_errors failed: detected illegal recursion into Mutex code
    @     0x556446335439  absl::raw_log_internal::RawLog()
    @     0x5564462ef943  absl::Mutex::LockSlowLoop()
    @     0x5564462eeb8b  absl::Mutex::LockSlowWithDeadline()
    @     0x5564462f177e  absl::Mutex::LockSlow()
    @     0x5564462edf7e  absl::Mutex::Lock()
    @     0x55644626b629  dart::SampleBlockBuffer::ReserveSampleImpl()
    @     0x55644626c3df  dart::Profiler::SampleThread()
    @     0x5564462d6c80  dart::ThreadInterrupterLinux::ThreadInterruptSignalHandler()
    @     0x7fa2ffbb61c0  __restore_rt
    @     0x556446023e6c  AbslInternalPerThreadSemWait
    @     0x5564462f0e6c  absl::CondVar::WaitCommon()
    @     0x556446267aac  dart::Monitor::Wait()
    @     0x556446147d9e  dart::SafepointHandler::ExitSafepointUsingLock()
    @     0x5564461b7ffd  dart::MessageHandler::PauseAndHandleAllMessages()
    @     0x556446112e39  Dart_WaitForEvent
    @     0x5564460f0e49  dart::bin::Builtin_CLI_WaitForEvent()
    @     0x5564461cb4b8  dart::NativeEntry::AutoScopeNativeCallWrapperNoStackCheck()
    @     0x7fa2feb72ed3  _kDartVmSnapshotInstructions
    @     0x7fa2fecd4c69  _kDartIsolateSnapshotInstructions
    @     0x7fa2fecd4c2e  _kDartIsolateSnapshotInstructions
    @     0x7fa2fecd4b45  _kDartIsolateSnapshotInstructions
    @     0x7fa2fecd47a0  _kDartIsolateSnapshotInstructions

Signal handlers should only call functions that are async signal safe, see manpage

/cc @rmacnak-google @bkonyi

@mkustermann mkustermann added area-vm Use area-vm for VM related issues, including code coverage, and the AOT and JIT backends. type-bug Incorrect behavior (everything from a crash to more subtle misbehavior) labels Jan 25, 2023
@rmacnak-google
Copy link
Contributor

We have a longer list of restrictions documented on SampleThread:

  // SampleThread is called from inside the signal handler and hence it is very
  // critical that the implementation of SampleThread does not do any of the
  // following:
  //   * Accessing TLS -- Because on Windows and Fuchsia the callback will be
  //                      running in a different thread.
  //   * Allocating memory -- Because this takes locks which may already be
  //                          held, resulting in a dead lock.
  //   * Taking a lock -- See above.
  static void SampleThread(Thread* thread, const InterruptedThreadState& state);

@mkustermann
Copy link
Member Author

Wasn't really involved in this, but I thought to recall that VM used to setup a preallocated ring-buffer where signal handlers can readily write into. Is this no longer the case?

@bkonyi would this be something you could look at?

@bkonyi
Copy link
Contributor

bkonyi commented Jan 25, 2023

I don't really have any cycles right now. Maybe @rmacnak-google can take a look?

@a-siva
Copy link
Contributor

a-siva commented Jan 30, 2023

This locking code seems to have been added here https://dart-review.googlesource.com/c/sdk/+/206920

@jacob314 jacob314 added the P1 A high priority bug; for example, a single project is unusable or has many test failures label Feb 15, 2023
@jacob314
Copy link
Member

This bug is slowing us down debugging a P0 issue in g3.

@jacob314 jacob314 removed the P1 A high priority bug; for example, a single project is unusable or has many test failures label Feb 15, 2023
@rmacnak-google rmacnak-google self-assigned this Feb 15, 2023
@a-siva a-siva added the P1 A high priority bug; for example, a single project is unusable or has many test failures label Feb 15, 2023
@rmacnak-google
Copy link
Contributor

copybara-service bot pushed a commit that referenced this issue Feb 21, 2023
Add asserts against using mutexes or monitors during the signal handler or suspended thread scopes. Poison use of Thread::Current during these scopes.

TEST=ci
Bug: #51124
Change-Id: If1df06520114105b2b4d8c81b4650bdb4efeaf50
Reviewed-on: https://dart-review.googlesource.com/c/sdk/+/283703
Reviewed-by: Ben Konyi <bkonyi@google.com>
Commit-Queue: Ryan Macnak <rmacnak@google.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-vm Use area-vm for VM related issues, including code coverage, and the AOT and JIT backends. P1 A high priority bug; for example, a single project is unusable or has many test failures type-bug Incorrect behavior (everything from a crash to more subtle misbehavior)
Projects
None yet
Development

No branches or pull requests

5 participants