Skip to content
This repository has been archived by the owner on Apr 22, 2023. It is now read-only.

Non-deterministic hanging for setTimeout and other async operations when using --prof (v0.12.x, OS X 10.9.5) #14576

Closed
esbena opened this issue Apr 9, 2015 · 9 comments

Comments

@esbena
Copy link

esbena commented Apr 9, 2015

setTimeout, fs.readFile and other async functions sometimes makes the entire node process hang permanently when the --prof flag is used on OS X.

Environments where issue occurs:

  • OS X 10.9.5 with node v0.12.0
  • OS X 10.9.5 with node v0.12.1
  • OS X 10.9.5 with node v0.12.2

Environments where issue does not occur:

  • OS X 10.9.5 with node v0.10.38
  • Ubuntu 14.04.2 with node v0.12.2

To reproduce, run this script with node v.0.12.x on the path:

#!/usr/bin/env bash

REPEAT=100;

echo "Running without --prof ${REPEAT} times"
for i in `seq ${REPEAT}`; do
    echo -n "$i "
    node -e "setTimeout('', 50)";
done
echo
echo "Running with --prof ${REPEAT} times"
for i in `seq ${REPEAT}`; do
    echo -n "$i "
    node --prof -e "setTimeout('', 50)";
done

Sample output, hanging after 9 iterations:

$ ./run.sh
Running without --prof 100 times
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 
Running with --prof 100 times
1 2 3 4 5 6 7 8 9   C-c
@misterdjules
Copy link

Thank you for reporting this issue!

This is the stack trace that I get for all threads:

lldb) thread backtrace all
  thread #1: tid = 0x135bb5, 0x00007fff9367b746 libsystem_kernel.dylib`__psynch_mutexwait + 10, queue = 'com.apple.main-thread'
    frame #0: 0x00007fff9367b746 libsystem_kernel.dylib`__psynch_mutexwait + 10
    frame #1: 0x00007fff89651779 libsystem_pthread.dylib`_pthread_mutex_lock + 372
    frame #2: 0x0000000100372f4e node`v8::internal::Isolate::EnsureInitialized() [inlined] v8::base::LockGuard<v8::base::Mutex>::LockGuard(mutex=<unavailable>) + 62 at mutex.h:204
    frame #3: 0x0000000100372f42 node`v8::internal::Isolate::EnsureInitialized() [inlined] v8::base::LazyInstanceImpl<v8::base::Mutex, v8::base::StaticallyAllocatedInstanceTrait<v8::base::Mutex>, v8::base::DefaultConstructTrait<v8::base::Mutex>, v8::base::ThreadSafeInitOnceTrait, v8::base::LeakyInstanceTrait<v8::base::Mutex> >::Pointer() + 46 at mutex.h:204
    frame #4: 0x0000000100372f14 node`v8::internal::Isolate::EnsureInitialized() + 4 at isolate.cc:150
    frame #5: 0x0000000100372ecb node`v8::internal::ThreadId::GetCurrentThreadId() + 11 at isolate.cc:50
    frame #6: 0x0000000100474277 node`v8::internal::SignalHandler::HandleProfilerSignal(int, __siginfo*, void*) [inlined] v8::internal::ThreadId::Current() + 87 at isolate.h:167
    frame #7: 0x0000000100474272 node`v8::internal::SignalHandler::HandleProfilerSignal(int, __siginfo*, void*) [inlined] v8::internal::Isolate::thread_manager(this=0x0000000100f07110) + 7 at v8threads.h:80
    frame #8: 0x000000010047426b node`v8::internal::SignalHandler::HandleProfilerSignal(signal=<unavailable>, info=<unavailable>, context=0x00007fff5fbfe0b0) + 75 at sampler.cc:346
    frame #9: 0x00007fff906cb5aa libsystem_platform.dylib`_sigtramp + 26
    frame #10: 0x00000001005b2be6 node`v8::base::Mutex::Unlock(this=0x00000001009a6410) + 6 at mutex.cc:0
    frame #11: 0x000000010037f39a node`v8::internal::AssertionNode::FillInBMInfo(this=0x000000010382f2a8, offset=58914384, budget=<unavailable>, bm=0x0000000000000000, not_at_start=<unavailable>) + 58 at jsregexp.cc:2332
    frame #12: 0x000000010037f315 node`v8::internal::ActionNode::FillInBMInfo(this=0x000000010382f4b0, offset=0, budget=<unavailable>, bm=0x000000010382f650, not_at_start=false) + 165 at jsregexp.cc:2304
    frame #13: 0x0000000100383ffb node`v8::internal::ChoiceNode::Emit(this=0x0000000000000001, compiler=0x00007fff5fbfe8d8, trace=0x00007fff5fbfe6a8) + 2571 at jsregexp.cc:4005
    frame #14: 0x000000010037dc77 node`v8::internal::RegExpCompiler::Assemble(this=0x00007fff5fbfe8d8, macro_assembler=<unavailable>, start=0x000000010382f528, capture_count=<unavailable>, pattern=<unavailable>) + 327 at jsregexp.cc:1124
    frame #15: 0x000000010037cbe8 node`v8::internal::RegExpEngine::Compile(data=<unavailable>, ignore_case=<unavailable>, is_global=<unavailable>, is_multiline=<unavailable>, is_ascii=<unavailable>, zone=<unavailable>, pattern=<unavailable>, sample_subject=<unavailable>) + 2088 at jsregexp.cc:6106
    frame #16: 0x000000010037be5e node`v8::internal::RegExpImpl::CompileIrregexp(is_ascii=true, re=<unavailable>, sample_subject=<unavailable>) + 638 at jsregexp.cc:439
    frame #17: 0x000000010037cea5 node`v8::internal::RegExpImpl::IrregexpPrepare(v8::internal::Handle<v8::internal::JSRegExp>, v8::internal::Handle<v8::internal::String>) [inlined] v8::internal::RegExpImpl::EnsureCompiledIrregexp(v8::internal::Handle<v8::internal::JSRegExp>, v8::internal::Handle<v8::internal::String>, bool) + 279 at jsregexp.cc:371
    frame #18: 0x000000010037cd8e node`v8::internal::RegExpImpl::IrregexpPrepare(regexp=<unavailable>, subject=<unavailable>) + 222 at jsregexp.cc:516
    frame #19: 0x000000010037b781 node`v8::internal::RegExpImpl::IrregexpExec(previous_index=0, regexp=<unavailable>, subject=<unavailable>, last_match_info=<unavailable>) + 49 at jsregexp.cc:635
    frame #20: 0x0000000100433c5d node`v8::internal::Runtime_RegExpExecRT(int, v8::internal::Object**, v8::internal::Isolate*) [inlined] v8::internal::__RT_impl_Runtime_RegExpExecRT(isolate=0x0000000101004c00, arguments=0x00007fff5fbfef98) + 102 at runtime.cc:2492
    frame #21: 0x0000000100433bf7 node`v8::internal::Runtime_RegExpExecRT(args_length=<unavailable>, args_object=0x00007fff5fbfef98, isolate=0x0000000101004c00) + 263 at runtime.cc:2479
    frame #22: 0x00000814f870740e
    frame #23: 0x00000814f8784877
    frame #24: 0x00000814f8782f9a
    frame #25: 0x00000814f878262b
    frame #26: 0x00000814f8707b75
    frame #27: 0x00000814f87608fb
    frame #28: 0x00000814f875fec6
    frame #29: 0x00000814f8780cfd
    frame #30: 0x00000814f8707b75
    frame #31: 0x00000814f87608fb
    frame #32: 0x00000814f875fec6
    frame #33: 0x00000814f877db10
    frame #34: 0x00000814f8707b75
    frame #35: 0x00000814f87608fb
    frame #36: 0x00000814f875fec6
    frame #37: 0x00000814f877a15a
    frame #38: 0x00000814f8707b75
    frame #39: 0x00000814f87608fb
    frame #40: 0x00000814f875fec6
    frame #41: 0x00000814f87795b0
    frame #42: 0x00000814f875f28f
    frame #43: 0x00000814f875e090
    frame #44: 0x00000814f8756fe0
    frame #45: 0x00000814f8728891
    frame #46: 0x0000000100217ef3 node`v8::internal::Invoke(is_construct=<unavailable>, argc=1, args=0x00007fff5fbff7c8, function=<unavailable>, receiver=<unavailable>) + 419 at execution.cc:91
    frame #47: 0x000000010012b65f node`v8::Function::Call(this=0x0000000102806418, argc=1, argv=0x00007fff5fbff7c8, recv=<unavailable>) + 207 at api.cc:4019
    frame #48: 0x0000000100543643 node`node::LoadEnvironment(env=0x0000000102026200) + 517 at node.cc:2875
    frame #49: 0x00000001005449f3 node`node::Start(argc=1, argv=<unavailable>) + 305 at node.cc:3756
    frame #50: 0x0000000100001774 node`start + 52

  thread #2: tid = 0x135bb6, 0x00007fff93677a56 libsystem_kernel.dylib`semaphore_wait_trap + 10, name = 'v8:Profiler'
    frame #0: 0x00007fff93677a56 libsystem_kernel.dylib`semaphore_wait_trap + 10
    frame #1: 0x00000001005b2df7 node`v8::base::Semaphore::Wait(this=0x0000000101841044) + 23 at semaphore.cc:46
    frame #2: 0x00000001003a03a9 node`v8::internal::Profiler::Run() [inlined] v8::internal::Profiler::Remove(this=0x0000000101800000, sample=0x00007fff00000002) + 8 at log.cc:625
    frame #3: 0x00000001003a03a1 node`v8::internal::Profiler::Run(this=0x0000000101800000) + 241 at log.cc:755
    frame #4: 0x00000001005b4817 node`v8::base::ThreadEntry(void*) [inlined] v8::base::Thread::NotifyStartedAndRun() + 23 at platform.h:510
    frame #5: 0x00000001005b4800 node`v8::base::ThreadEntry(arg=0x0000000101800000) + 64 at platform-posix.cc:569
    frame #6: 0x00007fff8964e899 libsystem_pthread.dylib`_pthread_body + 138
    frame #7: 0x00007fff8964e72a libsystem_pthread.dylib`_pthread_start + 137
    frame #8: 0x00007fff89652fc9 libsystem_pthread.dylib`thread_start + 13

* thread #3: tid = 0x135bb7, 0x00000001005b3ff4 node`v8::base::OS::Sleep(milliseconds=1) + 4 at platform-posix.cc:273, name = 'SamplerThread', stop reason = breakpoint 1.1
  * frame #0: 0x00000001005b3ff4 node`v8::base::OS::Sleep(milliseconds=1) + 4 at platform-posix.cc:273
    frame #1: 0x0000000100474bb1 node`v8::internal::SamplerThread::Run(this=0x0000000100f09ee0) + 33 at sampler.cc:553
    frame #2: 0x00000001005b4817 node`v8::base::ThreadEntry(void*) [inlined] v8::base::Thread::NotifyStartedAndRun() + 23 at platform.h:510
    frame #3: 0x00000001005b4800 node`v8::base::ThreadEntry(arg=0x0000000100f09ee0) + 64 at platform-posix.cc:569
    frame #4: 0x00007fff8964e899 libsystem_pthread.dylib`_pthread_body + 138
    frame #5: 0x00007fff8964e72a libsystem_pthread.dylib`_pthread_start + 137
    frame #6: 0x00007fff89652fc9 libsystem_pthread.dylib`thread_start + 13

  thread #4: tid = 0x135bb8, 0x00007fff93677a56 libsystem_kernel.dylib`semaphore_wait_trap + 10, name = 'OptimizingCompi'
    frame #0: 0x00007fff93677a56 libsystem_kernel.dylib`semaphore_wait_trap + 10
    frame #1: 0x00000001005b2df7 node`v8::base::Semaphore::Wait(this=0x000000010190144c) + 23 at semaphore.cc:46
    frame #2: 0x00000001003fbdf5 node`v8::internal::OptimizingCompilerThread::Run(this=0x0000000101901410) + 101 at optimizing-compiler-thread.cc:47
    frame #3: 0x00000001005b4817 node`v8::base::ThreadEntry(void*) [inlined] v8::base::Thread::NotifyStartedAndRun() + 23 at platform.h:510
    frame #4: 0x00000001005b4800 node`v8::base::ThreadEntry(arg=0x0000000101901410) + 64 at platform-posix.cc:569
    frame #5: 0x00007fff8964e899 libsystem_pthread.dylib`_pthread_body + 138
    frame #6: 0x00007fff8964e72a libsystem_pthread.dylib`_pthread_start + 137
    frame #7: 0x00007fff89652fc9 libsystem_pthread.dylib`thread_start + 13

  thread #5: tid = 0x135bb9, 0x00007fff93677a56 libsystem_kernel.dylib`semaphore_wait_trap + 10, name = 'v8:SweeperThrea'
    frame #0: 0x00007fff93677a56 libsystem_kernel.dylib`semaphore_wait_trap + 10
    frame #1: 0x00000001005b2df7 node`v8::base::Semaphore::Wait(this=0x0000000101901658) + 23 at semaphore.cc:46
    frame #2: 0x00000001002e7b17 node`v8::internal::SweeperThread::Run(this=0x0000000101901610) + 39 at sweeper-thread.cc:37
    frame #3: 0x00000001005b4817 node`v8::base::ThreadEntry(void*) [inlined] v8::base::Thread::NotifyStartedAndRun() + 23 at platform.h:510
    frame #4: 0x00000001005b4800 node`v8::base::ThreadEntry(arg=0x0000000101901610) + 64 at platform-posix.cc:569
    frame #5: 0x00007fff8964e899 libsystem_pthread.dylib`_pthread_body + 138
    frame #6: 0x00007fff8964e72a libsystem_pthread.dylib`_pthread_start + 137
    frame #7: 0x00007fff89652fc9 libsystem_pthread.dylib`thread_start + 13

  thread #6: tid = 0x135bba, 0x00007fff93677a56 libsystem_kernel.dylib`semaphore_wait_trap + 10, name = 'v8:SweeperThrea'
    frame #0: 0x00007fff93677a56 libsystem_kernel.dylib`semaphore_wait_trap + 10
    frame #1: 0x00000001005b2df7 node`v8::base::Semaphore::Wait(this=0x0000000101901708) + 23 at semaphore.cc:46
    frame #2: 0x00000001002e7b17 node`v8::internal::SweeperThread::Run(this=0x00000001019016c0) + 39 at sweeper-thread.cc:37
    frame #3: 0x00000001005b4817 node`v8::base::ThreadEntry(void*) [inlined] v8::base::Thread::NotifyStartedAndRun() + 23 at platform.h:510
    frame #4: 0x00000001005b4800 node`v8::base::ThreadEntry(arg=0x00000001019016c0) + 64 at platform-posix.cc:569
    frame #5: 0x00007fff8964e899 libsystem_pthread.dylib`_pthread_body + 138
    frame #6: 0x00007fff8964e72a libsystem_pthread.dylib`_pthread_start + 137
    frame #7: 0x00007fff89652fc9 libsystem_pthread.dylib`thread_start + 13

  thread #7: tid = 0x135bbb, 0x00007fff93677a56 libsystem_kernel.dylib`semaphore_wait_trap + 10, name = 'v8:SweeperThrea'
    frame #0: 0x00007fff93677a56 libsystem_kernel.dylib`semaphore_wait_trap + 10
    frame #1: 0x00000001005b2df7 node`v8::base::Semaphore::Wait(this=0x00000001019017b8) + 23 at semaphore.cc:46
    frame #2: 0x00000001002e7b17 node`v8::internal::SweeperThread::Run(this=0x0000000101901770) + 39 at sweeper-thread.cc:37
    frame #3: 0x00000001005b4817 node`v8::base::ThreadEntry(void*) [inlined] v8::base::Thread::NotifyStartedAndRun() + 23 at platform.h:510
    frame #4: 0x00000001005b4800 node`v8::base::ThreadEntry(arg=0x0000000101901770) + 64 at platform-posix.cc:569
    frame #5: 0x00007fff8964e899 libsystem_pthread.dylib`_pthread_body + 138
    frame #6: 0x00007fff8964e72a libsystem_pthread.dylib`_pthread_start + 137
    frame #7: 0x00007fff89652fc9 libsystem_pthread.dylib`thread_start + 13
(lldb) 

I have tested that this would be fixed by upgrading to V8 3.29.93.1 or newer, but I haven't had the time to investigate if we could isolate a specific fix and land it without breaking V8's API for v0.12.x.

@misterdjules
Copy link

It also occurs on SmartOS, it's definitely not specific to OSX.

@bmustata
Copy link

Same issue in here for Mac OSX.

@irengrig
Copy link

+1
Unfortunately, seems that this issue prevents using Node.js profiling feature for JetBrains WebStorm/IntelliJ IDEA
(https://www.jetbrains.com/webstorm/whatsnew/#v8profiling)

dmelikyan added a commit to dmelikyan/node that referenced this issue May 14, 2015
A deadlock happens when sampler initiated by SIGPROF tries to lock
the thread and the thread is already locked by the same thread. As
a result, other thread involved in sampling process hangs. The
patch adds a check for thread lock before continuing sampler
operation.

The fix has been tested on a sample app under load with and without
profiling turned on.

Fixes issue nodejs#14576 and specifically the duplicate issue nodejs#25295
@dmelikyan
Copy link

I've run the script by @esbena against the patch #25309 and it works fine too. Any chance to land it in 0.12?

@misterdjules misterdjules modified the milestones: 0.12.4, 0.12.5 May 25, 2015
@misterdjules
Copy link

@dmelikyan Thank you for #25309! As @mhdawson mentioned, @tunniclm will be reviewing your change asap.

@weekens
Copy link

weekens commented Jun 5, 2015

Reproduced this issue on Ubuntu 14.04.2 (x64) with Node v0.12.4. The patch #25309 has resolved the issue.

@misterdjules
Copy link

@weekens Thank you for the heads up!

dmelikyan added a commit to dmelikyan/node that referenced this issue Jun 8, 2015
A deadlock happens when sampler initiated by SIGPROF tries to lock
the thread and the thread is already locked by the same thread. As
a result, other thread involved in sampling process hangs. The
patch adds a check for thread lock before continuing sampler
operation.

The fix has been tested on a sample app under load with and without
profiling turned on.

Fixes issue nodejs#14576 and specifically the duplicate issue nodejs#25295
dmelikyan added a commit to dmelikyan/node that referenced this issue Jun 10, 2015
A deadlock happens when sampler initiated by SIGPROF tries to lock
the thread and the thread is already locked by the same thread. As
a result, other thread involved in sampling process hangs. The
patch adds a check for thread lock before continuing sampler
operation.

The fix has been tested on a sample app under load with and without
profiling turned on.

Fixes issue nodejs#14576 and specifically the duplicate issue nodejs#25295
misterdjules pushed a commit that referenced this issue Jun 10, 2015
A deadlock happens when sampler initiated by SIGPROF tries to lock
the thread and the thread is already locked by the same thread. As
a result, other thread involved in sampling process hangs. The
patch adds a check for thread lock before continuing sampler
operation.

The fix has been tested on a sample app under load with and without
profiling turned on.

Fixes issue #14576 and specifically the duplicate issue #25295

Reviewed-By: Julien Gilli <julien.gilli@joyent.com>
PR-URL: #25309
@misterdjules
Copy link

Fixed by b81a643, thank you all 👍

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

6 participants