Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Garbage Collection Crash #8216

Closed
sehz opened this issue Aug 21, 2016 · 19 comments
Closed

Garbage Collection Crash #8216

sehz opened this issue Aug 21, 2016 · 19 comments
Labels
async_hooks Issues and PRs related to the async hooks subsystem. tls Issues and PRs related to the tls subsystem. v8 engine Issues and PRs related to the V8 dependency.

Comments

@sehz
Copy link

sehz commented Aug 21, 2016

  • Version: v6.4.0:
  • Platform: Mac 15.6.0 Darwin Kernel Version 15.6.0::
  • ** OS X Version 10.11.6 (15G31)**:

Node crashed with following stack trace:

#
# Fatal error in ../deps/v8/src/execution.cc, line 64
# Check failed: AllowJavascriptExecution::IsAllowed(isolate).
#

==== C stack trace ===============================

 1: V8_Fatal
 2: v8::internal::(anonymous namespace)::Invoke(v8::internal::Isolate*, bool, v8::internal::Handle<v8::internal::Object>, v8::internal::Handle<v8::internal::Object>, int, v8::internal::Handle<v8::internal::Object>*, v8::internal::Handle<v8::internal::Object>)
 3: v8::internal::Execution::Call(v8::internal::Isolate*, v8::internal::Handle<v8::internal::Object>, v8::internal::Handle<v8::internal::Object>, int, v8::internal::Handle<v8::internal::Object>*)
 4: v8::Function::Call(v8::Local<v8::Context>, v8::Local<v8::Value>, int, v8::Local<v8::Value>*)
 5: node::TLSWrap::~TLSWrap()
 6: node::TLSWrap::~TLSWrap()
 7: v8::internal::GlobalHandles::DispatchPendingPhantomCallbacks(bool)
 8: v8::internal::GlobalHandles::PostGarbageCollectionProcessing(v8::internal::GarbageCollector, v8::GCCallbackFlags)
 9: v8::internal::Heap::PerformGarbageCollection(v8::internal::GarbageCollector, v8::GCCallbackFlags)
10: v8::internal::Heap::CollectGarbage(v8::internal::GarbageCollector, char const*, char const*, v8::GCCallbackFlags)
11: v8::internal::Factory::NewCodeRaw(int, bool)
12: v8::internal::Factory::NewCode(v8::internal::CodeDesc const&, unsigned int, v8::internal::Handle<v8::internal::Object>, bool, bool, int, bool)
13: v8::internal::CodeGenerator::MakeCodeEpilogue(v8::internal::MacroAssembler*, v8::internal::CompilationInfo*)
14: v8::internal::LChunk::Codegen()
15: v8::internal::OptimizedCompileJob::GenerateCode()
16: v8::internal::Compiler::GetConcurrentlyOptimizedCode(v8::internal::OptimizedCompileJob*)
17: v8::internal::OptimizingCompileDispatcher::InstallOptimizedFunctions()
18: v8::internal::StackGuard::HandleInterrupts()
19: v8::internal::Runtime_StackGuard(int, v8::internal::Object**, v8::internal::Isolate*)
20: 0xe170e00961b
21: 0xe170e6cb2e5

Process finished with exit code 132 (interrupted by signal 4: SIGILL)

@vkurchatkin vkurchatkin added the tls Issues and PRs related to the tls subsystem. label Aug 21, 2016
@mscdex mscdex added the v8 engine Issues and PRs related to the V8 dependency. label Aug 21, 2016
@mscdex
Copy link
Contributor

mscdex commented Aug 21, 2016

/cc @indutny

@addaleax
Copy link
Member

@sehyoc Thanks for reporting this! Is there any chance you can offer more information? Is the problem reproducible? Do you have something like a test script that can demonstrate the issue?

@mscdex
Copy link
Contributor

mscdex commented Aug 22, 2016

Also: do you know if there is a node version where the same code was working, or is this the first version of node you tried?

@indutny
Copy link
Member

indutny commented Aug 22, 2016

I have no other suspects but the AsyncWrap class. @trevnorris looks like calling JS functions is no longer allowed from the ~AsyncWrap, is it something that we should look into?

@bnoordhuis
Copy link
Member

I agree with Fedor that the call into the VM from the AsyncWrap destructor is responsible.

@sehyoc Something in your project (either your own code or a third-party module) is calling process.binding('async_wrap').setupHooks(). Try disabling that for now as a workaround.

@sehz
Copy link
Author

sehz commented Aug 22, 2016

This problem is reproducible but not everytime. Crashes happen 1 of 5 times during startup. Before I was running v6.3.1 which also crashed but much less frequency.

We are using babel with async plug-in. Since we are heavily relying on 'async' keyword, there is no way to turn this off.

@addaleax
Copy link
Member

addaleax commented Aug 23, 2016

@sehyoc This is not about the async keyword, async_wrap is something very different. Are you sure nothing in your project is touching process.binding('async_wrap')? grep -R async_wrap . should be a reasonable way to check for that.

@sehz
Copy link
Author

sehz commented Aug 23, 2016

Yes, I found who was using async_wrap. It was async-hook package:

node_modules/async-hook/async-hook.js:const asyncWrap = process.binding('async_wrap'); Binary file node_modules/electron-prebuilt/dist/Electron.app/Contents/Frameworks/Electron Framework.framework/Libraries/libnode.dylib matches Binary file node_modules/electron-prebuilt/dist/Electron.app/Contents/Frameworks/Electron Framework.framework/Versions/A/Libraries/libnode.dylib matches Binary file node_modules/electron-prebuilt/dist/Electron.app/Contents/Frameworks/Electron Framework.framework/Versions/Current/Libraries/libnode.dylib matches

And furthermore, async-hook was used by:
node_modules/trace/package.json: "async-hook": "^1.0.0", node_modules/trace/trace.js:const asyncHook = require('async-hook');

I was using to get long stack trace. Current version of trace still using async-hook. I removed the package

@addaleax addaleax added the async_hooks Issues and PRs related to the async hooks subsystem. label Sep 17, 2016
bnoordhuis added a commit to bnoordhuis/io.js that referenced this issue Nov 4, 2016
It is not allowed anymore to call JS code when collecting weakly
persistent handles, it hits the assertion below:

    # Fatal error in ../deps/v8/src/execution.cc, line 103
    # Check failed: AllowJavascriptExecution::IsAllowed(isolate).

Remove the call into the VM from the AsyncWrap destructor.  This commit
breaks the destroy hook but that cannot be helped.

Fixes: nodejs#8216
AndreasMadsen added a commit to AndreasMadsen/trace that referenced this issue Nov 4, 2016
@gibfahn
Copy link
Member

gibfahn commented Nov 4, 2016

cc/ @nodejs/diagnostics

@nakedcity
Copy link

any resolution so far?

@indutny
Copy link
Member

indutny commented Nov 14, 2016

ping @trevnorris

@nakedcity
Copy link

nakedcity commented Nov 14, 2016

well just to add I see this is still happening in versions 7 too. Easy way for me to reproduce is to install any package using async hooks and fire few requests to make it crash, let me know if you need any help

@trevnorris
Copy link
Contributor

I have a fix ready, but don't have a reproducible test case for it. There a minimal one I can include?

@nakedcity
Copy link

nakedcity commented Nov 24, 2016

I easily reproduce using this package https://github.com/AndreasMadsen/trace/ (Uses async-hooks)
and firing more than 5 request with express.

If you give me the branch you are working on I can test my current scenario upon it

@trevnorris
Copy link
Contributor

@nakedcity link is #9753

@nakedcity
Copy link

Thanks @trevnorris I take a look this weekend

trevnorris added a commit to trevnorris/node that referenced this issue Dec 1, 2016
Calling JS during GC is a no-no. So intead create a queue of all ids
that need to have their destroy() callback called and call them later.

Removed checking destroy() in test-async-wrap-uid because destroy() can
be called after the 'exit' callback.

Missing a reliable test to reproduce the issue that caused the
FATAL_ERROR.

PR-URL: nodejs#9753
Fixes: nodejs#8216
Fixes: nodejs#9465
Reviewed-By: Anna Henningsen <anna@addaleax.net>
Reviewed-By: Ben Noordhuis <info@bnoordhuis.nl>
trevnorris added a commit to trevnorris/node that referenced this issue Dec 2, 2016
Calling JS during GC is a no-no. So intead create a queue of all ids
that need to have their destroy() callback called and call them later.

Removed checking destroy() in test-async-wrap-uid because destroy() can
be called after the 'exit' callback.

Missing a reliable test to reproduce the issue that caused the
FATAL_ERROR.

Fixes: nodejs#8216
Fixes: nodejs#9465
Reviewed-By: Anna Henningsen <anna@addaleax.net>
Reviewed-By: Ben Noordhuis <info@bnoordhuis.nl>
MylesBorins pushed a commit that referenced this issue Dec 2, 2016
Calling JS during GC is a no-no. So intead create a queue of all ids
that need to have their destroy() callback called and call them later.

Removed checking destroy() in test-async-wrap-uid because destroy() can
be called after the 'exit' callback.

Missing a reliable test to reproduce the issue that caused the
FATAL_ERROR.

PR-URL: #9753
Fixes: #8216
Fixes: #9465
Reviewed-By: Anna Henningsen <anna@addaleax.net>
Reviewed-By: Ben Noordhuis <info@bnoordhuis.nl>
trevnorris added a commit to trevnorris/node that referenced this issue Dec 2, 2016
Calling JS during GC is a no-no. So intead create a queue of all ids
that need to have their destroy() callback called and call them later.

Removed checking destroy() in test-async-wrap-uid because destroy() can
be called after the 'exit' callback.

Missing a reliable test to reproduce the issue that caused the
FATAL_ERROR.

Fixes: nodejs#8216
Fixes: nodejs#9465
Reviewed-By: Anna Henningsen <anna@addaleax.net>
Reviewed-By: Ben Noordhuis <info@bnoordhuis.nl>
trevnorris added a commit to trevnorris/node that referenced this issue Dec 2, 2016
Calling JS during GC is a no-no. So intead create a queue of all ids
that need to have their destroy() callback called and call them later.

Removed checking destroy() in test-async-wrap-uid because destroy() can
be called after the 'exit' callback.

Missing a reliable test to reproduce the issue that caused the
FATAL_ERROR.

Fixes: nodejs#8216
Fixes: nodejs#9465
Reviewed-By: Anna Henningsen <anna@addaleax.net>
Reviewed-By: Ben Noordhuis <info@bnoordhuis.nl>
MylesBorins pushed a commit that referenced this issue Dec 6, 2016
Calling JS during GC is a no-no. So intead create a queue of all ids
that need to have their destroy() callback called and call them later.

Removed checking destroy() in test-async-wrap-uid because destroy() can
be called after the 'exit' callback.

Missing a reliable test to reproduce the issue that caused the
FATAL_ERROR.

Fixes: #8216
Fixes: #9465
Reviewed-By: Anna Henningsen <anna@addaleax.net>
Reviewed-By: Ben Noordhuis <info@bnoordhuis.nl>
MylesBorins pushed a commit that referenced this issue Dec 6, 2016
Calling JS during GC is a no-no. So intead create a queue of all ids
that need to have their destroy() callback called and call them later.

Removed checking destroy() in test-async-wrap-uid because destroy() can
be called after the 'exit' callback.

Missing a reliable test to reproduce the issue that caused the
FATAL_ERROR.

PR-URL: #10096
Fixes: #8216
Fixes: #9465
Reviewed-By: Anna Henningsen <anna@addaleax.net>
Reviewed-By: Ben Noordhuis <info@bnoordhuis.nl>
jmdarling pushed a commit to jmdarling/node that referenced this issue Dec 8, 2016
Calling JS during GC is a no-no. So intead create a queue of all ids
that need to have their destroy() callback called and call them later.

Removed checking destroy() in test-async-wrap-uid because destroy() can
be called after the 'exit' callback.

Missing a reliable test to reproduce the issue that caused the
FATAL_ERROR.

PR-URL: nodejs#9753
Fixes: nodejs#8216
Fixes: nodejs#9465
Reviewed-By: Anna Henningsen <anna@addaleax.net>
Reviewed-By: Ben Noordhuis <info@bnoordhuis.nl>
MylesBorins pushed a commit that referenced this issue Dec 13, 2016
Calling JS during GC is a no-no. So intead create a queue of all ids
that need to have their destroy() callback called and call them later.

Removed checking destroy() in test-async-wrap-uid because destroy() can
be called after the 'exit' callback.

Missing a reliable test to reproduce the issue that caused the
FATAL_ERROR.

Fixes: #8216
Fixes: #9465
Reviewed-By: Anna Henningsen <anna@addaleax.net>
Reviewed-By: Ben Noordhuis <info@bnoordhuis.nl>
MylesBorins pushed a commit that referenced this issue Dec 21, 2016
Calling JS during GC is a no-no. So intead create a queue of all ids
that need to have their destroy() callback called and call them later.

Removed checking destroy() in test-async-wrap-uid because destroy() can
be called after the 'exit' callback.

Missing a reliable test to reproduce the issue that caused the
FATAL_ERROR.

Fixes: #8216
Fixes: #9465
Reviewed-By: Anna Henningsen <anna@addaleax.net>
Reviewed-By: Ben Noordhuis <info@bnoordhuis.nl>
@surajwy
Copy link

surajwy commented Nov 17, 2017

I am also experiencing a similar crash in PostGarbageCollectionProcessing call
I do not have any reproducible case. But it happens 1-2 times a day in production.
Whenever I have got this crash, we were connecting to AWS's endpoint.
I do not have any module which uses async_wrap

Node.js version: v6.11.0
OpenSSL version: OpenSSL 1.0.1k-fips
Platform: AWS EC2
OS: Amazon linux 4.9.27-14.31.amzn1.x86_64

(lldb) v8 bt
 * SBThread: tid = 0x0000  * frame #0: 0x000000000111b346 node`node::TLSWrap::~TLSWrap() + 358
    frame #1: 0x000000000111b421 node`node::TLSWrap::~TLSWrap() + 17
    frame #2: 0x0000000000c85e5c node`v8::internal::GlobalHandles::DispatchPendingPhantomCallbacks(bool) + 252
    frame #3: 0x0000000000c860ba node`v8::internal::GlobalHandles::PostGarbageCollectionProcessing(v8::internal::GarbageCollector, v8::GCCallbackFlags) + 42
    frame #4: 0x0000000000ca6e37 node`v8::internal::Heap::PerformGarbageCollection(v8::internal::GarbageCollector, v8::GCCallbackFlags) + 535
    frame #5: 0x0000000000ca771a node`v8::internal::Heap::CollectGarbage(v8::internal::GarbageCollector, char const*, char const*, v8::GCCallbackFlags) + 330
    frame #6: 0x0000000000ca932e node`v8::internal::Heap::HandleGCRequest() + 222
    frame #7: 0x0000000000c55bac node`v8::internal::StackGuard::HandleInterrupts() + 796
    frame #8: 0x0000000000eb75fc node`v8::internal::Runtime_StackGuard(int, v8::internal::Object**, v8::internal::Isolate*) + 380
    frame #9: 0x00001ee4063079a7 <exit>

@bnoordhuis
Copy link
Member

bnoordhuis commented Nov 17, 2017

@surajwy Can you open a new issue and include the contents of the registers and a disassembly of the top frame? Yours is probably a different issue than this bug report.

edit: by the way, is that with a binary from https://nodejs.org/?

@surajwy
Copy link

surajwy commented Nov 17, 2017

@bnoordhuis I have opened a new issue here: #17097
Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
async_hooks Issues and PRs related to the async hooks subsystem. tls Issues and PRs related to the tls subsystem. v8 engine Issues and PRs related to the V8 dependency.
Projects
None yet
Development

No branches or pull requests

10 participants