Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dd-trace causes crash in v8 internals, SIGABRT #1229

Closed
doronrk opened this issue Feb 2, 2021 · 11 comments · Fixed by #1233
Closed

dd-trace causes crash in v8 internals, SIGABRT #1229

doronrk opened this issue Feb 2, 2021 · 11 comments · Fixed by #1233
Labels
bug Something isn't working community core
Milestone

Comments

@doronrk
Copy link

doronrk commented Feb 2, 2021

Describe the bug

After upgrading to node 12.20.1, we are getting the below crash on our localhost consistently. DD_TRACE_ENABLED=false resolves the issue, so it looks like dd-trace is implicated.

W20210201-11:27:47.096(-8)? (STDERR) program[53589]: ../src/api/callback.cc:123:void node::InternalCallbackScope::Close(): Assertion `(env_->execution_async_id()) == (0)' failed.
W20210201-11:27:47.097(-8)? (STDERR)  1: 0x1011f62d5 node::Abort() (.cold.1) [program]
W20210201-11:27:47.098(-8)? (STDERR)  2: 0x1000a5a29 node::Abort() [program]
W20210201-11:27:47.098(-8)? (STDERR)  3: 0x1000a5891 node::Assert(node::AssertionInfo const&) [program]
W20210201-11:27:47.099(-8)? (STDERR)  4: 0x100001c96 node::InternalCallbackScope::Close() [program]
W20210201-11:27:47.099(-8)? (STDERR)  5: 0x10000223a node::InternalMakeCallback(node::Environment*, v8::Local<v8::Object>, v8::Local<v8::Object>, v8::Local<v8::Function>, int, v8::Local<v8::Value>*, node::async_context) [program]
W20210201-11:27:47.100(-8)? (STDERR)  6: 0x100016598 node::AsyncWrap::MakeCallback(v8::Local<v8::Function>, int, v8::Local<v8::Value>*) [program]
W20210201-11:27:47.100(-8)? (STDERR)  7: 0x10015d560 node::StreamBase::CallJSOnreadMethod(long, v8::Local<v8::ArrayBuffer>, unsigned long, node::StreamBase::StreamBaseJSChecks) [program]
W20210201-11:27:47.101(-8)? (STDERR)  8: 0x10015f2d1 node::EmitToJSStreamListener::OnStreamRead(long, uv_buf_t const&) [program]
W20210201-11:27:47.102(-8)? (STDERR)  9: 0x1000c3224 node::(anonymous namespace)::Parser::OnStreamRead(long, uv_buf_t const&) [program]
W20210201-11:27:47.103(-8)? (STDERR) 10: 0x100163d6f node::LibuvStreamWrap::OnUvRead(long, uv_buf_t const*) [program]
W20210201-11:27:47.105(-8)? (STDERR) 11: 0x100975355 uv__stream_io [program]
W20210201-11:27:47.107(-8)? (STDERR) 12: 0x10097deac uv__io_poll [program]
W20210201-11:27:47.109(-8)? (STDERR) 13: 0x10096a651 uv_run [program]
W20210201-11:27:47.110(-8)? (STDERR) 14: 0x1000e8d97 node::NodeMainInstance::Run() [program]
W20210201-11:27:47.111(-8)? (STDERR) 15: 0x10007b0e9 node::Start(int, char**) [program]
W20210201-11:27:47.112(-8)? (STDERR) 16: 0x7fff6b74bcc9 start [/usr/lib/system/libdyld.dylib]
=> Exited from signal: SIGABRT
=> Your application is crashing. Waiting for file change.

Environment

localhost development

  • Operation system:

OS X 10.15.7

  • Node version:

12.20.1

  • Tracer version:

0.30.5

  • Agent version:

N/A, running on my localhost

@doronrk doronrk added the bug Something isn't working label Feb 2, 2021
@doronrk doronrk changed the title dd-trace crashes v8 internals, SIGABRT dd-trace causes crash in v8 internals, SIGABRT Feb 2, 2021
@rochdev
Copy link
Member

rochdev commented Feb 2, 2021

I wasn't able to reproduce the issue with a simple app, so a few follow up question:

  • Are you running directly on the host or in a Docker container?
  • Does the error happen right at startup or does it happen later?
  • Can you only replicate when using Meteor? What if you update only either the tracer or Meteor but not both? Does it happen with a brand new Meteor app?
  • What Node version did you upgrade from?
  • Any chance that you can provide a replication snippet?

@rochdev
Copy link
Member

rochdev commented Feb 2, 2021

Just tried with a brand new Meteor app and I still can't replicate the issue.

@doronrk
Copy link
Author

doronrk commented Feb 2, 2021

Are you running directly on the host or in a Docker container?

This is currently happening on all of our eng team's laptops running OS X.

Does the error happen right at startup or does it happen later?

It never happens right away, only after clicking around through our application a little bit.

Can you only replicate when using Meteor?

So far we have only replicated using our meteor application.

What if you update only either the tracer or Meteor but not both?

The crash happens on meteor 2.0 === node 12.20.1, regardless of which dd-trace version is used (only tried with 0.30.5 and 0.29.1)

The crash does not happen on meteor 1.11.1 === node 12.18.4, with dd-trace 0.29.1 - I have yet to test with only upgrading dd-trace to 0.35.5

Does it happen with a brand new Meteor app?

I have not tried yet.

What Node version did you upgrade from?

We upgraded from node 12.18.4

Any chance that you can provide a replication snippet?

I'll try my best to get around to this, but it will be tough for us. The most tempting route at this point is to hold off on upgrading meteor/node (which sucks) so we can focus on stuff that's more core to our business.

@rochdev
Copy link
Member

rochdev commented Feb 2, 2021

The error seems to come from a C++ check in async_hooks, so it's possible it's a bug in Node that is triggered by the combination of your app, the tracer, and async_hooks which we use for context propagation. What happens if you upgrade only Meteor and not Node? Also, do you get the same issue if you upgrade to latest 14 instead of latest 12?

@doronrk
Copy link
Author

doronrk commented Feb 2, 2021

What happens if you upgrade only Meteor and not Node?

My understanding is that the versions are tightly coupled. I'll check whether this is possible, but my understanding is that it is not.

Also, do you get the same issue if you upgrade to latest 14 instead of latest 12?

I don't believe there are any versions of meteor that use node 14. Meteor 2.0 is the latest release, which uses node 12

@pozylon
Copy link

pozylon commented Feb 3, 2021

I have the same issue, had to DD_TRACE_ENABLED: 0, it's not meteor 2.0, downgrading does not help

@bengl
Copy link
Collaborator

bengl commented Feb 3, 2021

Hi @pozylon!

We believe the issue is caused by using AsyncLocalStorage to track async context. This might be a bug upstream in Node.js, or in Meteor, or in the Datadog code, but @doronrk has confirmed that initializing with the async_resource scope manager instead (e.g. via tracer.init({ scope: 'async_resource' })) is a workaround for now.

If either of you can help us reproduce this, it would be super helpful for tracking down the bug so it can be fully fixed.

@rochdev
Copy link
Member

rochdev commented Feb 4, 2021

The issue should be fixed in 0.30.6. However, as mentioned above, the fix was to stop using AsyncLocalStorage meaning that this is only working around the issue. It's possible that we eventually make changes to our code that cause a regression unless we find the actual issue. It would be appreciated if one of you can provide a replication snippet so that we can also provide a fix upstream in either Node or Meteor and resolve the issue completely.

@bengl
Copy link
Collaborator

bengl commented Feb 4, 2021

I'm actually going to go ahead and re-open this, since the underlying issue is not resolved.

@bengl bengl reopened this Feb 4, 2021
@rochdev
Copy link
Member

rochdev commented Feb 8, 2021

@bengl Given the recent development that the issue is actually with Meteor and can only be fixed by them, should we close this issue since this is not something we can do something about?

@bengl
Copy link
Collaborator

bengl commented Apr 23, 2021

Yeah, let's close this since the issue is in how two libraries interoperate that are not within this repo, and the issue that was immediately presenting itself to @doronrk and others was fixed in a patch release. We won't be doing any further fixes in this project/repository with respect to fibers, and the situation is stagnant until Meteor moves off of fibers.

We've also added to our official documentation regarding fibers. https://docs.datadoghq.com/tracing/setup_overview/compatibility_requirements/nodejs/#fibers

We can reopen this if necessary.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working community core
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants