Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bluebird leaks domains on exception thrown in a then() function. #1125

Closed
jdmarshall opened this issue Jun 8, 2016 · 11 comments
Closed

Bluebird leaks domains on exception thrown in a then() function. #1125

jdmarshall opened this issue Jun 8, 2016 · 11 comments

Comments

@jdmarshall
Copy link

jdmarshall commented Jun 8, 2016

See https://gist.github.com/jdmarshall/e16424c6f039fb6bf4dab1951cb32129

I've tried this with node 0.10, 0.12, 4.4 and 6.0, and bluebird 2.9 and 3.4.

At process exit there are several domains remaining on the stack, and any data associated with the domain is kept around.

In all of the versions listed above, this code results in 2 orphaned domains, but in 0.10 it results in only one. This causes a pretty big issue if, for instance, you are using domains and promises to service requests in express.

The problem is that domain.bind(), which bluebird uses to service promises that were created in a domain, does not have a try/finally block around the domain.enter(), domain.exit() pairing.

Bluebird needs to do its own cleanup on errors, which it doesn't do.

@petkaantonov
Copy link
Owner

It sounds like a bug in domain.bind implementation and the best solution would be to fix it at the source.

@jdmarshall
Copy link
Author

I disagree.

bind() clearly expects to receive a function that doesn't throw unhandled exceptions. It has always been thus and through 3 refactorings of that code it has always been so. That consistency to me implies intent.

Or to look at it another way, fixing it in Node means it only gets fixed for future versions of Node. If bluebird only works properly on some future version of Node, and Node is planning to replace Bluebird with an ES promises implementation anyway, then the niche for bluebird is effectively tiny. If you iron out the error handling on this end, then bluebird continues to be indispensible for everyone using Node LTS or older.

@petkaantonov
Copy link
Owner

The point of bind is that exceptions thrown from the bound function are forwarded to the domain. That's what the docs say: https://nodejs.org/api/domain.html#domain_domain_bind_callback.

@benjamingr
Copy link
Collaborator

Domains are deprecated anyway.

@petkaantonov
Copy link
Owner

Still seems pretty critical bug that it's unfit for its only purpose

@jdmarshall
Copy link
Author

jdmarshall commented Jun 14, 2016

True enough on the bind() docs, but domains are undead at this point. They exist and there is no viable replacement, and without them Node's async I/O code is more of a footnote than a killer feature, because if you're running one request per node instance then who cares if your IO is non-blocking? Who are you blocking?

But that still leaves the practical matter that nobody will ever be able to use Bluebird cleanly with any version of Node that currently exists, unless they use one request per process.

@spion
Copy link
Collaborator

spion commented Jun 14, 2016

Why one request per process?

Are you using domains as continuation local storage?

@benjamingr
Copy link
Collaborator

@petkaantonov it is unfit for its only purpose. @trevnorris wrote about it in nodejs/node#6159 - basically, the problem is that you can't escape a domain once you entered it when going to node code.

For example - if express wraps requests in a domain and then you call fs and you are still inside the domain inside its callback then your domain is catching errors from fs without giving it a chance to clean up. Promises for example - don't have that problem since you never throw into code running in another context. Domains are therefor inherently broken for error handling.

Work on Zones is undergoing to replace domains at the language level - but that's still very early on and might suffer from the same problem. async_wrap on the other hand is safe and well thought out (IMO) but it doesn't do (or pretend to do) error handling at all.

@jdmarshall
Copy link
Author

@spion After a fashion, yes. It's an internal framework, and we're doing the pattern of 'protect the framework from client code misbehaving'. There's a bit of request state that accumulates and we have services that accumulate and/or repackage data from other services, so the promise chains tend to get big enough to warrant a diagram.

As for my one request per process comment, I sometimes jump too many steps in a logical progression, apologies for not being more detail oriented. This is getting a little off-track but I should explain myself anyway. Without domains, then you're accepting that a 500 error in one request should cause a 500 error in all of the other in-flight requests on the same process, and I'm in agreement with my team that this is not acceptable. So the only ways I'm aware of to isolate an error to a single request are domains, or one process per request.

Most of the errors we get should be treated at 400 errors, anything from missing auth to missing cookie to missing resource. They're intended to be caught and handled within our promise callback chains and indeed they usually are. Usually.

But my thinking on the matter is that node and OSS can deprecate libraries all they want, but until they've provided an alternative it's just a blame assignment trick and doesn't actually address the pain points.

My read of the bluebird code is that the problem comes from a bit of inverted logic in the domain code in bluebird. Rather than bind()ing the function and then wrapping it in a try/catch, you should be wrapping it in a try/catch and then bind()ing it. Then your fail() code happens independent of what Node is or isn't doing.

@spion
Copy link
Collaborator

spion commented Jun 28, 2016

Domains aren't deprecated because they are "not good enough". They are a horribly broken design. If you catch a domain error, node internals and/or other callback based libraries that build on top of them can get into an undefined state. By undefined state, I mean something like letting a dangling file handle or connection open, or some other kind of internal state being bad (non-working event emitter on a server, etc). After a few of these your process may become unusable, consume 100% cpu, refuse to process further requests... the possibilities are endless.

For a contrived example, read http://blog.izs.me/post/65712662830/restart-nodejs-servers-on-domain-errors-sensible

In projects where we use bluebird, we settled for

  • allowing crashes in non-promise code, and
  • using promises everywhere we possibly can

as a good-enough combination to deal with this.

No sensible alternative is possible for current callback-based code.

@petkaantonov
Copy link
Owner

Reverted in 3.4.5, this is impossible to implement

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants