Skip to content
This repository has been archived by the owner on Nov 28, 2020. It is now read-only.

Performance impact of async_hooks #181

Open
bmeurer opened this issue Dec 16, 2017 · 52 comments
Open

Performance impact of async_hooks #181

bmeurer opened this issue Dec 16, 2017 · 52 comments

Comments

@bmeurer
Copy link
Member

bmeurer commented Dec 16, 2017

I've already started a related discussions on Twitter earlier here, but Twitter doesn't scale for this kind of discussion, so I thought I should move it here, so it's easier to follow the discussion. For background: @jasnell approached me at NodeConfEU in November 2017, asking for help on the performance of async_hooks on the V8 side, especially when it comes to the Promise lifecycle hooks. And I've talked briefly with @thlorenz about it during the conference (although I have to admit that I was using the wrong terminology and thus kinda ruined the conversation, sorry). And we had chatted briefly about this as well with @trevnorris during one of the CTC-V8 calls.

Since Promise based APIs are likely becoming a (big) thing for Node 10 and at the same time there are estimates (i.e. by @ofrobots) that by the end of next year every Node server in enterprise is going to run with async_hooks, we should start a discussion about this early one to stay ahead of the problem before it becomes a real problem for our users.

In the last year @gsathya and @caitp already spent a lot of effort optimizing promises in V8 in general, so we're in a pretty good position wrt. baseline performance!

I'd like to use this forum to come to an agreement on what the concrete use cases are (at least estimate that), which aspects of the performance matter the most (i.e. promises created inside of C++ vs. in JS land), and what are useful benchmarks to drive the performance work in a meaningful way. I think we need both a set of micro-benchmarks to drive certain, in addition to real-world code that makes heavy use of promises, i.e. koa.js or fastify servers maybe? I feel that these benchmarks are most important, otherwise we might easily sink a lot of effort into work that doesn't help real-world use cases in the end. That's why I opened the issue on the benchmarking WG.

Comments and contributions are very welcome!

@bmeurer
Copy link
Member Author

bmeurer commented Dec 16, 2017

There's been a related discussion in nodejs/promises#31 earlier this year, which yielded a couple of interesting improvements and insights.

Pinging @nodejs/performance @nodejs/async_hooks

@benjamingr
Copy link
Member

I'd like to use this forum to come to an agreement on what the concrete use cases are (at least estimate that), which aspects of the performance matter the most (i.e. promises created inside of C++ vs. in JS land), and what are useful benchmarks to drive the performance work in a meaningful way.

Particularly about async_hooks and promises or promises in general?

For async_hooks and promises - I think a server-application benchmark could be good here - with contexts through async functions through several layers of calls - I'd then listen to promiseResolve and write interesting things about the app while sending it "user data" and measure how long it takes. "interesting things" can be things like how many times the database is hit, request latency for different request stages etc.

For promises in general there are at least 2-3 good ideas in that thread that can get native promises finally faster than userland alternatives like bluebird. Namely skilling the microtask queue when possible and inlining calls - not to mention async iterators that aren't fast yet but could revolutionize programming in Node.js if they were.

@bmeurer
Copy link
Member Author

bmeurer commented Dec 17, 2017

I'd like to use this forum to come to an agreement on what the concrete use cases are (at least estimate that), which aspects of the performance matter the most (i.e. promises created inside of C++ vs. in JS land), and what are useful benchmarks to drive the performance work in a meaningful way.

Particularly about async_hooks and promises or promises in general?

Improving promise performance in general is also a good idea, but this particular issue is primarily about the impact of async_hooks, i.e. about the performance cliff that you fall off when you enable async_hooks.

As for promises in general and in particular async/await and async iterators, we will need good benchmarks as well to drive meaningful performance work. I've started to collect some ideas for promise performance improvements, but I don't feel like it's a good approach to measure our success in terms of the bluebird benchmarks.

I think @mcollina had some comments about that before. Can we derive a simple representative performance test from fastify to get things going?

@AndreasMadsen
Copy link
Member

AndreasMadsen commented Dec 17, 2017

Specifically, regarding PromiseHooks and performance I would like to see:

kDestroy is added. I think in many cases it is straightforward to know when a promise can no longer be refered too. Thus we could emit the destroy hook much sooner, in other cases it would be fine to wait for garbage collection, but if V8 could instrument that for us directly it would be great.

I've gotten many real-life complains from both APM providers and other async_hooks users that they experience a "memory leak". The reality is, that there is no memory leak but that it unintuitive that the destroy event is not emitted as soon as possible but rather at garbage collection.

I'm guessing this would improve performance a lot, as we would no longer have to wrap the promise object itself to get notified about garbage collection. Essentially we could just create an unreferenced basic object on kInit and set then just set the asyncId and triggerAsyncId (two doubles) on the internal fields.


Extra feature requests that are not performance related.

/cc @addaleax

@bmeurer
Copy link
Member Author

bmeurer commented Dec 18, 2017

@AndreasMadsen Thanks for the feedback, that's very interesting. These seem to be more on the feature request side.

I think in many cases it is straightforward to know when a promise can no longer be refered too.

Do you have a suggestion/intuition here how to do that?

I'm guessing this would improve performance a lot, as we would no longer have to wrap the promise object itself to get notified about garbage collection. Essentially we could just create an unreferenced basic object on kInit and set then just set the asyncId and triggerAsyncId (two doubles) on the internal fields.

So you're saying that by changing the API and adding machinery to implement kDestroy on the V8 side we should be able to improve performance significantly?

@bmeurer
Copy link
Member Author

bmeurer commented Dec 18, 2017

cc @gsathya @caitp

@AndreasMadsen
Copy link
Member

AndreasMadsen commented Dec 18, 2017

Do you have a suggestion/intuition here how to do that?

Often promises are only "referenced" once, either by .then() chaining or they are returned and used in await. At least for the user, it is quite obvious that the promise can't be referenced after that.

function wait(ret, ms) {
  // hint: the promise is directly returned
  return new Promise(function (resolve, reject) {
    setTimeout(() => resolve(ret), ms);
  });
}

function main() {
 // hint: the promise is directly used in `await` without being assigned to a variable
 await wait(1, 10) // after this line the user expects the `destroy` hook to emit
 await wait(2, 10) // same for this promise
 await wait(3, 10) // same for this promise
}

So you're saying that by changing the API and adding machinery to implement kDestroy on the V8 side we should be able to improve performance significantly?

Yes, that would be my guess. Wrapping and unwrapping the promise object is definitely the most expensive part of our PromiseHooks integration. Not having to do that, should improve performance.

@bmeurer
Copy link
Member Author

bmeurer commented Dec 18, 2017

cc @jaro-sevcik

@benjamingr
Copy link
Member

Often promises are only "referenced" once, either by .then() chaining or they are returned and used in await. At least for the user, it is quite obvious that the promise can't be referenced after that.

If V8 could detect that with escape analysis and optimize the return value of that function to an entirely different implementation that would be awesome. Fast libraries all hack this by having the single listener case optimized - but an engine that can prove it and avoid allocating extra stuff would be awesome. In terms of promise hooks I suspect it would also simplify things a lot for the common case.

@mcollina
Copy link
Member

On the benchmarks side, we can also consider Hapi v17, which is completely async-await based.

The problem with real-world code is that it typically involves a database, and that makes it very hard to measure small improvements. IMHO we should measure:

  1. receive a request, performs a query in the DB, returns value as JSON
  2. receive a request, performs a query in the DB, performs another query in the DB, returns a JSON
  3. receive a request, performs a query in the DB, send an HTTP request somewhere (maybe using fetch), returns a JSON

I fear this would have to be a synthetic application.

@benjamingr
Copy link
Member

I fear this would have to be a synthetic application.

This is what Doxbee did (the bluebird benchmark suite) which you're familiar with - we can fork and extend it.

@mcollina
Copy link
Member

@benjamingr I think we are all more concern on overall impact of async await rather than a comparative measurement between implementations. Plus, I think we should focus on overall performance including a web framework, a database and HTTP requests rather than just the promise/async await layer.

@gsathya
Copy link
Member

gsathya commented Dec 18, 2017

Thanks @AndreasMadsen, @benjamingr and @mcollina! This is definitely interesting.

Yes, that would be my guess. Wrapping and unwrapping the promise object is definitely the most expensive part of our PromiseHooks integration. Not having to do that, should improve performance.

This seems surprising to me. I would think that calling out to the javascript hook defined by the user from C++ to be the most expensive part. Have you tried benchmarking with and without the wrapping?

If V8 could detect that with escape analysis and optimize the return value of that function to an entirely different implementation that would be awesome.

The way escape analysis works is by figuring out if a given allocation is unobservable. But, we lose all that with PromiseHooks as the kInit will cause the promise to escape. PromiseHooks enable more observable behavior, potentially restricting optimization.

Plus, I think we should focus on overall performance including a web framework, a database and HTTP requests rather than just the promise/async await layer.

@mcollina I agree, 💯. The hard part would be to try and get reproducible results from such a benchmark though.

@AndreasMadsen -- Can you link to a real world usage of async hooks? How are developers using async hooks? Is it for async stack traces? Is the kResolve hook useful? How is the kDestroy hook used? Is it useful for context propagation? PromiseHooks have a big surface area -- they give you the promise and let you do whatever. If we can find patterns in real world and benchmark these patterns, then we can start to optimize for that and go from there.

@bmeurer
Copy link
Member Author

bmeurer commented Dec 18, 2017

@mcollina My gut feeling is that we will need some kind of micro-benchmarks to drive individual work items. Plus real-world code to see overall impact and find appropriate areas worth looking into.

@benjamingr Escape analysis could indeed eliminate promises (sometimes), but with promise hooks that becomes impossible, since every promise already escapes/leaks to the hooks.

@AndreasMadsen
Copy link
Member

AndreasMadsen commented Dec 18, 2017

This seems surprising to me. I would think that calling out to the javascript hook defined by the user from C++ to be the most expensive part. Have you tried benchmarking with and without the wrapping?

I admit, I have not done a full matrix comparison. But this was the observation when we initially added the wrapping. It also depends on what hooks are used. I remember that we have some numbers for this but I can't find them.

The way escape analysis works is by figuring out if a given allocation is unobservable. But, we lose all that with PromiseHooks as the kInit will cause the promise to escape.

I can't think of any that actually use the promise object. If we had the kDestroy hook we could properly live without it. We would also need kInit to allow us to return a custom id (the asyncId) and if PromiseHook would carry that context information for the other events. The parentPromise would then have to be replaced with that id.

Can you link to a real world usage of async hooks?

Is it for async stack traces?

That and continues local storage. Hopefully, we will see it being used as a profiling tool too.

Is the kResolve hook useful?

kResolve is used by angular/zone.js#795 - but it is not enough for what they want. Either they need the resolved value or an extra event.

How is the kDestroy hook used?

It is not implemented by PromiseHooks so it isn't used. But it would directly replace our destroy hook. The destroy hook is used to clean up information that is stored in a new Map with the asyncId as the index value.

Is it useful for context propagation?

Very.

@bmeurer
Copy link
Member Author

bmeurer commented Dec 18, 2017

Is the kResolve hook useful?

kResolve is used by angular/zone.js#795 - but it is not enough for what they want. Either they need the resolved value or an extra event.

I've heard that zone.js might be abandoned by the Angular folks. And async_hooks are available on the Node side only, so they still need a different solution for client side. Not sure who to ask about this?

@benjamingr
Copy link
Member

@mcollina

I think we are all more concern on overall impact of async await rather than a comparative measurement between implementations. Plus, I think we should focus on overall performance including a web framework, a database and HTTP requests rather than just the promise/async await layer.

I think what we're measuring async/await against here (with async_hooks) is callbacks. I agree with the database assumption, I'd focus on the most common stack and write the same app in two ways (callbacks and promises) and add instrumentation (for async hooks).

I tend to agree with database/web framework and http requests - but we need to make sure those are as thin as possible so we don't end up with an implementation that is fine "for the average case" but completely breaks for apps that have a harder performance requirement.

What about a benchmark that does:

  • Make HTTP requests, but make the requests on a unix socket rather than a tcp one to minimize the affect of the network stack. I'm not sure it's not better to swap out the HTTP implementation for the test altogether (and just give koa/express/whatever a "mock HTTP interface") but that's risky.
  • Make DB requests, but very low overhead requests, preferably to the same machine and over a unix socket - this part feels very unreliable - I'm not sure how we can be certain our measurements are significant. Probably a Node server running an in-memory database on the other end (maybe the same one that's making the http requests). Measuring two processes is harder though and more prone to OS scheduling quirks.
  • I'm not sure I'd serialize to JSON so easily - since that might end up dominating the time the app takes and a lot of more performance conscious apps don't do it very often.

In general, it's mostly a scope question of whether Node.js should optimize for "the average app" or for "performance sensitive apps".


@bmeurer

Escape analysis could indeed eliminate promises (sometimes), but with promise hooks that becomes impossible, since every promise already escapes/leaks to the hooks.

I'm not sure I follow - so I'll clarify my request/suggestion. Escape analysis can (probably) greatly optimize most of the modern usage of promises in async/await where we're awaiting a direct invocation of an async function - and use much simpler promise code.

async function bar() {

}

(async function foo() {
  // bar is also an async function, this call only has one listener and doesn't escape
  // this is, presumably "the common case".
  await bar(); 
})();

In this case, we don't actually need to allocate a real promise object - we can continue the execution of foo when bar ends. Properties of promises like multicast and assimilation aren't "required". We don't need await to call Promise.resolve on its parameter since we know it's an async function or anything like that either.

If we can prove other nice properties for bar (like that it doesn't throw) it gets even better and we can inline bar entirely in foo (and 'defer a microtick' since we're awaiting it). This would let us drop any allocation or book-keeping altogether.

I realize this is far from the actual implementation but the benefit is also huge and it doesn't sound "too crazy" (I hope :)).

For my specific request: when/if this is optimized, you can inline async_hooks calls too in the above code. Instead of destroy when a promise allocated by bar gets GCd, there is no promise created (or even a call to bar whose code is inlined in foo) and destroy is emitted synthetically.

@bmeurer
Copy link
Member Author

bmeurer commented Dec 19, 2017

@benjamingr The thing with promise hooks (as required by async_hooks) is that we need to pass the JSPromise object to Node, which immediately makes all escape analysis useless as TurboFan has no idea what Node does with the object and cannot perform scalar replacement of the fields.

@AndreasMadsen
Copy link
Member

The thing with promise hooks (as required by async_hooks) is that we need to pass the JSPromise object to Node, which immediately makes all escape analysis useless as TurboFan has no idea what Node does with the object and cannot perform scalar replacement of the fields.

@bmeurer As said, we don't actually need the JSPromise object in node, we just need a way to pass context.

@bmeurer
Copy link
Member Author

bmeurer commented Dec 19, 2017

That's not how it's implemented currently. I think I need to look at the Node core side again. Don't you need to slap ids onto the JSPromise object?

@AndreasMadsen
Copy link
Member

@bmeurer yes we do. If PromiseHooks allowed us to set those ids (just one 64bit id really) in kInit and then passed them to the other events we wouldn't need the JSPromise object.

Our PromiseHook code is here: https://github.com/nodejs/node/blob/master/src/async_wrap.cc#L283

@bmeurer
Copy link
Member Author

bmeurer commented Dec 21, 2017

@AndreasMadsen But we still need to materialize the JSPromise object to pass it to the kInit hook. So you already defeat the escape analysis at that point. Would it be possible to eliminate the need to pass a concrete JSPromise object to the promise hooks completely?

@AndreasMadsen
Copy link
Member

Would it be possible to eliminate the need to pass a concrete JSPromise object to the promise hooks completely?

This is what I'm saying! We are only using it propagating context, so if PromiseHooks provided a different way of doing this, such as propagating a 64bit id set in kInit then that would work just fine.

@bmeurer
Copy link
Member Author

bmeurer commented Dec 22, 2017

@AndreasMadsen Staring at the PromiseHook on the Node side, you store the PromiseWrap object on the JSPromise and you do check the parent JSPromise during kInit. So this would be a major refactoring on how this works. A few questions come to mind here:

  • Can we get rid of the PromiseWrap for every JSPromise?
  • Can we move the id slapping logic to V8 instead? So that V8 can set these internal fields upon creating of JSPromise objects.

@gsathya does that make sense to you as well? Do we have other users of the PromiseHook?

@AndreasMadsen
Copy link
Member

AndreasMadsen commented Dec 22, 2017

Staring at the PromiseHook on the Node side, you store the PromiseWrap object on the JSPromise and you do check the parent JSPromise during kInit.

Correct. The PromiseWrap holds the ids (asyncId and triggerAsyncId) and help us emits the destroy event. We also send it to the user as the resource object, where it contains two values:

resource = {
  promise: JSPromise,
  // lookup parent asyncId on the internal field of the parent PromiseWrap
  parentId: parentJSPromise.[[PromiseWrap]].[[asyncId]]
}

So this would be a major refactoring on how this works.

On our side, it would be API breaking but I don't see a big issue here. What would be missing is the resource.promise object but I have never seen anyone actually use that. And I don't think it is worth the performance overhead.

Can we get rid of the PromiseWrap for every JSPromise?

yes. It would require two things:

  • V8 does the id slapping logic for us (at least one id).
  • V8 emits a kDestroy event.

These are the only critical things we use the PromiseWrap object for and hence the JSPromise as well.

Can we move the id slapping logic to V8 instead? So that V8 can set these internal fields upon creating of JSPromise objects.

yes

We would need at least one id (asyncId). We have another id (triggerAsyncId) but we could store that in a map as the asyncId is unique for each promise.

@JiaLiPassion
Copy link

@bmeurer,

I've heard that zone.js might be abandoned by the Angular folks. And async_hooks are available on the Node side only, so they still need a different solution for client side. Not sure who to ask about this?

Angular current provide a NgNoopZone to improve performance, it will not load zone.js,but it is just an additional option when user want to fully control when to do change detection.

currently in zone.js, I want to know when promise is really resolved instead of get a kResolve callback because I want to

  • implement a asynchooks based zone.js.
  • let zone.js be able to use native async/await.

I am still looking for walk around, but it seems to be a v8 feature request. I also post a feature request there , https://bugs.chromium.org/p/v8/issues/detail?id=6862

@ofrobots
Copy link
Collaborator

@AndreasMadsen @bmeurer

So this would be a major refactoring on how this works.

On our side, it would be API breaking but I don't see a big issue here. What would be missing is the resource.promise object but I have never seen anyone actually use that. And I don't think it is worth the performance overhead.

Note that async-hooks are still experimental, so the cost of making a semver breaking change is not that high, specially if it has significant benefits. Regardless, if V8 was to implement this today, this would only get into Node 10, a semver major release. Whether or not the change can be backported to Node 8 and 9 would depend on the complexity rather than semver-ness of the impact to async-hooks API.

@bmeurer
Copy link
Member Author

bmeurer commented Jan 17, 2018

@ruimarinho Oh that's pretty bad. Was it due to Promise heavy code? Or just old fashioned callback style? Also what kind of software did you try to deploy using async_hooks?

@ruimarinho
Copy link

@bmeurer yes - heavy Promise code (using async/await). Full stack http request server, including redis and database connections. We are going to try enabling async_hooks without associating any other code to confirm if the issue is simply related to activating them. We used domains which in 9.3.0+ run on top of async_hooks. We're also using bluebird for most of the modules. /cc @kurayama.

@bmeurer
Copy link
Member Author

bmeurer commented Jan 17, 2018

@ruimarinho Ok, if you're using bluebird promises, that's going to be slower than native promises already by itself. async/await though will always use native promises.

@bmeurer
Copy link
Member Author

bmeurer commented Jan 18, 2018

Accidentally commented on the TSC thread instead of this one. So for future reference, I also did a test run of simple hapi and koa servers (using @mcollina's autocannon tester), again with and without async_hooks enabled to get more real-worldish numbers (the Promise benchmarks arguably really stress promises pretty heavily). The results were pretty interesting (with latest Node 9.4.0):

Results for Node 9.4.0

The koa test is super flaky, so the performance difference could also be noise, but for hapi, which makes heavy use of async/await, there's pretty consistent 30% performance drop with just an empty init hook. See bmeurer/async-hooks-performance-impact for the benchmarks and additional information.

And to provide even more data for the discussion: The performance drop also increases with the number of hooks being used (maybe not unsurprising). For example for the Promise benchmarks, we see additional performance regressions:

@ruimarinho
Copy link

Very interesting data @bmeurer, thanks for the insights on this subject. We definitely hit the with async_hooks (all) scenario since the current domains implementation on top of async_hooks registers all handlers. That's more in line with what we've seen in production.

async/await though will always use native promises.

Does this hold true even even when await Promise.all([p1, p2]) where const Promise = require('bluebird')?

@bmeurer
Copy link
Member Author

bmeurer commented Jan 19, 2018

Does this hold true even even when await Promise.all([p1, p2]) where const Promise = require('bluebird')?

In that case you'll probably use bluebird promises for the Promise.all and native promises for the await, which I guess is not what you want. The ES2017 spec forces await and async function to always use native promises (which I consider a good thing FWIW). I think @gsathya brought this up earlier.

@holyjak
Copy link

holyjak commented Nov 1, 2018

Let me know whether it is not relevant and I should delete my post, but I experienced quite a bad increase in CPU usage when I activated async_hooks (Node 8 and 11) - see https://theholyjava.wordpress.com/2018/11/01/beware-the-performance-cost-of-async_hooks-node-8/ for details.

@mcollina
Copy link
Member

mcollina commented Nov 1, 2018

I would recommend to check out the latest Node 8, Node 10 and Node 11, as the runtimes that are used in the blog post are quite old, or not supported anymore (Node 9). There should have been some improvement in the area.

Overall async_hooks  are very costly, especially with promises.

@holyjak
Copy link

holyjak commented Nov 1, 2018 via email

@dnutels
Copy link

dnutels commented May 9, 2019

Tried it on Node 12.2 on Windows 10 U. Still costly.

regular Bluebird-doxbee: 126 ms.
init Bluebird-doxbee: 551 ms.
full Bluebird-doxbee: 707 ms.
regular Bluebird-parallel: 170 ms.
init Bluebird-parallel: 1048 ms.
full Bluebird-parallel: 1316 ms.
regular Wikipedia: 334 ms.
init Wikipedia: 2129 ms.
full Wikipedia: 2621 ms.
regular hapiserver: 5477.2 reqs.
init hapiserver: 3266.6 reqs.
full hapiserver: 2727.3 reqs.
regular koaserver: 8231.8 reqs.
init koaserver: 7309.6 reqs.
full koaserver: 6771.1 reqs.

@omeraha
Copy link

omeraha commented Jul 29, 2019

I would recommend to check out the latest Node 8, Node 10 and Node 11, as the runtimes that are used in the blog post are quite old, or not supported anymore (Node 9). There should have been some improvement in the area.

Overall async_hooks  are very costly, especially with promises.

@mcollina
Can you shortly explain the reasons for the substantial performance impact?

@alekbarszczewski
Copy link

Is it possible to reduce performance impact of async_hooks in the future (in future Node.js versions)? Or the performance will be always so bad because of the nature of async_hooks and how they work?

@benjamingr
Copy link
Member

@alekbarszczewski it is mostly blocked on people doing the product work and suggesting concrete improvements.

@rochdev
Copy link

rochdev commented Oct 2, 2019

If anyone can provide pointers of where to start to try and fix this, and/or a tl;dr of why it's so slow, I can try to look into it.

@mhdawson
Copy link
Member

mhdawson commented Oct 3, 2019

@rochdev how much time will you have to invest? It will be a relatively big ramp up.

@mhdawson
Copy link
Member

mhdawson commented Oct 3, 2019

@rochdev my suggestion is to come to the next https://github.com/nodejs/diagnostics meeting. We are looking for a Champion to help with moving async hooks forward.

As per the calendar the next meeting is scheduled for Oct 9 (next wed) and an issue will be opened in the repo next Monday with the meeting links etc.

@rochdev
Copy link

rochdev commented Oct 3, 2019

I definitely don't expect that it will be an easy task, but it's definitely an issue that is severely impacting APM vendors. I don't know if I'll be able to put a lot of time into it, at least short term, but I'd like to familiarize myself with the issue and how async_hooks currently works in general.

I'll try to join the next Diagnostics meeting. Thanks for the tip!

@rochdev
Copy link

rochdev commented Oct 7, 2019

I won't be able to make it to this week WG meeting. I'll make sure I stay available for the next one.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests