AsyncWrap public API proposal #18
Conversation
| * `parentId` {Number} | ||
|
|
||
| Called when a class is constructed that has the possibility to trigger an | ||
| asynchronous event. This does mean the instance _will_ trigger a |
Fishrock123
Apr 27, 2016
Member
does not mean
does not mean
| longer needs it. | ||
|
|
||
| Because the callback is called during deconstruction of the class instance it | ||
| is not safe to access the JS handle during `destroy()` execution. Reason this |
Fishrock123
Apr 27, 2016
Member
Does that mean that this is not that handle, i.e. undefined?
Does that mean that this is not that handle, i.e. undefined?
trevnorris
Apr 28, 2016
Author
Contributor
Yes. To reiterate a comment I just posted as to why this is the case:
Now, destroy() doesn't have a this because it's essentially a static method. It's unsafe to access a C++ class instance while deconstructing. So only the id of the handle is passed to JS so any storage resources can be released.
So you could essentially see undefined as the handle instance at that point in time. Since it has been (or soon will be) free'd.
Yes. To reiterate a comment I just posted as to why this is the case:
Now, destroy() doesn't have a this because it's essentially a static method. It's unsafe to access a C++ class instance while deconstructing. So only the id of the handle is passed to JS so any storage resources can be released.
So you could essentially see undefined as the handle instance at that point in time. Since it has been (or soon will be) free'd.
Fishrock123
Apr 28, 2016
•
Member
But is the handle instance (i.e. if you access this) actually undefined here?
But is the handle instance (i.e. if you access this) actually undefined here?
trevnorris
Apr 28, 2016
Author
Contributor
Yes. Logging this will show undefined.
Yes. Logging this will show undefined.
jasnell
Apr 29, 2016
Member
The inconsistency here between destroy and the other callbacks is another argument in favor of avoiding the use of this and passing the context as an argument. The destroy method simply would not have that argument set. As @chrisdickinson mentions it's more of a stylistic choice but I'd certainly prefer it
The inconsistency here between destroy and the other callbacks is another argument in favor of avoiding the use of this and passing the context as an argument. The destroy method simply would not have that argument set. As @chrisdickinson mentions it's more of a stylistic choice but I'd certainly prefer it
jasnell
Apr 29, 2016
Member
Nit: The reason this...
Nit: The reason this...
trevnorris
Apr 29, 2016
Author
Contributor
@jasnell I disagree that there's fundamental inconsistency. When destroy() is called the handle should no longer available for access. It reflects the fact that by accessing the handle you have undefined behavior. Remember that it's possible to store the object prior to calling destroy(), and while the handle may exist in JS accessing it leads to undefined behavior because the backing C++ class no longer exists. Thus making this === undefined is because this is literally undefined at that point.
@jasnell I disagree that there's fundamental inconsistency. When destroy() is called the handle should no longer available for access. It reflects the fact that by accessing the handle you have undefined behavior. Remember that it's possible to store the object prior to calling destroy(), and while the handle may exist in JS accessing it leads to undefined behavior because the backing C++ class no longer exists. Thus making this === undefined is because this is literally undefined at that point.
trevnorris
Apr 29, 2016
Author
Contributor
screw it. i'm going through and making it safe to access this regardless of where it's accessed. can't having the application abort, even if they screwed up. so this in destroy() will be set to the handle.
screw it. i'm going through and making it safe to access this regardless of where it's accessed. can't having the application abort, even if they screwed up. so this in destroy() will be set to the handle.
| everything can be safely accessed. So it is not recommended to blindly | ||
| iterate over the object's properties. Despite this, it is important to give | ||
| access of the handle to the user so they can utilize tracking methods of their | ||
| choosing. For example: |
chrisdickinson
Apr 27, 2016
It'd be nice to avoid meaningful this in new APIs, but I understand if it's impossible due to perf concerns.
It'd be nice to avoid meaningful this in new APIs, but I understand if it's impossible due to perf concerns.
trevnorris
Apr 28, 2016
Author
Contributor
I'd be lying if I said it was for performance. It's done this way because it makes syntactic sense. The callback is running as an extension of the handle, thus the handle becomes the calling context for the function. Why do you have an aversion to using this? Other than disallowing the use of arrow functions, which I don't accept as a valid argument over having a syntactically correct API, I don't see any disadvantage. While the advantage is that it more closely mimics the C++ class counterpart. Which is important here because they're so closely bound.
Now, destroy() doesn't have a this because it's essentially a static method. It's unsafe to access a C++ class instance while deconstructing. So only the id of the handle is passed to JS so any storage resources can be released.
I'd be lying if I said it was for performance. It's done this way because it makes syntactic sense. The callback is running as an extension of the handle, thus the handle becomes the calling context for the function. Why do you have an aversion to using this? Other than disallowing the use of arrow functions, which I don't accept as a valid argument over having a syntactically correct API, I don't see any disadvantage. While the advantage is that it more closely mimics the C++ class counterpart. Which is important here because they're so closely bound.
Now, destroy() doesn't have a this because it's essentially a static method. It's unsafe to access a C++ class instance while deconstructing. So only the id of the handle is passed to JS so any storage resources can be released.
chrisdickinson
Apr 29, 2016
The argument against this is primarily one of convention — anecdotally, many JS APIs have been moving away from "load-bearing this" over the last few years. When I was poking at the asyncwrap API (pre-docs), I was surprised that this carried pertinent info — it was not obvious to me that the hook was running as an extension of the handle.
It's admittedly a nit — but if it comes at zero cost & makes the API more obvious, it might be worth addressing.
The argument against this is primarily one of convention — anecdotally, many JS APIs have been moving away from "load-bearing this" over the last few years. When I was poking at the asyncwrap API (pre-docs), I was surprised that this carried pertinent info — it was not obvious to me that the hook was running as an extension of the handle.
It's admittedly a nit — but if it comes at zero cost & makes the API more obvious, it might be worth addressing.
|
|
||
| Node currently doesn't have sufficient API to notify calls to Promise | ||
| callbacks. In order to do so node would have to override the native | ||
| implementation. |
chrisdickinson
Apr 27, 2016
I think we have to solve this before the API can be made public.
I think we have to solve this before the API can be made public.
Fishrock123
Apr 27, 2016
Member
This is a V8 API problem, right?
This is a V8 API problem, right?
chrisdickinson
Apr 27, 2016
Yep — it was on my plate last but I ran out of time to work on it. It was looking like the solution would be pre/post hooks for executing microtaskqueue items — my original approach of making the queue itself pluggable wouldn't work because there are more than just JS functions in that queue.
Yep — it was on my plate last but I ran out of time to work on it. It was looking like the solution would be pre/post hooks for executing microtaskqueue items — my original approach of making the queue itself pluggable wouldn't work because there are more than just JS functions in that queue.
trevnorris
Apr 28, 2016
Author
Contributor
@chrisdickinson If we absolutely have to I'd prefer shimming native promises over not shipping this. We don't currently offer any type of support for Promises in this way. e.g. domains loosing context:
const domain = require('domain');
const print = process._rawDebug;
domain.create().run(() => Promise.resolve(1).then((val) => print(process.domain)));
// output: undefined
And TBH refusing this type of insight into node just because v8 doesn't provide us with more than a black box is a serious pain.
If, in fact, the CTC more unanimously decides that it's important that we wait for v8 to provide the appropriate APIs to accomplish this then I'd at least like to get approval for the EP so I can finish the implementation exposed via process.binding(). Then at least it's exposed in a way that can still be useful to many and will be ready once v8 provides what we need.
@chrisdickinson If we absolutely have to I'd prefer shimming native promises over not shipping this. We don't currently offer any type of support for Promises in this way. e.g. domains loosing context:
const domain = require('domain');
const print = process._rawDebug;
domain.create().run(() => Promise.resolve(1).then((val) => print(process.domain)));
// output: undefinedAnd TBH refusing this type of insight into node just because v8 doesn't provide us with more than a black box is a serious pain.
If, in fact, the CTC more unanimously decides that it's important that we wait for v8 to provide the appropriate APIs to accomplish this then I'd at least like to get approval for the EP so I can finish the implementation exposed via process.binding(). Then at least it's exposed in a way that can still be useful to many and will be ready once v8 provides what we need.
chrisdickinson
Apr 29, 2016
Right — this is a problem with domains as well, and I empathize with your desire to avoid making AsyncWrap dependent on V8 changes. However, without promise support, we can't make the sort of assertions we'd like to make — namely, being able to reliably back-associate any event with the event that queued it. This is somewhat less of an issue for domains, since domains and promises ostensibly both provide solutions for asynchronous error handling, and (leaving aside the domains-as-CLS use case) generally don't have much overlap.
On the upside, this is something we only have to solve once, and after that point any further microtask use from V8 is taken care of.
Right — this is a problem with domains as well, and I empathize with your desire to avoid making AsyncWrap dependent on V8 changes. However, without promise support, we can't make the sort of assertions we'd like to make — namely, being able to reliably back-associate any event with the event that queued it. This is somewhat less of an issue for domains, since domains and promises ostensibly both provide solutions for asynchronous error handling, and (leaving aside the domains-as-CLS use case) generally don't have much overlap.
On the upside, this is something we only have to solve once, and after that point any further microtask use from V8 is taken care of.
jasnell
Apr 29, 2016
Member
Unless I overlooked it, something should be included about whether hooks are blocking or not. What happens, for instance, if I drop an endless loop into pre() for instance? Is there any risk to passing this to setImmediate() or process.nextTick() in the hook callback? That sort of thing.
Unless I overlooked it, something should be included about whether hooks are blocking or not. What happens, for instance, if I drop an endless loop into pre() for instance? Is there any risk to passing this to setImmediate() or process.nextTick() in the hook callback? That sort of thing.
trevnorris
Apr 29, 2016
Author
Contributor
@jasnell as in whether async wrap callbacks run on the same thread? didn't think specifying that would be necessary, but can if you think so. as far as keeping a reference to this indefinitely, i'm in the middle of a PR to fix it so you can.
@jasnell as in whether async wrap callbacks run on the same thread? didn't think specifying that would be necessary, but can if you think so. as far as keeping a reference to this indefinitely, i'm in the middle of a PR to fix it so you can.
jasnell
Apr 29, 2016
Member
I think the documentation should at least give a recommendation one way or the other as to whether the callbacks should be sync or async and what the possible ramifications are.
I think the documentation should at least give a recommendation one way or the other as to whether the callbacks should be sync or async and what the possible ramifications are.
mike-kaufman
May 3, 2016
However, without promise support, we can't make the sort of assertions we'd like to make — namely, being able to reliably back-associate any event with the event that queued it.
+1.
However, without promise support, we can't make the sort of assertions we'd like to make — namely, being able to reliably back-associate any event with the event that queued it.
+1.
trevnorris
May 5, 2016
Author
Contributor
However, without promise support, we can't make the sort of assertions we'd like to make — namely, being able to reliably back-associate any event with the event that queued it.
This isn't completely accurate. It requires that we always drain the microtask queue in a specific way, and you will loose call stack information, but passing the correct parent id along is possible.
However, without promise support, we can't make the sort of assertions we'd like to make — namely, being able to reliably back-associate any event with the event that queued it.
This isn't completely accurate. It requires that we always drain the microtask queue in a specific way, and you will loose call stack information, but passing the correct parent id along is possible.
| // trace filtering of events. | ||
| const providers = async_wrap.Providers; | ||
| // init() is called during object construction. The object handle being |
jasnell
Apr 29, 2016
Member
Can you add a bit more up front about the over all design. Such as, what it a handle and what role does it play here? I know this isn't the API doc but it would help level set for anyone coming in and doing a review
Can you add a bit more up front about the over all design. Such as, what it a handle and what role does it play here? I know this isn't the API doc but it would help level set for anyone coming in and doing a review
mike-kaufman
May 3, 2016
It would be useful to describe the APIs in terms of the intended user model. For example, doesn't init() really mean enqueueAnAsyncCallback(). In the case of nextTick callbacks & promises, it doesn't correspond to an object being constructed, right?
It would be useful to describe the APIs in terms of the intended user model. For example, doesn't init() really mean enqueueAnAsyncCallback(). In the case of nextTick callbacks & promises, it doesn't correspond to an object being constructed, right?
Fishrock123
May 4, 2016
Member
enqueueAnAsyncCallback()
Pardon? This isn't actually where the callback gets called?
enqueueAnAsyncCallback()
Pardon? This isn't actually where the callback gets called?
mike-kaufman
May 4, 2016
My understanding of the user model:
init(id, type, parentId) - invoked when an async callback as identified by id is eligible to be invoked. parentId is the async ID of the currently executing parent at the time init() is called.
pre(id) - invoked immediately before an async callback as identified by id is invoked
post(id, hasThrown) - invoked immediately after an async callback as identified by id has been invoked.
destroy(id) - invoked when an async callback as identified by id is no longer elligible to be called.
Above is brief wrt the "user model". It can be expanded on to provide more succinct definitions. E.g., an async callback is defined as a javascript function which is enqueued for execution at some later time. It is identified by a unique ID refered to as the async ID. It has the following lifecycle events: init() - ...
My understanding of the user model:
init(id, type, parentId)- invoked when an async callback as identified byidis eligible to be invoked.parentIdis the async ID of the currently executing parent at the timeinit()is called.pre(id)- invoked immediately before an async callback as identified byidis invokedpost(id, hasThrown)- invoked immediately after an async callback as identified byidhas been invoked.destroy(id)- invoked when an async callback as identified byidis no longer elligible to be called.
Above is brief wrt the "user model". It can be expanded on to provide more succinct definitions. E.g., an async callback is defined as a javascript function which is enqueued for execution at some later time. It is identified by a unique ID refered to as the async ID. It has the following lifecycle events: init() - ...
trevnorris
May 5, 2016
Author
Contributor
For example, doesn't init() really mean enqueueAnAsyncCallback(). In the case of nextTick callbacks & promises, it doesn't correspond to an object being constructed, right?
Yes it does. It always does. It's always constructed at the time of call. You're confusing init() with something like when a callback is passed to the instances event handler, e.g. on('data' (which is the JS abstraction of the handle and has nothing to do with how the handle operates). The callback that will be called by MakeCallback is always defined at construction time. That will then trigger any callbacks from the user that have been abstracted via the JS API.
For example, doesn't
init()really meanenqueueAnAsyncCallback(). In the case of nextTick callbacks & promises, it doesn't correspond to an object being constructed, right?
Yes it does. It always does. It's always constructed at the time of call. You're confusing init() with something like when a callback is passed to the instances event handler, e.g. on('data' (which is the JS abstraction of the handle and has nothing to do with how the handle operates). The callback that will be called by MakeCallback is always defined at construction time. That will then trigger any callbacks from the user that have been abstracted via the JS API.
Fishrock123
May 5, 2016
Member
Does .created() make more sense?
Does .created() make more sense?
trevnorris
May 5, 2016
Author
Contributor
I used init as short for "initialized", which I believe well explains what's mechanically happening.
I used init as short for "initialized", which I believe well explains what's mechanically happening.
jasnell
May 6, 2016
Member
init() should be fine as a name but as @mike-kaufman and I have indicated a few times, this would be much clearer if it included a bit more detail about the lifecycle of a hook. That's not a criticism, just a suggestion.
init() should be fine as a name but as @mike-kaufman and I have indicated a few times, this would be much clearer if it included a bit more detail about the lifecycle of a hook. That's not a criticism, just a suggestion.
| // post() is called just after the handle's callback has finished, and will be | ||
| // called regardless whether the handle's callback threw. If the handle's | ||
| // callback did throw then hasThrown will be true. | ||
| function post(id, hasThrown) { } |
jasnell
Apr 29, 2016
Member
If hasThrown is true, is there any mechanism here for determining what the error was? Is there any intent for this to be used in any kind of recovery scenario? If not, that should be made explicit
If hasThrown is true, is there any mechanism here for determining what the error was? Is there any intent for this to be used in any kind of recovery scenario? If not, that should be made explicit
Fishrock123
Apr 30, 2016
Member
Some notes that I asked trevor to outline should cover this, but you'd use process.on('unhandledException'), and you can register it safely in pre and cleanup in post without risking other code being caught.
Some notes that I asked trevor to outline should cover this, but you'd use process.on('unhandledException'), and you can register it safely in pre and cleanup in post without risking other code being caught.
trevnorris
May 5, 2016
Author
Contributor
It's also useful if you want to see that someone is using domains or uncaughtException to handle errors. Rather than allowing the application to crash.
It's also useful if you want to see that someone is using domains or uncaughtException to handle errors. Rather than allowing the application to crash.
jasnell
May 6, 2016
Member
Ok, but that's not quite what I asked. Inside post, I look and see that hasThrown is true. Then what? As a suggestion, when writing up the docs for this, it needs to be clear how this particular method relates to the actual handling of the error so that there's no confusion. People will see the hasThrown and will wonder.
Ok, but that's not quite what I asked. Inside post, I look and see that hasThrown is true. Then what? As a suggestion, when writing up the docs for this, it needs to be clear how this particular method relates to the actual handling of the error so that there's no confusion. People will see the hasThrown and will wonder.
trevnorris
Jun 21, 2016
Author
Contributor
Ah, good point. Maybe instead we just pass the error object, if it occurred.
Ah, good point. Maybe instead we just pass the error object, if it occurred.
rvagg
Jul 6, 2016
Member
is there resolution to this @trevnorris? seems to be a dangling artifact that needs to be fixed
is there resolution to this @trevnorris? seems to be a dangling artifact that needs to be fixed
trevnorris
Aug 9, 2016
Author
Contributor
Going to remove it for now. Not a breaking change and can be added later in a minor version.
Going to remove it for now. Not a breaking change and can be added later in a minor version.
| // Allow callbacks of this AsyncHook instance to fire. This is not an implicit | ||
| // action after running createHook(), and must be explicitly run to being | ||
| // executing callbacks. | ||
| asyncHook.enable(); |
jasnell
Apr 29, 2016
Member
Does this return a value? Perhaps a ref to the hook so calls can be chained?
Does this return a value? Perhaps a ref to the hook so calls can be chained?
mhdawson
May 4, 2016
Member
Did you consider if we'd want to enable selectively. ie be able to enable only for a subset of the supported providers ?
Did you consider if we'd want to enable selectively. ie be able to enable only for a subset of the supported providers ?
Fishrock123
May 5, 2016
Member
Is there a benefit do doing that here rather than in your hook handlers?
Is there a benefit do doing that here rather than in your hook handlers?
trevnorris
May 5, 2016
Author
Contributor
@mhdawson That is coming in the future, but not spec'd for initial API. It would roughly look something like (mind you I'm changing API names on the fly, but you'll get what I mean):
const AsyncHook = require('async_wrap');
const pflags = AsyncHook.ProviderFlags; // Providers by flag enum
const ah = new AsyncHook({ init, pre, post });
ah.filter(pflags.TCPWRAP | pflags.UDPWRAP | pflags.CRYPTO);
ah.enable();
The above will only trigger the callbacks for the three flags provided. There are a couple things ATM that need to be addressed:
- Currently stored in an enum, not by or-able flag. Easy fix. Though worried we may run out of room.
- Is there some type of propagation that should happen automatically under the hood? Otherwise the user is left with no stack traces, resource tracing, etc. Because of this I'm considering how useful this feature would be.
@mhdawson That is coming in the future, but not spec'd for initial API. It would roughly look something like (mind you I'm changing API names on the fly, but you'll get what I mean):
const AsyncHook = require('async_wrap');
const pflags = AsyncHook.ProviderFlags; // Providers by flag enum
const ah = new AsyncHook({ init, pre, post });
ah.filter(pflags.TCPWRAP | pflags.UDPWRAP | pflags.CRYPTO);
ah.enable();The above will only trigger the callbacks for the three flags provided. There are a couple things ATM that need to be addressed:
- Currently stored in an enum, not by or-able flag. Easy fix. Though worried we may run out of room.
- Is there some type of propagation that should happen automatically under the hood? Otherwise the user is left with no stack traces, resource tracing, etc. Because of this I'm considering how useful this feature would be.
mhdawson
May 5, 2016
Member
Ok good to hear you have considered it and you have a good start at what it might look like.
Ok good to hear you have considered it and you have a good start at what it might look like.
jasnell
May 6, 2016
•
Member
Granted that you said that filtering would come in the future so just take this with a grain of salt for now but please just put the filter flags on the constructor ;-)
const ah = new AsyncHook({init,pre,post}, flags.TCPWRAP | flags.UDPWRAP);
Granted that you said that filtering would come in the future so just take this with a grain of salt for now but please just put the filter flags on the constructor ;-)
const ah = new AsyncHook({init,pre,post}, flags.TCPWRAP | flags.UDPWRAP);
| lifetime of `AsyncWrap`. These callbacks will also be called to emulate the | ||
| lifetime of handles and requests that do not fit this model. For example, | ||
| `HTTPPARSER` instances are recycled to improve performance. So the `destroy()` | ||
| callback would be called manually after a connection is done using it, just |
jasnell
Apr 29, 2016
Member
Is there any reason to indicate that the handle is not actually being destroyed in cases such as HTTPPARSER? I can't think of any but want it to be clear.
Is there any reason to indicate that the handle is not actually being destroyed in cases such as HTTPPARSER? I can't think of any but want it to be clear.
Fishrock123
Apr 30, 2016
Member
Probably not, actual actions being done are a lot more interesting & useful than handles that are idley waiting from exterior events.
Probably not, actual actions being done are a lot more interesting & useful than handles that are idley waiting from exterior events.
Fishrock123
Apr 30, 2016
Member
We could maybe add notes in the docs that some Providers actually use shared handles and these are then representative of individual events from those handles.
We could maybe add notes in the docs that some Providers actually use shared handles and these are then representative of individual events from those handles.
jasnell
Apr 30, 2016
Member
I'd recommend it. My main concern would be someone either accidentally or unknowingly keeping one of the reusable handles around after destroy. Since the handle is not actually destroyed, you could end up with some odd, hard to debug edge cases. I'm not saying there's anything we should do other than simply document the fact.
I'd recommend it. My main concern would be someone either accidentally or unknowingly keeping one of the reusable handles around after destroy. Since the handle is not actually destroyed, you could end up with some odd, hard to debug edge cases. I'm not saying there's anything we should do other than simply document the fact.
mhdawson
May 2, 2016
Member
Would it be possible to make it so that even though the underlying resource is re-used, the same handle is not re-used ? That was avoid the problem James mentions above.
Would it be possible to make it so that even though the underlying resource is re-used, the same handle is not re-used ? That was avoid the problem James mentions above.
trevnorris
May 5, 2016
Author
Contributor
Is there any reason to indicate that the handle is not actually being destroyed in cases such as HTTPPARSER? I can't think of any but want it to be clear.
There's no reason to. The resource will be assigned a new id and treated like a newly constructed object. We're just saving the hit of allocation and instantiation.
I'd recommend it. My main concern would be someone either accidentally or unknowingly keeping one of the reusable handles around after destroy.
You're thinking of the JS handle. We don't care about that. Internally we remove the pointer to the class from the JS object so it can no longer be referenced.
Would it be possible to make it so that even though the underlying resource is re-used, the same handle is not re-used ?
resource == handle.
Is there any reason to indicate that the handle is not actually being destroyed in cases such as HTTPPARSER? I can't think of any but want it to be clear.
There's no reason to. The resource will be assigned a new id and treated like a newly constructed object. We're just saving the hit of allocation and instantiation.
I'd recommend it. My main concern would be someone either accidentally or unknowingly keeping one of the reusable handles around after destroy.
You're thinking of the JS handle. We don't care about that. Internally we remove the pointer to the class from the JS object so it can no longer be referenced.
Would it be possible to make it so that even though the underlying resource is re-used, the same handle is not re-used ?
resource == handle.
mhdawson
May 5, 2016
Member
Ok, I think what you said is what I had in mind at least to some extent. There will be a new id even if the underlying resource is reused.
Ok, I think what you said is what I had in mind at least to some extent. There will be a new id even if the underlying resource is reused.
jasnell
May 6, 2016
Member
Awesome, ok. Thank you.
Awesome, ok. Thank you.
AndreasMadsen
May 10, 2016
Member
@trevnorris to be clear, the init hook will also be called manually, when the handle is reused?
@trevnorris to be clear, the init hook will also be called manually, when the handle is reused?
trevnorris
Jun 21, 2016
Author
Contributor
@AndreasMadsen Sorry, didn't see this. Yes, it will be manually called. When a resource is brought out of the pool it will be treated as if it were a new resource. All we're basically omitting is the memory allocation and class instantiation cost.
@AndreasMadsen Sorry, didn't see this. Yes, it will be manually called. When a resource is brought out of the pool it will be treated as if it were a new resource. All we're basically omitting is the memory allocation and class instantiation cost.
| asynchronous event. This does mean the instance _will_ trigger a | ||
| `pre()`/`post()` event. Only that the possibility exists. | ||
|
|
||
| The `this` of each call is that of the object being constructed. While core |
jasnell
Apr 29, 2016
Member
Recommend repeating here the exception for destroy
Recommend repeating here the exception for destroy
| that asynchronous branch. | ||
|
|
||
| #### `asyncHook.scope()` | ||
|
|
jasnell
Apr 29, 2016
Member
An example illustrating this would be helpful
An example illustrating this would be helpful
| // Pretend init, pre, post are all defined | ||
| const asyncHook = async_wrap.createHook({ init, pre, post }); | ||
| asyncHook.enable(); |
jasnell
Apr 29, 2016
Member
If enable() returned the hook these could be chained
If enable() returned the hook these could be chained
| List of all providers that may trigger the `init` callback. | ||
|
|
||
| ### Constructor: `AsyncHook` | ||
|
|
jasnell
Apr 29, 2016
Member
API style nit: perhaps export either the Constructor or createHook function as the default export of the async_wrap module the way we do with events?
E.g. const createHook = require('async_wrap'), but based on whichever is expected to be the most used construction method.
API style nit: perhaps export either the Constructor or createHook function as the default export of the async_wrap module the way we do with events?
E.g. const createHook = require('async_wrap'), but based on whichever is expected to be the most used construction method.
trevnorris
May 5, 2016
Author
Contributor
Maybe renaming the require'able to async_hook and have AsyncHook = require('async_hook') then can do new AsyncHook(...).
Maybe renaming the require'able to async_hook and have AsyncHook = require('async_hook') then can do new AsyncHook(...).
jasnell
May 6, 2016
Member
+1
+1
|
|
||
| #### `async_wrap.Providers` | ||
|
|
||
| List of all providers that may trigger the `init` callback. |
jasnell
Apr 29, 2016
Member
What data type? Array of strings? Array of objects?
What data type? Array of strings? Array of objects?
| const net = require('net'); | ||
| // Pretend init, pre, post are all defined | ||
| const asyncHook = async_wrap.createHook({ init, pre, post }); |
jasnell
Apr 29, 2016
Member
Is there a practical limit to how many of these should be created per process? What is the performance impact of creating too many of these?
Is there a practical limit to how many of these should be created per process? What is the performance impact of creating too many of these?
Fishrock123
Apr 30, 2016
Member
As far as I know, almost entirely dependant on the work done from the hook than the hook itself.
As far as I know, almost entirely dependant on the work done from the hook than the hook itself.
|
LGTM, except the mentioned nits. |
|
@trevnorris any chance to you have a branch somewhere with this API implemented that I could checkout and experiment with ? |
|
@mhdawson Not completely. While writing this a few tweaks were added for API consistency. Most of it is implemented in |
|
|
||
| Since its initial introduction with the `AsyncListener` API, `AsyncWrap` has | ||
| slowly evolved to ensure a generalized API that would serve as a solid base for | ||
| module authors who wished to add events to the life cycle of the event loop. |
mike-kaufman
May 3, 2016
I think it would be worthwhile to describe these life-cycle events at a high level. That is, describe the user model for this API. A lot of what I read below is tied up with implementation details around the native handles. I think most users are concerned about when a callback was enqueued, when that callback started execution, and when that callback completed execution.
I think it would be worthwhile to describe these life-cycle events at a high level. That is, describe the user model for this API. A lot of what I read below is tied up with implementation details around the native handles. I think most users are concerned about when a callback was enqueued, when that callback started execution, and when that callback completed execution.
trevnorris
May 4, 2016
Author
Contributor
the high level of node's event loop is already covered in detail in the-event-loop-timers-and-nexttick.md. i can reference that in the EP. also remember this isn't the actual docs entry. it's meant to be a high level overview of the change.
A lot of what I read below is tied up with implementation details around the native handles.
async wrap has always been partially tied to implementation details. that's the reason for its existence, and attempting to abstract that would remove the utility from the user. for example I was doing testing between 4 and 6 and realized that the order of events fires differently for certain operations. this is exactly the type of thing I wanted to know. it helps me debug my code. if we wanted all this abstracted away then we could simply wrap all the js calls. that's not the purpose for async wrap.
I think most users are concerned about when a callback was enqueued
It doesn't work like this. An oncomplete or similar is assigned to the newly created request upon construction. Hence why I chose to notify on construction of the instance
I think what you want is when the I/O request itself was created, which I believe the actual request instance to perform I/O is always created when requested by the user. So init() will handle that. the implementation is also simpler than injecting at all I/O calls. e.g. uv_write(), and since they're run at approximately the same time any counter measurements should be sufficiently correct.
Then pre() will be called when the async operation is complete, and post() will be called when the callback has completed execution. Covering your final points, and both of these are documented in the Hook Callbacks section.
As far as measuring fire-to-completion, it's impossible to do reliably in node. Best we can give is when node knows the request completed, but since all completed requests are queued by the kernel until node asks for them it's impossible to get perfect measurements. For example if 100 requests complete at the same time and each callback runs for 10 ms, we wouldn't know the last request completed until 1 second after it actually did. This is fundamentally part of node and not something we can do much about. Users will be able to track these details, and referring back to exposing implementation details of node this is exactly why it's important and not something that can be abstracted in a way to be useful.
the high level of node's event loop is already covered in detail in the-event-loop-timers-and-nexttick.md. i can reference that in the EP. also remember this isn't the actual docs entry. it's meant to be a high level overview of the change.
A lot of what I read below is tied up with implementation details around the native handles.
async wrap has always been partially tied to implementation details. that's the reason for its existence, and attempting to abstract that would remove the utility from the user. for example I was doing testing between 4 and 6 and realized that the order of events fires differently for certain operations. this is exactly the type of thing I wanted to know. it helps me debug my code. if we wanted all this abstracted away then we could simply wrap all the js calls. that's not the purpose for async wrap.
I think most users are concerned about when a callback was enqueued
It doesn't work like this. An oncomplete or similar is assigned to the newly created request upon construction. Hence why I chose to notify on construction of the instance
I think what you want is when the I/O request itself was created, which I believe the actual request instance to perform I/O is always created when requested by the user. So init() will handle that. the implementation is also simpler than injecting at all I/O calls. e.g. uv_write(), and since they're run at approximately the same time any counter measurements should be sufficiently correct.
Then pre() will be called when the async operation is complete, and post() will be called when the callback has completed execution. Covering your final points, and both of these are documented in the Hook Callbacks section.
As far as measuring fire-to-completion, it's impossible to do reliably in node. Best we can give is when node knows the request completed, but since all completed requests are queued by the kernel until node asks for them it's impossible to get perfect measurements. For example if 100 requests complete at the same time and each callback runs for 10 ms, we wouldn't know the last request completed until 1 second after it actually did. This is fundamentally part of node and not something we can do much about. Users will be able to track these details, and referring back to exposing implementation details of node this is exactly why it's important and not something that can be abstracted in a way to be useful.
mike-kaufman
May 5, 2016
Thanks Trevor for the link to the doc on the event loop - that's really helpful for me. If the goal of the API is to "add events to the life cycle of the event loop", then why isn't the event loop's phase exposed through API calls? Now, I'm not saying that it should be part of the API, but given stated goal, it isn't clear to why this is omitted.
Which is one of the things I'm driving at: Clarification/simplification around the goals API's goals will help get everyone on the same page around the utility of the API & how it is to be used. Ideally, this would include some definitions, goals/non-goals and some canonical use cases/examples.
async wrap has always been partially tied to implementation details. that's the reason for its existence, and attempting to abstract that would remove the utility from the user.
Can you expand on your example here? I'm not following how the description of the API impacts the ability to understand event ordering.
if we wanted all this abstracted away then we could simply wrap all the js calls. that's not the purpose for async wrap.
Again, precisely what I'm driving for - what is the purpose of the API? My understanding is the goal is to expose lifecycle events around async code execution. If not, then that's fine. If the goal is to provide lifecycle events around handles, then great, but it begs the question of "what is a handle and why does the user care?" (per @jasnell's comment).
It doesn't work like this. An oncomplete or similar is assigned to the newly created request upon construction. Hence why I chose to notify on construction of the instance
Sorry, I'm not following this.
I think what you want is...
I'm not following. We may be talking past each other with terminology. What I need is a notification model around async code execution in node. That is, I want to know precisely the following:
- when is a callback c "made available" for invocation, and what is the parent invocation "making it available"?
- when does a callback c begin invocation?
- when does a callback c end invocation?
- when is a callback c no longer available for invocation (arguably optional, but let's stick with it for now as it cleanly maps to proposed
destroy() call).
As far as measuring fire-to-completion, it's impossible to do reliably in node.
I think this is OK, but again, it will help to explicitly list the goals & non-goals of the API. At a minimum it will get everyone reading this on the same page.
I'm happy to take a stab at writing up what I think is a user model for the API, and something that fits cleanly into current async wrap implementation. Likely not 100% correct, but at a minimum useful to highlight differences in understanding. Let me know if people think this is a useful effort.
Thanks Trevor for the link to the doc on the event loop - that's really helpful for me. If the goal of the API is to "add events to the life cycle of the event loop", then why isn't the event loop's phase exposed through API calls? Now, I'm not saying that it should be part of the API, but given stated goal, it isn't clear to why this is omitted.
Which is one of the things I'm driving at: Clarification/simplification around the goals API's goals will help get everyone on the same page around the utility of the API & how it is to be used. Ideally, this would include some definitions, goals/non-goals and some canonical use cases/examples.
async wrap has always been partially tied to implementation details. that's the reason for its existence, and attempting to abstract that would remove the utility from the user.
Can you expand on your example here? I'm not following how the description of the API impacts the ability to understand event ordering.
if we wanted all this abstracted away then we could simply wrap all the js calls. that's not the purpose for async wrap.
Again, precisely what I'm driving for - what is the purpose of the API? My understanding is the goal is to expose lifecycle events around async code execution. If not, then that's fine. If the goal is to provide lifecycle events around handles, then great, but it begs the question of "what is a handle and why does the user care?" (per @jasnell's comment).
It doesn't work like this. An oncomplete or similar is assigned to the newly created request upon construction. Hence why I chose to notify on construction of the instance
Sorry, I'm not following this.
I think what you want is...
I'm not following. We may be talking past each other with terminology. What I need is a notification model around async code execution in node. That is, I want to know precisely the following:
- when is a callback c "made available" for invocation, and what is the parent invocation "making it available"?
- when does a callback c begin invocation?
- when does a callback c end invocation?
- when is a callback c no longer available for invocation (arguably optional, but let's stick with it for now as it cleanly maps to proposed
destroy()call).
As far as measuring fire-to-completion, it's impossible to do reliably in node.
I think this is OK, but again, it will help to explicitly list the goals & non-goals of the API. At a minimum it will get everyone reading this on the same page.
I'm happy to take a stab at writing up what I think is a user model for the API, and something that fits cleanly into current async wrap implementation. Likely not 100% correct, but at a minimum useful to highlight differences in understanding. Let me know if people think this is a useful effort.
Fishrock123
May 5, 2016
Member
but it begs the question of "what is a handle and why does the user care?" (per @jasnell's comment).
I think this has already been answered but if you don't know it may be more helpful to dig though node core while we're working out a potential API for it. We shouldn't expect 100% of things to be documented for end users in an EP. (see the other EPs)
but it begs the question of "what is a handle and why does the user care?" (per @jasnell's comment).
I think this has already been answered but if you don't know it may be more helpful to dig though node core while we're working out a potential API for it. We shouldn't expect 100% of things to be documented for end users in an EP. (see the other EPs)
mike-kaufman
May 5, 2016
I've been through the code. A clear definition of a "handle" and how it relates to the goals of the API aren't in the EP. Again, I'm happy to write this up and iterate on it.
I've been through the code. A clear definition of a "handle" and how it relates to the goals of the API aren't in the EP. Again, I'm happy to write this up and iterate on it.
Fishrock123
May 6, 2016
Member
I think that is probably out of scope. This exposes existing mechanics in a more usable way.
I think that is probably out of scope. This exposes existing mechanics in a more usable way.
jasnell
May 6, 2016
Member
I don't think telling someone to just go dig through node core while we're working out a potential API for it is really all that helpful (or friendly). @mike-kaufman appears to be making the point that while this new API appears useful it's not clear given the description in this eps exactly how it would be used and what benefit it brings. Several examples and a description of the lifecycle of a hook would make that much clearer. No one is asking for 100% of things to be documented for end users. What is being asked for is a bit more clarity... and I think a bit more patience would likely be a good thing also.
I don't think telling someone to just go dig through node core while we're working out a potential API for it is really all that helpful (or friendly). @mike-kaufman appears to be making the point that while this new API appears useful it's not clear given the description in this eps exactly how it would be used and what benefit it brings. Several examples and a description of the lifecycle of a hook would make that much clearer. No one is asking for 100% of things to be documented for end users. What is being asked for is a bit more clarity... and I think a bit more patience would likely be a good thing also.
Fishrock123
May 6, 2016
Member
I still think defining exactly what a handle is, is out of scope here.
I still think defining exactly what a handle is, is out of scope here.
| // pre() is called just before the handle's callback is called. It can be | ||
| // called 0-N times for handles (e.g. TCPWRAP), and should be called exactly 1 | ||
| // time for requests (e.g. FSREQWRAP). | ||
| function pre(id) { } |
mike-kaufman
May 3, 2016
for debug purposes, I suggest passing along the type to all callbacks. There were some cases with current APIs where I wasn't observing init() getting called for some IDs, but was seeing pre()/post(). Having the type in pre()/post)() would help to debug situations like this.
for debug purposes, I suggest passing along the type to all callbacks. There were some cases with current APIs where I wasn't observing init() getting called for some IDs, but was seeing pre()/post(). Having the type in pre()/post)() would help to debug situations like this.
jasnell
May 3, 2016
Member
Agree. I can imagine that in some cases (such as logging) I may not want to have to maintain state to know what type of thing id references.
Agree. I can imagine that in some cases (such as logging) I may not want to have to maintain state to know what type of thing id references.
Fishrock123
May 4, 2016
Member
Might be easier to attach type to the handle?
Might be easier to attach type to the handle?
mike-kaufman
May 4, 2016
Might be easier to attach type to the handle?
Not clear how that will work for the destroy() case where there is no handle. It may be worthwhile defining an async context type that can be passed to these events. Such a type can evolve to include appropriate information such as the type, handle-specific state, etc...
Might be easier to attach type to the handle?
Not clear how that will work for the destroy() case where there is no handle. It may be worthwhile defining an async context type that can be passed to these events. Such a type can evolve to include appropriate information such as the type, handle-specific state, etc...
mike-kaufman
May 4, 2016
The other useful bit of information is the invocation count. i.e., in cases where pre()/post() can be called 0..n times, what will uniquely identify an invocation? I agree that this can be tracked by individually by the hooks, but it is part of the identity of an invocation of a callback. (i.e., an execution of a callback is uniquely identifed by the tuple [asyncId, invocationId])
The other useful bit of information is the invocation count. i.e., in cases where pre()/post() can be called 0..n times, what will uniquely identify an invocation? I agree that this can be tracked by individually by the hooks, but it is part of the identity of an invocation of a callback. (i.e., an execution of a callback is uniquely identifed by the tuple [asyncId, invocationId])
mike-kaufman
May 5, 2016
Sure, let's use "handle". How would you define "handle"? And what term would you use for an invocation of the callback associated with the "handle" and how would you define this?
I'm not claiming the invocation_id is necessary to keep internally. I am claiming that the invocation_id uniquely identifies a specific invocation of a method associated with a "handle".
Sure, let's use "handle". How would you define "handle"? And what term would you use for an invocation of the callback associated with the "handle" and how would you define this?
I'm not claiming the invocation_id is necessary to keep internally. I am claiming that the invocation_id uniquely identifies a specific invocation of a method associated with a "handle".
mike-kaufman
May 5, 2016
Just make a Stacktrace?
Sorry, I'm not following your suggestion.
C++ I/O handler
That's not the case for promises or nextTick() timers. Are there other outliers not represented by an c++ IO handler? Moreover, if the goal of the API is to provide "lifecycle events around async code execution", defining a "handle" as a C++ I/O handler doesn't relate to the goals of the API.
Just make a Stacktrace?
Sorry, I'm not following your suggestion.
C++ I/O handler
That's not the case for promises or nextTick() timers. Are there other outliers not represented by an c++ IO handler? Moreover, if the goal of the API is to provide "lifecycle events around async code execution", defining a "handle" as a C++ I/O handler doesn't relate to the goals of the API.
Fishrock123
May 6, 2016
•
Member
Sorry, I'm not following your suggestion.
Literally just check a strack trace, Besides, that is the only way to know what you are asking for.
That's not the case for promises or nextTick() timers.
Fine. "I/O Handler". Next.
Sorry, I'm not following your suggestion.
Literally just check a strack trace, Besides, that is the only way to know what you are asking for.
That's not the case for promises or nextTick() timers.
Fine. "I/O Handler". Next.
jasnell
May 6, 2016
Member
Fine. "I/O Handler". Next.
Again, this isn't helpful. @Fishrock123 you have the advantage of being far more familiar with the Node.js internals and therefore make certain assumptions about what should just be "known". The questions that @mike-kaufman is asking are not unreasonable and are the kinds of questions that anyone trying to use this API would be asking. Being flippant doesn't address the concern. Can I ask you to please be more patient.
Fine. "I/O Handler". Next.
Again, this isn't helpful. @Fishrock123 you have the advantage of being far more familiar with the Node.js internals and therefore make certain assumptions about what should just be "known". The questions that @mike-kaufman is asking are not unreasonable and are the kinds of questions that anyone trying to use this API would be asking. Being flippant doesn't address the concern. Can I ask you to please be more patient.
| function init(id, type, parentId) { } | ||
| // pre() is called just before the handle's callback is called. It can be | ||
| // called 0-N times for handles (e.g. TCPWRAP), and should be called exactly 1 |
mike-kaufman
May 3, 2016
Also, I don't understand the distinction between "handles" and "requests" and why this impacts the expected number of calls. This is an example of where the implementation details are bleeding through the API.
Also, I don't understand the distinction between "handles" and "requests" and why this impacts the expected number of calls. This is an example of where the implementation details are bleeding through the API.
Fishrock123
May 4, 2016
Member
Which will happen regardless due to how close this API is to tracking what actually happens... which is the point.
Some clarification could perhaps be used though.
Which will happen regardless due to how close this API is to tracking what actually happens... which is the point.
Some clarification could perhaps be used though.
mike-kaufman
May 4, 2016
Which will happen regardless due to how close this API is to tracking what actually happens... which is the point.
Is this what actually happens when you consider promises & next tick callbacks? In these cases, there is no "handle", right? There is no "object" being constructed during init().
My point is, if the goal of the API is to provide life cycle events around async code execution, then the API should be described in terms of life cycle events of async code execution. I believe that getting to a crisp definition of these events will help ensure API consistency across different types of handles, and help drive correct semantics for next tick & promise callbacks.
Which will happen regardless due to how close this API is to tracking what actually happens... which is the point.
Is this what actually happens when you consider promises & next tick callbacks? In these cases, there is no "handle", right? There is no "object" being constructed during init().
My point is, if the goal of the API is to provide life cycle events around async code execution, then the API should be described in terms of life cycle events of async code execution. I believe that getting to a crisp definition of these events will help ensure API consistency across different types of handles, and help drive correct semantics for next tick & promise callbacks.
chrisdickinson
May 4, 2016
With promises there is a handle (or maybe a "request" is a better term), in the form of the microtask queue entry — which does not correspond to the promise, but to a call to resolve/reject() (if there are downstream promises dependent on an upstream promise-yet-to-be-settled) or .then (if the upstream promise is already resolved, but a new dependent promise is introduced.)
With promises there is a handle (or maybe a "request" is a better term), in the form of the microtask queue entry — which does not correspond to the promise, but to a call to resolve/reject() (if there are downstream promises dependent on an upstream promise-yet-to-be-settled) or .then (if the upstream promise is already resolved, but a new dependent promise is introduced.)
trevnorris
May 4, 2016
Author
Contributor
It's not an implementation detail. It's a concrete detail in how node works. handles do work via requests. e.g. when a TCP server handle receives a new connection, it receives a new connection request. which is then converted into a handle that can receive requests like new data packet. this has an important distinction when wanting to track metrics and resources.
and just because an object isn't an instance of ReqWrap doesn't mean it doesn't fit in this model. hypothetically all async activity could be made a subclass of HandleWrap or ReqWrap, but we don't for performance reasons.
There is no "object" being constructed during init().
TickObject() is constructed for every call to nextTick(). Promise has an internal construction mechanism. Promises construct a new object. The ECMA spec shows that each new promise has several internal fields set for each newly constructed instance.
But technically they would each be a request. They only trigger once (Promises will fire on multiple provided resolve/reject callbacks, but only once) and they themselves will never hold references to another request.
It's not an implementation detail. It's a concrete detail in how node works. handles do work via requests. e.g. when a TCP server handle receives a new connection, it receives a new connection request. which is then converted into a handle that can receive requests like new data packet. this has an important distinction when wanting to track metrics and resources.
and just because an object isn't an instance of ReqWrap doesn't mean it doesn't fit in this model. hypothetically all async activity could be made a subclass of HandleWrap or ReqWrap, but we don't for performance reasons.
There is no "object" being constructed during init().
TickObject() is constructed for every call to nextTick(). Promise has an internal construction mechanism. Promises construct a new object. The ECMA spec shows that each new promise has several internal fields set for each newly constructed instance.
But technically they would each be a request. They only trigger once (Promises will fire on multiple provided resolve/reject callbacks, but only once) and they themselves will never hold references to another request.
chrisdickinson
May 6, 2016
•
But technically they would each be a request. They only trigger once (Promises will fire on multiple provided resolve/reject callbacks, but only once) and they themselves will never hold references to another request.
Just to be sure we're all on the same page about promises-as-an-async-resource: the asynchronous component of a promise is not tied to Promise object creation, but rather through the addition of a handler (for a resolved settled promise) or the resolution of a value (for a pending promise with existing handlers.) The asynchronous resource is the microtask — not the promise. In node handle parlance, a promise is a factory for creating requests, not a request in and of itself.
But technically they would each be a request. They only trigger once (Promises will fire on multiple provided resolve/reject callbacks, but only once) and they themselves will never hold references to another request.
Just to be sure we're all on the same page about promises-as-an-async-resource: the asynchronous component of a promise is not tied to Promise object creation, but rather through the addition of a handler (for a resolved settled promise) or the resolution of a value (for a pending promise with existing handlers.) The asynchronous resource is the microtask — not the promise. In node handle parlance, a promise is a factory for creating requests, not a request in and of itself.
| // createHook() returns an instance of the AsyncHook constructor that can | ||
| // control how those hooks are used. For example, enabling or disabling their | ||
| // execution. | ||
| asyncHook = async_wrap.createHook({ init, pre, post, destroy }); |
mike-kaufman
May 3, 2016
I would expect a corresponding unhook() call to deregister a set of callbacks.
I would expect a corresponding unhook() call to deregister a set of callbacks.
jasnell
May 3, 2016
Member
Yes, I was wondering about this also. It appears that with this API there is no way to explicitly remove a hook. Clear documentation about the lifecycle of a hook would be fantastic.
Yes, I was wondering about this also. It appears that with this API there is no way to explicitly remove a hook. Clear documentation about the lifecycle of a hook would be fantastic.
Fishrock123
May 4, 2016
Member
Maybe better as new Hook()? As far as I understand it doesn't actually attach until you enable() it, so the counter is disable().
Maybe better as new Hook()? As far as I understand it doesn't actually attach until you enable() it, so the counter is disable().
mike-kaufman
May 4, 2016
My understanding is adding a "hook" instance will put it in the list of async hooks. On each callback, this list needs to be iterated, and for each hook, test if enabled & if enabled, then invoke the hook. So, a remove() call is different from disable(); it explicitly removes the hook from the set of async hooks. Let me know if my understanding is incorrect.
My understanding is adding a "hook" instance will put it in the list of async hooks. On each callback, this list needs to be iterated, and for each hook, test if enabled & if enabled, then invoke the hook. So, a remove() call is different from disable(); it explicitly removes the hook from the set of async hooks. Let me know if my understanding is incorrect.
trevnorris
May 4, 2016
Author
Contributor
disable() is a flag that toggles which set of callbacks need to be called for that instance. so if there's only one instance and it's disabled then the normal checks won't anything needs to be called.
since even after a hook has been disabled, it will continue to receive notifications on async paths already taken (100% intentional design). if you want an api that removes even those calls then that comes with the caveat that resource cleanup is then on the user I can think of a way to make that possible.
Maybe better as new Hook()?
I could live with that.
disable() is a flag that toggles which set of callbacks need to be called for that instance. so if there's only one instance and it's disabled then the normal checks won't anything needs to be called.
since even after a hook has been disabled, it will continue to receive notifications on async paths already taken (100% intentional design). if you want an api that removes even those calls then that comes with the caveat that resource cleanup is then on the user I can think of a way to make that possible.
Maybe better as
new Hook()?
I could live with that.
mike-kaufman
May 5, 2016
If I understand correctly, desired behavior is that if init() fires, then all callbacks fire for that handle instance. An explicit remove() will break this assertion.
If I understand correctly, desired behavior is that if init() fires, then all callbacks fire for that handle instance. An explicit remove() will break this assertion.
trevnorris
May 5, 2016
Author
Contributor
so basically running hooks.remove() means that those callbacks will never fire again. even if they've propagated though other asynchronous there are a couple technical difficulties there that'll have to be worked out, but probably possible to implement efficiently.
so basically running hooks.remove() means that those callbacks will never fire again. even if they've propagated though other asynchronous there are a couple technical difficulties there that'll have to be worked out, but probably possible to implement efficiently.
| // Enable hooks for the current synchronous execution scope. This will ensure | ||
| // the hooks are not in effect in case of multiple returns, or if an exception | ||
| // is thrown. | ||
| asyncHook.scope(); |
mike-kaufman
May 3, 2016
I'm not following the use of the scope() function. Can you elaborate on this?
I'm not following the use of the scope() function. Can you elaborate on this?
trevnorris
May 4, 2016
Author
Contributor
Running the following:
hook.enable();
net.createServer(fn).listen(8080);
hook.disable();
will allow hook's callback to continue firing on all callbacks that are triggered by the new Server instance. Though this can be error prone if there are multiple return locations. e.g.
setTimeout(() => {
hooks.enable();
if (checkState())
return;
fs.open(path, r, fn);
hooks.disable();
}, 100);
Because of the early return and not first running hooks.disable() that set of hooks could remain enabled indefinitely. So hooks.scope() would guarantee that they are disabled when the current stack unwinded. e.g.
setTimeout(() => {
hooks.scope();
if (checkState())
return;
fs.open(path, r, fn);
}, 100);
I'll probably end up needing to place in massive bold letters that it lasts until the current stack has unwound, or else users will think it's function scope specific. Like const or let.
Running the following:
hook.enable();
net.createServer(fn).listen(8080);
hook.disable();will allow hook's callback to continue firing on all callbacks that are triggered by the new Server instance. Though this can be error prone if there are multiple return locations. e.g.
setTimeout(() => {
hooks.enable();
if (checkState())
return;
fs.open(path, r, fn);
hooks.disable();
}, 100);Because of the early return and not first running hooks.disable() that set of hooks could remain enabled indefinitely. So hooks.scope() would guarantee that they are disabled when the current stack unwinded. e.g.
setTimeout(() => {
hooks.scope();
if (checkState())
return;
fs.open(path, r, fn);
}, 100);I'll probably end up needing to place in massive bold letters that it lasts until the current stack has unwound, or else users will think it's function scope specific. Like const or let.
mike-kaufman
May 5, 2016
Someone could achieve analogous functionality with a try/finally block, right?
Someone could achieve analogous functionality with a try/finally block, right?
mike-kaufman
May 5, 2016
If I understand scope() properly, this
() => {
hooks.scope();
if (checkState())
return;
fs.open(path, r, fn);
}
is equivalent to this:
() => {
try {
hooks.enable();
if (checkState())
return;
fs.open(path, r, fn);
}
finally {
hooks.disable();
}
}
If I understand scope() properly, this
() => {
hooks.scope();
if (checkState())
return;
fs.open(path, r, fn);
}is equivalent to this:
() => {
try {
hooks.enable();
if (checkState())
return;
fs.open(path, r, fn);
}
finally {
hooks.disable();
}
}
jasnell
May 6, 2016
Member
Based on the description that would be my understanding as well @mike-kaufman ... only that try/finally bring along it's own additional cost.
At the risk of repainting the bike shed, my key concern with scope() is with the name -- it's not clear at all and @trevnorris has already indicated that there could be some confusion around what scope it's actually referring to. Not sure if I have a better suggestion tho. I know it's ugly (and don't have a better suggestion) but something like hook.forThisStack() would be clearer but admittedly is a horrible name.
Based on the description that would be my understanding as well @mike-kaufman ... only that try/finally bring along it's own additional cost.
At the risk of repainting the bike shed, my key concern with scope() is with the name -- it's not clear at all and @trevnorris has already indicated that there could be some confusion around what scope it's actually referring to. Not sure if I have a better suggestion tho. I know it's ugly (and don't have a better suggestion) but something like hook.forThisStack() would be clearer but admittedly is a horrible name.
rvagg
Jul 6, 2016
Member
I like .syncScope() for this, it adds a lot more clarity to what it's trying to do
I like .syncScope() for this, it adds a lot more clarity to what it's trying to do
|
|
||
| Execution of the `nextTickQueue` and `MicrotaskQueue` are slightly special | ||
| asynchronous cases. Because they execute in the same `MakeCallback()` as the | ||
| asynchronous callback they will ultimately end up with the same parent as the |
mike-kaufman
May 3, 2016
Because they execute in the same MakeCallback() as the asynchronous callback they will ultimately end up with the same parent as the originating call
IMO this needs to be addressed. Specifically, nextTick() & promises need to maintain correct semantics wrt their async parent and wrt the callbacks that are being fired.
Because they execute in the same
MakeCallback()as the asynchronous callback they will ultimately end up with the same parent as the originating call
IMO this needs to be addressed. Specifically, nextTick() & promises need to maintain correct semantics wrt their async parent and wrt the callbacks that are being fired.
|
My biggest question after reading this and doing some tests with process.binding('async_wrap'); is: don't we have to define what you can/cannot assume in terms of the object passed as 'this' ? I'm thinking there needs to be a mapping between the providers, their callbacks and what object you can expect 'this' to be in each case. If that's true then I start to wonder about how much of the internals we'll be exposing and how that might constrain what we change in the future. Key would be to document what will or won't change across releases in terms of what you get for 'this' and the shape of those objects. As a concrete example for: crypto.randomBytes(): this is -> InternalFieldObject {
ondone:
{ [Function]
[length]: 0,
[name]: '',
[arguments]: null,
[caller]: null,
[prototype]: { [constructor]: [Circular] } } }
crypto.pbkdf2() this is -> InternalFieldObject { ondone: { [Function] [length]: 2, [name]: '' } }
and its not clear how I figure out from what's being passed to the hooks how you figure out which specific hook triggered the callback. Maybe more than is currently encoded in the private api is going to be encoded into type in the init call, currently it just looks like the provider. If the type will help to identify the specific callback per provider then defining what will be in type and the values would help. |
|
Per @mhdawson's comments, are there specific reasons why the actual handle object is being passed to the hooks? I am also concerned about the level of internal details being exposed here. Would be nice to understand the use cases for this. |
I'm fine with the idea that each constructor receives its own unique provider id. Not sure what you mean by "their callbacks", and the expected
All of this is already exposed via As for what we can change, there's no guarantee what fields will be available. Part of the initial concept of this API was users who wanted to know what node was doing. Not have just another abstracted API, that could be done easily enough in another way. I'm sure there'll be disagreement about what we should be able to rely on once a branch has reached stable, but that was never part of the initial design plan. It's basically "accessing Re:
Some users want to store information directly on the handle. Despite the id each has, it's the easiest way to propagate information and allow the GC to clean it up automatically. Ideally in the future there could be a basic set of calls that could be standardized (e.g. 'use strict';
const async_wrap = process.binding('async_wrap');
const print = process._rawDebug;
var handle;
async_wrap.setupHooks({ init() { handle = this } });
async_wrap.enable();
var server = require('net').createServer().listen(8080);
print(server._handle === handle);
server.close();
// output: trueI use it for debugging as well. With the understanding that things change, that's part of its utility. By addition of the id, explicitly passing the provider, etc. we're not forcing use of the handle on anyone. Simply making it available in a way that makes sense for the context of the call, and in a way that users like APMs will find very useful. |
Providing storage for the async context is different than exposing the handle though.
IMO, I think providing a context object which has a consistent shape & properties like provider type and handle is a cleaner API than passing the handle directly. It still meets the criteria of providing arbitrary storage associated with the "handle", it provides a place to define a common interface across handles, and it can evolve independently of the underlying handle.
I'm still not following how APMs will utilize the handle. Is there specfiic data on the handle that is useful? If so, what is this? |
Creating and tracking a new async context for every handle, and tracking it, is expensive. By attaching properties directly to the handle instance GC will take care of it all automatically, and at the least expense.
This can be, or at least should be, construct-able by the user. Creating all these new objects filled with properties is expensive, and you're missing that printing the actual contents of the handle is useful. And I don't share the concern about possibly needing to standardize properties in the handle and making it difficult for node to move forward. I've been aiming for a more standardized lower-level API, and "hiding" properties on an object in a significant way has become easier with ES6. But this is a separate topic.
Here's a really basic example script that should explain how useful it is to be able to see the handles themselves while debugging: 'use strict';
const async_wrap = process.binding('async_wrap');
const print = process._rawDebug;
const ctx_array = [];
async_wrap.setupHooks({
init() { /*print(this)*/ },
pre() {
if (ctx_array.indexOf(this) === -1) {
ctx_array.push(this);
print(this);
}
},
});
async_wrap.enable();
process.on('exit', () => print(ctx_array.length));
require('net').createServer(function(c) {
require('fs').readFile(__filename, () => {
c.end(new Buffer(1024 * 1024 * 100).fill('a'));
this.close();
});
}).listen(8080);
require('net').connect(8080, function() { this.resume() });In there you'll see a I hope this demonstrates the utility for being able to analyse each handle. All of the things mentioned in the previous paragraph cannot be obtained any other way. Removing the ability to see the handle would be a blow to the API, and basically be one step towards moving it to nothing more than a continuation-local-storage API. |
|
@trevnorris what I was referring to in respect to "their callbacks" was that for the CRYPTO provider there are multiple cases were callbacks are wrapped such that they pre, post methods are invoked (as in my example). I think your comment about making each of these have their own provider id addresses that question. I terms of the discussion about visibility of the handles, from what you describe we should document both in this eps what it's ok/not ok to use the handlers for and what expectations are. For example:
If we believe that documenting a list like this is enough protection from being boxed in later when users of the API are broken by later Node.js releases and complain, then passing the handles would be fine. If we were concerned that despite the warnings we'd still be trying to avoid breakage passing some other field from wrapper could make sense. |
|
@trevnorris a couple more clarifying questions ... Let's say I create a hook and some dependency module I'm using creates a hook... when those are called, are they passed the same id and handle, different id's same handle, same id's different handle or different ids and different handles? (and by handle here I mean the js object that wraps the actual handle). The main reason I ask is that if I'm attaching additional context to the handle, it would be helpful to also know that other hooks could be attaching their own context to the same handle. I'm still wondering about the potential cost of creating too many of these which is why I think describing the specific lifecycle from when a hook is created to when it is destroyed would be very helpful. While I understand that you've designed and implemented this to be as low impact on performance as possible, there is a non-zero cost to calling these hooks. Have you had the opportunity yet to benchmark an upper limit to the number of hooks that can be created without having a serious impact on performance? My key concern with this is that an app developer may not have any idea that dependency modules they may be using could be going out and creating hooks. Depending on how many such dependencies they have, they could end up seeing degraded performance without any clear indication as to why since installing the hook appears to be a completely transparent operation (that is, there's no indication that a new hook was created). |
|
9 votes in favor: @indutny @Trott @trevnorris @rvagg @evanlucas @ChALkeR @Fishrock123 @mscdex @cjihrig 0 votes against: 3 abstentions: @misterdjules @addaleax @thealphanerd 6 members who did not cast a vote and did not indicate they were abstaining: @chrisdickinson @shigeki @bnoordhuis @mhdawson @ofrobots @jasnell |
|
@trevnorris Thanks for the deep explanation. My hope with the create/queue event split was that CLS could link In its current state, async_wrap only solves part of the problem of async transaction tracing. You can get the code path the internals took, but you can't really get the more contextually useful path of how it got through the users code. |
|
@trevnorris By the way, thank you for all your work in Node and async_wrap.
Is this how the current unofficial async_wrap works? I'm sorry I'm having trouble following the changes in behavior b/w this EPS and the current released async_wrap. The real hope for myself and I think many others using CLS, is if this EPS version eliminates or help reduces the monkey-patching needed in order to have reliable CLS in node? I might be off-base, but it sounds like it won't b/c of its impact on performance? I'd give up a |
The current implementation is a hybrid. There were a lot of requests to allow multiple hook instances. Instead of having a single global set. This required propagating the state of hooks for each execution stack.
When it comes to patching
I put in many hours in an attempt to make this functionality opt-in, but the problem is a certain amount of state still needs to be maintained and propagated because a hook may be enabled at any time. In my PR nodejs/node#8531 I purposely left the commits intact so anyone could view the effort to keep the old functionality in. |
@Qard I think @trevnorris is going to be making |
|
Maybe. I'll have to dig into it at some point to see how it works. I don't work in APM anymore, so I haven't been paying super close attention to this stuff lately. |
| @@ -286,9 +308,9 @@ const async_hooks = require('async_hooks'); | |||
| const net = require('net'); | |||
| asyn_hooks.createHook({ | |||
evanlucas
Oct 14, 2016
Member
There is a typo here, should be async_hooks
There is a typo here, should be async_hooks
trevnorris
Oct 17, 2016
Author
Contributor
:-P Thanks. On it.
:-P Thanks. On it.
| reasonable use case. This API is meant to err on the side of requiring explicit | ||
| instructions from the user. | ||
|
|
||
| To help simplify this `enable()` returns the instance of itself so the calls |
overlookmotel
Oct 19, 2016
I think this should say "createHook() returns the instance of itself..."
I think this should say "createHook() returns the instance of itself..."
| // used in other emit calls. | ||
| async_hooks.emitInit(id, handle, type[, triggerId]); | ||
| // Call the before() callbacks. The reason for requiring both arguments is |
overlookmotel
Oct 19, 2016
Is this line now redundant?
Is this line now redundant?
trevnorris
Jan 6, 2017
Author
Contributor
forgot to add a parameter to emitBefore().
forgot to add a parameter to emitBefore().
|
|
||
| The following is a simple demonstration of this: | ||
|
|
||
| ```cpp |
watson
Nov 29, 2016
Member
-```cpp
+```js
-```cpp
+```js7c07d82
to
7c6199a
52d7c97
to
30b9268
After much investigation and communication this is the API for AsyncHooks that has evolved. Meant to be minimal, not impose any performance penalty to core when not being used, and minimal impact when it is used, this should serve public needs that have been expressed over the last two years. PR-URL: #18 Reviewed-By: Fedor Indutny <fedor@indutny.com> Reviewed-By: Jeremiah Senkpiel <fishrock123@rocketmail.com>
After much investigation and communication this is the API that has
surfaced. Meant to be minimal, not impose any performance penalty to
core when not being used, and minimal impact when it is used, this
should serve public needs that have been expressed over the last two
years.
@nodejs/ctc I'd like the initial review explicitly from the CTC before this is opened for too much external debate. Because experience has shown me that there will be suggestions/changes for those who want specific features and/or additions to suit their specific use case. Usually not taking the time to realize that this API is enough. They just need to write the additional code for the hooks.