Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AsyncLocalStorage loses the store if used together with createHook if there is a async function in between #53037

Open
mcollina opened this issue May 17, 2024 · 31 comments
Labels
async_hooks Issues and PRs related to the async hooks subsystem.

Comments

@mcollina
Copy link
Member

Consider the following code:

import { AsyncLocalStorage, createHook } from 'node:async_hooks';
import { notEqual } from 'assert/strict';

const hook = createHook({
  init () {},
})

// Commenting out this line will make the bug go away
hook.enable();

const asyncLocalStorage = new AsyncLocalStorage();

async function main () {
  // Commenting out this line will make the bug go away
  await 1
  asyncLocalStorage.enterWith({ foo: 'bar' });
}

await main()

console.log(executionAsyncResource());

// Should not be undefined
notEqual(asyncLocalStorage.getStore(), undefined);

Note that:

  1. disabling the hook solve the problem
  2. removing the await 1 promise solves the problem

Debugging the executionAsyncResource() in this case:

Promise {
  <pending>,
  [Symbol(async_id_symbol)]: 12,
  [Symbol(trigger_async_id_symbol)]: 9
}

while in without the createHook().enable() call:

{ [Symbol(kResourceStore)]: { foo: 'bar' } }
@mcollina
Copy link
Member Author

cc @bengl @Qard

@bengl
Copy link
Member

bengl commented May 17, 2024

Additional notes:

I attempted with the following changes, and it still happens:

  • Replace the async_hook with a promise hook. This narrows it down to promise hooks, I think.
  • Moved the hook after the storage construction. The order here doesn't seem to matter, regardless of what type of hook is used, except that it must be before the first await in the example below.
  • Replace the main() function, and awaiting it, with a function that reduces the number of awaits used, in case that's an issue. I could not reproduce without awaiting here. Simply putting a then() containing the checks at the end did not suffice to reproduce.
import { AsyncLocalStorage, executionAsyncResource } from 'node:async_hooks';
import { promiseHooks } from 'node:v8';
import { notEqual } from 'assert/strict';

const asyncLocalStorage = new AsyncLocalStorage();

// Commenting out this line will make the bug go away
promiseHooks.onInit(() => {});

await Promise.resolve().then(() => {
  asyncLocalStorage.enterWith({ foo: 'bar' });
})

console.log(executionAsyncResource());

// Should not be undefined
notEqual(asyncLocalStorage.getStore(), undefined);

@Qard
Copy link
Member

Qard commented May 18, 2024

This is known and (unfortunately) expected. It's also presently encoded into the AsyncContext spec proposal, which I'm trying to explain to people why it should not work that way. Specifically the issue is that PromiseHook, ContinuationPreservedEmbedderData, and the design of the future AsyncContext API all follow the path of continuation registration rather than promise resolution, so effectively context routes around awaits instead of through them.

The reason it works before the first await is that the code up until that first await runs synchronously and so it changes the current context before the await happens. That await then stores that context and restores it when it resumes. However if you put it after the first await then it will have already captured the context to restore coming out of the await before that happens. It only flows within that async function, which effectively makes it the same as a local variable. 😐

I'll continue to work on explaining to TC39 and V8 folks why this doesn't work.

@Qard
Copy link
Member

Qard commented May 18, 2024

Oh, and the reason why the hooks.enable() in the first example changes the behaviour is because PromiseHook would otherwise not be enabled until that enterWith() is reached and therefore there would be no captured context from the await to restore when it resumes. If you put the hooks.enable() right before that enterWith() it would work because it would have already passed the point when it could have captured a context value for the await.

@mcollina
Copy link
Member Author

mcollina commented May 18, 2024

Can you expand on why this works:

import { AsyncLocalStorage, createHook, executionAsyncResource } from 'node:async_hooks';
import { notEqual } from 'assert/strict';

const hook = createHook({
  init () {},
})

hook.enable();

const asyncLocalStorage = new AsyncLocalStorage();

await 1

asyncLocalStorage.enterWith({ foo: 'bar' });

console.log(executionAsyncResource());

// Should not be undefined
notEqual(asyncLocalStorage.getStore(), undefined);

While this doesn't?

import { AsyncLocalStorage, createHook, executionAsyncResource } from 'node:async_hooks';
import { notEqual } from 'assert/strict';

const hook = createHook({
  init () {},
})

hook.enable();

const asyncLocalStorage = new AsyncLocalStorage();

await Promise.resolve(1).then(() => {
  asyncLocalStorage.enterWith({ foo: 'bar' })
});

console.log(executionAsyncResource());

// Should not be undefined
notEqual(asyncLocalStorage.getStore(), undefined);

This is really unintuitive.

@VoltrexKeyva VoltrexKeyva added the async_hooks Issues and PRs related to the async hooks subsystem. label May 18, 2024
@Qard
Copy link
Member

Qard commented May 19, 2024

The enterWith(...) call will set the context until something would change it. In your first example the gap between where you set it and where you try to read it is sync so nothing would have changed it. Whereas in your second example it is set within an awaited promise continuation so the context value captured when the await started would replace that value when the await resumes.

Both async functions and promise continuations occur within this sort of dead space where context will only flow internally. In the case of async/await, the context at the point the await yields is restored when the await resumes. In the case of promises the context is captured when a continuation is added and restored when it runs, but it will do this for each point in the chain, so if you chain multiple continuations in place they will all link back to the same context rather than nesting. You would have to nest the continuation attachment for it to nest the contexts.

The more user-intuitive flow one would expect it to have is that promise continuations and await resumes are treated in the same way as callbacks. The point of confusion is that callbacks have a single point that functions both as callback registration which naturally leads into resolution, but the registration is the final point in the flow before it goes to the native side. Whereas promises split the two concerns and people are conflating the current design of AsyncLocalStorage binding callbacks as meaning that the appropriate place to bind such context is at registration point. Which is actually not the case, just the resolve happens in unobservable code and, due to implementation details, async_hooks and therefore AsyncLocalStorage needs that context binding to occur in observable code.

We don't flow context into internals, so there are a bunch of points where the value stored in context would be technically incorrect or empty, but it doesn't matter because it's impossible to acquire the context value at those points as it's in native code at the time. My rewrite to use AsyncContextFrame was an attempt to begin making that more correct, but that got blocked on V8 internal politics and I haven't got back to it yet...hopefully I will soon.

@legendecas
Copy link
Member

legendecas commented May 20, 2024

Isn't this an issue that constructing an asyncLocalStorage doesn't enable promise hooks for performance optimization, until setting a value in an asyncLocalStorage? Enabling promise hooks before and after a promise creation/resolution would change how AsyncLocalStorage propagates a value is a problem on either semantics.

@mcollina
Copy link
Member Author

Isn't this an issue that constructing an asyncLocalStorage doesn't enable promise hooks for performance optimization, until setting a value in an asyncLocalStorage?

Partially, but I think it's a good tradeoff.

We might want to document this "odd" behavior.

@littledan
Copy link

I'm confused about @Qard 's above comment that this issue exists in AsyncContext. The performance optimization that @legendecas mentions would be disabled for any native AsyncContext implementation. (Also enterWith doesn't exist in AsyncContext.) Could you explain the relationship?

@Qard
Copy link
Member

Qard commented May 20, 2024

The issue is that the user is expecting the context to flow out of an await.

async function main () {
  await 1
  asyncLocalStorage.enterWith({ foo: 'bar' });
}

await main()

// Should not be undefined
notEqual(asyncLocalStorage.getStore(), undefined);

While this does use the current scope modifying enterWith(...) rather than run(...), it could equally be expressed like this and have the same problem:

await asyncLocalStorage.run({ foo: 'bar' }, async () => {
  await 1
})

// Should not be undefined
notEqual(asyncLocalStorage.getStore(), undefined);

The expected flow is that the context flows into asynchronously following code, which it does not do whereas callback code does. A user could reasonably expect rewriting of callback-based code into promise-based code to continue to flow in roughly the same way, but this is not the case.

@littledan
Copy link

@Qard Do you agree that the bug as originally filed can be fixed without that other change?

@mcollina
Copy link
Member Author

mcollina commented May 20, 2024

The issue is that the user is expecting the context to flow out of an await.

I think the issue is that the user expects these two following code to be identical:

async function main () {
  await 1
  asyncLocalStorage.enterWith({ foo: 'bar' });
}

await main()

// Should not be undefined

notEqual(asyncLocalStorage.getStore(), undefined);
await 1
asyncLocalStorage.enterWith({ foo: 'bar' });
await 1

// Should not be undefined
notEqual(asyncLocalStorage.getStore(), undefined);

Most developers would think those are identical. Only experts would notice the difference (I was fooled by it on first sight, even after implmeneting a chunk of it).

@Qard
Copy link
Member

Qard commented May 20, 2024

@Qard Do you agree that the bug as originally filed can be fixed without that other change?

How? What other change? I'm not clear what you are referring to.

@littledan
Copy link

The other change being turning off the optimization that legendecas identified in #53037 (comment) . Such an optimization will not apply to AsyncContext.

@Qard
Copy link
Member

Qard commented May 21, 2024

That doesn't fix the flow problem I described. It would make the behaviour consistent, but consistently not what the user is expecting.

@legendecas
Copy link
Member

I think the issue is that the user expects these two following code to be identical:

Most developers would think those are identical. Only experts would notice the difference (I was fooled by it on first sight, even after implmeneting a chunk of it).

enterWith is the factor that make the pattern unintuitive. If the AsyncLocalStorage was accessed with run instead, the two examples would be identical:

async function main () {
  await 1
  asyncLocalStorage.run({ foo: 'bar' }, () => {});
}

await main()

equal(asyncLocalStorage.getStore(), undefined);
await 1
asyncLocalStorage.run({ foo: 'bar' }, () => {});
await 1

equal(asyncLocalStorage.getStore(), undefined);

@mcollina
Copy link
Member Author

Indeed, but that destroys the goal. asyncLocalStorage.run() leads to worse DX, which is the whole point of this exercise.

import { AsyncLocalStorage, createHook, executionAsyncResource } from 'node:async_hooks';
import { notEqual } from 'assert/strict';

const asyncLocalStorage = new AsyncLocalStorage();

await asyncLocalStorage.run({ foo: 'bar' }, async () => {
  await 1
  console.log(executionAsyncResource());
  // Should not be undefined
  notEqual(asyncLocalStorage.getStore(), undefined);
})

vs

import { AsyncLocalStorage, createHook, executionAsyncResource } from 'node:async_hooks';
import { notEqual } from 'assert/strict';

const asyncLocalStorage = new AsyncLocalStorage();

asyncLocalStorage.enterWith({ foo: 'bar' })

await 1

console.log(executionAsyncResource());

// Should not be undefined
notEqual(asyncLocalStorage.getStore(), undefined);

The latter is 100% more idiomatic javascript.

@littledan
Copy link

littledan commented May 21, 2024

worse DX, which is the whole point of this exercise.

Which exercise? If you were talking about AsyncContext: I don’t think the point of AsyncContext is to bring DX improvements vs AsyncLocalStorage. It is more about being a subset which can be implemented across environments, optimized well, etc.

@jridgewell
Copy link
Contributor

jridgewell commented May 21, 2024

Indeed, but that destroys the goal. asyncLocalStorage.run() leads to worse DX, which is the whole point of this exercise. …
The latter is 100% more idiomatic javascript.

Note that we've discussed how to bring DX improvements to AsyncContext proposal in a way that would be consistent to developers:

const ctx = new AsyncContext.Variable();

async function main() {
  using _ = ctx.scope('foo');
  equal(ctx.get(), 'foo');
  await 1
  equal(ctx.get(), 'foo');
}

const m = main();
// Doesn't matter that main started executing, the suspend point restores previous context
equal(ctx.get(), undefined);

await m;
// Doesn't matter that main finished executing, the await's resume restores previous context
equal(ctx.get(), undefined);

This won't solve @Qard's goal, but it does make OP's problematic code consistent. It shouldn't matter how you structure the code, the variable stores the same undefined value outside the using scope.

@Qard
Copy link
Member

Qard commented May 22, 2024

That flow makes it essentially only a local variable though. Users are expecting it to flow into continuations as is shown by the assumption at the start that the context set within main is available in the top-level scope after main completes. This is the point I'm trying to make--code after await points are functionally the same as nested callbacks and those do receive the context, so the behaviour of not flowing through those points is inconsistent and confusing to users.

@jridgewell
Copy link
Contributor

I think this issue has multiple possible goals at the same time:

  • Inconsistent behavior for seemingly equivalent code structure
  • Improved DX without nesting in callbacks
  • Flow-through contexts

I don't know what priority @mcollina sets for these, or even if he intended to hit on all these points in the OP. I'm only saying that we can solve the first 2 without the 3rd. AC already fixes the inconsistent behavior, and we've discussed how to improve DX. If the third is lowest priority, or even unintentional, then it doesn't necessitate a change in design for AC or ALS.

@mcollina
Copy link
Member Author

Improved DX without nesting in callbacks

How can you achieve this without flow-through contexts?

@jridgewell
Copy link
Contributor

See the async function body in #53037 (comment). It is possible to mutate the current context in scope without it leaking out of the scope. It's not defined in the spec yet, needs a extra syntax to be usable, but it's not unsolvable.

If that's the only DX improvement you're looking for, then it's a separate issue from flow-through. If the DX you expect is to leak the value out of the scope (which isn't flow-through semantics either), then that's not possible. If the DX you want is explicitly flow-through (context active during the promise's resolution is used after the await), then that's not the same as scoped mutations.

@mcollina
Copy link
Member Author

If the DX you want is explicitly flow-through (context active during the promise's resolution is used after the await)

Yes. I think this is what the community needs. It appears that quite a few of you are being dismissive of this feedback.
Users would expect this to happen, and be surprised when it doesn't. Framework authors would need to create convoluted APIs to leverage this to work around this limitation.

Why do you think it cannot be done? What are the tradeoffs implementation wise? No one has provided any explanation whatsoever on why, the only comments I have received are "because we say so".
As one of the people that made this possible to begin with (pulling an all nighter in Munich eons ago), I think I have enough context to understand the tradeoffs at play.

@jridgewell
Copy link
Contributor

Users would expect this to happen, and be surprised when it doesn't. Framework authors would need to create convoluted APIs to leverage this to work around this limitation.

Are there links to issues from users showing this confusion? This issue is the only one I've been linked to.

It appears that quite a few of you are being dismissive of this feedback.

Speaking for myself, I think the current semantics that AC, ALS, Zone.js, and CPED expose is the correct one. It's exactly how Vercel implemented it's request store before before I inherited the project (which is the reason I started working on the AC proposal to begin with). If so many implementations land on the same thing, I think that means the semantics must be good.

Why do you think it cannot be done?

I'm assuming this is in response to me saying "If the DX you expect is to leak the value out of the scope (which isn't flow-through semantics either), then that's not possible". My statement is about the leak caused by enterWith. enterWith is not necessary for flow-through semantics. Implementing flow-through semantics in AC is easy, and could be done in a way that does not cause a permanent memory leak of the last stored item.

So, there are 2 orthogonal DX discussions going on here:

  1. Do users expect in-scope mutations?
  2. Do users expect flow-around or flow-through semantics?

We can implement scoped mutations with either flow-around or flow-through semantics. We can chose flow-around or flow-through semantics without implementing scoped mutations.

@Qard
Copy link
Member

Qard commented May 27, 2024

I think the current semantics that AC, ALS, Zone.js, and CPED expose is the correct one. It's exactly how Vercel implemented it's request store.

That is targeting React though, which inherently has a unidirectional execution model--it flows inward to the inner components and only render results flow out so there is no following execution to flow to. Whereas Node.js most definitely has all sorts of branching and converging async which context needs to flow through. The use cases are entirely different, and there are multiple highly experienced Node.js developers in here that have interacted with a huge number of other Node.js developers in the community using AsyncLocalStorage and have a good idea of what user expectations are in that regard. All of those people in this issue are stating the same opinion of how context should flow for typical use cases in Node.js and are seemingly not being heard.

If so many implementations land on the same thing, I think that means the semantics must be good.

No, it just means they're all copying each other because other JS implementations are the closest examples to look to for inspiration.

If you look at literally any other language they all came up with the same different design from this, so you could make the same argument that because there's so many others that came to that same design that there must be something inherently wrong with our current model for being the outlier, and indeed there is. We built our model in this way because of deficiencies in the runtime to be able to express it any differently.


For similar issues demonstrating that ALS flow is not matching expectations:

There's plenty more, that's just the first page of ALS-related issues. The point is, we get lots of usability complaints both within GH issues and I see many more when dealing with APM customers expecting span data to flow and connect in particular ways.

@legendecas
Copy link
Member

@Qard I agree that AsyncLocalStorage could be improved like how it interacts with async generators. But the issues mentioned are not universally agreed on the behavior mentioned described as flow-through in this thread, like #46262 (comment).

@Qard
Copy link
Member

Qard commented May 28, 2024

That linked comment is conflating implementation details rather than consider what actually is the user expectation. Users expect context to flow from execution which, from their perspective, caused the particular continuation to execute. So context flows from calls to callbacks and from resolves to then handlers.

@jridgewell
Copy link
Contributor

jridgewell commented May 28, 2024

That is targeting React though, which inherently has a unidirectional execution model--it flows inward to the inner components and only render results flow out so there is no following execution to flow to.

Vercel's edge runtime isn't specifically React. The use case we had to solve is essentially request logging, so that every console.log() the dev writes is tied to the request that caused it. A particular use case that I had to solve was sharing promises across requests, and if request B starts appending logs using request A's log id because the user awaited request A's promise, they're gonna confuse a bunch of users.

If you look at literally any other language they all came up with the same different design from this, so you could make the same argument that because there's so many others that came to that same design that there must be something inherently wrong with our current model for being the outlier, and indeed there is.

No language that I familiar with uses flows-through semantics, and some but not all implement mutability. Ignoring Node:


Responding to each of the linked ASL issues:

This was actually created in response to our discussions in AC proposal. unhandledrejection is a global handler that can't be installed per request context, so we have to treat it special. The OP's semantics can be solved with either promise construct time's context or the reject time's, but it's cheaper for us to implement the reject time, so we used that. unhandledrejection's registration time never really makes sense, but this isn't an await for flows-through.

The OP there seems to advocate for consistency, and doesn't care if it's flows-around or flows-through. There seems to be a desire for code like the following to work:

describe("store", () => {
  beforeEach(async () => {
    store.enterWith("value");
  });

  it("the store should be set", () => {
    expect(store.getStore()).toEqual("value");
  });
});

This seems like the first mild +1 for flows-through semantics. But Jest could expose APIs for setting up the context snapshot to make this work in either semantics.

This doesn't bring up either flows-around or flows-through, just that async generators should have some context captured during init. Flow-around solves OP by binding the init concrete. Flow-though might actually break OP’s expectations, wouldn’t the .next() call’s context be required since it’s the executions context?

This seems to be an issue with exit() setting the current value to undefined instead of operating as enterWith's pair (restoring the previous value)? I don't think either semantics are asked for here, and AC solves the stacking by having .run(), and potentially if we have a scoped-mutation.

@mcollina
Copy link
Member Author

Vercel's edge runtime isn't specifically React. The use case we had to solve is essentially request logging, so that every console.log() the dev writes is tied to the request that caused it. A particular use case that I had to solve was sharing promises across requests, and if request B starts appending logs using request A's log id because the user awaited request A's promise, they're gonna confuse a bunch of users.

This is the most interesting use-case/feedback of the current behavior and something I haven't considered before (even if "cross the promises" all the time). I can see how a flow-through semantics would be problematic in this case, and how it would break quite a lot of code that make this assumption.

@Qard
Copy link
Member

Qard commented May 29, 2024

That's basically the exact same scenario as connection pooling, which is exactly what the bind(...) method on AsyncResource is for.

No language that I familiar with uses flows-through semantics, and some but not all implement mutability.

.NET does with CallContext.

It also does mutability by using the ExecutionContext class to do scoping. What we do presently is mix scoping and mutation into the same interface rather than allowing the two to be applied separately. This also means we can't have the runtime provide scopes in which to set, we can only make use of user-defined scopes. This is worse DX than using runtime-provided scopes wherever possible.

.NET is by far the most mature language when it comes to context management implementation. It also happens to be the origin of the async/await pattern we're trying to model over so we should probably be drawing most of our inspiration from that.

Responding to each of the linked ASL issues:

I was not implying any of those links were about the flow-through semantics subject, only that ALS has a bunch of related issues. Simply copying existing systems without talking to the people maintaining them to learn where the pains are and what could be done to improve the situation seems to me like a recipe for failure.

Also, and I have been very explicit about this several times now: I am not discounting the value of flow-around semantics. There is definite value to that flow in many scenarios. However for tracing purposes flow-through semantics are almost always what is actually needed. I think we need both.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
async_hooks Issues and PRs related to the async hooks subsystem.
Projects
None yet
Development

No branches or pull requests

7 participants