Skip to content
This repository has been archived by the owner on Aug 29, 2021. It is now read-only.

Normative: Synchronous based on a syntax and module graph #61

Merged
merged 2 commits into from
Mar 26, 2019

Conversation

littledan
Copy link
Member

This patch is a variant on #49 which determines which module subgraphs
are to be executed synchronously based on syntax (whether the module
contains a top-level await syntactically) and the dependency graph
(whether it imports a module which contains a top-level await,
recursively). This fixed check is designed to be more predictable and
analyzable.

Abstract module record changes:

  • The [[Async]] field stores whether this module or dependencies are
    async. It is expected to be initialized by the Linking phase.
  • The [[ExecutionPromise]] field stores the Promise related to the
    evaluation of a module whose [[Async]] field is true.
  • Evaluate() returns a Promise for [[Async]] modules, a completion
    record for sync modules which throw, or undefined otherwise.

Cyclic Module Record changes:

  • A new [[ModuleAsync]] field stores whether this particular module
    is asynchronous (dependencies aside).
  • The ExecuteModule method on Cyclic Module Records takes an
    argument for the Promise capability, but only if that particular
    module is [[ModuleAsync]].
  • The Link/Instantiate phase is used to propagate the [[Async]]
    field up the module graph to dependencies.
  • When there's a cycle, with some modules sync and some async, the
    whole cycle is considered async, with the Promise of each module
    set to the entrypoint of the cycle, although the cycle-closing
    edge will not actually be awaited (since this would be a deadlock).

Source Text Module Record changes:

  • The check for whether a module contains a top-level await locally
    is in a ContainsAwait algorithm (TBD writing this out, but it
    should be static since await may not appear in a direct eval)
  • Module execution works as before if ContainsAwait is false, and
    works like an async function if ContainsAwait is true.

Closes #47, #48, #43

This patch is a variant on tc39#49 which determines which module subgraphs
are to be executed synchronously based on syntax (whether the module
contains a top-level await syntactically) and the dependency graph
(whether it imports a module which contains a top-level await,
recursively). This fixed check is designed to be more predictable and
analyzable.

Abstract module record changes:
- The [[Async]] field stores whether this module or dependencies are
  async. It is expected to be initialized by the Linking phase.
- The [[ExecutionPromise]] field stores the Promise related to the
  evaluation of a module whose [[Async]] field is *true*.
- Evaluate() returns a Promise for [[Async]] modules, a completion
  record for sync modules which throw, or undefined otherwise.

Cyclic Module Record changes:
- A new [[ModuleAsync]] field stores whether this particular module
  is asynchronous (dependencies aside).
- The ExecuteModule method on Cyclic Module Records takes an
  argument for the Promise capability, but only if that particular
  module is [[ModuleAsync]].
- The Link/Instantiate phase is used to propagate the [[Async]]
  field up the module graph to dependencies.
- When there's a cycle, with some modules sync and some async, the
  whole cycle is considered async, with the Promise of each module
  set to the entrypoint of the cycle, although the cycle-closing
  edge will not actually be awaited (since this would be a deadlock).

Source Text Module Record changes:
- The check for whether a module contains a top-level await locally
  is in a ContainsAwait algorithm (TBD writing this out, but it
  should be static since await may not appear in a direct eval)
- Module execution works as before if ContainsAwait is false, and
  works like an async function if ContainsAwait is true.

Closes tc39#47, tc39#48, tc39#43
Copy link
Collaborator

@guybedford guybedford left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work, I mainly just have an architectural question about sync subgraphs with async leaves.

spec.html Outdated
</emu-alg>
<emu-note>
<p>An implementation may parse module source text and analyse it for Early Error conditions prior to the evaluation of ParseModule for that module source text. However, the reporting of any errors must be deferred until the point where this specification actually performs ParseModule upon that source text.</p>
</emu-note>
<emu-note type=editor>
<p>ContainsAwait needs a formal definition, but the idea is to check whether the code contains a top-level await.</p>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you working on this still?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not for today, need to pack for a trip. If you're interested in grammar algorithm hacking, go for it!

@@ -377,7 +478,8 @@ <h1>InnerModuleEvaluation( _module_, _stack_, _index_ )</h1>
1. Remove the last element of _stack_.
1. Set _requiredModule_.[[Status]] to `"evaluated"`.
1. If _requiredModule_ and _module_ are the same Module Record, set _done_ to *true*.
1. <ins>Otherwise, set _requiredModule_.[[ExecPromise]] to _module_.[[ExecPromise]].</ins>
1. <ins>Otherwise, if _module_.[[Async]] is *true*,</ins>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know it's unrelated to this PR, but it might be worth adding a note in this spec for this line that this code path ensures cyles have a single execution promise, which I think is an important concept?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that's a good point. This is rather subtle logic. (Do you agree with it?)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I completely agree with it, and think this is a crucial feature for TLA in cycles, that is worth noting more clearly.

spec.html Outdated
1. Return Completion(_result_).
1. <ins>Otherwise,</ins>
1. <ins>Assert: _capability_ was provided.</ins>
1. <ins>Perform ! AsyncModuleStart(_capability_, _module_.[[ECMAScriptCode]], _moduleCxt_, _module_).</ins>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't see where the AsyncModuleStart is that is being referenced here? Should this have stayed AsyncBlockStart?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right, this should be AsyncBlockStart.

README.md Outdated
@@ -272,9 +272,9 @@ Currently (in a world without top-level `await`), polyfills are synchronous. So,

#### Does the `Promise.all` happen even if none of the imported modules have a top-level `await`?

Yes. In particular, if none of the imported modules have a top-level `await`, there will still be a delay of some turns on the Promise job queue until the module body executes. The goal here is to avoid too much synchronous behavior, which would break if something turns out to be asynchronous in the future, or even alternate between those two depending on runtime conditions ("releasing Zalgo"). Similar considerations led to the decision that `await` should always be asynchronous, even if passed a non-Promise.
If the module's execution is deterministically synchronous synchronous (that is, if it and its dependencies each contain no top-level `await`), there will be no entry in the `Promise.all` for that module. In this case, it will run synchronously.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Repeated synchronous.

1. <ins>Otherwise,</ins>
1. <ins>Let _capability_ be ! NewPromiseCapability(%Promise%).</ins>
1. <ins>Set _module_.[[EvaluationPromise]] to _capability_.[[Promise]].</ins>
1. <ins>Perform ! ExecuteModuleWhenImportsReady(_module_, _asyncDependencies_, _capability_).</ins>
Copy link
Collaborator

@guybedford guybedford Mar 19, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider a graph A imports B and C, and B imports D and E in that order:

A -> B  -> D
     B  -> E
A -> C

The execution of the above graph when importing A is DEBCA.

Now if all of these modules are sync except for D, that means that we are entering this ExecuteModuleWhenImportsReady for the entire graph of E, B, C, A execution, just for D being async.

What execution guarantees do we provide for the execution of the sync sequence EBCA here? Would there be microtasks or anything of that sort running there?

Copy link
Collaborator

@guybedford guybedford Mar 19, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Basically what I'm saying is, surely BCA could still execute completely sync here even despite B having an async dependency (as soon as D has completed)?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think C and E would both execute synchronously, without any microtasks involved. The only Promise chains involved would be basically, D.then(() => B).then(() => A). B and A wait on D, so they can't start synchronously.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But wouldn't it be more like D.then(() => Promise.all(B, C)).then(() => A)? I guess the most concerning here is the C then A part being synchronous, since this means if I just have A->C and I add an import to B above the import to C in A, I'm now losing the execution order of C happening just before A synchronously? Or am I misdiagnosing this?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

C doesn't participate in the Promise.all, since it's synchronous.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your first summary is the ordering I was expecting. What would be the motivation for the other ordering? It seems more counterintuitive to me.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can get behind the first ordering as long as it is fully well-defined - that is that synchronous first part of the async function runs in normal sync execution order without invoking any microtasks before its execution. So if there was a dependency F that was synchronous before D, that FD would execute in sequence completely synchronously without microtasks or yielding. As long as that is ok, this sounds incredibly well defined to me.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(where D above is the synchronous part of D before its first "await" statement)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that's what this spec text (aims to) describe.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If that is the case, then this all sounds excellent to me!

@littledan
Copy link
Member Author

Feedback from @sokra :

From webpack side this would be something that is possible to implement. It has basically the same semantics as import await but with import await automatically inferred by the bundler.

I still have the concern about that adding async modules at one place will insert microticks into another unrelated part of the module graph. But that is something developer have to deal with. So technically adding async modules is still a API change and not purely an implementation detail. But the impact is very small and nobody will care about that anyway.

The CommonJS interop story is also a bit concerning as require() will return a Promise when an module in the subgraph is async. This also means API change for CommonJS consumers when adding an async module.

Another concern: Transpiling ESM to CJS could become a bit more involved as any import could be async and one doesn't know when only transpiling a single file at a time. It may makes sense to involve some Babel people here.

@littledan
Copy link
Member Author

cc @nicolo-ribaudo

@nicolo-ribaudo
Copy link
Member

Since Babel doesn't currently have the capability of analyzing the depdency graph of a module, we can't toggle between "sync" and "async" import based on that.

I have to admit that I don't know anything yet about this proposal, but the import await statement mentioned by @sokra would make it a lot easier for Babel to implement top-level await: if there is an async import, we can wrap the whole module with a promise and correctly await the imports known to return a promise.
If the async-ness of a module is implicit and can only be defined by looking at it's dependencies, the only way Babel could support this proposal would be by always* wrap modules with a promise and always* await imports.

* "always" would still be configurable by the user. For example, in the module graph mentioned above (#61 (comment)), an user could configure Babel to only compile modules as async when needed:

A -> B  -> D (async)
     B  -> E
A -> C
// Babel config
module.exports = {
  overrides: [{
    include: ["A", "B", "D"],
    plugins: [
      ["@babel/plugin-transform-modules-commonjs", { async: true }]
    ],
  }, {
    include: ["C", "E"],
    plugins: [
      ["@babel/plugin-transform-modules-commonjs", { async: false }]
    ]
  }],
};

@littledan
Copy link
Member Author

littledan commented Mar 23, 2019

I don't think the module body, excluding imports, would need to be transpiled differently based on whether the imports are async, would it? I was hoping this wouldn't affect Babel for that reason.

@nicolo-ribaudo
Copy link
Member

nicolo-ribaudo commented Mar 23, 2019

import foo from "module";

console.log(foo);

This could become either

var _module = require("module");

console.log(_module.foo);

or

module.exports = async function () {
  var exports = {};

  var _module = await require("module");

  console.log(_module.foo);

  return exports;
}();

@sokra
Copy link

sokra commented Mar 25, 2019

Babel would probably put the async logic in a helper function:

import foo from "moduleA";
import bar from "moduleB";

console.log(foo, bar);

export default 42;
var _moduleA = require("moduleA");
var _moduleB = require("moduleB");

module.exports = babelAsyncModuleHelper([_moduleA, _moduleB], ([_moduleA, _moduleB]) => {
  console.log(_moduleA.default, _moduleB.default);
  exports.default = 42;
  return exports;
});
function babelAsyncModuleHelper(modules, fn) {
    var hasPromise = modules.some(m => m instanceof Promise);
    if(hasPromise) {
        return Promise.all(modules).then(fn);
    } else {
        return fn(modules);
    }
}

So seems doable...

@nicolo-ribaudo
Copy link
Member

That's a good idea!

@littledan
Copy link
Member Author

Nit: Rather than awaiting all modules when any one of them are a Promise, you can filter just the Promise ones, and then Promise.all those (if the result is non-empty). That's the semantics of this patch.

@sokra
Copy link

sokra commented Mar 25, 2019

Nit: Rather than awaiting all modules when any one of them are a Promise, you can filter just the Promise ones, and then Promise.all those (if the result is non-empty). That's the semantics of this patch.

I don't think this makes a difference. Instead I think the effect of the patch is:

// with this patch
function babelAsyncModuleHelper(modules, fn) {
    var hasPromise = modules.some(m => m instanceof Promise);
    if(hasPromise) {
        return Promise.all(modules).then(fn);
    } else {
        return fn(modules);
    }
}

vs.

// original proposal
function babelAsyncModuleHelper(modules, fn) {
    return Promise.all(modules).then(fn);
}

@littledan
Copy link
Member Author

littledan commented Mar 25, 2019

To be precise, the effect of this patch is

// with this patch
function babelAsyncModuleHelper(modules, fn) {
    var promises = modules.filter(m => m instanceof Promise);
    if(promises.length !== 0) {
        return Promise.all(promises).then(mods =>
          fn(modules.map(m => m instanceof Promise ? mods.shift() : m)));
    } else {
        return fn(modules);
    }
}

@nicolo-ribaudo
Copy link
Member

That's to avoid awaiting modules which export function then, right?

@littledan
Copy link
Member Author

Well, it's not intended to be any sort of comprehensive fix to that; if such a module has a top-level await, then it will still wait on its exported function then. It's just what this patch does. It's intended to be conservative and leave synchronous module subgraphs synchronous.

@littledan
Copy link
Member Author

I haven't heard any strong objections to this proposal, though I understand if some folks prefer the restrictions of explicit import await #60 , or, on the other hand, the regularity (from some perspectives) of #57. I'm landing this patch as the cleanest compromise I can see between the two in this design space; we can continue to iterate on the semantics in the future.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Avoid reordering execution of module bodies before top-level await is reached
4 participants