Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Top-level await integration for ServiceWorkers running modules #1407

Closed
littledan opened this issue May 19, 2019 · 64 comments · Fixed by #1444
Closed

Top-level await integration for ServiceWorkers running modules #1407

littledan opened this issue May 19, 2019 · 64 comments · Fixed by #1444

Comments

@littledan
Copy link

The top-level await proposal would make module evaluation result in a Promise, rather than synchronous success or failure. This would affect the Update algorithm, which takes the steps in 16.1 if there is such a failure.

Does anyone have such concerns about this evaluation being asynchronous?

If this seems alright, then I'll make sure we have a PR for ServiceWorker after top-level await reaches Stage 3. It looks like there would be a little bit of refactoring on the SW side, so I'll wait for things to settle down editorially on the TC39 side before making more churn with dependent specs.

cc @domenic @guybedford @MylesBorins

@jakearchibald
Copy link
Contributor

I don't see any technical problems adding top-level await. Just a couple of adjustments like you mentioned. However, service workers that use top-level await would be considered bad practice.

When we boot up an active service worker, it's for an event. Sometimes the event is performance sensitive, such as "fetch". Awaiting on anything that isn't needed for TTFB is pretty bad.

So we either:

  1. Allow it since it's part of JavaScript, but provide guidance against it, and maybe warn on the console.
  2. Disable it. Although it kinda feels odd to disable JavaScript features in a particular context like this.

I'm leaning towards 2. What do you think @slightlyoff?

@wanderview
Copy link
Member

Top level await on a module that is offlined probably is no worse than importScripts(). But top-level await on arbitrary async code could be detrimental.

Also, top-level await would not be protected by a waitUntil(). In general we probably couldn't automatically keep it alive because then it would become an abuse vector.

@jakearchibald
Copy link
Contributor

Also, top-level await would not be protected by a waitUntil(). In general we probably couldn't automatically keep it alive because then it would become an abuse vector.

I think we could work around that. We know what event we're starting up for, and we could take any start up time away from that event when/if we get that far. Similar to if you've got a sync infinite loop or some other expensive code at the top level.

@littledan
Copy link
Author

littledan commented May 20, 2019

Top-level await could be useful for certain kinds of module loading, where subresources like dynamically determined JavaScript modules may be important for the code to run. Some of this can be expressed declaratively, e.g., with import-maps, but other cases are more complex.

@wanderview
Copy link
Member

where subresources like dynamically determined JavaScript modules may be important for the code to run

This sort of thing can be a bit of a footgun in service workers, unfortunately. It may work the majority of the time, but it increases the chance they will try to dynamically load a resource that is not offlined when the device does not have network. This is why importScripts() that hit network are only permitted during the install phase and otherwise throw immediately in service workers.

@jakearchibald
Copy link
Contributor

Dynamic module loading can be done with:

const modulePromise = import(useFoo ? './foo' : './bar');

addEventListener('fetch', event => {
  event.respondWith(async function() {
    const { whatever } = await modulePromise;
    // …
  }());
});

This feels better all round, as you won't be blocked on the import if your code doesn't use it.

@littledan
Copy link
Author

Well, thanks for explaining. I think we were hoping that top-level await would be available everywhere (e.g., a deep dependency may use TLA without its users having to worry about it), but it sounds like we may have to rethink that.

@wanderview
Copy link
Member

const modulePromise = import(useFoo ? './foo' : './bar');

Didn't we agree to block dynamic import() for now in #1356 (comment)?

@jakearchibald
Copy link
Contributor

"for now", but we should enable it once we figure out the right way to allow it. I still think the behaviour we have with importScripts is good, so we'd try to make import() match that.

@domenic
Copy link
Contributor

domenic commented Jun 10, 2019

I would like to advocate for "let the user do the bad thing if they want". Remember, top-level await is largely sugar over what would be accomplishable today. (Powerful sugar, because of how it allows dependencies to abstract things away, but still.)

We don't prevent people from doing the following:

(async () => {
  await delay(100);
  self.onfetch = e => ...;
})();

and we should similarly not prevent them from doing the following:

await delay(100);
self.onfetch = e => ...;

@annevk
Copy link
Member

annevk commented Jun 11, 2019

Isn't there a difference between those snippets? In that the first won't handle the incoming fetch event (needs to be handled synchronously as the event is dispatched after parsing) and the second will, but with an unintended delay? The second would be much harder to debug.

@jakearchibald
Copy link
Contributor

And we do prevent people from doing:

(async () => {
  await delay(100);
  addEventListener('fetch', event => );
})();

@jakearchibald
Copy link
Contributor

// 1
import './this-has-top-level-await';

// 2
const modulePromise = import('./this-has-top-level-await');

// 3
await modulePromise;

In a service worker, 1 and 3 are 'bad' because they delay important events. However, 2 is fine.

If we choose to disallow 1 & 3, can we still allow 2? It provides a path to including third party libraries that use top-level-await.

@domenic
Copy link
Contributor

domenic commented Jun 11, 2019

Isn't there a difference between those snippets?

No, they are identical, as far as I can tell.

And we do prevent people from doing:

How is this prevented? I think the same mechanism for prevention should apply to top-level await (instead of a novel mechanism like failing to parse the script).

@wanderview
Copy link
Member

I believe its prevented in step 1 here:

https://dom.spec.whatwg.org/#add-an-event-listener

@littledan
Copy link
Author

littledan commented Jun 11, 2019

Thanks for the context, @wanderview. Maybe all we need to do, then, is ensure the "has ever been evaluated" flag is set synchronously, not asynchronously. This way, we would throw a TypeError if the event listener is added after a top-level await is reached. We can permit top-level await to queue further things to run, but at the same time, not wait for the module graph to complete the async parts of its execution before continuing the installation. In a way, this would match script tags with top-level await (where there is no way to wait on that evaluation either).

This would mean, though, that if you add a WebAssembly module into your dependency graph in some ways, your ServiceWorker would encounter an error after it was initially installed. Maybe this would be very bad, broken behavior that we want to avoid earlier.

@Jamesernator
Copy link

I think there's a false association that async implies slow. A large number of use cases of top-level await (or WASM modules) are likely to only be expensive on first register as things will likely be in various caches on successive loads.

I think it would be fine to just allow whatever and instead add a couple new performance entry types like 'service-worker-first-load'/'service-worker-successive-load' that measures the speed that services workers loaded in response to first register/successive loads.

@jakearchibald
Copy link
Contributor

@Jamesernator

I think there's a false association that async implies slow.

Nah, the assumption is that in-series is slower than in-parallel.

const asyncResult = await doAsyncThing();

addEventListener('fetch', (event) => {
  if (asyncResultNeeded) {
    // …
  }
});

Vs:

const asyncResultP = doAsyncThing();

addEventListener('fetch', async (event) => {
  if (asyncResultNeeded) {
    const asyncResult = await asyncResultP;
    // …
  }
});

@jakearchibald
Copy link
Contributor

@littledan

Maybe all we need to do, then, is ensure the "has ever been evaluated" flag is set synchronously, not asynchronously.

But if we were booting up the service worker to fire a fetch event, wouldn't our dispatch of the event be blocked on all top-level awaits?

@littledan
Copy link
Author

@jakearchibald I'm suggesting that we don't set up such blocking mechanics, and instead, just launch execution of the module and go right on our way in setting up the ServiceWorker. Concretely, in the Run Service Worker algorithm, in step 12, we'd just run the module, take the Promise that it returns, and more or less discard it, continuing on our way with the setup.

There's a small amount of complexity with errors during module execution: Although errors are communicated via Promises, the JS specification knows about synchronous errors synchronously, in a way that doesn't break Promise invariants, and we could thread through an API to query this state, or you could just cheat and check if the Promise was rejected. If there's an error after a top-level await, that'd come via an onerror event "later", after the Service Worker is already running.

Does this setup make sense to you? Or do you see some reason why we'd still have to block?

@jakearchibald
Copy link
Contributor

Yeah, this feels like it could work. I don't think it makes top-level await useful to a service worker, but it seems like an ok way to work around it.

It doesn't prevent:

const modulePromise = import('./this-has-top-level-await');

…which is great.

@jakearchibald
Copy link
Contributor

@littledan thank you for working through this!

@jakearchibald
Copy link
Contributor

jakearchibald commented Sep 15, 2019

For TPAC:

Thoughts:

No one allows modules in service workers yet.

Developers may end up doing things like awaiting state from IDB on each service worker load. The worst gotchas would be async things that are sometimes very slow, but may not show up in dev due to different conditions.

However, developers can currently spin a while loop for seconds as part of service worker start up, and we haven't lost too much sleep over that. Similar things might happen when importing large libraries that do a lot of setup work.

We haven't blocked a JS feature in service worker before. Although, we did block XHR, partially because of sync XHR.

Option 1: Allow top level await as is. Service worker initialisation would be blocked on the main script fully executing, including awaited content. We'd advocate against it using it, and perhaps fire warnings in devtools.

Option 2: Ban it. The service worker would throw on await, so it wouldn't install.

Option 3: Allow top level await, but the service worker is considered ready once execution + microtasks are done. That means the awaits will continue to run, but events may be called before the script has 'completed'. Adding events after execution + microtasks is not allowed, as currently defined.

The worry about 3 is some of the changes of behaviour that could be unwittingly introduced by a third party library #1444 (comment).

@asutherland
Copy link

I like option 1. In #1331 and related issues like #1157 and requests for exposure of localStorage to ServiceWorkers (!) the issue of data available synchronously-ish at SW startup has come up several times. I know I've at least suggested that async APIs should still be used in this case and we will work on making the async APIs more performant.

To me, this means that:

  • Yes, top-level async awaits should be possible so async storage can be used. (And this is preferable to exposing localStorage, etc.)
  • We have a better chance of creating accurate and useful data-preload hints if we are able to know these storage uses are part of top-level evaluation and there's no chance for contamination from "fetch" events interleaving with the top-level evaluation.

Also, option 3 seems hard to reason about and explain to humans.

@littledan
Copy link
Author

@devsnek I'd rather not take this suggestion, because of the context-dependence issue with static vs dynamic import described above.

@devsnek
Copy link

devsnek commented Oct 10, 2019

@littledan, that's why I suggested it, the VM can freely choose whether or not to allow it at each point. from there, html can say to allow X module but not Y module.

@littledan
Copy link
Author

@devsnek How would the context information be passed to/from HTML, about whether it comes from a dynamic or static import? This isn't a function of the realm or script, but rather the spec algorithm that invokes the module import.

@devsnek
Copy link

devsnek commented Oct 10, 2019

@littledan the same way html provides an implementation of HostPromiseRejectionTracker i assume. and it's called in the evaluation for AwaitExpression.

@littledan
Copy link
Author

@devsnek I'm still having trouble understanding. I also don't understand what the issue is with the phrasing @Ms2ger and I collaborated on. If you have a concrete suggestion, can I suggest you make spec PRs for it, the way we have PRs out for the [[Async]] strategy?

@devsnek
Copy link

devsnek commented Oct 10, 2019

@littledan perhaps when i find some time...

the idea @syg and i came up with is basically instead of failing the entire graph, just make a top level await expression itself throw. Since an await can resume abruptly already, this doesn't introduce weird new semantics or the appearance of new parse goals.

@littledan
Copy link
Author

Oh, I missed that you were talking about shifting this to when the TLA is hit. I worry that that would be less predictable and regular than rejecting all async modules. If we want to prevent top-level await in modules, I'd prefer we reject the module graph ahead of time, rather than when the await expression runs. In general, the semantics of TLA have been careful to generally treat modules which have a TLA which is not dynamically hit as "async modules", taking e.g., an extra microtask turn.

@devsnek
Copy link

devsnek commented Oct 10, 2019

@littledan i'd rather have to discuss ideas about the timing than ideas about rejecting the entire graph though. if we want to reject the entire graph I fail to understand why we're bringing module goal to service workers at all.

@littledan
Copy link
Author

Note that we already concluded the discussion about the design of top-level await when we promoted it to Stage 3. Any changes from here could be proposed for consensus, and would generally be expected to be based on implementation experience.

@devsnek
Copy link

devsnek commented Oct 17, 2019

Interestingly an await expression can still throw synchronously, if the operand throws. I think that solves the timing issue pretty neatly.

@domenic
Copy link
Contributor

domenic commented Oct 17, 2019

I agree with @littledan that it should fail at parse time, not at runtime.

@hiroshige-g
Copy link

Sorry to be late for the discussion.

This might be already discussed, but how about, in Step 9 of https://w3c.github.io/ServiceWorker/#update-algorithm, checking script's record's [[Async]] and if it is true, reject the updating just like ordinal fetch errors? (i.e. replacing If the algorithm asynchronously completes with null, then: with If the algorithm asynchronously completes with null or the asynchronous completion value's record's [[Async]] is true, then: and removes the allowAwait argument from whatwg/html#4352)

I'm wondering if we can remove the top-level await allowance flag from fetching algorithm, because the top-level await allowance is more like an attribute of the calling context of module fetching (i.e. disallow TLA on top-level SW script fetch, allow TLA dynamic import on SW if we would allow dynamic import on SW), rather than an attribute of global scope nor fetching itself.

@Ms2ger
Copy link
Contributor

Ms2ger commented Nov 28, 2019

@hiroshige-g yes, I believe that would work. However, it would mean that we keep fetching and parsing any dependencies after the async module. However, since we're going to fail the service worker anyway, that probably isn't an issue. I'll write up some spec text tomorrow.

Ms2ger added a commit to Ms2ger/ServiceWorker that referenced this issue Nov 29, 2019
@Ms2ger
Copy link
Contributor

Ms2ger commented Nov 29, 2019

This was somewhat more difficult than anticipated, because we all forgot that [[Async]] doesn't work like that; it's only true if the module itself is async, not if it has async dependencies. I pushed 8d7ba39.

@Pauan
Copy link

Pauan commented Jan 27, 2020

I saw sync XHR mentioned earlier, so I have a question: is it also planned to ban sync XMLHttpRequest and sync new WebAssembly.Module(foo) / new WebAssembly.Instance(foo, bar) in service workers? Should I file a new issue for that?

@jakearchibald
Copy link
Contributor

XMLHttpRequest is already unavailable in a service worker.

There are no plans to remove those web assembly modules. If you think things should change here, can you explain why?

@Pauan
Copy link

Pauan commented Jan 27, 2020

@jakearchibald I assume the reason sync XHR was removed is because it blocks the event loop, but new WebAssembly.Module / new WebAssembly.Instance are synchronous, so they also block the event loop, which is why they have a 4 KB limit on the main thread. So will they also have a 4 KB limit in service workers, or is that limit removed?

@jakearchibald
Copy link
Contributor

The intent here is to prevent footguns. We aren't blocking while (true), but we are blocking await since its performance impact isn't always clear at dev time.

they have a 4 KB limit on the main thread. So will they also have a 4 KB limit in service workers, or is that limit removed?

That's definitely worth discussing! Could you file an issue?

@Pauan
Copy link

Pauan commented Jan 27, 2020

@jakearchibald Ah, okay, thanks for clarifying! I've opened a new issue here: #1499

@Jamesernator
Copy link

Jamesernator commented Feb 23, 2021

With this pull being merged, currently there is no way to import a module that uses TLA whatsoever.

I understand the reason is that import() invokes fetch and so doesn't work offline, but could we get some way to prefetch/parse modules so that the service worker can still asynchronously evaluate them? Which is to say, we declare them upfront synchronously and only those modules can be dynamically imported later:

// Service worker can treat ./moduleWithTLA.js and it's subgraph as necessary for 
// the service worker, so it should fetch this graph as well, the service worker would
// only become active once the graph for ./moduleWithTLA.js has successfully fetched and
// parsed, it doesn't however actually block evaluation of the current module, once this
// service worker is active this is even simply a no-op
registerModule("./moduleWithTLA.js");

self.addEventListener("some-event", async (evt) => {
  evt.waitUntil(async () => {
    // Allowed, triggers no fetch as once this service worker is active, ./moduleWithTLA.js
    // is already saved and cached with the rest of the modules
    await import("./moduleWithTLA.js");
    
    // Not allowed, this module wasn't registered, so it wasn't fetched before activating
    // the service worker, as such `import()` immediately throws an error
    await import("./notRegistered.js");
  });
});

@Jamesernator
Copy link

Jamesernator commented Feb 23, 2021

Actually I realised the above could be accomplished with the module blocks proposal, e.g. the service worker would not become active until all imports in all module blocks had also fetched. Then import() would only accept module blocks e.g.:

// We can dynamically import module blocks only as they are statically declared so
// the service worker can discover them the same way it does with regular modules
const moduleWithTLA = module {
  // Before becoming the active service worker, this graph must successfully fetch
  // and parse
  export * from "./moduleWithTLA.js";
}

self.addEventListener("some-event", async (evt) => {
  evt.waitUntil(async () => {
    // Is a module block so allowed
    const { someExport } = await import(moduleWithTLA);
    // Is a string so not allowed
    await import("some-string");
  });
});

@jakearchibald
Copy link
Contributor

I think the best solution here is to figure out how import() should work in a service worker. That fixes both problems.

Ms2ger added a commit to Ms2ger/ServiceWorker that referenced this issue May 21, 2021
Fixes w3c#1407.

Co-authored-by: Jake Archibald <jaffathecake@gmail.com>
Co-authored-by: yulia <ystartsev@mozilla.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.