Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Central algorithm for updating service worker state and registrations #1416

Open
wants to merge 10 commits into
base: main
Choose a base branch
from

Conversation

jakearchibald
Copy link
Contributor

@jakearchibald jakearchibald commented Jun 4, 2019

Fixes #1273.

My aim here was to create a central place that applies updates to workers, registrations, and their various JS objects. This means the rest of the spec doesn't have to try to guess the correct order.

I update all the JS objects before firing events, so objects should be in the current state once the related events fire.

Additionally, I wanted to create a single task (per environment) for firing these events.


Preview | Diff

@jakearchibald
Copy link
Contributor Author

I'll write tests for this. We can tweak the order of events if need be. I tried to do it 'bubbling', so the leaf thing (the service worker) gets the event first, then the registration, then navigator.serviceWorker.

@wanderview
Copy link
Member

Do you expect this to change any previously defined behavior? Or do you believe its a refactor without functional changes.

@jakearchibald
Copy link
Contributor Author

If things were implemented as spec'd, you'd get an event per task for related things. Eg if you have two registration objects representing the ServiceWorkerRegistration, things like updatefound would happen in different tasks per objects. It's now part of the same task.

Same if a worker becomes activating which also results in a controller change, and the previous worker becoming redundant. Those will now be dispatched in the same task whereas it was previously three tasks.

Using a single task for events that are part of the same action seems to make sense. As for the exact event order, I'm open to tweaking that if all browsers seem to behave the same way.

@wanderview
Copy link
Member

The statechange stuff at least matches what I implemented in firefox:

https://searchfox.org/mozilla-central/rev/f8b11433159cbc9cc80500b3e579d767473fa539/dom/serviceworkers/ServiceWorkerRegistration.cpp#396-489

Firefox does not run the updatefound event in the same task currently, but is guaranteed to be after the statechange events. So that kind of matches here.

I'm unsure of the controllerchange event right now.

Getting all this right in the face of cross-process implementations is pretty tricky and I spent a lot of time getting firefox to match the current tests. I expect anything that changes behavior may have a high implementation cost to adjust to. I just want to make sure we're not making behavioral changes that don't have a solid reason to avoid churning implementations.

@jakearchibald
Copy link
Contributor Author

If the cost looks too high I can add an additional task for controllerchange. I'd rather not, but it isn't a big deal. Consistent event ordering is the most important thing.

@wanderview
Copy link
Member

Is the multiple-task vs single-task difference easily observable to script? I expect trying to get statechange, updatefound, and controllerchange in the same task will be nearly impossible with the current firefox implementation.

@wanderview
Copy link
Member

cc @asutherland

@jakearchibald
Copy link
Contributor Author

Is the multiple-task vs single-task difference easily observable to script?

You could use a message port to queue a task, or requestAnimationFrame which might happen before the next task. It isn't guaranteed though.

With these changes, you're guaranteed not to have other tasks or requestAnimationFrame between these related events.

@wanderview
Copy link
Member

I guess I'm not sure guaranteeing consistency with tasks posted from other task sources is worth the implementation pain this would cause.

@jakearchibald
Copy link
Contributor Author

If it looks bad in the tests I can split it back out into tasks but keep the central algorithm for ordering. It'd look weird at that point so I'd add a note saying it's for legacy reasons.

@wanderview
Copy link
Member

I found the code that guarantees that controllerchange from a worker going active will occur after the statechange events:

https://searchfox.org/mozilla-central/rev/f8b11433159cbc9cc80500b3e579d767473fa539/dom/serviceworkers/ServiceWorkerRegistrationInfo.cpp#329-335

The TransitionWaitingToActive() will queue a task for the statechange stuff and then UpdateClientControllers() will queue a task for controllerchange events. Of course, there is nothing here for updatefound because updatefound is not fired for activating a worker.

@mfalken
Copy link
Member

mfalken commented Jun 4, 2019

Do we have an idea which tests will change?

Chrome also does most things in separate tasks. Each change is typically its own IPC, and each IPC is its own task. So Chrome acts like Firefox in that statechange is one task and controllerchange is another. Chrome also doesn't have a good system yet of changing all the JS objects first and then firing the events. Not really sure yet how difficult it would be to change these.

@wanderview
Copy link
Member

@jakearchibald
Copy link
Contributor Author

Isn't it inefficient using many tasks to update objects like this?

@wanderview
Copy link
Member

I don't think these events fire enough to really worry about efficiency. Its not a hot path.

I think we probably queue tasks this way because the spec used to be written that way and that influenced design decisions. Reversing those design decisions might be costly, though.

Also, some of the existing tests may require some ordering that is not clearly defined in the spec. To be honest at some point I was just trying to get the WPT tests to pass.

@jakearchibald
Copy link
Contributor Author

jakearchibald commented Jun 5, 2019

I've started going through the tests and looking at the event ordering and other assumptions.

  • activate-event-after-install-state-change.https
    • Asserts registration.active is null once sw state becomes installed on new registration.
  • activation-after-registration.https
    • Asserts registration.installing.state is installing on a new registration.
  • activation.https
    • After worker becomes installed, it waits on wait_for_activation_on_dummy_scope then asserts registration.waiting is set.
    • Asserts registration.active is a given worker once that worker's state is activating.
    • Asserts active, waiting, installing on a registration are null once the waiting worker's state becomes redundant on an unregistered registration.
    • TODO: wait_for_activation_on_dummy_scope shouldn't be needed with this PR, but maybe leave tests as they are?
  • active.https
    • Asserts registration.active.state is activating once registration.installing's state becomes activating.
  • claim-affect-other-registration.https
    • Once registration.installing becomes state activated, asserts .installing and .waiting are null, and .active is the worker that was previously registration.installing.
    • registration.installing is set when calling register with an existing scope but new url & the promise resolves.
    • Once registration.installing becomes state installed, asserts .waiting is the worker that was previously registration.installing.
    • Once registration.waiting becomes state activated, asserts .active is the worker that was previously registration.waiting, and old .active worker is now state redundant.
  • multiple-update.https.html
    • Asserts registration is updated (new installing worker) once registration.update() resolves.
  • ready.https.html
    • Once ready resolves, registration.active is correctly populated, and registration.active.state is either activating or activated.
  • register-same-scope-different-script-url.https
    • When installing worker becomes state redundant (due to a failing install), registration.installing will be null.
  • registration-service-worker-attributes.https
    • updatefound is fired on a new registration after register has resolved.
  • skip-waiting-installed.https
    • When controllerchange dispatches, controller has updated, and registration.active has updated.

Noteworthy things:

  • registration is expected to be updated by the time the state change event fires on one of the workers.
  • Once ready resolves, the active service worker may be either activating or activated. Feels like it should be one or the other.
  • updatefound is fired on a new registration after register has resolved. I need to update my PR for this.
  • If I'm trying to de-dupe tasks, I should makes sure two iframes with the same loop are updated at the same time.

@wanderview
Copy link
Member

The main thing to note is that registration is expected to be updated by the time the state change event fires on one of the workers.

Yea. Just for extra implementation context, trying to keep the separate registration and worker objects in a coherent state was very tricky for me implementing cross-process support. In the end I basically had to treat the worker objects as children of the registration and funnel all IPC traffic for them through the registration.

@jakearchibald
Copy link
Contributor Author

That seems totally reasonable given that a worker cannot transfer between registrations.

@wanderview
Copy link
Member

That seems totally reasonable given that a worker cannot transfer between registrations.

Its a little bit wonky because there are cases where a document may have a ServiceWorker object, but not have its corresponding ServiceWorkerRegistration object. For example, a client receives a postMessage from the SW and so have the msg.source attribute. My solution ends up creating the ServiceWorkerRegistration even though its not exposed in these cases.

@jakearchibald
Copy link
Contributor Author

Out of curiosity, if it had been spec'd like this PR from the start, would it have been easier, or just the same?

@wanderview
Copy link
Member

Maybe? Not sure. Its definitely easier to read and understand, though.

@jakearchibald jakearchibald changed the base branch from kill-ressurection to master June 6, 2019 07:37
1. If |settingsObject| [=/uses=] |registration|, then:
1. Set |settingsObject|'s [=environment/active service worker=] to |activeWorker|.
1. [=fire an event=] named `controllerchange` on |container|.
1. <span id="update-state-resolve-ready-step">If the [=Match Service Worker Registration|matching=] [=/service worker registration=] for |settingsObject|'s [=environment/creation URL=] is |registration|, and |readyPromise| is pending, then [=/resolve=] |readyPromise| with a new {{ServiceWorkerRegistration}} object representing |registration| in |readyPromise|'s [=relevant Realm=].</span>
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This shouldn't be a new {{ServiceWorkerRegistration}}. It should return the same object if one exists. This is kinda hand-wavey in the spec.

@jakearchibald
Copy link
Contributor Author

😈 If I'm really trying to cut down the number of tasks, pages on the same event loop should share a task. This would make things consistent between iframes that have synchronous script access.

@jakearchibald
Copy link
Contributor Author

Ok, I've done going through every test, but I didn't find any that require these events to be posted in separate tasks, except updatefound, which happens in a task after register() resolves.

@jakearchibald
Copy link
Contributor Author

jakearchibald commented Jun 6, 2019

When an oldWorker is active, and a newWorker replaces it, Chrome and Safari fire statechange on oldWorker first. Firefox fires it on newWorker first.

We should spec one or the other.

Edit: We already spec oldWorker to be first, but I guess we don't have tests.

@wanderview
Copy link
Member

Firefox fires it on newWorker first.

This should be somewhat straightforward to fix in this code here:

https://searchfox.org/mozilla-central/rev/f8b11433159cbc9cc80500b3e579d767473fa539/dom/serviceworkers/ServiceWorkerRegistration.cpp#405-454

@jakearchibald
Copy link
Contributor Author

The original issue (#1273) is about consistency of state, so I guess that's the important bit.

https://w3c.github.io/ServiceWorker/#installation-algorithm steps 17 & 18 leave the possibility of JavaScript seeing the same worker in .installing and .waiting.

There are many places where the a worker becomes redundant, but it's still in .installing/.waiting/.active.

https://w3c.github.io/ServiceWorker/#activation-algorithm steps 3 & 4 leaves the possibility of JavaScript seeing the same worker in .waiting and .active.

Even if we swap things around (as originally proposed in #1273), you can still end up with JavaScript seeing a worker in .waiting that has a state of installing.

If objects are updating across multiple tasks, we're always going to have this problem.

How about:

  1. For a single update (like a worker being promoted to active, and the current active worker becoming redundant), we update all related JS objects across globals that share the event loop.
  2. Related events then fire, which may be split across tasks, as long as they fire before the next state update.

@mattto

Chrome also doesn't have a good system yet of changing all the JS objects first and then firing the events. Not really sure yet how difficult it would be to change these.

I can't think of a better way to achieve consistency here. Is it too difficult?

@wanderview @asutherland Do you have a feel about this in terms of Mozilla's code base.

@youennf @cdumez Do you have feelings on this in terms of WebKit?

@jakearchibald
Copy link
Contributor Author

jakearchibald commented Jun 10, 2019

Summary for folks arriving here for the first time:

When newWorker (a service worker) is promoted from "waiting" to "active", overwriting oldWorker (another service worker), the following happens:

  1. oldWorker's state becomes "redundant".
  2. newWorker's state becomes "activating".
  3. registration.waiting becomes null.
  4. registration.active becomes newWorker.

The current version of the spec performs this update over 4 separate tasks in a single realm, meaning JavaScript can run in-between and observe a partially-updated state. All updates to JS objects & related events follow this pattern.

I'm proposing we switch to a single task per event loop, where the objects are updated, then the events are fired.

@wanderview suspects this would be a massive amount of work for Firefox. @matto is looking into it for Chrome.

If it makes things easier for implementers, it's probably fine to fire events over a series of tasks. However, I think we should aim to have all JS objects updated before firing any of the related events.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Should the worker be removed from the registration *before* its state is set to "redundant"?
3 participants