Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MutationObserver invocation order #713

Closed
pmdartus opened this issue Nov 2, 2018 · 12 comments · Fixed by #1160
Closed

MutationObserver invocation order #713

pmdartus opened this issue Nov 2, 2018 · 12 comments · Fixed by #1160
Labels
compat Standard is not web compatible or proprietary feature needs standardizing topic: shadow Relates to shadow trees (as defined in DOM)

Comments

@pmdartus
Copy link
Member

pmdartus commented Nov 2, 2018

Based on: jsdom/jsdom#2398 (comment)

While implementing MutationObserver in jsdom, I wasn't able to make the following test pass while following the spec: https://github.com/web-platform-tests/wpt/blob/master/shadow-dom/slotchange-event.html#L540-L594

The test run the following steps:

  1. Adds 2 mutation observers (one on id attribute and another one on title) and a slot with a slotchange event handler.
  2. Update the id attribute and the append a child in the light dom of the host. We then enqueue a microtask to notify the listeners. At this point the id attribute mutation observer contains 1 mutation record, and the signal slot list contains the slot.
  3. The notifiy mutation observer algorithm is executed later on:
    1. The mutation observer list is copied, it contains the 2 mutation observers, and it iterates over the 2 mutation observers
    2. The callback for the id mutation observer is executed and in it's callback it updates title attribute and append a new element the light dom of the host. By updating the title attribute a new mutation record is enqueued but this time for the title mutation observer.
    3. The callback for the title mutation observer is invoked because it has a mutation record. And the test throws here since it expects that the slotchange event is invoked first.

Chrome and Webkit don't run into the same issue since they keep track of an active mutation observer list instead of iterating over all the mutation observers.

@annevk
Copy link
Member

annevk commented Nov 2, 2018

What's an active mutation observer list? Mutation observers with at least one record?

Firefox also passes this test...

@annevk annevk added topic: shadow Relates to shadow trees (as defined in DOM) compat Standard is not web compatible or proprietary feature needs standardizing labels Nov 2, 2018
@domenic
Copy link
Member

domenic commented Nov 2, 2018

Mutation observers with at least one record?

Yes, see the bottom of jsdom/jsdom#2398 (comment)

@annevk
Copy link
Member

annevk commented Nov 2, 2018

Does this mean that if the id callback first enqueued a promise callback that would also run before the title callback? I.e., this is an issue even without slotchange? It seems like yes.

@pmdartus willing to add a test for that?

@smaug---- do you agree that fixing this would require changing step 2 of https://dom.spec.whatwg.org/#notify-mutation-observers to only include mutation observers whose record queue is not empty in notifyList?

@domenic
Copy link
Member

domenic commented Dec 3, 2018

Ping @pmdartus @smaug---- for @annevk's questions :)

@FremyCompany
Copy link

FremyCompany commented Dec 3, 2018

Hey, for what this is worth, I do believe the current design in the spec is incorrect.

Indeed, if the context stores a list of all MutationObservers, this list can either be holding strong references to the MOs or weak references to the MOs. Both approaches are not satisfactory.

  1. Strong references
    If the context holds strong references to the MOs, since the list of MOs is never emptied anywhere, MOs will leak forever and never get GCed.

  2. Weak references
    If the list holds weak references, it is possible for a MOs that had pending notifications to be GCed before those notifications get processed, which should not be possible.


According to me, the correct design here is not to have a list of all MOs in the context, but only have a list of MOs which currently do have at least one pending notification. When a record is pushed into a MO, the MO is appended to the "active MO" list with a strong reference. When notifications to MOs are about to be sent, the list of active MOs is temporarily copied then emptied, and notifications are sent to the MOs that are in the list at the time.

I do believe this is what Blink and Webkit implement, and I am pretty sure this is how this is implemented in Edge as well, from the last time I looked at the code.

@domenic
Copy link
Member

domenic commented Dec 3, 2018

If the list holds weak references, it is possible for a MOs that had pending notifications to be GCed before those notifications get processed, which should not be possible.

This is not true, because the nodes have references to their corresponding MOs. (Those references get cleared once notifications are delivered.)

Blink's design is similar to my suggestion in #720, but they make the node -> MO references weak, instead of the global list. That may contribute to the differences discussed in the OP, perhaps.

@pmdartus
Copy link
Member Author

pmdartus commented Dec 3, 2018

Sorry @annevk for the late response.

Does this mean that if the id callback first enqueued a promise callback that would also run before the title callback? I.e., this is an issue even without slotchange? It seems like yes.

I am not clear what you mean by a promise callback and what behavior you want to test. But yes I would be willing to help write additional WPT. =)

@FremyCompany
Copy link

FremyCompany commented Dec 4, 2018

You are still wrong about this @domenic. I think it's time you start listening to what I am saying... you realize I am also a browser implementer right? I wouldn't start talking about this if I wasn't entirely sure of what I am saying...

So, yes, the nodes have a strong reference to their MO but this reference list is not cleared when a notification is dispatched; quite the opposite this is the list that is used to find out which notifications to dispatch in the first place, it is only ever cleared when an MO is disconnect()ed.

See the spec:

To queue a mutation record of type for target with name, namespace, oldValue, addedNodes, removedNodes, previousSibling, and nextSibling, run these steps:

  • Let interestedObservers be an empty map.
  • Let nodes be the inclusive ancestors of target.
  • For each node in nodes, and then for each registered observer of node’s registered observer list:
    • [determine if that MO should be added to interestedObservers]
  • [...]
  • Queue a mutation observer compound microtask.

https://dom.spec.whatwg.org/#ref-for-registered-observer-list⑦

if this list was cleared, it could not be used to create the notifications in the first place. The list that is cleared after each notification is the "active MOs list" which holds the reference to the MOs until their notifications have been processed (that list does not exist in the spec, but does exist in Blink; instead the spec has a list of all MOs, see later why this distinction is important).

Now, to address your point that the nodes have a strong ref to the MOs, let's consider this practical case:

  • Node n is in the document
  • A mutation observer m is created to listen to node n to detect when it's being removed from the document
  • Neither the node n nor the mutation observer m are stored in a javascript variable, this is quite common for this scenario
    ---> here are all the references at that point in Blink:
    -------> activeMOs has a strong reference to an empty list: [] // no MO has a pending notification
    -------> n.registeredObservers has a strong reference to MO m // it is used on every dom change to see if a MO must be notified of the change
    -------> m.registrations has a weak reference to node n // it is used to remove m from n.registeredObservers on disconnect()
    -------> n.parentNode has a strong reference on n keeping it alive (dom tree) // this is the only strong reference to the node n at this point

Okay, not let's say node n is then removed from the document.

  • This changes triggers the notification algorithm on node n, and it turns out there is an MO that listens to such change ( m)
  • A mutation record is appended to m and a strong reference to m is pushed to activeMOs
  • Note that the mutation record has a strong reference to the node n that was mutated
  • A task is qeued to process notifications
  • Then n is removed from the document

At this point, n is no longer referenced by any strong reference, except the one from the mutation record. It is not in any document, and javascript doesn't hold a reference to it. The mutation record is only referenced by the MO. So now let's consider the MO. If it wasn't in the activeMOs list, it also wouldn't have any reference to it except the node. Indeed, it is not reference by JS and only the node stores a reference to it. We therefore have a closed cycle, and both the MO and the node can safely be garbage collected.

This is wrong, because the MO should get to process the notification. For instance, it could process the notification, and reinsert the node in the document, keeping itself alive. Also whether the MO would receive the notification or not would leak whether GC has happened or not.

However, this doesn't happen in Blink because there is the activeMOs list which holds a strong reference to all MO that are currently pending a notification; this list is what prevents the MO from being garbage-collected, and therefore it also keeps the node alive, since the node must be alive to be exposed in the mutation record.

The model in the spec is incorrect because it doesn't have an activeMOs list, and instead has a global list of all MOs. This global list cannot function properly, because either it has strong references to every MOs and they can never be garbage collected, or it holds a weak reference to them, and then they can be garbage collected while they have a pending notifications but their linked node has been removed from the document. See above message.

Please reread this until you get it, seriously.

@annevk
Copy link
Member

annevk commented Dec 10, 2018

@FremyCompany please adhere to https://whatwg.org/code-of-conduct. Assuming malice is not the way to go.

@pmdartus I was thinking that if the id callback did something such as Promise.resolve().then(x), when would x run relative to the other things?

@FremyCompany
Copy link

FremyCompany commented Dec 11, 2018

@annevk I am sorry, I wrote that on the heat of the moment. It's been a little rough in the last few months as you can imagine. I would appreciate it however if someone could read my explanation of what is currently implemented in all browsers and acknowledge it :-/

@annevk
Copy link
Member

annevk commented Dec 11, 2018

@FremyCompany I think you're correct that the specification leaks memory. And I'd accept a PR that addresses it or a new issue for the backlog. However, I'll note that we haven't formalized strong/weak references and typically assume infinite resources, and the problem I'd like to keep this issue focused on are the web observable differences pointed out in OP.

@FremyCompany
Copy link

@annevk If you replace the global static list of MOs by the dynamic list of active MOs, the test issue mentioned above disappeared because you will be iterating over active MOs that have a pending notification which is strictly equivalent to looping through all MOs (current spec) but filter out those who had no pending notification at the start of the loop (currently proposed edit).

Basically those two issues are one and the same according to my understanding, which is why I commented here.

annevk added a commit that referenced this issue Feb 13, 2023
The existing design had some leaks and was also different from implementations. In particular, assume you have two mutation observers and a change triggered the first one as well as a slot change. Then if in the reaction to the first one a further change triggers the second one it would run before the slot change per the existing specification, but that didn't match implementations and more importantly didn't allow for garbage collection.

Test: https://github.com/web-platform-tests/wpt/blob/master/shadow-dom/slotchange-event.html#L540-L594.

Fixes #713. Fixes #1159. Closes #720.
annevk added a commit that referenced this issue Feb 20, 2023
The existing design had some leaks and was also different from implementations. In particular, assume you have two mutation observers and a change triggered the first one as well as a slot change. Then if in the reaction to the first one a further change triggers the second one it would run before the slot change per the existing specification, but that didn't match implementations and more importantly didn't allow for garbage collection.

Test: https://github.com/web-platform-tests/wpt/blob/master/shadow-dom/slotchange-event.html#L540-L594.

Fixes #713. Fixes #1159. Closes #720.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compat Standard is not web compatible or proprietary feature needs standardizing topic: shadow Relates to shadow trees (as defined in DOM)
Development

Successfully merging a pull request may close this issue.

4 participants