Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Potential error in connection between "Document destroy" and "Document abort" algorithms #9148

Closed
kalenikaliaksandr opened this issue Apr 11, 2023 · 9 comments · Fixed by #9907

Comments

@kalenikaliaksandr
Copy link

kalenikaliaksandr commented Apr 11, 2023

Seems like there is an issue in https://html.spec.whatwg.org/multipage/document-lifecycle.html#destroy-a-document that "document destroy" algorithm can call "document abort" on navigables with already nulled active documents of descendant navigables:

  1. Destroy the active documents of each of document's descendant navigables.

after recursive destroy on step 1 all descendant navigables will have document=null in active session history because of step 7:

  1. Set document's node navigable's active session history entry's document state's document to null.

and then on step four of initial destroy call

  1. Abort document.

abort will fail to access active document for a child navigable here because it was nulled before on step 7 in destroy steps:

  1. Abort the active documents of each of document's descendant navigables

Questions:
Is null check for active document assumed here? If so maybe it worth to mention null check explicitly.
if this supposed to work without null check on active document then there might be param whether abort should go recursively.

@domenic
Copy link
Member

domenic commented Apr 12, 2023

How this is used to work is that the recursion happened after aborting: https://html.spec.whatwg.org/commit-snapshots/c196f1bca137b37e4ab420dc79dac2360e5dadec/#garbage-collection-and-browsing-contexts . It changed, of course, in 0a97a81.

I can't immediately see why this was changed. Probably I thought it would be more logical, but I forgot that abort was recursive.

So I think our two choices are:

  • Have a non-recursive version of abort, and use that inside destroy
  • Move the recursive part of destroy after the call to abort.

Any thoughts on which of these makes more sense is welcome. I'm leaning toward the former, since it seems more sensible to do everything node-by-node in the tree, especially when the tree might span multiple processes.

@kalenikaliaksandr
Copy link
Author

kalenikaliaksandr commented Apr 12, 2023

@domenic

Move the recursive part of destroy after the call to abort.

that seems to work and I went this way for now. the only disadvantage I see is that with this way: "abort" will actually do "abort steps" for every navigable recursively from initial "destroy" call and then there might be a bunch of subsequent "abort" calls from recursive "destroy's" that will do nothing. I guess it's not really a problem, right? But still it would be nice to call "abort" just once on every navigable during "destroy".

@kalenikaliaksandr
Copy link
Author

Seems like there is another related issue that apply the history step on step 14.10.2:

For each childNavigable of displayedDocument's descendant navigables, queue a global task on the navigation and traversal task source given childNavigable's active window to unload childNavigable's active document.

does unloading of documents of child navigables and destroy will fail to access navigable's active document for them because it was nulled while destroying displayedDocument itself (because "destroy" is recursive).

@domenic
Copy link
Member

domenic commented Jun 28, 2023

I had some spare time today and so tried to tackle this. I can't see an elegant way to do so without splitting all the algorithms into single-document + descendants versions. I think the result is:

  • destroy a document => only aborts a single document (unlike today), then does the rest of the destroy steps for a single document (unlike today)
  • unload a document => only unloads a single document (like today), then destroys a single document (unlike today)
  • abort a document => only aborts a signle document (unlike today)

and then

  • destroy a document and its descendants => gathers a list of inclusive descendants, and destroys them all. Does not need to fan out tasks as it's "just" manipulating global browser-wide associations, I think.
  • unload a document and its descendants => gathers a list of descendants, calls unload, then fans out tasks to all descendants to unload them. Tasks are necessary because of the many JS script interactions.
  • abort a document and its descendants => gathers a list of descendants, calls abort, then fans out tasks to all descendants to abort them. Tasks are necessary, I think, because it messes with parser state.

And then the rest of the spec only ever calls these "and its descendants" versions. (For unload, this will deduplicate a bit of code, and fix a bug where "close a top-level traversable" is failing to fan out the tasks.)

Does this sound like it's on track to you?

@kalenikaliaksandr
Copy link
Author

Splitting algorithms seems reasonable for me. However, even with deleted recursing step in destroy algorithm, it will still indirectly cause destroyment of all descendant navigable documents. This is because setting browsing context to null on step 6:

Set document's browsing context to null.

will make all document's iframes inactive. This, in turn, causes their child navigables to be destroyed. consequently, the navigable's document is destroyed in step 5:

Destroy navigable's active document.

It feels like I might be missing something, but do we need both recursive and non-recursive versions of destroy considering this?

@domenic
Copy link
Member

domenic commented Nov 2, 2023

This, in turn, causes their child navigables to be destroyed.

Why is that? Setting a field to null does not cause another algorithm to run, I don't think.

@domenic
Copy link
Member

domenic commented Nov 2, 2023

  • Does not need to fan out tasks as it's "just" manipulating global browser-wide associations, I think.

This is false, because it calls abort, which deals with parser-related state and can fire readystatechange.

@kalenikaliaksandr
Copy link
Author

Why is that? Setting a field to null does not cause another algorithm to run, I don't think.

Specification for frame https://html.spec.whatwg.org/#frame says:
"When a frame element stops being an active frame element, the user agent must destroy a child navigable given the element."

so setting field to null indeed seems to invoke other algorithm. is that wrong?

This is false, because it calls abort, which deals with parser-related state and can fire readystatechange.

oh, right, I missed that.

@domenic
Copy link
Member

domenic commented Nov 2, 2023

so setting field to null indeed seems to invoke other algorithm. is that wrong?

Ick. Note that the part of the spec you're quoting is for frame, not iframe. It seems very old, and thus indeed it has this sort of action at a distance that we try to avoid these days.

I'll update that to behave more like iframe, and only destroy on removal from the DOM.

domenic added a commit that referenced this issue Nov 2, 2023
As noted in #9148 (comment), frame's processing model has not kept up with iframe's. It defined an "active frame element" concept which resulted in action-at-a-distance whenever the browsing context was cleared, which is hard to reason about. Instead, we can have it mimick iframe and have explicit actions on insertion and removal.

Additionally, iframe itself was not using the most-modern techniques for handling insertion and removal. This updates both iframe and frame to use "HTML element insertion steps" and "HTML element removal steps".

Finally, this makes the normative change of processing iframes when they are in shadow DOM. For some reason the specification was only handling them when they were in a document tree. frame elements are left as only handled when in a document tree.
domenic added a commit that referenced this issue Nov 2, 2023
As noted in #9148, the way in which destroy calls abort, and unload calls destroy, is not sound when most of these algorithms act on the entire tree. Instead, we need to split these algorithms into single-document base variants, which call each other, and whole-tree "document and its descendant" variants, which proceed carefully.

The whole-tree variants in particular need to take care with how they queue tasks. We cannot fire events (like unload for unloading, or readystatechange for aborting) inside of inactive documents, so we need to delay making a document inactive until its children have been processed. (The related issue #9869 around reloading is not yet solved, however, since that is a special case where the same session history entry is reused and thus making documents inactive is harder to avoid.)

Closes #9148. Closes #9904 while we're in the area.
domenic added a commit that referenced this issue Nov 6, 2023
As noted in #9148 (comment), frame's processing model has not kept up with iframe's. It defined an "active frame element" concept which resulted in action-at-a-distance whenever the browsing context was cleared, which is hard to reason about. Instead, we can have it mimick iframe and have explicit actions on insertion and removal.

Additionally, iframe itself was not using the most-modern techniques for handling insertion and removal. This updates both iframe and frame to use "HTML element insertion steps" and "HTML element removal steps".

Finally, this makes the normative change of processing iframes when they are in shadow DOM. For some reason the specification was only handling them when they were in a document tree. frame elements are left as only handled when in a document tree.
domenic added a commit that referenced this issue Nov 15, 2023
As noted in #9148, the way in which destroy calls abort, and unload calls destroy, is not sound when most of these algorithms act on the entire tree. Instead, we need to split these algorithms into single-document base variants, which call each other, and whole-tree "document and its descendant" variants, which proceed carefully.

The whole-tree variants in particular need to take care with how they queue tasks. We cannot fire events (like unload for unloading, or readystatechange for aborting) inside of inactive documents, so we need to delay making a document inactive until its children have been processed. (The related issue #9869 around reloading is not yet solved, however, since that is a special case where the same session history entry is reused and thus making documents inactive is harder to avoid.)

Closes #9148.
domenic added a commit that referenced this issue Nov 15, 2023
As noted in #9148, the way in which destroy calls abort, and unload calls destroy, is not sound when most of these algorithms act on the entire tree. Instead, we need to split these algorithms into single-document base variants, which call each other, and whole-tree "document and its descendant" variants, which proceed carefully.

The whole-tree variants in particular need to take care with how they queue tasks. We cannot fire events (like unload for unloading, or readystatechange for aborting) inside of inactive documents, so we need to delay making a document inactive until its children have been processed. (The related issue #9869 around reloading is not yet solved, however, since that is a special case where the same session history entry is reused and thus making documents inactive is harder to avoid.)

Closes #9148.
domenic added a commit that referenced this issue Jan 11, 2024
As noted in #9148, the way in which destroy calls abort, and unload calls destroy, is not sound when most of these algorithms act on the entire tree. Instead, we need to split these algorithms into single-document base variants, which call each other, and whole-tree "document and its descendant" variants, which proceed carefully.

The whole-tree variants in particular need to take care with how they queue tasks. We cannot fire events (like unload for unloading, or readystatechange for aborting) inside of inactive documents, so we need to delay making a document inactive until its children have been processed. (The related issue #9869 around reloading is not yet solved, however, since that is a special case where the same session history entry is reused and thus making documents inactive is harder to avoid.)

Closes #9148.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging a pull request may close this issue.

2 participants