Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

History traversal - improve spec compliance #23479

Open
gterzian opened this issue May 30, 2019 · 10 comments
Open

History traversal - improve spec compliance #23479

gterzian opened this issue May 30, 2019 · 10 comments

Comments

@gterzian
Copy link
Member

@gterzian gterzian commented May 30, 2019

@asajeffrey @cbrewster

Related: #10994 #10992

I happened to be looking into setting the request-history-navigation-flag as part of #23368, and ended-up looking once more into the structure of the current implementation history traversal algorithm.

It appears to me that most of the "logic" happens in the constellation, with a few back and forth with script with things like ConstellationControlMsg::UpdateHistoryState.

Also, it appears to me that the actual "navigation" happens straight in the constellation, where pipelines are updated and I think also potentially reloaded.

Finally, it appears to me that we're not doing a bunch of stuff that would have to happen in script, such as firing popstate or hashchange events(which are indeed items still left as TODO in #10992).

From a high-level perspective, I think the main problem with the current implementation is that we're not doing Step 5 of traverse-the-history-by-a-delta which reads:

Queue a task that consists of running the following substeps. The relevant event loop is that of the specified browsing context's active document. The task source for the queued task is the history traversal task source.

And those substeps involve:

  1. Potentially cancelling any ongoing navigation of the "specified browsing context"
  2. Potentially unloading the currently active document of the "specified browsing context"
  3. And finally, running all the traverse-the-history steps, which includes stuff like "navigating" to perform an "entry update" in case the entry doesn't hold a document anymore. It also involves stuff like update the state of the history, firing events(either in the same task, or an subsequent one if the non-blocking-events-flag is set), and so on.

All of this happening from within the same task on the "relevant event-loop". I think what we currently do is rather running logic in the constellation, and then performing a few operations in script by sending messages to it. I think the problem there is you loose the proper interleaving with other things happening in the relevant event-loop, unless you're lucky that the constellation messages are handled "properly interleaved" by the script-thread(for example things like cancelling an ongoing navigation of the "target entry" would seem like something that is timing dependent and could benefit from running fully as a task on the relevant event-loop).

Another nice thing about doing all of these steps as a task on the script-thread, is that it means things like "reloading" actually go through the normal navigation flow as it originates inside of script, as opposed to a custom algorithm implemented fully inside the constellation. The spec also relies on this integration in the navigate algorithm, since it is in fact "reentrant" to the history traversal through the entry-update concept

This matters for things like setting a " history-navigation flag" to the request, since now you'd have to, I think, integrate it "by hand" into the algorithm in constellation, whereas otherwise it could fit in more easily into a single unified workflow that always starts inside the script-thread.

Let me know what you think, I just had a quick look around so I probably missed a few things...

@gterzian
Copy link
Member Author

@gterzian gterzian commented May 30, 2019

Also related is the concept of the session-history-traversal-queue and the session-history-event-loop

Each top-level browsing context has a session history traversal queue, initially empty, to which tasks can be added.

Each top-level browsing context, when created, must begin running the following algorithm, known as the session history event loop for that top-level browsing context, in parallel

So since it's "in-parallel", one could say the constellation is a good place to do it, and in some way the current implementation is essentially this "session-history-event-loop" running in the constellation.

To traverse the history by a delta delta, the user agent must append a task to this top-level browsing context's session history traversal queue, the task consisting of running the following steps

Which is really what we do here, since w send a message from script to the constellation:

let msg = ScriptMsg::TraverseHistory(direction);

However what's missing is the "queue a task back on an event-loop, not necessarily the one from which the original message above originated, and run the rest of the traversal algorithm".

It might very well be that the various parts of the current implementation, including the back and forth with various messages between constellation and script, is actually fully spec compliant. However I'm pretty sure it would be more robust the follow the spec for closely and actually queue a task back on an event-loop, and take it from there. So it essentially puts most of the logic to run back on an event-loop, with messages to the constellation for some things like actually loading an url.

The various steps of traverse-the-history should per the spec really all originate sequentially from the same task(unless otherwise stated, such as with the "non-blocking-events-flag ").

So looking at all the stuff that was already done in #10992, I think it's a case of "we're 90% there, but to get that last 10% requires quite a bit of structural change, not just another 10% of work".

@asajeffrey
Copy link
Member

@asajeffrey asajeffrey commented May 30, 2019

Yeah, the current implementation is spread out a lot, and involves more back-and-forth between script and the constellation. Some of this is inevitable, as there are four threads involved: the constellation, the script thread that performed the navigation, the script thread being navigated away from, and the script thread being navigated to. Not to mention that the page being navigated to might have been evicted, so navigating to it may involve creating script threads and reloading the document,

Part of the reason for leaving this alone was to give the spec some time to settle down, there are a lot of open issues around navigagtion at https://github.com/whatwg/html/labels/topic%3A%20navigation. This may be cowardice on my part though.

@cbrewster
Copy link
Member

@cbrewster cbrewster commented May 30, 2019

I would also like to add that its really difficult to know what to do in a script thread when a history traversal is requested because history traversal always operates on the joint session history. A single script thread cannot really reason about the joint session history as that may require knowing info from other script threads. Naturally the constellation is a place that has all the info that we need.

I'm not entirely convinced that the current spec really reflects the reality of session history that is implemented in The Browsers, but it's been a while since I have read the history traversal spec so it may have improved since then.

@gterzian
Copy link
Member Author

@gterzian gterzian commented May 31, 2019

@cbrewster

I would also like to add that its really difficult to know what to do in a script thread when a history traversal is requested because history traversal always operates on the joint session history. A single script thread cannot really reason about the joint session history as that may require knowing info from other script threads. Naturally the constellation is a place that has all the info that we need.

Yes and that is the role of what the spec refers to as the session-history-event-loop, which is supposed to run "in-parallel" of the script-event-loop. So the constellation would seem like the right place to run that logic, and that part of the current implementation could be said to implicitly follow the spec.

The part that is still missing is "once you've figured out which entry you wan to traverse to, queue a task on the relevant event-loop to perform the rest of the algorithm"

@asajeffrey

there are four threads involved: the constellation, the script thread that performed the navigation, the script thread being navigated away from, and the script thread being navigated to.

Correct, and the only change I propose is once we've figured out which "script thread is being navigated to", is to instead of sending a series of messages to perform various operations(currently I saw we have a dedicated HistoryStateUpdate message, for example), we should do all these operations as part of a single task(except where otherwise stated) on the relevant event-loop.

Basically:

  1. Script-thread enqueues a task onto the parallel "history traversal queue"(read: a message is sent from a script-thread to the constellation).
  2. When the task/message is handled, in parallel, we need to inspect the joint-session-history and determine the entry we are traversing to.
  3. Queue a task, using the history-traversal-task-source, back on the event-loop corresponding to the entry determined at 2. In practice that would probably mean sending a message to script, and then enqueuing another message onto the task-queue.

I'm also aware that there are quite a few things we do in the constellation, or in script but in response to a message from the constellation, which the spec says to do on the event-loop and/or with a specific task queued. So there are cases where the message from the constellation effectively doubles as a task on the event-loop when it is handled. Maybe this is such a case where it makes sense as well to do that.

It does read to me like quite a lot of steps are done in that task that is queued using the history-traversal-task-source on the relevant event-loop, and doing those operations with several messages could result in some intermittent problems when they're not handled one after the other by script.

I think it's worth looking into and see also if it makes some tests pass.


Not to mention that the page being navigated to might have been evicted, so navigating to it may involve creating script threads and reloading the document

Note that if the entry doesn't hold a document anymore, the spec plugs the traversal into navigating the browsing context to reload the document, and then re-enter the traversal with this time a document.

That's still different from whether we need to spin-up a script-thread that might have been shut down. By the way I think we might be shutting down script-threads a bit too early in some cases, which might be related, see for example #22815 (which by the way I think went away but I'm not sure why).

@asajeffrey
Copy link
Member

@asajeffrey asajeffrey commented Jun 4, 2019

Hmm, I think the reason why there's a disconnect between what Servo does and the spec is that there's some tasks that the spec says should be performed on the event loop of the active document of the specified browsing context but which Servo needs to be handled by the constellation.

For example, step 5.3 of https://html.spec.whatwg.org/multipage/history.html#traverse-the-history-by-a-delta is "Traverse the history of the specified browsing context to the specified entry with the history-navigation flag set," and step 4.3 of https://html.spec.whatwg.org/multipage/browsing-the-web.html#traverse-the-history is "Set the active document of the browsing context to entry's Document object."

Unfortunately, in Servo, setting the active document isn't something that a script thread can do directly, it has to message the constellation, which updates its state and then messages all the affected pipelines.

That said, the spec is now a lot closer to what Servo does than it used to be, and reorganizing the code to more closely match the spec is A Good Thing, as long as the spec is reasonably stable.

@annevk do you know if there's any planned major changes to this bit of navigation any time soon?

@gterzian
Copy link
Member Author

@gterzian gterzian commented Jun 5, 2019

Unfortunately, in Servo, setting the active document isn't something that a script thread can do directly, it has to message the constellation, which updates its state and then messages all the affected pipelines.

Yes, and what I'm thinking of is to move the "center of gravity" of our implementation of the algorithm to a script-thread via a task queued using the "history-traversal-task-source"(as per the spec), and message the constellation from that script-thread when necessary.

The main reason would be that I think it would matter less to have the constellation running and handling other messages while the "history traversal(the last part that is speced to run inside a task)" runs on a script-thread, than to have the script-thread continue to run while the main logic is taking place in the constellation with several messages used to affect the world of the script-thread.

For example in traverse-the-history, we have:

  • Step 4.1 "Remove any tasks queued by the history traversal task source that are associated with any Document objects in the top-level browsing context's document family.",
  • 4.6.4 "Fire an event named pageshow at the Document object's relevant global object"
  • 5 Set the document's URL to entry's URL
  • 13 Set history.state to state.
  • 16 "If the non-blocking events flag is not set, then run the following substeps immediately. Otherwise, the non-blocking events flag flag is set; queue a task to run the following substeps instead."
  • 16.1 "If state changed is true, then fire an event named popstate at the Document "

And those are all examples of things affecting the document, which are specced to all happen in the same task queued using the "history-traversal-task-source". When those steps mention "queuing a task", that's that "Dom manipulation task-source".

So I think that if one tries to implement all those steps from the constellation, using messaging to the relevant script-thread, you risk a lot of intermittent issues simply because script-thread will keep running in the meantime, and also the various "queue a subsequent task in some cases, in others fire an event immediately" are hard to express via messaging from the constellation, since that's a bit like queuing a task in itself. Finally, the "Remove any tasks queued by the history traversal task source" step would seem to really require going down the task-source route, since otherwise you'd have to do something manually by ignoring some messages from the constellation in some cases.

On the other hand, we also have things like:

  • 4.3 "Set the active document of the browsing context to entry's Document object."(mentioned earlier)
  • 4.4 "Clear any browsing context names of all entries in the history that are associated with Document objects with the same origin as the new active document and that are contiguous with entry."
  • 9 "Set the current entry to entry."
  • 15 "Set entry's Document object's latest entry to entry."

Which seem like examples of stuff requiring to happen in the constellation, and I'd argue there should be no problem sending either one, or several, messages from the relevant script-thread to the constellation, with the constellation continuously running in the meantime.

Finally there steps like:

  • 6 "If entry has a URL whose fragment differs from that of the current entry's when compared in a case-sensitive manner, and the two share the same Document object, then let hash changed be true, and let old URL be the current entry's URL and new URL be entry's URL. Otherwise, let hash changed be false."

That would seem fit to happen on the constellation, before the actually "history-traversal-task_source" is queued on a relevant script-thread, while passing the results of such operations along to the script-thread.

So I'd say we don't have to follow the spec by the letter, and some things need to happen on the constellation, yet I'd argue that the various changes to the world of the relevant script-thread which are specified to happen inside a single task, with that task potentially queuing others, probably require to happen in a single task, or at least a single operation on the main-loop of the relevant script-thread(and I'd argue a task is better, even if that requires first sending a message to the script-thread, and then queuing the task, since if you handle the message sent by the constellation as a task, you'll loose some interleaving with other tasks and it's harder to cancel all tasks for a given task-source if the task is a message from the constellation), to avoid intermittent and other issues.

Having said all that, it's usually when you try to implement these workflows that you run into the real issues :)

@annevk
Copy link

@annevk annevk commented Jun 5, 2019

@asajeffrey there's whatwg/html#4664 and there are some more plans to enable Cross-Origin-Opener-Policy and friends, but not sure on timeline.

@asajeffrey
Copy link
Member

@asajeffrey asajeffrey commented Jun 5, 2019

@annevk okay, sounds like it's stable enough for us to redo the current implementation, thanks!

@annevk
Copy link

@annevk annevk commented Jun 5, 2019

Yeah, though please have a look through the open issues around navigation and history so you don't do wasted work.

@asajeffrey
Copy link
Member

@asajeffrey asajeffrey commented Jun 5, 2019

@gterzian yes, that sounds right. We could at least make the implementation follow the latest spec a bit more. The main thing to look out for is deadlock:

//! It's also important that the constellation not deadlock. In particular, we need
//! to be careful that we don't introduce any cycles in the can-block-on relation.
//! Blocking is typically introduced by `receiver.recv()`, which blocks waiting for the
//! sender to send some data. Servo tries to achieve deadlock-freedom by using the following
//! can-block-on relation:
//!
//! * Layout can block on canvas
//! * Layout can block on font cache
//! * Layout can block on image cache
//! * Constellation can block on compositor
//! * Constellation can block on embedder
//! * Constellation can block on layout
//! * Script can block on anything (other than script)
//! * Blocking is transitive (if T1 can block on T2 and T2 can block on T3 then T1 can block on T3)
//! * Nothing can block on itself!
//!
//! There is a complexity intoduced by IPC channels, since they do not support
//! non-blocking send. This means that as well as `receiver.recv()` blocking,
//! `sender.send(data)` can also block when the IPC buffer is full. For this reason it is
//! very important that all IPC receivers where we depend on non-blocking send
//! use a router to route IPC messages to an mpsc channel. The reason why that solves
//! the problem is that under the hood, the router uses a dedicated thread to forward
//! messages, and:
//!
//! * Anything (other than a routing thread) can block on a routing thread
//!
//! See https://github.com/servo/servo/issues/14704

We can have script block on the constellation but not the other way round.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
4 participants
You can’t perform that action at this time.