Main thread complexity #23899

asajeffrey · 2019-07-31T14:45:57Z

This is intended to be a meta-bug for centralizing the discussions about the complexity of our main thread.

Originally, the main thread ran the embedder (and in particular the window event loop) and webrender, and this wasn't too bad since the only source of blocking on events was the embedder. Webrender blocks on the WebGL thread, but that wasn't too bad when it was off in another thread.

Unfortunately, as Servo has been ported to different devices, we've ended up with more and more cases of things which need to be run on the main thread, including WebXR and WebGL (on devices where GL contexts which share textures need to live on the same thread). This is causing problems.

asajeffrey · 2019-07-31T14:53:11Z

Related issues:

There are probably others!

asajeffrey · 2019-07-31T14:55:38Z

cc @jdm @gterzian @paulrouget @Manishearth

asajeffrey · 2019-07-31T14:57:50Z

In particular, there is discussion starting at #23865 (comment) about whether async/await and futures 0.3 is a good fit for Servo's main thread.

gterzian · 2019-07-31T17:19:31Z

I think the main problem is that

servo/ports/libsimpleservo/api/src/lib.rs

Line 250 in 8ec2897

pub fn perform_updates(&mut self) -> Result<(), &'static str> {

is running various sub-loops.

Since each sub-loops needs to run for different reasons, "waking-up" or "animation mode" is used to achieve a regular heartbeat on each of these sub-loops.

However, since every-time a sub-loop runs, it doesn't know exactly why, it cannot block on messages and otherwise run in a "robust" manner, instead needing to use try_recv or timeouts to try to "catch potentially incoming messages".

The irony of "animation mode" is that there is no lack of "source of events" for the main-thread, in the form of incoming messages from script. In fact, each frame requires messaging from and to script over IPC.

The problem is that those messages are handled by "sub-loops", hence the main-thread needs to keep running so as to keep them lively, even though the flow of messages in itself could be turned into a robust signalling mechanism(which would have happened already if each sub-loop ran in its own thread of execution).

A side-effect of these sub-loops, is that it's possible for the main-thread to become unresponsive to UI events as a result of running such sub-loop, with even the possibility existing that script could put the main-thread in such state on purpose and keep it there indefinitely(example from an issue noted above: servo/webxr#34).

The root of the problem is the interface of embedder_traits::EventLoopWaker, which interestingly these days is actually a re-exported webxr_api::MainThreadWaker(which points towards the need to unify "waking" across use-cases).

wake doesn't come with a specific event. Requiring "running Servo" each time a call occurs.

wake is also implemented, in ipc mode at least, by sending a message, received by a router running in the "main-process" alongside the main-thread, which then calls the embedder-specific "wake"(which is safe to call from another thread, but isn't available across process).

That's really interesting if you think about it, since the reason why the main-thread is awoken, is because another ipc message has been sent and it needs to be handled by a sub-loop.

My proposed solution, is to simply make every communication with the main-thread happen by sending an EmbedderMsg on an ipc channel. Yes that means no more webxr_api::SessionMsg::Raf, instead being replaced by something like EmbedderMsg::XrRaf.

Then, all those messages would be handled by a router thread, running in the same process as the "main-thread".

Each message would result in wake being called, but with a specific DeviceEvent.

it would then be the responsibility of the embedder code(running on the main-thread) to run an event-loop and handle these events, interleaved with UI input.

When handling such events, the embedder could do two things:

Call in to system specific APIs, like swapping buffers or "making itself current".
Call into Servo specific "main-thread Servo" APIs, for example acquiring a webgl lock, or sending a frame to script.

1 and 2 could be interleaved in various sequences, since this would be basically single-threaded sequential code.

The main difference with now is that the Servo code wouldn't have to do anything device-specific.

Instead, Servo would only be called into to do stuff the device cannot do, like communicating a frame back to script, and in the context of the device work having been done(or being assumed to soon complete). For example, in the case of a raf, Servo would only be called into to send a frame to script. Swapping buffers would not be the work of Servo.

it would also mean that the main-thread is basically running a device-specific render loop, in the context of which Servo is called into from time to time.

WebXr/WebgGL would then become part of the API into "main-thread Servo", which the render loop can call into when necessary. Compare to the current situation where each device requires implementing a "webXr backend", in addition to a render loop. And the "backend" simply calls into device apis. I think the render-loop should essentially be the backend, calling into main-thread Servo when necessary.

Essentially the opposite of needing to wake-up the device so as to "make Servo run". Instead, the device render loop is awoken with specific events, with the ability to call into Servo as part of handling these events.

There is a 1-to-1 relation between a "wake-up" and a "main-thread operation".

Finally, since the render-loop is the main-thread, I think it could allow full flexibility for the embedding code to do what's right platform-wise.

asajeffrey · 2019-07-31T17:44:58Z

@gterzian I'm confused as to how sending an EmbedderMsg on an ipc channel would wake up the main event loop, in the case of an embedding like glutin where the embedder is blocked waiting on a UI event. Is the idea that we'd have a wrapper around an ipc channel that would send the EmbedderMsg then call the embedder waker?

gterzian · 2019-08-01T04:59:18Z

I'm confused as to how sending an EmbedderMsg on an ipc channel would wake up the main event loop, in the case of an embedding like glutin where the embedder is blocked waiting on a UI event.

@asajeffrey It could be the same as is currently done with

https://github.com/servo/webxr/blob/fcbdddb66e2d307d86ff109d79001c7a45de79db/webxr-api/registry.rs#L63

With the difference that "waking" would not be done when another message is sent, the message would result in the wake-up.

So for example:

// Sends an IPC message.
let _ = self.sender.send(RegistryMsg::SupportsSession(mode, dest));
// Also sends an IPC message.
self.waker.wake();

would become

// Internally sends (PipelineId, EmbedderMsg::SuppportsXrMode(mode, dest))
self.embedder.supports_session(mode, dest); 

// Elsewhere
// Internally sends (PipelineId, EmbedderMsg::XrRaf(dest))
self.embedder.request_xr_animation_frame(dest)

then, in the "main-proces", there is a router thread that does:

for (pipeline_id, msg) in ipc_receiver.recv() {
    match msg {
        EmbedderMsg::SuppportsXrMode(mode, dest) => {
            // Here we don't need to go through the main-thread
            let _ = dest.send(self.xr_registry.supports_session(mode));
        },
        EmbedderMsg::XrRaf(dest) => {
           // This is a bit unfortunate, and I think we could do it better.
           self.xr_callbacks.lock().push_back((pipeline_id, dest));
           // Here, we want to run something on the main-thread
           self.waker.wake(DeviceEvent::XrRaf(pipeline_id));
       }
    }
}

Then, this calls into the device

void BrowserPage::WakeUp(DeviceEvent event) {
    switch (event.type) {
        case DeviceEvent::XrRaf: {
            auto pipelineId = event.pipelineId;
            RunOnGLThread([=] { mServo->XrRaf(pipelineId); });
        }
    }
}

in "main-thread Servo"

pub fn xr_raf(&mut self, pipeline_id: PipelineId) {
    let frame = self.device.wait_for_animation_frame();
    if let Some((dest_pipeline, dest)) = self.xr_callbacks.lock().pop_front() {
        assert_eq!(dest_pipeline, pipeline_id);
        let timestamp = self.timestamp;
        let _ = dest.send((timestamp, frame));
    }
}

There's a bit of a problem because the dest callback has to be passed to the main-thread. We could do it with some raw pointer with the embedding code, but I guess some shared-state, or maybe a channel, would be safer. I think we could also re-structure the flow a bit so that there is just one "callback" per session, which is setup at the beginning of the session and not sent along for each call.

The above also shows how the router thread can be a bit more than just a router, since some operations do not need to occur on the main-thread. It can do basically half the work, hold some state, and so on. It should never block.

Finally, I know earlier I said "Servo shouldn't do anything device specific", and now looking at https://github.com/servo/webxr/blob/master/webxr/googlevr/device.rs I see that something like wait_for_animation_frame can get pretty complicated and specific.

So maybe we want to keep the Device apis, the various backends, and so on. So the main change is really that the render loop drives the main-thread, and is itself driven either by UI events, or by the router thread handling EmbedderMsg.

Also, we can keep perform_updates to support "legacy main-thread", all the while not adding new stuff to it, and removing "sub-loops" one at the time, moving them into using EmbedderMsg and being driven by the main-loop one event at a time.

gterzian · 2019-08-01T08:21:41Z

More specifically re glutin, it does seem to be a very complicated loop, and even there I think you could simply share an Arc<Mutex<VecDequeu<DeviceEvent>>(Not glutin DeviceEvent) between the EventLoopWaker and the App, and then internally call self.proxy.wakeup(), and then pop_front on the shared queue when glutin::Event::Awakened comes in.

Then the actual glutin::Event::WindowEvent would be handled as they come in.

It could look something like:

struct HeadedEventLoopWaker {
    proxy: Arc<glutin::EventsLoopProxy>,
    event_queue: Sender<DeviceEvent>,
}

impl EventLoopWaker for HeadedEventLoopWaker {
    fn wake(&self, event: DeviceEvent) {
        self.sender.send(event);
        // kick the OS event loop awake.
        if let Ok(err) = self.proxy.wakeup() {
            warn!("Failed to wake up event loop ({}).", err);
        }
    }
}

// In run_loop
self.events_loop.borrow_mut().run_forever(|e| {
    let event = match e {
        glutin::Event::Awakened => {
            // Can we count on not receiving spurious Event::Awakened?
            self.receiver.recv().unwrap()
        },
        glutin::Event::WindowEvent => //translate to a DeviceEvent
    };
    self.handle_event(event)
});

Or find some other way to plug a specific event into the wake-up...

It's not great, and I think the glutin port is more a case of supporting legacy and at least having a semblance of a window on desktop to run tests and so on. I'm more excited by the stuff going on in support these days...

delan · 2023-10-24T11:40:56Z

Related issues: […]

There are probably others!

asajeffrey added I-refactor No impact; the issue is one of maintainability or tidiness. Proposed solution requires refactoring. A-embedding I-cleanup No impact; the issue is one of maintainability or tidiness. A-content/webgl 3d canvas API A-webrender A-xr-zomg AR/VR/MR labels Jul 31, 2019

asajeffrey added the B-meta This issue tracks the status of multiple, related pieces of work label Jul 31, 2019

asajeffrey mentioned this issue Jul 31, 2019

Match event loop wakeups to expected messages #23865

Open

gterzian mentioned this issue Dec 17, 2019

Useless calls from the embedder #25305

Open

This was referenced Mar 30, 2020

Hook up new openxr context menu gesture to embedder context menu machinery #26051

Merged

Embedder/Device interaction API #26057

Closed

This was referenced Oct 24, 2023

Fix servoshell can't respond when there's no minibrowser #30598

Merged

[meta] embedding #30593

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Main thread complexity #23899

Main thread complexity #23899

asajeffrey commented Jul 31, 2019 •

edited by delan

Loading

asajeffrey commented Jul 31, 2019

asajeffrey commented Jul 31, 2019

asajeffrey commented Jul 31, 2019

gterzian commented Jul 31, 2019 •

edited

Loading

asajeffrey commented Jul 31, 2019

gterzian commented Aug 1, 2019 •

edited

Loading

gterzian commented Aug 1, 2019 •

edited

Loading

delan commented Oct 24, 2023

Main thread complexity #23899

Main thread complexity #23899

Comments

asajeffrey commented Jul 31, 2019 • edited by delan Loading

asajeffrey commented Jul 31, 2019

asajeffrey commented Jul 31, 2019

asajeffrey commented Jul 31, 2019

gterzian commented Jul 31, 2019 • edited Loading

asajeffrey commented Jul 31, 2019

gterzian commented Aug 1, 2019 • edited Loading

gterzian commented Aug 1, 2019 • edited Loading

delan commented Oct 24, 2023

asajeffrey commented Jul 31, 2019 •

edited by delan

Loading

gterzian commented Jul 31, 2019 •

edited

Loading

gterzian commented Aug 1, 2019 •

edited

Loading

gterzian commented Aug 1, 2019 •

edited

Loading