Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fire a mozbrowsererror event if pipeline/frame panics #10334

Closed
paulrouget opened this issue Apr 1, 2016 · 42 comments
Closed

Fire a mozbrowsererror event if pipeline/frame panics #10334

paulrouget opened this issue Apr 1, 2016 · 42 comments
Assignees

Comments

@paulrouget
Copy link
Contributor

@paulrouget paulrouget commented Apr 1, 2016

Followup: #10082 (comment)

mozbrowsererror with type = "fatal".

@paulrouget
Copy link
Contributor Author

@paulrouget paulrouget commented Apr 1, 2016

And maybe attach the back trace or add some more details to the event to allow browserhtml to report the crash.

@asajeffrey
Copy link
Member

@asajeffrey asajeffrey commented Apr 1, 2016

@paulrouget should that be type = "fatal" or type = "fatal(crash)". https://developer.mozilla.org/en-US/docs/Web/Events/mozbrowsererror has "fatal(crash)" but I think "(crash)" is just meant as a comment, not as part of the type string.

@asajeffrey
Copy link
Member

@asajeffrey asajeffrey commented Apr 1, 2016

At the moment, our binding to mozbrowsererror doesn't allow us to specify the type (https://github.com/servo/servo/blob/master/components/script_traits/lib.rs#L418). We'd need to extend MozBrowserEvent to provide type information.

@paulrouget How important is the type? Should we just emit a mozbrowsererror with no type info for now?

@asajeffrey
Copy link
Member

@asajeffrey asajeffrey commented Apr 1, 2016

Also @paulrouget, should there be any other info in the event? For example, would including the stack trace in the detail be useful?

@jdm
Copy link
Member

@jdm jdm commented Apr 1, 2016

@asajeffrey The type is not important to maintain. We extend them as necessary.

@paulrouget
Copy link
Contributor Author

@paulrouget paulrouget commented Apr 2, 2016

should that be type = "fatal" or type = "fatal(crash)".

"fatal"

How important is the type?

Not that important, but it's pretty easy to support: we should create a new event type in BrowserElement.webidl;

dictionary BrowserElementErrorEventDetail {
  DOMString type;
};

For example, would including the stack trace in the detail be useful?

Absolutely. You can add an extra field to the webidl for that.

@Manishearth
Copy link
Member

@Manishearth Manishearth commented Apr 5, 2016

From #10329 (comment)

As far as #10334, one way to handle this might be to have a global privileged "parent window". Windows corresponding to browser.html and those using that API have this. On Servo's side we store a global once-initialized static channel to its script task, and in the panic handler if we find such a channel and sending messages to the window works (use try_send() instead of send()), send a message to the script task with panic details. This triggers an event on the bhtml Window.

The alternative is to have each iframe mozbrowser get a special custom panic handler that bubbles it up. However, panics may not always be in the per-iframe threads, and if something else panics then there's not much we can do about it.

Looking at this perhaps we have to use the latter option, since we need to be able to associate a panic with a specific <iframe>

@Manishearth
Copy link
Member

@Manishearth Manishearth commented Apr 5, 2016

@paulrouget For telemetry and stuff, would it be okay if the bhtml window itself could have a mozbrowsererror fired on it (not just its iframes)? This way, we can attempt to catch crashes in individual iframe threads, and if the panic is elsewhere, try to tell bhtml about it directly so that it can (attempt to) report to telemetry.

@paulrouget
Copy link
Contributor Author

@paulrouget paulrouget commented Apr 5, 2016

@Manishearth Do we need to have a special event for that? Could we use this instead: https://html.spec.whatwg.org/multipage/webappapis.html#handler-onerror. Because we try to make mozbrowser events map contentDocument events. mozbrowserloadend is onload for example.

Regardless of the name of the event, it should be fired only if 1) the mozbrowser pref is on, 2) only to the very top level pipeline.

@Manishearth
Copy link
Member

@Manishearth Manishearth commented Apr 5, 2016

Could it be a mozbrowsererror instead (fired on the Window instead of the iframe)? I don't like exposing a possible exploit vector through a non-mozbrowser interface.

(Yes, it will only trigger on Browser windows, but I like having that second step there too)

@paulrouget
Copy link
Contributor Author

@paulrouget paulrouget commented Apr 5, 2016

Sure. To avoid confusion, could we name it mozbrowsererror_unknownorigin or something like that?

@Manishearth
Copy link
Member

@Manishearth Manishearth commented Apr 5, 2016

sgtm

@asajeffrey
Copy link
Member

@asajeffrey asajeffrey commented Apr 11, 2016

OK, here's my first shot, but I don't think it's quite right (reasons below):

    fn handle_failure_msg(&mut self,
                          pipeline_id: PipelineId,
                          parent_info: Option<(PipelineId, SubpageId)>) {
...
        if Some((parent_id, subpage_id)) = parent_info {
            if Some(parent) = self.pipelines.get(&parent_id) {
                let event = MozBrowserEvent::Error; // TODO: add type and detail 
                parent.trigger_mozbrowser_event(subpage_id, event);
            }
        }
...
    }

The reason why I don't think this is quite right is that it's firing the Error event on the iframe which just failed, so its script thread is not necessarily in any state to be bubbling events. Really the event should be sent to the parent, not the child, but currently AFAICT the script thread doesn't expose a way to send a mozbrowser event to a given pipeline, you have to send it to a given subpage.

@Manishearth thoughts?

@asajeffrey
Copy link
Member

@asajeffrey asajeffrey commented Apr 11, 2016

(One thing holding this back is that we can't get at the error until #10345 lands).

@jdm
Copy link
Member

@jdm jdm commented Apr 11, 2016

The subpage is used to identify the iframe element in the given pipeline, iirc.

@jdm
Copy link
Member

@jdm jdm commented Apr 11, 2016

Which is to say, the script thread which may not be in a good state for dispatching events is entirely different from the script thread that contains the iframe element, since the two are considered cross origin to each other.

@asajeffrey
Copy link
Member

@asajeffrey asajeffrey commented Apr 11, 2016

@jdm: yes, but parent.trigger_mozbrowser_event(subpage_id, event) triggers a mozbrowser event inside the script thread of the child, doesn't it? In this case, that's the script thread that's probably just crashed, so is very likely to drop the event on the floor rather than bubble it.

@asajeffrey
Copy link
Member

@asajeffrey asajeffrey commented Apr 11, 2016

Goes and reads implementation... ah, I see the event is actually processed by the script thread of the parent, not of the child. This is good!

Not so good is the nested case, since we want the event to reach the browser.html root iframe eventually. When the event reaches the document node of its containing iframe, does it keep bubbling?

@Manishearth
Copy link
Member

@Manishearth Manishearth commented Apr 12, 2016

I was thinking we'd store a separate sender to the toplevel script task in the thread local state. This does break process isolation, however, so it has to be a different channel entirely that can only handle mozbrowserevents, which needs to be selected on. Perhaps this solution is better, though I'm not sure how process isolation is achieved here.

Do child iframes have access to the parent script threads via the pipeline API? Even if x-origin? Is that not plusungood?

The reason I'd prefer using TLS instead of the existing machinery is that we don't know how much is broken with the existing machinery when we hit the panic handler. IEven if something wasn't broken, it probably got cleaned up during unwinding -- panic handlers can't access your usual state since they run after destructors IIRC.

This way we set up a channel when spawning a task (easy to do since it's not during a panic), stick it in TLS, and have the panic handler just blindly send the backtrace down that channel.

@asajeffrey
Copy link
Member

@asajeffrey asajeffrey commented Apr 12, 2016

OK, what we could do when an error happens is run up the pipeline tree starting at the pipeline that caused the error, until we find the (parent_id, subpage_id) where parent.parent_info is None, and then call parent.trigger_mozbrowser_event(subpage_id, event). This will fire an event in the root script thread, targeted at the child iframe which contains the pipeline that failed. The event never has to cross origin boundaries, so hopefully does not cause security problems.

Looking at the code for trigger_mozbrowserlocationchange (https://github.com/servo/servo/blob/master/components/compositing/constellation.rs#L1863), I'm not sure why this doesn't have to do something similar. In the case where the iframe that changed location is deeply nested in the iframe tree, why is it enough to trigger a mozbrowser event on the iframe's parent?

@jdm
Copy link
Member

@jdm jdm commented Apr 12, 2016

The idea right now is that we only intend to support mozbrowser frames in the top-level document.

@Manishearth
Copy link
Member

@Manishearth Manishearth commented Apr 12, 2016

@jdm However, we want to support catching crashes in frames that are transitively under mozbrowser, no?

@jdm
Copy link
Member

@jdm jdm commented Apr 12, 2016

That has interesting implications, since the root frame can't directly interact with the iframe containing the pipeline that failed.

@asajeffrey
Copy link
Member

@asajeffrey asajeffrey commented Apr 12, 2016

@jdm: indeed, but by this point the iframe contents are in some unknown state, hopefully about:failure. The chrome can't do much, except perhaps ask the user if they want to submit a report, and maybe kill the tab containing the crashed iframe.

bors-servo added a commit that referenced this issue Apr 14, 2016
…earth

Added panic message to failures.

Added the panic message to failures. This is a step towards #10334, since it gives us access to the panic error message when we fire a `mozbrowsererror` event. The remaining steps are also to record the backtrace, and to report the failure in the event.

<!-- Reviewable:start -->
---
This change is [<img src="https://reviewable.io/review_button.svg" height="35" align="absmiddle" alt="Reviewable"/>](https://reviewable.io/reviews/servo/servo/10587)
<!-- Reviewable:end -->
@asajeffrey
Copy link
Member

@asajeffrey asajeffrey commented Apr 14, 2016

@paulrouget what should we do if a non-pipeline thread panics? There's no iframe responsible for the panic, should we just trigger a mozbrowsererror event on the top-level window? Should the event details be different?

@Manishearth
Copy link
Member

@Manishearth Manishearth commented Apr 14, 2016

@asajeffrey
Copy link
Member

@asajeffrey asajeffrey commented Apr 14, 2016

@Manishearth okay, I am going over previous discussions. @paulrouget you'd prefer a different event than a mozbrowsererror event with some other way of distinguishing that it didn't come from a particular iframe?

@paulrouget
Copy link
Contributor Author

@paulrouget paulrouget commented Apr 15, 2016

@paulrouget you'd prefer a different event than a mozbrowsererror event with some other way of distinguishing that it didn't come from a particular iframe?

mozbrowser**** events are for the Browser API (hence browser). So for the top level window, let's use something else. I don't really care about the name, as long as it's different.

@asajeffrey
Copy link
Member

@asajeffrey asajeffrey commented Apr 15, 2016

We can fire it onto any fixed target, e.g. the root document, there's just no iframe to target, unless we wanted to try to track which pipeline each thread is acting on behalf of.

@Manishearth
Copy link
Member

@Manishearth Manishearth commented Apr 15, 2016

So for the top level window

The top level window is a Browser window though.

@paulrouget
Copy link
Contributor Author

@paulrouget paulrouget commented Apr 15, 2016

It doesn't really matter. mozbrowsererror, mozbrowsererror_unknownorigin, or anything else. The target will be different, so we can differentiate.

@asajeffrey
Copy link
Member

@asajeffrey asajeffrey commented Apr 15, 2016

#10641 adds a panic channel that can be used by any thread (not just a pipeline thread) to report panics. Next up: add the backtrace to that channel, get all threads to report panics, and hook up to mozbrowsererror.

bors-servo added a commit that referenced this issue Apr 18, 2016
Trigger mozbrowsererror event when a pipeline fails.

Addresses part of #10334 (it fires the event, but does not include a backtrace).

<!-- Reviewable:start -->
---
This change is [<img src="https://reviewable.io/review_button.svg" height="35" align="absmiddle" alt="Reviewable"/>](https://reviewable.io/reviews/servo/servo/10551)
<!-- Reviewable:end -->
bors-servo added a commit that referenced this issue Apr 19, 2016
Dedicated panic channel

Added a dedicated panic channel, and removed the panic messages for the script and layout threads. This is needed so that other threads can report panics, which is part of #10334.

Note that this PR includes the commit from #10572, so should land after it lands.

r? @Manishearth

<!-- Reviewable:start -->
---
This change is [<img src="https://reviewable.io/review_button.svg" height="35" align="absmiddle" alt="Reviewable"/>](https://reviewable.io/reviews/servo/servo/10641)
<!-- Reviewable:end -->
bors-servo added a commit that referenced this issue Apr 19, 2016
Dedicated panic channel

Added a dedicated panic channel, and removed the panic messages for the script and layout threads. This is needed so that other threads can report panics, which is part of #10334.

Note that this PR includes the commit from #10572, so should land after it lands.

r? @Manishearth

<!-- Reviewable:start -->
---
This change is [<img src="https://reviewable.io/review_button.svg" height="35" align="absmiddle" alt="Reviewable"/>](https://reviewable.io/reviews/servo/servo/10641)
<!-- Reviewable:end -->
@asajeffrey
Copy link
Member

@asajeffrey asajeffrey commented Apr 23, 2016

@paulrouget what do you want me to do in the case of multiple panics? Just send the first one? Just send the first one for each pipeline? These things usually travel in packs, since panics cascade.

@asajeffrey
Copy link
Member

@asajeffrey asajeffrey commented Apr 25, 2016

#10824 sends the panic reason and backtrace to the constellation, for error reporting.

#10837 adds the fields to mozbrowsererror events for forwarding this information.

Once both have landed, we can report panics to the browser chrome.

@asajeffrey
Copy link
Member

@asajeffrey asajeffrey commented Apr 25, 2016

@paulrouget I added description and report fields to the mozbrowsererror details, intended to contain a human-readable and a machine-readable representation of the error. Do these seem sensible to you? Care to bikeshed the names?

@asajeffrey
Copy link
Member

@asajeffrey asajeffrey commented Apr 26, 2016

#10862 allows us to test this, by making sure the root pipeline survives, even when randomly killing pipelines.

bors-servo added a commit that referenced this issue Apr 26, 2016
…anishearth

Communicate a backtrace to the constellation when panicking.

Send a representation of the backtrace from a pipeline thread to the constellation in the case of panic. This is the next step in communicating the backtrace to the browser chrome (#10334).

<!-- Reviewable:start -->
---
This change is [<img src="https://reviewable.io/review_button.svg" height="35" align="absmiddle" alt="Reviewable"/>](https://reviewable.io/reviews/servo/servo/10824)
<!-- Reviewable:end -->
bors-servo added a commit that referenced this issue Apr 27, 2016
Don't kill the root pipeline when randomly killing pipelines.

Useful for stress-testing the browser chrome, e.g. testing #10334.

<!-- Reviewable:start -->
---
This change is [<img src="https://reviewable.io/review_button.svg" height="35" align="absmiddle" alt="Reviewable"/>](https://reviewable.io/reviews/servo/servo/10862)
<!-- Reviewable:end -->
@asajeffrey
Copy link
Member

@asajeffrey asajeffrey commented Apr 28, 2016

#10898 only keeps the root pipeline alive when dom.mozbrowser.enabled is true.

bors-servo added a commit that referenced this issue Apr 29, 2016
…shearth

Add detail to mozbrowsererror events.

Part of #10334. Once #10824 lands, we can include the panic reason and backtrace in the error report.

<!-- Reviewable:start -->
---
This change is [<img src="https://reviewable.io/review_button.svg" height="35" align="absmiddle" alt="Reviewable"/>](https://reviewable.io/reviews/servo/servo/10837)
<!-- Reviewable:end -->
bors-servo added a commit that referenced this issue Apr 30, 2016
… r=jdm

Send the panic reason and backtrace in mozbrowsererror.

Closes #10334.  Glues together PRs #10837 and #10824.

<!-- Reviewable:start -->
---
This change is [<img src="https://reviewable.io/review_button.svg" height="35" align="absmiddle" alt="Reviewable"/>](https://reviewable.io/reviews/servo/servo/10931)
<!-- Reviewable:end -->
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
4 participants
You can’t perform that action at this time.