Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid deadlock when closing a pipeline. #11585

Conversation

@asajeffrey
Copy link
Member

asajeffrey commented Jun 3, 2016

At the moment, the constellation blocks on a pipeline during closure. This PR makes pipeline closure asynchronous.


  • ./mach build -d does not report any errors
  • ./mach test-tidy does not report any errors
  • These changes fix #11546.
  • These changes do not require tests because testing for absence of deadlock is difficult.

This change is Reviewable

@highfive
Copy link

highfive commented Jun 3, 2016

Heads up! This PR modifies the following files:

  • @asajeffrey: components/constellation/pipeline.rs, components/constellation/constellation.rs
  • @KiChjang: components/script/script_thread.rs, components/script_traits/script_msg.rs, components/script_traits/script_msg.rs
@highfive
Copy link

highfive commented Jun 3, 2016

warning Warning warning

  • These commits modify script code, but no tests are modified. Please consider adding a test!
@asajeffrey
Copy link
Member Author

asajeffrey commented Jun 3, 2016

@asajeffrey asajeffrey changed the title Avoid deadlock when shutting down. Avoid deadlock when closing a pipeline. Jun 3, 2016
@asajeffrey
Copy link
Member Author

asajeffrey commented Jun 3, 2016

I note we don't have a I-deadlock or I-liveness issue :)

@asajeffrey
Copy link
Member Author

asajeffrey commented Jun 3, 2016

Running test-wpt locally, I can see some new TIMEOUTs, which is probably due to not having a timeout on shutdown. The easiest thing to do would be to add a ForceExit message, and send it to the constellation a fixed time after shutdown is initiated.

@larsbergstrom larsbergstrom assigned larsbergstrom and unassigned jdm Jun 3, 2016
@larsbergstrom
Copy link
Contributor

larsbergstrom commented Jun 3, 2016

Are these FAIL->TIMEOUT or PASS->TIMEOUT? The former seems OK; the latter is worrysome.

I get a little bit worried about constant, fixed timeouts, as it's easy to hit them on either slow devices or devices that are, say, building the script crate at the same time.

I've reviewed the code and it appears good to me, though I'll confess I don't nit "this should be a oneliner" as closely as I should.

@larsbergstrom
Copy link
Contributor

larsbergstrom commented Jun 3, 2016

Reviewed 4 of 4 files at r1.
Review status: all files reviewed at latest revision, all discussions resolved.


Comments from Reviewable

@asajeffrey
Copy link
Member Author

asajeffrey commented Jun 3, 2016

@larsbergstrom they were PASS -> TIMEOUT, but it could be they were intermittents before.

@asajeffrey
Copy link
Member Author

asajeffrey commented Jun 3, 2016

@larsbergstrom on thinking about it, we might be able to replace timeouts on shutdown by ensuring that when a pipeline panics, it is removed from the pipeline set. We'd still have a case where a deadlocked pipeline would cause shutdown to fail though.

@highfive
Copy link

highfive commented Jun 3, 2016

New code was committed to pull request.

1 similar comment
@highfive
Copy link

highfive commented Jun 3, 2016

New code was committed to pull request.

@asajeffrey
Copy link
Member Author

asajeffrey commented Jun 3, 2016

I'm not sure about this last commit, it adds a heartbeat for shutdown messages. Every 100ms, the compositor sends an Exit message to the constellation, which in turn sends an Exit message to the pipelines. The reason for doing this is that if a pipeline exits silently, without telling the constellation that it's doing so, we will attempt to send it an Exit message, which will fail in the same way as if the pipeline panicked.

This fix turns a deadlock into a panic, which is better, but at the cost of code complexity and the possibility for spurious panic reporting, so I'm not sure it's worth it.

@asajeffrey
Copy link
Member Author

asajeffrey commented Jun 3, 2016

I re-ran the wpt-tests a few times. The failures are intermittent, but /html/semantics/embedded-content/the-iframe-element/iframe-with-base.html appears to be a new intermittent, and might be related to this PR, sigh:

$ ./mach test-wpt --release
Running 6187 tests in web-platform-tests

  ▶ TIMEOUT [expected PASS] /html/semantics/embedded-content/the-iframe-element/iframe-with-base.html
  │ 
  │ Shutting down the Constellation after generating an output file or exit flag specified
  └ Shutting down the Constellation after generating an output file or exit flag specified

Unsupported test type wdspec for product servo
Ran 6187 tests finished in 948.0 seconds.
  • 6186 ran as expected. 2502 tests skipped.
  • 1 tests timed out unexpectedly
@asajeffrey asajeffrey force-pushed the asajeffrey:constellation-avoid-deadlock-during-pipeline-closure branch from d2a872e to 3422af7 Jun 3, 2016
@highfive
Copy link

highfive commented Jun 3, 2016

New code was committed to pull request.

@asajeffrey
Copy link
Member Author

asajeffrey commented Jun 3, 2016

Removed heartbeat patch and squashed.

@highfive
Copy link

highfive commented Jun 3, 2016

New code was committed to pull request.

@asajeffrey
Copy link
Member Author

asajeffrey commented Jun 3, 2016

I think I may know what was causing the problem. Previously I was calling on shutdown:

        for (pipeline_id, ref pipeline) in &self.pipelines {
            debug!("Exiting pipeline {:?}.", pipeline_id);
            pipeline.exit();
        }

which closes the pipelines in random order, not respecting the tree structure. I replaced this by:

        // TODO: exit before the root frame is set?
        if let Some(root_id) = self.root_frame_id {
            self.close_frame(root_id, ExitPipelineMode::Normal);
        }

and everything seems much happier!

@larsbergstrom
Copy link
Contributor

larsbergstrom commented Jun 3, 2016

Awesome!

@asajeffrey
Copy link
Member Author

asajeffrey commented Jun 3, 2016

I'll take that as an r+ :)

@bors-servo r=larsbergstrom

@bors-servo
Copy link
Contributor

bors-servo commented Jun 3, 2016

📌 Commit 2263d9c has been approved by larsbergstrom

@asajeffrey
Copy link
Member Author

asajeffrey commented Jun 3, 2016

(For those following along at home, there was a previous r+ on IRC http://logs.glob.uno/?c=mozilla%23servo&s=3+Jun+2016&e=3+Jun+2016#c447820 - I'm not quite that YOLO.)

@asajeffrey
Copy link
Member Author

asajeffrey commented Jun 4, 2016

I added the test case from #11546. Running it in debug mode kicked up a stack trace:

  ▶ CRASH [expected OK] /_mozilla/mozilla/iframe_replacement.html
  │ 
  │ ERROR:constellation::constellation: Panic: called `Result::unwrap()` on an `Err` value: Error { repr: Os { code: 32, message: "Broken pipe" } }
  │ ERROR:constellation::constellation: Backtrace:
  │ frame #0  - 0x000055bc4d692e28 - backtrace::backtrace::libunwind::trace
  │                                @ /home/ajeffrey/github/asajeffrey/servo/.cargo/registry/src/github.com-1ecc6299db9ec823/backtrace-0.2.1/src/backtrace/libunwind.rs:54
  │                                - backtrace::backtrace::trace<closure>
  │                                @ /home/ajeffrey/github/asajeffrey/servo/.cargo/registry/src/github.com-1ecc6299db9ec823/backtrace-0.2.1/src/backtrace/mod.rs:70
  │ frame #1  - 0x000055bc4d692cda - backtrace::capture::{{impl}}::new
  │                                @ /home/ajeffrey/github/asajeffrey/servo/target/debug/build/backtrace-379e75c6a2dd4a9a/out/capture.rs:79
  │ frame #2  - 0x000055bc4a6e02bc - util::thread::spawn_named_with_send_on_panic::_$u7b$$u7b$closure$u7d$$u7d$::_$u7b$$u7b$closure$u7d$$u7d$::h84a17b383f0345bc
  │ frame #3  - 0x000055bc4a6e2c4d - extern $u22$rust.call$u22$$u20$fn$LP$$u5b$closure$SP$DefId$u20$$u7b$$u20$krate.$u20$70$C$$u20$node.$u20$DefIndex$LP$2613$RP$$u20$$u7d$$u20$0.ipc_channel..ipc..IpcSender$LT$$LP$std..option..Option$LT$msg..constellation_msg..PipelineId$GT$$C$$u20$std..string..String$C$$u20$std..string..String$RP$$GT$$C$$u20$1.std..option..Option$LT$msg..constellation_msg..PipelineId$GT$$u5d$$C$$u20$$LP$$RF$$u27$static$u20$std..any..Any$u20$$u2b$$u20$$u27$static$C$$RP$$RP$::once_shim.61225::he372710fe154ed24
  │ frame #4  - 0x000055bc4a6e2bfb - alloc::boxed::{{impl}}::call_box<(&Any),closure>
  │                                @ ../src/liballoc/boxed.rs:543
  │ frame #5  - 0x000055bc4d5a768c - util::panicking::initiate_panic_hook::{{closure}}::{{closure}}::{{closure}}
  │                                @ /home/ajeffrey/github/asajeffrey/servo/components/util/panicking.rs:43
  │ frame #6  - 0x000055bc4d5a7239 - std::thread::local::{{impl}}::with<core::cell::RefCell<core::option::Option<util::panicking::PanicHandlerLocal>>,closure,()>
  │                                @ ../src/libstd/thread/local.rs:211
  │ frame #7  - 0x000055bc4d5a70ba - util::panicking::initiate_panic_hook::{{closure}}::{{closure}}
  │                                @ /home/ajeffrey/github/asajeffrey/servo/components/util/panicking.rs:40
  │ frame #8  - 0x000055bc4dbe359c - std::panicking::rust_panic_with_hook::hfe203e3083c2b544
  │ frame #9  - 0x000055bc4dbfe191 - std::panicking::begin_panic::h4889569716505182
  │ frame #10 - 0x000055bc4dbe4ffa - std::panicking::begin_panic_fmt::h484cd47786497f03
  │ frame #11 - 0x000055bc4dbfe12e - rust_begin_unwind
  │ frame #12 - 0x000055bc4dc3472f - core::panicking::panic_fmt::h257ceb0aa351d801
  │ frame #13 - 0x000055bc4a551c2b - core::result::unwrap_failed<std::io::error::Error>
  │                                @ ../src/libcore/macros.rs:29
  │ frame #14 - 0x000055bc4a715784 - core::result::{{impl}}::unwrap<(),std::io::error::Error>
  │                                @ ../src/libcore/result.rs:723
  │ frame #15 - 0x000055bc4a6e364b - _<gfx..paint_thread..PaintThread<C>>::create::_$u7b$$u7b$closure$u7d$$u7d$::h705efe1c05d2a43c
  │ frame #16 - 0x000055bc4a6df655 - util::thread::spawn_named_with_send_on_panic::_$u7b$$u7b$closure$u7d$$u7d$::h1d42b4e0faf413e5
  │ frame #17 - 0x000055bc4a6df3e7 - std::panic::{{impl}}::call_once<(),closure>
  │                                @ ../src/libstd/panic.rs:284
  │ frame #18 - 0x000055bc4a6df33e - std::panicking::try::_$u7b$$u7b$closure$u7d$$u7d$::_$u7b$$u7b$closure$u7d$$u7d$::h354ca91dfb46364c
  │ frame #19 - 0x000055bc4a6dfc04 - std::panicking::try::call<closure>
  │                                @ ../src/libstd/panicking.rs:272
  │ frame #20 - 0x000055bc4dc0838b - __rust_try
  │ frame #21 - 0x000055bc4dc0832e - __rust_maybe_catch_panic
  │ frame #22 - 0x000055bc4a6df28d - std::panicking::try::_$u7b$$u7b$closure$u7d$$u7d$::hd516d71661769d0d
  │ frame #23 - 0x000055bc4a6df1cf - std::thread::local::{{impl}}::with<core::cell::Cell<usize>,closure,core::result::Result<(), Box<Any>>>
  │                                @ ../src/libstd/thread/local.rs:211
  │ frame #24 - 0x000055bc4a6df00f - std::panicking::try<(),std::panic::AssertUnwindSafe<closure>>
  │                                @ ../src/libstd/panicking.rs:235
  │ frame #25 - 0x000055bc4a6deeee - std::panic::catch_unwind<std::panic::AssertUnwindSafe<closure>,()>
  │                                @ ../src/libstd/panic.rs:387
  │ frame #26 - 0x000055bc4a6ded61 - std::thread::Builder::spawn::_$u7b$$u7b$closure$u7d$$u7d$::h0d443712f85f2de7
  │ frame #27 - 0x000055bc4a6dfde7 - alloc::boxed::{{impl}}::call_box<(),closure>
  │                                @ ../src/liballoc/boxed.rs:543
  │ frame #28 - 0x000055bc4dbfc334 - std::sys::thread::Thread::new::thread_start::h6f266e069bf4ec2b
  │ frame #29 - 0x00007f8ef7d4b6a9 - start_thread
  │ frame #30 - 0x00007f8ef7869e9c - clone
  │ frame #31 - 0x0000000000000000 - <unknown>
  │ 
  └ ERROR:constellation::constellation: Pipeline failed in hard-fail mode.  Crashing!

which I think is cased by the paint thread calling shutdown_chan.send(()).unwrap(). I'm replacing this by a let _ = shutdown_chan,send(()) to see if that helps.

@highfive
Copy link

highfive commented Jun 4, 2016

New code was committed to pull request.

@asajeffrey
Copy link
Member Author

asajeffrey commented Jun 4, 2016

@larsbergstrom I pushed the change:

-            shutdown_chan.send(()).unwrap();
+            let _ = shutdown_chan.send(());

does this still look good to you? If so, I'll squash and retry.

@asajeffrey asajeffrey force-pushed the asajeffrey:constellation-avoid-deadlock-during-pipeline-closure branch from 21d5115 to 2416072 Jun 4, 2016
@highfive
Copy link

highfive commented Jun 4, 2016

New code was committed to pull request.

@asajeffrey
Copy link
Member Author

asajeffrey commented Jun 4, 2016

Squashed.

@asajeffrey
Copy link
Member Author

asajeffrey commented Jun 4, 2016

@bors-servo
Copy link
Contributor

bors-servo commented Jun 4, 2016

📌 Commit 2416072 has been approved by larsbergstrom

@bors-servo
Copy link
Contributor

bors-servo commented Jun 4, 2016

Testing commit 2416072 with merge d2bac9f...

bors-servo added a commit that referenced this pull request Jun 4, 2016
…-pipeline-closure, r=larsbergstrom

Avoid deadlock when closing a pipeline.

<!-- Please describe your changes on the following line: -->
At the moment, the constellation blocks on a pipeline during closure. This PR makes pipeline closure asynchronous.

---
<!-- Thank you for contributing to Servo! Please replace each `[ ]` by `[X]` when the step is complete, and replace `__` with appropriate data: -->
- [X] `./mach build -d` does not report any errors
- [X] `./mach test-tidy` does not report any errors
- [X] These changes fix #11546.
- [X] These changes do not require tests because testing for absence of deadlock is difficult.

<!-- Pull requests that do not address these steps are welcome, but they will require additional verification as part of the review process. -->

<!-- Reviewable:start -->
---
This change is [<img src="https://reviewable.io/review_button.svg" height="35" align="absmiddle" alt="Reviewable"/>](https://reviewable.io/reviews/servo/servo/11585)
<!-- Reviewable:end -->
@bors-servo
Copy link
Contributor

bors-servo commented Jun 4, 2016

💔 Test failed - linux-rel

@highfive
Copy link

highfive commented Jun 4, 2016

  ▶ FAIL [expected PASS] /css-transforms-1_dev/html/transform-abspos-007.htm
  └   → /css-transforms-1_dev/html/transform-abspos-007.htm 71f0313eedfbfcce0b0fcc5ae55f34b1daa3b8d8
/css-transforms-1_dev/html/reference/transform-abspos-ref.htm 78d197606924062e8dd2a773c977afcecf8940f8
Testing 71f0313eedfbfcce0b0fcc5ae55f34b1daa3b8d8 == 78d197606924062e8dd2a773c977afcecf8940f8

  ▶ FAIL [expected PASS] /css-transforms-1_dev/html/transform-table-007.htm
  └   → /css-transforms-1_dev/html/transform-table-007.htm a5c014b20ef1363bea6f24eda28c7efb7c45698a
/css-transforms-1_dev/html/reference/transform-blank-ref.htm fa6407b1acbbfea27e27061e7d1bdeca98e4a728
Testing a5c014b20ef1363bea6f24eda28c7efb7c45698a == fa6407b1acbbfea27e27061e7d1bdeca98e4a728
@asajeffrey
Copy link
Member Author

asajeffrey commented Jun 4, 2016

@bors-servo
Copy link
Contributor

bors-servo commented Jun 4, 2016

Testing commit 2416072 with merge 6581e35...

bors-servo added a commit that referenced this pull request Jun 4, 2016
…-pipeline-closure, r=larsbergstrom

Avoid deadlock when closing a pipeline.

<!-- Please describe your changes on the following line: -->
At the moment, the constellation blocks on a pipeline during closure. This PR makes pipeline closure asynchronous.

---
<!-- Thank you for contributing to Servo! Please replace each `[ ]` by `[X]` when the step is complete, and replace `__` with appropriate data: -->
- [X] `./mach build -d` does not report any errors
- [X] `./mach test-tidy` does not report any errors
- [X] These changes fix #11546.
- [X] These changes do not require tests because testing for absence of deadlock is difficult.

<!-- Pull requests that do not address these steps are welcome, but they will require additional verification as part of the review process. -->

<!-- Reviewable:start -->
---
This change is [<img src="https://reviewable.io/review_button.svg" height="35" align="absmiddle" alt="Reviewable"/>](https://reviewable.io/reviews/servo/servo/11585)
<!-- Reviewable:end -->
@bors-servo
Copy link
Contributor

bors-servo commented Jun 4, 2016

@bors-servo bors-servo merged commit 2416072 into servo:master Jun 4, 2016
2 checks passed
2 checks passed
continuous-integration/travis-ci/pr The Travis CI build passed
Details
homu Test successful
Details
@bors-servo bors-servo mentioned this pull request Jun 4, 2016
3 of 3 tasks complete
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

6 participants
You can’t perform that action at this time.