New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Run translation and LLVM in parallel when compiling with multiple CGUs #43506

Merged
merged 29 commits into from Aug 1, 2017

Conversation

Projects
None yet
7 participants
@michaelwoerister
Contributor

michaelwoerister commented Jul 27, 2017

This is still a work in progress but the bulk of the implementation is done, so I thought it would be good to get it in front of more eyes.

This PR makes the compiler start running LLVM while translation is still in progress, effectively allowing for more parallelism towards the end of the compilation pipeline. It also allows the main thread to switch between either translation or running LLVM, which allows to reduce peak memory usage since not all LLVM module have to be kept in memory until linking. This is especially good for incr. comp. but it works just as well when running with -Ccodegen-units=N.

In order to help tuning and debugging the work scheduler, the PR adds the -Ztrans-time-graph flag which spits out html files that show how work packages where scheduled:
Building regex
(red is translation, green is llvm)

One side effect here is that -Ztime-passes might show something not quite correct because trans and LLVM are not strictly separated anymore. I plan to have some special handling there that will try to produce useful output.

One open question is how to determine whether the trans-thread should switch to intermediate LLVM processing.

TODO:

  • Restore -Z time-passes output for LLVM.
  • Update documentation, esp. for work package scheduling.
  • Tune the scheduling algorithm.

cc @alexcrichton @rust-lang/compiler

@rust-highfive

This comment has been minimized.

Show comment
Hide comment
@rust-highfive

rust-highfive Jul 27, 2017

Collaborator

r? @arielb1

(rust_highfive has picked a reviewer for you, use r? to override)

Collaborator

rust-highfive commented Jul 27, 2017

r? @arielb1

(rust_highfive has picked a reviewer for you, use r? to override)

@michaelwoerister

This comment has been minimized.

Show comment
Hide comment
@michaelwoerister

michaelwoerister Jul 27, 2017

Contributor

r? @alexcrichton

Pre-assigning @alexcrichton for review, so he can already start reading :P

Contributor

michaelwoerister commented Jul 27, 2017

r? @alexcrichton

Pre-assigning @alexcrichton for review, so he can already start reading :P

@alexcrichton

This comment has been minimized.

Show comment
Hide comment
@alexcrichton
Member

alexcrichton commented Jul 27, 2017

🎊

@retep998

This comment has been minimized.

Show comment
Hide comment
@retep998

retep998 Jul 27, 2017

Member

As a future thought for how -Ztime-passes could be implemented in a world where everything is threaded, we could use GetThreadTimes on Windows (and whatever linux equivalent) to measure the CPU time used up by a given thread to more precisely track where time is being spent.

Member

retep998 commented Jul 27, 2017

As a future thought for how -Ztime-passes could be implemented in a world where everything is threaded, we could use GetThreadTimes on Windows (and whatever linux equivalent) to measure the CPU time used up by a given thread to more precisely track where time is being spent.

@alexcrichton

Looking great!

The general framework of who's running what when seemed a little confusing to follow, but I think a comment would go a long way towards helping that.

Show outdated Hide outdated src/librustc_trans/back/write.rs
Show outdated Hide outdated src/librustc_trans/back/write.rs
Show outdated Hide outdated src/librustc_trans/back/write.rs
Show outdated Hide outdated src/librustc_trans/back/write.rs
@alexcrichton

This comment has been minimized.

Show comment
Hide comment
@alexcrichton

alexcrichton Jul 28, 2017

Member

Docs look great! So one overall meta-comment as well now. I'm having some difficulty articulating this but I'm slightly worried about a situation like:

  • We've got N codegen units left to translate, but the main thread is stopped
  • Our heuristic blocks the main thread an spins up a worker for an existing codegen unit.
  • All of a sudden we get an influx of jobserver tokens, but the main thread is still stopped.

I haven't convinced myself this is possible and I think that the various handling makes it ok? I've found the interaction between all these threads and the implicit token a little difficult to follow, but does this make sense to you? Can you think of a case where we accidentally starve the translation thread due to our heuristic?

Member

alexcrichton commented Jul 28, 2017

Docs look great! So one overall meta-comment as well now. I'm having some difficulty articulating this but I'm slightly worried about a situation like:

  • We've got N codegen units left to translate, but the main thread is stopped
  • Our heuristic blocks the main thread an spins up a worker for an existing codegen unit.
  • All of a sudden we get an influx of jobserver tokens, but the main thread is still stopped.

I haven't convinced myself this is possible and I think that the various handling makes it ok? I've found the interaction between all these threads and the implicit token a little difficult to follow, but does this make sense to you? Can you think of a case where we accidentally starve the translation thread due to our heuristic?

@michaelwoerister

This comment has been minimized.

Show comment
Hide comment
@michaelwoerister

michaelwoerister Jul 28, 2017

Contributor

I haven't convinced myself this is possible and I think that the various handling makes it ok? I've found the interaction between all these threads and the implicit token a little difficult to follow, but does this make sense to you? Can you think of a case where we accidentally starve the translation thread due to our heuristic?

So with the latest change, the main thread will be free again as soon as the first LLVM worker is done (thanks to your suggestion). It's still possible to run into a LLVM work shortage the way you describe though.

I just adapted the strategy to estimate the cost of a LLVM WorkItem. Maybe the main thread should always start the cheapest one available, so it can get back to translating sooner, if needed? I don't think this is much of a problem though (with no data to back this claim up in any way :))

Contributor

michaelwoerister commented Jul 28, 2017

I haven't convinced myself this is possible and I think that the various handling makes it ok? I've found the interaction between all these threads and the implicit token a little difficult to follow, but does this make sense to you? Can you think of a case where we accidentally starve the translation thread due to our heuristic?

So with the latest change, the main thread will be free again as soon as the first LLVM worker is done (thanks to your suggestion). It's still possible to run into a LLVM work shortage the way you describe though.

I just adapted the strategy to estimate the cost of a LLVM WorkItem. Maybe the main thread should always start the cheapest one available, so it can get back to translating sooner, if needed? I don't think this is much of a problem though (with no data to back this claim up in any way :))

@alexcrichton

This comment has been minimized.

Show comment
Hide comment
@alexcrichton

alexcrichton Jul 28, 2017

Member

Thinking about this a bit more, I think this is my mental model for what's happening: on each turn of the loop we'll have N slots of work to fill up depending on what's currently running and what amount of tokens we have. Given the choice of whether to translate a new unit or codegen an existing unit it seems fine to have a heuristic. Whenever something happens though it'll turn the loop and cause everything to start over.

I think that's roughly what's implemented right now, but I think that it means that we should consider the translation thread idle as soon as we've acquired a new token? That way if the translation thread was blocked and we get a token it should get unblocked? (which I don't think happens today?) I may not be following the code quite right though...

I just adapted the strategy to estimate the cost of a LLVM WorkItem

Neat!

Member

alexcrichton commented Jul 28, 2017

Thinking about this a bit more, I think this is my mental model for what's happening: on each turn of the loop we'll have N slots of work to fill up depending on what's currently running and what amount of tokens we have. Given the choice of whether to translate a new unit or codegen an existing unit it seems fine to have a heuristic. Whenever something happens though it'll turn the loop and cause everything to start over.

I think that's roughly what's implemented right now, but I think that it means that we should consider the translation thread idle as soon as we've acquired a new token? That way if the translation thread was blocked and we get a token it should get unblocked? (which I don't think happens today?) I may not be following the code quite right though...

I just adapted the strategy to estimate the cost of a LLVM WorkItem

Neat!

@michaelwoerister

This comment has been minimized.

Show comment
Hide comment
@michaelwoerister

michaelwoerister Jul 31, 2017

Contributor

but I think that it means that we should consider the translation thread idle as soon as we've acquired a new token?

Yes, that makes sense. It's a variation of considering the translation thread idle when a package is finished (another way of getting an additional token).

Contributor

michaelwoerister commented Jul 31, 2017

but I think that it means that we should consider the translation thread idle as soon as we've acquired a new token?

Yes, that makes sense. It's a variation of considering the translation thread idle when a package is finished (another way of getting an additional token).

@bors

This comment has been minimized.

Show comment
Hide comment
@bors

bors Aug 1, 2017

Contributor

💔 Test failed - status-travis

Contributor

bors commented Aug 1, 2017

💔 Test failed - status-travis

@michaelwoerister

This comment has been minimized.

Show comment
Hide comment
@michaelwoerister

michaelwoerister Aug 1, 2017

Contributor

Thanks, @kennytm. Compiling with LLVM 3.7 right now...

Contributor

michaelwoerister commented Aug 1, 2017

Thanks, @kennytm. Compiling with LLVM 3.7 right now...

@michaelwoerister

This comment has been minimized.

Show comment
Hide comment
@michaelwoerister

michaelwoerister Aug 1, 2017

Contributor

Can't reproduce with LLVM 3.7 either :(

Contributor

michaelwoerister commented Aug 1, 2017

Can't reproduce with LLVM 3.7 either :(

@michaelwoerister

This comment has been minimized.

Show comment
Hide comment
@michaelwoerister

michaelwoerister Aug 1, 2017

Contributor

I pushed another change that should give a sensible error message at a likely point of failure. Also adapted the scheduler heuristic slightly. Let's see.

Contributor

michaelwoerister commented Aug 1, 2017

I pushed another change that should give a sensible error message at a likely point of failure. Also adapted the scheduler heuristic slightly. Let's see.

@michaelwoerister

This comment has been minimized.

Show comment
Hide comment
@michaelwoerister

michaelwoerister Aug 1, 2017

Contributor

New error, excellent :D

Contributor

michaelwoerister commented Aug 1, 2017

New error, excellent :D

@michaelwoerister

This comment has been minimized.

Show comment
Hide comment
@michaelwoerister

michaelwoerister Aug 1, 2017

Contributor

@bors r=alexcrichton

OK, passes travis now.

Contributor

michaelwoerister commented Aug 1, 2017

@bors r=alexcrichton

OK, passes travis now.

@bors

This comment has been minimized.

Show comment
Hide comment
@bors

bors Aug 1, 2017

Contributor

📌 Commit 6468cad has been approved by alexcrichton

Contributor

bors commented Aug 1, 2017

📌 Commit 6468cad has been approved by alexcrichton

bors added a commit that referenced this pull request Aug 1, 2017

Auto merge of #43506 - michaelwoerister:async-llvm, r=alexcrichton
Run translation and LLVM in parallel when compiling with multiple CGUs

This is still a work in progress but the bulk of the implementation is done, so I thought it would be good to get it in front of more eyes.

This PR makes the compiler start running LLVM while translation is still in progress, effectively allowing for more parallelism towards the end of the compilation pipeline. It also allows the main thread to switch between either translation or running LLVM, which allows to reduce peak memory usage since not all LLVM module have to be kept in memory until linking. This is especially good for incr. comp. but it works just as well when running with `-Ccodegen-units=N`.

In order to help tuning and debugging the work scheduler, the PR adds the `-Ztrans-time-graph` flag which spits out html files that show how work packages where scheduled:
![Building regex](https://user-images.githubusercontent.com/1825894/28679272-f6752bd8-72f2-11e7-8a6c-56207855ce95.png)
(red is translation, green is llvm)

One side effect here is that `-Ztime-passes` might show something not quite correct because trans and LLVM are not strictly separated anymore. I plan to have some special handling there that will try to produce useful output.

One open question is how to determine whether the trans-thread should switch to intermediate LLVM processing.

TODO:
- [x] Restore `-Z time-passes` output for LLVM.
- [x] Update documentation, esp. for work package scheduling.
- [x] Tune the scheduling algorithm.

cc @alexcrichton @rust-lang/compiler
@bors

This comment has been minimized.

Show comment
Hide comment
@bors

bors Aug 1, 2017

Contributor

⌛️ Testing commit 6468cad with merge e772c28...

Contributor

bors commented Aug 1, 2017

⌛️ Testing commit 6468cad with merge e772c28...

@bors

This comment has been minimized.

Show comment
Hide comment
@bors

bors Aug 1, 2017

Contributor

☀️ Test successful - status-appveyor, status-travis
Approved by: alexcrichton
Pushing e772c28 to master...

Contributor

bors commented Aug 1, 2017

☀️ Test successful - status-appveyor, status-travis
Approved by: alexcrichton
Pushing e772c28 to master...

@bors bors merged commit 6468cad into rust-lang:master Aug 1, 2017

2 checks passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details
homu Test successful
Details

@bors bors referenced this pull request Aug 1, 2017

Merged

Profile queries #43345

bors added a commit that referenced this pull request Oct 5, 2017

Auto merge of #45019 - aidanhs:aphs-no-trans-worker-panic, r=alexcric…
…hton

Don't unwrap work item results as the panic trace is useless

Fixes #43402 now there's no multithreaded panic printouts

Also update a comment

--------

Likely regressed in #43506, where the code was changed to panic in worker threads on error.

Unwrapping gives zero extra information since the stack trace is so short, so we may as well just surface that there was an error and exit the thread properly. Because there are then no multithreaded printouts, I think it should mean the output of the test for #26199 is deterministic and not interleaved (thanks to @philipc #43402 (comment) for a hint).

Sadly the output is now:
```
thread '<unnamed>' panicked at 'aborting due to worker thread panic', src/librustc_trans/back/write.rs:1643:20
note: Run with `RUST_BACKTRACE=1` for a backtrace.
error: could not write output to : No such file or directory

error: aborting due to previous error
```
but it's an improvement over the multi-panic situation before.

r? @alexcrichton

netbsd-srcmastr pushed a commit to NetBSD/pkgsrc that referenced this pull request Nov 3, 2017

Update to 1.21.0
Changelog:
Version 1.21.0 (2017-10-12)
==========================

Language
--------
- [You can now use static references for literals.][43838]
  Example:
  ```rust
  fn main() {
      let x: &'static u32 = &0;
  }
  ```
- [Relaxed path syntax. Optional `::` before `<` is now allowed in all contexts.][43540]
  Example:
  ```rust
  my_macro!(Vec<i32>::new); // Always worked
  my_macro!(Vec::<i32>::new); // Now works
  ```

Compiler
--------
- [Upgraded jemalloc to 4.5.0][43911]
- [Enabled unwinding panics on Redox][43917]
- [Now runs LLVM in parallel during translation phase.][43506]
  This should reduce peak memory usage.

Libraries
---------
- [Generate builtin impls for `Clone` for all arrays and tuples that
  are `T: Clone`][43690]
- [`Stdin`, `Stdout`, and `Stderr` now implement `AsRawFd`.][43459]
- [`Rc` and `Arc` now implement `From<&[T]> where T: Clone`, `From<str>`,
  `From<String>`, `From<Box<T>> where T: ?Sized`, and `From<Vec<T>>`.][42565]

Stabilized APIs
---------------

[`std::mem::discriminant`]

Cargo
-----
- [You can now call `cargo install` with multiple package names][cargo/4216]
- [Cargo commands inside a virtual workspace will now implicitly
  pass `--all`][cargo/4335]
- [Added a `[patch]` section to `Cargo.toml` to handle
  prepublication dependencies][cargo/4123] [RFC 1969]
- [`include` & `exclude` fields in `Cargo.toml` now accept gitignore
  like patterns][cargo/4270]
- [Added the `--all-targets` option][cargo/4400]
- [Using required dependencies as a feature is now deprecated and emits
  a warning][cargo/4364]


Misc
----
- [Cargo docs are moving][43916]
  to [doc.rust-lang.org/cargo](https://doc.rust-lang.org/cargo)
- [The rustdoc book is now available][43863]
  at [doc.rust-lang.org/rustdoc](https://doc.rust-lang.org/rustdoc)
- [Added a preview of RLS has been made available through rustup][44204]
  Install with `rustup component add rls-preview`
- [`std::os` documentation for Unix, Linux, and Windows now appears on doc.rust-lang.org][43348]
  Previously only showed `std::os::unix`.

Compatibility Notes
-------------------
- [Changes in method matching against higher-ranked types][43880] This may cause
  breakage in subtyping corner cases. [A more in-depth explanation is available.][info/43880]
- [rustc's JSON error output's byte position start at top of file.][42973]
  Was previously relative to the rustc's internal `CodeMap` struct which
  required the unstable library `libsyntax` to correctly use.
- [`unused_results` lint no longer ignores booleans][43728]

[42565]: rust-lang/rust#42565
[42973]: rust-lang/rust#42973
[43348]: rust-lang/rust#43348
[43459]: rust-lang/rust#43459
[43506]: rust-lang/rust#43506
[43540]: rust-lang/rust#43540
[43690]: rust-lang/rust#43690
[43728]: rust-lang/rust#43728
[43838]: rust-lang/rust#43838
[43863]: rust-lang/rust#43863
[43880]: rust-lang/rust#43880
[43911]: rust-lang/rust#43911
[43916]: rust-lang/rust#43916
[43917]: rust-lang/rust#43917
[44204]: rust-lang/rust#44204
[cargo/4123]: rust-lang/cargo#4123
[cargo/4216]: rust-lang/cargo#4216
[cargo/4270]: rust-lang/cargo#4270
[cargo/4335]: rust-lang/cargo#4335
[cargo/4364]: rust-lang/cargo#4364
[cargo/4400]: rust-lang/cargo#4400
[RFC 1969]: rust-lang/rfcs#1969
[info/43880]: rust-lang/rust#44224 (comment)
[`std::mem::discriminant`]: https://doc.rust-lang.org/std/mem/fn.discriminant.html

@andjo403 andjo403 referenced this pull request Jan 9, 2018

Merged

fix faulty comment #47302

GuillaumeGomez added a commit to GuillaumeGomez/rust that referenced this pull request Jan 16, 2018

Rollup merge of rust-lang#47302 - andjo403:commentfix, r=michaelwoeri…
…ster

fix faulty comment

after rust-lang#43506 there is no fixed number of request sent.

GuillaumeGomez added a commit to GuillaumeGomez/rust that referenced this pull request Jan 16, 2018

Rollup merge of rust-lang#47302 - andjo403:commentfix, r=michaelwoeri…
…ster

fix faulty comment

after rust-lang#43506 there is no fixed number of request sent.

kennytm added a commit to kennytm/rust that referenced this pull request Jan 17, 2018

Rollup merge of rust-lang#47302 - andjo403:commentfix, r=michaelwoeri…
…ster

fix faulty comment

after rust-lang#43506 there is no fixed number of request sent.

kennytm added a commit to kennytm/rust that referenced this pull request Jan 17, 2018

Rollup merge of rust-lang#47302 - andjo403:commentfix, r=michaelwoeri…
…ster

fix faulty comment

after rust-lang#43506 there is no fixed number of request sent.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment