Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Integrate jobserver support to parallel codegen #42682

Merged
merged 1 commit into from
Jun 22, 2017

Conversation

alexcrichton
Copy link
Member

This commit integrates the jobserver crate into the compiler. The crate was
previously integrated in to Cargo as part of rust-lang/cargo#4110. The purpose
here is to two-fold:

  • Primarily the compiler can cooperate with Cargo on parallelism. When you run
    cargo build -j4 then this'll make sure that the entire build process between
    Cargo/rustc won't use more than 4 cores, whereas today you'd get 4 rustc
    instances which may all try to spawn lots of threads.

  • Secondarily rustc/Cargo can now integrate with a foreign GNU make jobserver.
    This means that if you call cargo/rustc from make or another
    jobserver-compatible implementation it'll use foreign parallelism settings
    instead of creating new ones locally.

As the number of parallel codegen instances in the compiler continues to grow
over time with the advent of incremental compilation it's expected that this'll
become more of a problem, so this is intended to nip concurrent concerns in the
bud by having all the tools to cooperate!

Note that while rustc has support for itself creating a jobserver it's far more
likely that rustc will always use the jobserver configured by Cargo. Cargo today
will now set a jobserver unconditionally for rustc to use.

@rust-highfive
Copy link
Collaborator

r? @arielb1

(rust_highfive has picked a reviewer for you, use r? to override)

@alexcrichton
Copy link
Member Author

r? @michaelwoerister

Copy link
Member

@michaelwoerister michaelwoerister left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice! I'm excited about this :)

Regarding the implementation, I'm not quite clear on how token handling works there. Wouldn't it be easier to just move one token into each spawn_work and let it go out of scope there?

@@ -82,16 +84,11 @@ pub fn run(sess: &session::Session,
// For each of our upstream dependencies, find the corresponding rlib and
// load the bitcode from the archive. Then merge it into the current LLVM
// module that we've got.
link::each_linked_rlib(sess, &mut |cnum, path| {
// `#![no_builtins]` crates don't participate in LTO.
if sess.cstore.is_no_builtins(cnum) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you remove this on purpose?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, it's added in later again.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah this query just ended up having a lot of dependencies on sess so I figured it'd be best to move it way up to the beginning instead of only running it back here.

execute_work_item(&cgcx, work);
let mut tokens = Vec::new();
let mut running = 0;
while work_items.len() > 0 || running > 0 {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add a comment here saying something to the effect of "This is our 'main loop', taking care of spawning worker threads and communicating with live ones via message passing -- so we have to keep it running as long as there's still work that hasn't been doled out to a worker (work_items > 0) or if there are still live workers to be communicated with (running > 0)."

scope,
tx.clone(),
work_items.pop().unwrap(),
work_items.len());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not very fond this: mutating work_items via pop and then taking its len. I assume that we have a defined evaluation order of function arguments, but I don't like relying on it.

// possible. Remember that we have an ambient token available to us
// hence the `+1` here.
//
// Also note that we may actually acquire more tokens than we need, so
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When does that happen? If we abort early because of an error?

//
// Also note that we may actually acquire more tokens than we need, so
// in that case just truncate the `tokens` list every time we pass
// through here.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add that truncating implies dropping and thus releasing tokens?

work_items.len());
running += 1;
}
tokens.truncate(running.saturating_sub(1));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not quite sure how this works. Can't this cause tokens to be lost without a spawn_work having been called for them?


// Set up a destructor which will fire off a message that we're done as
// we exit.
struct Bomb {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should have something like this in libstd.

if sess.cstore.is_no_builtins(cnum) {
return
}
each_linked_rlib.push((cnum, path.to_path_buf()));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the each_linked_rlib field is LTO-specific, we should probably change to the name to reflect this.

// Execute the work itself, and if it finishes successfully then flag
// ourselves as a success as well.
if execute_work_item(&cgcx, work).is_err() {
drop(cgcx.tx.send(Message::AbortIfErrors));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One could argue that it would be cleaner to also mem::forget the bomb in this case.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I wasn't quite sure how this should be handled, I think that if you see a FatalError then a diagnostic has already been sent off, which in turn already sent AbortIfErrors. In that sense it may be fruitless to send another message here, so I'll just ignore the result.

// option. This file may not be copied, modified, or distributed
// except according to those terms.

//! Scoped threads, copied from `crossbeam`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we also use crossbeam directly?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm upon further inspection, I don't see why not!

@shepmaster shepmaster added the S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. label Jun 16, 2017
@alexcrichton
Copy link
Member Author

Ok, updated! @michaelwoerister I added a large comment above the "main loop" which I believe should answer your questions about the token management, but if you'd like me to clarify anything please just let me know!

@bors
Copy link
Contributor

bors commented Jun 18, 2017

☔ The latest upstream changes (presumably #42676) made this pull request unmergeable. Please resolve the merge conflicts.

// manner we can ensure that the maximum number of parallel workers is
// capped at any one point in time.
//
// The jobserver protocol is a little unique, however. We, as a running
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because concurrent programming isn't complicated enough by itself already 😛

@michaelwoerister
Copy link
Member

Thanks for the clarifying comment about the jobserver protocol!

r=me once the merge conflict is fixed.

@alexcrichton
Copy link
Member Author

@bors: r=michaelwoerister

@bors
Copy link
Contributor

bors commented Jun 19, 2017

📌 Commit 5d00e5e has been approved by michaelwoerister

@bors
Copy link
Contributor

bors commented Jun 20, 2017

⌛ Testing commit 5d00e5ea2892d376aa18f8db2e4db435e097a81a with merge 6cb3b992db1aa5c9952c187dca9b30ec0c3d98d4...

@bors
Copy link
Contributor

bors commented Jun 20, 2017

💔 Test failed - status-appveyor

@alexcrichton
Copy link
Member Author

@bors: r=michaelwoerister

@bors
Copy link
Contributor

bors commented Jun 20, 2017

📌 Commit a014634 has been approved by michaelwoerister

@bors
Copy link
Contributor

bors commented Jun 20, 2017

⌛ Testing commit a014634c1a0ee939207dcd1e8d64dfcd0ebec586 with merge 016496955016d5d75d27180c10a158aad0083c8d...

@arielb1 arielb1 added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Jun 20, 2017
@bors
Copy link
Contributor

bors commented Jun 20, 2017

💔 Test failed - status-appveyor

@alexcrichton
Copy link
Member Author

@bors: r=michaelwoerister

@bors
Copy link
Contributor

bors commented Jun 21, 2017

📌 Commit 451d392 has been approved by michaelwoerister

This commit integrates the `jobserver` crate into the compiler. The crate was
previously integrated in to Cargo as part of rust-lang/cargo#4110. The purpose
here is to two-fold:

* Primarily the compiler can cooperate with Cargo on parallelism. When you run
  `cargo build -j4` then this'll make sure that the entire build process between
  Cargo/rustc won't use more than 4 cores, whereas today you'd get 4 rustc
  instances which may all try to spawn lots of threads.

* Secondarily rustc/Cargo can now integrate with a foreign GNU `make` jobserver.
  This means that if you call cargo/rustc from `make` or another
  jobserver-compatible implementation it'll use foreign parallelism settings
  instead of creating new ones locally.

As the number of parallel codegen instances in the compiler continues to grow
over time with the advent of incremental compilation it's expected that this'll
become more of a problem, so this is intended to nip concurrent concerns in the
bud by having all the tools to cooperate!

Note that while rustc has support for itself creating a jobserver it's far more
likely that rustc will always use the jobserver configured by Cargo. Cargo today
will now set a jobserver unconditionally for rustc to use.
@alexcrichton
Copy link
Member Author

@bors: r=michaelwoerister

@bors
Copy link
Contributor

bors commented Jun 21, 2017

📌 Commit 201f069 has been approved by michaelwoerister

@bors
Copy link
Contributor

bors commented Jun 21, 2017

⌛ Testing commit 201f069 with merge 694adee...

bors added a commit that referenced this pull request Jun 21, 2017
Integrate jobserver support to parallel codegen

This commit integrates the `jobserver` crate into the compiler. The crate was
previously integrated in to Cargo as part of rust-lang/cargo#4110. The purpose
here is to two-fold:

* Primarily the compiler can cooperate with Cargo on parallelism. When you run
  `cargo build -j4` then this'll make sure that the entire build process between
  Cargo/rustc won't use more than 4 cores, whereas today you'd get 4 rustc
  instances which may all try to spawn lots of threads.

* Secondarily rustc/Cargo can now integrate with a foreign GNU `make` jobserver.
  This means that if you call cargo/rustc from `make` or another
  jobserver-compatible implementation it'll use foreign parallelism settings
  instead of creating new ones locally.

As the number of parallel codegen instances in the compiler continues to grow
over time with the advent of incremental compilation it's expected that this'll
become more of a problem, so this is intended to nip concurrent concerns in the
bud by having all the tools to cooperate!

Note that while rustc has support for itself creating a jobserver it's far more
likely that rustc will always use the jobserver configured by Cargo. Cargo today
will now set a jobserver unconditionally for rustc to use.
@bors
Copy link
Contributor

bors commented Jun 21, 2017

💔 Test failed - status-travis

@alexcrichton
Copy link
Member Author

alexcrichton commented Jun 21, 2017 via email

@bors
Copy link
Contributor

bors commented Jun 22, 2017

⌛ Testing commit 201f069 with merge 80271e8...

bors added a commit that referenced this pull request Jun 22, 2017
Integrate jobserver support to parallel codegen

This commit integrates the `jobserver` crate into the compiler. The crate was
previously integrated in to Cargo as part of rust-lang/cargo#4110. The purpose
here is to two-fold:

* Primarily the compiler can cooperate with Cargo on parallelism. When you run
  `cargo build -j4` then this'll make sure that the entire build process between
  Cargo/rustc won't use more than 4 cores, whereas today you'd get 4 rustc
  instances which may all try to spawn lots of threads.

* Secondarily rustc/Cargo can now integrate with a foreign GNU `make` jobserver.
  This means that if you call cargo/rustc from `make` or another
  jobserver-compatible implementation it'll use foreign parallelism settings
  instead of creating new ones locally.

As the number of parallel codegen instances in the compiler continues to grow
over time with the advent of incremental compilation it's expected that this'll
become more of a problem, so this is intended to nip concurrent concerns in the
bud by having all the tools to cooperate!

Note that while rustc has support for itself creating a jobserver it's far more
likely that rustc will always use the jobserver configured by Cargo. Cargo today
will now set a jobserver unconditionally for rustc to use.
@bors
Copy link
Contributor

bors commented Jun 22, 2017

☀️ Test successful - status-appveyor, status-travis
Approved by: michaelwoerister
Pushing 80271e8 to master...

@jdm
Copy link
Contributor

jdm commented Jun 28, 2017

So does this only support invoking cargo/rustc from make, but the behaviour of invoking make from a Cargo build script is unchanged?

@alexcrichton
Copy link
Member Author

@jdm it's a little more nuanced than that. Cargo also creates a jobserver in addition to consuming one, meaning that rustc will basically always use that jobserver now. If Cargo inherits a jobserver though then rustc likely will too.

You need to tweak makefiles calling rustc/cargo though to actually let them inherit the jobserver, notably adding a + to the beginning of the rule definition.

For build scripts invoking make the make subprocess will inherit Cargo's jobserver if no -j argument is passed, but if -jN is passed then that'll override the inherited jobserver.

jonhoo added a commit to mit-pdos/noria that referenced this pull request Oct 10, 2017
This should significantly speed up debug and test builds + cargo check.
With rust-lang/rust#42682, cargo/rustc no longer
spawns lots and lots of workers even when called recursively. Still not
enabled by default in release mode:

https://internals.rust-lang.org/t/help-test-out-thinlto/6017
Manishearth added a commit to Manishearth/rust that referenced this pull request Jul 16, 2020
…atsakis

Remove some `ignore-stage1` annotations.

These tests appear to no longer need the `ignore-stage1` marker.

- `run-make-fulldeps/issue-37839` and `run-make-fulldeps/issue-37893`: I believe these were due to the use of proc-macros, and probably were just missed in rust-lang#49219 which fixed the proc-macro compatibility.

- `compile-fail/asm-src-loc-codegen-units.rs`: This was due to an old issue with landing pads (as mentioned in the linked issue rust-lang#20184). `-Zno-landing-pads` was an option when building the first stage (it was much faster), but somewhere along the way (I think the switch from makefiles to rustbuild), the option was removed.
  - NOTE: This test doesn't actually test what it was originally written for, and is probably mostly pointless now. This test was asserting the message "build without -C codegen-units for more exact errors", but that was removed in rust-lang#42682. It is now in essence identical to `asm-src-loc.rs`.
Manishearth added a commit to Manishearth/rust that referenced this pull request Jul 16, 2020
…atsakis

Remove some `ignore-stage1` annotations.

These tests appear to no longer need the `ignore-stage1` marker.

- `run-make-fulldeps/issue-37839` and `run-make-fulldeps/issue-37893`: I believe these were due to the use of proc-macros, and probably were just missed in rust-lang#49219 which fixed the proc-macro compatibility.

- `compile-fail/asm-src-loc-codegen-units.rs`: This was due to an old issue with landing pads (as mentioned in the linked issue rust-lang#20184). `-Zno-landing-pads` was an option when building the first stage (it was much faster), but somewhere along the way (I think the switch from makefiles to rustbuild), the option was removed.
  - NOTE: This test doesn't actually test what it was originally written for, and is probably mostly pointless now. This test was asserting the message "build without -C codegen-units for more exact errors", but that was removed in rust-lang#42682. It is now in essence identical to `asm-src-loc.rs`.
Manishearth added a commit to Manishearth/rust that referenced this pull request Jul 16, 2020
…atsakis

Remove some `ignore-stage1` annotations.

These tests appear to no longer need the `ignore-stage1` marker.

- `run-make-fulldeps/issue-37839` and `run-make-fulldeps/issue-37893`: I believe these were due to the use of proc-macros, and probably were just missed in rust-lang#49219 which fixed the proc-macro compatibility.

- `compile-fail/asm-src-loc-codegen-units.rs`: This was due to an old issue with landing pads (as mentioned in the linked issue rust-lang#20184). `-Zno-landing-pads` was an option when building the first stage (it was much faster), but somewhere along the way (I think the switch from makefiles to rustbuild), the option was removed.
  - NOTE: This test doesn't actually test what it was originally written for, and is probably mostly pointless now. This test was asserting the message "build without -C codegen-units for more exact errors", but that was removed in rust-lang#42682. It is now in essence identical to `asm-src-loc.rs`.
Manishearth added a commit to Manishearth/rust that referenced this pull request Jul 16, 2020
…atsakis

Remove some `ignore-stage1` annotations.

These tests appear to no longer need the `ignore-stage1` marker.

- `run-make-fulldeps/issue-37839` and `run-make-fulldeps/issue-37893`: I believe these were due to the use of proc-macros, and probably were just missed in rust-lang#49219 which fixed the proc-macro compatibility.

- `compile-fail/asm-src-loc-codegen-units.rs`: This was due to an old issue with landing pads (as mentioned in the linked issue rust-lang#20184). `-Zno-landing-pads` was an option when building the first stage (it was much faster), but somewhere along the way (I think the switch from makefiles to rustbuild), the option was removed.
  - NOTE: This test doesn't actually test what it was originally written for, and is probably mostly pointless now. This test was asserting the message "build without -C codegen-units for more exact errors", but that was removed in rust-lang#42682. It is now in essence identical to `asm-src-loc.rs`.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants