Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Integrate jobserver support to parallel codegen #42682

Merged
merged 1 commit into from Jun 22, 2017

Conversation

Projects
None yet
8 participants
@alexcrichton
Copy link
Member

alexcrichton commented Jun 15, 2017

This commit integrates the jobserver crate into the compiler. The crate was
previously integrated in to Cargo as part of rust-lang/cargo#4110. The purpose
here is to two-fold:

  • Primarily the compiler can cooperate with Cargo on parallelism. When you run
    cargo build -j4 then this'll make sure that the entire build process between
    Cargo/rustc won't use more than 4 cores, whereas today you'd get 4 rustc
    instances which may all try to spawn lots of threads.

  • Secondarily rustc/Cargo can now integrate with a foreign GNU make jobserver.
    This means that if you call cargo/rustc from make or another
    jobserver-compatible implementation it'll use foreign parallelism settings
    instead of creating new ones locally.

As the number of parallel codegen instances in the compiler continues to grow
over time with the advent of incremental compilation it's expected that this'll
become more of a problem, so this is intended to nip concurrent concerns in the
bud by having all the tools to cooperate!

Note that while rustc has support for itself creating a jobserver it's far more
likely that rustc will always use the jobserver configured by Cargo. Cargo today
will now set a jobserver unconditionally for rustc to use.

@rust-highfive

This comment has been minimized.

Copy link
Collaborator

rust-highfive commented Jun 15, 2017

r? @arielb1

(rust_highfive has picked a reviewer for you, use r? to override)

@alexcrichton

This comment has been minimized.

Copy link
Member Author

alexcrichton commented Jun 15, 2017

@alexcrichton alexcrichton force-pushed the alexcrichton:jobserver branch 3 times, most recently from c765aad to 4e8e13a Jun 15, 2017

@michaelwoerister
Copy link
Contributor

michaelwoerister left a comment

Very nice! I'm excited about this :)

Regarding the implementation, I'm not quite clear on how token handling works there. Wouldn't it be easier to just move one token into each spawn_work and let it go out of scope there?

@@ -82,16 +84,11 @@ pub fn run(sess: &session::Session,
// For each of our upstream dependencies, find the corresponding rlib and
// load the bitcode from the archive. Then merge it into the current LLVM
// module that we've got.
link::each_linked_rlib(sess, &mut |cnum, path| {
// `#![no_builtins]` crates don't participate in LTO.
if sess.cstore.is_no_builtins(cnum) {

This comment has been minimized.

@michaelwoerister

michaelwoerister Jun 16, 2017

Contributor

Did you remove this on purpose?

This comment has been minimized.

@michaelwoerister

michaelwoerister Jun 16, 2017

Contributor

I see, it's added in later again.

This comment has been minimized.

@alexcrichton

alexcrichton Jun 16, 2017

Author Member

Yeah this query just ended up having a lot of dependencies on sess so I figured it'd be best to move it way up to the beginning instead of only running it back here.

execute_work_item(&cgcx, work);
let mut tokens = Vec::new();
let mut running = 0;
while work_items.len() > 0 || running > 0 {

This comment has been minimized.

@michaelwoerister

michaelwoerister Jun 16, 2017

Contributor

Could you add a comment here saying something to the effect of "This is our 'main loop', taking care of spawning worker threads and communicating with live ones via message passing -- so we have to keep it running as long as there's still work that hasn't been doled out to a worker (work_items > 0) or if there are still live workers to be communicated with (running > 0)."

scope,
tx.clone(),
work_items.pop().unwrap(),
work_items.len());

This comment has been minimized.

@michaelwoerister

michaelwoerister Jun 16, 2017

Contributor

I'm not very fond this: mutating work_items via pop and then taking its len. I assume that we have a defined evaluation order of function arguments, but I don't like relying on it.

// possible. Remember that we have an ambient token available to us
// hence the `+1` here.
//
// Also note that we may actually acquire more tokens than we need, so

This comment has been minimized.

@michaelwoerister

michaelwoerister Jun 16, 2017

Contributor

When does that happen? If we abort early because of an error?

//
// Also note that we may actually acquire more tokens than we need, so
// in that case just truncate the `tokens` list every time we pass
// through here.

This comment has been minimized.

@michaelwoerister

michaelwoerister Jun 16, 2017

Contributor

Could you add that truncating implies dropping and thus releasing tokens?

work_items.len());
running += 1;
}
tokens.truncate(running.saturating_sub(1));

This comment has been minimized.

@michaelwoerister

michaelwoerister Jun 16, 2017

Contributor

I'm not quite sure how this works. Can't this cause tokens to be lost without a spawn_work having been called for them?


// Set up a destructor which will fire off a message that we're done as
// we exit.
struct Bomb {

This comment has been minimized.

@michaelwoerister

michaelwoerister Jun 16, 2017

Contributor

We should have something like this in libstd.

if sess.cstore.is_no_builtins(cnum) {
return
}
each_linked_rlib.push((cnum, path.to_path_buf()));

This comment has been minimized.

@michaelwoerister

michaelwoerister Jun 16, 2017

Contributor

If the each_linked_rlib field is LTO-specific, we should probably change to the name to reflect this.

// Execute the work itself, and if it finishes successfully then flag
// ourselves as a success as well.
if execute_work_item(&cgcx, work).is_err() {
drop(cgcx.tx.send(Message::AbortIfErrors));

This comment has been minimized.

@michaelwoerister

michaelwoerister Jun 16, 2017

Contributor

One could argue that it would be cleaner to also mem::forget the bomb in this case.

This comment has been minimized.

@alexcrichton

alexcrichton Jun 16, 2017

Author Member

Yeah I wasn't quite sure how this should be handled, I think that if you see a FatalError then a diagnostic has already been sent off, which in turn already sent AbortIfErrors. In that sense it may be fruitless to send another message here, so I'll just ignore the result.

// option. This file may not be copied, modified, or distributed
// except according to those terms.

//! Scoped threads, copied from `crossbeam`

This comment has been minimized.

@michaelwoerister

michaelwoerister Jun 16, 2017

Contributor

Could we also use crossbeam directly?

This comment has been minimized.

@alexcrichton

alexcrichton Jun 16, 2017

Author Member

Hm upon further inspection, I don't see why not!

@alexcrichton alexcrichton force-pushed the alexcrichton:jobserver branch from 4e8e13a to b364714 Jun 16, 2017

@alexcrichton

This comment has been minimized.

Copy link
Member Author

alexcrichton commented Jun 16, 2017

Ok, updated! @michaelwoerister I added a large comment above the "main loop" which I believe should answer your questions about the token management, but if you'd like me to clarify anything please just let me know!

@alexcrichton alexcrichton force-pushed the alexcrichton:jobserver branch from b364714 to 0f66436 Jun 17, 2017

@bors

This comment has been minimized.

Copy link
Contributor

bors commented Jun 18, 2017

☔️ The latest upstream changes (presumably #42676) made this pull request unmergeable. Please resolve the merge conflicts.

// manner we can ensure that the maximum number of parallel workers is
// capped at any one point in time.
//
// The jobserver protocol is a little unique, however. We, as a running

This comment has been minimized.

@michaelwoerister

michaelwoerister Jun 19, 2017

Contributor

Because concurrent programming isn't complicated enough by itself already 😛

@michaelwoerister

This comment has been minimized.

Copy link
Contributor

michaelwoerister commented Jun 19, 2017

Thanks for the clarifying comment about the jobserver protocol!

r=me once the merge conflict is fixed.

@alexcrichton alexcrichton force-pushed the alexcrichton:jobserver branch from 0f66436 to 5d00e5e Jun 19, 2017

@alexcrichton

This comment has been minimized.

Copy link
Member Author

alexcrichton commented Jun 19, 2017

@bors: r=michaelwoerister

@bors

This comment has been minimized.

Copy link
Contributor

bors commented Jun 19, 2017

📌 Commit 5d00e5e has been approved by michaelwoerister

@bors

This comment has been minimized.

Copy link
Contributor

bors commented Jun 20, 2017

⌛️ Testing commit 5d00e5e with merge 6cb3b99...

bors added a commit that referenced this pull request Jun 20, 2017

Auto merge of #42682 - alexcrichton:jobserver, r=michaelwoerister
Integrate jobserver support to parallel codegen

This commit integrates the `jobserver` crate into the compiler. The crate was
previously integrated in to Cargo as part of rust-lang/cargo#4110. The purpose
here is to two-fold:

* Primarily the compiler can cooperate with Cargo on parallelism. When you run
  `cargo build -j4` then this'll make sure that the entire build process between
  Cargo/rustc won't use more than 4 cores, whereas today you'd get 4 rustc
  instances which may all try to spawn lots of threads.

* Secondarily rustc/Cargo can now integrate with a foreign GNU `make` jobserver.
  This means that if you call cargo/rustc from `make` or another
  jobserver-compatible implementation it'll use foreign parallelism settings
  instead of creating new ones locally.

As the number of parallel codegen instances in the compiler continues to grow
over time with the advent of incremental compilation it's expected that this'll
become more of a problem, so this is intended to nip concurrent concerns in the
bud by having all the tools to cooperate!

Note that while rustc has support for itself creating a jobserver it's far more
likely that rustc will always use the jobserver configured by Cargo. Cargo today
will now set a jobserver unconditionally for rustc to use.
@bors

This comment has been minimized.

Copy link
Contributor

bors commented Jun 20, 2017

💔 Test failed - status-appveyor

@alexcrichton alexcrichton force-pushed the alexcrichton:jobserver branch from 5d00e5e to a014634 Jun 20, 2017

@alexcrichton

This comment has been minimized.

Copy link
Member Author

alexcrichton commented Jun 20, 2017

@bors: r=michaelwoerister

@bors

This comment has been minimized.

Copy link
Contributor

bors commented Jun 20, 2017

📌 Commit a014634 has been approved by michaelwoerister

@bors

This comment has been minimized.

Copy link
Contributor

bors commented Jun 20, 2017

⌛️ Testing commit a014634 with merge 0164969...

bors added a commit that referenced this pull request Jun 20, 2017

Auto merge of #42682 - alexcrichton:jobserver, r=michaelwoerister
Integrate jobserver support to parallel codegen

This commit integrates the `jobserver` crate into the compiler. The crate was
previously integrated in to Cargo as part of rust-lang/cargo#4110. The purpose
here is to two-fold:

* Primarily the compiler can cooperate with Cargo on parallelism. When you run
  `cargo build -j4` then this'll make sure that the entire build process between
  Cargo/rustc won't use more than 4 cores, whereas today you'd get 4 rustc
  instances which may all try to spawn lots of threads.

* Secondarily rustc/Cargo can now integrate with a foreign GNU `make` jobserver.
  This means that if you call cargo/rustc from `make` or another
  jobserver-compatible implementation it'll use foreign parallelism settings
  instead of creating new ones locally.

As the number of parallel codegen instances in the compiler continues to grow
over time with the advent of incremental compilation it's expected that this'll
become more of a problem, so this is intended to nip concurrent concerns in the
bud by having all the tools to cooperate!

Note that while rustc has support for itself creating a jobserver it's far more
likely that rustc will always use the jobserver configured by Cargo. Cargo today
will now set a jobserver unconditionally for rustc to use.
@bors

This comment has been minimized.

Copy link
Contributor

bors commented Jun 20, 2017

💔 Test failed - status-appveyor

@Mark-Simulacrum

This comment has been minimized.

Copy link
Member

Mark-Simulacrum commented Jun 20, 2017

Linking failure?

[01:33:02] error: linking with `link.exe` failed: exit code: 1120
[01:33:02]   |
[01:33:02]   = note: "C:\\Program Files (x86)\\Microsoft Visual Studio 14.0\\VC\\bin\\amd64\\link.exe" "/NOLOGO" "/NXCOMPAT" "/LIBPATH:C:\\projects\\rust\\build\\x86_64-pc-windows-msvc\\stage2\\lib\\rustlib\\x86_64-pc-windows-msvc\\lib" "
C:\\projects\\rust\\build\\x86_64-pc-windows-msvc\\test\\run-pass\\smallest-hello-world.0.o" "/OUT:C:\\projects\\rust\\build\\x86_64-pc-windows-msvc\\test\\run-pass\\smallest-hello-world.stage2-x86_64-pc-windows-msvc.exe" "/OPT:REF,ICF" "
/DEBUG" "/LIBPATH:C:\\projects\\rust\\build\\x86_64-pc-windows-msvc\\test\\run-pass" "/LIBPATH:C:\\projects\\rust\\build\\x86_64-pc-windows-msvc\\test\\run-pass\\smallest-hello-world.stage2-x86_64-pc-windows-msvc.run-pass.libaux" "/LIBPAT
H:C:\\projects\\rust\\build\\x86_64-pc-windows-msvc\\native\\rust-test-helpers" "/LIBPATH:C:\\projects\\rust\\build\\x86_64-pc-windows-msvc\\stage2\\lib\\rustlib\\x86_64-pc-windows-msvc\\lib" "C:\\projects\\rust\\build\\x86_64-pc-windows-
msvc\\stage2\\lib\\rustlib\\x86_64-pc-windows-msvc\\lib\\liballoc_system-ace597b8fd7407ce.rlib" "C:\\projects\\rust\\build\\x86_64-pc-windows-msvc\\stage2\\lib\\rustlib\\x86_64-pc-windows-msvc\\lib\\libcore-216e0f775a886879.rlib"
[01:33:02]   = note: smallest-hello-world.0.o : error LNK2019: unresolved external symbol puts referenced in function main
[01:33:02]           LINK : error LNK2001: unresolved external symbol mainCRTStartup
[01:33:02]           C:\projects\rust\build\x86_64-pc-windows-msvc\test\run-pass\smallest-hello-world.stage2-x86_64-pc-windows-msvc.exe : fatal error LNK1120: 2 unresolved externals
[01:33:02]
[01:33:02]
[01:33:02] error: aborting due to previous error(s)
[01:33:02]

@alexcrichton alexcrichton force-pushed the alexcrichton:jobserver branch from a014634 to 2376e0d Jun 20, 2017

@alexcrichton

This comment has been minimized.

Copy link
Member Author

alexcrichton commented Jun 20, 2017

@bors: r=michaelwoerister

@bors

This comment has been minimized.

Copy link
Contributor

bors commented Jun 20, 2017

📌 Commit 2376e0d has been approved by michaelwoerister

@bors

This comment has been minimized.

Copy link
Contributor

bors commented Jun 21, 2017

🔒 Merge conflict

@bors

This comment has been minimized.

Copy link
Contributor

bors commented Jun 21, 2017

☔️ The latest upstream changes (presumably #42664) made this pull request unmergeable. Please resolve the merge conflicts.

@alexcrichton alexcrichton force-pushed the alexcrichton:jobserver branch from 2376e0d to 451d392 Jun 21, 2017

@alexcrichton

This comment has been minimized.

Copy link
Member Author

alexcrichton commented Jun 21, 2017

@bors: r=michaelwoerister

@bors

This comment has been minimized.

Copy link
Contributor

bors commented Jun 21, 2017

📌 Commit 451d392 has been approved by michaelwoerister

@alexcrichton alexcrichton force-pushed the alexcrichton:jobserver branch from 451d392 to 46dc6da Jun 21, 2017

Integrate jobserver support to parallel codegen
This commit integrates the `jobserver` crate into the compiler. The crate was
previously integrated in to Cargo as part of rust-lang/cargo#4110. The purpose
here is to two-fold:

* Primarily the compiler can cooperate with Cargo on parallelism. When you run
  `cargo build -j4` then this'll make sure that the entire build process between
  Cargo/rustc won't use more than 4 cores, whereas today you'd get 4 rustc
  instances which may all try to spawn lots of threads.

* Secondarily rustc/Cargo can now integrate with a foreign GNU `make` jobserver.
  This means that if you call cargo/rustc from `make` or another
  jobserver-compatible implementation it'll use foreign parallelism settings
  instead of creating new ones locally.

As the number of parallel codegen instances in the compiler continues to grow
over time with the advent of incremental compilation it's expected that this'll
become more of a problem, so this is intended to nip concurrent concerns in the
bud by having all the tools to cooperate!

Note that while rustc has support for itself creating a jobserver it's far more
likely that rustc will always use the jobserver configured by Cargo. Cargo today
will now set a jobserver unconditionally for rustc to use.

@alexcrichton alexcrichton force-pushed the alexcrichton:jobserver branch from 46dc6da to 201f069 Jun 21, 2017

@alexcrichton

This comment has been minimized.

Copy link
Member Author

alexcrichton commented Jun 21, 2017

@bors: r=michaelwoerister

@bors

This comment has been minimized.

Copy link
Contributor

bors commented Jun 21, 2017

📌 Commit 201f069 has been approved by michaelwoerister

@bors

This comment has been minimized.

Copy link
Contributor

bors commented Jun 21, 2017

⌛️ Testing commit 201f069 with merge 694adee...

bors added a commit that referenced this pull request Jun 21, 2017

Auto merge of #42682 - alexcrichton:jobserver, r=michaelwoerister
Integrate jobserver support to parallel codegen

This commit integrates the `jobserver` crate into the compiler. The crate was
previously integrated in to Cargo as part of rust-lang/cargo#4110. The purpose
here is to two-fold:

* Primarily the compiler can cooperate with Cargo on parallelism. When you run
  `cargo build -j4` then this'll make sure that the entire build process between
  Cargo/rustc won't use more than 4 cores, whereas today you'd get 4 rustc
  instances which may all try to spawn lots of threads.

* Secondarily rustc/Cargo can now integrate with a foreign GNU `make` jobserver.
  This means that if you call cargo/rustc from `make` or another
  jobserver-compatible implementation it'll use foreign parallelism settings
  instead of creating new ones locally.

As the number of parallel codegen instances in the compiler continues to grow
over time with the advent of incremental compilation it's expected that this'll
become more of a problem, so this is intended to nip concurrent concerns in the
bud by having all the tools to cooperate!

Note that while rustc has support for itself creating a jobserver it's far more
likely that rustc will always use the jobserver configured by Cargo. Cargo today
will now set a jobserver unconditionally for rustc to use.
@bors

This comment has been minimized.

Copy link
Contributor

bors commented Jun 21, 2017

💔 Test failed - status-travis

@alexcrichton

This comment has been minimized.

Copy link
Member Author

alexcrichton commented Jun 21, 2017

@bors

This comment has been minimized.

Copy link
Contributor

bors commented Jun 22, 2017

⌛️ Testing commit 201f069 with merge 80271e8...

bors added a commit that referenced this pull request Jun 22, 2017

Auto merge of #42682 - alexcrichton:jobserver, r=michaelwoerister
Integrate jobserver support to parallel codegen

This commit integrates the `jobserver` crate into the compiler. The crate was
previously integrated in to Cargo as part of rust-lang/cargo#4110. The purpose
here is to two-fold:

* Primarily the compiler can cooperate with Cargo on parallelism. When you run
  `cargo build -j4` then this'll make sure that the entire build process between
  Cargo/rustc won't use more than 4 cores, whereas today you'd get 4 rustc
  instances which may all try to spawn lots of threads.

* Secondarily rustc/Cargo can now integrate with a foreign GNU `make` jobserver.
  This means that if you call cargo/rustc from `make` or another
  jobserver-compatible implementation it'll use foreign parallelism settings
  instead of creating new ones locally.

As the number of parallel codegen instances in the compiler continues to grow
over time with the advent of incremental compilation it's expected that this'll
become more of a problem, so this is intended to nip concurrent concerns in the
bud by having all the tools to cooperate!

Note that while rustc has support for itself creating a jobserver it's far more
likely that rustc will always use the jobserver configured by Cargo. Cargo today
will now set a jobserver unconditionally for rustc to use.
@bors

This comment has been minimized.

Copy link
Contributor

bors commented Jun 22, 2017

☀️ Test successful - status-appveyor, status-travis
Approved by: michaelwoerister
Pushing 80271e8 to master...

@bors bors merged commit 201f069 into rust-lang:master Jun 22, 2017

2 checks passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details
homu Test successful
Details

@alexcrichton alexcrichton deleted the alexcrichton:jobserver branch Jun 22, 2017

@jdm

This comment has been minimized.

Copy link
Contributor

jdm commented Jun 28, 2017

So does this only support invoking cargo/rustc from make, but the behaviour of invoking make from a Cargo build script is unchanged?

@alexcrichton

This comment has been minimized.

Copy link
Member Author

alexcrichton commented Jun 30, 2017

@jdm it's a little more nuanced than that. Cargo also creates a jobserver in addition to consuming one, meaning that rustc will basically always use that jobserver now. If Cargo inherits a jobserver though then rustc likely will too.

You need to tweak makefiles calling rustc/cargo though to actually let them inherit the jobserver, notably adding a + to the beginning of the rule definition.

For build scripts invoking make the make subprocess will inherit Cargo's jobserver if no -j argument is passed, but if -jN is passed then that'll override the inherited jobserver.

jonhoo added a commit to mit-pdos/noria that referenced this pull request Oct 10, 2017

Allow cargo to compile with >1 core
This should significantly speed up debug and test builds + cargo check.
With rust-lang/rust#42682, cargo/rustc no longer
spawns lots and lots of workers even when called recursively. Still not
enabled by default in release mode:

https://internals.rust-lang.org/t/help-test-out-thinlto/6017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.