Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cargo should release Make jobserver token when it's waiting for file lock #6747

Closed
glandium opened this Issue Mar 14, 2019 · 4 comments

Comments

Projects
None yet
3 participants
@glandium
Copy link

glandium commented Mar 14, 2019

Consider the following Makefile:

default: first second third

.PHONY: first second
first second:
	+cargo build -p $@ --verbose

third:
	echo start $@
	sleep 1
	echo end $@

And the directory containing the Makefile contains the following Cargo.toml:

[workspace]
members = [
  "first",
  "second",
]

first and second are just created with cargo new, with an additional build.rs each that contains the following:

use std::{thread, time};

fn main() {
  thread::sleep(time::Duration::from_secs(2));
}

Now, run the build with make -j2.

The timeline of what happens looks like the following:

  • 0s: make executes the two cargo commands.
  • 0s: first cargo command invokes rustc for build.rs and runs build_script_build ; second cargo command prints "Blocking waiting for file lock on build directory"
  • 2s: first cargo finishes running build_script_build, builds the rest of the first crate
  • 2s: second cargo command invokes rustc for build.rs and runs build_script_build
  • 2s: make executes echo start third then sleep 2
  • 4s: second cargo finishes running build_script_build, builds the rest of the second crate
  • 4s: make executes echo end third.

Obviously, this is all synthetic, and cargo releasing the make jobserver token as proposed in the summary would not change the outcome in this example, being that the build would still take 4 seconds. But what it would allow is for make to start running the third target earlier, and that could make a hell of a difference.

The real world manifestation is https://bugzilla.mozilla.org/show_bug.cgi?id=1533988 which is caused by the Firefox build system invoking three cargo commands in the same workspace, two of which are "Blocking waiting for file lock on build directory", and the one that does work is actually likely to not be using all the cores available because of how a lot of the dependency chain of the crates built is linear. So the two cargos doing nothing are preventing more C++ compiler processes working while the first cargo works.

Cc: @alexcrichton

alexcrichton added a commit to alexcrichton/jobserver-rs that referenced this issue Mar 14, 2019

alexcrichton added a commit to alexcrichton/cargo that referenced this issue Mar 14, 2019

Release a jobserver token while locking a file
This is a possible solution to rust-lang#6747, but we'll ideally get some testing
in before landing!
@alexcrichton

This comment has been minimized.

Copy link
Member

alexcrichton commented Mar 14, 2019

Oh wow, a fascinating find!

@glandium would you be able to test out #6748 to see if it fixes the issue you've found here?

alexcrichton added a commit to alexcrichton/jobserver-rs that referenced this issue Mar 14, 2019

@glandium

This comment has been minimized.

Copy link
Author

glandium commented Mar 14, 2019

It seems to work, at least with the small testcase from this report and a few other small testcases I derived from it.

@upsuper

This comment has been minimized.

Copy link
Contributor

upsuper commented Mar 15, 2019

@glandium pinged me on the original bug, so I tried to test that PR on the exactly same environment (a GCP n1-highcpu-8 instance). In conclusion, that patch helps CPU utilization a lot, although it doesn't seem to help on total build time of Firefox at all (which is a separate issue related to Firefox's build system I guess).

This is the CPU usage graph building Firefox on this instance with an unpatched Cargo:
Screenshot_2019-03-15 Compute Engine - Building Firefox - Google Cloud Platform

This is the CPU usage graph with an patched Cargo:
Screenshot_2019-03-16 Compute Engine - Building Firefox - Google Cloud Platform

As can be seen that with the patched Cargo, CPU utilization is keep being 100% for most of the time, while previously much less.

alexcrichton added a commit to alexcrichton/cargo that referenced this issue Mar 15, 2019

Release a jobserver token while locking a file
This is intended to fix rust-lang#6747 where multiple Cargos invoked with the
same jobserver would all have their own token but not actually run
concurrently due to file locking. Instead the fix is that whenever Cargo
blocks for a file lock with a configured global jobserver, a token is
released just before we block and then reacquired afterwards. This way
we should ensure that we're not hogging a cpu/token unnecessarily
without doing any work!

Closes rust-lang#6747
@alexcrichton

This comment has been minimized.

Copy link
Member

alexcrichton commented Mar 15, 2019

Thanks for investigating @glandium and @upsuper, those are definitely some convincing graphs :)

bors added a commit that referenced this issue Mar 15, 2019

Auto merge of #6748 - alexcrichton:release-then-acquire, r=ehuss
Release a jobserver token while locking a file

This is a possible solution to #6747, but we'll ideally get some testing
in before landing!

bors added a commit that referenced this issue Mar 15, 2019

Auto merge of #6748 - alexcrichton:release-then-acquire, r=ehuss
Release a jobserver token while locking a file

This is a possible solution to #6747, but we'll ideally get some testing
in before landing!

@bors bors closed this in d19b41f Mar 15, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.