Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: Globally Locked Cargo #1781

Closed
wants to merge 11 commits into from

Conversation

Projects
None yet
4 participants
@Byron
Copy link
Contributor

Byron commented Jul 4, 2015

This PR is the successor of 'Concurrent Cargo' and uses a single lock to assure multiple cargo invocations will not interfere with each other in an undefined fashion.

Implementation Details

The file-lock crate provides the actual lock implementation. It allows to try obtaining a lock (non-blocking) or to wait until the lock was obtained. All this is done using standard operating system facilities which are available on both windows and posix-compatible systems.

The lock-file is persistent and is currently placed in ${CARGO_HOME}/.global-lock. Please note it s not comparable to the locking mechanism git uses, as the latter doesn't support blocking until a lock is obtained.

Using the cargo configuration and the key build.lock-kind, it is possible to change the default from nowait to wait and thus cause Cargo processes to wait for each other.

Open Questions

In order of perceived relevance:

  • How can I implement a test-case which doesn't suffer from races ?
    • The preliminary test only works occasionally, and for some reason the spawned background process starts after the foreground process and thus fails to obtain the lock. Adding sleep_ms seems like the wrong thing to do.
  • Should the lock better be obtained in cargo::process(...) ?
  • Which name would you prefer for the lock file ?
  • Which configuration key would you prefer ? The current one seems not to be too fitting.

Testing

Manual tests indicate the system works as expected.

$ git clone https://github.com/Byron/google-apis-rs
$ cd google-apis-rs
# Build 4 targets with plenty of dependencies in parallel. Depending on the `build.jobs` variable 
# in `.cargo.config`, there is more or less intra-process contention.
$ CARGO_HOME=$PWD/cargo_home make -j4 groupsmigration1-cli-cargo discovery1-cli-cargo translate2-cli-cargo audit1-cli-cargo ARGS=build

The .cargo/config contained in the aforementioned repository is used to configure cargo. Currently it looks like this:

[build]
target-dir = "target"
lock-kind = "wait"

Work still to be done

  • Tests for both, blocking and nonblocking, lock options
  • Support for the windows platform.
    • The latter is achieved through improvements to the file-lock crate and should be transparent to cargo.

Byron added some commits Jul 4, 2015

Added CargoLock implementation
It was previously used in the `Concurrent Cargo`
[PR](https://goo.gl/pcUvVH).
Lock every primary cargo command
The lock is obtained in a section shared by all cargo sub-command
invocations. By default, failure to obtain a lock will result in
gracefully aborting the operation.

However, it is possible to configure it to wait until a lock can be
obtained.
Added test for global cargo lock
However, it suffers from a race condition that make it succeed only
occasionally.

The lock code was moved into cargo's main execute function, which
should be better as the lock will be obtained even sooner after
cargo comes up.

Removed some special-case code which made the lock work better in a
multi-threaded environment, which now is not needed anymore due to
the changed lock granularity. This also allows to remove
`errno` crate.
@rust-highfive

This comment has been minimized.

Copy link

rust-highfive commented Jul 4, 2015

r? @alexcrichton

(rust_highfive has picked a reviewer for you, use r? to override)

@Byron Byron referenced this pull request Jul 5, 2015

Closed

Concurrent Builds #122

@@ -93,6 +94,26 @@ dependencies = [
]

[[package]]
name = "errno"

This comment has been minimized.

@alexcrichton

alexcrichton Jul 6, 2015

Member

I think this lockfile may need to be regenerated.

This comment has been minimized.

@Byron

Byron Jul 8, 2015

Author Contributor

Regenerating it does not remove the errno entry - it just updates dependencies to the latest patch-levels. I am not sure if this is what you intended.

This comment has been minimized.

@alexcrichton

alexcrichton Jul 8, 2015

Member

Oh right, duh! Forgot this was a dependency of file-lock.

That being said I believe that this crate doesn't provide anything beyond io::Error::last_os_error() and io::Error::raw_os_error(), so I'd prefer if the file-lock crate were adjusted to not have this dependency.

let fg_process = bg_process.clone();
let mut bg_process_handle = bg_process.build_command().spawn().unwrap();

assert_eq!(fg_process.build_command().output().unwrap().status.code().unwrap(), 101);

This comment has been minimized.

@alexcrichton

alexcrichton Jul 6, 2015

Member

This test looks like it runs a risk of being flaky because of various bits of timing here and there, and I think it'd also be useful to match the stdout/stderr here as well.

This comment has been minimized.

@Byron

Byron Jul 6, 2015

Author Contributor

Yes, this is also why it doesn't work most of the time. I am not sure how to make it work without having to make the two processes communicate (i.e. master waits for child to come up).

Maybe I could launch the cargo implementation from within the test-case directly (without a sub-process), and then spawn the test-executable. That would give me more control, and is similar to the working tests in the file-lock crate.

This comment has been minimized.

@alexcrichton

alexcrichton Jul 7, 2015

Member

Hm yeah, that sounds like a good idea!

fn from(t: ParseError) -> Self {
Box::new(t)
}
}

This comment has been minimized.

@alexcrichton

alexcrichton Jul 6, 2015

Member

Could you add these to src/cargo/util/errors.rs? That's currently where these conversions and trait implementations are housed.

Cargo.toml Outdated
@@ -36,6 +36,10 @@ toml = "0.1"
url = "0.2"
winapi = "0.1"

[dependencies.file-lock]
git = "https://github.com/Byron/file-lock"
branch = "cargo-adjustments"

This comment has been minimized.

@alexcrichton

alexcrichton Jul 6, 2015

Member

Before merging I'd prefer to depend on a crates.io version of this library

This comment has been minimized.

@Byron

Byron Jul 6, 2015

Author Contributor

Absolutely - my preferred path of action is to get the cargo-related integration right, then port file-lock to windows, and see if cargo indeed works and builds on all supported platforms. Probably you can ask bors to do that, as travis can't test windows for instance.

Once it does work everywhere, a new crates.io release of the original crate should be made (I just have a fork).

This comment has been minimized.

@alexcrichton

alexcrichton Jul 7, 2015

Member

I'll see if I can't set up appveyor CI for cargo, that should give us some Windows coverage for PRs at least.

const CONFIG_KEY: &'static str = "build.lock-kind";
match try!(config.get_string(CONFIG_KEY)).map(|t| t.0) {
None => Ok(LockKind::NonBlocking),
Some(kind_string) => match kind_string.parse() {

This comment has been minimized.

@alexcrichton

alexcrichton Jul 6, 2015

Member

I'm a little uneasy putting this parsing of configuration into an external crate, it seems like it's easy to make a change upstream which looks like it's not a breaking change but turns out to actually be a breaking change.

This comment has been minimized.

@Byron

Byron Jul 6, 2015

Author Contributor

I'm a little uneasy putting this parsing of configuration into an external crate [...]

Do you mean external module ? Should I put the configuration handling into the Config type ?

This comment has been minimized.

@alexcrichton

alexcrichton Jul 7, 2015

Member

In this case the FromStr implementation is located in the file_lock crate, and I would expect the actual parsing of a value to happen in this crate. If we're just gonna go with "always wait" though this isn't so bad.

}

pub fn lock(&mut self) -> CargoResult<()> {
// NOTE(ST): This could fail if cargo is run concurrently for the first time

This comment has been minimized.

@alexcrichton

alexcrichton Jul 6, 2015

Member

This "NOTE(ST)" and "TODO(ST)" isn't currently present throughout the rest of Cargo, can you remove these annotations?

return Err(caused_human(format!("Failed to create parent directory of \
lock-file at '{}'",
lock_dir.display()), io_err));
}

This comment has been minimized.

@alexcrichton

alexcrichton Jul 6, 2015

Member

This can be replaced with try! + chain_error


pub fn lock_kind(config: &Config) -> CargoResult<LockKind> {
// TODO(ST): rename config key to something more suitable
const CONFIG_KEY: &'static str = "build.lock-kind";

This comment has been minimized.

@alexcrichton

alexcrichton Jul 6, 2015

Member

I think it's fine to avoid this for now and just pick a reasonable default behavior

This comment has been minimized.

@Byron

Byron Jul 6, 2015

Author Contributor

Do you mean cargo should just always abort, or always wait, without an option to configure it ?

This comment has been minimized.

@alexcrichton

alexcrichton Jul 7, 2015

Member

For now yeah, and I'd probably choose "always wait" over "always abort"

try!(CargoLock::lock_kind(config)));

try!(fl.lock().chain_error(|| {
human("Failed to obtain global cargo lock")

This comment has been minimized.

@alexcrichton

alexcrichton Jul 6, 2015

Member

I think that we'll want to improve this error message slightly, for example how can this come up? If it comes up because another Cargo is running, then it should probably say that, but if it just failed because of an I/O error we may also want to say that.

This comment has been minimized.

@Byron

Byron Jul 6, 2015

Author Contributor

I think that we'll want to improve this error message slightly, for example how can this come up?

Good idea. Also to check for IOErrors specifically. These shouldn't happen though, and will be provided as cause as FileLock errors are declared human right now.

@alexcrichton

This comment has been minimized.

Copy link
Member

alexcrichton commented Jul 6, 2015

I think that to start off with we can just pick a reasonable default (either wait or don't wait), and in the case of waiting I also think that a message needs to be printed to the effect of "we're now waiting for some other Cargo to exit"

@Byron

This comment has been minimized.

Copy link
Contributor Author

Byron commented Jul 6, 2015

[...] and in the case of waiting I also think that a message needs to be printed to the effect of "we're now waiting for some other Cargo to exit"

Unfortunately this doesn't work, as we don't know that we are waiting in case we do so. We can only tell we didn't get a lock if we tried to get a lock first, then print "we are waiting" and then wait on the lock. However, that would clearly be a race.

So far I have the impression you favour to not make the handling configurable, which to my mind would restrict the usefulness of the implementation. After all, both behaviours, wait and nowait have their benefits, and the file-lock crate can handle both.

Maybe I just misunderstood you in this regard.

@alexcrichton

This comment has been minimized.

Copy link
Member

alexcrichton commented Jul 7, 2015

However, that would clearly be a race.

Could you elaborate on the race here? I'm not quite sure what the problem would be in this case.

So far I have the impression you favour to not make the handling configurable, which to my mind would restrict the usefulness of the implementation.

Yeah I think it's fine to perhaps expand into this in the future, but I'd prefer to get some experience with the current implementation before getting more ambitious, and I figure that waiting-by-default is a reasonable way to start out.

@Byron

This comment has been minimized.

Copy link
Contributor Author

Byron commented Jul 7, 2015

Thanks for the replies, I believe I know enough now to prepare everything for a next review. I will let you know once I think I addressed all issues.

Good luck with getting a window CI environment ready ! I wonder why it's not possible to build cargo along with rustc, which certainly is already tested on all supported platforms. After all, cargo is bundled with rustc and expected to run equally well on all supported platforms.

Could you elaborate on the race here? I'm not quite sure what the problem would be in this case.

A non-waiting (try) lock followed by blocking lock in case try-lock failed would cause cargo to print "waiting for other process". That other process could, while we are printing this for example, drop the lock, and we get the lock without blocking at all. Two sys-calls in short succession are always a race, to my mind. One could argue that we are talking about a fraction of a second, and even if there is a race, the worst thing that could happen is a message printed for no real reason.

@alexcrichton

This comment has been minimized.

Copy link
Member

alexcrichton commented Jul 7, 2015

That other process could, while we are printing this for example, drop the lock, and we get the lock without blocking at all.

This doesn't sound like a race condition, just something that can happen? There's no harm in doing this and if Cargo continues quickly then even better!

Byron added some commits Jul 8, 2015

CargoLock LockKind is now specified per lock call
* removed access to configuration, and will instead default to
  blocking semantics. This caused a change in the API of the
  `CargoLock` type, that generally reduced complexity.
Moved LockError related code to errors.rs
That way, all error related code can be found in one module, and isn't
sprinkled all over the place.
Improved error handling in CargoLock.lock()
* just fail if the directory creation fails. This could happen if there
  is a race, but such a race would only possibly occour if cargo
  has never been invoked before on a particular CARGO_HOME
* use try! + chain_error + human instead of more complex error handling
@alexcrichton

This comment has been minimized.

Copy link
Member

alexcrichton commented Jul 8, 2015

Ah and I've now set up AppVeyor CI for Cargo so when this PR is rebased I think it'll trigger a new CI build on Windows.

@bors

This comment has been minimized.

Copy link
Contributor

bors commented Jul 29, 2015

☔️ The latest upstream changes (presumably #1860) made this pull request unmergeable. Please resolve the merge conflicts.

@bors

This comment has been minimized.

Copy link
Contributor

bors commented Aug 4, 2015

☔️ The latest upstream changes (presumably #1830) made this pull request unmergeable. Please resolve the merge conflicts.

@bors

This comment has been minimized.

Copy link
Contributor

bors commented Aug 17, 2015

☔️ The latest upstream changes (presumably #1885) made this pull request unmergeable. Please resolve the merge conflicts.

@bors

This comment has been minimized.

Copy link
Contributor

bors commented Oct 6, 2015

☔️ The latest upstream changes (presumably #2022) made this pull request unmergeable. Please resolve the merge conflicts.

@bors

This comment has been minimized.

Copy link
Contributor

bors commented Oct 21, 2015

☔️ The latest upstream changes (presumably #2061) made this pull request unmergeable. Please resolve the merge conflicts.

@bors

This comment has been minimized.

Copy link
Contributor

bors commented Nov 19, 2015

☔️ The latest upstream changes (presumably #2154) made this pull request unmergeable. Please resolve the merge conflicts.

@Byron

This comment has been minimized.

Copy link
Contributor Author

Byron commented Dec 7, 2015

Thanks for the time invested into this PR. However, I have to face the bitter truth that I won't be the one finishing this.

The latest attempt to bring it up-to-date with master revealed that libc has moved or renamed constants used by the file-lock crate, which now would need additional work - something inevitable considering the required windows port.

Therefore I believe it's best to close the issue to prevent it from being dragged along into 2016.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.