New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix running Cargo concurrently #2486

Merged
merged 1 commit into from Mar 17, 2016

Conversation

Projects
None yet
5 participants
@alexcrichton
Copy link
Member

alexcrichton commented Mar 15, 2016

Cargo has historically had no protections against running it concurrently. This
is pretty unfortunate, however, as it essentially just means that you can only
run one instance of Cargo at a time globally on a system.

An "easy solution" to this would be the use of file locks, except they need to
be applied judiciously. It'd be a pretty bad experience to just lock the entire
system globally for Cargo (although it would work), but otherwise Cargo must be
principled how it accesses the filesystem to ensure that locks are properly
held. This commit intends to solve all of these problems.

A new utility module is added to cargo, util::flock, which contains two types:

  • FileLock - a locked version of a File. This RAII guard will unlock the
    lock on Drop and I/O can be performed through this object. The actual
    underlying Path can be read from this object as well.
  • Filesystem - an unlocked representation of a Path. There is no "safe"
    method to access the underlying path without locking a file on the filesystem
    first.

Built on the fs2 library, these locks use the flock system call on Unix and
LockFileEx on Windows. Although file locking on Unix is documented as not so
great
, but largely only because of NFS, these are just advisory, and
there's no byte-range locking. These issues don't necessarily plague Cargo,
however, so we should try to leverage them. On both Windows and Unix the file
locks are released when the underlying OS handle is closed, which means that
if the process dies the locks are released.

Cargo has a number of global resources which it now needs to lock, and the
strategy is done in a fairly straightforward way:

  • Each registry's index contains one lock (a dotfile in the index). Updating the
    index requires a read/write lock while reading the index requires a shared
    lock. This should allow each process to ensure a registry update happens while
    not blocking out others for an unnecessarily long time. Additionally any
    number of processes can read the index.
  • When downloading crates, each downloaded crate is individually locked. A lock
    for the downloaded crate implies a lock on the output directory as well.
    Because downloaded crates are immutable, once the downloaded directory exists
    the lock is no longer needed as it won't be modified, so it can be released.
    This granularity of locking allows multiple Cargo instances to download
    dependencies in parallel.
  • Git repositories have separate locks for the database and for the project
    checkout. The datbase and checkout are locked for read/write access when an
    update is performed, and the lock of the checkout is held for the entire
    lifetime of the git source. This is done to ensure that any other Cargo
    processes must wait while we use the git repository. Unfortunately there's
    just not that much parallelism here.
  • Binaries managed by cargo install are locked by the local metadata file that
    Cargo manages. This is relatively straightforward.
  • The actual artifact output directory is just globally locked for the entire
    build. It's hypothesized that running Cargo concurrently in one directory is
    less of a feature needed rather than running multiple instances of Cargo
    globally (for now at least). It would be possible to have finer grained
    locking here, but that can likely be deferred to a future PR.

So with all of this infrastructure in place, Cargo is now ready to grab some
locks and ensure that you can call it concurrently anywhere at any time and
everything always works out as one might expect.

One interesting question, however, is what does Cargo do on contention? On one
hand Cargo could immediately abort, but this would lead to a pretty poor UI as
any Cargo process on the system could kick out any other. Instead this PR takes
a more nuanced approach.

  • First, all locks are attempted to be acquired (a "try lock"). If this
    succeeds, we're done.
  • Next, Cargo prints a message to the console that it's going to block waiting
    for a lock. This is done because it's indeterminate how long Cargo will wait
    for the lock to become available, and most long-lasting operations in Cargo
    have a message printed for them.
  • Finally, a blocking acquisition of the lock is issued and we wait for it to
    become available.

So all in all this should help Cargo fix any future concurrency bugs with file
locking in a principled fashion while also allowing concurrent Cargo processes
to proceed reasonably across the system.

Closes #354

@rust-highfive

This comment has been minimized.

Copy link

rust-highfive commented Mar 15, 2016

r? @wycats

(rust_highfive has picked a reviewer for you, use r? to override)

@alexcrichton

This comment has been minimized.

Copy link
Member

alexcrichton commented Mar 15, 2016

r? @brson

@rust-highfive rust-highfive assigned brson and unassigned wycats Mar 15, 2016

@alexcrichton

This comment has been minimized.

Copy link
Member

alexcrichton commented Mar 15, 2016

cc #1764 and #1781, previous attempts to do the same thing

})
}

fn create_dir_all(path: &Path) -> io::Result<()> {

This comment has been minimized.

@brson

brson Mar 16, 2016

Contributor

Why does this module have it's own function for this instead of using std?

This comment has been minimized.

@alexcrichton

alexcrichton Mar 16, 2016

Member

The version in the standard library doesn't currently handle the NotFound error as it's not intended to be run concurrently (which I have personally advocated for). This is basically the concurrent variant of those functions.

Maybe we should consider adding it to libstd at some point...

@brson

This comment has been minimized.

Copy link
Contributor

brson commented Mar 16, 2016

Hard to evaluate how correct it is but it looks awesome. r=me

@alexcrichton alexcrichton force-pushed the alexcrichton:flock branch 3 times, most recently from 97c6a31 to e3ffa0c Mar 16, 2016

@alexcrichton

This comment has been minimized.

Copy link
Member

alexcrichton commented Mar 17, 2016

@bors: r=brson

@bors

This comment has been minimized.

Copy link
Contributor

bors commented Mar 17, 2016

📌 Commit e3ffa0c has been approved by brson

@bors

This comment has been minimized.

Copy link
Contributor

bors commented Mar 17, 2016

☔️ The latest upstream changes (presumably #2241) made this pull request unmergeable. Please resolve the merge conflicts.

@alexcrichton alexcrichton force-pushed the alexcrichton:flock branch from e3ffa0c to e22b08c Mar 17, 2016

@alexcrichton

This comment has been minimized.

Copy link
Member

alexcrichton commented Mar 17, 2016

@bors: r=brson e22b08c

@bors

This comment has been minimized.

Copy link
Contributor

bors commented Mar 17, 2016

⌛️ Testing commit e22b08c with merge 1a5ec9b...

bors added a commit that referenced this pull request Mar 17, 2016

Auto merge of #2486 - alexcrichton:flock, r=brson
Fix running Cargo concurrently

Cargo has historically had no protections against running it concurrently. This
is pretty unfortunate, however, as it essentially just means that you can only
run one instance of Cargo at a time **globally on a system**.

An "easy solution" to this would be the use of file locks, except they need to
be applied judiciously. It'd be a pretty bad experience to just lock the entire
system globally for Cargo (although it would work), but otherwise Cargo must be
principled how it accesses the filesystem to ensure that locks are properly
held. This commit intends to solve all of these problems.

A new utility module is added to cargo, `util::flock`, which contains two types:

* `FileLock` - a locked version of a `File`. This RAII guard will unlock the
  lock on `Drop` and I/O can be performed through this object. The actual
  underlying `Path` can be read from this object as well.
* `Filesystem` - an unlocked representation of a `Path`. There is no "safe"
  method to access the underlying path without locking a file on the filesystem
  first.

Built on the [fs2] library, these locks use the `flock` system call on Unix and
`LockFileEx` on Windows. Although file locking on Unix is [documented as not so
great][unix-bad], but largely only because of NFS, these are just advisory, and
there's no byte-range locking. These issues don't necessarily plague Cargo,
however, so we should try to leverage them. On both Windows and Unix the file
locks are released when the underlying OS handle is closed, which means that
if the process dies the locks are released.

Cargo has a number of global resources which it now needs to lock, and the
strategy is done in a fairly straightforward way:

* Each registry's index contains one lock (a dotfile in the index). Updating the
  index requires a read/write lock while reading the index requires a shared
  lock. This should allow each process to ensure a registry update happens while
  not blocking out others for an unnecessarily long time. Additionally any
  number of processes can read the index.
* When downloading crates, each downloaded crate is individually locked. A lock
  for the downloaded crate implies a lock on the output directory as well.
  Because downloaded crates are immutable, once the downloaded directory exists
  the lock is no longer needed as it won't be modified, so it can be released.
  This granularity of locking allows multiple Cargo instances to download
  dependencies in parallel.
* Git repositories have separate locks for the database and for the project
  checkout. The datbase and checkout are locked for read/write access when an
  update is performed, and the lock of the checkout is held for the entire
  lifetime of the git source. This is done to ensure that any other Cargo
  processes must wait while we use the git repository. Unfortunately there's
  just not that much parallelism here.
* Binaries managed by `cargo install` are locked by the local metadata file that
  Cargo manages. This is relatively straightforward.
* The actual artifact output directory is just globally locked for the entire
  build. It's hypothesized that running Cargo concurrently in *one directory* is
  less of a feature needed rather than running multiple instances of Cargo
  globally (for now at least). It would be possible to have finer grained
  locking here, but that can likely be deferred to a future PR.

So with all of this infrastructure in place, Cargo is now ready to grab some
locks and ensure that you can call it concurrently anywhere at any time and
everything always works out as one might expect.

One interesting question, however, is what does Cargo do on contention? On one
hand Cargo could immediately abort, but this would lead to a pretty poor UI as
any Cargo process on the system could kick out any other. Instead this PR takes
a more nuanced approach.

* First, all locks are attempted to be acquired (a "try lock"). If this
  succeeds, we're done.
* Next, Cargo prints a message to the console that it's going to block waiting
  for a lock. This is done because it's indeterminate how long Cargo will wait
  for the lock to become available, and most long-lasting operations in Cargo
  have a message printed for them.
* Finally, a blocking acquisition of the lock is issued and we wait for it to
  become available.

So all in all this should help Cargo fix any future concurrency bugs with file
locking in a principled fashion while also allowing concurrent Cargo processes
to proceed reasonably across the system.

[fs2]: https://github.com/danburkert/fs2-rs
[unix-bad]: http://0pointer.de/blog/projects/locking.html

Closes #354
@bors

This comment has been minimized.

Copy link
Contributor

bors commented Mar 17, 2016

💔 Test failed - cargo-mac-64

Fix running Cargo concurrently
Cargo has historically had no protections against running it concurrently. This
is pretty unfortunate, however, as it essentially just means that you can only
run one instance of Cargo at a time **globally on a system**.

An "easy solution" to this would be the use of file locks, except they need to
be applied judiciously. It'd be a pretty bad experience to just lock the entire
system globally for Cargo (although it would work), but otherwise Cargo must be
principled how it accesses the filesystem to ensure that locks are properly
held. This commit intends to solve all of these problems.

A new utility module is added to cargo, `util::flock`, which contains two types:

* `FileLock` - a locked version of a `File`. This RAII guard will unlock the
  lock on `Drop` and I/O can be performed through this object. The actual
  underlying `Path` can be read from this object as well.
* `Filesystem` - an unlocked representation of a `Path`. There is no "safe"
  method to access the underlying path without locking a file on the filesystem
  first.

Built on the [fs2] library, these locks use the `flock` system call on Unix and
`LockFileEx` on Windows. Although file locking on Unix is [documented as not so
great][unix-bad], but largely only because of NFS, these are just advisory, and
there's no byte-range locking. These issues don't necessarily plague Cargo,
however, so we should try to leverage them. On both Windows and Unix the file
locks are released when the underlying OS handle is closed, which means that
if the process dies the locks are released.

Cargo has a number of global resources which it now needs to lock, and the
strategy is done in a fairly straightforward way:

* Each registry's index contains one lock (a dotfile in the index). Updating the
  index requires a read/write lock while reading the index requires a shared
  lock. This should allow each process to ensure a registry update happens while
  not blocking out others for an unnecessarily long time. Additionally any
  number of processes can read the index.
* When downloading crates, each downloaded crate is individually locked. A lock
  for the downloaded crate implies a lock on the output directory as well.
  Because downloaded crates are immutable, once the downloaded directory exists
  the lock is no longer needed as it won't be modified, so it can be released.
  This granularity of locking allows multiple Cargo instances to download
  dependencies in parallel.
* Git repositories have separate locks for the database and for the project
  checkout. The datbase and checkout are locked for read/write access when an
  update is performed, and the lock of the checkout is held for the entire
  lifetime of the git source. This is done to ensure that any other Cargo
  processes must wait while we use the git repository. Unfortunately there's
  just not that much parallelism here.
* Binaries managed by `cargo install` are locked by the local metadata file that
  Cargo manages. This is relatively straightforward.
* The actual artifact output directory is just globally locked for the entire
  build. It's hypothesized that running Cargo concurrently in *one directory* is
  less of a feature needed rather than running multiple instances of Cargo
  globally (for now at least). It would be possible to have finer grained
  locking here, but that can likely be deferred to a future PR.

So with all of this infrastructure in place, Cargo is now ready to grab some
locks and ensure that you can call it concurrently anywhere at any time and
everything always works out as one might expect.

One interesting question, however, is what does Cargo do on contention? On one
hand Cargo could immediately abort, but this would lead to a pretty poor UI as
any Cargo process on the system could kick out any other. Instead this PR takes
a more nuanced approach.

* First, all locks are attempted to be acquired (a "try lock"). If this
  succeeds, we're done.
* Next, Cargo prints a message to the console that it's going to block waiting
  for a lock. This is done because it's indeterminate how long Cargo will wait
  for the lock to become available, and most long-lasting operations in Cargo
  have a message printed for them.
* Finally, a blocking acquisition of the lock is issued and we wait for it to
  become available.

So all in all this should help Cargo fix any future concurrency bugs with file
locking in a principled fashion while also allowing concurrent Cargo processes
to proceed reasonably across the system.

[fs2]: https://github.com/danburkert/fs2-rs
[unix-bad]: http://0pointer.de/blog/projects/locking.html

Closes #354

@alexcrichton alexcrichton force-pushed the alexcrichton:flock branch from e22b08c to 8eac1d6 Mar 17, 2016

@alexcrichton

This comment has been minimized.

Copy link
Member

alexcrichton commented Mar 17, 2016

@bors: r=brson

@bors

This comment has been minimized.

Copy link
Contributor

bors commented Mar 17, 2016

📌 Commit 8eac1d6 has been approved by brson

@bors

This comment has been minimized.

Copy link
Contributor

bors commented Mar 17, 2016

⌛️ Testing commit 8eac1d6 with merge c5360c4...

bors added a commit that referenced this pull request Mar 17, 2016

Auto merge of #2486 - alexcrichton:flock, r=brson
Fix running Cargo concurrently

Cargo has historically had no protections against running it concurrently. This
is pretty unfortunate, however, as it essentially just means that you can only
run one instance of Cargo at a time **globally on a system**.

An "easy solution" to this would be the use of file locks, except they need to
be applied judiciously. It'd be a pretty bad experience to just lock the entire
system globally for Cargo (although it would work), but otherwise Cargo must be
principled how it accesses the filesystem to ensure that locks are properly
held. This commit intends to solve all of these problems.

A new utility module is added to cargo, `util::flock`, which contains two types:

* `FileLock` - a locked version of a `File`. This RAII guard will unlock the
  lock on `Drop` and I/O can be performed through this object. The actual
  underlying `Path` can be read from this object as well.
* `Filesystem` - an unlocked representation of a `Path`. There is no "safe"
  method to access the underlying path without locking a file on the filesystem
  first.

Built on the [fs2] library, these locks use the `flock` system call on Unix and
`LockFileEx` on Windows. Although file locking on Unix is [documented as not so
great][unix-bad], but largely only because of NFS, these are just advisory, and
there's no byte-range locking. These issues don't necessarily plague Cargo,
however, so we should try to leverage them. On both Windows and Unix the file
locks are released when the underlying OS handle is closed, which means that
if the process dies the locks are released.

Cargo has a number of global resources which it now needs to lock, and the
strategy is done in a fairly straightforward way:

* Each registry's index contains one lock (a dotfile in the index). Updating the
  index requires a read/write lock while reading the index requires a shared
  lock. This should allow each process to ensure a registry update happens while
  not blocking out others for an unnecessarily long time. Additionally any
  number of processes can read the index.
* When downloading crates, each downloaded crate is individually locked. A lock
  for the downloaded crate implies a lock on the output directory as well.
  Because downloaded crates are immutable, once the downloaded directory exists
  the lock is no longer needed as it won't be modified, so it can be released.
  This granularity of locking allows multiple Cargo instances to download
  dependencies in parallel.
* Git repositories have separate locks for the database and for the project
  checkout. The datbase and checkout are locked for read/write access when an
  update is performed, and the lock of the checkout is held for the entire
  lifetime of the git source. This is done to ensure that any other Cargo
  processes must wait while we use the git repository. Unfortunately there's
  just not that much parallelism here.
* Binaries managed by `cargo install` are locked by the local metadata file that
  Cargo manages. This is relatively straightforward.
* The actual artifact output directory is just globally locked for the entire
  build. It's hypothesized that running Cargo concurrently in *one directory* is
  less of a feature needed rather than running multiple instances of Cargo
  globally (for now at least). It would be possible to have finer grained
  locking here, but that can likely be deferred to a future PR.

So with all of this infrastructure in place, Cargo is now ready to grab some
locks and ensure that you can call it concurrently anywhere at any time and
everything always works out as one might expect.

One interesting question, however, is what does Cargo do on contention? On one
hand Cargo could immediately abort, but this would lead to a pretty poor UI as
any Cargo process on the system could kick out any other. Instead this PR takes
a more nuanced approach.

* First, all locks are attempted to be acquired (a "try lock"). If this
  succeeds, we're done.
* Next, Cargo prints a message to the console that it's going to block waiting
  for a lock. This is done because it's indeterminate how long Cargo will wait
  for the lock to become available, and most long-lasting operations in Cargo
  have a message printed for them.
* Finally, a blocking acquisition of the lock is issued and we wait for it to
  become available.

So all in all this should help Cargo fix any future concurrency bugs with file
locking in a principled fashion while also allowing concurrent Cargo processes
to proceed reasonably across the system.

[fs2]: https://github.com/danburkert/fs2-rs
[unix-bad]: http://0pointer.de/blog/projects/locking.html

Closes #354
@bors

This comment has been minimized.

@bors bors merged commit 8eac1d6 into rust-lang:master Mar 17, 2016

2 of 3 checks passed

continuous-integration/appveyor/pr Waiting for AppVeyor build to complete
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
homu Test successful
Details
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment