Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cargo lock on build directory does not work on NFS #4629

Open
bruceg opened this issue Oct 16, 2017 · 13 comments
Open

Cargo lock on build directory does not work on NFS #4629

bruceg opened this issue Oct 16, 2017 · 13 comments
Labels
A-filesystem Area: issues with filesystems S-triage Status: This issue is waiting on initial triage.

Comments

@bruceg
Copy link
Contributor

bruceg commented Oct 16, 2017

When run on a project located on a NFS mount, cargo fails to lock the build directory.

Expected behavior:

# cd /tmp
# cargo new --bin bin
     Created binary (application) `bin` project
# cd bin
# cargo run & cargo run
    Blocking waiting for file lock on build directory
   Compiling bin v0.1.0 (file:///tmp/bin)
    Finished dev [unoptimized + debuginfo] target(s) in 0.57 secs
     Running `target/debug/bin`
Hello, world!
    Finished dev [unoptimized + debuginfo] target(s) in 0.0 secs
     Running `target/debug/bin`
Hello, world!

Actual behavior (my /home is NFSv4 mounted, with no lock-related options on the mount):

# cd $HOME/src
# cargo new --bin bin
     Created binary (application) `bin` project
# cd bin
# cargo run & cargo run
   Compiling bin v0.1.0 (file:///home/bruce/src/bin)
   Compiling bin v0.1.0 (file:///home/bruce/src/bin)
error: failed to remove /home/bruce/src/bin/target/debug/deps/bin-291ef6a9e055a048.0.o: No such file or directory (os error 2)

error: failed to remove /home/bruce/src/bin/target/debug/deps/bin-291ef6a9e055a048.crate.metadata.o: No such file or directory (os error 2)

error: failed to remove /home/bruce/src/bin/target/debug/deps/bin-291ef6a9e055a048.crate.allocator.o: No such file or directory (os error 2)

    Finished dev [unoptimized + debuginfo] target(s) in 0.90 secs
     Running `target/debug/bin`
    Finished dev [unoptimized + debuginfo] target(s) in 0.96 secs
     Running `target/debug/bin`
Hello, world!
Hello, world!

I'm using cargo 0.22.0 (3423351 2017-10-06) on Gentoo Linux, kernel 4.9.52.

@alexcrichton
Copy link
Member

is... there anything we can do about this? AFAIK file locking basically doesn't work at all with NFS

@bruceg
Copy link
Contributor Author

bruceg commented Oct 17, 2017

NFS file locking does work. For NFS v3 Linux has claimed support since about 2.6.12 (in 2005), and NFS v4 has file locking support built into the protocol.

However, there seems to be some conflicting information if flock(2) actually works over NFS. Kernel notes say it's been working for a long time, but others report that it doesn't still. The NFS man page(5) under "Using file locks with NFS" indicates that NFS (all versions?) supports advisory locks only using lockf/fcntl, with flock being emulated by fcntl calls.

If what you need is a single lock for the whole tree, a single lock file would suffice using one of a variety of methods (mkdir is atomic; symlink to a random value and readlink to check if yours was created; create, link, and then stat original file to count links, etc). I don't know the details of how cargo uses locks to be sure what would be most appropriate.

References:
https://unix.stackexchange.com/questions/229675/nfs-file-locking-not-working-am-i-misunderstanding
https://stackoverflow.com/questions/22409780/flock-vs-lockf-on-linux
https://serverfault.com/questions/66919/file-locks-on-an-nfs
https://en.wikipedia.org/wiki/File_locking#Problems

@bruceg
Copy link
Contributor Author

bruceg commented Oct 17, 2017

I see now that cargo did locking, but had issues with an NFS system (#2615). The reading I have done seems to point at NFS flock hangs being a NFS configuration problem more than an application problem. If flock doesn't work across the network, it generally maps to either a no-op (BSD) or a local-only lock (old Linux, others?). Turning locks on again (by disabling the NFS check) works for me, but that's far from conclusive testing it won't hang for me (at some other point) or anybody else.

I'm not sure where to go from here. Adding the locking back for NFS seems to be a non-starter because of problems others are having. Switching to fcntl locks may be the best compromise on Linux, but has portability issues (and isn't supported by fs2). I'd be willing to do the work to make something happen, but have no idea what.

Would it make any sense to switch the locking model to having some wider scale lock (ie a secondary lock file specifically when cargo needs exclusive access) instead of locking each writable file open, or is that just crazy talk?

FTR I ran into this by using assert_cli to implement tests for a command-line program. In in turn runs the program through cargo run. Running the numerous tests in parallel works much faster than sequentially due to the overhead of running cargo for each test.

@alexcrichton
Copy link
Member

Most systems, even if they have an NFS mount, typically have something on a local filesystem. I wonder if we could perhaps detect that the current path is an NFS mount and if so switch the path to the file lock on the filesystem. For example if /home/bar/code/foo.lock is NFS, we could perhaps instead use a file lock at /tmp/$(hash("/home/bar/code/foo.lock")).lock or something like that.

That way we could ensure that Cargo on the same machine is coordinated, and we're not really trying to coordinate Cargo across machines anyway.

@bruceg
Copy link
Contributor Author

bruceg commented Oct 17, 2017

Trying to use a file with a predictable file name in a globally writable directory is a recipe for disaster (symlink attack, etc). It would be more appropriate would be to use a NFS safe locking technique to create a separate lock file beside the file (AKA a dot-lock file or some such).

It may be worth spending some time and building a standalone crate to do just this, ala the lockfile command from the days of procmail or dotlockfile from the current liblockfile package. I could see it being useful in other places. I should go ahead and do that, though I can't test it on much more than Linux.

@stale
Copy link

stale bot commented Sep 19, 2018

As there hasn't been any activity here in over 6 months I've marked this as stale and if no further activity happens for 7 days I will close it.

I'm a bot so this may be in error! If this issue should remain open, could someone (the author, a team member, or any interested party) please comment to that effect?

The team would be especially grateful if such a comment included details such as:

  • Is this still relevant?
  • If so, what is blocking it?
  • Is it known what could be done to help move this forward?

Thank you for contributing!

(The cargo team is currently evaluating the use of Stale bot, and using #6035 as the tracking issue to gather feedback.)

If you're reading this comment from the distant future, fear not if this was closed automatically. If you believe it's still an issue please leave a comment and a team member can reopen this issue. Opening a new issue is also acceptable!

@stale stale bot added the stale label Sep 19, 2018
@bruceg
Copy link
Contributor Author

bruceg commented Oct 12, 2018

This is still a problem, though obviously it doesn't affect many people.

@stale stale bot removed the stale label Oct 12, 2018
@HankB
Copy link

HankB commented Oct 15, 2018

Total Rust newb here. I think I just ran into this too trying to build a project on an NFS share. The code I'm trying to build has errors and that could be contributing as well. The error messages I get finish with

thread 'main' panicked at 'librustc/session/mod.rs:802: Trying to get session directory from IncrCompSession `NotInitialized`', librustc/session/mod.rs:1296:26
note: Run with `RUST_BACKTRACE=1` for a backtrace.
error: aborting due to 3 previous errors


error: internal compiler error: unexpected panic

note: the compiler unexpectedly panicked. this is a bug.

note: we would appreciate a bug report: https://github.com/rust-lang/rust/blob/master/CONTRIBUTING.md#bug-reports

note: rustc 1.29.1 (b801ae664 2018-09-20) running on x86_64-unknown-linux-gnu

note: compiler flags: -C debuginfo=2 -C incremental --crate-type lib

note: some of the compiler flags provided by cargo are hidden

error: Could not compile `tower`.

To learn more, run the command again with --verbose.
hbarta@olive:/mnt/share/projects/tower$ 

Impact on me is minimal as it would be a simple convenience to build on laptop or desktop. I can just copy the whole directory over. (Not worth the effort at this point to upload to a remote repo.)
I can provide further information if there is any interest. Otherwise I'll just consider this a "me too" to a known issue.
I'm, running on Debian Stretch (both host and NFS server.)

@alexcrichton
Copy link
Member

@HankB FWIW that's an issue for rust-lang/rust, not Cargo (it's coming from the compiler, not Cargo)

@HankB
Copy link

HankB commented Oct 16, 2018

@alexcrichton Thanks for pointing that out. It should have been obvious. I have added my comments to what seems like the same issue. (rust-lang/rust#40830)

@remifontan
Copy link

Hi,
here's another "me too" story.

I've been having pretty weird build issues/crashes recently, and nailed them down to being a concurrent issue with VSCode/Rust-analyser doing cargo check at the same time as I am building on the shell... and my project was on NFS (v3 to be precise).

I could workarounds all those "weird" crashes by either moving my project onto local disk or by disabling the VSCode/RA option "checkOnSave".

@nospam3089
Copy link

For what it is worth, this function is very prototypical code to detect whether a directory is on nfs or not. It seems to reliably succeed in detecting nfs filesystems on illumos, and could possibly be used as a basis for anyone wishing to bring something like the Linux case also to illumos. Please beware that it has only been thrown together without much thought, and do note that it contains an unsafe block.

use std::path::Path;

fn is_on_nfs_mount(path: &Path) -> bool {
    use std::ffi::CString;
    use std::mem;
    use std::os::unix::prelude::*;
    use libc;

    let path = match CString::new(path.as_os_str().as_bytes()) {
        Ok(path) => path,
        Err(_) => return false,
    };

    let oracle = [b'n' as i8, b'f' as i8, b's' as i8, b'\0' as i8];
    unsafe {
        let mut buf: libc::statvfs = mem::zeroed();
        let r = libc::statvfs(path.as_ptr(), &mut buf);

        println!("{:?} {:?}", &buf.f_basetype[0..=4], &oracle);
        buf.f_basetype[0..4] == oracle
    }
}

fn main() {
    let path0 = "/some/local/path";
    println!("{}: {:?}", path0, is_on_nfs_mount(&Path::new(path0)));
    let path1 = "/some/nfs/path";
    println!("{}: {:?}", path1, is_on_nfs_mount(&Path::new(path1)));
}

We decided to work around our issue by avoiding nfs mounted build directories. Thus no attempts have been, or will be, made by us to build or use cargo with the above posted function. In the unlikely case someone picks up from here, please feel free to do so. I have however posted all my available knowledge and can unfortunately not be of much further assistance.

@epage epage added the S-triage Status: This issue is waiting on initial triage. label Oct 17, 2023
@remifontan
Copy link

remifontan commented Apr 28, 2024

Hi. the "prototypical code" from above is not working anymore, libc must have changed and the field f_basetype is not there anymore.

However, it seems possible to find out whether a path is on a NFS mount with this command:

>>> stat -fc %T /my/nfs/mount
nfs

and by retriving the hex code:

>>> stat -fc %t /my/nfs/mount
6969

with this in mind, I updated the prototypical code from above to work in that same way:

use std::path::Path;

fn is_on_nfs_mount(path: &Path) -> bool {
    use std::ffi::CString;
    use std::mem;
    use std::os::unix::prelude::*;
    use libc;

    let path = match CString::new(path.as_os_str().as_bytes()) {
        Ok(path) => path,
        Err(_) => return false,
    };

    
    const TST_NFS_MAGIC : i64 = 0x6969;
    unsafe {
        let mut buf: libc::statfs = mem::zeroed();
        let r = libc::statfs(path.as_ptr(), &mut buf);
        buf.f_type == TST_NFS_MAGIC
    }
}

fn main() {
    let path0 = "/some/local/path";
    println!("{}: {:?}", path0, is_on_nfs_mount(&Path::new(path0)));
    let path1 = "/some/nfs/path";
    println!("{}: {:?}", path1, is_on_nfs_mount(&Path::new(path1)));
}

it would be great if a solution could be found so that people working NFS can safely use cargo without triggering this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-filesystem Area: issues with filesystems S-triage Status: This issue is waiting on initial triage.
Projects
None yet
Development

No branches or pull requests

7 participants