Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Virtualised cargo fails with a "No such device" error n°19 likely because of MAP_SHARED mmap call on qemu + virtiofsd with --cache=never. #122262

Closed
gl-yziquel opened this issue Mar 9, 2024 · 19 comments
Labels
C-feature-request Category: A feature request, i.e: not implemented / a PR. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Comments

@gl-yziquel
Copy link

gl-yziquel commented Mar 9, 2024

Problem

I believe this issue is possibly kind of the same issue that was raised back in 2016:

rust-lang/cargo#2808

Context: Virtualised "cargo build" fails with "No such device" whenever setup is qemu +virtiofsd with --cache=never. Essentially the same issue that I just got fixed for gix. (The link below has more details and context.)

GitoxideLabs/gitoxide#1312

As can be seen on the issue for gix, in the links referenced there, rust uses by default a MAP_SHARED (like gix before today's fix) instead of a MAP_PRIVATE (like git and not gix) for mmap system calls. MAP_SHARED has more stringent coherence requirements than MAP_PRIVATE, BUT that bug doesn't materialise with a virtiofs filesystem unless cache is disabled with --cache=never (and I need to disable caching because heavy IO workloads on the guest blow up the file descriptors on the host if cache is enabled).

To see the "No such device" issue, one needs both MAP_SHARED in place of MAP_PRIVATE and --cache=never on virtiosfd.

mini-me@virtucon ~/h/s/my-project (master)> cargo build                                                                
   Compiling rand_core v0.6.4                                                                                                                                                                                                                 
   Compiling siphasher v0.3.11                             
   Compiling libc v0.2.153                                                                                                                                                                                                                    
   Compiling typenum v1.17.0                                                                                           
   Compiling memchr v2.7.1                                 
   Compiling minimal-lexical v0.2.1                                                                                    
   Compiling tinyvec_macros v0.1.1                                                                                     
   Compiling version_check v0.9.4                                                                                      
   Compiling cfg-if v1.0.0                                                                                                                                                                                                                    
   Compiling smallvec v1.13.1                                                                                          
   Compiling bitflags v1.3.2                               
   Compiling fnv v1.0.7                                                                                                
   Compiling bitflags v2.4.2                                                                                           
   Compiling unicode-width v0.1.11                                                                                     
error[E0786]: found invalid metadata files for crate `core`                                                            
  |                                                                                                                    
  = note: failed to mmap file '/home/mini-me/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/lib/libcore-193cf992125ccd4c.rlib': No such device (os error 19)

The fix is to use MAP_PRIVATE in rust code, like done for gix:

GitoxideLabs/gitoxide@88061a1

I believe this is a bug that is pervasive in the wider rust ecosystem when virtualised because MAP_SHARED is used by default. It hits gix. It hits cargo.

Conclusion: Either the rust ecosystem was wrong to choose to use MAP_SHARED as a default, either the virtiofs has a semi-broken memory mapped file implementation. I assume the latter is true, but that doesn't change the fact that virtualised rust using mmap() system calls tends to be affected, like cargo is, by spurrious "No such device" linux OS n° 19 errors for code that runs perfectly fine when not virtualised.

Debugging gix was easy with strace. Debugging cargo to provide nice reproducible strace dumps is harder to me as I get lost in the subprocesses it triggers. Any guidance as to how to get the relevant strace dumps would be appreciated. But the gix link shows that MAP_SHARED indeed is what triggers these spurrious errors when virtualised, like in that old 2016 bug report linked at the top. Here is my own stracing of cargo build:

mini-me@virtucon ~/h/s/my-project (master) [SIGINT]> strace cargo build
clone3({flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD|CLONE_SYSVSEM|CLONE_SETTLS|CLONE_PARENT_SETTID|CLONE_CHILD_CLEARTID, child_tid=0x7795163f1990, parent_tid=0x7795163f1990, exit_signal=0, stack=0x7795161f1000, stack_size=0x1ffbc0, tls=0x7795163f16c0} => {parent_tid=[43528]}, 88) = 43528
[...]
write(2, "error[E0786]", 12error[E0786])            = 12
write(2, ": found invalid metadata files f"..., 47: found invalid metadata files for crate `core`) = 47
[...]
write(2, "note", 4note)                     = 4
write(2, ": failed to mmap file '/home/min"..., 189: failed to mmap file '/home/mini-me/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/lib/libcore-193cf992125ccd4c.rlib': No such device (os error 19)) = 189

The problem likely occurs in the clone3() system call. I don't know how to strace it to provide clear cut proof.

But MAP_SHARED inside the clone does seem to be something that occured in the 2016 bug:

rust-lang/cargo#2808 (comment)

Steps

Virtualise an Ubuntu Mantic VM, with a virtiofs mount backed up by a ZFS share. Ensure virtiofsd is launched with --cache=never. Install cargo on that virtualised filesystem. Try to build some rust code with cargo build. Observe "No such device" error 19 with "cargo build".

This MAP_SHARED issue likely impacts more virtualisation setups than the one I described. MAP_PRIVATE is arguably more unsafe with ugly corner cases, but MAP_SHARED asks for too much guarantees on some virtualised filesystems.

Possible Solution(s)

Replace MAP_SHARED mmap() system calls in the rust stack of cargo by MAP_PRIVATE mmap() system calls. Like gix / gitoxide just did.

Notes

No response

Version

cargo 1.75.0 (1d8b05cdd 2023-11-20)
release: 1.75.0
commit-hash: 1d8b05cdd1287c64467306cf3ca2c8ac60c11eb0
commit-date: 2023-11-20
host: x86_64-unknown-linux-gnu
libgit2: 1.7.1 (sys:0.18.1 vendored)
libcurl: 8.4.0-DEV (sys:0.4.68+curl-8.4.0 vendored ssl:OpenSSL/1.1.1u)
ssl: OpenSSL 1.1.1u  30 May 2023
os: Ubuntu 23.10 (mantic) [64-bit]
@gl-yziquel gl-yziquel added the C-bug Category: This is a bug. label Mar 9, 2024
@gl-yziquel gl-yziquel changed the title Vitualised cargo fails with a No such device likely because of MAP_SHARED mmap call on qemu + virtiofsd with --cache=never. Vitualised cargo fails with a "No such device" error n°19 likely because of MAP_SHARED mmap call on qemu + virtiofsd with --cache=never. Mar 9, 2024
@gl-yziquel gl-yziquel changed the title Vitualised cargo fails with a "No such device" error n°19 likely because of MAP_SHARED mmap call on qemu + virtiofsd with --cache=never. Virtualised cargo fails with a "No such device" error n°19 likely because of MAP_SHARED mmap call on qemu + virtiofsd with --cache=never. Mar 10, 2024
@gl-yziquel
Copy link
Author

gl-yziquel commented Mar 10, 2024

Confirmation. I did an strace -f cargo build. And, after some hunting...

[pid 43835] mmap(NULL, 48973144, PROT_READ, MAP_SHARED, 9, 0 <unfinished ...>
[pid 43835] <... mmap resumed>)         = -1 ENODEV (No such device)

MAP_SHARED is the culprit that triggers an ENODEV failure.

That MAP_PRIVATE instead of MAP_SHARED fixes it is witnessed by the gix fix in the gitoxide repo.

@weihanglo
Copy link
Member

Thank you for the report. Really appreciate the effort!

The error came from rustc the compiler. Seems pretty likely from here. memmap2 is the crate rustc uses for mmap operaations, so it is mostly the same issue as the gitoxide one. Will transfer this to rust-lang/rust as this doesn't seem too related to Cargo.

@rustbot rustbot added the needs-triage This issue may need triage. Remove it if it has been sufficiently triaged. label Mar 10, 2024
@weihanglo weihanglo transferred this issue from rust-lang/cargo Mar 10, 2024
@gl-yziquel
Copy link
Author

gl-yziquel commented Mar 10, 2024

Thank you for the report. Really appreciate the effort!

The error came from rustc the compiler.

I'm tracing it here in strace -f:

[pid 43793] execve("/home/mini-me/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/bin/rustc", ["/home/mini-me/.rustup/toolchains"..., "-", "--crate-name", "___", "--print=file-names", "--crate-type", "bin", "--crate-type", "rlib", "--crate-type", "dylib", "--crate-type", "cdylib", "--crate-type", "staticlib", "--crate-type", "proc-macro", "--print=sysroot", "--print=split-debuginfo", "--print=crate-name", "--print=cfg"], 0x61e426d8b710 /* 72 vars */ <unfinished ...>

If anyone knows of toolings able to cross-examine both some strace stuff and a debugger, I'd be happy to pinpoint the offending piece of rust code that calls MAP_SHARED.

Seems pretty likely from here. memmap2 is the crate rustc uses for mmap operaations, so it is mostly the same issue as the gitoxide one. Will transfer this to rust-lang/rust as this doesn't seem too related to Cargo.

OK.

@weihanglo
Copy link
Member

I meant, the rustc code shares the same code path with gitxoide.

We need someone familiar with this part to determine if it is safe and better to use MAP_PRIVATE.

@gl-yziquel
Copy link
Author

gl-yziquel commented Mar 10, 2024

We need someone familiar with this part to determine if it is safe and better to use MAP_PRIVATE.

From what I understood, MAP_PRIVATE has some corner cases that make it unsafe, but I tend to believe it is nitpicking to keep MAP_SHARED because of that. It would be sacrificing pragmatism (having rustc behave when virtualised) for ideological purity (safety in a context where it doesn't mean that much.)

There are quite a bit of pieces of code that rely on virtualisation, specifically given the (horrible) modern tendency to use docker everywhere to build code, the worst of which not being GitHub Actions. That issue seems to pop up unrecognised in many places where people use arcane workarounds.

@saethlin
Copy link
Member

From what I understood, MAP_PRIVATE has some corner cases that make it unsafe

I implemented Miri's mmap support so I'd like to think I know a fair bit about mmap... but what you're saying here doesn't make sense to me. Can you explain what about MAP_PRIVATE is more dangerous than MAP_SHARED?

@saethlin saethlin added T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. C-feature-request Category: A feature request, i.e: not implemented / a PR. and removed C-bug Category: This is a bug. needs-triage This issue may need triage. Remove it if it has been sufficiently triaged. labels Mar 10, 2024
@gl-yziquel
Copy link
Author

gl-yziquel commented Mar 10, 2024

From what I understood, MAP_PRIVATE has some corner cases that make it unsafe

I implemented Miri's mmap support so I'd like to think I know a fair bit about mmap... but what you're saying here doesn't make sense to me. Can you explain what about MAP_PRIVATE is more dangerous than MAP_SHARED?

The man page of MAP_PRIVATE says:

MAP_PRIVATE Create a private copy-on-write mapping. Updates to the mapping are not visible to other processes mapping the same file, and are not carried through to the underlying file. It is unspecified whether changes made to the file after the mmap() call are visible in the mapped region.

People on #irc seemed to claim that this unspecified behaviour made MAP_PRIVATE unsafe in some sense or another. I find that besides the point of the current problem. But, well, that's not the kind of discussion that really is up to me.

I'd be happy getting MAP_PRIVATE instead of MAP_SHARED into rustc w/r to such virtualisation issues. Trying to recompile rust with that change at the moment. We'll see if that fixes my virtualisation issue.

@saethlin
Copy link
Member

saethlin commented Mar 10, 2024

Ah. Whatever that IRC discussion was is not about the Rust concept of unsafety. In Rust, the way we use memmap2 is UB if the mapping is updated. So if anything, MAP_PRIVATE is better because it's possible that a modification to the file wouldn't update the mapping and cause UB.

I'd be happy to review such a change (put r? saethlin in the PR description on its own line), or I'll make the PR myself when I have time.

@gl-yziquel
Copy link
Author

Ah. Whatever that IRC discussion was is not about the Rust concept of unsafety. In Rust, the way we use memmap2 is UB if the mapping is updated. So if anything, MAP_PRIVATE is better because it's possible that a modification to the file wouldn't update the mapping and cause UB.

UB ?

I'd be happy to review such a change (put r? saethlin in the PR description on its own line), or I'll make the PR myself when I have time.

I'd be happy to try making a first commit to the rust project, to be honest.

Just busy doing nonsense like resizing my VM to compile rust on it. May take a few days given that I'm new to the rust codebase. I'll try to make it happen ASAP.

@saethlin
Copy link
Member

UB is short for Undefined Behavior; alternatively "things the compiler will assume implicitly in order to optimize". In this case, memmap2 hands out a &[u8] to the mapping, and a Rust compiler is allowed to apply optimizations which assume the memory referred to by the &[u8] is not updated between reads through that reference.

Normally this property of &T is no big deal, but when it's a mapped file, that's an awfully non-local safety contract to uphold. Normally, Rust APIs are designed to contain unsafety, and it's a sore point that mmap is so uncontained.

@gl-yziquel
Copy link
Author

So, roughly, we can safely assume that MAP_SHARED will misbehave, and unsafely assume MAP_PRIVATE will behave.

Great.

@gl-yziquel
Copy link
Author

OK. Got a fix and tested it (in as much as it is possible in my limited context). Familiarising myself with contribution guidelines, and you people should get a PR before tomorrow.

@the8472
Copy link
Member

the8472 commented Mar 10, 2024

I'm not against changing this but I'll note that going around and patching projects using mmap doesn't seem like a great solution either. mmap(..., MAP_SHARED) is part of the posix standard so every day someone might start writing code using it.

If you execute your programs on a weird filesystem that doesn't support it you might hit this issue every time you install new software or upgrade existing ones (that start using it).

So it'd be better to fix the rootcause, either by switching to a different filesystem or by getting the FS fixed.

@gl-yziquel
Copy link
Author

gl-yziquel commented Mar 10, 2024

I'm not against changing this but I'll note that going around and patching projects using mmap doesn't seem like a great solution either. mmap(..., MAP_SHARED) is part of the posix standard so every day someone might start writing code using it.

If you execute your programs on a weird filesystem that doesn't support it you might hit this issue every time you install new software or upgrade existing ones (that start using it).

So it'd be better to fix the rootcause, either by switching to a different filesystem or by getting the FS fixed.

Which is why I am currently trying to raise the issue in the virtiofs gitlab repository.

https://gitlab.com/virtio-fs/virtiofsd/-/issues/149

Note that while this is here specific to virtiofs, it seems many virtualisation bug reports tend to refer to such a situation in somewhat veiled terms. The 2016 bug report reference at the top of this bug report already points to MAP_SHARED in its strace analysis, but it did not quite light a bulb and people have been using workarounds for years now because the issue has been misunderstood.

This is essentially a question of what rust aims to support. Any virtualisation setup is bound at one point to grapple with such issues such as memmap() (as virtiofs is currently doing with dax window which currently is not merged in qemu and requires going down the serial patching route) and this is going to force virtualisation support for rust code to continuously lag behind until virtualisation has become a fully fully mature technology.

If people want to use MAP_SHARED everywhere, fine. But at least get rustc running in these setup. If an email client or whatever doesn't work when virtualised because of MAP_SHARED, fine, whatever. If rustc bails out, that is an entirely different matter.

P.S.: it remains a rust specific issue insofar as MAP_SHARED is being used as a default much more in rust than it is in other language ecosystems. While the POSIX standard may say something, the real reality of the real world is that much less code will break if not written in rust.

@the8472
Copy link
Member

the8472 commented Mar 10, 2024

Any virtualisation setup is bound at one point to grapple with such issues

No, not really. If you just virtualize a plain block device then such an issue does not arise because mmap will be implemented inside the guest OS with a regular filesystem.

I've been using rust in various virtual machines (virtualbox, hyperv, kvm) for years. It is only specific less commonly used filesystems that cause issues. But virtiofs is not alone in this. 9p, NFS, various FUSE drivers, the linux NTFS implementation... many niche filesystems have quirks that cause problems in a lot of software.

Switching to a mature filesystem implementation rather than an immature/incomplete one can very well be the answer.

@gl-yziquel
Copy link
Author

No, not really. If you just virtualize a plain block device then such an issue does not arise because mmap will be implemented inside the guest OS with a regular filesystem.

That is called deflecting. The issue is not whether I like chocolate or vanilla. The issue is that this chocolate tastes bad. And so does vanilla, if you ask me.

@workingjubilee
Copy link
Member

Please cease to argue about this, or at least find a more suitable venue to do so. GitHub issues only survive a limited number of back-and-forth remarks and we have approximately hit that limit. If rustc does not need to use MAP_SHARED it is entirely reasonable for rustc to eschew it. The quasi-philosophical argument about the state of the ecosystem can be taken up separately.

@technetos
Copy link
Member

Hey folks, this discussion is starting to get out of hand and off topic. Please take the discussion to another venue other than Github issues.

@gl-yziquel
Copy link
Author

The quasi-philosophical argument about the state of the ecosystem can be taken up separately.

Yes. Though, it is worth mentionning that go is impacted too in much the same way. It just bailed out on building gosec with that same kind of error. Big C/C++ projects like qemu, however, do not seem to be impacted.

bors added a commit to rust-lang-ci/rust that referenced this issue Mar 16, 2024
Use `MAP_PRIVATE` (not unsound-prone `MAP_SHARED`)

Solves rust-lang#122262
github-actions bot pushed a commit to rust-lang/miri that referenced this issue Mar 17, 2024
Use `MAP_PRIVATE` (not unsound-prone `MAP_SHARED`)

Solves rust-lang/rust#122262
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-feature-request Category: A feature request, i.e: not implemented / a PR. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests

7 participants