Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: avoid cross-device link errors while moving files #453

Closed
wants to merge 3 commits into from

Conversation

orhun
Copy link
Contributor

@orhun orhun commented Jan 2, 2024

Background

#447 changed the archive extraction logic to:

  1. Extract files into a temporary directory. (/tmp)
  2. Move all files to the target directory.

However, step 2 uses fs::rename which poses a potential issue on Linux systems:

This will not work if the new name is on a different mount point.

We can demonstrate this with the following snippet:

#!/usr/bin/env rust-script

//! ```cargo
//! [dependencies]
//! anyhow = "1.0.79"
//! dirs = "5.0.1"
//! tempfile = "3.9.0"
//! ```

use anyhow::Result;
use std::fs::{self, File};

fn main() -> Result<()> {
    let tmp_dir = tempfile::tempdir()?;

    let file_path = tmp_dir.path().join("test.txt");
    File::create(&file_path)?;

    let new_path = dirs::home_dir().unwrap().join("new.txt");
    fs::rename(file_path, new_path)?;
    Ok(())
}
$ chmod +x script.rs && ./script.rs

thread 'main' panicked at src/main.rs:7:41:
called `Result::unwrap()` on an `Err` value: Os { code: 18, kind: CrossesDevices, message: "In
valid cross-device link" }
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

This is because /tmp is a different mount point than ~/.

The issue

Due to the reason above, rattler-build build is not able to move the extracted files for me:

$ rattler-build build -r recipe.yaml --output-dir ../rattler-output/

No target platform specified, using current platform
Found variants:

git-cliff-1.4.0-hb0f4dca_0
╭─────────────────┬──────────╮
│ Variant         ┆ Version  │
╞═════════════════╪══════════╡
│ target_platform ┆ linux-64 │
╰─────────────────┴──────────╯

Fetching source from URL: https://github.com/orhun/git-cliff/archive/refs/tags/v0.4.0.tar.gz
Validated SHA256 values of the downloaded file!
Found valid source cache file.
Error:   × IO Error: failed to rename file from /tmp/.tmpjOYIft/git-cliff-0.4.0/.dockerignore to /home/
  │ orhun/gh/conda-forge-staged-recipes/recipes/rattler-output/bld/rattler-build_git-cliff_1704219413/
  │ work/.dockerignore
  ├─▶ failed to rename file from /tmp/.tmpjOYIft/git-cliff-0.4.0/.dockerignore to /home/orhun/gh/
  │   conda-forge-staged-recipes/recipes/rattler-output/bld/rattler-build_git-cliff_1704219413/
  │   work/.dockerignore
  ╰─▶ Invalid cross-device link (os error 18)

This PR simply fixes that issue by copying + deleting files instead of moving for Linux systems.

Error type

Rust throws std::io::ErrorKind::CrossesDevices for this case. However, we can directly check this type because it is gated behind #![feature(io_error_more)] feature. What we can do is enable this feature for the crate but this means that we will need the nightly compiler for builds. I don't think this is feasible.

Another thing that we can do is to check libc::EXDEV error type:

“Invalid cross-device link.” An attempt to make an improper link across file systems was detected. This happens not only when you use link (see Hard Links) but also when you rename a file with rename (see Renaming Files).

This means adding the libc dependency... so let's not do that.

For additional context, see the "renaming directories" section of the Kernel documentation.

  1. return EXDEV error: this error is returned by rename(2) when trying to
    move a file or directory across filesystem boundaries. Hence
    applications are usually prepared to hande this error (mv(1) for example
    recursively copies the directory tree). This is the default behavior.

Long story short, I find it most applicable to have a simple string check until io_error_more feature is stabilized:

// TODO: replace it with `std::io::ErrorKind::CrossesDevices`
// when <https://github.com/rust-lang/rust/issues/86442> is stabilized.
if format!("{:?}", e.kind()) == *"CrossesDevices" {

Additional context

The implementation is heavily inspired by rustup: https://github.com/rust-lang/rustup/blob/3b0ae079d91144ea9cf99a5fc6d493701a160823/src/utils/utils.rs#L566

Here is a related discussion: rust-lang/rustup#1239

🐻

`fs::rename` does not work on Linux when the source and destinations
are at different mount points. This commit mitigates this by using
copy + delete instead of move for cases like this.
@baszalmstra
Copy link
Contributor

Coincidentally just today I added code to rattler to detect if two files/directories share the same filesystem. see https://github.com/mamba-org/rattler/blob/02e68c9539c6009cc1370fbf46dc69ca5361d12d/crates/rattler/src/install/mod.rs#L562 . Would that be an option instead of the (imho hacky) string comparison?

@baszalmstra
Copy link
Contributor

I think rattler also already depends on libc so we might consider adding that dependency.

Also isnt EXDEV just a constant?

@wolfv
Copy link
Member

wolfv commented Jan 2, 2024

I wonder if this is also a problem on macOS?

@orhun
Copy link
Contributor Author

orhun commented Jan 2, 2024

detect if two files/directories share the same filesystem [..] Would that be an option instead of the (imho hacky) string comparison?

Oh, nice! Will try that.

I think rattler also already depends on libc so we might consider adding that dependency.
Also isnt EXDEV just a constant?

Yes, the problem that I faced was e.raw_os_error() was returning None which made me ditch the libc idea. The constant value is 18 IIRC but we can't really go that route since we don't have a value to compare 😕

I wonder if this is also a problem on macOS?

According to the docs:

This function currently corresponds to the rename function on Unix

Since macOS is Unix based, it might be. Assuming you're on mac, maybe give it a try with the snippet I shared 🐻

@wolfv
Copy link
Member

wolfv commented Jan 3, 2024

I think by default on macOS the temporary files are created in the home folder and live on the same fs, so it's usually not a problem. But if I had two different partitions I believe it would show the same behavior.

I did run your snippet, and that was running fine :)

@orhun
Copy link
Contributor Author

orhun commented Jan 3, 2024

I applied @baszalmstra's suggestion in 20d3122

One thing that I needed to do was create the file/directory before checking the metadata since it fails if the path does not exist.

@wolfv
Copy link
Member

wolfv commented Jan 3, 2024

Hmm, I believe on Windows we would have the same issue if we try to rename a file from C:\bla to D:\xyz since the drives would be different.

I am wondering if we should opt for a simpler solution. Either extract somewhere adjacent to the final location (instead of the tempfile).

For the copy_dir function we actually have one in rattler-build that should also work (https://github.com/prefix-dev/rattler-build/blob/main/src/source/copy_dir.rs).

@orhun
Copy link
Contributor Author

orhun commented Jan 3, 2024

I am wondering if we should opt for a simpler solution. Either extract somewhere adjacent to the final location (instead of the tempfile).

I'm wondering if we can directly extract to the final location. I don't know why it is extracted into temp in the first place.

For the copy_dir function we actually have one in rattler-build that should also work

I can try to use that after we decide which way we want to go 🐻

@baszalmstra
Copy link
Contributor

checking if two paths share the same drive on windows is relatively simple. I also implemented that in the code I linked previously.

@wolfv
Copy link
Member

wolfv commented Jan 3, 2024

@baszalmstra do you prefer doing that or just having a temp folder inside the src_cache folder that we would use to do this (on the same fs)?

@baszalmstra
Copy link
Contributor

Does that still provide out of source builds? If so the simplicity of that approach makes a lot of sense to me.

@orhun
Copy link
Contributor Author

orhun commented Jan 3, 2024

Created #456

@orhun orhun closed this Jan 3, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants