New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sparse-file handling for the Linux implementation of std::fs::copy() #55909

Open
wants to merge 27 commits into
base: master
from

Conversation

Projects
None yet
9 participants
@tarka
Contributor

tarka commented Nov 13, 2018

This PR adds sparse-file support to the Linux implementation of std::fs:copy(). The current implementation uses the Linux sys-call copy_file_range(), which has a number of benefits over user-space copying, but it will expand 'holes' in any sparse files if not explicitly checked-for. This PR also adds a user-space copy function as a fall back for cases where copy_file_range() is unavailable or cannot handle the operation.

There are a few judgement-call changes made to support the additional code; I'm happy to take pointers on style if this doesn't fit with the overall Rust codebase:

  • The Linux-specific implementation has been moved to its own module to minimise #[cfg()]s, minimise nesting, and simplify testing.
  • It uses a number of smaller functions for testing and readability; the compiler should inline these.
  • The global flag for the syscall availability has been made thread-local to avoid global-lock contention in highly multi-threaded programs.

It's worth noting that the user-space copy and sparse-hole seek code could be ported to other Unix-like OSs that support sparse-files, however I didn't want to get too into the weeds.

cc @nicokoch who added the original copy_file_range() implementation.

@rust-highfive

This comment has been minimized.

Collaborator

rust-highfive commented Nov 13, 2018

r? @dtolnay

(rust_highfive has picked a reviewer for you, use r? to override)

@rust-highfive

This comment has been minimized.

Collaborator

rust-highfive commented Nov 13, 2018

The job x86_64-gnu-llvm-5.0 of your PR failed on Travis (raw log). Through arcane magic we have determined that the following fragments from the build log may contain information about the problem.

Click to expand the log.
travis_time:end:1480ff98:start=1542078454825909941,finish=1542078457084667265,duration=2258757324
$ git checkout -qf FETCH_HEAD
travis_fold:end:git.checkout

Encrypted environment variables have been removed for security reasons.
See https://docs.travis-ci.com/user/pull-requests/#Pull-Requests-and-Security-Restrictions
$ export SCCACHE_BUCKET=rust-lang-ci-sccache2
$ export SCCACHE_REGION=us-west-1
Setting environment variables from .travis.yml
$ export IMAGE=x86_64-gnu-llvm-5.0
---
[00:52:33] .................................................................................................... 100/5017
[00:52:36] .................................................................................................... 200/5017
[00:52:39] .............................ii............................................ii...................ii.. 300/5017
[00:52:41] ..............................................................................................iii... 400/5017
[00:52:44] .....iiiiiiii.iii............................iii...........................................i........ 500/5017
[00:52:51] .................................................................................................... 700/5017
[00:52:58] .................................................................................i...........i...... 800/5017
[00:53:01] .................................................................................................... 900/5017
[00:53:05] iiiii..................ii.iiii...................................................................... 1000/5017
---
[00:53:41] .................................................................................................... 2200/5017
[00:53:46] .................................................................................................... 2300/5017
[00:53:50] .................................................................................................... 2400/5017
[00:53:53] .................................................................................................... 2500/5017
[00:53:57] .................................................................................iiiiiiiii.......... 2600/5017
[00:54:05] ..............................................ii.................................................... 2800/5017
[00:54:07] .................................................................................................... 2900/5017
[00:54:11] .................................................................................................... 3000/5017
[00:54:14] ........................................i........................................................... 3100/5017
---
travis_time:start:test_codegen
Check compiletest suite=codegen mode=codegen (x86_64-unknown-linux-gnu -> x86_64-unknown-linux-gnu)
[01:08:22] 
[01:08:22] running 116 tests
[01:08:25] i..ii...iii..iiii.....i...i.........i..iii...........i.....i.....ii...i..i.ii..............i...ii..i 100/116
[01:08:25] i.i....iiii.....
[01:08:25] 
[01:08:25]  finished in 3.583
[01:08:25] travis_fold:end:test_codegen

---
travis_time:start:test_debuginfo
Check compiletest suite=debuginfo mode=debuginfo-both (x86_64-unknown-linux-gnu -> x86_64-unknown-linux-gnu)
[01:08:40] 
[01:08:40] running 118 tests
[01:09:06] .iiiii...i.....i..i...i..i.i..i.i..i.....i..i....i..........iiii.........i.i....i...i.......ii.i.i.i 100/118
[01:09:10] ......iii.i.....ii
[01:09:10] 
[01:09:10]  finished in 30.049
[01:09:10] travis_fold:end:test_debuginfo

---
[01:29:33] thread '<unnamed>' panicked at 'explicit panic', libstd/sync/rwlock.rs:652:13
[01:29:33] thread '<unnamed>' panicked at 'explicit panic', libstd/sync/rwlock.rs:617:13
[01:29:33] thread '<unnamed>' panicked at 'explicit panic', libstd/sync/rwlock.rs:629:13
[01:29:34] thread '<unnamed>' panicked at 'What the answer to my lifetimes dilemma is?', libstd/sys_common/remutex.rs:241:13
[01:29:34] ..............................................................................F...F................. 700/784
[01:29:34] thread '<unnamed>' panicked at 'index 2 and/or 4 in `"aé 💩"` do not lie on character boundary', libstd/sys_common/wtf8.rs:784:5
[01:29:34] thread '<unnamed>' panicked at 'index 5 and/or 8 in `"aé 💩"` do not lie on character boundary', libstd/sys_common/wtf8.rs:784:5
[01:29:34] thread '<unnamed>' panicked at 'assertion failed: is_code_point_boundary(self, new_len)', libstd/sys_common/wtf8.rs:329:9
[01:29:34] thread '<unnamed>' panicked at 'assertion failed: is_code_point_boundary(self, new_len)', libstd/sys_common/wtf8.rs:329:9
---
[01:29:34] thread '<unnamed>' panicked at 'specified instant was later than self', libstd/sys/unix/time.rs:292:17
[01:29:43] ....................................................................................
[01:29:43] failures:
[01:29:43] 
[01:29:43] ---- sys::unix::fs_linux::tests::test_lseek_data stdout ----
[01:29:43] thread 'sys::unix::fs_linux::tests::test_lseek_data' panicked at 'called `Result::unwrap()` on an `Err` value: Os { code: 38, kind: Other, message: "Function not implemented" }', libcore/result.rs:1009:5
[01:29:43] ---- sys::unix::fs_linux::tests::test_sparse_copy_middle stdout ----
[01:29:43] ---- sys::unix::fs_linux::tests::test_sparse_copy_middle stdout ----
[01:29:43] thread 'sys::unix::fs_linux::tests::test_sparse_copy_middle' panicked at 'called `Result::unwrap()` on an `Err` value: Os { code: 38, kind: Other, message: "Function not implemented" }', libcore/result.rs:1009:5
[01:29:43] 
[01:29:43] failures:
[01:29:43] failures:
[01:29:43]     sys::unix::fs_linux::tests::test_lseek_data
[01:29:43]     sys::unix::fs_linux::tests::test_sparse_copy_middle
[01:29:43] test result: FAILED. 782 passed; 2 failed; 0 ignored; 0 measured; 0 filtered out
[01:29:43] 
[01:29:43] error: test failed, to rerun pass '--lib'
[01:29:43] 
[01:29:43] 
[01:29:43] 
[01:29:43] command did not execute successfully: "/checkout/obj/build/x86_64-unknown-linux-gnu/stage0/bin/cargo" "test" "--target" "x86_64-unknown-linux-gnu" "-j" "4" "--release" "--locked" "--color" "always" "--features" "panic-unwind backtrace" "--manifest-path" "/checkout/src/libstd/Cargo.toml" "-p" "std" "--" "--quiet"
[01:29:43] 
[01:29:43] 
[01:29:43] failed to run: /checkout/obj/build/bootstrap/debug/bootstrap test
[01:29:43] Build completed unsuccessfully in 0:41:03
[01:29:43] Build completed unsuccessfully in 0:41:03
[01:29:43] Makefile:58: recipe for target 'check' failed
[01:29:43] make: *** [check] Error 1
The command "stamp sh -x -c "$RUN_SCRIPT"" exited with 2.
travis_time:start:042e792a
$ date && (curl -fs --head https://google.com | grep ^Date: | sed 's/Date: //g' || true)
Tue Nov 13 04:37:31 UTC 2018

I'm a bot! I can only do what humans tell me to, so if this was not helpful or you have suggestions for improvements, please ping or otherwise contact @TimNN. (Feature Requests)

@nicokoch

This comment has been minimized.

Contributor

nicokoch commented Nov 14, 2018

I can‘t really review right now since I‘m on vacation.

@dtolnay

This comment has been minimized.

Member

dtolnay commented Nov 15, 2018

Reassigning because I am swamped this week:

r? @alexcrichton

@the8472

This comment has been minimized.

the8472 commented Nov 19, 2018

Afaik every call to metadata() results in a stat syscall, i.e. it's not cached in the File instance. Maybe it would make sense to cache the results?

@the8472

This comment has been minimized.

the8472 commented Nov 19, 2018

I have posted some ideas on internals how this could be optimized further, but that's out of scope for this PR.

@tarka

This comment has been minimized.

Contributor

tarka commented Nov 20, 2018

Afaik every call to metadata() results in a stat syscall.

Good point, I've cleaned it up.

// Slightly modified version of io::copy() that only copies a set amount of bytes.
fn copy_bytes_uspace(mut reader: &File, mut writer: &File, nbytes: usize) -> io::Result<u64> {
const BLKSIZE: usize = 4 * 1024; // Assume 4k blocks on disk.

This comment has been minimized.

@the8472

the8472 Nov 20, 2018

This is a reasonable default for io::copy that handles generic Read and Write types but for fs::copy we know we're doing syscalls and can probably do better. E.g. gnu cp has bloated up over the years and now does 128KiB or stat.blksize, whichever is larger.

This is probably due to ever-growing throughput (thus more syscalls per unit of time) and larger true block sizes, e.g. on flash and raid storage.

This comment has been minimized.

@the8472

the8472 Nov 20, 2018

Oh, and of course it only does this for large files.

On the other hand the rust std default buffer size was dropped to rust 8k in #32695 due to problems with jemalloc. But that might not be relevant here since jemalloc is not the default anymore and this would only apply to copying large files.

@the8472

This comment has been minimized.

the8472 commented Nov 20, 2018

GNU cp does fadvise(..., FADVISE_SEQUENTIAL) on the source. We probably should do that too.

https://github.com/coreutils/coreutils/blob/4711c49312d54e84996c13c612f7081c95f821a6/src/copy.c#L1226

@alexcrichton

This comment has been minimized.

Member

alexcrichton commented Nov 20, 2018

Thanks for the PR @tarka! Unfortunately I'm very swamped right now and this is a quite a large amount of code that I won't have time to review. Others from @rust-lang/libs may be interested in this as well.

I'm not entirely certain if we want to have such complicated code for this in the standard library, this may be crossing into the boundary of "should start on crates.io and maybe move some time afterwards". If no one else gets around to this I will try to get around to it in the coming weeks

let out_meta = outfd.metadata()?;
let (is_sparse, is_xmount) = copy_parms(&in_meta, &out_meta)?;
let uspace = is_xmount;

This comment has been minimized.

@the8472

the8472 Nov 20, 2018

This check may soon be too strict in new kernels, cross-device clone_file_range might become supported e.g. between NFS mounts or multiple subvolumes of btrfs systems.

Instead one should always try copy_file_range and fall back to regular copy on the first EXDEV.

@tarka

This comment has been minimized.

Contributor

tarka commented Nov 21, 2018

Thanks @alexcrichton, understood.

@the8472 Thanks for the pointers. Based on Alex's comments I might hold off on any changes until it's decided if this is appropriate for std. That said, even if this is accepted into std I think there is a place for an external library with more advanced features and tunables. In particular with xcp I had to duplicate most of what was in std to add callbacks, and that might be better moved to a library. So I might move this code into that library while this PR is in the queue and iterate there. If you have the time I'd love to have your input and contributions. I'll let you know when I've got the basics up and running.

tarka added a commit to tarka/fs-extra that referenced this pull request Nov 22, 2018

@TimNN TimNN added A-allocators and removed A-allocators labels Nov 27, 2018

@Dylan-DPC

This comment has been minimized.

Member

Dylan-DPC commented Dec 3, 2018

Ping from triage. @rust-lang/libs can someone review this?

@Dylan-DPC

This comment has been minimized.

Member

Dylan-DPC commented Dec 10, 2018

@sfackler

This comment has been minimized.

Member

sfackler commented Dec 11, 2018

I don't think I'm really convinced that the significant extra complexity here is worth it. I'd probably agree with @alexcrichton that this may be best served starting externally.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment