Skip to content

Commit

Permalink
Auto merge of #75272 - the8472:spec-copy, r=KodrAus
Browse files Browse the repository at this point in the history
specialize io::copy to use copy_file_range, splice or sendfile

Fixes #74426.
Also covers #60689 but only as an optimization instead of an official API.

The specialization only covers std-owned structs so it should avoid the problems with #71091

Currently linux-only but it should be generalizable to other unix systems that have sendfile/sosplice and similar.

There is a bit of optimization potential around the syscall count. Right now it may end up doing more syscalls than the naive copy loop when doing short (<8KiB) copies between file descriptors.

The test case executes the following:

```
[pid 103776] statx(3, "", AT_STATX_SYNC_AS_STAT|AT_EMPTY_PATH, STATX_ALL, {stx_mask=STATX_ALL|STATX_MNT_ID, stx_attributes=0, stx_mode=S_IFREG|0644, stx_size=17, ...}) = 0
[pid 103776] write(4, "wxyz", 4)        = 4
[pid 103776] write(4, "iklmn", 5)       = 5
[pid 103776] copy_file_range(3, NULL, 4, NULL, 5, 0) = 5

```

0-1 `stat` calls to identify the source file type. 0 if the type can be inferred from the struct from which the FD was extracted
𝖬 `write` to drain the `BufReader`/`BufWriter` wrappers. only happen when buffers are present. 𝖬 ≾ number of wrappers present. If there is a write buffer it may absorb the read buffer contents first so only result in a single write. Vectored writes would also be an option but that would require more invasive changes to `BufWriter`.
𝖭 `copy_file_range`/`splice`/`sendfile` until file size, EOF or the byte limit from `Take` is reached. This should generally be *much* more efficient than the read-write loop and also have other benefits such as DMA offload or extent sharing.

## Benchmarks

```

OLD

test io::tests::bench_file_to_file_copy         ... bench:      21,002 ns/iter (+/- 750) = 6240 MB/s    [ext4]
test io::tests::bench_file_to_file_copy         ... bench:      35,704 ns/iter (+/- 1,108) = 3671 MB/s  [btrfs]
test io::tests::bench_file_to_socket_copy       ... bench:      57,002 ns/iter (+/- 4,205) = 2299 MB/s
test io::tests::bench_socket_pipe_socket_copy   ... bench:     142,640 ns/iter (+/- 77,851) = 918 MB/s

NEW

test io::tests::bench_file_to_file_copy         ... bench:      14,745 ns/iter (+/- 519) = 8889 MB/s    [ext4]
test io::tests::bench_file_to_file_copy         ... bench:       6,128 ns/iter (+/- 227) = 21389 MB/s   [btrfs]
test io::tests::bench_file_to_socket_copy       ... bench:      13,767 ns/iter (+/- 3,767) = 9520 MB/s
test io::tests::bench_socket_pipe_socket_copy   ... bench:      26,471 ns/iter (+/- 6,412) = 4951 MB/s
```
  • Loading branch information
bors committed Nov 14, 2020
2 parents 66c1309 + bbfa92c commit 30e49a9
Show file tree
Hide file tree
Showing 11 changed files with 930 additions and 152 deletions.
2 changes: 1 addition & 1 deletion library/std/src/fs.rs
Original file line number Diff line number Diff line change
Expand Up @@ -1656,7 +1656,7 @@ pub fn rename<P: AsRef<Path>, Q: AsRef<Path>>(from: P, to: Q) -> io::Result<()>
/// the length of the `to` file as reported by `metadata`.
///
/// If you’re wanting to copy the contents of one file to another and you’re
/// working with [`File`]s, see the [`io::copy`] function.
/// working with [`File`]s, see the [`io::copy()`] function.
///
/// # Platform-specific behavior
///
Expand Down
88 changes: 88 additions & 0 deletions library/std/src/io/copy.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,88 @@
use crate::io::{self, ErrorKind, Read, Write};
use crate::mem::MaybeUninit;

/// Copies the entire contents of a reader into a writer.
///
/// This function will continuously read data from `reader` and then
/// write it into `writer` in a streaming fashion until `reader`
/// returns EOF.
///
/// On success, the total number of bytes that were copied from
/// `reader` to `writer` is returned.
///
/// If you’re wanting to copy the contents of one file to another and you’re
/// working with filesystem paths, see the [`fs::copy`] function.
///
/// [`fs::copy`]: crate::fs::copy
///
/// # Errors
///
/// This function will return an error immediately if any call to [`read`] or
/// [`write`] returns an error. All instances of [`ErrorKind::Interrupted`] are
/// handled by this function and the underlying operation is retried.
///
/// [`read`]: Read::read
/// [`write`]: Write::write
///
/// # Examples
///
/// ```
/// use std::io;
///
/// fn main() -> io::Result<()> {
/// let mut reader: &[u8] = b"hello";
/// let mut writer: Vec<u8> = vec![];
///
/// io::copy(&mut reader, &mut writer)?;
///
/// assert_eq!(&b"hello"[..], &writer[..]);
/// Ok(())
/// }
/// ```
#[stable(feature = "rust1", since = "1.0.0")]
pub fn copy<R: ?Sized, W: ?Sized>(reader: &mut R, writer: &mut W) -> io::Result<u64>
where
R: Read,
W: Write,
{
cfg_if::cfg_if! {
if #[cfg(any(target_os = "linux", target_os = "android"))] {
crate::sys::kernel_copy::copy_spec(reader, writer)
} else {
generic_copy(reader, writer)
}
}
}

/// The general read-write-loop implementation of
/// `io::copy` that is used when specializations are not available or not applicable.
pub(crate) fn generic_copy<R: ?Sized, W: ?Sized>(reader: &mut R, writer: &mut W) -> io::Result<u64>
where
R: Read,
W: Write,
{
let mut buf = MaybeUninit::<[u8; super::DEFAULT_BUF_SIZE]>::uninit();
// FIXME: #42788
//
// - This creates a (mut) reference to a slice of
// _uninitialized_ integers, which is **undefined behavior**
//
// - Only the standard library gets to soundly "ignore" this,
// based on its privileged knowledge of unstable rustc
// internals;
unsafe {
reader.initializer().initialize(buf.assume_init_mut());
}

let mut written = 0;
loop {
let len = match reader.read(unsafe { buf.assume_init_mut() }) {
Ok(0) => return Ok(written),
Ok(len) => len,
Err(ref e) if e.kind() == ErrorKind::Interrupted => continue,
Err(e) => return Err(e),
};
writer.write_all(unsafe { &buf.assume_init_ref()[..len] })?;
written += len as u64;
}
}
5 changes: 4 additions & 1 deletion library/std/src/io/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -266,6 +266,8 @@ pub use self::buffered::IntoInnerError;
#[stable(feature = "rust1", since = "1.0.0")]
pub use self::buffered::{BufReader, BufWriter, LineWriter};
#[stable(feature = "rust1", since = "1.0.0")]
pub use self::copy::copy;
#[stable(feature = "rust1", since = "1.0.0")]
pub use self::cursor::Cursor;
#[stable(feature = "rust1", since = "1.0.0")]
pub use self::error::{Error, ErrorKind, Result};
Expand All @@ -279,11 +281,12 @@ pub use self::stdio::{_eprint, _print};
#[doc(no_inline, hidden)]
pub use self::stdio::{set_panic, set_print, LocalOutput};
#[stable(feature = "rust1", since = "1.0.0")]
pub use self::util::{copy, empty, repeat, sink, Empty, Repeat, Sink};
pub use self::util::{empty, repeat, sink, Empty, Repeat, Sink};

pub(crate) use self::stdio::clone_io;

mod buffered;
pub(crate) mod copy;
mod cursor;
mod error;
mod impls;
Expand Down
8 changes: 8 additions & 0 deletions library/std/src/io/stdio.rs
Original file line number Diff line number Diff line change
Expand Up @@ -409,6 +409,14 @@ impl Read for Stdin {
}
}

// only used by platform-dependent io::copy specializations, i.e. unused on some platforms
#[cfg(any(target_os = "linux", target_os = "android"))]
impl StdinLock<'_> {
pub(crate) fn as_mut_buf(&mut self) -> &mut BufReader<impl Read> {
&mut self.inner
}
}

#[stable(feature = "rust1", since = "1.0.0")]
impl Read for StdinLock<'_> {
fn read(&mut self, buf: &mut [u8]) -> io::Result<usize> {
Expand Down
2 changes: 1 addition & 1 deletion library/std/src/io/tests.rs
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
use super::{repeat, Cursor, SeekFrom};
use crate::cmp::{self, min};
use crate::io::prelude::*;
use crate::io::{self, IoSlice, IoSliceMut};
use crate::io::{BufRead, Read, Seek, Write};
use crate::ops::Deref;

#[test]
Expand Down
73 changes: 1 addition & 72 deletions library/std/src/io/util.rs
Original file line number Diff line number Diff line change
Expand Up @@ -4,78 +4,7 @@
mod tests;

use crate::fmt;
use crate::io::{self, BufRead, ErrorKind, Initializer, IoSlice, IoSliceMut, Read, Write};
use crate::mem::MaybeUninit;

/// Copies the entire contents of a reader into a writer.
///
/// This function will continuously read data from `reader` and then
/// write it into `writer` in a streaming fashion until `reader`
/// returns EOF.
///
/// On success, the total number of bytes that were copied from
/// `reader` to `writer` is returned.
///
/// If you’re wanting to copy the contents of one file to another and you’re
/// working with filesystem paths, see the [`fs::copy`] function.
///
/// [`fs::copy`]: crate::fs::copy
///
/// # Errors
///
/// This function will return an error immediately if any call to [`read`] or
/// [`write`] returns an error. All instances of [`ErrorKind::Interrupted`] are
/// handled by this function and the underlying operation is retried.
///
/// [`read`]: Read::read
/// [`write`]: Write::write
///
/// # Examples
///
/// ```
/// use std::io;
///
/// fn main() -> io::Result<()> {
/// let mut reader: &[u8] = b"hello";
/// let mut writer: Vec<u8> = vec![];
///
/// io::copy(&mut reader, &mut writer)?;
///
/// assert_eq!(&b"hello"[..], &writer[..]);
/// Ok(())
/// }
/// ```
#[stable(feature = "rust1", since = "1.0.0")]
pub fn copy<R: ?Sized, W: ?Sized>(reader: &mut R, writer: &mut W) -> io::Result<u64>
where
R: Read,
W: Write,
{
let mut buf = MaybeUninit::<[u8; super::DEFAULT_BUF_SIZE]>::uninit();
// FIXME: #42788
//
// - This creates a (mut) reference to a slice of
// _uninitialized_ integers, which is **undefined behavior**
//
// - Only the standard library gets to soundly "ignore" this,
// based on its privileged knowledge of unstable rustc
// internals;
unsafe {
reader.initializer().initialize(buf.assume_init_mut());
}

let mut written = 0;
loop {
let len = match reader.read(unsafe { buf.assume_init_mut() }) {
Ok(0) => return Ok(written),
Ok(len) => len,
Err(ref e) if e.kind() == ErrorKind::Interrupted => continue,
Err(e) => return Err(e),
};
writer.write_all(unsafe { &buf.assume_init_ref()[..len] })?;
written += len as u64;
}
}
use crate::io::{self, BufRead, Initializer, IoSlice, IoSliceMut, Read, Write};

/// A reader which is always at EOF.
///
Expand Down
1 change: 1 addition & 0 deletions library/std/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -317,6 +317,7 @@
#![feature(toowned_clone_into)]
#![feature(total_cmp)]
#![feature(trace_macros)]
#![feature(try_blocks)]
#![feature(try_reserve)]
#![feature(unboxed_closures)]
#![feature(unsafe_block_in_unsafe_fn)]
Expand Down
85 changes: 8 additions & 77 deletions library/std/src/sys/unix/fs.rs
Original file line number Diff line number Diff line change
Expand Up @@ -1204,88 +1204,19 @@ pub fn copy(from: &Path, to: &Path) -> io::Result<u64> {

#[cfg(any(target_os = "linux", target_os = "android"))]
pub fn copy(from: &Path, to: &Path) -> io::Result<u64> {
use crate::cmp;
use crate::sync::atomic::{AtomicBool, Ordering};

// Kernel prior to 4.5 don't have copy_file_range
// We store the availability in a global to avoid unnecessary syscalls
static HAS_COPY_FILE_RANGE: AtomicBool = AtomicBool::new(true);

unsafe fn copy_file_range(
fd_in: libc::c_int,
off_in: *mut libc::loff_t,
fd_out: libc::c_int,
off_out: *mut libc::loff_t,
len: libc::size_t,
flags: libc::c_uint,
) -> libc::c_long {
libc::syscall(libc::SYS_copy_file_range, fd_in, off_in, fd_out, off_out, len, flags)
}

let (mut reader, reader_metadata) = open_from(from)?;
let max_len = u64::MAX;
let (mut writer, _) = open_to_and_set_permissions(to, reader_metadata)?;

let has_copy_file_range = HAS_COPY_FILE_RANGE.load(Ordering::Relaxed);
let mut written = 0u64;
while written < max_len {
let copy_result = if has_copy_file_range {
let bytes_to_copy = cmp::min(max_len - written, usize::MAX as u64) as usize;
let copy_result = unsafe {
// We actually don't have to adjust the offsets,
// because copy_file_range adjusts the file offset automatically
cvt(copy_file_range(
reader.as_raw_fd(),
ptr::null_mut(),
writer.as_raw_fd(),
ptr::null_mut(),
bytes_to_copy,
0,
))
};
if let Err(ref copy_err) = copy_result {
match copy_err.raw_os_error() {
Some(libc::ENOSYS | libc::EPERM | libc::EOPNOTSUPP) => {
HAS_COPY_FILE_RANGE.store(false, Ordering::Relaxed);
}
_ => {}
}
}
copy_result
} else {
Err(io::Error::from_raw_os_error(libc::ENOSYS))
};
match copy_result {
Ok(0) if written == 0 => {
// fallback to work around several kernel bugs where copy_file_range will fail to
// copy any bytes and return 0 instead of an error if
// - reading virtual files from the proc filesystem which appear to have 0 size
// but are not empty. noted in coreutils to affect kernels at least up to 5.6.19.
// - copying from an overlay filesystem in docker. reported to occur on fedora 32.
return io::copy(&mut reader, &mut writer);
}
Ok(0) => return Ok(written), // reached EOF
Ok(ret) => written += ret as u64,
Err(err) => {
match err.raw_os_error() {
Some(
libc::ENOSYS | libc::EXDEV | libc::EINVAL | libc::EPERM | libc::EOPNOTSUPP,
) => {
// Try fallback io::copy if either:
// - Kernel version is < 4.5 (ENOSYS)
// - Files are mounted on different fs (EXDEV)
// - copy_file_range is broken in various ways on RHEL/CentOS 7 (EOPNOTSUPP)
// - copy_file_range is disallowed, for example by seccomp (EPERM)
// - copy_file_range cannot be used with pipes or device nodes (EINVAL)
assert_eq!(written, 0);
return io::copy(&mut reader, &mut writer);
}
_ => return Err(err),
}
}
}
use super::kernel_copy::{copy_regular_files, CopyResult};

match copy_regular_files(reader.as_raw_fd(), writer.as_raw_fd(), max_len) {
CopyResult::Ended(result) => result,
CopyResult::Fallback(written) => match io::copy::generic_copy(&mut reader, &mut writer) {
Ok(bytes) => Ok(bytes + written),
Err(e) => Err(e),
},
}
Ok(written)
}

#[cfg(any(target_os = "macos", target_os = "ios"))]
Expand Down
Loading

0 comments on commit 30e49a9

Please sign in to comment.