Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sendfile syscall #60689

Closed
arobase-che opened this issue May 10, 2019 · 10 comments
Closed

sendfile syscall #60689

arobase-che opened this issue May 10, 2019 · 10 comments
Labels
C-feature-request Category: A feature request, i.e: not implemented / a PR. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.

Comments

@arobase-che
Copy link

Should we implement the sendfile syscall to TCPStream and/or UDPStream ?

Even if it's not totally portable (Linux and BSD, windows using transmitfile), I may be a convenient way to send a file effectively.

@hellow554
Copy link
Contributor

hellow554 commented May 10, 2019

What about macOS? At least all tier1 targets should be supported at least.
Also I'm not really convinced why this should be in std and not in an own crate. Maybe you can elaborate it further and convince me.

@rustbot modify labels: C-enhancement T-libs

@rustbot rustbot added C-enhancement Category: An issue proposing an enhancement or a PR with one. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. labels May 10, 2019
@jonas-schievink jonas-schievink added C-feature-request Category: A feature request, i.e: not implemented / a PR. and removed C-enhancement Category: An issue proposing an enhancement or a PR with one. labels May 10, 2019
@arobase-che
Copy link
Author

Should work on macOS too but i never had used macOS. Mac OS sendfile documentation


I'm not sure it should be in std. Actually, it's a syscall so a very low level feature. As send, accept, bind, recvfrom, ... that are already in std::net. Other languages add it to the std (Python), Go use it internally in his « std » but doesn't provide a way to use it with a socket). But others languages as Haskell uses external lib. Some simply don't (D socket's)

I know i may be more convincing but I'm not sure myself ! In fact, i'm asking it.

It seems to me to be the right way to add it to the std. But maybe it's because I don't know how it could be implemented from an external crate.

@abonander
Copy link
Contributor

This could be an internal specialization for io::copy(), and/or maybe exposed in std::os::unix::io as

pub fn send_file<R: AsRawFd, W: AsRawFd>(r: &mut R, w: &mut W) -> io::Result<u64> {...}

Windows actually has a painfully close equivalent but it's specifically to copy from a file handle to a socket (which are separate object types in the Windows API) whereas sendfile() works for any pair of file descriptors, so it couldn't be the same interface unless separate traits were used for the read side and the write side and one was only implemented by File and the other by the net stream types.

https://docs.microsoft.com/en-us/windows/desktop/api/mswsock/nf-mswsock-transmitfile

@abonander
Copy link
Contributor

abonander commented May 10, 2019

@arobase-che sendfile() is actually already available in the the libc crate but that's the unsafe extern bindings to the actual function in sys/sendfile.h.

It's of course completely possible to build a safe wrapper around it and it's actually pretty trivial:

extern crate libc;
use std::os::unix::io::AsRawFd;
use std::{io, ptr};


// mutability is not technically required but it fits API conventions
pub fn send_file<R: AsRawFd, W: AsRawFd>(r: &mut R, w: &mut W) -> io::Result<usize> {
    // parameter ordering is reversed
    // null pointer is an out-pointer for the offset after the read, if not null it doesn't update the file cursors which we actually do want
    // last argument is the maximum number of bytes to copy but 
    // the documentation says it stops at 2^31-1 regardless of arch
    match unsafe { libc::sendfile(w.as_raw_fd(), r.as_raw_fd(), ptr::null_mut(), usize::MAX) } {
        -1 => Err(io::Error::last_os_error()),
        copied => Ok(copied as usize), 
    } 
}

@abonander
Copy link
Contributor

The libc wrapper function for reference: http://man7.org/linux/man-pages/man2/sendfile.2.html

@the8472
Copy link
Member

the8472 commented May 11, 2019

This could be an internal specialization for io::copy()

That would be quite a rabbit hole. There are lots of special copy syscalls on linux.

splice: any -> pipe; pipe -> any
vmsplice: &'static [u8] -> pipe
sendfile: mmapable fd -> socket
copy_file_range: originally just regular -> regular on same file system, but support is expanding

@steveklabnik
Copy link
Member

Triage: I don't think this was ever implemented.

bors added a commit to rust-lang-ci/rust that referenced this issue Nov 14, 2020
specialize io::copy to use copy_file_range, splice or sendfile

Fixes rust-lang#74426.
Also covers rust-lang#60689 but only as an optimization instead of an official API.

The specialization only covers std-owned structs so it should avoid the problems with rust-lang#71091

Currently linux-only but it should be generalizable to other unix systems that have sendfile/sosplice and similar.

There is a bit of optimization potential around the syscall count. Right now it may end up doing more syscalls than the naive copy loop when doing short (<8KiB) copies between file descriptors.

The test case executes the following:

```
[pid 103776] statx(3, "", AT_STATX_SYNC_AS_STAT|AT_EMPTY_PATH, STATX_ALL, {stx_mask=STATX_ALL|STATX_MNT_ID, stx_attributes=0, stx_mode=S_IFREG|0644, stx_size=17, ...}) = 0
[pid 103776] write(4, "wxyz", 4)        = 4
[pid 103776] write(4, "iklmn", 5)       = 5
[pid 103776] copy_file_range(3, NULL, 4, NULL, 5, 0) = 5

```

0-1 `stat` calls to identify the source file type. 0 if the type can be inferred from the struct from which the FD was extracted
𝖬 `write` to drain the `BufReader`/`BufWriter` wrappers. only happen when buffers are present. 𝖬 ≾ number of wrappers present. If there is a write buffer it may absorb the read buffer contents first so only result in a single write. Vectored writes would also be an option but that would require more invasive changes to `BufWriter`.
𝖭 `copy_file_range`/`splice`/`sendfile` until file size, EOF or the byte limit from `Take` is reached. This should generally be *much* more efficient than the read-write loop and also have other benefits such as DMA offload or extent sharing.

## Benchmarks

```

OLD

test io::tests::bench_file_to_file_copy         ... bench:      21,002 ns/iter (+/- 750) = 6240 MB/s    [ext4]
test io::tests::bench_file_to_file_copy         ... bench:      35,704 ns/iter (+/- 1,108) = 3671 MB/s  [btrfs]
test io::tests::bench_file_to_socket_copy       ... bench:      57,002 ns/iter (+/- 4,205) = 2299 MB/s
test io::tests::bench_socket_pipe_socket_copy   ... bench:     142,640 ns/iter (+/- 77,851) = 918 MB/s

NEW

test io::tests::bench_file_to_file_copy         ... bench:      14,745 ns/iter (+/- 519) = 8889 MB/s    [ext4]
test io::tests::bench_file_to_file_copy         ... bench:       6,128 ns/iter (+/- 227) = 21389 MB/s   [btrfs]
test io::tests::bench_file_to_socket_copy       ... bench:      13,767 ns/iter (+/- 3,767) = 9520 MB/s
test io::tests::bench_socket_pipe_socket_copy   ... bench:      26,471 ns/iter (+/- 6,412) = 4951 MB/s
```
@the8472
Copy link
Member

the8472 commented Oct 12, 2021

There's a partial implementation via #75272, you can sendfile to a socket by using io::copy. It's currently linux-specific and would have to be generalized to work on other platforms.

@the8472
Copy link
Member

the8472 commented Jul 10, 2023

There's a partial implementation via #75272, you can sendfile to a socket by using io::copy. It's currently linux-specific and would have to be generalized to work on other platforms.

The sendfile optimization for copying to a socket had had to be removed in #108283 since sendfile violates ordering guarantees implicit in the Read/Write traits.

So this will have to be implemented as a separate API which makes fewer promises.

@arobase-che
Copy link
Author

The situation is pretty clear now. I think we can close the issue.
Thank you everybody.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-feature-request Category: A feature request, i.e: not implemented / a PR. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests

7 participants