-
-
Notifications
You must be signed in to change notification settings - Fork 30.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add splice() to the os module #85791
Comments
The splice system call moves data between two file descriptors without copying between kernel address space and user address space. This can be a very useful addition for libraries implementing low-level file management. |
I don't recall the subtle differences between sendfile() and splice(). I recall that in early Linux versions, one was limited to sockets, and only on one side. But later, it became possible to pass two sockets, or one file on disk and one socket, etc. Python exposes sendfile() as os.sendfile() since Python 3.3: |
Basically, splice() is specialized for pipes: splice() only works if one of the file descriptors refer to a pipe. So you can use for e.g. socket-to-pipe or pipe-to-file without copying the data into userspace. But you can't do file-to-file copies with it. sendfile() only works if the source file descriptor refers to something that can be mmap()ed (i.e. mostly normal files) and before 2.6.33 the destination must be a socket. |
The API of splice() looks complicated. How would you use it in Python? Are off_in and off_out adjusted as in copy_file_range() and sendfile()? It is not clear from the man page. If they are, how would you return updated values? Are you going to add vmsplice() and tee() too? Since it is Linux-specific API, would not be better to add a purposed module linux? |
It's not uncommon that a syscall added to the Linux kernel is later added to other platforms. Example: getrandom() exists in Linux and Solaris. Example: memfd_create() was designed in Linux, and added later to FreeBSD: freebsd/freebsd-src@575e351 (see bpo-41013). |
OpenBSD uses a different API: int sosplice(struct socket *so, int fd, off_t max, struct timeval *tv);
int somove(struct socket *so, int wait); "The function sosplice() is used to splice together a source and a drain socket." "The function somove() transfers data from the source's receive buffer to the drain's send buffer." "Socket splicing can be invoked from userland via the setsockopt(2) system-call at the SOL_SOCKET level with the socket option SO_SPLICE." |
It has the same API as copy_file_range and other similar system calls that we already expose, so we just need to do the same thing we do there.
It behaves the same as in copy_file_range() with the exception that one has to be None (the one associated with the pipe file descriptor). We don't return the updated values (neither we do in copy_file_range()).
We can certainly discuss adding vmsplice() and tee() (probably tee is more interesting), but in my humble oppinion that would be a different discussion. |
The semantics are considerably different (splice() is about pipes while sosplice() talks about general sockets). Also, the point of splice() is to skip copying from kernel buffers, but sosplice() does not mention that it does not copy between userspace and kernel space |
This is an interesting point, but I think that at this particular point it would be more confusing for users than not (normally people go to the os module for system calls) and as Victor mention, we would need to update the os module if some other operative system adds the system call later |
Heads up: I plant to land this next week in case someone could to do a review or has something against |
Do you mean "Linux kernel >= 2.6.17 and glibc >= 2.5" ?
Maybe also add " .. versionadded:: 3.10" on these constants. |
My understanding is that glibc provides emulation for glibc >= 2.5 The section from the manpage says:
Not sure how to interpret that. You want to change the "or" to "and"? |
I reopen the issue. This issue broke Python compilation on AIX. https://buildbot.python.org/all/#/builders/302/builds/377 configure: "checking for splice... yes" "./Modules/posixmodule.c", line 15146.53: 1506-045 (S) Undeclared identifier SPLICE_F_MOVE. make: 1254-004 The error code from the last command is 1. The code: /* constants for splice */
#ifdef HAVE_SPLICE
if (PyModule_AddIntConstant(m, "SPLICE_F_MOVE", SPLICE_F_MOVE)) return -1;
if (PyModule_AddIntConstant(m, "SPLICE_F_NONBLOCK", SPLICE_F_NONBLOCK)) return -1;
if (PyModule_AddIntConstant(m, "SPLICE_F_MORE", SPLICE_F_MORE)) return -1;
#endif |
There is no emulation. It's just a function which wraps the syscall: I understand that you need Linux kernel >= 2.6.17 *and* glibc >= 2.5. |
Nice, AIX can build again Python. But now the 3 tests fail since the test uses a pipe and a file, whereas on AIX, it seems like splice() requires one end to be a socket. I wrote attached PR 23354 to skip the 3 tests on AIX. ====================================================================== Traceback (most recent call last):
File "/home/buildbot/buildarea/3.x.aixtools-aix-power6/build/Lib/test/test_os.py", line 406, in test_splice
i = os.splice(in_fd, write_fd, 5)
OSError: [Errno 57] Socket operation on non-socket ====================================================================== Traceback (most recent call last):
File "/home/buildbot/buildarea/3.x.aixtools-aix-power6/build/Lib/test/test_os.py", line 440, in test_splice_offset_in
i = os.splice(in_fd, write_fd, bytes_to_copy, offset_src=in_skip)
OSError: [Errno 57] Socket operation on non-socket ====================================================================== Traceback (most recent call last):
File "/home/buildbot/buildarea/3.x.aixtools-aix-power6/build/Lib/test/test_os.py", line 479, in test_splice_offset_out
i = os.splice(read_fd, out_fd, bytes_to_copy, offset_dst=out_seek)
OSError: [Errno 57] Socket operation on non-socket |
FYI I checked and AIX is fixed. All tests pass again on POWER6 AIX 3.x buildbot. |
Thanks a lot Victor |
This is still broken. Since this was included in master - the AIX buildbot is failing to compile (https://buildbot.python.org/all/#/builders/438/builds/391 and https://buildbot.python.org/all/#/builders/302/builds/377) Strangely enough - the first bot continues to fail compile at the same location - while the second bot (running in a different environment) starting passing compile and all tests starting with https://buildbot.python.org/all/#/builders/302/builds/406. Note: bot 1 is using what I call (personal opinion) a mixed environment with some libraries coming from OSS packages and some from IBM AIX. bot 2 - relies on IBM AIX libraries. ++++++ aixtools@gcc119:[/home/aixtools/cpython/cpython-master]make V=1
|
Is it possible that posixmodule.c lacks an #include to get the function on AIX? On AIX 7.1, man splice says: #include <sys/types.h>
#include <sys/socket.h>
int splice(socket1, socket2, flags)
int socket1, socket2;
int flags; posixmodule.c doesn't include it on AIX: #if defined(__FreeBSD__) || defined(__DragonFly__) || defined(__APPLE__)
# ifdef HAVE_SYS_SOCKET_H
# include <sys/socket.h>
# endif
#endif Michael: Would you mind to try building the master branch of Python with attached socket.patch? (on the worker where Python no longer builds) |
Started custom build of PR 23608 in https://buildbot.python.org/all/#/buildrequests/84365 |
Seems that adding #include <sys/socket.h> does not work so I am going to skip adding this function on AIX. If someone is interested in fixing it, they can remove the #ifdef and figure out what's going on with that buildbot |
I'm fine with not implementing the function on AIX for now. |
Sorry Victor - family matters - so I was not watching for several days. |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: