-
-
Notifications
You must be signed in to change notification settings - Fork 29.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add os.sendfile() #55091
Comments
Attached is a patch which implements os.sendfile for unix systems (linux, freebsd, apple, solaris, dragonfly). It takes the iov initialization code and off_t parsing from i10812. It encapsulates all the functionality from the various sendfiles which means a fair amount of #ifdefs but the basic case works for all of them. Tested on Linux & FreeBSD - it should work on solaris but since it needs to link with the sendfile library and I have no idea how to link the posix module with the sendfile library only on Solaris, i couldn't test it. If someone could please contribute this... I think it might be possible to get a Windows equivalent of this - i'll leave it for someone else to do ;-) |
Since the posix module is linked statically inside the interpreter, it probably needs some change in the base ldflags. I'll take a look when I have some time. |
Ok, I figured it out to link with sendfile on solaris. Here is the updated patch. |
Thanks for writing this. I would focus on trying to provide a unique interface across all platforms. Being sendfile() not a standard POSIX I think we should not worry about providing a strict one-to-one interface. "headers" and "trailers" arguments should be available everywhere as they are crucial in different protocols such as HTTP. The "offset" parameter should be available everywhere, Linux included (in your patch you dropped Linux support). It turns out the only peculiar argument is "flags", available only on *BSD. sendfile(out, in, count, offset=None, headers=None, trailers=None, flags=0) |
I'm not so sure about this anymore. Please ignore this. |
Just to be clear: There are 3 different interfaces. So it does provide a unique interface across all platforms while still providing the ability to access the native functionality. Preferably, I'd like to see a thin wrapper like this remain and then have a sendfile() method added to the socket object which takes a file-like object (not a file descriptor) and optional headers/trailers. This method can then figure out how best to do it depending on the platform. (i.e. using TCP_CORK if necessary, etc). It could even be made to work with file-like objects that cannot be mmap()ed. Why not put it straight in socket anyway? Well, some of the implementations allow sendfile() to have a normal fd as the output. Putting it in socket then would't make sense. |
I agree then, although I'm not sure having a function with a variable number of args depending on the platform is acceptable. |
I've just tried it against r87935 and it applies cleanly. Perhaps you didn't apply the patch correctly (it requires "-p1" since it was a Mercurial diff), try: With regards to the different arguments, I don't know if that's acceptable or not or if there is a better way. Since you can have mmap.mmap() with differing args between Windows & Unix, maybe it is acceptable. And, Python exposes differing functionality via the posix module since the available functions differs widely between platforms. |
The patch as-is did not work on Linux. |
We absolutely need to expose sendfile as-is. If we want to provide |
Oh sorry, that was because it changed configure.in so "autoreconf" needs to be run to regenerate configure & pyconfig.h.in. I thought that patches weren't meant to include the regenerated files. Especially since differences in the versions between autoconf 2.65 & 2.67 can produce hundreds of differences in "configure" making the patch large and obfuscating the actual changes. |
Correct. Not including them is perfectly fine. |
I notice Linux is described as not taking count=0 to mean to send until the end of "in" is reached. Is it possible to pass (size_t)-1 to this field if None is given, or do a loop on non-zero counts from sendfile to emulate this? I poked around the linux source for 2.6.32, and it appears sendfile() is emulated on top of splice(), but this doesn't provide undocumented count=0 handling as I was hoping. |
Please note that on FreeBSD things work a little bit differently for non-blocking sockets: In details I'm talking about:
...and the similar note about EBUSY, later in the page. ret = sendfile(in, out, offset, &sbytes, &sf, flags); ... if (ret == -1) {
if ((errno == EAGAIN) || (errno == EBUSY)) {
return Py_BuildValue("ll", sbytes, offset + sbytes);
}
return posix_error();
} |
Attached is a new sendfile patch which fixes the issue with FreeBSD (and Mac OS X & DragonFly BSD from what I can see). With regards to anacrolix's request, I think what Martin said in msg126049. i.e. if we want to provide a unifying layer on top of sendfile we can, but this should just expose sendfile() as is. |
Could you please also add support for offset argument on Linux? |
Attached is an updated patch that uses keyword arguments. Using an offset with Linux was always supported although I have cleaned up the documentation a bit to make that clearer. import os
import socket
with open("/tmp/test", "wb") as fp:
fp.write(b"testdata\n" * 1000000)
cli = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
cli.connect(('localhost', 8010))
c = cli.fileno()
f = os.open("/tmp/test", os.O_RDONLY)
os.sendfile(c, f, 1, 7) # "estdata"
cli.send(b"\n\n")
os.sendfile(c, f, None, 4) # "test"
cli.send(b"\n\n")
os.sendfile(c, f, None, 4) # "data"
cli.send(b"\n\n") |
The latter part is incorrect as it is not guaranteed that all bytes specified in "count" argument are going to be sent and it also sounds like the function is blocking. Also, I'd be for using the BSD notation and rename the "count" argument to "nbytes", which is more clear. Docstring should be changed to reflect the keyword arguments:
Finally, tests for header and trailer arguments should be written. The rest of the patch looks ok to me. |
I have a few problems with these parts of the latest patch: + The second case may be used on Mac OS X and FreeBSD where *headers* Why special case these? Why can't Mac OS X and FreeBSD write those manually into the output file descriptor. It's presumptious but I don't see why something so easy to do explicitly is mashed into the interface here, just pretend it doesn't exist. For the sake of simplicity (and sendfile might become very popular in future code), just drop this "feature". for h in headers: out.write(h)
os.sendfile(out, in, offset, count)
for t in trailers: out.write(t) + On Mac OS X and FreeBSD, a value of 0 for *count* specifies to send until Again this should be emulated where it's not available, as it's very common, and a logical feature to have. However as indicated earlier, if os.sendfile is a minimal syscall wrapper, some thought needs to be given to a "higher-level" function, that also includes my other suggestions. + On Solaris, *out* may be the file descriptor of a regular file or the file I'm pretty sure that Solaris isn't the only platform that supports non-socket file descriptors here, Linux (the platform I'm using), is one such case. As a general rule, we want to allow any file descriptors, and not restrict to sockets (except for awful platforms like Windows). |
No, sendfile() on Linux supports only socket file descriptors: |
Thanks for catching that: Presently (Linux 2.6.9): in_fd, must correspond to a file which sup‐ Despite the fact the manpage hasn't changed since 2004, sendfile still only works for sockets. :( |
OK, updated documentation and tests.
These can be a crucial part of certain protocols such as HTTP to ensure that a minimal amount of TCP packets are sent. Also, the posix module is supposed to expose the OS functionality transparently. Besides, if you don't want to use headers/trailers, they can simply be left out. |
In case someone is interested in statistics, I wrote a sendfile() wrapper by using ctypes for pyftpdlib and benchmark results are quite impressive: |
Patch in attachment provides a complete test suite. + if (ret < 0) { The test suite shows that "trailer" argument does not work. |
For trailers to work, I think the line: Also not that tests like this: perhaps should also include solaris since it doesn't support headers/trailers either. |
You're right, that line is certainly wrong as it should compare for bytes but the trailer data still doesn't get appended. |
With no changes, I get: Traceback (most recent call last):
File "/usr/home/ross/py3k_sftest/Lib/test/test_os.py", line 1531, in test_trailers
self.assertEqual(data, "abcde12345")
AssertionError: b'abcde12345' != 'abcde12345' After changing, the tests work perfectly. Perhaps its the FreeBSD version? Here's the output from uname: Maybe its a bug on that platform, you could try building a sendfile program in C with trailer data. Or try a simple test in the interpreter like the one in msg127326 but with added trailer data. |
I was testing against FreeBSD 7.0 RC1 and I confirm the problem doesn't occur on 8.1 version. Patch in attachment adds test for "flags" argument and skips headers/trailers tests on linux and solaris. |
Please leave open until a patch is actually committed ;) |
On a second thought I have two complaints. There is no reason to return the file offset, also because this is only supported on Linux. On all other platforms we are calculating the file offset by making a sum of offset + number of bytes sent. Despite this would normally work it no longer makes sense when headers/trailers arguments are specified. >>> sendfile(in, out, offset, 4096, headers=[b"xxxx"])
(5000, 5000) In this case the right offset is supposed to be 4096. My second complaint is about headers/trailers data type. |
OK, I'm happy to not return the file offset. However, I still think that headers and trailers should remain as is since this matches the native interface very closely. |
Patch in attachment removes offset and return just the number of bytes sent as an integer. |
Yes I agree it can go in now. Unless someone wants to do some tests on more OS's like FreeBSD 7.2, Solaris, etc. (I've only checked on Linux 2.6, FreeBSD 8.1, OpenIndiana and OS X 10.5). |
I'm going to commit the patch and then watch whether some of the buildbots turn red. |
Committed in r88580. |
Three comments:
|
Thanks Georg. It seems we have a failure on Leopard: Also, I think I can add support for AIX if someone gives me SSH access over an AIX box. |
I think the issue is you request a 4096 transfer near the end of file, but there aren't that many bytes remaining. Apparently OS X then doesn't transfer anything and returns 0. Hint: >>> 10485760. / 4096
2560.0
>>> 10486272. / 4096
2560.125 |
OSX failure is tracked in bpo-11351. |
test_os fails on the Fedora bot (--without-threads): test test_os crashed -- Traceback (most recent call last):
File "./Lib/test/regrtest.py", line 1037, in runtest_inner
File "/home/buildbot/buildarea/3.x.krah-fedora/build/Lib/importlib/_bootstrap.py", line 437, in load_module
return self._load_module(fullname)
File "/home/buildbot/buildarea/3.x.krah-fedora/build/Lib/importlib/_bootstrap.py", line 141, in decorated
return fxn(self, module, *args, **kwargs)
File "/home/buildbot/buildarea/3.x.krah-fedora/build/Lib/importlib/_bootstrap.py", line 342, in _load_module
exec(code_object, module.__dict__)
File "/home/buildbot/buildarea/3.x.krah-fedora/build/Lib/test/test_os.py", line 1312, in <module>
class SendfileTestServer(asyncore.dispatcher, threading.Thread):
AttributeError: 'NoneType' object has no attribute 'Thread' |
New changeset c45e92bd4d81 by Giampaolo Rodola' in branch 'default': |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: