-
-
Notifications
You must be signed in to change notification settings - Fork 30.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
test_socket.SendfileUsingSendfileTest fails on illumos #86403
Comments
I'm investigating some test failures related to sendfile on illumos: testCount (test.test_socket.SendfileUsingSendfileTest) ... FAIL ====================================================================== Traceback (most recent call last):
File "/root/cpython/Lib/socket.py", line 386, in _sendfile_use_sendfile
sent = os_sendfile(sockno, fileno, offset, blocksize)
OSError: [Errno 22] Invalid argument
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/root/cpython/Lib/test/test_socket.py", line 374, in _tearDown
raise exc
File "/root/cpython/Lib/test/test_socket.py", line 392, in clientRun
test_func()
File "/root/cpython/Lib/test/test_socket.py", line 6057, in _testOffset
sent = meth(file, offset=5000)
File "/root/cpython/Lib/socket.py", line 399, in _sendfile_use_sendfile
raise _GiveupOnSendfile(err)
socket._GiveupOnSendfile: [Errno 22] Invalid argument ====================================================================== Traceback (most recent call last):
File "/root/cpython/Lib/test/test_socket.py", line 6085, in testCount
self.assertEqual(len(data), count)
AssertionError: 5405948743 != 5000007 ====================================================================== Traceback (most recent call last):
File "/root/cpython/Lib/test/test_socket.py", line 6159, in testWithTimeout
self.assertEqual(len(data), self.FILESIZE)
AssertionError: 429006848 != 10485760 Ran 11 tests in 33.603s FAILED (failures=2, errors=1) Looking at the testCount case I could observe repeated calls to sendfile() with out_fd=7, in_fd=6 off=0, len=5000007 which returned -1 with errno set to EAGAIN. |
I thought I'd give it a shot and I believe i found the issue. Let's use the testCount test as an example. The client side (or the data sending side) looks like this: def _testCount(self):
address = self.serv.getsockname()
file = open(os_helper.TESTFN, 'rb')
sock = socket.create_connection(address,
timeout=support.LOOPBACK_TIMEOUT)
with sock, file:
count = 5000007
meth = self.meth_from_sock(sock)
sent = meth(file, count=count)
self.assertEqual(sent, count)
self.assertEqual(file.tell(), count) So we're sending 5000007 bytes of data at once to a socket that has a timeout of 5.0 seconds set (default in those tests). Somewhere along the way socket._sendfile_use_sendfile() is called and in it there's a loop:
On my test VM running openindiana 5.11 (I think that's the version number?) this is basically an infinite loop (I think it'll end at some point, but I didn't have the patience to verify this). That's because trying to send 5000007 bytes to that socket with 5 seconds timeout will trigger BlockingIOError. Why? This is the relevant part of os.sendfile() implementation from posixmodule.c:
offset is 0 in our case, but its exact value doesn't matter. The trouble is this is what illumos sendfile man page[1] says: RETURN VALUES
After some input from Jakub Kulik I believe this is a unique behavior (Linux, Oracle Solaris, Mac OS, FreeBSD and likely all or almost all other OSes don't do this) and it lacks handling on the Python side. I tested this and indeed the sendfile(out_fd, in_fd, &offset, count) call *does* do partial writes in our case which gets reported as EAGAIN which gets converted to BlockingIOError which makes socket._sendfile_use_sendfile() retry again and again, each time resending the very beginning of the data and not going any further, therefore accumulating a lot of garbage on the receiving socket's side. This patch works for me and I run the whole test_socket test suite with it, no more failures: diff --git a/Modules/posixmodule.c b/Modules/posixmodule.c + off_t original_offset = offset;
do {
Py_BEGIN_ALLOW_THREADS
ret = sendfile(out_fd, in_fd, &offset, count);
Py_END_ALLOW_THREADS
} while (ret < 0 && errno == EINTR && !(async_err = PyErr_CheckSignals()));
- if (ret < 0)
- return (!async_err) ? posix_error() : NULL;
+ if (ret < 0) {
+ if (offset != original_offset) {
+ ret = offset - original_offset;
+ }
+ else {
+ return (!async_err) ? posix_error() : NULL;
+ }
+ }
return Py_BuildValue("n", ret);
#endif
} If it's verified to be a good change I'm happy to resubmit it in a PR with appropriate illumos-specific #ifndefs. |
Excellent analysis, that's it! I've also tested your patch on SmartOS and it works great. |
I did some further digging, and this is indeed not possible on Oracle Solaris (that is, sendfile() cannot write bytes and fail during the same call). We considered this a bug and changed/fixed it several years ago. Manual page doesn't mention that either: I am not sure whether Illumos will want to change this as well (probably not since it is mentioned in the man page as expected behavior), but either way, the proposed change doesn't harm Oracle Solaris because 'if (offset != original_offset)' can never be true. |
Thank you! I submitted a PR with a slightly modified patch (the comparison only happens on Solaris family of systems), I'd appreciate your confirmation that it still works (it's working for me on openindiana). |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: