New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

On Python 3, sock.makefile('rb').readline() doesn't handle blocking errors correctly #274

Closed
vstinner opened this Issue Dec 16, 2015 · 18 comments

Comments

Projects
None yet
4 participants
@vstinner
Contributor

vstinner commented Dec 16, 2015

On Python 3, socket.socket is monkey-patched to GreenSocket (of eventlet.greenio.base) object. When sock.makefile('rb') is called,

sock.makefile('rb') creates a SocketIO object (of socket). SocketIO.readline() calls GreenSocket.recv_into() until it gets enough data. It stops when recv_into() returns 0 byte... but also when it returns None.

GreenSocket.recv_into() starts with a trampoline to check if we have data to read, but sometimes this check is not enough: recv_into() raises a BlockingIOError. Sadly, GreenSocket.recv_into() only calls the trampoline once, it doesn't handle BlockingIOError.

In case of BlockingIOError, recv_into() is indirectly converted to None, and readline() interprets None as "end of line" so readline() returns an incomplete or empty line.

In the unit test of the OpenStack Glance project, an unit test fails because the HTTP client gets a BadStatusLine (of http.client). In fact, the underlying recv_into() raised BlockingIOError and the read is not retried.

I will propose a fix.

@temoto

This comment has been minimized.

Show comment
Hide comment
@temoto

temoto Dec 16, 2015

Member

Great, thanks. Please, see comments to 4th commit.

Also, about issue reference in commit text. With this format gh #274 the gh is separate word, not parsed by github or any software. Github parses strings like gh-274. Personally I think the best option is long URL, like https://github.com/eventlet/eventlet/issues/274, because it's clickable in any case.

Member

temoto commented Dec 16, 2015

Great, thanks. Please, see comments to 4th commit.

Also, about issue reference in commit text. With this format gh #274 the gh is separate word, not parsed by github or any software. Github parses strings like gh-274. Personally I think the best option is long URL, like https://github.com/eventlet/eventlet/issues/274, because it's clickable in any case.

@vstinner

This comment has been minimized.

Show comment
Hide comment
@vstinner

vstinner Dec 16, 2015

Contributor

Github parses strings like gh-274

Ok done.

Personally I think the best option is long URL, like #274, because it's clickable in any case.

I also added this link.

Contributor

vstinner commented Dec 16, 2015

Github parses strings like gh-274

Ok done.

Personally I think the best option is long URL, like #274, because it's clickable in any case.

I also added this link.

@temoto

This comment has been minimized.

Show comment
Hide comment
@temoto

temoto Dec 17, 2015

Member

LGTM now

Member

temoto commented Dec 17, 2015

LGTM now

jstasiak added a commit that referenced this issue Jan 7, 2016

gh-274: Handle blocking I/O errors in GreenSocket
Fix recv_into(), recvfrom(), recvfrom_into() and sendto() methods of
GreenSocket to handle blocking I/O errors (ex: BlockingIOError on
Python 3). Even if the trampoline was called, the socket method can
still fails with an I/O errors for various reasons (see manual pages
of the C functions for examples).

Ref: #274
@jstasiak

This comment has been minimized.

Show comment
Hide comment
@jstasiak

jstasiak Jan 7, 2016

Contributor

#275 (just merged) fixes this I believe, please reopen if I'm wrong.

Contributor

jstasiak commented Jan 7, 2016

#275 (just merged) fixes this I believe, please reopen if I'm wrong.

@jstasiak jstasiak closed this Jan 7, 2016

jstasiak referenced this issue Jan 7, 2016

Revert "gh-274: Handle blocking I/O errors in GreenSocket"
The commit broke tests on Python 3[1]:

======================================================================
FAIL: test_recv_into_timeout (tests.greenio_test.TestGreenSocket)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/travis/build/eventlet/eventlet/tests/greenio_test.py", line 192, in test_recv_into_timeout
    self.fail("socket.timeout not raised")
AssertionError: socket.timeout not raised

This reverts commit 19035b1.

[1] https://travis-ci.org/eventlet/eventlet/jobs/100720347
@jstasiak

This comment has been minimized.

Show comment
Hide comment
@jstasiak

jstasiak Jan 7, 2016

Contributor

I completely missed the fact that this breaks tests on Python 3, just reverted, will reopen this issue until that's resolved.

Contributor

jstasiak commented Jan 7, 2016

I completely missed the fact that this breaks tests on Python 3, just reverted, will reopen this issue until that's resolved.

@jstasiak jstasiak reopened this Jan 7, 2016

@vstinner

This comment has been minimized.

Show comment
Hide comment
@vstinner

vstinner Jan 7, 2016

Contributor

I completely missed the fact that this breaks tests on Python 3, just reverted, will reopen this issue until that's resolved.

I was unable to run tests on Python 3 on my Fedora 23. I found the issue: tox.ini requires explicitly an old version of PyOpenSSL which doesn't work on Fedora 23. Remove the version from tox.ini allows me to run tests on Python 3.

So, I fixed the issue by exchanging the trampoline and the call the the read method in recv_loop(). I don't understand well eventlet, so I'm not sure that it's correct.

Contributor

vstinner commented Jan 7, 2016

I completely missed the fact that this breaks tests on Python 3, just reverted, will reopen this issue until that's resolved.

I was unable to run tests on Python 3 on my Fedora 23. I found the issue: tox.ini requires explicitly an old version of PyOpenSSL which doesn't work on Fedora 23. Remove the version from tox.ini allows me to run tests on Python 3.

So, I fixed the issue by exchanging the trampoline and the call the the read method in recv_loop(). I don't understand well eventlet, so I'm not sure that it's correct.

@vstinner

This comment has been minimized.

Show comment
Hide comment
@vstinner

vstinner Jan 7, 2016

Contributor

Oh... It looks like tests now pass on Python 3, but I broke them on Python 2... Any idea how to implement correctly the "recv loop" for GreenSocket?

Or I can write simpler patch with conditional code depending on the Python major version.

Contributor

vstinner commented Jan 7, 2016

Oh... It looks like tests now pass on Python 3, but I broke them on Python 2... Any idea how to implement correctly the "recv loop" for GreenSocket?

Or I can write simpler patch with conditional code depending on the Python major version.

@jstasiak

This comment has been minimized.

Show comment
Hide comment
@jstasiak

jstasiak Jan 7, 2016

Contributor

Hah, that's annoying. Let me have a look at what that test does in the first place.

Contributor

jstasiak commented Jan 7, 2016

Hah, that's annoying. Let me have a look at what that test does in the first place.

@jstasiak

This comment has been minimized.

Show comment
Hide comment
@jstasiak

jstasiak Jan 7, 2016

Contributor

I think I know what this is (tests break for me locally on Python 2.7.10 too BTW).

The test:

        buf = array.array('B')
        # ...
            value = client.recv_into(buf)
            self.fail("socket.timeout not raised, got %r instead" % (value,))

The patch makes so that recv_meth is called at least once before trampoline is called:

-    def recv(self, buflen, flags=0):
+    def _recv_loop(self, recv_meth, *args):
         fd = self.fd
         if self.act_non_blocking:
-            return fd.recv(buflen, flags)
+            return recv_meth(*args)
         while True:
             try:
-                return fd.recv(buflen, flags)
+                return recv_meth(*args)
             except socket.error as e:
                 if get_errno(e) in SOCKET_BLOCKING:
                     pass
@@ -328,35 +328,28 @@ class GreenSocket(object):
                 # Perhaps we should return '' instead?
                 raise EOFError()

From what I see the builtin socket.recv_into (and socket.recv too) called with non-blocking socket (as it's used by Eventlet internally) won't raise an exception if one attempts to read zero bytes (Python 2.7.10):

>>> s.settimeout(0)
>>> s.recv(0)
''
>>> s.recv(1)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
socket.error: [Errno 35] Resource temporarily unavailable
>>> 
>>> s.recv_into(bytearray(b''))
0
>>> s.recv_into(bytearray(b'x'))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
socket.error: [Errno 35] Resource temporarily unavailable

The test suite is making sure an exception is raised while the builtin recv_into just returns 0, your patch fixes this incompatibility therefore breaking the tests :) I think the right approach here is to modify the tests to enforce builtin recv_into behaviour (0 when 0 bytes attempted to be read, timeout otherwise in this case). (this is wrong, see below)

Contributor

jstasiak commented Jan 7, 2016

I think I know what this is (tests break for me locally on Python 2.7.10 too BTW).

The test:

        buf = array.array('B')
        # ...
            value = client.recv_into(buf)
            self.fail("socket.timeout not raised, got %r instead" % (value,))

The patch makes so that recv_meth is called at least once before trampoline is called:

-    def recv(self, buflen, flags=0):
+    def _recv_loop(self, recv_meth, *args):
         fd = self.fd
         if self.act_non_blocking:
-            return fd.recv(buflen, flags)
+            return recv_meth(*args)
         while True:
             try:
-                return fd.recv(buflen, flags)
+                return recv_meth(*args)
             except socket.error as e:
                 if get_errno(e) in SOCKET_BLOCKING:
                     pass
@@ -328,35 +328,28 @@ class GreenSocket(object):
                 # Perhaps we should return '' instead?
                 raise EOFError()

From what I see the builtin socket.recv_into (and socket.recv too) called with non-blocking socket (as it's used by Eventlet internally) won't raise an exception if one attempts to read zero bytes (Python 2.7.10):

>>> s.settimeout(0)
>>> s.recv(0)
''
>>> s.recv(1)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
socket.error: [Errno 35] Resource temporarily unavailable
>>> 
>>> s.recv_into(bytearray(b''))
0
>>> s.recv_into(bytearray(b'x'))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
socket.error: [Errno 35] Resource temporarily unavailable

The test suite is making sure an exception is raised while the builtin recv_into just returns 0, your patch fixes this incompatibility therefore breaking the tests :) I think the right approach here is to modify the tests to enforce builtin recv_into behaviour (0 when 0 bytes attempted to be read, timeout otherwise in this case). (this is wrong, see below)

@jstasiak

This comment has been minimized.

Show comment
Hide comment
@jstasiak

jstasiak Jan 7, 2016

Contributor

In the response above I mixed some things up, sorry. The recv_into behaviour described is correct for nonblocking sockets but this is about a socket with a timeout; "our" recv_into needs to simulate the "socket with a timeout" behaviour. The builtin recv_into:

>>> s.settimeout(0.1)
>>> s.recv_into(array.array('B', b''))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
socket.timeout: timed out
>>> s.recv_into(array.array('B', b'x'))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
socket.timeout: timed out

So yes, the test is correct (un)fortunately.

Contributor

jstasiak commented Jan 7, 2016

In the response above I mixed some things up, sorry. The recv_into behaviour described is correct for nonblocking sockets but this is about a socket with a timeout; "our" recv_into needs to simulate the "socket with a timeout" behaviour. The builtin recv_into:

>>> s.settimeout(0.1)
>>> s.recv_into(array.array('B', b''))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
socket.timeout: timed out
>>> s.recv_into(array.array('B', b'x'))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
socket.timeout: timed out

So yes, the test is correct (un)fortunately.

@jstasiak

This comment has been minimized.

Show comment
Hide comment
@jstasiak

jstasiak Jan 8, 2016

Contributor

I played with it locally and I modified recv tests (because builtin recv also times out even if one doesn't want to read any bytes):

diff --git a/tests/greenio_test.py b/tests/greenio_test.py
index 5c404cc..f2a9f86 100644
--- a/tests/greenio_test.py
+++ b/tests/greenio_test.py
@@ -123,6 +123,8 @@ class TestGreenSocket(tests.LimitedTestCase):

         client.connect(addr)

+        expect_socket_timeout(client.recv, 0)
         expect_socket_timeout(client.recv, 8192)

         evt.send()

This introduces a test failure that can solved by something like:

diff --git a/eventlet/greenio/base.py b/eventlet/greenio/base.py
index 18799af..09bb382 100644
--- a/eventlet/greenio/base.py
+++ b/eventlet/greenio/base.py
@@ -310,6 +310,8 @@ class GreenSocket(object):
             return fd.recv(buflen, flags)
         while True:
             try:
+                if buflen == 0:
+                    self._read_trampoline()
                 return fd.recv(buflen, flags)
             except socket.error as e:
                 if get_errno(e) in SOCKET_BLOCKING:
@@ -319,15 +321,19 @@ class GreenSocket(object):
                 else:
                     raise
             try:
-                self._trampoline(
-                    fd,
-                    read=True,
-                    timeout=self.gettimeout(),
-                    timeout_exc=socket.timeout("timed out"))
+                self._read_trampoline()
             except IOClosed as e:
                 # Perhaps we should return '' instead?
                 raise EOFError()

+    def _read_trampoline(self):
+        self._trampoline(
+            self.fd,
+            read=True,
+            timeout=self.gettimeout(),
+            timeout_exc=socket.timeout("timed out"),
+        )
+
Contributor

jstasiak commented Jan 8, 2016

I played with it locally and I modified recv tests (because builtin recv also times out even if one doesn't want to read any bytes):

diff --git a/tests/greenio_test.py b/tests/greenio_test.py
index 5c404cc..f2a9f86 100644
--- a/tests/greenio_test.py
+++ b/tests/greenio_test.py
@@ -123,6 +123,8 @@ class TestGreenSocket(tests.LimitedTestCase):

         client.connect(addr)

+        expect_socket_timeout(client.recv, 0)
         expect_socket_timeout(client.recv, 8192)

         evt.send()

This introduces a test failure that can solved by something like:

diff --git a/eventlet/greenio/base.py b/eventlet/greenio/base.py
index 18799af..09bb382 100644
--- a/eventlet/greenio/base.py
+++ b/eventlet/greenio/base.py
@@ -310,6 +310,8 @@ class GreenSocket(object):
             return fd.recv(buflen, flags)
         while True:
             try:
+                if buflen == 0:
+                    self._read_trampoline()
                 return fd.recv(buflen, flags)
             except socket.error as e:
                 if get_errno(e) in SOCKET_BLOCKING:
@@ -319,15 +321,19 @@ class GreenSocket(object):
                 else:
                     raise
             try:
-                self._trampoline(
-                    fd,
-                    read=True,
-                    timeout=self.gettimeout(),
-                    timeout_exc=socket.timeout("timed out"))
+                self._read_trampoline()
             except IOClosed as e:
                 # Perhaps we should return '' instead?
                 raise EOFError()

+    def _read_trampoline(self):
+        self._trampoline(
+            self.fd,
+            read=True,
+            timeout=self.gettimeout(),
+            timeout_exc=socket.timeout("timed out"),
+        )
+
@vstinner

This comment has been minimized.

Show comment
Hide comment
@vstinner

vstinner Jan 8, 2016

Contributor
+                if buflen == 0:
+                    self._read_trampoline()

Oh, thanks for the hint. It looks like tests now pass on Python 2 and Python 3.

Contributor

vstinner commented Jan 8, 2016

+                if buflen == 0:
+                    self._read_trampoline()

Oh, thanks for the hint. It looks like tests now pass on Python 2 and Python 3.

jstasiak added a commit that referenced this issue Jan 10, 2016

gh-274: Handle blocking I/O errors in GreenSocket
Fix recv_into(), recvfrom(), recvfrom_into() and sendto() methods of
GreenSocket to handle blocking I/O errors (ex: BlockingIOError on
Python 3). Even if the trampoline was called, the socket method can
still fails with an I/O errors for various reasons (see manual pages
of the C functions for examples).

Ref: #274
@jstasiak

This comment has been minimized.

Show comment
Hide comment
@jstasiak

jstasiak Jan 10, 2016

Contributor

Thanks @haypo, this is now in master.

Contributor

jstasiak commented Jan 10, 2016

Thanks @haypo, this is now in master.

@vstinner

This comment has been minimized.

Show comment
Hide comment
@vstinner

vstinner Jan 10, 2016

Contributor
Contributor

vstinner commented Jan 10, 2016

jstasiak added a commit that referenced this issue Jan 24, 2016

greenio: Fix "TypeError: an integer is required" in sendto()
The sendto() interface as defined in Python documentation:

    socket.sendto(string, address)
    socket.sendto(string, flags, address)

I didn't catch the fact that [1] broke this, this patch fixes it and add
a sendto/recvfrom test to make sure it doesn't happen again (turns out
we didn't have any).

GitHub issue: #290

Fixes: bc4d1b5 - gh-274: Handle blocking I/O errors in GreenSocket

[1] bc4d1b5
@raylu

This comment has been minimized.

Show comment
Hide comment
@raylu

raylu Feb 3, 2016

Contributor

With python 3 and 0.17.4, this code always works:

#!/usr/bin/env python3

from eventlet import wsgi
import eventlet

ten_point_one_MiB = 10 * 1024 * 1024 + 128 * 1024

def iter_response():
    for i in range(10 * 1024 // 256):
        yield 'a' * 256 * 1024
    yield 'a' * 128 * 1024

def app(env, start_response):
    start_response('200 OK', [('Content-Type', 'text/plain'), ('Content-Length', ten_point_one_MiB)])
    return iter_response()

wsgi.server(eventlet.listen(('', 8090)), app)
$ curl localhost:8090 > /dev/null
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 10.1M  100 10.1M    0     0   386M      0 --:--:-- --:--:-- --:--:--  389M

but in 0.18.1, sometimes it works and sometimes you get

$ curl localhost:8090 > /dev/null
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
 98 10.1M   98  9.9M    0     0  62769      0  0:02:49  0:02:45  0:00:04     0

and it just hangs, never downloading the last 128KiB.

eventlet.wsgi calls writelines

_writelines(towrite)

on wfile, created with the original makefile
return _original_socket.makefile(self, *args, **kwargs)

which is a socket.SocketIO object https://github.com/python/cpython/blob/e2165e5eb2ad2bc792d2752b5366cb9d9f083ad2/Lib/socket.py#L532
which inherits from io.RawIOBase which inherits from io.IOBase which inherits from _io._IOBase whose writelines implementation just calls write and doesn't check the bytes written https://github.com/python/cpython/blob/1fe0fd9feb6a4472a9a1b186502eb9c0b2366326/Modules/_io/iobase.c#L730
and socket.SocketIO's write calls send, not sendall https://github.com/python/cpython/blob/04eca3310445f504e3723d07756b5accfa91421e/Lib/socket.py#L593
so arguably, this is a bug in CPython

Contributor

raylu commented Feb 3, 2016

With python 3 and 0.17.4, this code always works:

#!/usr/bin/env python3

from eventlet import wsgi
import eventlet

ten_point_one_MiB = 10 * 1024 * 1024 + 128 * 1024

def iter_response():
    for i in range(10 * 1024 // 256):
        yield 'a' * 256 * 1024
    yield 'a' * 128 * 1024

def app(env, start_response):
    start_response('200 OK', [('Content-Type', 'text/plain'), ('Content-Length', ten_point_one_MiB)])
    return iter_response()

wsgi.server(eventlet.listen(('', 8090)), app)
$ curl localhost:8090 > /dev/null
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 10.1M  100 10.1M    0     0   386M      0 --:--:-- --:--:-- --:--:--  389M

but in 0.18.1, sometimes it works and sometimes you get

$ curl localhost:8090 > /dev/null
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
 98 10.1M   98  9.9M    0     0  62769      0  0:02:49  0:02:45  0:00:04     0

and it just hangs, never downloading the last 128KiB.

eventlet.wsgi calls writelines

_writelines(towrite)

on wfile, created with the original makefile
return _original_socket.makefile(self, *args, **kwargs)

which is a socket.SocketIO object https://github.com/python/cpython/blob/e2165e5eb2ad2bc792d2752b5366cb9d9f083ad2/Lib/socket.py#L532
which inherits from io.RawIOBase which inherits from io.IOBase which inherits from _io._IOBase whose writelines implementation just calls write and doesn't check the bytes written https://github.com/python/cpython/blob/1fe0fd9feb6a4472a9a1b186502eb9c0b2366326/Modules/_io/iobase.c#L730
and socket.SocketIO's write calls send, not sendall https://github.com/python/cpython/blob/04eca3310445f504e3723d07756b5accfa91421e/Lib/socket.py#L593
so arguably, this is a bug in CPython

@vstinner

This comment has been minimized.

Show comment
Hide comment
@vstinner

vstinner Feb 3, 2016

Contributor

so arguably, this is a bug in CPython

As a Python core developer, I have to disagree :-D I opened the bug report #295

Contributor

vstinner commented Feb 3, 2016

so arguably, this is a bug in CPython

As a Python core developer, I have to disagree :-D I opened the bug report #295

@jstasiak

This comment has been minimized.

Show comment
Hide comment
@jstasiak

jstasiak Feb 5, 2016

Contributor

@raylu @haypo I'll just leave this here for reference: https://mail.python.org/pipermail/python-dev/2012-August/121396.html

Considering the above I'm inclined to say the wsgi module only worked before because GreenSocket.send was broken too and was working like sendall internally therefore the fix is to stop using writelines().

Contributor

jstasiak commented Feb 5, 2016

@raylu @haypo I'll just leave this here for reference: https://mail.python.org/pipermail/python-dev/2012-August/121396.html

Considering the above I'm inclined to say the wsgi module only worked before because GreenSocket.send was broken too and was working like sendall internally therefore the fix is to stop using writelines().

@jstasiak

This comment has been minimized.

Show comment
Hide comment
@jstasiak

jstasiak Jul 4, 2016

Contributor

Related Python bug report for reference: Raw I/O writelines() broken for non-blocking I/O

Contributor

jstasiak commented Jul 4, 2016

Related Python bug report for reference: Raw I/O writelines() broken for non-blocking I/O

openstack-gerrit pushed a commit to openstack/deb-python-eventlet that referenced this issue Sep 2, 2016

gh-274: Handle blocking I/O errors in GreenSocket
Fix recv_into(), recvfrom(), recvfrom_into() and sendto() methods of
GreenSocket to handle blocking I/O errors (ex: BlockingIOError on
Python 3). Even if the trampoline was called, the socket method can
still fails with an I/O errors for various reasons (see manual pages
of the C functions for examples).

Ref: eventlet/eventlet#274

openstack-gerrit pushed a commit to openstack/deb-python-eventlet that referenced this issue Sep 2, 2016

gh-274: Handle blocking I/O errors in GreenSocket
Fix recv_into(), recvfrom(), recvfrom_into() and sendto() methods of
GreenSocket to handle blocking I/O errors (ex: BlockingIOError on
Python 3). Even if the trampoline was called, the socket method can
still fails with an I/O errors for various reasons (see manual pages
of the C functions for examples).

Ref: eventlet/eventlet#274
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment