New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

eventlet.monkey_patch() on Python 3.4 makes stdout non-blocking #248

Open
rbtcollins opened this Issue Sep 21, 2015 · 9 comments

Comments

Projects
None yet
2 participants
@rbtcollins

rbtcollins commented Sep 21, 2015

We have a test suite which writes it's output to stdout. We started observing short writes on Python 3.4:

http://paste.ubuntu.com/12513141/

  File "/tmp/designate/.tox/py34/lib/python3.4/site-packages/subunit/v2.py", line 215, in _write_packet
    raise Exception("Failed to write packet (%d) bytes" % len(buf))
Exception: Failed to write packet (4271) bytes
{3} subunit.parser [0.000328s] ... FAILED

Now, I could put a loop in there with a timeout etc - but that is the is synchronous codepath in subunit, because we're writing to a known quantity.

This behaviour doesn't happen on Python2.6 or 2.7 with eventlet, so its just a bit of a puzzle.

On Python3.4 we get a binary file object by unwrapping stdout - sys.stdout.buffer is the object we're writing to, and we obtain the handle for that before monkeypatching is done (because the test harness is itself not an eventlet app)

Even more odd, piping the suite output through cat prevents the errors being triggered - I have no idea why!

@vstinner

This comment has been minimized.

Show comment
Hide comment
@vstinner

vstinner Sep 22, 2015

Contributor

Are you sure that standard streams (stdin, stdout, stderr) are non-blocking? I'm unable to reproduce the issue. Example with Python 3.5:

import eventlet
eventlet.monkey_patch()
import os
for fd in range(3):
    print(fd, os.get_blocking(fd))

Output:

0 True
1 True
2 True

How can I try to reproduce the issue? Did you check with "strace -f -o trace" to check that fd 0, 1 or 2 is made non-blocking with fcntl()?

By default (with no redirection), all standard streams are the same device:

$ ls -l /proc/self/fd/[012]
lrwx------. 1 haypo haypo 64 22 sept. 10:12 /proc/self/fd/0 -> /dev/pts/0
lrwx------. 1 haypo haypo 64 22 sept. 10:12 /proc/self/fd/1 -> /dev/pts/0
lrwx------. 1 haypo haypo 64 22 sept. 10:12 /proc/self/fd/2 -> /dev/pts/0

If you modify one stream, the two other streams are impacted. Example:

import os
os.set_blocking(1, False)
for fd in range(3):
    print(fd, os.get_blocking(fd))

Output:

0 False
1 False
2 False

If you redirect a stream and you make it non-blocking, it doesn't use the shared device anymore and only this stream will be make non-blocking. Same code but with stdout redirected (ex: python script.py|cat):

0 True
1 False
2 True

stdin and stderr are still blocking, but the stdout (pipe) became non-blocking.

Contributor

vstinner commented Sep 22, 2015

Are you sure that standard streams (stdin, stdout, stderr) are non-blocking? I'm unable to reproduce the issue. Example with Python 3.5:

import eventlet
eventlet.monkey_patch()
import os
for fd in range(3):
    print(fd, os.get_blocking(fd))

Output:

0 True
1 True
2 True

How can I try to reproduce the issue? Did you check with "strace -f -o trace" to check that fd 0, 1 or 2 is made non-blocking with fcntl()?

By default (with no redirection), all standard streams are the same device:

$ ls -l /proc/self/fd/[012]
lrwx------. 1 haypo haypo 64 22 sept. 10:12 /proc/self/fd/0 -> /dev/pts/0
lrwx------. 1 haypo haypo 64 22 sept. 10:12 /proc/self/fd/1 -> /dev/pts/0
lrwx------. 1 haypo haypo 64 22 sept. 10:12 /proc/self/fd/2 -> /dev/pts/0

If you modify one stream, the two other streams are impacted. Example:

import os
os.set_blocking(1, False)
for fd in range(3):
    print(fd, os.get_blocking(fd))

Output:

0 False
1 False
2 False

If you redirect a stream and you make it non-blocking, it doesn't use the shared device anymore and only this stream will be make non-blocking. Same code but with stdout redirected (ex: python script.py|cat):

0 True
1 False
2 True

stdin and stderr are still blocking, but the stdout (pipe) became non-blocking.

@rbtcollins

This comment has been minimized.

Show comment
Hide comment
@rbtcollins

rbtcollins Sep 22, 2015

I'm not sure that its non-blockingness; The first report I had of this was from designate, but since then a few other OpenStack projects have followed along - look for the ones with | cat in their .testr.conf.

rbtcollins commented Sep 22, 2015

I'm not sure that its non-blockingness; The first report I had of this was from designate, but since then a few other OpenStack projects have followed along - look for the ones with | cat in their .testr.conf.

@vstinner

This comment has been minimized.

Show comment
Hide comment
@vstinner

vstinner Sep 24, 2015

Contributor

Ok, I found the code.

Part1 (subunit): subunit.run.SubunitTestRunner.run() => SubunitTestRunner._list() => os.fdopen(fileno) where fileno is 1 (stdout) in my case

Part2 (eventlet): os.fdopen(fd, ...) => eventlet.greenio.py3.GreenPipe() => eventlet.greenio.py3._open() => FileIO(fd, ...) => eventlet.greenio.py3.GreenFileIO()

And greenio.py3.GreenFileIO() constructor calls set_nonblocking(self).

Ok, well, why not. An event loop hates blocking things, so it makes sense.

The issue title is "eventlet.monkey_patch() on Python 3.4 makes stdout non-blocking". Ok, but what is the root issue? eventlet does exactly the same on Python 2: it also makes stdout non-blocking...

Contributor

vstinner commented Sep 24, 2015

Ok, I found the code.

Part1 (subunit): subunit.run.SubunitTestRunner.run() => SubunitTestRunner._list() => os.fdopen(fileno) where fileno is 1 (stdout) in my case

Part2 (eventlet): os.fdopen(fd, ...) => eventlet.greenio.py3.GreenPipe() => eventlet.greenio.py3._open() => FileIO(fd, ...) => eventlet.greenio.py3.GreenFileIO()

And greenio.py3.GreenFileIO() constructor calls set_nonblocking(self).

Ok, well, why not. An event loop hates blocking things, so it makes sense.

The issue title is "eventlet.monkey_patch() on Python 3.4 makes stdout non-blocking". Ok, but what is the root issue? eventlet does exactly the same on Python 2: it also makes stdout non-blocking...

@vstinner

This comment has been minimized.

Show comment
Hide comment
@vstinner

vstinner Sep 24, 2015

Contributor

== Python 2 ==

GreePipe.write() => socket._fileobject.write() => socket._fileobject.flush() => call sendall() in a loop until the write buffer is empty

sendall() is eventlet.greenio.py3._SocketDuckForFd.sendall() which calls os.write() in a loop until all bytes are written. sendall() calls self._trampoline(self, write=True) to wait until the non-blocking file (a pipe in your case) becomes writable.

== Python 3 ==

GreenFileIO.write() calls os.write() once, but it may wait until the non-blocking file (pipe) becomes writable using trampoline(self, write=True).

Hum, it's every different. On Python 2, there are two loops to ensure that all bytes are written. On Python 3, there is no loop to ensure that all bytes are written.

== subunit ==

In your subunit program, to support the current version of eventlet on Python 3, you should accept that write() can be partial and write less than bytes than requested. write() always return the number of written bytes. You may implement your own loop. Example using memoryview to avoid memory copies:

view = memoryview(data)
datalen = len(data)
offset = 0
while offset < datalen:
  written = write(view[offset:])
  offset += written
Contributor

vstinner commented Sep 24, 2015

== Python 2 ==

GreePipe.write() => socket._fileobject.write() => socket._fileobject.flush() => call sendall() in a loop until the write buffer is empty

sendall() is eventlet.greenio.py3._SocketDuckForFd.sendall() which calls os.write() in a loop until all bytes are written. sendall() calls self._trampoline(self, write=True) to wait until the non-blocking file (a pipe in your case) becomes writable.

== Python 3 ==

GreenFileIO.write() calls os.write() once, but it may wait until the non-blocking file (pipe) becomes writable using trampoline(self, write=True).

Hum, it's every different. On Python 2, there are two loops to ensure that all bytes are written. On Python 3, there is no loop to ensure that all bytes are written.

== subunit ==

In your subunit program, to support the current version of eventlet on Python 3, you should accept that write() can be partial and write less than bytes than requested. write() always return the number of written bytes. You may implement your own loop. Example using memoryview to avoid memory copies:

view = memoryview(data)
datalen = len(data)
offset = 0
while offset < datalen:
  written = write(view[offset:])
  offset += written
@vstinner

This comment has been minimized.

Show comment
Hide comment
@vstinner

vstinner Sep 24, 2015

Contributor

eventlet has a different behaviour on Python 2 and Python 3, but I don't know which behaviour is the "good" behaviour :-)

In subunit/run.py, I see that the code uses "stream = os.fdopen(fileno, 'wb', 0)". So it explicitly asks a unbuffered writter. On Python 3, unbuffered really means unbuffered: FileIO.write() only calls os.write() once. It can returns less bytes than requested, it's part of FileIO specification.

The I/O stack of Python 2 is based on the stdio.h of the standard C library which doesn't give a full access on low-level files. It's less clear if write() calls the write syscall once or more. But you must expect that calling the write(data) method can be a partial write, write method of the object created by os.fdopen(fileno, 'wb', 0).

Ok now to come back to eventlet: it would be a bad idea to modify write() of eventlet on Python 2 to allow partial write, it will likely break a lot of applications in the wild. It would be tricky to debug such issue: partial write don't occur always, it depends on the network speed, on the size of the pipe buffer, etc. And it would take a lot of effort to "fix" all applications to support the new "pedantic" eventlet.

My conclusion is that eventlet must behave the same on Python 2 and Python 3, and so use a loop to write on Python 3 too. Even if the user asked an unbuffered writer.

Contributor

vstinner commented Sep 24, 2015

eventlet has a different behaviour on Python 2 and Python 3, but I don't know which behaviour is the "good" behaviour :-)

In subunit/run.py, I see that the code uses "stream = os.fdopen(fileno, 'wb', 0)". So it explicitly asks a unbuffered writter. On Python 3, unbuffered really means unbuffered: FileIO.write() only calls os.write() once. It can returns less bytes than requested, it's part of FileIO specification.

The I/O stack of Python 2 is based on the stdio.h of the standard C library which doesn't give a full access on low-level files. It's less clear if write() calls the write syscall once or more. But you must expect that calling the write(data) method can be a partial write, write method of the object created by os.fdopen(fileno, 'wb', 0).

Ok now to come back to eventlet: it would be a bad idea to modify write() of eventlet on Python 2 to allow partial write, it will likely break a lot of applications in the wild. It would be tricky to debug such issue: partial write don't occur always, it depends on the network speed, on the size of the pipe buffer, etc. And it would take a lot of effort to "fix" all applications to support the new "pedantic" eventlet.

My conclusion is that eventlet must behave the same on Python 2 and Python 3, and so use a loop to write on Python 3 too. Even if the user asked an unbuffered writer.

@vstinner

This comment has been minimized.

Show comment
Hide comment
@vstinner

vstinner Sep 24, 2015

Contributor

Ok, I created the pull request #250 to "fix" GreenFileIO.write() on Python 3.

Again, the workaround in subunit is to call write() in a loop to write all bytes.

Contributor

vstinner commented Sep 24, 2015

Ok, I created the pull request #250 to "fix" GreenFileIO.write() on Python 3.

Again, the workaround in subunit is to call write() in a loop to write all bytes.

@rbtcollins

This comment has been minimized.

Show comment
Hide comment
@rbtcollins

rbtcollins Sep 24, 2015

Yeah, I need to cater for the FileIO case - it was oversight when I added the unbuffered stuff - its a legitimate bug i nsubunit there, I just need to do some gnarly type checking to determine if I have a high or low level method - e.g. can I consult the return value of write or not.

rbtcollins commented Sep 24, 2015

Yeah, I need to cater for the FileIO case - it was oversight when I added the unbuffered stuff - its a legitimate bug i nsubunit there, I just need to do some gnarly type checking to determine if I have a high or low level method - e.g. can I consult the return value of write or not.

@vstinner

This comment has been minimized.

Show comment
Hide comment
@vstinner

vstinner Sep 24, 2015

Contributor

FYI I proposed this change to subunit: testing-cabal/subunit#14

Contributor

vstinner commented Sep 24, 2015

FYI I proposed this change to subunit: testing-cabal/subunit#14

temoto added a commit that referenced this issue Oct 13, 2015

GH #248: Fix GreenFileIO.write()
On Python 3, GreenFileIO.write() now always write all bytes using a
loop, to have the same behaviour than GreenPipe.write() on Python 2.
Before, the write() could be partial, write less bytes than expected.

On Python 2, GreenPipe.write() doesn't allow partial writes, it
always write all bytes, even if the user requested an unbuffered pipe
(ex: by calling os.fdopen(fd, 'wb', 0)).

Modifying Python 2 to be pedantic and allow partial writes for
unbuffered pipes will likely break a lot of applications. It's
simpler to modify Python 3 GreenFileIO to behave the same than Python
2 GreenPipe.
@vstinner

This comment has been minimized.

Show comment
Hide comment
@vstinner

vstinner Jan 4, 2016

Contributor

Since #250 is closed, this issue can also be closed no?

Contributor

vstinner commented Jan 4, 2016

Since #250 is closed, this issue can also be closed no?

openstack-gerrit added a commit to openstack/openstack that referenced this issue Feb 19, 2016

Updated openstack/openstack
Project: openstack/oslo.messaging  7305fc5af059a886fa6d27292c14ec99b4439a1d

.testr.conf: revert workaround of testtools bug

Revert the change I6507e693fc929e03884f933bbda241f744d3a7c0. The
testtools bug was fixed, the "| cat" workaround is no more needed.

Fix for subunit (for testtools), workaround the eventlet bug:
testing-cabal/subunit#14

Fix of the root cause in eventlet:
eventlet/eventlet#248

The bug was limited to Python 3 and related to non-blocking stdout.

Change-Id: I207b7a7e82cbd7e5848231115b3577be7b600638
Related-Bug: 1492505

openstack-gerrit pushed a commit to openstack/oslo.messaging that referenced this issue Feb 19, 2016

.testr.conf: revert workaround of testtools bug
Revert the change I6507e693fc929e03884f933bbda241f744d3a7c0. The
testtools bug was fixed, the "| cat" workaround is no more needed.

Fix for subunit (for testtools), workaround the eventlet bug:
testing-cabal/subunit#14

Fix of the root cause in eventlet:
eventlet/eventlet#248

The bug was limited to Python 3 and related to non-blocking stdout.

Change-Id: I207b7a7e82cbd7e5848231115b3577be7b600638
Related-Bug: 1492505

openstack-gerrit added a commit to openstack/openstack that referenced this issue Feb 19, 2016

Updated openstack/openstack
Project: openstack/oslo.messaging  7305fc5af059a886fa6d27292c14ec99b4439a1d

.testr.conf: revert workaround of testtools bug

Revert the change I6507e693fc929e03884f933bbda241f744d3a7c0. The
testtools bug was fixed, the "| cat" workaround is no more needed.

Fix for subunit (for testtools), workaround the eventlet bug:
testing-cabal/subunit#14

Fix of the root cause in eventlet:
eventlet/eventlet#248

The bug was limited to Python 3 and related to non-blocking stdout.

Change-Id: I207b7a7e82cbd7e5848231115b3577be7b600638
Related-Bug: 1492505

openstack-gerrit added a commit to openstack/openstack that referenced this issue Feb 21, 2016

Updated openstack/openstack
Project: openstack/neutron  e07b9b8207dc5be561a1ae101c0c530eee5093f6

.testr.conf: revert workaround of testtools bug

Revert the change I4c88f1891f53c6559bca71bf657aa30df2101280.
The testtools bug was fixed, the "| cat" workaround is no more
needed.

Fix for subunit (for testtools), workaround the eventlet bug:
testing-cabal/subunit#14

Fix of the root cause in eventlet:
eventlet/eventlet#248

The bug was limited to Python 3 and related to non-blocking stdout.

See also the similar change for oslo.messaging:
I207b7a7e82cbd7e5848231115b3577be7b600638.

Change-Id: Iad759af7fbd8a404b7ac361763b27214f3a49ac3
Related-Bug: 1492505

openstack-gerrit pushed a commit to openstack/neutron that referenced this issue Feb 21, 2016

.testr.conf: revert workaround of testtools bug
Revert the change I4c88f1891f53c6559bca71bf657aa30df2101280.
The testtools bug was fixed, the "| cat" workaround is no more
needed.

Fix for subunit (for testtools), workaround the eventlet bug:
testing-cabal/subunit#14

Fix of the root cause in eventlet:
eventlet/eventlet#248

The bug was limited to Python 3 and related to non-blocking stdout.

See also the similar change for oslo.messaging:
I207b7a7e82cbd7e5848231115b3577be7b600638.

Change-Id: Iad759af7fbd8a404b7ac361763b27214f3a49ac3
Related-Bug: 1492505

jeffrey4l pushed a commit to jeffrey4l/subunit that referenced this issue Jul 6, 2016

Fix StreamResultToBytes._write_packet()
Call write() in a loop until all bytes are written. The write()
method doesn't ensure that all bytes are written.

This change should workaround the eventlet bug:
eventlet/eventlet#248
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment