Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dagpool_test busy loop on CPython 3.7 epool/select hubs #475

Closed
temoto opened this issue Mar 10, 2018 · 14 comments
Closed

dagpool_test busy loop on CPython 3.7 epool/select hubs #475

temoto opened this issue Mar 10, 2018 · 14 comments

Comments

@temoto
Copy link
Member

temoto commented Mar 10, 2018

While executing dagpool_test, random tests will burn CPU in kernel. Observed as

./venv-37/bin/nosetests -svx tests/dagpool_test.py
tests.dagpool_test.test_spawn_collision_spawn ... ok
tests.dagpool_test.test_spawn_multiple ... ok
tests.dagpool_test.test_spawn_many ... ok
tests.dagpool_test.test_wait_each_all ... ok
tests.dagpool_test.test_kill ... ok
tests.dagpool_test.test_post_collision_preload ... ok
tests.dagpool_test.test_post_collision_post ... ok
tests.dagpool_test.test_post_collision_spawn ... ok
tests.dagpool_test.test_post_replace ... ok
tests.dagpool_test.test_getitem ... ok
tests.dagpool_test.test_waitall_exc ... ok
tests.dagpool_test.test_propagate_exc ... ^C^C^\[1]    26649 quit       ./venv-37/bin/nosetests -svx tests/dagpool_test.py
   23.65s user 0.06s system 99% cpu 23.727 total

Note system 99%.

  • Nose process is not responding to SIGINT (KeyboardInterrupt).
  • Particular tests to cause busy loop: test_propagate_exc, test_wait_each_exc.
  • They pass alone, problem occurs while running whole dagpool_test.
  • Environment: Linux, python 3.7, epoll and select hubs

Example of build failed for this reason:
https://travis-ci.org/eventlet/eventlet/jobs/351626233

@nat-goodspeed could you look what is the problem here?

Similar busy loop problem with test_accept_deflate_ext_context_takeover_13 (tests.websocket_new_test.TestWebSocketWithCompression). Difference is: under strace websocket tests segfault with

recvfrom(9, "", 2, 0, NULL, NULL)       = 0
--- SIGSEGV {si_signo=SIGSEGV, si_code=SEGV_MAPERR, si_addr=0x99} ---

and dagpool doesn't

write(2, "tests.dagpool_test.test_propagate_exc ... ", 42tests.dagpool_test.test_propagate_exc ... ) = 42
clock_gettime(CLOCK_MONOTONIC, {3917231, 687934179}) = 0
clock_gettime(CLOCK_MONOTONIC, {3917231, 687991532}) = 0
clock_gettime(CLOCK_MONOTONIC, {3917231, 688159907}) = 0
clock_gettime(CLOCK_MONOTONIC, {3917231, 688235242}) = 0
clock_gettime(CLOCK_MONOTONIC, {3917231, 688261079}) = 0
^C--- SIGINT {si_signo=SIGINT, si_code=SI_KERNEL} ---
@temoto
Copy link
Member Author

temoto commented Mar 10, 2018

Current status: 2018-03-10 on Linux CPython 3.7 Eventlet may cause busy loop, use in production is not recommended, subscribe to this issue to get updates.

@nat-goodspeed
Copy link
Contributor

Sorry, I'm at a conference all this week. I will look into it next week.

@nat-goodspeed
Copy link
Contributor

Are you building Python 3.7 from source? I tried to find it on python.org, but it seems to be unavailable as yet. That makes it tough to debug: I don't know how to obtain the interpreter version with which this fails.

If Python 3.7 has not yet been released, isn't it possible that what you're seeing is a bug in Python 3.7?

@temoto
Copy link
Member Author

temoto commented Apr 7, 2018

@nat-goodspeed it is there, https://www.python.org/downloads/release/python-370b3/

3.7-dev supplied by Travis also produces this problem.

@nirik
Copy link

nirik commented Jun 29, 2018

We are hitting this in Fedora as well: https://bugzilla.redhat.com/show_bug.cgi?id=1594248

If I comment out test_wait_each_exc() then it hangs at test_post_get_exc()

If I disable both of those it crashes at: (tests.websocket_new_test.TestWebSocketWithCompression) ... /var/tmp/rpm-tmp.mxJJ4m: line 34: 11070 Segmentation fault (core dumped) nosetests-3.7 -v
(which may be a different issue)

@nat-goodspeed
Copy link
Contributor

Ah, so Python 3.7 was officially released yesterday. There goes my hope that it was a bug in the pre-release version. I haven't forgotten about this, really!

@vstinner
Copy link
Contributor

I created an upstream bug report: https://bugs.python.org/issue33996

I'm able to reproduce the crash without eventlet, only using greenlet.

@nat-goodspeed
Copy link
Contributor

Thank you!
Doesn't that mean the fix for Python 3.7 would be in greenlet, though, rather than the Python interpreter? Or am I missing the point?

@vstinner
Copy link
Contributor

Doesn't that mean the fix for Python 3.7 would be in greenlet, though, rather than the Python interpreter?

I wrote a script to reproduce the bug only using greenlet. Maybe it's a bug in greenlet, maybe in Python. Honestly, I don't know at this point. At least, it doesn't seem to be a bug in eventlet.

@vstinner
Copy link
Contributor

vstinner commented Jul 3, 2018

The bug is in greenlet, not in Python: python-greenlet/greenlet#131

@hroncok
Copy link
Contributor

hroncok commented Jul 3, 2018

With @vstinner'spatch, this issue is solved, yet eventlet still fails with 3.7:

#502

@vstinner
Copy link
Contributor

vstinner commented Jul 3, 2018

With @vstinner'spatch, this issue is solved, yet eventlet still fails with 3.7: (...)

I suggest to open a new issue to track these issues, since they are different (unrelated) ;-)

@nat-goodspeed
Copy link
Contributor

Or at least change the name of this one... dagpool_test doesn't seem to be the culprit after all.

@nat-goodspeed
Copy link
Contributor

None of the test failures in #502 seem related to dagpool_test. Closing.

clrpackages pushed a commit to clearlinux-pkgs/eventlet that referenced this issue Mar 20, 2020
… 0.25.1

AGSPhoenix (1):
      Add locals() call to example backdoor invocation

Aayush Kasurde (2):
      Modified pyopenssl example using evenlet
      contributing.md

Anthony Sottile (1):
      Use tox's `TOXENV` environment variable

Chris Kerr (1):
      dns: reading /etc/hosts raised DeprecationWarning for universal lines on Python 3.4+

Daniel Alvarez (1):
      greendns: don't contact nameservers if one entry is returned from hosts file

David Szotten (2):
      workaround for pathlib on py 3.7
      release notes and version bump for 0.25.1

Feng (1):
      greenio: Fixed OSError: [WinError 10038] Socket operation on nonsocket

Geoffrey Thomas (1):
      patcher: workaround for monotonic "no suitable implementation"

Gevorg Davoian (1):
      green.zmq: socket.{recv,send}_* signatures did not match recent upstream pyzmq

Haikel Guemar (1):
      Drop OpenSSL.rand support

Hugo (1):
      Add Python 3.6

James Page (1):
      Avoid dependency on enum-compat

Jaume Marhuenda (2):
      support.greendns: ImportError when dns.rdtypes was imported before eventlet
      greendns: resolving over TCP produced ValueError

Jesse (1):
      tpool: exception in tpool-ed call leaked memory via backtrace

Julien Kasarherou (1):
      wsgi: make Expect 100-continue field-value case-insensitive.

Junyi (2):
      [bug] reimport submodule as well in patcher.inject (#540)
      Fix compatibility with Python 3.7 ssl.SSLSocket  (#531)

Konstantin Enchant (1):
      websocket: fd leak when client did not close connection properly

Lon Hohberger (4):
      Fix bad ipv6 comparison
      greendns udp: Fix infinite loop when source address mismatch
      tests: Add ipv6 tests for greendns udp() function
      tests: Add ipv4 udp tests for greendns

Marcel Plch (1):
      Fix for Python 3.7 (#506)

Matt Bennett (1):
      greendns: be explicit about expecting bytes from sock.recv

Miguel Grinberg (1):
      socket: context manager support

Miro Hrončok (1):
      Stop using deprecated cgi.parse_qs() to support Python 3.8

Ondřej Kobližek (1):
      Fixed tests.greendns_test.TestGetaddrinfo eventlet/eventlet#373

Ondřej Nový (1):
      Regenerate test crt

Quan Tian (1):
      patcher: set locked RLocks' owner only when patching existing locks

Ralf Haferkamp (1):
      greendns: Treat /etc/hosts entries case-insensitive

Sam Merritt (1):
      pools: put to empty pool would block sometimes

Sergey Shepelev (59):
      zmq: autogenerated documentation was missing
      Explicit environ flag for importing eventlet.__version__ without ignoring import errors
      release: use twine for PyPI upload
      readme: latest dev version was pointing to bitbucket
      tests: patcher_import_patched_defaults was failing in presence of pyopenssl package
      dns: try unqualified queries as top level
      Type check Semaphore, GreenPool arguments; Thanks to Matthew D. Pagel
      test_import_patched_defaults bended to play with pyopenssl>=16.1.0
      v0.20.1 release
      test coverage reports
      tests cleanup, CI with Python 3.6
      python3.6: http.client.request support chunked_encoding
      New timeout error API: .is_timeout=True on exception object
      support: upgrade bundled six to 1.10 (dbfbfc818e3d)
      green.profile: Python3 compatibility; Thanks to Artur Stawiarski
      Timeout was marked deprecated along with TimeoutError by mistake
      tests: socket_resolve_green was giving false fails
      dns: hosts file was consulted after nameservers
      hubs: use monotonic clock by default (bundled package); Thanks to Roman Podoliaka and Victor Stinner
      dns: EVENTLET_NO_GREENDNS option is back, green is still default
      dns: EAI_NODATA was removed from RFC3493 and FreeBSD
      db_pool: proxy Connection.set_isolation_level()
      ssl: RecursionError on Python3.6+; Thanks to justdoit0823@github and Gevent developers
      wsgi: log_output=False was not disabling startup and accepted messages
      v0.21.0 release
      update monotonic 1.3 5c0322dc559bf961f7e111d97cd3ed9ab5c1a73b
      queue: empty except was catching too much
      wsgi: push deprecated options one step
      wsgi: close idle connections (also applies to websockets)
      convenience: skip SO_REUSEPORT for bind on random port (0)
      convenience: SO_REUSEPORT is not available on WSL platform (Linux on Windows)
      green.subprocess: keep CalledProcessError identity; Thanks to Linbing@github
      support: upgrade bundled dnspython to 1.16.0 (22e9de1d7957e)
      convenience: (SO_REUSEPORT) socket.error is not OSError on Python 2; Thanks to JacoFourie@github
      travis: codecov flags format was running `tr` with invalid arguments
      greendns: early socket.timeout was breaking IO retry loops
      init: second workaround for monotonic "no suitable implementation"; Thanks to Geoffrey Thomas
      Travis broke ipv6, allow failure; test against Python 2.7
      travis: crutch to get ipv6 back
      v0.22.0 release
      event: Event.wait() timeout=None argument to be compatible with upstream CPython
      v0.22.1 release
      support: psycopg2_patcher import module, not function
      travis: update test dependencies, use psycopg2-binary
      moved function eventlet.support.capture_stderr to tests
      travis: allow fail python 3.7 see issue eventlet/eventlet#475
      wsgi: latin-1 encoding dance for environ[PATH_INFO]
      green.threading: current_thread() did not see new monkey-patched threads; Thanks to Jake Tesler
      v0.23.0 release
      Drop support for Python2.6 and python-epoll package
      Drop support for Python3.3
      website: link to PyPI project page w/o version; reflect current state of installation and development
      v0.24.0 release
      v0.24.1 release
      ssl: connect used non-monotonic time.time() for timeout (#520)
      New benchmarks runner
      greenthread: optimize _exit_funcs getattr/del dance; Thanks to Alex Kashirin
      IMPORTANT: late import in `use_hub()` + thread race caused using epolls even when it is unsupported on current platform
      maintainers list

Sourabh Deshmukh (1):
      typo

Stefan Nica (1):
      wsgi: handle remote connection resets

Tim Burke (10):
      wsgi: Don't strip all Unicode whitespace from headers on py3 (#504)
      wsgi: Use byte strings on py2 and unicode strings on py3
      wsgi: Catch and swallow IOErrors during discard() (#532)
      wsgi: Stop replacing invalid UTF-8 on py3
      wsgi: fix Input.readline on Python 3
      wsgi: fix Input.readlines when dealing with chunked input
      wsgi: Return 400 on negative Content-Length request headers (#537)
      wsgi: Only send 100 Continue response if no response has been sent yet (#557)
      wsgi: minimize API changes for 100-continue fix (#569)
      v0.25.0 release

Yuichi Bando (1):
      New feature: Add zipkin tracing to eventlet

costasgambit (1):
      websocket: support permessage-deflate extension; Thanks to Costas Christofi and Peter Kovary

jaimefrites (1):
      green.select: fix mark_as_closed() wrong number of args

nat-goodspeed (5):
      greendns: full comment lines were not skipped; Thanks to nat-goodspeed
      external dependencies for six, monotonic, dnspython
      Issue 535: use Python 2 compatible syntax for keyword-only args. (#536)
      Increase Travis slop factor for ZMQ CPU usage.  (#542)
      #53: Make a GreenPile with no spawn()s an empty sequence. (#555)

orishoshan (1):
      Issue #405: GreenSocket.accept does not notify_open (#406)

talwrii (1):
      green.zmq: support RCVTIMEO (receive timeout)

0.25.1
* wsgi (tests): Stop using deprecated cgi.parse_qs() to support Python 3.8; Thanks to Miro Hrončok
* os: Add workaround to `open` for pathlib on py 3.7; Thanks to David Szotten

0.25.0
======
* wsgi: Only send 100 Continue response if no response has been sent yet; Thanks to Tim Burke
* wsgi: Return 400 on negative Content-Length request headers; Thanks to Tim Burke
* Make a GreenPile with no spawn()s an empty sequence; Thanks to nat-goodspeed
* wsgi: fix Input.readlines when dealing with chunked input; Thanks to Tim Burke
* wsgi: fix Input.readline on Python 3; Thanks to Tim Burke
* wsgi: Stop replacing invalid UTF-8 on py3; Thanks to Tim Burke
* ssl: Fix compatibility with Python 3.7 ssl.SSLSocket; Thanks to Junyi
* reimport submodule as well in patcher.inject; Thanks to Junyi
* use Python 2 compatible syntax for keyword-only args; Thanks to nat-goodspeed

(NEWS truncated at 15 lines)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants