Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

eventlet 0.36.0 incompatible with flask-socketio running under gunicorn #946

Closed
dhdaines opened this issue Mar 25, 2024 · 16 comments · Fixed by #949
Closed

eventlet 0.36.0 incompatible with flask-socketio running under gunicorn #946

dhdaines opened this issue Mar 25, 2024 · 16 comments · Fixed by #949
Labels

Comments

@dhdaines
Copy link
Contributor

Haven't figured out a minimal test case for this yet, but our CI pipelines for https://github.com/roedoejet/g2p/ have all stopped working because of the 0.36.0 release. We are running a background flask-socketio server with gunicorn and eventlet, and trying to access it via Playwright, which hangs forever when trying to connect to the socket.

The documentation seems to indicate that we should not use eventlet anymore, so we'll do that :) but just to note that the most recent eventlet seems to be particularly ill-behaved.

@4383
Copy link
Member

4383 commented Mar 25, 2024

Hello @dhdaines,

Thanks for reporting this problem.
We recently modified how the eventlet wsgi server behaves with sockets (1f3db00), so, indeed, it could have some side effects on your tests.

Please can you share links where we can observes that broken jobs?

If you are able to migrate off of eventlet then fire! That's the solution we encourgage.

Else, having some details about your recent problem can help us troubleshooting what happens with the last version, and hence, helps far less lucky other users that can't migrate off easily.

@4383 4383 added the bug label Mar 25, 2024
@dhdaines
Copy link
Contributor Author

Sure! Not sure how much it will help but here is before reverting eventlet:

https://github.com/roedoejet/g2p/actions/runs/8422888910/job/23064740602

And here is after:

https://github.com/roedoejet/g2p/actions/runs/8423476069/job/23065192040

We'll try using gevent, since it seems like that might be the only alternative Flask supports...

@4383
Copy link
Member

4383 commented Mar 26, 2024

Rather than moving to gevent, what do you think of using quart as a replacement for flask?

Quart is the official response of the flask team to asyncio. I'd encourage that solution rather than the gevent solution, which at some point, will surely lead you to the same maintenance problems that are currently seen in eventlet, see #824.

Quart offer the same API as Flask.

For other candidates for replacement, you can take a look to this curated list:
https://github.com/timofurrer/awesome-asyncio

I just proposed a pull request to update our migration doc with awesome-asyncio #947

@4383
Copy link
Member

4383 commented Mar 26, 2024

I don't know how is designed your code base, but, either you could switch to the Eventlet asyncio hub and then migrate to quart or, directly move to quart if you do not use eventlet for something else in your code base.

That will be a win-win accomplishment.

Forgot to thank you for your previous links :)

@dhdaines
Copy link
Contributor Author

dhdaines commented Mar 26, 2024

Thank you! At one point I already recreated our API + web app using FastAPI, but we didn't end up switching because Flask is fairly embedded in our codebase - so if Quart has the same API it seems like the migration would be simpler.

We don't particularly care about asyncio (though I agree that it is better), we just need something that supports SocketIO, but it seems that to do so with Flask in production requires eventlet, gevent, or something else, which from what I understand are all somewhat of a compromise.

To reproduce the error, it's not all browsers that seem to trigger it - Safari definitely does, while Chrome on Linux seems to work okay, yet somehow Playwright hangs in testing. Full sequence to get the error is:

git clone https://github.com/roedoejet/g2p.git
pip install g2p
gunicorn --worker-class eventlet -w 1 g2p.app:APP --no-sendfile --bind 0.0.0.0:5000
python g2p/g2p/tests/test_studio.py  # in another window obiously

Actually it seems that the error may be triggered by the websocket upgrade request, as I get this exception:

[2024-03-26 08:45:40 -0400] [16229] [ERROR] Error handling request /socket.io/?EIO=4&transport=websocket&sid=62vsmkRvQIZ13WscAAAU
Traceback (most recent call last):
  File "/home/dhd/work/ReadAlongs/venv/lib/python3.11/site-packages/gunicorn/workers/base_async.py", line 55, in handle
    self.handle_request(listener_name, req, client, addr)
  File "/home/dhd/work/ReadAlongs/venv/lib/python3.11/site-packages/gunicorn/workers/base_async.py", line 108, in handle_request
    respiter = self.wsgi(environ, resp.start_response)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/dhd/work/ReadAlongs/venv/lib/python3.11/site-packages/flask/app.py", line 2552, in __call__
    return self.wsgi_app(environ, start_response)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/dhd/work/ReadAlongs/venv/lib/python3.11/site-packages/flask_socketio/__init__.py", line 43, in __call__
    return super(_SocketIOMiddleware, self).__call__(environ,
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/dhd/work/ReadAlongs/venv/lib/python3.11/site-packages/engineio/middleware.py", line 63, in __call__
    return self.engineio_app.handle_request(environ, start_response)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/dhd/work/ReadAlongs/venv/lib/python3.11/site-packages/socketio/server.py", line 430, in handle_request
    return self.eio.handle_request(environ, start_response)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/dhd/work/ReadAlongs/venv/lib/python3.11/site-packages/engineio/server.py", line 274, in handle_request
    packets = socket.handle_get_request(
              ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/dhd/work/ReadAlongs/venv/lib/python3.11/site-packages/engineio/socket.py", line 90, in handle_get_request
    return getattr(self, '_upgrade_' + transport)(environ,
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/dhd/work/ReadAlongs/venv/lib/python3.11/site-packages/engineio/socket.py", line 146, in _upgrade_websocket
    return ws(environ, start_response)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/dhd/work/ReadAlongs/venv/lib/python3.11/site-packages/engineio/async_drivers/eventlet.py", line 44, in __call__
    return super().__call__(environ, start_response)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/dhd/work/ReadAlongs/venv/lib/python3.11/site-packages/eventlet/websocket.py", line 137, in __call__
    environ['eventlet.set_idle']()
    ~~~~~~~^^^^^^^^^^^^^^^^^^^^^
KeyError: 'eventlet.set_idle'

@dhdaines
Copy link
Contributor Author

Oh. So, in fact, it is definitely related to that change you mentioned above (1f3db00)

@dhdaines
Copy link
Contributor Author

dhdaines commented Mar 26, 2024

So, this will fix it, but I don't know enough about the internals of eventlet to know if it's the right fix:

diff --git a/eventlet/websocket.py b/eventlet/websocket.py
index 9d6becc2..a2eba800 100644
--- a/eventlet/websocket.py
+++ b/eventlet/websocket.py
@@ -134,8 +134,8 @@ class WebSocketWSGI:
 
         # We're ready to switch protocols; flag the connection
         # as idle to play well with a graceful stop
-        environ['eventlet.set_idle']()
-
+        if 'eventlet.set_idle' in environ:
+            environ['eventlet.set_idle']()
         try:
             self.handler(ws)
         except OSError as e:

@dhdaines
Copy link
Contributor Author

Ah, I understand, the problem is that WebSocketWSGI is being used from a server that is not eventlet.wsgi.Server and so the magical environment function is not set. There are examples in code below where we simply check if keys are set in there, so it should be the right fix I believe. I'll dig a bit more as this is interesting to understand, then make a PR.

@4383
Copy link
Member

4383 commented Mar 26, 2024

@dhdaines: thanks for all these details.

Do not hesitate to ping me when you propose your PR, I'll be happy to review it.

@tipabu: FYI, this issue may interest you.

@dhdaines dhdaines changed the title eventlet 0.36.0 hangs forever in flask-socketio server eventlet 0.36.0 incompatible with flask-socketio running under gunicorn Mar 26, 2024
@dhdaines
Copy link
Contributor Author

@4383 okay, PR submitted! as noted ... mocking a Gunicorn worker is not something I have the time or talent to do at the moment so I couldn't add a test for it, but I can confirm that it fixes the problem

@flyingclimber
Copy link

Ran into this as well. I ended up rolling back to 0.35.2 which worked fine

@dhdaines
Copy link
Contributor Author

Ran into this as well. I ended up rolling back to 0.35.2 which worked fine

Yes, you can simply version your dependency to "eventlet!=0.36.0" and that should work as I presume the bug will be fixed in the next release, being rather uncontroversial :)

@4383
Copy link
Member

4383 commented Mar 29, 2024

Thanks @dhdaines for your PR and for the advices.

I'm going to merge your fix. Once merged I'll start to prepare a new release (0.36.1).
@flyingclimber by pinning your requirements like David suggested, you will be able to pull this new version.

@4383 4383 closed this as completed in #949 Mar 29, 2024
4383 added a commit that referenced this issue Mar 29, 2024
… (#949)

Co-authored-by: Hervé Beraud <hberaud@redhat.com>
@4383
Copy link
Member

4383 commented Mar 29, 2024

Fix released!

Let us know if it fixed your problem and feel free to reopen this issue if you observe unexpected behaviors.

@dhdaines
Copy link
Contributor Author

Thank you so much! Just reran our test suite, it picked up 0.36.1, and everything works: https://github.com/roedoejet/g2p/actions/runs/8452592573/job/23244684343?pr=345

@4383
Copy link
Member

4383 commented Apr 2, 2024

You are welcome, that have been our pleasure. Thanks.

openstack-mirroring pushed a commit to openstack/requirements that referenced this issue Apr 5, 2024
It will allow us to benefit from several recent fixes
related to the new asyncio hub.

We started to implement some devstack integrations
with the new eventlet asyncio hub [1], so it is worth
using a version that contains useful fixes.
See https://review.opendev.org/c/openstack/devstack/+/914108

Also this patch propose to blacklist version 0.36.0 which is
buggy version in some wsgi context. 0.36.1 fix that wsgi problem.
For further details see:

- eventlet/eventlet#946
- eventlet/eventlet#949

Change-Id: I2add78de0e3d2f439964afa0d0c9d854a52b5f7f
openstack-mirroring pushed a commit to openstack/openstack that referenced this issue Apr 5, 2024
* Update requirements from branch 'master'
  to 4ba203d335842ba76005e0ab3a395c6b1f5853c4
  - Merge "bump eventlet to the latest version"
  - bump eventlet to the latest version
    
    It will allow us to benefit from several recent fixes
    related to the new asyncio hub.
    
    We started to implement some devstack integrations
    with the new eventlet asyncio hub [1], so it is worth
    using a version that contains useful fixes.
    See https://review.opendev.org/c/openstack/devstack/+/914108
    
    Also this patch propose to blacklist version 0.36.0 which is
    buggy version in some wsgi context. 0.36.1 fix that wsgi problem.
    For further details see:
    
    - eventlet/eventlet#946
    - eventlet/eventlet#949
    
    Change-Id: I2add78de0e3d2f439964afa0d0c9d854a52b5f7f
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
3 participants