Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix Flask services' performance issues on Windows #1134

Merged
merged 10 commits into from Jul 21, 2022

Conversation

ricklamers
Copy link
Member

@ricklamers ricklamers commented Jul 19, 2022

Description

Using eventlet caused performance issues when running Orchest on Windows using minikube and when using the Docker for Desktop included Kubernetes cluster.

According to Flask SocketIO they recommend the gthreads + simple-web-socket deployment configuration when using Gunicorn if you don't want to rely on the functioning of green threads.

https://flask-socketio.readthedocs.io/en/latest/deployment.html?highlight=threads#gunicorn-web-server

@fruttasecca will probably need to some more testing, but so far the performance has been rock solid with this configuration for me.

Closes: #681, #235

Checklist

  • I have manually tested my changes and I am happy with the result.
  • The documentation reflects the changes.
  • The PR branch is set up to merge into dev instead of master.
  • I haven't introduced breaking changes that would disrupt existing jobs, i.e. backwards compatibility is maintained.
  • In case I changed the dependencies in any requirements.in I have run pip-compile to update the corresponding requirements.txt.

@yannickperrenet
Copy link
Contributor

This PR would sidestep (but thereby fix) #681 and #235.

@yannickperrenet
Copy link
Contributor

yannickperrenet commented Jul 19, 2022

Some general notes/thoughts.

I guess that "gthreads" means "green-threads"? Which means that those threads would be scheduled cooperatively and running on one process (due to -w 1). I know that eventlet patches the Python standard library to ensure tasks yield control during IO tasks. How does "gthreads" switch between threads? Probably good to know as some issues appear to be fixed, but might have introduced others 🤔

Going with gthreads I guess we still can't change our "custom" rmtree, copytree and alike functions to functions from the shutil module given that they would still block the main thread?

def rmtree(path, ignore_errors=False) -> None:
"""A wrapped rm -rf.
If eventlet is being used and it's either patching all modules or
patchng subprocess, this function is not going to block the thread.


EDIT: To answer my own question. I guess "gthreads" means "gunicorn-threads" (aka their own implementation). From the source code we can see that they are using futures.ThreadPoolExecutor: link.

Which would mean that the threads are scheduled preemptively by the OS.

I think it would be good to test what happens when a large project-dir needs to be snapshotted for job creation (thus triggering the custom copytree implementation) as the OS will then have to decide between the CPU intensive task (the copy) and handling new requests. Although, that should be handled correctly due to us executing the copy in a subprocess.

@fruttasecca
Copy link
Member

I guess we want to update the auth server as well, for consistency?

@fruttasecca
Copy link
Member

fruttasecca commented Jul 20, 2022

I'm getting socketio errors in the orchest-webserver due to some breakage between dependencies of the orchest-api and orchest-webserver, thus any form of logs (env builds, steps, etc.) won't work.

ERROR:engineio.server:server.py - [20/Jul/2022 08:00:34] - The client is using an unsupported version of the [Socket.IO](http://socket.io/) or [Engine.IO](http://engine.io/) protocols

I have tried pinning the following for the webserver, didn't work

Flask-SocketIO==4.3.2
python-engineio==3.14.2
python-socketio==4.6.1

ricklamers and others added 4 commits July 20, 2022 11:35
@ricklamers
Copy link
Member Author

Moved to Flask-SocketIO==5.2.0 (and updated JS socket.io client to 4.5.1 to support the new socket.io protocol required by Flask-SocketIO 5.X).

@yannickperrenet yannickperrenet changed the title Fix orchest-webserver/orchest-api performance issues Fix Flask services' performance issues on Windows Jul 20, 2022
@yannickperrenet yannickperrenet added improvement An improvement or enhancement to an existing feature. fix PR fixes a bug labels Jul 20, 2022
We no longer use `eventlet` but still require the patched implementation
for gunicorn's `gthread`.
@yannickperrenet
Copy link
Contributor

Did some testing and everything seemed to work fine.

The testing that I did:

  • Add artificial load on the webserver by running for i in {1..10000}; do curl localhost/async/projects; done and manually going through the application as well
  • Created a reasonably large file (head -c 3GB /dev/zero > data) in the project directory and created jobs (all whilst doing the above load).

Everything worked smoothly.

Probably something went wrong during the `script/build_container.sh`
that didn't clean up the `orchest-cli` in the first place.
To allow for easier alphabetical sorting of the lines.
Copy link
Contributor

@yannickperrenet yannickperrenet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

I don't own a Windows machine so I trust that everything works 😉 Glad that everything still works fine on Linux as well. (Should we also run a round of tests on macOS?)

Copy link
Member

@fruttasecca fruttasecca left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, did some testing as well and things worked smoothly, some testing on mac wouldn't hurt given that performance is sometimes an issue on docker on desktop etc.

@ricklamers ricklamers removed the request for review from iannbing July 21, 2022 13:43
@mausworks
Copy link
Member

Did some basic testing on macOS Monterey 12.4.

  • Pulled this branch
  • Ran the build scripts/build_container.sh -M -t $TAG -o $TAG with the latest tag
  • Ran orchest install --dev
  • Opened Orchest and did a hard reload
  • Imported a project: a bit sluggish, but it's really hot here now, so everything is a bit slow
  • Let the environment build
  • Added some files, removed some, moved some around
  • Ran the pipeline
  • Viewed the logs, etc …

No hiccups from what I could see. 👍

Copy link
Member

@mausworks mausworks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not Python guru, but LGTM. 👍

@ricklamers ricklamers merged commit 1ce1d0f into dev Jul 21, 2022
@ricklamers ricklamers deleted the fix/webserver-gthreads branch July 21, 2022 13:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
fix PR fixes a bug improvement An improvement or enhancement to an existing feature.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants