Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Gunicorn plumbing #357

Merged
merged 8 commits into from
Sep 11, 2020
Merged

Gunicorn plumbing #357

merged 8 commits into from
Sep 11, 2020

Conversation

c0c0n3
Copy link
Member

@c0c0n3 c0c0n3 commented Sep 9, 2020

This PR builds the plumbing to run QuantumLeap as a WSGI app in Gunicorn---details below.

WSGI

The building of the QuantumLeap WSGI callable object got factored out to a separate module (server.wsgi) so to have a clean WSGI interface provider. The callable is made up by a Connexion wrapper that manages the Flask application in which QuantumLeap runs, just like we had it before this PR. It's a singleton object exposed through the application variable which you can use to run QuantumLeap in your WSGI container of choice---Gunicorn, uWSGI, etc. Here's an example of running the QuantumLeap WSGI app in Gunicorn:

$ cd ngsi-timeseries-api/src
$ gunicorn server.wsgi --config server/gconfig.py

The above Gunicorn config file contains settings that should be okay in most situations (more about it in the next section) but YMMV.

Also part of the server.wsgi module is a function to run the bare-bones QuantumLeap WSGI app in the Flask dev server just like we've done up to now. This function comes in handy for debugging and provides a safety net if we realise the QuantumLeap app is unstable when run in Gunicorn and we quickly have to switch back to the old Flask dev server setup---see USE_FLASK env var in the next section. Also existing QuantumLeap users can leverage it for for backward-compatibility ---I doubt this is a use case though as I can't think of any good reason for our users not wanting to upgrade to a proper WSGI container, but someone may have a good use case for sticking with Flask.

Gunicorn support

While server.wsgi gives QuantumLeap users the freedom to choose any WSGI runtime, we also provide fully-fledged support for Gunicorn in code. The rationale for that is:

  • we decided to use Gunicorn in production in place of uWSGI
  • the plumbing required to run QuantumLeap decently in Gunicorn is quite involved, error prone, and takes lots of experimentation
  • ditto for the ability to easily debug in your IDE/editor with the same software stack you'd use in prod

The last point is critical when running QuantumLeap in Gunicorn+gevent. In fact, while you could debug in the Flask dev server, that'd be miles away from what actually happens in Gunicorn since gevent/greenlet monkey patches Python, Gunicorn spawns several processes, etc. so it's a dramatically different runtime!

So we've got our own drop-in replacement for the Gunicorn server runner which starts QuantumLeap in Gunicorn just like you'd do yourself with a suitable command line (see example in the previous section) but is more flexible (w/r/t configuration) and can be easily debugged in your IDE/editor. In fact, the server.grunner module comes with a run function to start a fully-fledged Gunicorn server to run the QuantumLeap WSGI app. This is the function that QuantumLeap's entry point (app.py) calls now. As a result of that, now invoking the Python interpreter on app.py triggers the following sequence of actions:

  1. The Gunicorn master process gets booted.
  2. Our server.grunner reads in the Gunicorn config settings in server.gconfig.
  3. Then it reads in the config file specified through the --config CLI option if given. If this file contains any setting with the same name as in server.gconfig, they'll override those in server.gconfig.
  4. Finally it processes any other CLI arguments as specified by the Gunicorn docs. In particular any config settings given on the CLI will override previous settings with the same name in (2) and (3).
  5. server.grunner tells the master to load the QuantumLeap WSGI app. (This can't be configured, any other WSGI callable specified in (3) or (4) will be ignored.)
  6. The master completes the startup procedure by forking worker processes.

If you pass no CLI args as in

$ python app.py

then this is the same as running

$ gunicorn server.wsgi --config server/gconfig.py

Here's an example of overriding the default bind address in server/gconfig.py with a CLI option to make QuantumLeap listen to port 8080 instead of 8668 (default)

$ python app.py -b '0.0.0.0:8080'

You'll find an example of using a custom config file in the section below about the new Docker image. Before moving on, we should mention there's an escape hatch. If you don't want Gunicorn and you'd rather run QuantumLeap classic (i.e. exactly the way it was before this PR, with the Flask dev server) then all you need to do is set the USE_FLASK environment variable to one of the following values (case won't matter): true, yes, 1, t, y.

Gunicorn parallelism and concurrency

QuantumLeap does alot of network IO, so we configure worker processes to use multi-threading (gthread) to improve performance---see server.gconfig module. With this setting, each request gets handled in its own thread taken from a thread pool. In our tests, the gthread worker type had better throughput and latency than gevent but gevent used up less memory, most likely because of the difference in actual OS threads. So for now we go with gthread and a low number of threads. This has the advantage of better performance, reasonable memory consumption, and keeps us from accidentally falling into the gevent monkey patching rabbit hole. Also notice that according to Gunicorn docs, when using gevent, Psycopg (Timescale driver) needs psycogreen properly configured to take full advantage of async IO. (Not sure what to do for the Crate driver!)

But here's the surprise. In our tests, w/r/t to throughput gthread outperformed gevent---27% better. Latency was pretty much the same though. But the funny thing is that we used exactly the same number of worker processes and the default number of threads per process, which is, wait for it, 1. Yes, 1. We did some initial quick & dirty benchmarking to get these results. We'll likely have to measure better and also understand better the way the various Gunicorn worker types actually work. (Pun intended.)

Docker image

The Docker image got slightly updated too. It still calls python app.py but now it does that as an entry point rather than a command. This makes it easier to specify CLI args to docker run or in k8s/docker compose YAML. In particular, it gives you a convenient way to re-configure Gunicorn by mounting your config file on the container and then running the container with the following option

--config /path/to/where/you/mounted/your/gunicorn.conf.py

as in the below example

 $ echo 'workers = 2' > gunicorn.conf.py
 $ docker run -it --rm \
              -p 8668:8668 \
              -v $(pwd)/gunicorn.conf.py:/gunicorn.conf.py
              smartsdk/quantumleap --config /gunicorn.conf.py

@c0c0n3
Copy link
Member Author

c0c0n3 commented Sep 11, 2020

@chicco785 the build is broken because of an OSM issue unrelated to this PR's change set---I created a separate GitHub issue for that, #358. Not sure why GitHub isn't displaying a link to the build log, but here it is:

Should we go ahead and merge or is there something else you'd like to do?

This was referenced Sep 11, 2020
@chicco785
Copy link
Contributor

I would go ahead, but there is not even ci result to look at in my gui :/

# Logging config section.
#

loglevel = 'debug'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this logging for gunicorn or also for the app?

Copy link
Member Author

@c0c0n3 c0c0n3 Sep 11, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's Gunicorn own log level, it doesn't affect QuantumLeap. That's the log level you had in #354 and I suggest we keep it that way until we get to know Gunicorn better :-)

@c0c0n3
Copy link
Member Author

c0c0n3 commented Sep 11, 2020

I would go ahead, but there is not even ci result to look at in my gui :/

Yea, for some weird reason GitHub isn't displaying the link. No idea why, but here;s Travis's log:

@chicco785
Copy link
Contributor

@c0c0n3 let's disable the failing geocoding (to have clean tests) and let's merge

@c0c0n3 c0c0n3 mentioned this pull request Sep 11, 2020
@c0c0n3
Copy link
Member Author

c0c0n3 commented Sep 11, 2020

disable the failing geocoding (to have clean tests)

agreed :-) just pushed the change and mentioned in #358 that we should re-enable it as soon as we have a fix.

Copy link
Contributor

@chicco785 chicco785 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@chicco785
Copy link
Contributor

the build passed: https://travis-ci.org/github/smartsdk/ngsi-timeseries-api/builds/726220256, but we are still stack, let's merge some how

@chicco785 chicco785 merged commit 098e3e5 into master Sep 11, 2020
@c0c0n3 c0c0n3 deleted the gunicorn branch September 11, 2020 10:40
@c0c0n3 c0c0n3 mentioned this pull request Feb 1, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants