Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sync vs Async workers #1409

Closed
gukoff opened this issue Dec 15, 2016 · 9 comments
Closed

Sync vs Async workers #1409

gukoff opened this issue Dec 15, 2016 · 9 comments
Labels
- Forum - Documentation help wanted Open for everyone. You do not need permission to work on these. May need familiarity with codebase. Improvement
Projects

Comments

@gukoff
Copy link

gukoff commented Dec 15, 2016

Hi!

The documentation doesn't state clearly enough, how the async workers are different from the sync ones and what should be done by a programmer to make use of their difference.

I assume, that asynchronous workers are spawned in separate processes based on the pre-fork model.

Say, we want to see a difference between a sync and a gevent worker classes in an example of a simple application. Here go four scenarios:

Scenario №1

The application accepts a request, makes 10 external calls using the requests library and returns an array of 10 responses.

My assumption: there is no difference, the class of the workers doesn't matter.

Scenario №2

The application calls gevent.monkey.patch_all() in the pre_fork() function of the master process. Then the first scenario takes place: the app accepts a request, makes 10 external calls using the requests library and returns an array of 10 responses.

My assumption: synchronous workers implicitly turn into gevent workers.

Scenario №3

The same as the Scenario№2, but the monkey-patch is called in a worker.

My assumptions:

  1. gevent.monkey.patch_all() doesn't affect the way workers listen on the socket. Synchronous workers don't turn into gevent workers and don't accept new calls until the previous are handled.
  2. gevent worker might accept a new call when an external call with requests occurs. The allowed count of concurrently handled calls is capped by worker_connections setting. That's the only difference.

Scenario №4

The application accepts a request, spawns 5 gevent jobs and joins them; their 5 responses will be the result. After that it spawns another 10 jobs, doesn't join them and returns immediately.

My assumptions:

  1. After finishing the first request a synchronous worker keeps listening on a socket. 10 spawned jobs are waiting for the second request to be accepted, where the gevent. joinall(...) will be called and they might be scheduled to be executed.
  2. After finishing the first request a gevent worker subsequently executes 10 spawned jobs until the second request is issued. It can switch to handling the second request only after a gevent context switch(gevent. joinall(...), after having any of the jobs done, etc.).
  3. In case workers reload(SIGHUP, MAX_REQUESTS, etc.) synchronous ones lose all pending spawned jobs. Gevent workers are terminated gracefully.

I feel, many of these assumptions must be wrong. Could you please correct me and expand on it in the documentation respectively?

@tilgovi tilgovi added Documentation help wanted Open for everyone. You do not need permission to work on these. May need familiarity with codebase. Improvement labels Dec 20, 2016
@benoitc
Copy link
Owner

benoitc commented Dec 22, 2016

This page describe the design an give some informations about the workers:
http://docs.gunicorn.org/en/stable/design.html

I will answer in a generic manner if it's OK for you. Hopefully it will give you enough hint to answer yourself to the scenarios above.

If you run gunicorn behind a proxy that buffer the connection the key point is not the number the number of connections that can accept gunicorn,but rather the number of slow connections (worker that do a huge task, connecting to an api, a database ...) or connections that will be used for a long time (longpolling, ..) . In such case an async worker is advised. When you return quite immediately, then a sync worker is enough and in most case when a database is locale or in the same network with a low latency it can also be used.

If your run gunicorn standalone. Then you will need a threaded or an async worker if you expect to have a lot of concurrent connections. Increasing the number of worker when using a sync worker is also enough sometimes when you don't expect a large number of connectios or can handle some latency in the responses.

@benoitc
Copy link
Owner

benoitc commented Dec 22, 2016

I will also add that monley patching add some border effects to your application which cna be an issue or not. Using other async workers don't suffer such border effects. At least for the tornado and threaded workers.

@gukoff
Copy link
Author

gukoff commented Jan 16, 2017

@benoitc thanks for your answer!

I've already read the docs. Essentially, my point is that the docs are way to short. There are important implementation details, which aren't mentioned yet.

Firstly, it came as a surprise to me, that gevent-workers implicitly call gevent.monkey.patch_all(). It is quite a rough strategy, unacceptable in many cases. There should be another type of gevent workers, which simply listen on a gevent socket and don't monkey-patch anything. And this behaviour isn't explicitly documented. It's also important to know, whether the main process gets monkey-patched as well as the worker processes.

Secondly, it's not very clear, how the max-requests option works. Say, if given, does it use the graceful_timeout option? If so, how does the graceful_timeout option work? Does it make a worker stop accepting new requests, or it's up to a developer?

Thirdly, how exactly does gunicorn restart after the HUP signal? The documentation states as follows:

HUP: Reload the configuration, start the new worker processes with a new configuration and gracefully shutdown older workers

So, in case I have a server with 30 workers, a long-running pre_fork function(1 minute) and the graceful timeout of 20 seconds, what are the actions after the HUP? I suppose, they are:

  1. Reload the application and configuration in the master process;
  2. run the pre_fork function in the master process. Wait a minute for it to finish. Don't touch the workers;
  3. fork 30 new workers. Let them work together with the older ones. In other words, for a short period of time consume double RAM and let 60 workers run on the same socket;
  4. gracefully shutdown the older workers. Give them 20 seconds to handle the pending queries and terminate.
    Am I right?

Fourthly, what happens if the master process is sent with two HUP signals
at the same time? Are they put is some kind of signal queue and handled consecutively? What about other signals?

Fifthly, has the recommendation about 2*CORES + 1 workers something to do with asynchronous workers? I think, that the gevent workers are expected to utilise CPU to the limit and never wait in any IO-bound tasks, and ~CORES workers are OK. Otherwise the load isn't high enough and the number of workers can be even lower.

And so on.

@benoitc benoitc added this to Answered, waiting in Forum Feb 26, 2017
@benoitc
Copy link
Owner

benoitc commented Mar 22, 2017

@gukoff Firstly, it came as a surprise to me, that gevent-workers implicitly call gevent.monkey.patch_all(). It is quite a rough strategy, unacceptable in many cases. There should be another type of gevent workers, which simply listen on a gevent socket and don't monkey-patch anything. And this behaviour isn't explicitly documented. It's also important to know, whether the main process gets monkey-patched as well as the worker processes.

Can you open a ticket about it?

@benoitc
Copy link
Owner

benoitc commented Mar 22, 2017

to answer on your last questions: max_requests is used when you know yuou will have to recycle (kill the current worker) at some point. It's useful when you know that your worker is leaking some memory, or need to be reset at some point.

Hooks must be processed fast If not, you may either block the worker or the arbiter preventing any other scheduled actions.

The arbiter queue the signals so 2 HUP will be handled concurrently.

N2+1 is a generic rule to load balance the sockets between the workers between CPUs/cores especially useful for the sync worker.

@mdomans
Copy link

mdomans commented Jul 29, 2017

Actually, I stumbled into this venue of questioning recently at work and was rather badly bitten by mixing gevent with gunicorn without considering it's Python too.

@benoitc my simple question is this, assuming I'm using SyncWorker as my worker, and somewhere in the code serving request I call monkey.patch_all - how far up the component tree will this patch_all go. Will it patch the SyncWorker for other requests too, effectively making it gevent worker?

@RonRothman
Copy link

@mdomans In what way were you bitten by mixing gevent and gunicorn?

(I'm curious because we've been using gunicorn+gevent successfully for 2 years now.)

@mdomans
Copy link

mdomans commented Aug 3, 2017

I used gevent based grequests library which calls monkey.patch_all. This in turn resulted in a lot of socket errors for other requests.
Important note: we use SyncWorkers and I needed gevent to be very precisely scoped only to one function. As it turns out, the patching somehow leaked out.

@RonRothman curious to talk about your architecture :)

@benoitc
Copy link
Owner

benoitc commented Oct 9, 2018

closing the issue, superseded by #1746

@benoitc benoitc closed this as completed Oct 9, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
- Forum - Documentation help wanted Open for everyone. You do not need permission to work on these. May need familiarity with codebase. Improvement
Projects
No open projects
Forum
Answered, waiting
Development

No branches or pull requests

5 participants