Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Documentation on scaling and concurrency #960

Open
ipartola opened this issue Mar 7, 2018 · 20 comments
Open

Documentation on scaling and concurrency #960

ipartola opened this issue Mar 7, 2018 · 20 comments

Comments

@ipartola
Copy link

ipartola commented Mar 7, 2018

Current documentation (2.0.2) does not mention how to run multiple workers beyond running a supervisor process and using a shared FD. What is the correct way to scale beyond a single daphne/worker process?

Also, there is no discussion of concurrency. How many threads are run at once? How does that get affected by sync code that, e.g. talks to the database, remote API's via blocking sockets, etc? Is the number of threads a tunable parameter?

@andrewgodwin
Copy link
Member

Documentation of how to deploy could be an entire book, but I agree there should be a bit more, I'll put it on the backlog to write some time. Scaling beyond a single daphne process should be done the same way you scale beyond a single process/machine for any other Python webserver (gunicorn, etc.) - reverse proxies/loadbalancers/etc.

For an immediate answer on threading - the number of threads run at once is, by default, the default for ThreadPoolExecutor (which I believe is CPUs * 5). It's tunable using the ASGI_THREADS environment variable. More threads will not necessarily be faster, though, as the Python context switching will start to hurt.

@ipartola
Copy link
Author

ipartola commented Mar 7, 2018

@andrewgodwin would a pull request that adds child process management to daphne be welcome? The reason for this is because my use case involves running channels-based apps on Heroku, where as far as I can tell, I can't use the process manager + FD method to run multiple processes on a single Dyno, yet this is the desirable and cost effective method of doing it.

Being able to do what other (non-ASGI) protocol servers do, such as gunicorn, and run daphne with --concurrency 2 to run two protocols servers with the parent accepting connections and handing them over to the child for actual processing would be ideal. I'd be happy to code this up, if you think it would be valuable for allowing another simple scaling option.

@andrewgodwin
Copy link
Member

@ipartola I really don't want to take on the maintenance and security overhead of child process management if I can avoid it - there's plenty of other software out there that does it better than a lacklustre version we roll in to Daphne.

Can Heroku really not run a process manager with socket support on a dyno? That seems... lacking. It's very easy to do in Docker.

@ipartola
Copy link
Author

ipartola commented Mar 9, 2018

Heroku can absolutely run a process manager, there just isn't a documented one that supports inheriting a socket file descriptor. We found Circus, which would work, except that it doesn't support getting the port on which to bind from an environment variable or via a command line parameter, which is how you'd do it on Heroku.

So one way to accomplish this would be to find a process manager that fits both needs: can take the port number on the command line or via an env var and can do the FD inheritance. At that point, just documenting it would be ideal. Are you aware of a process manager that fits this criteria?

Another would be to use Circus as a library (which is supported) and adapt it specifically for use with daphne. That way the maintenance of most of that code would lie with Circus, with the channels project only having the glue code necessary to have Circus manage several daphne processes (see https://circus.readthedocs.io/en/latest/for-devs/). Would you be open to doing it that way?

@andrewgodwin
Copy link
Member

I'm not really open to solving this problem in the Channels project directly, as it only affects a specific hosting platform. It would be better to do something like this as a separate third-party app (running Daphne directly via its Server class interface is easy)

@ChillarAnand
Copy link

@ipartola Did you manage to run multiple instances?

I am not able to run multiple instances. For now, I am running multiple instances on multiple ports and doing a load blance using nginx.

@ipartola
Copy link
Author

@ChillarAnand I did not. I tried using Circus (https://circus.readthedocs.io) and the FD handoff that is documented for daphne and and Circus. Somewhere in there something doesn't work. I am not sure which project has the bug, but I didn't figure out which. Regardless, my team found that the FD handoff is basically not working because of whatever is causing this, and there aren't many alternatives to circus that are platform-independent and written in Python. For now our solution on Heroku is to run one daphne instance per dyno. This is more expensive than we'd like, but works for now.

@andrewgodwin I don't think this is a single platform specific problem. Being able to run multiple child processes is pretty standard. Gunicorn certainly does it, so does apache2, uwsgi, etc. I would argue that having some kind of solution for this is important. I can see not wanting that code in this codebase, but having clear working instructions for how to get it working with another project that isn't platform-specific would be a good thing. At some point I know I'll be circling back to how to get daphne working with circus or another alternative (someone please suggest one!). Would you be open to a pull request for documentation regarding that?

@andrewgodwin
Copy link
Member

I'd be happy to take a pull request or docs to make this better. I agree other servers implement this - most also have bigger maintenance teams, and I unfortunately can't take on something as complex as process launching and management all by myself.

@ChillarAnand
Copy link

@ipartola uvicorn might fit your needs? You can start multiple processes with uvicorn foo.asgi --workers 5.

@ipartola
Copy link
Author

@ChillarAnand very interesting! I'll check it out.

@luk156
Copy link

luk156 commented May 20, 2018

which is the relation to ASGI_THREADS and max simultaneos connections to my sockets?

@andrewgodwin
Copy link
Member

ASGI_THREADS is the max number of simultaneous sync operations you can run; it's not related to simultaneous connections directly, but if all your connections are trying to do sync things it could be a bottleneck (though, often, turning it down can help more as Python threading is not very good at context-switching)

@luk156
Copy link

luk156 commented May 20, 2018

i change my asgi.py

`import os
import django
from channels.routing import get_default_application

os.environ.setdefault("DJANGO_SETTINGS_MODULE", "get4.settings.10__settings")
os.environ.setdefault("ASGI_THREADS", "5")
django.setup()
application = get_default_application()
`

when supervisor start i see 5 process but after some minutes they become 10, it's an auto scaling inside daphne?

thanks in advance

@andrewgodwin
Copy link
Member

Daphne only ever launches a single process; are you seeing processes or OS-level threads? Daphne only ever launches what you specify in ASGI_THREADS as the number of threads via a Python ThreadPoolExecutor.

@luk156
Copy link

luk156 commented May 20, 2018

is see os-level threads sorry.

when supervisor start i see 1 process with 4 thread after some minutes the threads become more than 20...

https://ibb.co/jCUrko
https://ibb.co/jS8TWT

after some hours they are 25 threads (5*CPU) again

my problem is the RAM consumption, i have a lot of mysql thread, i'm running 30 daphne istances and i neeed 8 GB of ram

i open an issue on django/daphne#201

@umgelurgel
Copy link

Was also looking for information about scaling in the context of heroku - is there any way to scale beyond a single worker process? As in, is there a way for one/many daphe processes to communicate with multiple python manage.py runworker processes running the same consumer?

@andrewgodwin
Copy link
Member

@umgelurgel If you are using Channels 2 then the scaling just works like any other Python server. Channels 1 you can still do but you'll need to run daphne and the workers as separate process types and scale them independently, and use the Redis channel layer.

@ChillarAnand
Copy link

If anyone knows a good tutorial/documenation on how to scale Daphne with the process supervisor and shared file descriptor, please share it here or added to the documentation.

If it is OK to have third party applications, I can submit a patch which shows how to scale using uvicorn or hypercorn.

@mojimi
Copy link

mojimi commented Jul 16, 2019

I don't see why deployment docs can't be a book, there a hundreds of books about Django already!

This will really come in handy with the release of 3/3.1 including native ASGI support

@ChillarAnand
Copy link

A year back, I have written a blog post about Django channels deployment as there weren't any resources for it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants