-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Workers silently exit when allocating more memory than the system allows. #1937
Comments
I like both proposals. |
i don't think we should try to spawn indefinitely in any case anyway. We should probably handle the number of times we tried to respawn a worker in a time window and decide to stop at some point, shouldn't we? Additionally we should indeed log the status code error. IMO it's better to crash and let the user do something at some point if it needs to restart and co. |
Hi @benoitc |
@adoukkali What do you mean by "silently" ? If something is crashing then it will crash. If you don't want to have the worker crashing, you should take measures in your applications to not trigger an exception that will make it crash. |
Any update on this? @adoukkali @benoitc What about the 2nd proposal? |
Problem Description
Gunicorn workers silently exit when allocating more memory than the system allows. This causes the master to enter an infinite loop where it keeps booting new workers unsuccessfully. No hook/logging exists to detect or handle this behavior.
Files
These file will allow you to to reproduce the behavior (clone-able version here: https://github.com/jonathanlunt/gunicorn-memory-example).
docker
is used to artificially constrain system resources.app.py: Generic "hello world" from gunicorn documentation
config.py: This includes a
post_fork
function that allocates 100MBDockerfile
Usage
The run command limits the container to allow only 50MB in memory
Proposed Solutions
Arbiter Logging:
Since no error is logged, it is difficult to determine when this behavior is triggered other than track the number of times a worker is created. One possible way would be to log the status code in the
Arbiter
.Update
Arbiter.reap_workers
:gunicorn/gunicorn/arbiter.py
Line 524 in 33025cf
To include:
Worker Exit Code Tracking:
Another way to handle the error is to allow the user to perform an action depending on the process exit code. However, currently exitcode doesn't appear to be tracked by the worker class.
Update
Worker.__init__
:gunicorn/gunicorn/workers/base.py
Line 36 in 33025cf
To include:
Update
Arbiter.reap_workers
:gunicorn/gunicorn/arbiter.py
Line 534 in 33025cf
To include:
The
status
will come up as9
in this case. This would allow the user-providedchild_exit
code make a decision based onWorker.exitcode
Comments
If there are other solutions to this issue, I'd be happy to hear them, but for now I don't know if there's a good way to track/handle this situation with gunicorn by default.
I would be willing to submit a PR for the proposed solutions, but I wanted to raise this up as an issue in order to get feedback on what the best way to handle this behavior.
The text was updated successfully, but these errors were encountered: