Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Set REMAP_SIGTERM=SIGQUIT to prevent Celery task loss on dyno restarts #817

Open
edmorley opened this issue Apr 10, 2019 · 1 comment
Open

Comments

@edmorley
Copy link
Member

edmorley commented Apr 10, 2019

When a Heroku dyno is restarted (eg daily restart or deploy), all processes on the dyno are sent a SIGTERM signal prior to being shutdown, to allow them time to gracefully exit:
https://devcenter.heroku.com/articles/dynos#graceful-shutdown-with-sigterm

However when using Celery, this SIGTERM signal results in Celery's worker processes being shutdown before the master process has had a chance to send unfinished tasks back to the broker (since SIGTERM is the signal used internally by the master process to instruct the workers to shutdown). This causes tasks to be lost.

As such, a new feature was added to Celery 4, which allows customisation of the signal used internally by Celery (via the environment variable REMAP_SIGTERM), so that it no longer clashes with the SIGTERM sent by Heroku.

See:
celery/celery#2839
https://devcenter.heroku.com/articles/celery-heroku#using-remap_sigterm

If users miss the mention in the devcenter article and don't set the environment variable, they will lose tasks upon deployments and daily dyno restarts. At my previous employer we regularly experienced lost tasks, which were eventually tracked down to this issue (at that point we were stuck on Celery 3 which didn't support REMAP_SIGTERM).

As such it seems like this buildpack should set REMAP_SIGTERM=SIGQUIT automatically for the user when using Celery (if it's not already set in their environment).

In fact, the environment variable doesn't appear to be used by anything other than Celery (and its internal library, billiard), so it may be safe to enable globally without a Celery conditional (similar to how FORWARDED_ALLOW_IPS is configured regardless of the presence of gunicorn).

Tracked internally at W-8133993.

@zyv
Copy link
Contributor

zyv commented Feb 22, 2023

As a workaround we had to add it to our Procfile for now :-/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants