Skip to content

Configure celery to use soft shutdown #5000

@bjester

Description

@bjester

This issue is not open for contribution. Visit Contributing guidelines to learn about the contributing process and how to find suitable issues.

Current behavior

During continuous deployment, when the container's running production code are shutdown, and in particular the celery workers, tasks may get interrupted and may not be reprocessed after the new code, and workers, have started. Kubernetes gives the worker container 30 seconds of grace to shutdown.

Desired behavior

With Celery 5.5.0, a new feature was added to enable a 'soft shutdown', which is a time-limited warm shutdown. A warm shutdown allows the workers to finish any tasks before it actually terminates. We can take advantage of this feature by setting. This won't likely help long-running publishing tasks, but would likely help all others. To implement this, we should:

  • upgrade to Celery 5.5.0 if we haven't already
  • set worker_soft_shutdown_timeout=28 for Celery (28 slightly less than the 30 seconds of grace given by k8s)

We may also need the following to ensure a soft shutdown is triggered for SIGTERM:

  • set export REMAP_SIGTERM="SIGQUIT" for the celery workers, and
  • coordinate with infrastructure to ensure this is set for production

Value add

Better reliability for Studio's celery tasks

Possible tradeoffs

As mentioned, this will not accommodate for tasks that normally take longer than 30 seconds, like channel publishing.

References

https://github.com/celery/celery/releases/tag/v5.5.0
https://docs.celeryq.dev/en/latest/userguide/workers.html#soft-shutdown

Metadata

Metadata

Assignees

No one assigned

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions