Gracefully stop celery tasks when calling helm upgrade #5412

RonZhang724 · 2019-03-27T21:13:24Z

Checklist

I have checked the issues list for similar or identical feature requests.
I have checked the commit log to find out if a feature was already implemented in master.

Brief Summary

I have a running minikube to deploy my rabbitMQ, flower, and celery workers. Each task will take approximately 30 seconds or longer to be completed. When I run a bunch of tasks at once, they will be put into the queue to wait to be processed by workers. However, when I upgrade the workers using new image by running helm upgrade myworker /path/to/helmchart The old pod is terminated, and all the running tasks and waiting tasks are nuked and gone. Hence, a graceful way of stoping those tasks should probably be implemented.

Design

Architectural Considerations

Helm and Kubernetes are involved, therefore, some orchestration might be required.

Proposed Behavior

helm upgrade myworker /path/to/helmchart is run, the pod terminating process should wait until all the existing tasks have been processed. Then, a new pod will be spun up that will process new tasks using the new code.

Proposed UI/UX

None

Diagrams

N/A

Alternatives

Similar question has been asked on Stack overflow https://stackoverflow.com/questions/55334885/celery-task-stops-when-calling-helm-install And the author proposed that

"using the pod PreStop hook OR creating something that will prevent the task from stopping."

might work. Thanks for noticing, and I will keep exploring solutions to this problem.

The text was updated successfully, but these errors were encountered:

thedrow · 2019-03-28T07:48:40Z

Related issue #4213.

The solution would be to send a SIGINT to the master process and wait until it exits before allowing to shut down.

thedrow · 2019-04-02T09:27:24Z

Sorry, not SIGINT but SIGTERM.

Does that work for you?

RonZhang724 · 2019-04-09T22:12:14Z

Sorry about the late response because I was experimenting different ways to get this to work. Thanks to the hint, I looked into the lifecycle of Kubernetes pods and found out that there is this thing called "helm hook", that will get executed in different phases for deployment. One of the hooks that I found useful is the pre-upgrade hook. It requires successful execution of a command specified in the hook yaml file before new pod gets deployed and replace the old pod.

Now my question would be: how to create an image to check if there is any tasks waiting to be processed by the worker pod?

thedrow · 2019-04-11T10:59:02Z

You can of course inspect a worker if gossip is enabled.
However, when you shut down the Celery master with SIGTERM we wait for all tasks to be executed before exiting.

I'm not sure if you really need a helm hook for this.
You just need to extend k8s' timeout since it already sends a SIGTERM to PID 1.

Ensure that Celery master is PID 1 and that you have configured terminationGracePeriodSeconds to a large enough number.

thedrow added Issue Type: Feature Request Category: Deployment labels Apr 2, 2019

auvipy added this to the Future milestone Oct 31, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Gracefully stop celery tasks when calling helm upgrade #5412

Gracefully stop celery tasks when calling helm upgrade #5412

RonZhang724 commented Mar 27, 2019 •

edited by sync-by-unito bot

thedrow commented Mar 28, 2019

thedrow commented Apr 2, 2019

RonZhang724 commented Apr 9, 2019 •

edited

thedrow commented Apr 11, 2019

Gracefully stop celery tasks when calling helm upgrade #5412

Gracefully stop celery tasks when calling helm upgrade #5412

Comments

RonZhang724 commented Mar 27, 2019 • edited by sync-by-unito bot

Checklist

Brief Summary

Design

Architectural Considerations

Proposed Behavior

Proposed UI/UX

Diagrams

Alternatives

thedrow commented Mar 28, 2019

thedrow commented Apr 2, 2019

RonZhang724 commented Apr 9, 2019 • edited

thedrow commented Apr 11, 2019

RonZhang724 commented Mar 27, 2019 •

edited by sync-by-unito bot

RonZhang724 commented Apr 9, 2019 •

edited