Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Child worker process should send WORKER_UP message within 2 seconds #1715

Closed
anti-social opened this issue Dec 4, 2013 · 6 comments
Closed

Comments

@anti-social
Copy link
Contributor

In our project we have global data which loaded in Loader.on_worker_process_init method. Sometimes it takes more than 2 seconds. In that case when child process dies AsynPool starts new one and then kills it.

There are many messages in logs:

Timed out waiting for UP message from <Worker(Worker-6, started daemon)>

Moving data loading into worker_process_init signal handler doesn't help.
Now I moved loading data into Loader.on_task_init.

Maybe there should be option or setting which could set AsynPool._proc_alive_timeout attribute?
Or perhaps add information in docs that worker_process_init signal and Loader.on_worker_process_init should be executed within 2 seconds.

@ask ask closed this as completed in a9b000f Dec 4, 2013
@ask
Copy link
Contributor

ask commented Dec 4, 2013

There could be a worker_process_ready signal I guess, but optimally none of the signal handlers should be blocking for long.

@oppianmatt
Copy link

I'm getting this error but we don't have any signal handlers on init.

We get this if a task has exceeded it's time limit.

@oppianmatt
Copy link

Did some stracing, and it was just because we had so many python files that it was taking to long to startup. Managed to fix it with a monkey path in the settings file:

# try monkey patch startup timeout since we take longer than 4.0 seconds to startup
from celery.concurrency import asynpool
asynpool.PROC_ALIVE_TIMEOUT = 60.0

@bryanhelmig
Copy link
Contributor

We hit this too @ask - it might be pretty handy to have this as a configuration option. Lots of head scratching as a result...

@kotrakrishna
Copy link

@oppianmatt Thanks for the monkey patch script.

@jpays
Copy link

jpays commented May 7, 2020

In case someone is still having that kind of problem, you can upgrade to 4.4.2 and use the new configuration option "worker_proc_alive_timeout" to increase init timeout.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants