Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

celeryd simply stops processing tasks #911

Closed
cyounkins opened this issue Aug 9, 2012 · 2 comments
Closed

celeryd simply stops processing tasks #911

cyounkins opened this issue Aug 9, 2012 · 2 comments

Comments

@cyounkins
Copy link

Today celeryd stopped responding again. It was in gevent mode, 50 threads with 3.0.5.

Logging stopped, and tasks are not being processed. The last thing it was doing was trying to unlock a chord that would never unlock because of a failed task (a separate issue). The process is still running.

It writes the following to the log after sending SIGTERM:

[2012-08-09 10:53:44,931: WARNING/MainProcess] Traceback (most recent call last):
[2012-08-09 10:53:44,932: WARNING/MainProcess] File "core.pyx", line 132, in gevent.core.__event_handler (gevent/core.c:1404)
[2012-08-09 10:53:44,949: WARNING/MainProcess] File "/srv/autoref/virtual_env/lib/python2.6/site-packages/gevent/socket.py", line 145, in _wait_helper
[2012-08-09 10:53:44,997: WARNING/MainProcess] def _wait_helper(ev, evtype):
[2012-08-09 10:53:44,997: WARNING/MainProcess] File "/srv/autoref/virtual_env/lib/python2.6/site-packages/celery/apps/worker.py", line 305, in _handle_request
[2012-08-09 10:53:45,013: WARNING/MainProcess] raise exc()
[2012-08-09 10:53:45,013: WARNING/MainProcess] SystemExit
[2012-08-09 10:53:45,014: WARNING/MainProcess] Failed to execute callback for event fd=13 READ flags=INIT
  cb  = <function _wait_helper at 0x21cb410>
  arg = (<greenlet.greenlet object at 0x2876730>, timeout('timed out',))

But the process does not end. I had to use SIGKILL.

What else can I do to debug this?


Update: Now I'm seeing this with separate processes as well.

On 8/21/2012 at 22:24 celery printed the last message in its log, a 'task succeeded' message.
At 8/22/2012 at 16:01 I manually restarted celery through an init script. It logged that it was shutting down.

It came back up, then proceeded to process a number of items that were inserted into the queue after 8/21/2012 22:24.

The tasks involve making socket connections, and it may be the case that there is no default timeout:

In [1]: import socket
In [2]: socket.getdefaulttimeout()

In [3]:

It seems unlikely that it would wait 10+ hours on a socket. I'll add a socket.setdefaulttimeout() call at the start of my program.

@ask
Copy link
Contributor

ask commented Aug 17, 2012

Hard to say, do you have any idea of how I would be able to reproduce it? What are your tasks doing?

@cyounkins
Copy link
Author

Please see updated to OP

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants