New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

3.1.5 crasher, Unrecoverable error: KeyError(14177,) #1711

Closed
ionelmc opened this Issue Dec 2, 2013 · 3 comments

Comments

Projects
None yet
1 participant
@ionelmc
Member

ionelmc commented Dec 2, 2013

[pid=14177 processName=MainProcess - 2013-12-02 19:13:39,748]: celery.worker.job - INFO - Task sometask[42bbb6a2-bd7d-4314-8c6f-c04d751de840] succeeded in 0.145124898001s: []
[pid=14177 processName=MainProcess - 2013-12-02 19:13:39,776]: celery.worker - ERROR - Unrecoverable error: KeyError(14177,)
Traceback (most recent call last):
  File "/var/lib/jenkins/jobs/xxx/workspace/.ve/local/lib/python2.7/site-packages/celery/worker/__init__.py", line 212, in start
    self.blueprint.start(self)
  File "/var/lib/jenkins/jobs/xxx/workspace/.ve/local/lib/python2.7/site-packages/celery/bootsteps.py", line 123, in start
    step.start(parent)
  File "/var/lib/jenkins/jobs/xxx/workspace/.ve/local/lib/python2.7/site-packages/celery/bootsteps.py", line 373, in start
    return self.obj.start()
  File "/var/lib/jenkins/jobs/xxx/workspace/.ve/local/lib/python2.7/site-packages/celery/worker/consumer.py", line 270, in start
    blueprint.start(self)
  File "/var/lib/jenkins/jobs/xxx/workspace/.ve/local/lib/python2.7/site-packages/celery/bootsteps.py", line 123, in start
    step.start(parent)
  File "/var/lib/jenkins/jobs/xxx/workspace/.ve/local/lib/python2.7/site-packages/celery/worker/consumer.py", line 789, in start
    c.loop(*c.loop_args())
  File "/var/lib/jenkins/jobs/xxx/workspace/.ve/local/lib/python2.7/site-packages/celery/worker/loops.py", line 72, in asynloop
    next(loop)
  File "/var/lib/jenkins/jobs/xxx/workspace/.ve/local/lib/python2.7/site-packages/kombu/async/hub.py", line 333, in create_loop
    cb(*cbargs)
  File "/var/lib/jenkins/jobs/xxx/workspace/.ve/local/lib/python2.7/site-packages/celery/concurrency/asynpool.py", line 393, in _event_process_exit
    self.maintain_pool()
  File "/var/lib/jenkins/jobs/xxx/workspace/.ve/local/lib/python2.7/site-packages/billiard/pool.py", line 1239, in maintain_pool
    self._maintain_pool()
  File "/var/lib/jenkins/jobs/xxx/workspace/.ve/local/lib/python2.7/site-packages/billiard/pool.py", line 1230, in _maintain_pool
    joined = self._join_exited_workers()
  File "/var/lib/jenkins/jobs/xxx/workspace/.ve/local/lib/python2.7/site-packages/billiard/pool.py", line 1112, in _join_exited_workers
    exitcode = exitcodes[acked_by_gone]
KeyError: 14177
[pid=14177 processName=MainProcess - 2013-12-02 19:13:39,779]: celery.bootsteps - DEBUG - | Worker: Closing Hub...
[pid=14177 processName=MainProcess - 2013-12-02 19:13:39,780]: celery.bootsteps - DEBUG - | Worker: Closing Pool...
[pid=14177 processName=MainProcess - 2013-12-02 19:13:39,780]: celery.bootsteps - DEBUG - | Worker: Closing Consumer...
[pid=14177 processName=MainProcess - 2013-12-02 19:13:39,781]: celery.bootsteps - DEBUG - | Worker: Stopping Consumer...
[pid=14177 processName=MainProcess - 2013-12-02 19:13:39,781]: celery.bootsteps - DEBUG - | Consumer: Closing Connection...
[pid=14177 processName=MainProcess - 2013-12-02 19:13:39,782]: celery.bootsteps - DEBUG - | Consumer: Closing Events...
[pid=14177 processName=MainProcess - 2013-12-02 19:13:39,782]: celery.bootsteps - DEBUG - | Consumer: Closing Gossip...
[pid=14177 processName=MainProcess - 2013-12-02 19:13:39,783]: celery.bootsteps - DEBUG - | Consumer: Closing Mingle...
[pid=14177 processName=MainProcess - 2013-12-02 19:13:39,783]: celery.bootsteps - DEBUG - | Consumer: Closing Tasks...
[pid=14177 processName=MainProcess - 2013-12-02 19:13:39,783]: celery.bootsteps - DEBUG - | Consumer: Closing Heart...
[pid=14177 processName=MainProcess - 2013-12-02 19:13:39,784]: celery.bootsteps - DEBUG - | Consumer: Closing Control...
[pid=14177 processName=MainProcess - 2013-12-02 19:13:39,784]: celery.bootsteps - DEBUG - | Consumer: Closing event loop...
[pid=14177 processName=MainProcess - 2013-12-02 19:13:39,785]: celery.bootsteps - DEBUG - | Consumer: Stopping event loop...
[pid=14177 processName=MainProcess - 2013-12-02 19:13:39,785]: celery.bootsteps - DEBUG - | Consumer: Stopping Control...
[pid=14177 processName=MainProcess - 2013-12-02 19:13:39,788]: celery.bootsteps - DEBUG - | Consumer: Stopping Heart...
[pid=14177 processName=MainProcess - 2013-12-02 19:13:39,789]: celery.bootsteps - DEBUG - | Consumer: Stopping Tasks...
[pid=14177 processName=MainProcess - 2013-12-02 19:13:39,790]: celery.worker.consumer - DEBUG - Cancelling task consumer...
[pid=14177 processName=MainProcess - 2013-12-02 19:13:39,891]: celery.bootsteps - DEBUG - | Consumer: Stopping Mingle...
[pid=14177 processName=MainProcess - 2013-12-02 19:13:39,892]: celery.bootsteps - DEBUG - | Consumer: Stopping Gossip...
[pid=14177 processName=MainProcess - 2013-12-02 19:13:39,895]: celery.bootsteps - DEBUG - | Consumer: Stopping Events...
[pid=14177 processName=MainProcess - 2013-12-02 19:13:39,895]: celery.bootsteps - DEBUG - | Consumer: Stopping Connection...
[pid=14177 processName=MainProcess - 2013-12-02 19:13:39,896]: celery.bootsteps - DEBUG - | Worker: Stopping Pool...
[pid=14177 processName=MainProcess - 2013-12-02 19:13:39,907]: celery.bootsteps - ERROR - Error on stopping Pool: KeyError(14177,)
Traceback (most recent call last):
  File "/var/lib/jenkins/jobs/xxx/workspace/.ve/local/lib/python2.7/site-packages/celery/bootsteps.py", line 155, in send_all
    fun(parent, *args)
  File "/var/lib/jenkins/jobs/xxx/workspace/.ve/local/lib/python2.7/site-packages/celery/bootsteps.py", line 377, in stop
    return self.obj.stop()
  File "/var/lib/jenkins/jobs/xxx/workspace/.ve/local/lib/python2.7/site-packages/celery/concurrency/base.py", line 119, in stop
    self.on_stop()
  File "/var/lib/jenkins/jobs/xxx/workspace/.ve/local/lib/python2.7/site-packages/celery/concurrency/prefork.py", line 154, in on_stop
    self._pool.join()
  File "/var/lib/jenkins/jobs/xxx/workspace/.ve/local/lib/python2.7/site-packages/billiard/pool.py", line 1511, in join
    stop_if_not_current(self._result_handler)
  File "/var/lib/jenkins/jobs/xxx/workspace/.ve/local/lib/python2.7/site-packages/billiard/pool.py", line 148, in stop_if_not_current
    thread.stop(timeout)
  File "/var/lib/jenkins/jobs/xxx/workspace/.ve/local/lib/python2.7/site-packages/billiard/pool.py", line 488, in stop
    self.on_stop_not_started()
  File "/var/lib/jenkins/jobs/xxx/workspace/.ve/local/lib/python2.7/site-packages/celery/concurrency/asynpool.py", line 296, in on_stop_not_started
    join_exited_workers(shutdown=True)
  File "/var/lib/jenkins/jobs/xxx/workspace/.ve/local/lib/python2.7/site-packages/billiard/pool.py", line 1112, in _join_exited_workers
    exitcode = exitcodes[acked_by_gone]
KeyError: 14177
[pid=14177 processName=MainProcess - 2013-12-02 19:13:39,908]: celery.bootsteps - DEBUG - | Worker: Stopping Hub...
[pid=14177 processName=MainProcess - 2013-12-02 19:13:39,909]: celery.bootsteps - DEBUG - | Consumer: Shutdown Control...
[pid=14177 processName=MainProcess - 2013-12-02 19:13:39,910]: celery.bootsteps - DEBUG - | Consumer: Shutdown Heart...
[pid=14177 processName=MainProcess - 2013-12-02 19:13:39,911]: celery.bootsteps - DEBUG - | Consumer: Shutdown Tasks...
[pid=14177 processName=MainProcess - 2013-12-02 19:13:39,911]: celery.worker.consumer - DEBUG - Cancelling task consumer...
[pid=14177 processName=MainProcess - 2013-12-02 19:13:39,911]: celery.worker.consumer - DEBUG - Closing consumer channel...
[pid=14177 processName=MainProcess - 2013-12-02 19:13:39,912]: celery.bootsteps - DEBUG - | Consumer: Shutdown Gossip...
[pid=14177 processName=MainProcess - 2013-12-02 19:13:39,912]: celery.bootsteps - DEBUG - | Consumer: Shutdown Events...
[pid=14177 processName=MainProcess - 2013-12-02 19:13:39,914]: celery.bootsteps - DEBUG - | Consumer: Shutdown Connection...

 -------------- xxx@xxx-hudson v3.1.5 (Cipater)
---- **** -----
--- * ***  * -- Linux-3.2.0-32-generic-x86_64-with-Ubuntu-12.04-precise
-- * - **** ---
- ** ---------- [config]
- ** ---------- .> broker:      amqp://xxx@localhost:5672/xxx
- ** ---------- .> app:         default:0x2cb4b10 (djcelery.loaders.DjangoLoader)
- ** ---------- .> concurrency: 5 (prefork)
- *** --- * --- .> events:      ON
-- ******* ----
--- ***** ----- [queues]
 -------------- .> xxx@xxx-hudson.dq exchange=C.dq(direct) key=xxx@xxx-hudson
                .> celery           exchange=celery(direct) key=celery

[tasks]
  . sometask
  . celery.backend_cleanup
  . celery.chain
  . celery.chord
  . celery.chord_unlock
  . celery.chunks
  . celery.group
  . celery.map
  . celery.starmap
  . test_xxx.test_engine.test_content_tracer.create_content

[pid=14177 processName=MainProcess - 2013-12-02 19:13:40,007]: celery.concurrency.asynpool - DEBUG - removing tasks from inqueue until task handler finished
@ionelmc

This comment has been minimized.

Show comment
Hide comment
@ionelmc

ionelmc Dec 3, 2013

Member

This one is from a local run:

[pid=44845 processName=MainProcess - 2013-12-03 12:49:16,622]: kombu.pidbox - DEBUG - pidbox received method revoke(signal='SIGTERM', terminate=True, task_id='7a557dc3-b516-4838-aa97-a1f99e51e598') [reply_to:None ticket:None]
[pid=44845 processName=MainProcess - 2013-12-03 12:49:16,628]: kombu.pidbox - DEBUG - pidbox received method revoke(signal='SIGTERM', terminate=True, task_id='56294677-89eb-4228-a84f-b0480f49001c') [reply_to:None ticket:None]
[pid=44845 processName=MainProcess - 2013-12-03 12:49:16,629]: celery.worker.control - INFO - Terminating 56294677-89eb-4228-a84f-b0480f49001c (15)
[pid=51995 processName=Worker-1851 - 2013-12-03 12:49:16,629]: xxx.utility.memory - DEBUG - Restored resource.setrlimit to (-1, -1).
[pid=51995 processName=Worker-1851 - 2013-12-03 12:49:16,633]: xxx.tools - INFO - Completed task transform with id: 56294677-89eb-4228-a84f-b0480f49001c, run time: 1.81sec.
[pid=44845 processName=MainProcess - 2013-12-03 12:49:16,636]: kombu.pidbox - DEBUG - pidbox received method revoke(signal='SIGTERM', terminate=False, task_id='137f2c0f-5a6a-4321-bb78-6e21ecffc691') [reply_to:None ticket:None]
[pid=44845 processName=MainProcess - 2013-12-03 12:49:16,636]: celery.worker.control - INFO - Tasks flagged as revoked: 137f2c0f-5a6a-4321-bb78-6e21ecffc691
[pid=44845 processName=MainProcess - 2013-12-03 12:49:16,670]: celery.worker.job - DEBUG - Task accepted: xxx.tasks.script[137f2c0f-5a6a-4321-bb78-6e21ecffc691] pid:44845
[pid=44845 processName=MainProcess - 2013-12-03 12:49:16,670]: celery.worker - ERROR - Unrecoverable error: TypeError("bad operand type for unary -: 'NoneType'",)
Traceback (most recent call last):
  File "/home/ionel/projects/core/.ve/local/lib/python2.7/site-packages/celery/worker/__init__.py", line 212, in start
    self.blueprint.start(self)
  File "/home/ionel/projects/core/.ve/local/lib/python2.7/site-packages/celery/bootsteps.py", line 123, in start
    step.start(parent)
  File "/home/ionel/projects/core/.ve/local/lib/python2.7/site-packages/celery/bootsteps.py", line 373, in start
    return self.obj.start()
  File "/home/ionel/projects/core/.ve/local/lib/python2.7/site-packages/celery/worker/consumer.py", line 270, in start
    blueprint.start(self)
  File "/home/ionel/projects/core/.ve/local/lib/python2.7/site-packages/celery/bootsteps.py", line 123, in start
    step.start(parent)
  File "/home/ionel/projects/core/.ve/local/lib/python2.7/site-packages/celery/worker/consumer.py", line 790, in start
    c.loop(*c.loop_args())
  File "/home/ionel/projects/core/.ve/local/lib/python2.7/site-packages/celery/worker/loops.py", line 72, in asynloop
    next(loop)
  File "/home/ionel/projects/core/.ve/local/lib/python2.7/site-packages/kombu/async/hub.py", line 333, in create_loop
    cb(*cbargs)
  File "/home/ionel/projects/core/.ve/local/lib/python2.7/site-packages/celery/concurrency/asynpool.py", line 393, in _event_process_exit
    self.maintain_pool()
  File "/home/ionel/projects/core/.ve/src/billiard/billiard/pool.py", line 1251, in maintain_pool
    self._maintain_pool()
  File "/home/ionel/projects/core/.ve/src/billiard/billiard/pool.py", line 1242, in _maintain_pool
    joined = self._join_exited_workers()
  File "/home/ionel/projects/core/.ve/src/billiard/billiard/pool.py", line 1139, in _join_exited_workers
    self.on_job_process_down(job, write_to.pid)
  File "/home/ionel/projects/core/.ve/local/lib/python2.7/site-packages/celery/concurrency/asynpool.py", line 978, in on_job_process_down
    self.on_partial_read(job, job._write_to)
  File "/home/ionel/projects/core/.ve/local/lib/python2.7/site-packages/celery/concurrency/asynpool.py", line 1111, in on_partial_read
    self._put_back(job)
  File "/home/ionel/projects/core/.ve/local/lib/python2.7/site-packages/celery/concurrency/asynpool.py", line 606, in _put_back
    job._set_terminated(job._terminated)
  File "/home/ionel/projects/core/.ve/src/billiard/billiard/pool.py", line 1668, in _set_terminated
    raise Terminated(-signum)
TypeError: bad operand type for unary -: 'NoneType'
[pid=44845 processName=MainProcess - 2013-12-03 12:49:16,671]: celery.bootsteps - DEBUG - | Worker: Closing Hub...
[pid=44845 processName=MainProcess - 2013-12-03 12:49:16,671]: celery.bootsteps - DEBUG - | Worker: Closing Pool...
[pid=44845 processName=MainProcess - 2013-12-03 12:49:16,671]: celery.bootsteps - DEBUG - | Worker: Closing Consumer...
[pid=44845 processName=MainProcess - 2013-12-03 12:49:16,671]: celery.bootsteps - DEBUG - | Worker: Stopping Consumer...
[pid=44845 processName=MainProcess - 2013-12-03 12:49:16,671]: celery.bootsteps - DEBUG - | Consumer: Closing Connection...
[pid=44845 processName=MainProcess - 2013-12-03 12:49:16,671]: celery.bootsteps - DEBUG - | Consumer: Closing Events...
[pid=44845 processName=MainProcess - 2013-12-03 12:49:16,671]: celery.bootsteps - DEBUG - | Consumer: Closing Mingle...
[pid=44845 processName=MainProcess - 2013-12-03 12:49:16,671]: celery.bootsteps - DEBUG - | Consumer: Closing Gossip...
[pid=44845 processName=MainProcess - 2013-12-03 12:49:16,672]: celery.bootsteps - DEBUG - | Consumer: Closing Tasks...
[pid=44845 processName=MainProcess - 2013-12-03 12:49:16,672]: celery.bootsteps - DEBUG - | Consumer: Closing Heart...
[pid=44845 processName=MainProcess - 2013-12-03 12:49:16,672]: celery.bootsteps - DEBUG - | Consumer: Closing Control...
[pid=44845 processName=MainProcess - 2013-12-03 12:49:16,672]: celery.bootsteps - DEBUG - | Consumer: Closing event loop...
[pid=44845 processName=MainProcess - 2013-12-03 12:49:16,672]: celery.bootsteps - DEBUG - | Consumer: Stopping event loop...
[pid=44845 processName=MainProcess - 2013-12-03 12:49:16,672]: celery.bootsteps - DEBUG - | Consumer: Stopping Control...
[pid=44845 processName=MainProcess - 2013-12-03 12:49:16,673]: celery.bootsteps - DEBUG - | Consumer: Stopping Heart...
[pid=44845 processName=MainProcess - 2013-12-03 12:49:16,673]: celery.bootsteps - DEBUG - | Consumer: Stopping Tasks...
[pid=44845 processName=MainProcess - 2013-12-03 12:49:16,673]: celery.worker.consumer - DEBUG - Cancelling task consumer...
[pid=44845 processName=MainProcess - 2013-12-03 12:49:16,678]: celery.bootsteps - DEBUG - | Consumer: Stopping Gossip...
[pid=44845 processName=MainProcess - 2013-12-03 12:49:16,679]: celery.bootsteps - DEBUG - | Consumer: Stopping Mingle...
[pid=44845 processName=MainProcess - 2013-12-03 12:49:16,679]: celery.bootsteps - DEBUG - | Consumer: Stopping Events...
[pid=44845 processName=MainProcess - 2013-12-03 12:49:16,679]: celery.bootsteps - DEBUG - | Consumer: Stopping Connection...
[pid=44845 processName=MainProcess - 2013-12-03 12:49:16,679]: celery.bootsteps - DEBUG - | Worker: Stopping Pool...
[pid=44845 processName=MainProcess - 2013-12-03 12:49:16,698]: multiprocessing - CRITICAL - Ghost acked_by_gone 44845 from job <Result: 1848 ack:True ready:False> with worker pids [44845]
[pid=44845 processName=MainProcess - 2013-12-03 12:49:16,698]: celery.bootsteps - ERROR - Error on stopping Pool: KeyError(44845,)
Traceback (most recent call last):
  File "/home/ionel/projects/core/.ve/local/lib/python2.7/site-packages/celery/bootsteps.py", line 155, in send_all
    fun(parent, *args)
  File "/home/ionel/projects/core/.ve/local/lib/python2.7/site-packages/celery/bootsteps.py", line 377, in stop
    return self.obj.stop()
  File "/home/ionel/projects/core/.ve/local/lib/python2.7/site-packages/celery/concurrency/base.py", line 119, in stop
    self.on_stop()
  File "/home/ionel/projects/core/.ve/local/lib/python2.7/site-packages/celery/concurrency/prefork.py", line 140, in on_stop
    self._pool.join()
  File "/home/ionel/projects/core/.ve/src/billiard/billiard/pool.py", line 1523, in join
    stop_if_not_current(self._result_handler)
  File "/home/ionel/projects/core/.ve/src/billiard/billiard/pool.py", line 152, in stop_if_not_current
    thread.stop(timeout)
  File "/home/ionel/projects/core/.ve/src/billiard/billiard/pool.py", line 493, in stop
    self.on_stop_not_started()
  File "/home/ionel/projects/core/.ve/local/lib/python2.7/site-packages/celery/concurrency/asynpool.py", line 296, in on_stop_not_started
    join_exited_workers(shutdown=True)
  File "/home/ionel/projects/core/.ve/src/billiard/billiard/pool.py", line 1124, in _join_exited_workers
    exitcode = exitcodes[acked_by_gone]
KeyError: 44845
[pid=44845 processName=MainProcess - 2013-12-03 12:49:16,698]: celery.bootsteps - DEBUG - | Worker: Stopping Hub...
[pid=44845 processName=MainProcess - 2013-12-03 12:49:16,698]: celery.bootsteps - DEBUG - | Consumer: Shutdown Control...
[pid=44845 processName=MainProcess - 2013-12-03 12:49:16,700]: celery.bootsteps - DEBUG - | Consumer: Shutdown Heart...
[pid=44845 processName=MainProcess - 2013-12-03 12:49:16,700]: celery.bootsteps - DEBUG - | Consumer: Shutdown Tasks...
[pid=44845 processName=MainProcess - 2013-12-03 12:49:16,700]: celery.worker.consumer - DEBUG - Cancelling task consumer...
[pid=44845 processName=MainProcess - 2013-12-03 12:49:16,700]: celery.worker.consumer - DEBUG - Closing consumer channel...
[pid=44845 processName=MainProcess - 2013-12-03 12:49:16,700]: celery.bootsteps - DEBUG - | Consumer: Shutdown Gossip...
[pid=44845 processName=MainProcess - 2013-12-03 12:49:16,700]: celery.bootsteps - DEBUG - | Consumer: Shutdown Events...
[pid=44845 processName=MainProcess - 2013-12-03 12:49:16,701]: celery.bootsteps - DEBUG - | Consumer: Shutdown Connection...

 -------------- xxx@dkbox v3.1.6 (Cipater)
---- **** ----- 
--- * ***  * -- Linux-3.8.0-33-generic-x86_64-with-Ubuntu-12.04-precise
-- * - **** --- 
- ** ---------- [config]
- ** ---------- .> broker:      amqp://xxx@localhost:5672/xxx
- ** ---------- .> app:         default:0x2816950 (djcelery.loaders.DjangoLoader)
- ** ---------- .> concurrency: 5 (prefork)
- *** --- * --- .> events:      ON
-- ******* ---- 
--- ***** ----- [queues]
 -------------- .> xxx@dkbox.dq exchange=C.dq(direct) key=xxx@dkbox
                .> celery           exchange=celery(direct) key=celery

[tasks]
  . xxx.tasks.analyse
  . xxx.tasks.chunked_parse
  . xxx.tasks.script
  . xxx.tasks.transform
  . celery.backend_cleanup
  . celery.chain
  . celery.chord
  . celery.chord_unlock
  . celery.chunks
  . celery.group
  . celery.map
  . celery.starmap

[pid=44845 processName=MainProcess - 2013-12-03 12:49:16,743]: celery.concurrency.asynpool - DEBUG - removing tasks from inqueue until task handler finished
[pid=51905 processName=Worker-1835 - 2013-12-03 12:49:17,246]: xxx.tools - INFO - Completed task script with id: c4831f6d-a4d8-4375-98b8-e80050541242, run time: 31.25sec. 
Member

ionelmc commented Dec 3, 2013

This one is from a local run:

[pid=44845 processName=MainProcess - 2013-12-03 12:49:16,622]: kombu.pidbox - DEBUG - pidbox received method revoke(signal='SIGTERM', terminate=True, task_id='7a557dc3-b516-4838-aa97-a1f99e51e598') [reply_to:None ticket:None]
[pid=44845 processName=MainProcess - 2013-12-03 12:49:16,628]: kombu.pidbox - DEBUG - pidbox received method revoke(signal='SIGTERM', terminate=True, task_id='56294677-89eb-4228-a84f-b0480f49001c') [reply_to:None ticket:None]
[pid=44845 processName=MainProcess - 2013-12-03 12:49:16,629]: celery.worker.control - INFO - Terminating 56294677-89eb-4228-a84f-b0480f49001c (15)
[pid=51995 processName=Worker-1851 - 2013-12-03 12:49:16,629]: xxx.utility.memory - DEBUG - Restored resource.setrlimit to (-1, -1).
[pid=51995 processName=Worker-1851 - 2013-12-03 12:49:16,633]: xxx.tools - INFO - Completed task transform with id: 56294677-89eb-4228-a84f-b0480f49001c, run time: 1.81sec.
[pid=44845 processName=MainProcess - 2013-12-03 12:49:16,636]: kombu.pidbox - DEBUG - pidbox received method revoke(signal='SIGTERM', terminate=False, task_id='137f2c0f-5a6a-4321-bb78-6e21ecffc691') [reply_to:None ticket:None]
[pid=44845 processName=MainProcess - 2013-12-03 12:49:16,636]: celery.worker.control - INFO - Tasks flagged as revoked: 137f2c0f-5a6a-4321-bb78-6e21ecffc691
[pid=44845 processName=MainProcess - 2013-12-03 12:49:16,670]: celery.worker.job - DEBUG - Task accepted: xxx.tasks.script[137f2c0f-5a6a-4321-bb78-6e21ecffc691] pid:44845
[pid=44845 processName=MainProcess - 2013-12-03 12:49:16,670]: celery.worker - ERROR - Unrecoverable error: TypeError("bad operand type for unary -: 'NoneType'",)
Traceback (most recent call last):
  File "/home/ionel/projects/core/.ve/local/lib/python2.7/site-packages/celery/worker/__init__.py", line 212, in start
    self.blueprint.start(self)
  File "/home/ionel/projects/core/.ve/local/lib/python2.7/site-packages/celery/bootsteps.py", line 123, in start
    step.start(parent)
  File "/home/ionel/projects/core/.ve/local/lib/python2.7/site-packages/celery/bootsteps.py", line 373, in start
    return self.obj.start()
  File "/home/ionel/projects/core/.ve/local/lib/python2.7/site-packages/celery/worker/consumer.py", line 270, in start
    blueprint.start(self)
  File "/home/ionel/projects/core/.ve/local/lib/python2.7/site-packages/celery/bootsteps.py", line 123, in start
    step.start(parent)
  File "/home/ionel/projects/core/.ve/local/lib/python2.7/site-packages/celery/worker/consumer.py", line 790, in start
    c.loop(*c.loop_args())
  File "/home/ionel/projects/core/.ve/local/lib/python2.7/site-packages/celery/worker/loops.py", line 72, in asynloop
    next(loop)
  File "/home/ionel/projects/core/.ve/local/lib/python2.7/site-packages/kombu/async/hub.py", line 333, in create_loop
    cb(*cbargs)
  File "/home/ionel/projects/core/.ve/local/lib/python2.7/site-packages/celery/concurrency/asynpool.py", line 393, in _event_process_exit
    self.maintain_pool()
  File "/home/ionel/projects/core/.ve/src/billiard/billiard/pool.py", line 1251, in maintain_pool
    self._maintain_pool()
  File "/home/ionel/projects/core/.ve/src/billiard/billiard/pool.py", line 1242, in _maintain_pool
    joined = self._join_exited_workers()
  File "/home/ionel/projects/core/.ve/src/billiard/billiard/pool.py", line 1139, in _join_exited_workers
    self.on_job_process_down(job, write_to.pid)
  File "/home/ionel/projects/core/.ve/local/lib/python2.7/site-packages/celery/concurrency/asynpool.py", line 978, in on_job_process_down
    self.on_partial_read(job, job._write_to)
  File "/home/ionel/projects/core/.ve/local/lib/python2.7/site-packages/celery/concurrency/asynpool.py", line 1111, in on_partial_read
    self._put_back(job)
  File "/home/ionel/projects/core/.ve/local/lib/python2.7/site-packages/celery/concurrency/asynpool.py", line 606, in _put_back
    job._set_terminated(job._terminated)
  File "/home/ionel/projects/core/.ve/src/billiard/billiard/pool.py", line 1668, in _set_terminated
    raise Terminated(-signum)
TypeError: bad operand type for unary -: 'NoneType'
[pid=44845 processName=MainProcess - 2013-12-03 12:49:16,671]: celery.bootsteps - DEBUG - | Worker: Closing Hub...
[pid=44845 processName=MainProcess - 2013-12-03 12:49:16,671]: celery.bootsteps - DEBUG - | Worker: Closing Pool...
[pid=44845 processName=MainProcess - 2013-12-03 12:49:16,671]: celery.bootsteps - DEBUG - | Worker: Closing Consumer...
[pid=44845 processName=MainProcess - 2013-12-03 12:49:16,671]: celery.bootsteps - DEBUG - | Worker: Stopping Consumer...
[pid=44845 processName=MainProcess - 2013-12-03 12:49:16,671]: celery.bootsteps - DEBUG - | Consumer: Closing Connection...
[pid=44845 processName=MainProcess - 2013-12-03 12:49:16,671]: celery.bootsteps - DEBUG - | Consumer: Closing Events...
[pid=44845 processName=MainProcess - 2013-12-03 12:49:16,671]: celery.bootsteps - DEBUG - | Consumer: Closing Mingle...
[pid=44845 processName=MainProcess - 2013-12-03 12:49:16,671]: celery.bootsteps - DEBUG - | Consumer: Closing Gossip...
[pid=44845 processName=MainProcess - 2013-12-03 12:49:16,672]: celery.bootsteps - DEBUG - | Consumer: Closing Tasks...
[pid=44845 processName=MainProcess - 2013-12-03 12:49:16,672]: celery.bootsteps - DEBUG - | Consumer: Closing Heart...
[pid=44845 processName=MainProcess - 2013-12-03 12:49:16,672]: celery.bootsteps - DEBUG - | Consumer: Closing Control...
[pid=44845 processName=MainProcess - 2013-12-03 12:49:16,672]: celery.bootsteps - DEBUG - | Consumer: Closing event loop...
[pid=44845 processName=MainProcess - 2013-12-03 12:49:16,672]: celery.bootsteps - DEBUG - | Consumer: Stopping event loop...
[pid=44845 processName=MainProcess - 2013-12-03 12:49:16,672]: celery.bootsteps - DEBUG - | Consumer: Stopping Control...
[pid=44845 processName=MainProcess - 2013-12-03 12:49:16,673]: celery.bootsteps - DEBUG - | Consumer: Stopping Heart...
[pid=44845 processName=MainProcess - 2013-12-03 12:49:16,673]: celery.bootsteps - DEBUG - | Consumer: Stopping Tasks...
[pid=44845 processName=MainProcess - 2013-12-03 12:49:16,673]: celery.worker.consumer - DEBUG - Cancelling task consumer...
[pid=44845 processName=MainProcess - 2013-12-03 12:49:16,678]: celery.bootsteps - DEBUG - | Consumer: Stopping Gossip...
[pid=44845 processName=MainProcess - 2013-12-03 12:49:16,679]: celery.bootsteps - DEBUG - | Consumer: Stopping Mingle...
[pid=44845 processName=MainProcess - 2013-12-03 12:49:16,679]: celery.bootsteps - DEBUG - | Consumer: Stopping Events...
[pid=44845 processName=MainProcess - 2013-12-03 12:49:16,679]: celery.bootsteps - DEBUG - | Consumer: Stopping Connection...
[pid=44845 processName=MainProcess - 2013-12-03 12:49:16,679]: celery.bootsteps - DEBUG - | Worker: Stopping Pool...
[pid=44845 processName=MainProcess - 2013-12-03 12:49:16,698]: multiprocessing - CRITICAL - Ghost acked_by_gone 44845 from job <Result: 1848 ack:True ready:False> with worker pids [44845]
[pid=44845 processName=MainProcess - 2013-12-03 12:49:16,698]: celery.bootsteps - ERROR - Error on stopping Pool: KeyError(44845,)
Traceback (most recent call last):
  File "/home/ionel/projects/core/.ve/local/lib/python2.7/site-packages/celery/bootsteps.py", line 155, in send_all
    fun(parent, *args)
  File "/home/ionel/projects/core/.ve/local/lib/python2.7/site-packages/celery/bootsteps.py", line 377, in stop
    return self.obj.stop()
  File "/home/ionel/projects/core/.ve/local/lib/python2.7/site-packages/celery/concurrency/base.py", line 119, in stop
    self.on_stop()
  File "/home/ionel/projects/core/.ve/local/lib/python2.7/site-packages/celery/concurrency/prefork.py", line 140, in on_stop
    self._pool.join()
  File "/home/ionel/projects/core/.ve/src/billiard/billiard/pool.py", line 1523, in join
    stop_if_not_current(self._result_handler)
  File "/home/ionel/projects/core/.ve/src/billiard/billiard/pool.py", line 152, in stop_if_not_current
    thread.stop(timeout)
  File "/home/ionel/projects/core/.ve/src/billiard/billiard/pool.py", line 493, in stop
    self.on_stop_not_started()
  File "/home/ionel/projects/core/.ve/local/lib/python2.7/site-packages/celery/concurrency/asynpool.py", line 296, in on_stop_not_started
    join_exited_workers(shutdown=True)
  File "/home/ionel/projects/core/.ve/src/billiard/billiard/pool.py", line 1124, in _join_exited_workers
    exitcode = exitcodes[acked_by_gone]
KeyError: 44845
[pid=44845 processName=MainProcess - 2013-12-03 12:49:16,698]: celery.bootsteps - DEBUG - | Worker: Stopping Hub...
[pid=44845 processName=MainProcess - 2013-12-03 12:49:16,698]: celery.bootsteps - DEBUG - | Consumer: Shutdown Control...
[pid=44845 processName=MainProcess - 2013-12-03 12:49:16,700]: celery.bootsteps - DEBUG - | Consumer: Shutdown Heart...
[pid=44845 processName=MainProcess - 2013-12-03 12:49:16,700]: celery.bootsteps - DEBUG - | Consumer: Shutdown Tasks...
[pid=44845 processName=MainProcess - 2013-12-03 12:49:16,700]: celery.worker.consumer - DEBUG - Cancelling task consumer...
[pid=44845 processName=MainProcess - 2013-12-03 12:49:16,700]: celery.worker.consumer - DEBUG - Closing consumer channel...
[pid=44845 processName=MainProcess - 2013-12-03 12:49:16,700]: celery.bootsteps - DEBUG - | Consumer: Shutdown Gossip...
[pid=44845 processName=MainProcess - 2013-12-03 12:49:16,700]: celery.bootsteps - DEBUG - | Consumer: Shutdown Events...
[pid=44845 processName=MainProcess - 2013-12-03 12:49:16,701]: celery.bootsteps - DEBUG - | Consumer: Shutdown Connection...

 -------------- xxx@dkbox v3.1.6 (Cipater)
---- **** ----- 
--- * ***  * -- Linux-3.8.0-33-generic-x86_64-with-Ubuntu-12.04-precise
-- * - **** --- 
- ** ---------- [config]
- ** ---------- .> broker:      amqp://xxx@localhost:5672/xxx
- ** ---------- .> app:         default:0x2816950 (djcelery.loaders.DjangoLoader)
- ** ---------- .> concurrency: 5 (prefork)
- *** --- * --- .> events:      ON
-- ******* ---- 
--- ***** ----- [queues]
 -------------- .> xxx@dkbox.dq exchange=C.dq(direct) key=xxx@dkbox
                .> celery           exchange=celery(direct) key=celery

[tasks]
  . xxx.tasks.analyse
  . xxx.tasks.chunked_parse
  . xxx.tasks.script
  . xxx.tasks.transform
  . celery.backend_cleanup
  . celery.chain
  . celery.chord
  . celery.chord_unlock
  . celery.chunks
  . celery.group
  . celery.map
  . celery.starmap

[pid=44845 processName=MainProcess - 2013-12-03 12:49:16,743]: celery.concurrency.asynpool - DEBUG - removing tasks from inqueue until task handler finished
[pid=51905 processName=Worker-1835 - 2013-12-03 12:49:17,246]: xxx.tools - INFO - Completed task script with id: c4831f6d-a4d8-4375-98b8-e80050541242, run time: 31.25sec. 
@ionelmc

This comment has been minimized.

Show comment
Hide comment
@ionelmc

ionelmc Dec 3, 2013

Member

I'm doing this change

--- a/billiard/pool.py
+++ b/billiard/pool.py
@@ -1124,7 +1124,10 @@ class Pool(object):
                         exitcode = exitcodes[acked_by_gone]
                         if getattr(cleaned[acked_by_gone],
                                    '_job_terminated', False):
-                            job._set_terminated(exitcode)
+                            if exitcode:
+                                job._set_terminated(exitcode)
+                            else:
+                                job._set_terminated()
                         else:
                             self.on_job_process_lost(
                                 job, acked_by_gone, exitcode,

But meanwhile, how did the exitcode get the value None ?

Member

ionelmc commented Dec 3, 2013

I'm doing this change

--- a/billiard/pool.py
+++ b/billiard/pool.py
@@ -1124,7 +1124,10 @@ class Pool(object):
                         exitcode = exitcodes[acked_by_gone]
                         if getattr(cleaned[acked_by_gone],
                                    '_job_terminated', False):
-                            job._set_terminated(exitcode)
+                            if exitcode:
+                                job._set_terminated(exitcode)
+                            else:
+                                job._set_terminated()
                         else:
                             self.on_job_process_lost(
                                 job, acked_by_gone, exitcode,

But meanwhile, how did the exitcode get the value None ?

@ionelmc

This comment has been minimized.

Show comment
Hide comment
@ionelmc

ionelmc Dec 4, 2013

Member

With billiard 3.3.0.11:

[pid=3453 processName=Worker-1845 - 2013-12-04 01:35:36,753]: xxx.utility.memory - DEBUG - Calling resource.setrlimit with (1073741824, -1). Previous limit was (-1, -1).
[pid=3453 processName=Worker-1845 - 2013-12-04 01:35:36,755]: xxx.content.tools - DEBUG - For contexts (Context(context_id:None, operator: Test-Customer, element:host-name-1, product:TYPEA, start_time:1287097667, time_zone:5),), found customizations: {}
[pid=3453 processName=Worker-1845 - 2013-12-04 01:35:36,755]: xxx.engine.transform.python_transformer - DEBUG - Run transformer on contexts: [Context(context_id:None, operator: Test-Customer, element:host-name-1, product:TYPEA, start_time:1287097667, time_zone:5)]
[pid=28409 processName=MainProcess - 2013-12-04 01:35:37,187]: kombu.pidbox - DEBUG - pidbox received method revoke(signal='SIGTERM', terminate=True, task_id='082ee8af-acc2-4c76-b198-798779e3a5c1') [reply_to:None ticket:None]
[pid=28409 processName=MainProcess - 2013-12-04 01:35:37,213]: kombu.pidbox - DEBUG - pidbox received method revoke(signal='SIGTERM', terminate=True, task_id='7b5adfd3-c057-450a-857e-57e24e0da8fd') [reply_to:None ticket:None]
[pid=28409 processName=MainProcess - 2013-12-04 01:35:37,213]: celery.worker.control - INFO - Terminating 7b5adfd3-c057-450a-857e-57e24e0da8fd (15)
[pid=3453 processName=Worker-1845 - 2013-12-04 01:35:37,214]: xxx.utility.memory - DEBUG - Restored resource.setrlimit to (-1, -1).
[pid=3453 processName=Worker-1845 - 2013-12-04 01:35:37,230]: xxx.engine.tools - INFO - Completed task transform with id: 7b5adfd3-c057-450a-857e-57e24e0da8fd, run time: 0.54sec.
[pid=28409 processName=MainProcess - 2013-12-04 01:35:37,235]: kombu.pidbox - DEBUG - pidbox received method revoke(signal='SIGTERM', terminate=False, task_id='25eaf53d-c1ed-4ad8-add5-19bd617fc460') [reply_to:None ticket:None]
[pid=28409 processName=MainProcess - 2013-12-04 01:35:37,235]: celery.worker.control - INFO - Tasks flagged as revoked: 25eaf53d-c1ed-4ad8-add5-19bd617fc460
[pid=28409 processName=MainProcess - 2013-12-04 01:35:37,261]: celery.worker.job - DEBUG - Task accepted: xxx.engine.tasks.script[25eaf53d-c1ed-4ad8-add5-19bd617fc460] pid:28409
[pid=28409 processName=MainProcess - 2013-12-04 01:35:37,261]: celery.worker - ERROR - Unrecoverable error: TypeError("bad operand type for unary -: 'NoneType'",)
Traceback (most recent call last):
  File "/var/lib/jenkins/jobs/xxx-trunk/workspace/.ve/local/lib/python2.7/site-packages/celery/worker/__init__.py", line 212, in start
    self.blueprint.start(self)
  File "/var/lib/jenkins/jobs/xxx-trunk/workspace/.ve/local/lib/python2.7/site-packages/celery/bootsteps.py", line 123, in start
    step.start(parent)
  File "/var/lib/jenkins/jobs/xxx-trunk/workspace/.ve/local/lib/python2.7/site-packages/celery/bootsteps.py", line 373, in start
    return self.obj.start()
  File "/var/lib/jenkins/jobs/xxx-trunk/workspace/.ve/local/lib/python2.7/site-packages/celery/worker/consumer.py", line 270, in start
    blueprint.start(self)
  File "/var/lib/jenkins/jobs/xxx-trunk/workspace/.ve/local/lib/python2.7/site-packages/celery/bootsteps.py", line 123, in start
    step.start(parent)
  File "/var/lib/jenkins/jobs/xxx-trunk/workspace/.ve/local/lib/python2.7/site-packages/celery/worker/consumer.py", line 790, in start
    c.loop(*c.loop_args())
  File "/var/lib/jenkins/jobs/xxx-trunk/workspace/.ve/local/lib/python2.7/site-packages/celery/worker/loops.py", line 72, in asynloop
    next(loop)
  File "/var/lib/jenkins/jobs/xxx-trunk/workspace/.ve/local/lib/python2.7/site-packages/kombu/async/hub.py", line 333, in create_loop
    cb(*cbargs)
  File "/var/lib/jenkins/jobs/xxx-trunk/workspace/.ve/local/lib/python2.7/site-packages/celery/concurrency/asynpool.py", line 393, in _event_process_exit
    self.maintain_pool()
  File "/var/lib/jenkins/jobs/xxx-trunk/workspace/.ve/local/lib/python2.7/site-packages/billiard/pool.py", line 1246, in maintain_pool
    self._maintain_pool()
  File "/var/lib/jenkins/jobs/xxx-trunk/workspace/.ve/local/lib/python2.7/site-packages/billiard/pool.py", line 1237, in _maintain_pool
    joined = self._join_exited_workers()
  File "/var/lib/jenkins/jobs/xxx-trunk/workspace/.ve/local/lib/python2.7/site-packages/billiard/pool.py", line 1134, in _join_exited_workers
    self.on_job_process_down(job, write_to.pid)
  File "/var/lib/jenkins/jobs/xxx-trunk/workspace/.ve/local/lib/python2.7/site-packages/celery/concurrency/asynpool.py", line 978, in on_job_process_down
    self.on_partial_read(job, job._write_to)
  File "/var/lib/jenkins/jobs/xxx-trunk/workspace/.ve/local/lib/python2.7/site-packages/celery/concurrency/asynpool.py", line 1111, in on_partial_read
    self._put_back(job)
  File "/var/lib/jenkins/jobs/xxx-trunk/workspace/.ve/local/lib/python2.7/site-packages/celery/concurrency/asynpool.py", line 606, in _put_back
    job._set_terminated(job._terminated)
  File "/var/lib/jenkins/jobs/xxx-trunk/workspace/.ve/local/lib/python2.7/site-packages/billiard/pool.py", line 1663, in _set_terminated
    raise Terminated(-signum)
TypeError: bad operand type for unary -: 'NoneType'
[pid=28409 processName=MainProcess - 2013-12-04 01:35:37,263]: celery.bootsteps - DEBUG - | Worker: Closing Hub...
[pid=28409 processName=MainProcess - 2013-12-04 01:35:37,263]: celery.bootsteps - DEBUG - | Worker: Closing Pool...
[pid=28409 processName=MainProcess - 2013-12-04 01:35:37,263]: celery.bootsteps - DEBUG - | Worker: Closing Consumer...
[pid=28409 processName=MainProcess - 2013-12-04 01:35:37,264]: celery.bootsteps - DEBUG - | Worker: Stopping Consumer...
[pid=28409 processName=MainProcess - 2013-12-04 01:35:37,264]: celery.bootsteps - DEBUG - | Consumer: Closing Connection...
[pid=28409 processName=MainProcess - 2013-12-04 01:35:37,264]: celery.bootsteps - DEBUG - | Consumer: Closing Events...
[pid=28409 processName=MainProcess - 2013-12-04 01:35:37,264]: celery.bootsteps - DEBUG - | Consumer: Closing Heart...
[pid=28409 processName=MainProcess - 2013-12-04 01:35:37,265]: celery.bootsteps - DEBUG - | Consumer: Closing Mingle...
[pid=28409 processName=MainProcess - 2013-12-04 01:35:37,265]: celery.bootsteps - DEBUG - | Consumer: Closing Gossip...
[pid=28409 processName=MainProcess - 2013-12-04 01:35:37,265]: celery.bootsteps - DEBUG - | Consumer: Closing Control...
[pid=28409 processName=MainProcess - 2013-12-04 01:35:37,265]: celery.bootsteps - DEBUG - | Consumer: Closing Tasks...
[pid=28409 processName=MainProcess - 2013-12-04 01:35:37,265]: celery.bootsteps - DEBUG - | Consumer: Closing event loop...
[pid=28409 processName=MainProcess - 2013-12-04 01:35:37,265]: celery.bootsteps - DEBUG - | Consumer: Stopping event loop...
[pid=28409 processName=MainProcess - 2013-12-04 01:35:37,265]: celery.bootsteps - DEBUG - | Consumer: Stopping Tasks...
[pid=28409 processName=MainProcess - 2013-12-04 01:35:37,266]: celery.worker.consumer - DEBUG - Cancelling task consumer...
[pid=28409 processName=MainProcess - 2013-12-04 01:35:37,358]: celery.bootsteps - DEBUG - | Consumer: Stopping Control...
[pid=28409 processName=MainProcess - 2013-12-04 01:35:37,360]: celery.bootsteps - DEBUG - | Consumer: Stopping Gossip...
[pid=28409 processName=MainProcess - 2013-12-04 01:35:37,362]: celery.bootsteps - DEBUG - | Consumer: Stopping Mingle...
[pid=28409 processName=MainProcess - 2013-12-04 01:35:37,362]: celery.bootsteps - DEBUG - | Consumer: Stopping Heart...
[pid=28409 processName=MainProcess - 2013-12-04 01:35:37,362]: celery.bootsteps - DEBUG - | Consumer: Stopping Events...
[pid=28409 processName=MainProcess - 2013-12-04 01:35:37,362]: celery.bootsteps - DEBUG - | Consumer: Stopping Connection...
[pid=28409 processName=MainProcess - 2013-12-04 01:35:37,363]: celery.bootsteps - DEBUG - | Worker: Stopping Pool...
[pid=28409 processName=MainProcess - 2013-12-04 01:35:38,887]: celery.bootsteps - ERROR - Error on stopping Pool: KeyError(28409,)
Traceback (most recent call last):
  File "/var/lib/jenkins/jobs/xxx-trunk/workspace/.ve/local/lib/python2.7/site-packages/celery/bootsteps.py", line 155, in send_all
    fun(parent, *args)
  File "/var/lib/jenkins/jobs/xxx-trunk/workspace/.ve/local/lib/python2.7/site-packages/celery/bootsteps.py", line 377, in stop
    return self.obj.stop()
  File "/var/lib/jenkins/jobs/xxx-trunk/workspace/.ve/local/lib/python2.7/site-packages/celery/concurrency/base.py", line 119, in stop
    self.on_stop()
  File "/var/lib/jenkins/jobs/xxx-trunk/workspace/.ve/local/lib/python2.7/site-packages/celery/concurrency/prefork.py", line 140, in on_stop
    self._pool.join()
  File "/var/lib/jenkins/jobs/xxx-trunk/workspace/.ve/local/lib/python2.7/site-packages/billiard/pool.py", line 1518, in join
    stop_if_not_current(self._result_handler)
  File "/var/lib/jenkins/jobs/xxx-trunk/workspace/.ve/local/lib/python2.7/site-packages/billiard/pool.py", line 148, in stop_if_not_current
    thread.stop(timeout)
  File "/var/lib/jenkins/jobs/xxx-trunk/workspace/.ve/local/lib/python2.7/site-packages/billiard/pool.py", line 491, in stop
    self.on_stop_not_started()
  File "/var/lib/jenkins/jobs/xxx-trunk/workspace/.ve/local/lib/python2.7/site-packages/celery/concurrency/asynpool.py", line 296, in on_stop_not_started
    join_exited_workers(shutdown=True)
  File "/var/lib/jenkins/jobs/xxx-trunk/workspace/.ve/local/lib/python2.7/site-packages/billiard/pool.py", line 1120, in _join_exited_workers
    if getattr(cleaned[acked_by_gone],
KeyError: 28409
[pid=28409 processName=MainProcess - 2013-12-04 01:35:38,887]: celery.bootsteps - DEBUG - | Worker: Stopping Hub...
[pid=28409 processName=MainProcess - 2013-12-04 01:35:38,888]: celery.bootsteps - DEBUG - | Consumer: Shutdown Tasks...
[pid=28409 processName=MainProcess - 2013-12-04 01:35:38,888]: celery.worker.consumer - DEBUG - Cancelling task consumer...
[pid=28409 processName=MainProcess - 2013-12-04 01:35:38,888]: celery.worker.consumer - DEBUG - Closing consumer channel...
[pid=28409 processName=MainProcess - 2013-12-04 01:35:38,888]: celery.bootsteps - DEBUG - | Consumer: Shutdown Control...
[pid=28409 processName=MainProcess - 2013-12-04 01:35:38,889]: celery.bootsteps - DEBUG - | Consumer: Shutdown Gossip...
[pid=28409 processName=MainProcess - 2013-12-04 01:35:38,890]: celery.bootsteps - DEBUG - | Consumer: Shutdown Heart...
[pid=28409 processName=MainProcess - 2013-12-04 01:35:38,890]: celery.bootsteps - DEBUG - | Consumer: Shutdown Events...
[pid=28409 processName=MainProcess - 2013-12-04 01:35:38,891]: celery.bootsteps - DEBUG - | Consumer: Shutdown Connection...

 -------------- xxx_pipeline_worker@xxx-hudson v3.1.6 (Cipater)
---- **** -----
--- * ***  * -- Linux-3.2.0-57-generic-x86_64-with-Ubuntu-12.04-precise
-- * - **** ---
- ** ---------- [config]
- ** ---------- .> app:         default:0x1c9e150 (djcelery.loaders.DjangoLoader)
- ** ---------- .> transport:   amqp://xxx@localhost:5672/xxx
- ** ---------- .> results:     mongodb
- *** --- * --- .> concurrency: 5 (prefork)
-- ******* ----
--- ***** ----- [queues]
 -------------- .> xxx_pipeline_worker@xxx-hudson.dq exchange=C.dq(direct) key=xxx_pipeline_worker@xxx-hudson
                .> celery           exchange=celery(direct) key=celery

[tasks]
  . xxx.engine.tasks.analyse
  . xxx.engine.tasks.chunked_parse
  . xxx.engine.tasks.script
  . xxx.engine.tasks.transform
  . xxx.service.tasks.process
  . celery.backend_cleanup
  . celery.chain
  . celery.chord
  . celery.chord_unlock
  . celery.chunks
  . celery.group
  . celery.map
  . celery.starmap

[pid=28409 processName=MainProcess - 2013-12-04 01:35:38,953]: celery.concurrency.asynpool - DEBUG - removing tasks from inqueue until task handler finished
[pid=3370 processName=Worker-1832 - 2013-12-04 01:35:39,455]: xxx.engine.tools - INFO - Completed task script with id: b9f86249-902b-4296-951f-9e9d27c87dca, run time: 31.52sec.

I'm gonna try it again with celery/billiard#90 applied ....

Member

ionelmc commented Dec 4, 2013

With billiard 3.3.0.11:

[pid=3453 processName=Worker-1845 - 2013-12-04 01:35:36,753]: xxx.utility.memory - DEBUG - Calling resource.setrlimit with (1073741824, -1). Previous limit was (-1, -1).
[pid=3453 processName=Worker-1845 - 2013-12-04 01:35:36,755]: xxx.content.tools - DEBUG - For contexts (Context(context_id:None, operator: Test-Customer, element:host-name-1, product:TYPEA, start_time:1287097667, time_zone:5),), found customizations: {}
[pid=3453 processName=Worker-1845 - 2013-12-04 01:35:36,755]: xxx.engine.transform.python_transformer - DEBUG - Run transformer on contexts: [Context(context_id:None, operator: Test-Customer, element:host-name-1, product:TYPEA, start_time:1287097667, time_zone:5)]
[pid=28409 processName=MainProcess - 2013-12-04 01:35:37,187]: kombu.pidbox - DEBUG - pidbox received method revoke(signal='SIGTERM', terminate=True, task_id='082ee8af-acc2-4c76-b198-798779e3a5c1') [reply_to:None ticket:None]
[pid=28409 processName=MainProcess - 2013-12-04 01:35:37,213]: kombu.pidbox - DEBUG - pidbox received method revoke(signal='SIGTERM', terminate=True, task_id='7b5adfd3-c057-450a-857e-57e24e0da8fd') [reply_to:None ticket:None]
[pid=28409 processName=MainProcess - 2013-12-04 01:35:37,213]: celery.worker.control - INFO - Terminating 7b5adfd3-c057-450a-857e-57e24e0da8fd (15)
[pid=3453 processName=Worker-1845 - 2013-12-04 01:35:37,214]: xxx.utility.memory - DEBUG - Restored resource.setrlimit to (-1, -1).
[pid=3453 processName=Worker-1845 - 2013-12-04 01:35:37,230]: xxx.engine.tools - INFO - Completed task transform with id: 7b5adfd3-c057-450a-857e-57e24e0da8fd, run time: 0.54sec.
[pid=28409 processName=MainProcess - 2013-12-04 01:35:37,235]: kombu.pidbox - DEBUG - pidbox received method revoke(signal='SIGTERM', terminate=False, task_id='25eaf53d-c1ed-4ad8-add5-19bd617fc460') [reply_to:None ticket:None]
[pid=28409 processName=MainProcess - 2013-12-04 01:35:37,235]: celery.worker.control - INFO - Tasks flagged as revoked: 25eaf53d-c1ed-4ad8-add5-19bd617fc460
[pid=28409 processName=MainProcess - 2013-12-04 01:35:37,261]: celery.worker.job - DEBUG - Task accepted: xxx.engine.tasks.script[25eaf53d-c1ed-4ad8-add5-19bd617fc460] pid:28409
[pid=28409 processName=MainProcess - 2013-12-04 01:35:37,261]: celery.worker - ERROR - Unrecoverable error: TypeError("bad operand type for unary -: 'NoneType'",)
Traceback (most recent call last):
  File "/var/lib/jenkins/jobs/xxx-trunk/workspace/.ve/local/lib/python2.7/site-packages/celery/worker/__init__.py", line 212, in start
    self.blueprint.start(self)
  File "/var/lib/jenkins/jobs/xxx-trunk/workspace/.ve/local/lib/python2.7/site-packages/celery/bootsteps.py", line 123, in start
    step.start(parent)
  File "/var/lib/jenkins/jobs/xxx-trunk/workspace/.ve/local/lib/python2.7/site-packages/celery/bootsteps.py", line 373, in start
    return self.obj.start()
  File "/var/lib/jenkins/jobs/xxx-trunk/workspace/.ve/local/lib/python2.7/site-packages/celery/worker/consumer.py", line 270, in start
    blueprint.start(self)
  File "/var/lib/jenkins/jobs/xxx-trunk/workspace/.ve/local/lib/python2.7/site-packages/celery/bootsteps.py", line 123, in start
    step.start(parent)
  File "/var/lib/jenkins/jobs/xxx-trunk/workspace/.ve/local/lib/python2.7/site-packages/celery/worker/consumer.py", line 790, in start
    c.loop(*c.loop_args())
  File "/var/lib/jenkins/jobs/xxx-trunk/workspace/.ve/local/lib/python2.7/site-packages/celery/worker/loops.py", line 72, in asynloop
    next(loop)
  File "/var/lib/jenkins/jobs/xxx-trunk/workspace/.ve/local/lib/python2.7/site-packages/kombu/async/hub.py", line 333, in create_loop
    cb(*cbargs)
  File "/var/lib/jenkins/jobs/xxx-trunk/workspace/.ve/local/lib/python2.7/site-packages/celery/concurrency/asynpool.py", line 393, in _event_process_exit
    self.maintain_pool()
  File "/var/lib/jenkins/jobs/xxx-trunk/workspace/.ve/local/lib/python2.7/site-packages/billiard/pool.py", line 1246, in maintain_pool
    self._maintain_pool()
  File "/var/lib/jenkins/jobs/xxx-trunk/workspace/.ve/local/lib/python2.7/site-packages/billiard/pool.py", line 1237, in _maintain_pool
    joined = self._join_exited_workers()
  File "/var/lib/jenkins/jobs/xxx-trunk/workspace/.ve/local/lib/python2.7/site-packages/billiard/pool.py", line 1134, in _join_exited_workers
    self.on_job_process_down(job, write_to.pid)
  File "/var/lib/jenkins/jobs/xxx-trunk/workspace/.ve/local/lib/python2.7/site-packages/celery/concurrency/asynpool.py", line 978, in on_job_process_down
    self.on_partial_read(job, job._write_to)
  File "/var/lib/jenkins/jobs/xxx-trunk/workspace/.ve/local/lib/python2.7/site-packages/celery/concurrency/asynpool.py", line 1111, in on_partial_read
    self._put_back(job)
  File "/var/lib/jenkins/jobs/xxx-trunk/workspace/.ve/local/lib/python2.7/site-packages/celery/concurrency/asynpool.py", line 606, in _put_back
    job._set_terminated(job._terminated)
  File "/var/lib/jenkins/jobs/xxx-trunk/workspace/.ve/local/lib/python2.7/site-packages/billiard/pool.py", line 1663, in _set_terminated
    raise Terminated(-signum)
TypeError: bad operand type for unary -: 'NoneType'
[pid=28409 processName=MainProcess - 2013-12-04 01:35:37,263]: celery.bootsteps - DEBUG - | Worker: Closing Hub...
[pid=28409 processName=MainProcess - 2013-12-04 01:35:37,263]: celery.bootsteps - DEBUG - | Worker: Closing Pool...
[pid=28409 processName=MainProcess - 2013-12-04 01:35:37,263]: celery.bootsteps - DEBUG - | Worker: Closing Consumer...
[pid=28409 processName=MainProcess - 2013-12-04 01:35:37,264]: celery.bootsteps - DEBUG - | Worker: Stopping Consumer...
[pid=28409 processName=MainProcess - 2013-12-04 01:35:37,264]: celery.bootsteps - DEBUG - | Consumer: Closing Connection...
[pid=28409 processName=MainProcess - 2013-12-04 01:35:37,264]: celery.bootsteps - DEBUG - | Consumer: Closing Events...
[pid=28409 processName=MainProcess - 2013-12-04 01:35:37,264]: celery.bootsteps - DEBUG - | Consumer: Closing Heart...
[pid=28409 processName=MainProcess - 2013-12-04 01:35:37,265]: celery.bootsteps - DEBUG - | Consumer: Closing Mingle...
[pid=28409 processName=MainProcess - 2013-12-04 01:35:37,265]: celery.bootsteps - DEBUG - | Consumer: Closing Gossip...
[pid=28409 processName=MainProcess - 2013-12-04 01:35:37,265]: celery.bootsteps - DEBUG - | Consumer: Closing Control...
[pid=28409 processName=MainProcess - 2013-12-04 01:35:37,265]: celery.bootsteps - DEBUG - | Consumer: Closing Tasks...
[pid=28409 processName=MainProcess - 2013-12-04 01:35:37,265]: celery.bootsteps - DEBUG - | Consumer: Closing event loop...
[pid=28409 processName=MainProcess - 2013-12-04 01:35:37,265]: celery.bootsteps - DEBUG - | Consumer: Stopping event loop...
[pid=28409 processName=MainProcess - 2013-12-04 01:35:37,265]: celery.bootsteps - DEBUG - | Consumer: Stopping Tasks...
[pid=28409 processName=MainProcess - 2013-12-04 01:35:37,266]: celery.worker.consumer - DEBUG - Cancelling task consumer...
[pid=28409 processName=MainProcess - 2013-12-04 01:35:37,358]: celery.bootsteps - DEBUG - | Consumer: Stopping Control...
[pid=28409 processName=MainProcess - 2013-12-04 01:35:37,360]: celery.bootsteps - DEBUG - | Consumer: Stopping Gossip...
[pid=28409 processName=MainProcess - 2013-12-04 01:35:37,362]: celery.bootsteps - DEBUG - | Consumer: Stopping Mingle...
[pid=28409 processName=MainProcess - 2013-12-04 01:35:37,362]: celery.bootsteps - DEBUG - | Consumer: Stopping Heart...
[pid=28409 processName=MainProcess - 2013-12-04 01:35:37,362]: celery.bootsteps - DEBUG - | Consumer: Stopping Events...
[pid=28409 processName=MainProcess - 2013-12-04 01:35:37,362]: celery.bootsteps - DEBUG - | Consumer: Stopping Connection...
[pid=28409 processName=MainProcess - 2013-12-04 01:35:37,363]: celery.bootsteps - DEBUG - | Worker: Stopping Pool...
[pid=28409 processName=MainProcess - 2013-12-04 01:35:38,887]: celery.bootsteps - ERROR - Error on stopping Pool: KeyError(28409,)
Traceback (most recent call last):
  File "/var/lib/jenkins/jobs/xxx-trunk/workspace/.ve/local/lib/python2.7/site-packages/celery/bootsteps.py", line 155, in send_all
    fun(parent, *args)
  File "/var/lib/jenkins/jobs/xxx-trunk/workspace/.ve/local/lib/python2.7/site-packages/celery/bootsteps.py", line 377, in stop
    return self.obj.stop()
  File "/var/lib/jenkins/jobs/xxx-trunk/workspace/.ve/local/lib/python2.7/site-packages/celery/concurrency/base.py", line 119, in stop
    self.on_stop()
  File "/var/lib/jenkins/jobs/xxx-trunk/workspace/.ve/local/lib/python2.7/site-packages/celery/concurrency/prefork.py", line 140, in on_stop
    self._pool.join()
  File "/var/lib/jenkins/jobs/xxx-trunk/workspace/.ve/local/lib/python2.7/site-packages/billiard/pool.py", line 1518, in join
    stop_if_not_current(self._result_handler)
  File "/var/lib/jenkins/jobs/xxx-trunk/workspace/.ve/local/lib/python2.7/site-packages/billiard/pool.py", line 148, in stop_if_not_current
    thread.stop(timeout)
  File "/var/lib/jenkins/jobs/xxx-trunk/workspace/.ve/local/lib/python2.7/site-packages/billiard/pool.py", line 491, in stop
    self.on_stop_not_started()
  File "/var/lib/jenkins/jobs/xxx-trunk/workspace/.ve/local/lib/python2.7/site-packages/celery/concurrency/asynpool.py", line 296, in on_stop_not_started
    join_exited_workers(shutdown=True)
  File "/var/lib/jenkins/jobs/xxx-trunk/workspace/.ve/local/lib/python2.7/site-packages/billiard/pool.py", line 1120, in _join_exited_workers
    if getattr(cleaned[acked_by_gone],
KeyError: 28409
[pid=28409 processName=MainProcess - 2013-12-04 01:35:38,887]: celery.bootsteps - DEBUG - | Worker: Stopping Hub...
[pid=28409 processName=MainProcess - 2013-12-04 01:35:38,888]: celery.bootsteps - DEBUG - | Consumer: Shutdown Tasks...
[pid=28409 processName=MainProcess - 2013-12-04 01:35:38,888]: celery.worker.consumer - DEBUG - Cancelling task consumer...
[pid=28409 processName=MainProcess - 2013-12-04 01:35:38,888]: celery.worker.consumer - DEBUG - Closing consumer channel...
[pid=28409 processName=MainProcess - 2013-12-04 01:35:38,888]: celery.bootsteps - DEBUG - | Consumer: Shutdown Control...
[pid=28409 processName=MainProcess - 2013-12-04 01:35:38,889]: celery.bootsteps - DEBUG - | Consumer: Shutdown Gossip...
[pid=28409 processName=MainProcess - 2013-12-04 01:35:38,890]: celery.bootsteps - DEBUG - | Consumer: Shutdown Heart...
[pid=28409 processName=MainProcess - 2013-12-04 01:35:38,890]: celery.bootsteps - DEBUG - | Consumer: Shutdown Events...
[pid=28409 processName=MainProcess - 2013-12-04 01:35:38,891]: celery.bootsteps - DEBUG - | Consumer: Shutdown Connection...

 -------------- xxx_pipeline_worker@xxx-hudson v3.1.6 (Cipater)
---- **** -----
--- * ***  * -- Linux-3.2.0-57-generic-x86_64-with-Ubuntu-12.04-precise
-- * - **** ---
- ** ---------- [config]
- ** ---------- .> app:         default:0x1c9e150 (djcelery.loaders.DjangoLoader)
- ** ---------- .> transport:   amqp://xxx@localhost:5672/xxx
- ** ---------- .> results:     mongodb
- *** --- * --- .> concurrency: 5 (prefork)
-- ******* ----
--- ***** ----- [queues]
 -------------- .> xxx_pipeline_worker@xxx-hudson.dq exchange=C.dq(direct) key=xxx_pipeline_worker@xxx-hudson
                .> celery           exchange=celery(direct) key=celery

[tasks]
  . xxx.engine.tasks.analyse
  . xxx.engine.tasks.chunked_parse
  . xxx.engine.tasks.script
  . xxx.engine.tasks.transform
  . xxx.service.tasks.process
  . celery.backend_cleanup
  . celery.chain
  . celery.chord
  . celery.chord_unlock
  . celery.chunks
  . celery.group
  . celery.map
  . celery.starmap

[pid=28409 processName=MainProcess - 2013-12-04 01:35:38,953]: celery.concurrency.asynpool - DEBUG - removing tasks from inqueue until task handler finished
[pid=3370 processName=Worker-1832 - 2013-12-04 01:35:39,455]: xxx.engine.tools - INFO - Completed task script with id: b9f86249-902b-4296-951f-9e9d27c87dca, run time: 31.52sec.

I'm gonna try it again with celery/billiard#90 applied ....

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment