Skip to content

Celery task retries not executed with MongoDB broker #1072

Closed
dmytton opened this Issue Nov 16, 2012 · 10 comments

2 participants

@dmytton
dmytton commented Nov 16, 2012

If you set the broker to mongodb then task retries never get executed. The same code works as expected changing the broker to the default (AMQP/RabbitMQ).

There is a debug log line:

[2012-11-16 21:58:36,237: INFO/MainProcess] Task tasks.name[c77b9dbd-5f37-461e-887b-decf4c30a624] retry: Task can be retried

After this, nothing happens.

This was also reported at http://stackoverflow.com/questions/11902821/celery-worker-does-not-retry-task-after-calling-retry

Does not work:

david@Shibuya ~/Sites/honshuu-metrics: celery worker --loglevel=debug --app=metrics -B

 -------------- celery@shibuya.lan v3.0.5 (Chiastic Slide)
---- **** ----- 
--- * ***  * -- [Configuration]
-- * - **** --- . broker:      mongodb://localhost%2Fmetrics_celery_broker/metrics_celery_broker
- ** ---------- . app:         __main__:0x10b61d2d0
- ** ---------- . concurrency: 2 (processes)
- ** ---------- . events:      OFF (enable -E to monitor this worker)
- ** ---------- 
- *** --- * --- [Queues]
-- ******* ---- . celery:      exchange:celery(direct) binding:celery
--- ***** ----- 

[Tasks]
  . celery.backend_cleanup
  . celery.chain
  . celery.chord
  . celery.chord_unlock
  . celery.chunks
  . celery.group
  . celery.map
  . celery.starmap
  . metrics.tasks.postTempoDB

[2012-11-16 21:58:34,066: DEBUG/MainProcess] [Worker] Loading modules.
[2012-11-16 21:58:34,072: DEBUG/MainProcess] [Worker] Claiming components.
[2012-11-16 21:58:34,073: DEBUG/MainProcess] [Worker] Building boot step graph.
[2012-11-16 21:58:34,074: DEBUG/MainProcess] [Worker] New boot order: {ev, queues, beat, pool, mediator, autoreloader, timers, state-db, autoscaler, consumer}
[2012-11-16 21:58:34,080: DEBUG/MainProcess] Starting celery.beat._Process...
[2012-11-16 21:58:34,091: DEBUG/MainProcess] celery.beat._Process OK!
[2012-11-16 21:58:34,092: DEBUG/MainProcess] Starting celery.concurrency.processes.TaskPool...
[2012-11-16 21:58:34,214: DEBUG/MainProcess] celery.concurrency.processes.TaskPool OK!
[2012-11-16 21:58:34,215: DEBUG/MainProcess] Starting celery.worker.mediator.Mediator...
[2012-11-16 21:58:34,233: DEBUG/MainProcess] celery.worker.mediator.Mediator OK!
[2012-11-16 21:58:34,233: DEBUG/MainProcess] Starting celery.worker.consumer.BlockingConsumer...
[2012-11-16 21:58:34,234: WARNING/MainProcess] celery@shibuya.lan has started.
[2012-11-16 21:58:34,235: DEBUG/MainProcess] Consumer: Re-establishing connection to the broker...
[2012-11-16 21:58:34,238: DEBUG/MainProcess] Consumer: Connection established.
[2012-11-16 21:58:34,251: WARNING/MainProcess] /Library/Python/2.7/site-packages/pymongo-2.2.1-py2.7-macosx-10.7-intel.egg/pymongo/connection.py:294: UserWarning: must provide a username and password to authenticate to metrics_celery_broker
  "to authenticate to %s" % (db,))
[2012-11-16 21:58:34,260: DEBUG/MainProcess] Consumer: basic.qos: prefetch_count->8
[2012-11-16 21:58:34,300: DEBUG/MainProcess] Consumer: Ready to accept tasks!

Works as expected:

david@Shibuya ~/Sites/honshuu-metrics: celery worker --loglevel=debug --app=metrics -B

 -------------- celery@shibuya.lan v3.0.5 (Chiastic Slide)
---- **** ----- 
--- * ***  * -- [Configuration]
-- * - **** --- . broker:      amqp://guest@localhost:5672//
- ** ---------- . app:         __main__:0x10fa1a2d0
- ** ---------- . concurrency: 2 (processes)
- ** ---------- . events:      OFF (enable -E to monitor this worker)
- ** ---------- 
- *** --- * --- [Queues]
-- ******* ---- . celery:      exchange:celery(direct) binding:celery
--- ***** ----- 

[Tasks]
  . celery.backend_cleanup
  . celery.chain
  . celery.chord
  . celery.chord_unlock
  . celery.chunks
  . celery.group
  . celery.map
  . celery.starmap
  . metrics.tasks.postTempoDB

[2012-11-16 22:03:07,148: DEBUG/MainProcess] [Worker] Loading modules.
[2012-11-16 22:03:07,156: DEBUG/MainProcess] [Worker] Claiming components.
[2012-11-16 22:03:07,156: DEBUG/MainProcess] [Worker] Building boot step graph.
[2012-11-16 22:03:07,157: DEBUG/MainProcess] [Worker] New boot order: {ev, queues, beat, pool, mediator, autoreloader, timers, state-db, autoscaler, consumer}
[2012-11-16 22:03:07,164: DEBUG/MainProcess] Starting celery.worker.hub.Hub...
[2012-11-16 22:03:07,164: DEBUG/MainProcess] celery.worker.hub.Hub OK!
[2012-11-16 22:03:07,164: DEBUG/MainProcess] Starting celery.beat._Process...
[2012-11-16 22:03:07,169: DEBUG/MainProcess] celery.beat._Process OK!
[2012-11-16 22:03:07,172: DEBUG/MainProcess] Starting celery.concurrency.processes.TaskPool...
[2012-11-16 22:03:07,173: INFO/Beat] Celerybeat: Starting...
[2012-11-16 22:03:07,234: DEBUG/Beat] Current schedule:
<Entry: celery.backend_cleanup celery.backend_cleanup() {<crontab: 0 4 * * * (m/h/d/dM/MY)>}
[2012-11-16 22:03:07,236: DEBUG/Beat] Celerybeat: Ticking with max interval->5.00 minutes
[2012-11-16 22:03:07,235: DEBUG/MainProcess] celery.concurrency.processes.TaskPool OK!
[2012-11-16 22:03:07,238: DEBUG/MainProcess] Starting celery.worker.mediator.Mediator...
[2012-11-16 22:03:07,239: DEBUG/MainProcess] celery.worker.mediator.Mediator OK!
[2012-11-16 22:03:07,240: DEBUG/MainProcess] Starting celery.worker.consumer.Consumer...
[2012-11-16 22:03:07,241: WARNING/MainProcess] celery@shibuya.lan has started.
[2012-11-16 22:03:07,243: DEBUG/MainProcess] Consumer: Re-establishing connection to the broker...
[2012-11-16 22:03:07,244: DEBUG/MainProcess] Start from server, version: 8.0, properties: {u'information': u'Licensed under the MPL.  See http://www.rabbitmq.com/', u'product': u'RabbitMQ', u'copyright': u'Copyright (C) 2007-2012 VMware, Inc.', u'capabilities': {}, u'platform': u'Erlang/OTP', u'version': u'2.8.4'}, mechanisms: [u'PLAIN', u'AMQPLAIN'], locales: [u'en_US']
[2012-11-16 22:03:07,247: DEBUG/MainProcess] Open OK! known_hosts []
[2012-11-16 22:03:07,248: DEBUG/MainProcess] using channel_id: 1
[2012-11-16 22:03:07,252: DEBUG/MainProcess] Channel open
[2012-11-16 22:03:07,256: DEBUG/Beat] Celerybeat: Waking up in 5.00 minutes.
[2012-11-16 22:03:07,257: DEBUG/MainProcess] Start from server, version: 8.0, properties: {u'information': u'Licensed under the MPL.  See http://www.rabbitmq.com/', u'product': u'RabbitMQ', u'copyright': u'Copyright (C) 2007-2012 VMware, Inc.', u'capabilities': {}, u'platform': u'Erlang/OTP', u'version': u'2.8.4'}, mechanisms: [u'PLAIN', u'AMQPLAIN'], locales: [u'en_US']
[2012-11-16 22:03:07,259: DEBUG/MainProcess] Open OK! known_hosts []
[2012-11-16 22:03:07,259: DEBUG/MainProcess] Consumer: Connection established.
[2012-11-16 22:03:07,261: DEBUG/MainProcess] using channel_id: 1
[2012-11-16 22:03:07,262: DEBUG/MainProcess] Channel open
[2012-11-16 22:03:07,266: DEBUG/MainProcess] Consumer: basic.qos: prefetch_count->8
[2012-11-16 22:03:07,267: DEBUG/MainProcess] using channel_id: 2
[2012-11-16 22:03:07,270: DEBUG/MainProcess] Channel open
[2012-11-16 22:03:07,290: DEBUG/MainProcess] Consumer: Ready to accept tasks!
@ask
Celery member
ask commented Nov 19, 2012

Do you restart the worker inbetween?

@dmytton
dmytton commented Nov 19, 2012

Inbetween what?

@ask
Celery member
ask commented Nov 19, 2012

between calling retry and the time it should've been retried

@dmytton
dmytton commented Nov 19, 2012

No, I didn't restart the worker. There is an immediate retry and then it backs off until the retry limit is exceeded. Using the default broker this works as expected (without restarting it) but using the Mongo broker there is never any retry.

@ask
Celery member
ask commented Nov 19, 2012

Are you sure you do not have any old workers running that is hijacking the messages? Try killing all the workers (changing the broker in this case could would form a new message bus so wouldn't be affected)

@dmytton
dmytton commented Nov 19, 2012

I monitored it in celery flower and there was only 1 worker listed. The task went to the retry state but never got retried - there was no item in the scheduled tasks list either.

@ask
Celery member
ask commented Nov 19, 2012

Workers may accept messages without sending heartbeats, so flower would not necessarily be able to detect it running

@dmytton
dmytton commented Nov 19, 2012

Righto. How should I find out if there is another worker that has processed the message? This is local development so I have restarted and started up a single worker and still get the same result.

I'm logging within the task so I can see when it's being executed. Using --logfile should mean that any other worker will also log to that file, and I'm only seeing the 1 worker logging anything.

@ask
Celery member
ask commented Nov 20, 2012

This patch is extremely dirty, but works until I've found a better solution (which probably will be just as dirty).

Make sure you do not use 'main' as the app name by giving it a name:

celery = Celery('tasks')

@ask ask closed this in 89b760c Nov 20, 2012
@dmytton
dmytton commented Nov 20, 2012

Is giving the app a name a workaround or does it depend on those commits?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Something went wrong with that request. Please try again.