New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Celery Worker crashing after first task with TypeError: 'NoneType' object is not callable #3620

Closed
alanhamlett opened this Issue Nov 24, 2016 · 54 comments

Comments

Projects
None yet
@alanhamlett
Contributor

alanhamlett commented Nov 24, 2016

Checklist

  • I have included the output of celery -A proj report in the issue.
    (if you are not able to do this, then at least specify the Celery
    version affected).
software -> celery:4.0.0 (latentcall) kombu:4.0.0 py:3.4.3
            billiard:3.5.0.2 py-amqp:2.1.1
platform -> system:Linux arch:64bit, ELF imp:CPython
loader   -> celery.loaders.default.Loader
settings -> transport:amqp results:disabled
  • I have verified that the issue exists against the master branch of Celery.
    Yes I've tested and it behaves the same using master.

Steps to reproduce

Not exactly sure, because other machines with the same specs and requirements are working.

Expected behavior

Should consume tasks.

Actual behavior

A task is accepted, then a traceback is logged, then the worker reconnects to the broker for some reason. This repeats forever:

[2016-11-23 23:09:00,468: INFO/MainProcess] Connected to amqp://user:**@10.136.131.6:5672//
[2016-11-23 23:09:00,484: INFO/MainProcess] mingle: searching for neighbors
[2016-11-23 23:09:01,921: INFO/MainProcess] mingle: sync with 1 nodes
[2016-11-23 23:09:01,922: INFO/MainProcess] mingle: sync complete
[2016-11-23 23:09:01,970: INFO/MainProcess] Received task: tasks.calculate_user_running_total[ddd103af-d527-4564-83f8-96b747767a0c]
[2016-11-23 23:09:01,972: CRITICAL/MainProcess] Unrecoverable error: TypeError("'NoneType' object is not callable",)
Traceback (most recent call last):
  File "./venv/lib/python3.4/site-packages/celery/worker/worker.py", line 203, in start
    self.blueprint.start(self)
  File "./venv/lib/python3.4/site-packages/celery/bootsteps.py", line 119, in start
    step.start(parent)
  File "./venv/lib/python3.4/site-packages/celery/bootsteps.py", line 370, in start
    return self.obj.start()
  File "./venv/lib/python3.4/site-packages/celery/worker/consumer/consumer.py", line 318, in start
    blueprint.start(self)
  File "./venv/lib/python3.4/site-packages/celery/bootsteps.py", line 119, in start
    step.start(parent)
  File "./venv/lib/python3.4/site-packages/celery/worker/consumer/consumer.py", line 584, in start
    c.loop(*c.loop_args())
  File "./venv/lib/python3.4/site-packages/celery/worker/loops.py", line 47, in asynloop
    consumer.consume()
  File "./venv/lib/python3.4/site-packages/kombu/messaging.py", line 470, in consume
    self._basic_consume(T, no_ack=no_ack, nowait=False)
  File "./venv/lib/python3.4/site-packages/kombu/messaging.py", line 591, in _basic_consume
    no_ack=no_ack, nowait=nowait)
  File "./venv/lib/python3.4/site-packages/kombu/entity.py", line 737, in consume
    arguments=self.consumer_arguments)
  File "./venv/lib/python3.4/site-packages/amqp/channel.py", line 1578, in basic_consume
    wait=None if nowait else spec.Basic.ConsumeOk,
  File "./venv/lib/python3.4/site-packages/amqp/abstract_channel.py", line 73, in send_method
    return self.wait(wait, returns_tuple=returns_tuple)
  File "./venv/lib/python3.4/site-packages/amqp/abstract_channel.py", line 93, in wait
    self.connection.drain_events(timeout=timeout)
  File "./venv/lib/python3.4/site-packages/amqp/connection.py", line 464, in drain_events
    return self.blocking_read(timeout)
  File "./venv/lib/python3.4/site-packages/amqp/connection.py", line 469, in blocking_read
    return self.on_inbound_frame(frame)
  File "./venv/lib/python3.4/site-packages/amqp/method_framing.py", line 88, in on_frame
    callback(channel, msg.frame_method, msg.frame_args, msg)
  File "./venv/lib/python3.4/site-packages/amqp/connection.py", line 473, in on_inbound_method
    method_sig, payload, content,
  File "./venv/lib/python3.4/site-packages/amqp/abstract_channel.py", line 142, in dispatch_method
    listener(*args)
  File "./venv/lib/python3.4/site-packages/amqp/channel.py", line 1613, in _on_basic_deliver
    fun(msg)
  File "./venv/lib/python3.4/site-packages/kombu/messaging.py", line 617, in _receive_callback
    return on_m(message) if on_m else self.receive(decoded, message)
  File "./venv/lib/python3.4/site-packages/celery/worker/consumer/consumer.py", line 558, in on_task_received
    callbacks,
  File "./venv/lib/python3.4/site-packages/celery/worker/strategy.py", line 145, in task_message_handler
    handle(req)
  File "./venv/lib/python3.4/site-packages/celery/worker/worker.py", line 221, in _process_task_sem
    return self._quick_acquire(self._process_task, req)
  File "./venv/lib/python3.4/site-packages/kombu/async/semaphore.py", line 62, in acquire
    callback(*partial_args, **partial_kwargs)
  File "./venv/lib/python3.4/site-packages/celery/worker/worker.py", line 226, in _process_task
    req.execute_using_pool(self.pool)
  File "./venv/lib/python3.4/site-packages/celery/worker/request.py", line 532, in execute_using_pool
    correlation_id=task_id,
  File "./venv/lib/python3.4/site-packages/celery/concurrency/base.py", line 155, in apply_async
    **options)
  File "./venv/lib/python3.4/site-packages/billiard/pool.py", line 1487, in apply_async
    self._quick_put((TASK, (result._job, None, func, args, kwds)))
TypeError: 'NoneType' object is not callable

The above lines are keep repeating every few seconds and no tasks are consumed from the queue.

@alanhamlett alanhamlett changed the title from Repeatedly reconnecting to broker, tasks accepted but never processed and still left in queue to Celery Worker crashing after first task with TypeError: 'NoneType' object is not callable Nov 26, 2016

@alanhamlett

This comment has been minimized.

Contributor

alanhamlett commented Nov 26, 2016

The error is repeating in the log because the Celery worker daemon is crashing, so systemd restarts it.

@alanhamlett

This comment has been minimized.

Contributor

alanhamlett commented Nov 26, 2016

@ask, self._quick_put is somehow not defined. Should billiard check for a None value before calling, catch the exception, or should self._quick_put never be None?

@alanhamlett

This comment has been minimized.

Contributor

alanhamlett commented Nov 26, 2016

When I change billiard/pool.py#L1483 to if self.threads or self._quick_put is None: Celery does not crash anymore but for some reason the workers never process any tasks.

@alanhamlett

This comment has been minimized.

Contributor

alanhamlett commented Nov 27, 2016

More verbose output with logging level DEBUG:

[2016-11-27 14:48:09,875: DEBUG/MainProcess] | Worker: Preparing bootsteps.
[2016-11-27 14:48:09,877: DEBUG/MainProcess] | Worker: Building graph...
[2016-11-27 14:48:09,878: DEBUG/MainProcess] | Worker: New boot order: {Timer, Hub, Pool, Autoscaler, StateDB, Beat, Consumer}
[2016-11-27 14:48:09,889: DEBUG/MainProcess] | Consumer: Preparing bootsteps.
[2016-11-27 14:48:09,889: DEBUG/MainProcess] | Consumer: Building graph...
[2016-11-27 14:48:09,898: DEBUG/MainProcess] | Consumer: New boot order: {Connection, Agent, Events, Mingle, Tasks, Control, Gossip, Heart, event loop}
[2016-11-27 14:48:09,908: DEBUG/MainProcess] | Worker: Starting Hub
[2016-11-27 14:48:09,908: DEBUG/MainProcess] ^-- substep ok
[2016-11-27 14:48:09,908: DEBUG/MainProcess] | Worker: Starting Pool
[2016-11-27 14:48:09,998: DEBUG/MainProcess] ^-- substep ok
[2016-11-27 14:48:09,999: DEBUG/MainProcess] | Worker: Starting Consumer
[2016-11-27 14:48:10,000: DEBUG/MainProcess] | Consumer: Starting Connection
[2016-11-27 14:48:10,016: DEBUG/MainProcess] Start from server, version: 0.9, properties: {'cluster_name': 'rabbit@rabbitmq', 'product': 'RabbitMQ', 'version': '3.5.6', 'information': 'Licensed under the MPL.
  See http://www.rabbitmq.com/', 'capabilities': {'authentication_failure_close': True, 'consumer_priorities': True, 'consumer_cancel_notify': True, 'per_consumer_qos': True, 'basic.nack': True, 'publisher_confirms': True, 'connection.blocked': True, 'exchange_exchange_bindings': True}, 'copyright': 'Copyright (C) 2007-2015 Pivotal Software, Inc.', 'platform': 'Erlang/OTP'}, mechanisms: ['AMQPLAIN', 'PLAIN'], locales: ['en_US']
[2016-11-27 14:48:10,018: INFO/MainProcess] Connected to amqp://user:**@10.136.131.6:5672//
[2016-11-27 14:48:10,018: DEBUG/MainProcess] ^-- substep ok
[2016-11-27 14:48:10,019: DEBUG/MainProcess] | Consumer: Starting Events
[2016-11-27 14:48:10,031: DEBUG/MainProcess] Start from server, version: 0.9, properties: {'cluster_name': 'rabbit@rabbitmq', 'product': 'RabbitMQ', 'version': '3.5.6', 'information': 'Licensed under the MPL.  See http://www.rabbitmq.com/', 'capabilities': {'authentication_failure_close': True, 'consumer_priorities': True, 'consumer_cancel_notify': True, 'per_consumer_qos': True, 'basic.nack': True, 'publisher_confirms': True, 'connection.blocked': True, 'exchange_exchange_bindings': True}, 'copyright': 'Copyright (C) 2007-2015 Pivotal Software, Inc.', 'platform': 'Erlang/OTP'}, mechanisms: ['AMQPLAIN', 'PLAIN'], locales: ['en_US']
[2016-11-27 14:48:10,034: DEBUG/MainProcess] ^-- substep ok
[2016-11-27 14:48:10,034: DEBUG/MainProcess] | Consumer: Starting Mingle
[2016-11-27 14:48:10,035: INFO/MainProcess] mingle: searching for neighbors
[2016-11-27 14:48:10,036: DEBUG/MainProcess] using channel_id: 1
[2016-11-27 14:48:10,041: DEBUG/MainProcess] Channel open
[2016-11-27 14:48:10,061: DEBUG/MainProcess] Start from server, version: 0.9, properties: {'cluster_name': 'rabbit@rabbitmq', 'product': 'RabbitMQ', 'version': '3.5.6', 'information': 'Licensed under the MPL.  See http://www.rabbitmq.com/', 'capabilities': {'authentication_failure_close': True, 'consumer_priorities': True, 'consumer_cancel_notify': True, 'per_consumer_qos': True, 'basic.nack': True, 'publisher_confirms': True, 'connection.blocked': True, 'exchange_exchange_bindings': True}, 'copyright': 'Copyright (C) 2007-2015 Pivotal Software, Inc.', 'platform': 'Erlang/OTP'}, mechanisms: ['AMQPLAIN', 'PLAIN'], locales: ['en_US']
[2016-11-27 14:48:10,063: DEBUG/MainProcess] using channel_id: 1
[2016-11-27 14:48:10,064: DEBUG/MainProcess] Channel open
[2016-11-27 14:48:11,189: INFO/MainProcess] mingle: sync with 3 nodes
[2016-11-27 14:48:11,190: DEBUG/MainProcess] mingle: processing reply from celery@worker03
[2016-11-27 14:48:11,190: DEBUG/MainProcess] mingle: processing reply from celery@worker02
[2016-11-27 14:48:11,190: DEBUG/MainProcess] mingle: processing reply from celery@worker01
[2016-11-27 14:48:11,190: INFO/MainProcess] mingle: sync complete
[2016-11-27 14:48:11,191: DEBUG/MainProcess] ^-- substep ok
[2016-11-27 14:48:11,191: DEBUG/MainProcess] | Consumer: Starting Tasks
[2016-11-27 14:48:11,244: DEBUG/MainProcess] ^-- substep ok
[2016-11-27 14:48:11,244: DEBUG/MainProcess] | Consumer: Starting Control
[2016-11-27 14:48:11,244: DEBUG/MainProcess] using channel_id: 2
[2016-11-27 14:48:11,246: DEBUG/MainProcess] Channel open
[2016-11-27 14:48:11,251: DEBUG/MainProcess] ^-- substep ok
[2016-11-27 14:48:11,251: DEBUG/MainProcess] | Consumer: Starting Gossip
[2016-11-27 14:48:11,252: DEBUG/MainProcess] using channel_id: 3
[2016-11-27 14:48:11,253: DEBUG/MainProcess] Channel open
[2016-11-27 14:48:11,257: DEBUG/MainProcess] ^-- substep ok
[2016-11-27 14:48:11,258: DEBUG/MainProcess] | Consumer: Starting Heart
[2016-11-27 14:48:11,259: DEBUG/MainProcess] using channel_id: 1
[2016-11-27 14:48:11,260: DEBUG/MainProcess] Channel open
[2016-11-27 14:48:11,261: DEBUG/MainProcess] ^-- substep ok
[2016-11-27 14:48:11,261: DEBUG/MainProcess] | Consumer: Starting event loop
[2016-11-27 14:48:11,264: INFO/MainProcess] Received task: wakatime.tasks.cache_coding_activity[0eba267c-72e4-40ea-91dd-a1a7ab17c514]
[2016-11-27 14:48:11,265: DEBUG/MainProcess] TaskPool: Apply <function _fast_trace_task at 0x7ff469300950> (args:('wakatime.tasks.cache_coding_activity', '0eba267c-72e4-40ea-91dd-a1a7ab17c514', {'argsrepr': '()', 'task': 'wakatime.tasks.cache_coding_activity', 'lang': 'py', 'parent_id': '81f0c7ce-1396-496f-bf64-ae243736c845', 'timelimit': [None, None], 'root_id': '128647cc-f558-4b7d-bafc-338d186b5cfa', 'reply_to': 'e3c2b067-a058-3aa0-a3a1-384d4b917bbf', 'retries': 0, 'expires': None, 'delivery_info': {'exchange': '', 'priority': None, 'routing_key': 'cache', 'redelivered': True}, 'id': '0eba267c-72e4-40ea-91dd-a1a7ab17c514', 'correlation_id': '0eba267c-72e4-40ea-91dd-a1a7ab17c514', 'group': None, 'eta': None, 'kwargsrepr': "{'cache_projects': True, 'timeout': 15, 'user_id': UUID('d9c69ce0-f194-45a6-83cf-98f931fca8aa'), 'writes_only': False}", 'origin': 'gen3021@worker02'}, '[[], {"cache_projects": true, "timeout": 15, "user_id": "d9c69ce0-f194-45a6-83cf-98f931fca8aa", "writes_only": false}, {"callbacks": null, "chain": null, "chord": null, "errbacks": null}]', 'application/json', 'utf-8') kwargs:{})
[2016-11-27 14:48:11,266: CRITICAL/MainProcess] Unrecoverable error: TypeError("'NoneType' object is not callable",)
Traceback (most recent call last):
  File "./venv/src/celery/celery/worker/worker.py", line 203, in start
    self.blueprint.start(self)
  File "./venv/src/celery/celery/bootsteps.py", line 119, in start
    step.start(parent)
  File "./venv/src/celery/celery/bootsteps.py", line 370, in start
    return self.obj.start()
  File "./venv/src/celery/celery/worker/consumer/consumer.py", line 318, in start
    blueprint.start(self)
  File "./venv/src/celery/celery/bootsteps.py", line 119, in start
    step.start(parent)
  File "./venv/src/celery/celery/worker/consumer/consumer.py", line 593, in start
    c.loop(*c.loop_args())
  File "./venv/src/celery/celery/worker/loops.py", line 47, in asynloop
    consumer.consume()
  File "./venv/lib/python3.4/site-packages/kombu/messaging.py", line 470, in consume
    self._basic_consume(T, no_ack=no_ack, nowait=False)
  File "./venv/lib/python3.4/site-packages/kombu/messaging.py", line 591, in _basic_consume
    no_ack=no_ack, nowait=nowait)
  File "./venv/lib/python3.4/site-packages/kombu/entity.py", line 737, in consume
    arguments=self.consumer_arguments)
  File "./venv/lib/python3.4/site-packages/amqp/channel.py", line 1578, in basic_consume
    wait=None if nowait else spec.Basic.ConsumeOk,
  File "./venv/lib/python3.4/site-packages/amqp/abstract_channel.py", line 73, in send_method
    return self.wait(wait, returns_tuple=returns_tuple)
  File "./venv/lib/python3.4/site-packages/amqp/abstract_channel.py", line 93, in wait
    self.connection.drain_events(timeout=timeout)
  File "./venv/lib/python3.4/site-packages/amqp/connection.py", line 464, in drain_events
    return self.blocking_read(timeout)
  File "./venv/lib/python3.4/site-packages/amqp/connection.py", line 469, in blocking_read
    return self.on_inbound_frame(frame)
  File "./venv/lib/python3.4/site-packages/amqp/method_framing.py", line 88, in on_frame
    callback(channel, msg.frame_method, msg.frame_args, msg)
  File "./venv/lib/python3.4/site-packages/amqp/connection.py", line 473, in on_inbound_method
    method_sig, payload, content,
  File "./venv/lib/python3.4/site-packages/amqp/abstract_channel.py", line 142, in dispatch_method
    listener(*args)
  File "./venv/lib/python3.4/site-packages/amqp/channel.py", line 1613, in _on_basic_deliver
    fun(msg)
  File "./venv/lib/python3.4/site-packages/kombu/messaging.py", line 617, in _receive_callback
    return on_m(message) if on_m else self.receive(decoded, message)
  File "./venv/src/celery/celery/worker/consumer/consumer.py", line 567, in on_task_received
    callbacks,
  File "./venv/src/celery/celery/worker/strategy.py", line 145, in task_message_handler
    handle(req)
  File "./venv/src/celery/celery/worker/worker.py", line 221, in _process_task_sem
    return self._quick_acquire(self._process_task, req)
  File "./venv/lib/python3.4/site-packages/kombu/async/semaphore.py", line 62, in acquire
    callback(*partial_args, **partial_kwargs)
  File "./venv/src/celery/celery/worker/worker.py", line 226, in _process_task
    req.execute_using_pool(self.pool)
  File "./venv/src/celery/celery/worker/request.py", line 532, in execute_using_pool
    correlation_id=task_id,
  File "./venv/src/celery/celery/concurrency/base.py", line 155, in apply_async
    **options)
  File "./venv/lib/python3.4/site-packages/billiard/pool.py", line 1487, in apply_async
    self._quick_put((TASK, (result._job, None, func, args, kwds)))
TypeError: 'NoneType' object is not callable
[2016-11-27 14:48:11,273: DEBUG/MainProcess] | Worker: Closing Hub...
[2016-11-27 14:48:11,274: DEBUG/MainProcess] | Worker: Closing Pool...
[2016-11-27 14:48:11,274: DEBUG/MainProcess] | Worker: Closing Consumer...
[2016-11-27 14:48:11,274: DEBUG/MainProcess] | Worker: Stopping Consumer...
[2016-11-27 14:48:11,274: DEBUG/MainProcess] | Consumer: Closing Connection...
[2016-11-27 14:48:11,275: DEBUG/MainProcess] | Consumer: Closing Events...
[2016-11-27 14:48:11,275: DEBUG/MainProcess] | Consumer: Closing Mingle...
[2016-11-27 14:48:11,275: DEBUG/MainProcess] | Consumer: Closing Tasks...
[2016-11-27 14:48:11,275: DEBUG/MainProcess] | Consumer: Closing Control...
[2016-11-27 14:48:11,275: DEBUG/MainProcess] | Consumer: Closing Gossip...
[2016-11-27 14:48:11,276: DEBUG/MainProcess] | Consumer: Closing Heart...
[2016-11-27 14:48:11,276: DEBUG/MainProcess] | Consumer: Closing event loop...
[2016-11-27 14:48:11,276: DEBUG/MainProcess] | Consumer: Stopping event loop...
[2016-11-27 14:48:11,276: DEBUG/MainProcess] | Consumer: Stopping Heart...
[2016-11-27 14:48:11,277: DEBUG/MainProcess] | Consumer: Stopping Gossip...
[2016-11-27 14:48:11,278: INFO/MainProcess] Received task: wakatime.tasks.cache_coding_activity[f786fc75-0518-4893-8988-ff7f063edd12]
[2016-11-27 14:48:11,278: DEBUG/MainProcess] TaskPool: Apply <function _fast_trace_task at 0x7ff469300950> (args:('wakatime.tasks.cache_coding_activity', 'f786fc75-0518-4893-8988-ff7f063edd12', {'argsrepr': '()', 'task': 'wakatime.tasks.cache_coding_activity', 'lang': 'py', 'parent_id': '81f0c7ce-1396-496f-bf64-ae243736c845', 'timelimit': [None, None], 'root_id': '128647cc-f558-4b7d-bafc-338d186b5cfa', 'reply_to': 'e3c2b067-a058-3aa0-a3a1-384d4b917bbf', 'retries': 0, 'expires': None, 'delivery_info': {'exchange': '', 'priority': None, 'routing_key': 'cache', 'redelivered': True}, 'id': 'f786fc75-0518-4893-8988-ff7f063edd12', 'correlation_id': 'f786fc75-0518-4893-8988-ff7f063edd12', 'group': None, 'eta': None, 'kwargsrepr': "{'cache_projects': True, 'timeout': 15, 'user_id': UUID('7056644f-2564-4074-b89e-631973879f44'), 'writes_only': False}", 'origin': 'gen3021@worker02'}, '[[], {"cache_projects": true, "timeout": 15, "user_id": "7056644f-2564-4074-b89e-631973879f44", "writes_only": false}, {"callbacks": null, "chain": null, "chord": null, "errbacks": null}]', 'application/json', 'utf-8') kwargs:{})
[2016-11-27 14:48:11,279: INFO/MainProcess] Received task: wakatime.tasks.cache_coding_activity[d5c8dc57-116c-467d-9924-e2999280c2f8]
[2016-11-27 14:48:11,280: INFO/MainProcess] Received task: wakatime.tasks.cache_coding_activity[460ef864-e482-4b0f-8580-d0095750bae6]
[2016-11-27 14:48:11,281: DEBUG/MainProcess] Closed channel #3
[2016-11-27 14:48:11,281: DEBUG/MainProcess] | Consumer: Stopping Control...
[2016-11-27 14:48:11,283: DEBUG/MainProcess] Closed channel #2
[2016-11-27 14:48:11,283: DEBUG/MainProcess] | Consumer: Stopping Tasks...
[2016-11-27 14:48:11,284: DEBUG/MainProcess] Canceling task consumer...
[2016-11-27 14:48:11,286: DEBUG/MainProcess] | Consumer: Stopping Mingle...
[2016-11-27 14:48:11,286: DEBUG/MainProcess] | Consumer: Stopping Events...
[2016-11-27 14:48:11,286: DEBUG/MainProcess] | Consumer: Stopping Connection...
[2016-11-27 14:48:11,286: DEBUG/MainProcess] | Worker: Stopping Pool...
[2016-11-27 14:48:12,800: DEBUG/MainProcess] result handler: all workers terminated
[2016-11-27 14:48:12,801: DEBUG/MainProcess] | Worker: Stopping Hub...
[2016-11-27 14:48:12,801: DEBUG/MainProcess] | Consumer: Shutdown Heart...
[2016-11-27 14:48:12,802: DEBUG/MainProcess] | Consumer: Shutdown Gossip...
[2016-11-27 14:48:12,802: DEBUG/MainProcess] | Consumer: Shutdown Control...
[2016-11-27 14:48:12,802: DEBUG/MainProcess] | Consumer: Shutdown Tasks...
[2016-11-27 14:48:12,803: DEBUG/MainProcess] Canceling task consumer...
[2016-11-27 14:48:12,803: DEBUG/MainProcess] Closing consumer channel...
[2016-11-27 14:48:12,803: DEBUG/MainProcess] | Consumer: Shutdown Events...
[2016-11-27 14:48:12,804: DEBUG/MainProcess] Closed channel #1
[2016-11-27 14:48:12,805: DEBUG/MainProcess] | Consumer: Shutdown Connection...
[2016-11-27 14:48:12,806: DEBUG/MainProcess] Closed channel #1
[2016-11-27 14:48:12,807: DEBUG/MainProcess] removing tasks from inqueue until task handler finished
@alanhamlett

This comment has been minimized.

Contributor

alanhamlett commented Nov 27, 2016

This was introduced with Celery 4.x because downgrading to 3.1.24 prevents the traceback.

@ask

This comment has been minimized.

Member

ask commented Dec 1, 2016

Doesn't happen to me here on Linux Python 3.4. What arguments to you use to start the worker?

@ask

This comment has been minimized.

Member

ask commented Dec 2, 2016

_quick_put should never be None btw. Does this happen at startup or always after a connection failure?

@ask

This comment has been minimized.

Member

ask commented Dec 2, 2016

I've been trying to reproduce by stopping the broker while executing tasks, and still no luck at reproducing.

@alanhamlett

This comment has been minimized.

Contributor

alanhamlett commented Dec 4, 2016

Always at startup. The worker arguments are:

/opt/app/venv/bin/python /opt/app/venv/bin/celery worker --app=wakatime.celery --workdir=/opt/app --logfile=/var/log/celery/worker.log --loglevel=INFO --concurrency=50 --exclude-queues=medium,low,cache
@phretor

This comment has been minimized.

phretor commented Dec 16, 2016

👍 for this. Getting the very same issue, even on 4.0.1

@jmesquita

This comment has been minimized.

jmesquita commented Dec 16, 2016

@ask I get to reproduce it everytime when you have messages on the broker waiting to be processed when the worker comes up. This is often the case when using beat, which is my case. If the beat services comes online before the worker, you won't be able to start the worker due to the issue mentioned above. I'm using python 2.7 for all that matter and am able to reproduce it consistently.

This is the same error as the one mentioned on #3539

@alanhamlett

This comment has been minimized.

Contributor

alanhamlett commented Dec 16, 2016

@jmesquita that's consistent with my scenario, since my queues always have pending messages on the broker when starting the workers.

@jmesquita

This comment has been minimized.

jmesquita commented Dec 16, 2016

@alanhamlett I'm trying to get this fixed and reading the code but I'm new to celery so it might take me sometime. What is strange to me is that with so many people using celery and celery messages being queued by default to workers, this has not exploded within the community. Makes me wonder if I'm misusing it somehow.

@adewes

This comment has been minimized.

adewes commented Dec 18, 2016

I dug into the code a bit, _quick_put gets assigned by AsyncPool._create_write_handlers, which gets called by AsyncPool.register_with_event_loop, which gets called by celery.worker.loops.asynloop. Superficially, the problem seems to be that asynloop first calls consumer.consume() and only then calls obj.register_with_event_loop, which causes _quick_put to be None when it gets called from within consume().

This would explain why the problems does not occur when there are no messages in the queue when the event loop starts up, as then consume() will do nothing and the next time it gets called, register_with_event_loop will have been called already.

I could fix this by moving

obj.controller.register_with_event_loop(hub)
obj.register_with_event_loop(hub)

before consumer.consume(), though this is of course only a very naive (and possibly wrong) fix.

@jmesquita

This comment has been minimized.

jmesquita commented Dec 19, 2016

So I worked around my problem by making celery beat messages transient, which is actually my intended behaviour anyway. I'll revisit this as soon as I have a bit more experience with Celery and it's codebase.

@gkoller

This comment has been minimized.

gkoller commented Dec 21, 2016

@jmesquita:

So I worked around my problem by making celery beat messages transient

That prevents the error from occurring. Thank you for the suggestion.

@sterutkb

This comment has been minimized.

sterutkb commented Dec 29, 2016

I can only reproduce this bug if I am attempting to consume from multiple queues. If everything is consumed from a single queue then start up works as expected (messages on the queue are properly consumed).

@adewes I tested your proposed solution and at least on the surface it seems to solve the problem.

@thedrow

This comment has been minimized.

Member

thedrow commented Jan 2, 2017

@adewes Can you issue a pull request so we can discuss your proposed change?

@jdotjdot

This comment has been minimized.

jdotjdot commented Jan 9, 2017

Are there any updates on this issue? This is causing us major problems now, including in production. I can't even test locally because I get the TypeError issue. We may have to downgrade back to Celery 3.

@adewes

This comment has been minimized.

adewes commented Jan 9, 2017

I was not able to resolve it so far, also downgraded to version 3 for now, hope the problem will be fixed soon. @thedrow my "quick fix" did not yield a complete resolution of the problem so I'm not opening a pull request. I'm not fully versed in the dataflow of the components used (there are several libraries in play here), so I'm not able to debug this further right now unfortunately.

@jdotjdot

This comment has been minimized.

jdotjdot commented Jan 9, 2017

I'm actually not even sure we can downgrade because it's possible we might be relying on the new usage of message headers in the v2 task design.

@ask--I'm happy to screenshare or whatever with you so you can see my exact environment to help debug, maybe even try opening up a remote debug if we need to. We're in a bit of a bind because we went all in on Celery 4 and now can't start our workers in production.

@alanhamlett

This comment has been minimized.

Contributor

alanhamlett commented Jan 9, 2017

For now, you can install my fork to get things running in your prod environment:

pip install -e git://github.com/alanhamlett/celery.git@73147a9da31f2932eb4778e9474fbe72f23d21c2#egg=Celery

I just opened #3752 to fix this, but need to figure out a good test to cover the bug first.

@jdotjdot

This comment has been minimized.

jdotjdot commented Jan 9, 2017

Thanks so much.

Starting to think I'm going crazy... I tried upgrading to 4.0.2 (currently on 4.0.0) and with that upgrade, all of the sudden self.retry() stopped working as well.

@mrfuxi

This comment has been minimized.

mrfuxi commented Jan 13, 2017

Specifying queues explicitly from in CLI looks like a walk around (on 4.0.0)

@jdotjdot

This comment has been minimized.

jdotjdot commented Jan 13, 2017

@sww314

This comment has been minimized.

sww314 commented Jan 13, 2017

@jdotjdot - I am seeing the same thing. Specifying queues on the command line.
I only see the issue when there is data in the queue. If the queues are empty, I can start workers and they will run.

@ianseyer

This comment has been minimized.

ianseyer commented Jan 18, 2017

+1

Having this exact issue. Only solution seems to be reset my broker (which is not good for a queue).

@jdotjdot

This comment has been minimized.

jdotjdot commented Feb 14, 2017

@Eugeny

This comment has been minimized.

Eugeny commented Feb 14, 2017

In the light of this unfortunate delay I decided to merge some fixes for known Celery 4 issues together in git+http://github.com/Eugeny/celery#egg=celery==4.0.2.1
This includes:

Working fine in production so far

@johnatron

This comment has been minimized.

johnatron commented Feb 14, 2017

The message spec may be cross compatible, but there are (subtle) incompatible differences between the versions when it comes to Celery Canvas and chaining (a feature which we use a lot). A considerable amount of effort went in to port our app from v3 to v4. Having to go back would be a bad time.

@johnatron

This comment has been minimized.

johnatron commented Feb 16, 2017

O frabjous day! Thank you.

avantrino added a commit to avantrino/dockerimages that referenced this issue Mar 12, 2017

@iiilx

This comment has been minimized.

iiilx commented Apr 12, 2017

I have this same problem using Python 3.6 and celery 4.0.2.

Shut down celery, create a task, start up celery and immediately get the error TypeError: 'NoneType' object is not callable. @ask could you please consider the proposed fixes and merging? This is preventing previously happy users of Celery from continuing to use Celery :(

Thank you for your time, @ask!

@tiptoettt

This comment has been minimized.

tiptoettt commented Apr 12, 2017

"Microsoft Windows is no longer supported.

The test suite is passing, and Celery seems to be working with Windows, but we make no guarantees as we are unable to diagnose issues on this platform. If you are a company requiring support on this platform, please get in touch."
Probably not going to be fixed.

@iiilx

This comment has been minimized.

iiilx commented Apr 12, 2017

@tiptoettt this is not an issue specific to Windows. I'm on Mac. Can you please closely look at everyone's comments? Developers will ditch Celery because of this. I've used celery for 5 years and it's a major issue. I am going to use an earlier version 3.1.25 if it doesn't have this issue.

@LPiner

This comment has been minimized.

LPiner commented Apr 24, 2017

Also having this issue

lpiner@host:~$ uname -a
Linux host 4.4.0-72-generic #93-Ubuntu SMP Fri Mar 31 14:07:41 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
lpiner@host:~$ celery --version
4.0.2 (latentcall)

This is the line of code that seems to cause the issue:

task_queues = (
    Queue('queue1_name', Exchange('default'), routing_key='default'),
    Queue('queue2_name', Exchange('default'), routing_key='general.routing_key'),
)

Removing the second queue causes the issue to go away.

@Vslira

This comment has been minimized.

Vslira commented May 8, 2017

From what I understand from @ChillarAnand 's post in celery/kombu/issues/675, this issue should be solved by 4.0.3, right?

@LPiner

This comment has been minimized.

LPiner commented May 8, 2017

@Vslira

This comment has been minimized.

Vslira commented May 8, 2017

Thanks, it worked!

@sireliah

This comment has been minimized.

sireliah commented May 22, 2017

I had the same problem on Celery 3.1.23 (Cipater) using RabbitMQ. After long debugging I finally found this thread and fixed the problem by using 3.1.25 version. But it was banging the head against the wall, really.

@xarg

This comment has been minimized.

xarg commented Jun 1, 2017

Same issue on v4.0.2 (latentcall) with multiple queues + heartbeat.

@MrKiven

This comment has been minimized.

MrKiven commented Jun 2, 2017

same issue for me on v0.4.0.2 workaround is downgrading to v3.x

@ba1dr

This comment has been minimized.

ba1dr commented Jun 26, 2017

v4.0.2
For me this issue is a showstopper. I do not consider downgrading to 3.x. I hope the new version will be released soon.

@LPiner

This comment has been minimized.

LPiner commented Jun 26, 2017

@ba1dr build from master. That fixes it for me.

@ba1dr

This comment has been minimized.

ba1dr commented Jun 26, 2017

@LPiner, thanks, this fixed the issue for me. But this is not a production-ready solution, eh?

@LPiner

This comment has been minimized.

LPiner commented Jun 26, 2017

@ba1dr not much of a choice TBH. It's either this or downgrade back to 3.x.
We use it in production with no issues but our scale is only a few hundred jobs a day, your mileage may vary.

@thedrow

This comment has been minimized.

Member

thedrow commented Jul 11, 2017

We're gonna release soon. See #4109

@rholloway

This comment has been minimized.

rholloway commented Jul 13, 2017

Affected here as well. Just hit us. Ready for the release!

AKSHAYUBHAT added a commit to AKSHAYUBHAT/DeepVideoAnalytics that referenced this issue Jul 15, 2017

worker fails due to celery/celery#3620 when consuming from multiple q…
…ueues, disabling broadcast tasks for now.

ketanbhatt added a commit to ketanbhatt/celery that referenced this issue Sep 13, 2017

gitmate-test-user pushed a commit to GitMateIO/gitmate-2 that referenced this issue Nov 22, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment