New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Worker hangs at startup when running without mingle and gossip and many messages in queue #1847
Worker hangs at startup when running without mingle and gossip and many messages in queue #1847
Comments
Some syscalls while this is happening:
|
Stacktrace
|
Nevermind. |
Ok so I've patched kombu a bit to have some extra logging:
|
Fds to worker processes are from 4 to 13 (inclusive). It looks none of those are added in the hub for writing purpose (thus tasks don't get run). |
It appears that the amqp client is stalling (has data in internal buffers, but the hub doesn't know about it). I have identified the commit adding buffering in pyamqp Connection object to this commit: celery/py-amqp@737fa58 |
I would appear that librabbitmq has the same stalling issue but looks harder to fix as it just reads up to the buffer's size, see https://github.com/ask/rabbitmq-c/blob/bfbe693e88f8495073f43571904cdbb817dc50ae/librabbitmq/amqp_socket.c#L211 |
@sabw8217 You can test this fix by using 'pyamqp://' broker protocol (in case you have librabbitmq installed). Don't forget to install the code from the 2 PRs mentioned above. |
I think there is still an issue here - it looks to me like what it is happening is that this call in asynloop() on line 41 of celery/worker/loops.py can receive messages up to the entire prefetch count of the worker in addition to the basic_consume_ok. consumer.consume() And when that happens the messages end up in the method_queue in the amqp Channel. If the entire prefetch count has been received, RMQ won't send anything to us until a message gets acked, but because RMQ isn't sending anything the socket FD never becomes ready to read, and we never end up calling the task handler(which would call drain_events and kick us out of this bad state, where there are prefetch_count basic_deliver messages in the pyamqp's internal method queue). It looks to me like doing this will fix it, but I'm guessing there should be a timeout on this call, and possibly this whole issue should be being addressed at a lower level:
|
@sabw8217 are you still having the issue or is this a different problem ? |
It's the same issue. The worker hangs on startup, and you can wake it up by issuing a celery inspect active command, which I think causes RabbitMQ to deliver another message to the worker, and then the socket becomes readable and we go through drain_events and consume all the messages. I added some debugging code to amqp in connection.Connection._wait_method and got output like this: [2014-03-07 08:06:44,093: WARNING/MainProcess] called _wait_method for channel 2, there was 0 in the queue I was running with CELERYD_PREFETCH_MULTIPLIER = 16, and I can see 16 messages go into that method_queue property, and I don't ever see them get read out of there until I trigger a read from the socket with a management command. |
@sabw8217 can you try different client library ? (eg: uninstall or install rabbitmq) Also, you're not using SSL right? |
Reinstalling the rabbitmq-server package(which I assume is what you meant) didn't have any affect. I'm not using SSL, I'm just connecting to local RabbitMQ: |
@sabw8217 This may sound silly, but are you using the latest packages of kombu/billiard/celery/pyamqp ? |
Sorry about the slow response. Here's the output of pip freeze in the ve I am running this in: |
Having the same problem. Running I can also confirm that adding the call to |
See celery#1847 for more details.
having the same issue, inspect active seems to get things going. What version of celery is this supposed to be fixed in? |
Hmm. The "fix" in py-amqp is to never read more from the socket than absolutely necessary. Doesn't Celery have a way to tell its scheduler to mark a task as ready, without having a readable/writeable socket? I really don't like the inefficiency caused by not buffering any more. |
What @boffbowsh said isn't entirely accurate. Turned out that librabbitmq + |
Any recent discoveries on this issue? We're experiencing this problem intermittently and a |
No, we're really sorry but we're suffering from low number of maintainers and @ask is pretty busy lately. |
Thanks for the update! |
@Rigdon feel free to contribute. We need pull requests fixing bugs like these. |
kudos for @sabw8217 who's fix i copied verbatim.
kudos for @sabw8217 who's fix i copied verbatim.
Closed by #2823 Thanks everyone :) |
Applying patch to 3.1.18 doesn't work for me:
|
Changelog Details: Change history ================ This document contains change notes for bugfix releases in the 3.1.x series (Cipater), please see :ref:`whatsnew-3.1` for an overview of what's new in Celery 3.1. .. _version-3.1.26: 3.1.26 ====== :release-date: 2018-23-03 16:00 PM IST :release-by: Omer Katz - Fixed a crash caused by tasks cycling between Celery 3 and Celery 4 workers. .. _version-3.1.25: 3.1.25 ====== :release-date: 2016-10-10 12:00 PM PDT :release-by: Ask Solem - **Requirements** - Now depends on :ref:`Kombu 3.0.37 <kombu:version-3.0.37>` - Fixed problem with chords in group introduced in 3.1.24 (Issue #3504). .. _version-3.1.24: 3.1.24 ====== :release-date: 2016-09-30 04:21 PM PDT :release-by: Ask Solem - **Requirements** - Now depends on :ref:`Kombu 3.0.36 <kombu:version-3.0.36>`. - Now supports Task protocol 2 from the future 4.0 release. Workers running 3.1.24 are now able to process messages sent using the `new task message protocol`_ to be introduced in Celery 4.0. Users upgrading to Celery 4.0 when this is released are encouraged to upgrade to this version as an intermediate step, as this means workers not yet upgraded will be able to process messages from clients/workers running 4.0. .. _`new task message protocol`: http://docs.celeryproject.org/en/master/internals/protocol.html#version-2 - ``Task.send_events`` can now be set to disable sending of events for that task only. Example when defining the task: .. code-block:: python @app.task(send_events=False) def add(x, y): return x + y - **Utils**: Fixed compatibility with recent :pypi:`psutil` versions (Issue #3262). - **Canvas**: Chord now forwards partial arguments to its subtasks. Fix contributed by Tayfun Sen. - **App**: Arguments to app such as ``backend``, ``broker``, etc are now pickled and sent to the child processes on Windows. Fix contributed by Jeremy Zafran. - **Deployment**: Generic init scripts now supports being symlinked in runlevel directories (Issue #3208). - **Deployment**: Updated CentOS scripts to work with CentOS 7. Contributed by Joe Sanford. - **Events**: The curses monitor no longer crashes when the result of a task is empty. Fix contributed by Dongweiming. - **Worker**: ``repr(worker)`` would crash when called early in the startup process (Issue #2514). - **Tasks**: GroupResult now defines __bool__ and __nonzero__. This is to fix an issue where a ResultSet or GroupResult with an empty result list are not properly tupled with the as_tuple() method when it is a parent result. This is due to the as_tuple() method performing a logical and operation on the ResultSet. Fix contributed by Colin McIntosh. - **Worker**: Fixed wrong values in autoscale related logging message. Fix contributed by ``@raducc``. - Documentation improvements by * Alexandru Chirila * Michael Aquilina * Mikko Ekström * Mitchel Humpherys * Thomas A. Neil * Tiago Moreira Vieira * Yuriy Syrovetskiy * ``@dessant`` .. _version-3.1.23: 3.1.23 ====== :release-date: 2016-03-09 06:00 P.M PST :release-by: Ask Solem - **Programs**: Last release broke support for the ``--hostnmame`` argument to :program:`celery multi` and :program:`celery worker --detach` (Issue #3103). - **Results**: MongoDB result backend could crash the worker at startup if not configured using an URL. .. _version-3.1.22: 3.1.22 ====== :release-date: 2016-03-07 01:30 P.M PST :release-by: Ask Solem - **Programs**: The worker would crash immediately on startup on ``backend.as_uri()`` when using some result backends (Issue #3094). - **Programs**: :program:`celery multi`/:program:`celery worker --detach` would create an extraneous logfile including literal formats (e.g. ``%I``) in the filename (Issue #3096). .. _version-3.1.21: 3.1.21 ====== :release-date: 2016-03-04 11:16 A.M PST :release-by: Ask Solem - **Requirements** - Now depends on :ref:`Kombu 3.0.34 <kombu:version-3.0.34>`. - Now depends on :mod:`billiard` 3.3.0.23. - **Prefork pool**: Fixes 100% CPU loop on Linux epoll (Issue #1845). Also potential fix for: Issue #2142, Issue #2606 - **Prefork pool**: Fixes memory leak related to processes exiting (Issue #2927). - **Worker**: Fixes crash at startup when trying to censor passwords in MongoDB and Cache result backend URLs (Issue #3079, Issue #3045, Issue #3049, Issue #3068, Issue #3073). Fix contributed by Maxime Verger. - **Task**: An exception is now raised if countdown/expires is less than -2147483648 (Issue #3078). - **Programs**: :program:`celery shell --ipython` now compatible with newer IPython versions. - **Programs**: The DuplicateNodeName warning emitted by inspect/control now includes a list of the node names returned. Contributed by Sebastian Kalinowski. - **Utils**: The ``.discard(item)`` method of :class:`~celery.datastructures.LimitedSet` did not actually remove the item (Issue #3087). Fix contributed by Dave Smith. - **Worker**: Node name formatting now emits less confusing error message for unmatched format keys (Issue #3016). - **Results**: amqp/rpc backends: Fixed deserialization of JSON exceptions (Issue #2518). Fix contributed by Allard Hoeve. - **Prefork pool**: The `process inqueue damaged` error message now includes the original exception raised. - **Documentation**: Includes improvements by: - Jeff Widman. .. _version-3.1.20: 3.1.20 ====== :release-date: 2016-01-22 06:50 P.M UTC :release-by: Ask Solem - **Requirements** - Now depends on :ref:`Kombu 3.0.33 <kombu:version-3.0.33>`. - Now depends on :mod:`billiard` 3.3.0.22. Includes binary wheels for Microsoft Windows x86 and x86_64! - **Task**: Error emails now uses ``utf-8`` charset by default (Issue #2737). - **Task**: Retry now forwards original message headers (Issue #3017). - **Worker**: Bootsteps can now hook into ``on_node_join``/``leave``/``lost``. See :ref:`extending-consumer-gossip` for an example. - **Events**: Fixed handling of DST timezones (Issue #2983). - **Results**: Redis backend stopped respecting certain settings. Contributed by Jeremy Llewellyn. - **Results**: Database backend now properly supports JSON exceptions (Issue #2441). - **Results**: Redis ``new_join`` did not properly call task errbacks on chord error (Issue #2796). - **Results**: Restores Redis compatibility with redis-py < 2.10.0 (Issue #2903). - **Results**: Fixed rare issue with chord error handling (Issue #2409). - **Tasks**: Using queue-name values in :setting:`CELERY_ROUTES` now works again (Issue #2987). - **General**: Result backend password now sanitized in report output (Issue #2812, Issue #2004). - **Configuration**: Now gives helpful error message when the result backend configuration points to a module, and not a class (Issue #2945). - **Results**: Exceptions sent by JSON serialized workers are now properly handled by pickle configured workers. - **Programs**: ``celery control autoscale`` now works (Issue #2950). - **Programs**: ``celery beat --detached`` now runs after fork callbacks. - **General**: Fix for LRU cache implementation on Python 3.5 (Issue #2897). Contributed by Dennis Brakhane. Python 3.5's ``OrderedDict`` does not allow mutation while it is being iterated over. This breaks "update" if it is called with a dict larger than the maximum size. This commit changes the code to a version that does not iterate over the dict, and should also be a little bit faster. - **Init scripts**: The beat init script now properly reports service as down when no pid file can be found. Eric Zarowny - **Beat**: Added cleaning of corrupted scheduler files for some storage backend errors (Issue #2985). Fix contributed by Aleksandr Kuznetsov. - **Beat**: Now syncs the schedule even if the schedule is empty. Fix contributed by Colin McIntosh. - **Supervisord**: Set higher process priority in supervisord example. Contributed by George Tantiras. - **Documentation**: Includes improvements by: - Bryson - Caleb Mingle - Christopher Martin - Dieter Adriaenssens - Jason Veatch - Jeremy Cline - Juan Rossi - Kevin Harvey - Kevin McCarthy - Kirill Pavlov - Marco Buttu - Mayflower - Mher Movsisyan - Michael Floering - michael-k - Nathaniel Varona - Rudy Attias - Ryan Luckie - Steven Parker - squfrans - Tadej Janež - TakesxiSximada - Tom S .. _version-3.1.19: 3.1.19 ====== :release-date: 2015-10-26 01:00 P.M UTC :release-by: Ask Solem - **Requirements** - Now depends on :ref:`Kombu 3.0.29 <kombu:version-3.0.29>`. - Now depends on :mod:`billiard` 3.3.0.21. - **Results**: Fixed MongoDB result backend URL parsing problem (Issue celery/kombu#375). - **Worker**: Task request now properly sets ``priority`` in delivery_info. Fix contributed by Gerald Manipon. - **Beat**: PyPy shelve may raise ``KeyError`` when setting keys (Issue #2862). - **Programs**: :program:`celery beat --deatched` now working on PyPy. Fix contributed by Krzysztof Bujniewicz. - **Results**: Redis result backend now ensures all pipelines are cleaned up. Contributed by Justin Patrin. - **Results**: Redis result backend now allows for timeout to be set in the query portion of the result backend URL. E.g. ``CELERY_RESULT_BACKEND = 'redis://?timeout=10'`` Contributed by Justin Patrin. - **Results**: ``result.get`` now properly handles failures where the exception value is set to :const:`None` (Issue #2560). - **Prefork pool**: Fixed attribute error ``proc.dead``. - **Worker**: Fixed worker hanging when gossip/heartbeat disabled (Issue #1847). Fix contributed by Aaron Webber and Bryan Helmig. - **Results**: MongoDB result backend now supports pymongo 3.x (Issue #2744). Fix contributed by Sukrit Khera. - **Results**: RPC/amqp backends did not deserialize exceptions properly (Issue #2691). Fix contributed by Sukrit Khera. - **Programs**: Fixed problem with :program:`celery amqp`'s ``basic_publish`` (Issue #2013). - **Worker**: Embedded beat now properly sets app for thread/process (Issue #2594). - **Documentation**: Many improvements and typos fixed. Contributions by: Carlos Garcia-Dubus D. Yu jerry Jocelyn Delalande Josh Kupershmidt Juan Rossi kanemra Paul Pearce Pavel Savchenko Sean Wang Seungha Kim Zhaorong Ma .. _version-3.1.18: 3.1.18 ====== :release-date: 2015-04-22 05:30 P.M UTC :release-by: Ask Solem - **Requirements** - Now depends on :ref:`Kombu 3.0.25 <kombu:version-3.0.25>`. - Now depends on :mod:`billiard` 3.3.0.20. - **Django**: Now supports Django 1.8 (Issue #2536). Fix contributed by Bence Tamas and Mickaël Penhard. - **Results**: MongoDB result backend now compatible with pymongo 3.0. Fix contributed by Fatih Sucu. - **Tasks**: Fixed bug only happening when a task has multiple callbacks (Issue #2515). Fix contributed by NotSqrt. - **Commands**: Preload options now support ``--arg value`` syntax. Fix contributed by John Anderson. - **Compat**: A typo caused ``celery.log.setup_logging_subsystem`` to be undefined. Fix contributed by Gunnlaugur Thor Briem. - **init scripts**: The celerybeat generic init script now uses ``/bin/sh`` instead of bash (Issue #2496). Fix contributed by Jelle Verstraaten. - **Django**: Fixed a :exc:`TypeError` sometimes occurring in logging when validating models. Fix contributed by Alexander. - **Commands**: Worker now supports new ``--executable`` argument that can be used with ``--detach``. Contributed by Bert Vanderbauwhede. - **Canvas**: Fixed crash in chord unlock fallback task (Issue #2404). - **Worker**: Fixed rare crash occurring with ``--autoscale`` enabled (Issue #2411). - **Django**: Properly recycle worker Django database connections when the Django ``CONN_MAX_AGE`` setting is enabled (Issue #2453). Fix contributed by Luke Burden. .. _version-3.1.17: 3.1.17 ====== :release-date: 2014-11-19 03:30 P.M UTC :release-by: Ask Solem .. admonition:: Do not enable the :setting:`CELERYD_FORCE_EXECV` setting! Please review your configuration and disable this option if you're using the RabbitMQ or Redis transport. Keeping this option enabled after 3.1 means the async based prefork pool will be disabled, which can easily cause instability. - **Requirements** - Now depends on :ref:`Kombu 3.0.24 <kombu:version-3.0.24>`. Includes the new Qpid transport coming in Celery 3.2, backported to support those who may still require Python 2.6 compatibility. - Now depends on :mod:`billiard` 3.3.0.19. - ``celery[librabbitmq]`` now depends on librabbitmq 1.6.1. - **Task**: The timing of ETA/countdown tasks were off after the example ``LocalTimezone`` implementation in the Python documentation no longer works in Python 3.4. (Issue #2306). - **Task**: Raising :exc:`~celery.exceptions.Ignore` no longer sends ``task-failed`` event (Issue #2365). - **Redis result backend**: Fixed unbound local errors. Fix contributed by Thomas French. - **Task**: Callbacks was not called properly if ``link`` was a list of signatures (Issuse #2350). - **Canvas**: chain and group now handles json serialized signatures (Issue #2076). - **Results**: ``.join_native()`` would accidentally treat the ``STARTED`` state as being ready (Issue #2326). This could lead to the chord callback being called with invalid arguments when using chords with the :setting:`CELERY_TRACK_STARTED` setting enabled. - **Canvas**: The ``chord_size`` attribute is now set for all canvas primitives, making sure more combinations will work with the ``new_join`` optimization for Redis (Issue #2339). - **Task**: Fixed problem with app not being properly propagated to ``trace_task`` in all cases. Fix contributed by kristaps. - **Worker**: Expires from task message now associated with a timezone. Fix contributed by Albert Wang. - **Cassandra result backend**: Fixed problems when using detailed mode. When using the Cassandra backend in detailed mode, a regression caused errors when attempting to retrieve results. Fix contributed by Gino Ledesma. - **Mongodb Result backend**: Pickling the backend instance will now include the original url (Issue #2347). Fix contributed by Sukrit Khera. - **Task**: Exception info was not properly set for tasks raising :exc:`~celery.exceptions.Reject` (Issue #2043). - **Worker**: Duplicates are now removed when loading the set of revoked tasks from the worker state database (Issue #2336). - **celery.contrib.rdb**: Fixed problems with ``rdb.set_trace`` calling stop from the wrong frame. Fix contributed by llllllllll. - **Canvas**: ``chain`` and ``chord`` can now be immutable. - **Canvas**: ``chord.apply_async`` will now keep partial args set in ``self.args`` (Issue #2299). - **Results**: Small refactoring so that results are decoded the same way in all result backends. - **Logging**: The ``processName`` format was introduced in Py2.6.2 so for compatibility this format is now excluded when using earlier versions (Issue #1644). .. _version-3.1.16: 3.1.16 ====== :release-date: 2014-10-03 06:00 P.M UTC :release-by: Ask Solem - **Worker**: 3.1.15 broke ``-Ofair`` behavior (Issue #2286). This regression could result in all tasks executing in a single child process if ``-Ofair`` was enabled. - **Canvas**: ``celery.signature`` now properly forwards app argument in all cases. - **Task**: ``.retry()`` did not raise the exception correctly when called without a current exception. Fix contributed by Andrea Rabbaglietti. - **Worker**: The ``enable_events`` remote control command disabled worker-related events by mistake (Issue #2272). Fix contributed by Konstantinos Koukopoulos. - **Django**: Adds support for Django 1.7 class names in INSTALLED_APPS when using ``app.autodiscover_tasks()`` (Issue #2248). - **Sphinx**: ``celery.contrib.sphinx`` now uses ``getfullargspec`` on Python 3 (Issue #2302). - **Redis/Cache Backends**: Chords will now run at most once if one or more tasks in the chord are executed multiple times for some reason. .. _version-3.1.15: 3.1.15 ====== :release-date: 2014-09-14 11:00 P.M UTC :release-by: Ask Solem - **Django**: Now makes sure ``django.setup()`` is called before importing any task modules (Django 1.7 compatibility, Issue #2227) - **Results**: ``result.get()`` was misbehaving by calling ``backend.get_task_meta`` in a finally call leading to AMQP result backend queues not being properly cleaned up (Issue #2245). .. _version-3.1.14: 3.1.14 ====== :release-date: 2014-09-08 03:00 P.M UTC :release-by: Ask Solem - **Requirements** - Now depends on :ref:`Kombu 3.0.22 <kombu:version-3.0.22>`. - **Init scripts**: The generic worker init scripts ``status`` command now gets an accurate pidfile list (Issue #1942). - **Init scripts**: The generic beat script now implements the ``status`` command. Contributed by John Whitlock. - **Commands**: Multi now writes informational output to stdout instead of stderr. - **Worker**: Now ignores not implemented error for ``pool.restart`` (Issue #2153). - **Task**: Retry no longer raises retry exception when executed in eager mode (Issue #2164). - **AMQP Result backend**: Now ensured ``on_interval`` is called at least every second for blocking calls to properly propagate parent errors. - **Django**: Compatibility with Django 1.7 on Windows (Issue #2126). - **Programs**: `--umask` argument can be now specified in both octal (if starting with 0) or decimal. .. _version-3.1.13: 3.1.13 ======
Changelog Details: Change history ================ This document contains change notes for bugfix releases in the 3.1.x series (Cipater), please see :ref:`whatsnew-3.1` for an overview of what's new in Celery 3.1. .. _version-3.1.26: 3.1.26 ====== :release-date: 2018-23-03 16:00 PM IST :release-by: Omer Katz - Fixed a crash caused by tasks cycling between Celery 3 and Celery 4 workers. .. _version-3.1.25: 3.1.25 ====== :release-date: 2016-10-10 12:00 PM PDT :release-by: Ask Solem - **Requirements** - Now depends on :ref:`Kombu 3.0.37 <kombu:version-3.0.37>` - Fixed problem with chords in group introduced in 3.1.24 (Issue #3504). .. _version-3.1.24: 3.1.24 ====== :release-date: 2016-09-30 04:21 PM PDT :release-by: Ask Solem - **Requirements** - Now depends on :ref:`Kombu 3.0.36 <kombu:version-3.0.36>`. - Now supports Task protocol 2 from the future 4.0 release. Workers running 3.1.24 are now able to process messages sent using the `new task message protocol`_ to be introduced in Celery 4.0. Users upgrading to Celery 4.0 when this is released are encouraged to upgrade to this version as an intermediate step, as this means workers not yet upgraded will be able to process messages from clients/workers running 4.0. .. _`new task message protocol`: http://docs.celeryproject.org/en/master/internals/protocol.html#version-2 - ``Task.send_events`` can now be set to disable sending of events for that task only. Example when defining the task: .. code-block:: python @app.task(send_events=False) def add(x, y): return x + y - **Utils**: Fixed compatibility with recent :pypi:`psutil` versions (Issue #3262). - **Canvas**: Chord now forwards partial arguments to its subtasks. Fix contributed by Tayfun Sen. - **App**: Arguments to app such as ``backend``, ``broker``, etc are now pickled and sent to the child processes on Windows. Fix contributed by Jeremy Zafran. - **Deployment**: Generic init scripts now supports being symlinked in runlevel directories (Issue #3208). - **Deployment**: Updated CentOS scripts to work with CentOS 7. Contributed by Joe Sanford. - **Events**: The curses monitor no longer crashes when the result of a task is empty. Fix contributed by Dongweiming. - **Worker**: ``repr(worker)`` would crash when called early in the startup process (Issue #2514). - **Tasks**: GroupResult now defines __bool__ and __nonzero__. This is to fix an issue where a ResultSet or GroupResult with an empty result list are not properly tupled with the as_tuple() method when it is a parent result. This is due to the as_tuple() method performing a logical and operation on the ResultSet. Fix contributed by Colin McIntosh. - **Worker**: Fixed wrong values in autoscale related logging message. Fix contributed by ``@raducc``. - Documentation improvements by * Alexandru Chirila * Michael Aquilina * Mikko Ekström * Mitchel Humpherys * Thomas A. Neil * Tiago Moreira Vieira * Yuriy Syrovetskiy * ``@dessant`` .. _version-3.1.23: 3.1.23 ====== :release-date: 2016-03-09 06:00 P.M PST :release-by: Ask Solem - **Programs**: Last release broke support for the ``--hostnmame`` argument to :program:`celery multi` and :program:`celery worker --detach` (Issue #3103). - **Results**: MongoDB result backend could crash the worker at startup if not configured using an URL. .. _version-3.1.22: 3.1.22 ====== :release-date: 2016-03-07 01:30 P.M PST :release-by: Ask Solem - **Programs**: The worker would crash immediately on startup on ``backend.as_uri()`` when using some result backends (Issue #3094). - **Programs**: :program:`celery multi`/:program:`celery worker --detach` would create an extraneous logfile including literal formats (e.g. ``%I``) in the filename (Issue #3096). .. _version-3.1.21: 3.1.21 ====== :release-date: 2016-03-04 11:16 A.M PST :release-by: Ask Solem - **Requirements** - Now depends on :ref:`Kombu 3.0.34 <kombu:version-3.0.34>`. - Now depends on :mod:`billiard` 3.3.0.23. - **Prefork pool**: Fixes 100% CPU loop on Linux epoll (Issue #1845). Also potential fix for: Issue #2142, Issue #2606 - **Prefork pool**: Fixes memory leak related to processes exiting (Issue #2927). - **Worker**: Fixes crash at startup when trying to censor passwords in MongoDB and Cache result backend URLs (Issue #3079, Issue #3045, Issue #3049, Issue #3068, Issue #3073). Fix contributed by Maxime Verger. - **Task**: An exception is now raised if countdown/expires is less than -2147483648 (Issue #3078). - **Programs**: :program:`celery shell --ipython` now compatible with newer IPython versions. - **Programs**: The DuplicateNodeName warning emitted by inspect/control now includes a list of the node names returned. Contributed by Sebastian Kalinowski. - **Utils**: The ``.discard(item)`` method of :class:`~celery.datastructures.LimitedSet` did not actually remove the item (Issue #3087). Fix contributed by Dave Smith. - **Worker**: Node name formatting now emits less confusing error message for unmatched format keys (Issue #3016). - **Results**: amqp/rpc backends: Fixed deserialization of JSON exceptions (Issue #2518). Fix contributed by Allard Hoeve. - **Prefork pool**: The `process inqueue damaged` error message now includes the original exception raised. - **Documentation**: Includes improvements by: - Jeff Widman. .. _version-3.1.20: 3.1.20 ====== :release-date: 2016-01-22 06:50 P.M UTC :release-by: Ask Solem - **Requirements** - Now depends on :ref:`Kombu 3.0.33 <kombu:version-3.0.33>`. - Now depends on :mod:`billiard` 3.3.0.22. Includes binary wheels for Microsoft Windows x86 and x86_64! - **Task**: Error emails now uses ``utf-8`` charset by default (Issue #2737). - **Task**: Retry now forwards original message headers (Issue #3017). - **Worker**: Bootsteps can now hook into ``on_node_join``/``leave``/``lost``. See :ref:`extending-consumer-gossip` for an example. - **Events**: Fixed handling of DST timezones (Issue #2983). - **Results**: Redis backend stopped respecting certain settings. Contributed by Jeremy Llewellyn. - **Results**: Database backend now properly supports JSON exceptions (Issue #2441). - **Results**: Redis ``new_join`` did not properly call task errbacks on chord error (Issue #2796). - **Results**: Restores Redis compatibility with redis-py < 2.10.0 (Issue #2903). - **Results**: Fixed rare issue with chord error handling (Issue #2409). - **Tasks**: Using queue-name values in :setting:`CELERY_ROUTES` now works again (Issue #2987). - **General**: Result backend password now sanitized in report output (Issue #2812, Issue #2004). - **Configuration**: Now gives helpful error message when the result backend configuration points to a module, and not a class (Issue #2945). - **Results**: Exceptions sent by JSON serialized workers are now properly handled by pickle configured workers. - **Programs**: ``celery control autoscale`` now works (Issue #2950). - **Programs**: ``celery beat --detached`` now runs after fork callbacks. - **General**: Fix for LRU cache implementation on Python 3.5 (Issue #2897). Contributed by Dennis Brakhane. Python 3.5's ``OrderedDict`` does not allow mutation while it is being iterated over. This breaks "update" if it is called with a dict larger than the maximum size. This commit changes the code to a version that does not iterate over the dict, and should also be a little bit faster. - **Init scripts**: The beat init script now properly reports service as down when no pid file can be found. Eric Zarowny - **Beat**: Added cleaning of corrupted scheduler files for some storage backend errors (Issue #2985). Fix contributed by Aleksandr Kuznetsov. - **Beat**: Now syncs the schedule even if the schedule is empty. Fix contributed by Colin McIntosh. - **Supervisord**: Set higher process priority in supervisord example. Contributed by George Tantiras. - **Documentation**: Includes improvements by: - Bryson - Caleb Mingle - Christopher Martin - Dieter Adriaenssens - Jason Veatch - Jeremy Cline - Juan Rossi - Kevin Harvey - Kevin McCarthy - Kirill Pavlov - Marco Buttu - Mayflower - Mher Movsisyan - Michael Floering - michael-k - Nathaniel Varona - Rudy Attias - Ryan Luckie - Steven Parker - squfrans - Tadej Janež - TakesxiSximada - Tom S .. _version-3.1.19: 3.1.19 ====== :release-date: 2015-10-26 01:00 P.M UTC :release-by: Ask Solem - **Requirements** - Now depends on :ref:`Kombu 3.0.29 <kombu:version-3.0.29>`. - Now depends on :mod:`billiard` 3.3.0.21. - **Results**: Fixed MongoDB result backend URL parsing problem (Issue celery/kombu#375). - **Worker**: Task request now properly sets ``priority`` in delivery_info. Fix contributed by Gerald Manipon. - **Beat**: PyPy shelve may raise ``KeyError`` when setting keys (Issue #2862). - **Programs**: :program:`celery beat --deatched` now working on PyPy. Fix contributed by Krzysztof Bujniewicz. - **Results**: Redis result backend now ensures all pipelines are cleaned up. Contributed by Justin Patrin. - **Results**: Redis result backend now allows for timeout to be set in the query portion of the result backend URL. E.g. ``CELERY_RESULT_BACKEND = 'redis://?timeout=10'`` Contributed by Justin Patrin. - **Results**: ``result.get`` now properly handles failures where the exception value is set to :const:`None` (Issue #2560). - **Prefork pool**: Fixed attribute error ``proc.dead``. - **Worker**: Fixed worker hanging when gossip/heartbeat disabled (Issue #1847). Fix contributed by Aaron Webber and Bryan Helmig. - **Results**: MongoDB result backend now supports pymongo 3.x (Issue #2744). Fix contributed by Sukrit Khera. - **Results**: RPC/amqp backends did not deserialize exceptions properly (Issue #2691). Fix contributed by Sukrit Khera. - **Programs**: Fixed problem with :program:`celery amqp`'s ``basic_publish`` (Issue #2013). - **Worker**: Embedded beat now properly sets app for thread/process (Issue #2594). - **Documentation**: Many improvements and typos fixed. Contributions by: Carlos Garcia-Dubus D. Yu jerry Jocelyn Delalande Josh Kupershmidt Juan Rossi kanemra Paul Pearce Pavel Savchenko Sean Wang Seungha Kim Zhaorong Ma .. _version-3.1.18: 3.1.18 ====== :release-date: 2015-04-22 05:30 P.M UTC :release-by: Ask Solem - **Requirements** - Now depends on :ref:`Kombu 3.0.25 <kombu:version-3.0.25>`. - Now depends on :mod:`billiard` 3.3.0.20. - **Django**: Now supports Django 1.8 (Issue #2536). Fix contributed by Bence Tamas and Mickaël Penhard. - **Results**: MongoDB result backend now compatible with pymongo 3.0. Fix contributed by Fatih Sucu. - **Tasks**: Fixed bug only happening when a task has multiple callbacks (Issue #2515). Fix contributed by NotSqrt. - **Commands**: Preload options now support ``--arg value`` syntax. Fix contributed by John Anderson. - **Compat**: A typo caused ``celery.log.setup_logging_subsystem`` to be undefined. Fix contributed by Gunnlaugur Thor Briem. - **init scripts**: The celerybeat generic init script now uses ``/bin/sh`` instead of bash (Issue #2496). Fix contributed by Jelle Verstraaten. - **Django**: Fixed a :exc:`TypeError` sometimes occurring in logging when validating models. Fix contributed by Alexander. - **Commands**: Worker now supports new ``--executable`` argument that can be used with ``--detach``. Contributed by Bert Vanderbauwhede. - **Canvas**: Fixed crash in chord unlock fallback task (Issue #2404). - **Worker**: Fixed rare crash occurring with ``--autoscale`` enabled (Issue #2411). - **Django**: Properly recycle worker Django database connections when the Django ``CONN_MAX_AGE`` setting is enabled (Issue #2453). Fix contributed by Luke Burden. .. _version-3.1.17: 3.1.17 ====== :release-date: 2014-11-19 03:30 P.M UTC :release-by: Ask Solem .. admonition:: Do not enable the :setting:`CELERYD_FORCE_EXECV` setting! Please review your configuration and disable this option if you're using the RabbitMQ or Redis transport. Keeping this option enabled after 3.1 means the async based prefork pool will be disabled, which can easily cause instability. - **Requirements** - Now depends on :ref:`Kombu 3.0.24 <kombu:version-3.0.24>`. Includes the new Qpid transport coming in Celery 3.2, backported to support those who may still require Python 2.6 compatibility. - Now depends on :mod:`billiard` 3.3.0.19. - ``celery[librabbitmq]`` now depends on librabbitmq 1.6.1. - **Task**: The timing of ETA/countdown tasks were off after the example ``LocalTimezone`` implementation in the Python documentation no longer works in Python 3.4. (Issue #2306). - **Task**: Raising :exc:`~celery.exceptions.Ignore` no longer sends ``task-failed`` event (Issue #2365). - **Redis result backend**: Fixed unbound local errors. Fix contributed by Thomas French. - **Task**: Callbacks was not called properly if ``link`` was a list of signatures (Issuse #2350). - **Canvas**: chain and group now handles json serialized signatures (Issue #2076). - **Results**: ``.join_native()`` would accidentally treat the ``STARTED`` state as being ready (Issue #2326). This could lead to the chord callback being called with invalid arguments when using chords with the :setting:`CELERY_TRACK_STARTED` setting enabled. - **Canvas**: The ``chord_size`` attribute is now set for all canvas primitives, making sure more combinations will work with the ``new_join`` optimization for Redis (Issue #2339). - **Task**: Fixed problem with app not being properly propagated to ``trace_task`` in all cases. Fix contributed by kristaps. - **Worker**: Expires from task message now associated with a timezone. Fix contributed by Albert Wang. - **Cassandra result backend**: Fixed problems when using detailed mode. When using the Cassandra backend in detailed mode, a regression caused errors when attempting to retrieve results. Fix contributed by Gino Ledesma. - **Mongodb Result backend**: Pickling the backend instance will now include the original url (Issue #2347). Fix contributed by Sukrit Khera. - **Task**: Exception info was not properly set for tasks raising :exc:`~celery.exceptions.Reject` (Issue #2043). - **Worker**: Duplicates are now removed when loading the set of revoked tasks from the worker state database (Issue #2336). - **celery.contrib.rdb**: Fixed problems with ``rdb.set_trace`` calling stop from the wrong frame. Fix contributed by llllllllll. - **Canvas**: ``chain`` and ``chord`` can now be immutable. - **Canvas**: ``chord.apply_async`` will now keep partial args set in ``self.args`` (Issue #2299). - **Results**: Small refactoring so that results are decoded the same way in all result backends. - **Logging**: The ``processName`` format was introduced in Py2.6.2 so for compatibility this format is now excluded when using earlier versions (Issue #1644). .. _version-3.1.16: 3.1.16 ====== :release-date: 2014-10-03 06:00 P.M UTC :release-by: Ask Solem - **Worker**: 3.1.15 broke ``-Ofair`` behavior (Issue #2286). This regression could result in all tasks executing in a single child process if ``-Ofair`` was enabled. - **Canvas**: ``celery.signature`` now properly forwards app argument in all cases. - **Task**: ``.retry()`` did not raise the exception correctly when called without a current exception. Fix contributed by Andrea Rabbaglietti. - **Worker**: The ``enable_events`` remote control command disabled worker-related events by mistake (Issue #2272). Fix contributed by Konstantinos Koukopoulos. - **Django**: Adds support for Django 1.7 class names in INSTALLED_APPS when using ``app.autodiscover_tasks()`` (Issue #2248). - **Sphinx**: ``celery.contrib.sphinx`` now uses ``getfullargspec`` on Python 3 (Issue #2302). - **Redis/Cache Backends**: Chords will now run at most once if one or more tasks in the chord are executed multiple times for some reason. .. _version-3.1.15: 3.1.15 ====== :release-date: 2014-09-14 11:00 P.M UTC :release-by: Ask Solem - **Django**: Now makes sure ``django.setup()`` is called before importing any task modules (Django 1.7 compatibility, Issue #2227) - **Results**: ``result.get()`` was misbehaving by calling ``backend.get_task_meta`` in a finally call leading to AMQP result backend queues not being properly cleaned up (Issue #2245). .. _version-3.1.14: 3.1.14 ====== :release-date: 2014-09-08 03:00 P.M UTC :release-by: Ask Solem - **Requirements** - Now depends on :ref:`Kombu 3.0.22 <kombu:version-3.0.22>`. - **Init scripts**: The generic worker init scripts ``status`` command now gets an accurate pidfile list (Issue #1942). - **Init scripts**: The generic beat script now implements the ``status`` command. Contributed by John Whitlock. - **Commands**: Multi now writes informational output to stdout instead of stderr. - **Worker**: Now ignores not implemented error for ``pool.restart`` (Issue #2153). - **Task**: Retry no longer raises retry exception when executed in eager mode (Issue #2164). - **AMQP Result backend**: Now ensured ``on_interval`` is called at least every second for blocking calls to properly propagate parent errors. - **Django**: Compatibility with Django 1.7 on Windows (Issue #2126). - **Programs**: `--umask` argument can be now specified in both octal (if starting with 0) or decimal. .. _version-3.1.13: 3.1.13 ======
See test case here:
https://github.com/sabw8217/celery_test
If I enqueue a number of tasks, and then start a worker with --without-mingle and --without-gossip, the worker seems to hang and not actually run any of the tasks I enqueued. I'm running this on debian. On my dev box I have pretty consistently been able to cause the worker to "wake up" and start consuming tasks by issuing a 'celery inspect' command. 'celery inspect active' will return something like:
-> celery@aaron-dev.localdomain: OK
- empty -
And then worker the will go through the tasks.
Debug output from the worker looks like:
[2014-02-04 13:53:47,628: DEBUG/MainProcess] | Worker: Starting Hub
[2014-02-04 13:53:47,628: DEBUG/MainProcess] ^-- substep ok
[2014-02-04 13:53:47,628: DEBUG/MainProcess] | Worker: Starting Pool
[2014-02-04 13:53:47,633: DEBUG/MainProcess] ^-- substep ok
[2014-02-04 13:53:47,634: DEBUG/MainProcess] | Worker: Starting Consumer
[2014-02-04 13:53:47,635: DEBUG/MainProcess] | Consumer: Starting Connection
[2014-02-04 13:53:47,644: INFO/MainProcess] Connected to amqp://guest@localhost:5672//
[2014-02-04 13:53:47,645: DEBUG/MainProcess] ^-- substep ok
[2014-02-04 13:53:47,645: DEBUG/MainProcess] | Consumer: Starting Events
[2014-02-04 13:53:47,659: DEBUG/MainProcess] ^-- substep ok
[2014-02-04 13:53:47,661: DEBUG/MainProcess] | Consumer: Starting Heart
[2014-02-04 13:53:47,663: DEBUG/MainProcess] ^-- substep ok
[2014-02-04 13:53:47,663: DEBUG/MainProcess] | Consumer: Starting Tasks
[2014-02-04 13:53:47,666: DEBUG/MainProcess] basic.qos: prefetch_count->4
[2014-02-04 13:53:47,667: DEBUG/MainProcess] ^-- substep ok
[2014-02-04 13:53:47,667: DEBUG/MainProcess] | Consumer: Starting Control
[2014-02-04 13:53:47,669: DEBUG/MainProcess] ^-- substep ok
[2014-02-04 13:53:47,670: DEBUG/MainProcess] | Consumer: Starting event loop
[2014-02-04 13:53:47,670: WARNING/MainProcess] celery@aaron-dev.localdomain ready.
[2014-02-04 13:53:47,670: DEBUG/MainProcess] | Worker: Hub.register Pool...
The text was updated successfully, but these errors were encountered: