mingle: searching for neighbors #1664

Closed
vpukhanov opened this Issue Nov 18, 2013 · 9 comments

Projects

None yet

5 participants

@vpukhanov

Hello.
I'm using djcelery and djkombu. When I'm starting celery like this
python3 manage.py celeryd --loglevel=info
or like this
python3 manage.py celery worker --loglevel=info

everything stops on this
[2013-11-18 19:27:22,624: INFO/MainProcess] mingle: searching for neighbors

and this never ends.

I'm using Lubuntu 13.10 with Python 3.3

@ask
Member
ask commented Nov 19, 2013

If you use the Django database transport, then a workaround for this can be to start the worker with --without-mingle --without-gossip. the transport does not support this so will be fine.

@ask ask closed this in 3c2db4e Nov 19, 2013
@vpukhanov

Thanks. That did the trick.

@zebulon2

I have observed the same problem with celery 3.1.6 and django-celery recently. I have tried to reproduce it on another machine (which was a clone) with celery 3.1.3. I updated to 3.1.6 using pip, but the problem did not happen on that machine. I am using amqp+rabbitmq.
Using --without-mingle --without-gossip, celery starts and says celery@djangosrv ready, but it is still not working. Could it be a problem with database access? We recently changed from sqlite to postgres, however celery was working for a while after the migration.
Is there anything more I can do to diagnose this problem?

@ask
Member
ask commented Dec 13, 2013

@zebulon2 Does it not say anything after 'searching for neighbors'? It will only use amqp communication at that point, so one reason could be that the rabbitmq disk/memory alerts are in effect blocking it from publishing messages.
If that is the case you should be low on disk space/memory on the broker host and the rabbit logs should tell you (note that the disk/memory percentage limits are a bit lower than most monitoring tools use)

@vpukhanov

I'm actually using postgresql,but I'm using kombu to connect celery to database. Everything is working just brilliant.

@zebulon2

I have been dissecting the issue. This was caused by the installation of the linux-image-3.11-0.bpo.2-amd64 package on Debian 7 (Wheezy). Removing the package restores celery functionality. So this is external to the celery code, maybe related to ampq+rabbitmq and I have reported the problem to Debian. However, since I am able to trigger the bug at will (just by reinstalling that package) then I am wondering if I could help better discovering the cause, which could be informative to everybody.

@zebulon2

Hi,

Going back to this : I found out the cause of this issue: rabbitmq had run out of free disk space limit, which is 1GB by default. So if celery is unresponsive, you should check the filesystem size and log messages from rabbitmq.

@romanukyan

@zebulon2 Same cause for me. Things are back to normal after freeing up space

@jason-kane

ooh, just ran into this the issue was a rabbitmq partition. One more thing to check if you see this symptom.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment