Framing Error #922

Closed
jbiel opened this Issue Aug 16, 2012 · 9 comments

Comments

Projects
None yet
5 participants
Contributor

jbiel commented Aug 16, 2012

Hello, we're ramping up our celery use and receiving lots of framing errors which cause the worker processes to crash. We're trying out pika to see if the problem goes away. Is pika the preferred AMQP transport?

Versions:

celery: 3.0.5
kombu: 2.3.2
billiard: 2.7.3.12
amqplib: 1.0.2
rabbitmq: 2.8.2

[2012-08-16 19:08:34,029: ERROR/MainProcess] Unrecoverable error: Exception('Framing Error, received 0x00 while expecting 0xce',)
Traceback (most recent call last):
  File "/usr/lib/python2.7/dist-packages/celery/worker/__init__.py", line 353, in start
    component.start()
  File "/usr/lib/python2.7/dist-packages/celery/worker/consumer.py", line 369, in start
    self.consume_messages()
  File "/usr/lib/python2.7/dist-packages/celery/worker/consumer.py", line 465, in consume_messages
    drain_nowait()
  File "/usr/lib/python2.7/dist-packages/kombu/connection.py", line 195, in drain_nowait
    self.drain_events(timeout=0)
  File "/usr/lib/python2.7/dist-packages/kombu/connection.py", line 191, in drain_events
    return self.transport.drain_events(self.connection, **kwargs)
  File "/usr/lib/python2.7/dist-packages/kombu/transport/amqplib.py", line 262, in drain_events
    return connection.drain_events(**kwargs)
  File "/usr/lib/python2.7/dist-packages/kombu/transport/amqplib.py", line 97, in drain_events
    chanmap, None, timeout=timeout)
  File "/usr/lib/python2.7/dist-packages/kombu/transport/amqplib.py", line 155, in _wait_multiple
    channel, method_sig, args, content = read_timeout(timeout)
  File "/usr/lib/python2.7/dist-packages/kombu/transport/amqplib.py", line 129, in read_timeout
    return self.method_reader.read_method()
  File "/usr/lib/python2.7/dist-packages/amqplib/client_0_8/method_framing.py", line 221, in read_method
    raise m
Exception: Framing Error, received 0x00 while expecting 0xce
[2012-08-16 19:08:35,182: WARNING/MainProcess] Traceback (most recent call last):
[2012-08-16 19:08:35,183: WARNING/MainProcess] File "/usr/lib/python2.7/runpy.py", line 162, in _run_module_as_main
[2012-08-16 19:08:35,195: WARNING/MainProcess] "__main__", fname, loader, pkg_name)
[2012-08-16 19:08:35,196: WARNING/MainProcess] File "/usr/lib/python2.7/runpy.py", line 72, in _run_code
[2012-08-16 19:08:35,196: WARNING/MainProcess] exec code in run_globals
[2012-08-16 19:08:35,196: WARNING/MainProcess] File "/usr/lib/python2.7/dist-packages/celery/bin/celeryd.py", line 207, in <module>
[2012-08-16 19:08:35,196: WARNING/MainProcess] main()
[2012-08-16 19:08:35,196: WARNING/MainProcess] File "/usr/lib/python2.7/dist-packages/celery/bin/celeryd.py", line 203, in main
[2012-08-16 19:08:35,197: WARNING/MainProcess] worker.execute_from_commandline()
[2012-08-16 19:08:35,197: WARNING/MainProcess] File "/usr/lib/python2.7/dist-packages/celery/bin/celeryd.py", line 136, in execute_from_commandline
[2012-08-16 19:08:35,197: WARNING/MainProcess] return super(WorkerCommand, self).execute_from_commandline(argv)
[2012-08-16 19:08:35,197: WARNING/MainProcess] File "/usr/lib/python2.7/dist-packages/celery/bin/base.py", line 177, in execute_from_commandline
[2012-08-16 19:08:35,197: WARNING/MainProcess] return self.handle_argv(prog_name, argv[1:])
[2012-08-16 19:08:35,198: WARNING/MainProcess] File "/usr/lib/python2.7/dist-packages/celery/bin/base.py", line 228, in handle_argv
[2012-08-16 19:08:35,198: WARNING/MainProcess] return self.run(*args, **options)
[2012-08-16 19:08:35,198: WARNING/MainProcess] File "/usr/lib/python2.7/dist-packages/celery/bin/celeryd.py", line 155, in run
[2012-08-16 19:08:35,198: WARNING/MainProcess] return self.app.Worker(**kwargs).run()
[2012-08-16 19:08:35,198: WARNING/MainProcess] File "/usr/lib/python2.7/dist-packages/celery/apps/worker.py", line 171, in run
[2012-08-16 19:08:35,199: WARNING/MainProcess] self.run_worker()
[2012-08-16 19:08:35,199: WARNING/MainProcess] File "/usr/lib/python2.7/dist-packages/celery/apps/worker.py", line 256, in run_worker
[2012-08-16 19:08:35,199: WARNING/MainProcess] worker.start()
[2012-08-16 19:08:35,199: WARNING/MainProcess] File "/usr/lib/python2.7/dist-packages/celery/worker/__init__.py", line 360, in start
[2012-08-16 19:08:35,199: WARNING/MainProcess] self.stop()
[2012-08-16 19:08:35,200: WARNING/MainProcess] File "/usr/lib/python2.7/dist-packages/celery/worker/__init__.py", line 402, in stop
[2012-08-16 19:08:35,200: WARNING/MainProcess] self._shutdown(warm=True)
[2012-08-16 19:08:35,200: WARNING/MainProcess] File "/usr/lib/python2.7/dist-packages/celery/worker/__init__.py", line 437, in _shutdown
[2012-08-16 19:08:35,200: WARNING/MainProcess] self.consumer.close_connection()
[2012-08-16 19:08:35,200: WARNING/MainProcess] File "/usr/lib/python2.7/dist-packages/celery/worker/consumer.py", line 583, in close_connection
[2012-08-16 19:08:35,201: WARNING/MainProcess] self.maybe_conn_error(self.task_consumer.close)
[2012-08-16 19:08:35,201: WARNING/MainProcess] File "/usr/lib/python2.7/dist-packages/celery/worker/consumer.py", line 567, in maybe_conn_error
[2012-08-16 19:08:35,201: WARNING/MainProcess] fun()
[2012-08-16 19:08:35,201: WARNING/MainProcess] File "/usr/lib/python2.7/dist-packages/kombu/messaging.py", line 355, in cancel
[2012-08-16 19:08:35,202: WARNING/MainProcess] cancel(tag)
[2012-08-16 19:08:35,202: WARNING/MainProcess] File "/usr/lib/python2.7/dist-packages/kombu/transport/amqplib.py", line 233, in basic_cancel
[2012-08-16 19:08:35,202: WARNING/MainProcess] return super(Channel, self).basic_cancel(consumer_tag, **kwargs)
[2012-08-16 19:08:35,202: WARNING/MainProcess] File "/usr/lib/python2.7/dist-packages/amqplib/client_0_8/channel.py", line 1791, in basic_cancel
[2012-08-16 19:08:35,220: WARNING/MainProcess] (60, 31),    # Channel.basic_cancel_ok
[2012-08-16 19:08:35,220: WARNING/MainProcess] File "/usr/lib/python2.7/dist-packages/amqplib/client_0_8/abstract_channel.py", line 95, in wait
[2012-08-16 19:08:35,221: WARNING/MainProcess] self.channel_id, allowed_methods)
[2012-08-16 19:08:35,221: WARNING/MainProcess] File "/usr/lib/python2.7/dist-packages/amqplib/client_0_8/connection.py", line 202, in _wait_method
[2012-08-16 19:08:35,222: WARNING/MainProcess] self.method_reader.read_method()
[2012-08-16 19:08:35,222: WARNING/MainProcess] File "/usr/lib/python2.7/dist-packages/amqplib/client_0_8/method_framing.py", line 221, in read_method
[2012-08-16 19:08:35,222: WARNING/MainProcess] raise m
[2012-08-16 19:08:35,222: WARNING/MainProcess] Exception
[2012-08-16 19:08:35,223: WARNING/MainProcess] :
[2012-08-16 19:08:35,223: WARNING/MainProcess] Framing Error, received 0x2d while expecting 0xce

Thanks!

Owner

ask commented Aug 17, 2012

What's your configuration like? pika is not recommended, you could try it but it was very buggy last time I tried. librabbitmq is better supported (pip install librabbitmq)

Jared is out on holiday but I think I can add some useful detail. We moved to pika last night and discovered what appears to be a much more useful error message this morning:

File "/usr/lib/python2.7/dist-packages/celery/worker/consumer.py", line 839, in consume_messages
self.connection.drain_events(timeout=10.0)
File "/usr/lib/python2.7/dist-packages/kombu/connection.py", line 191, in drain_events
return self.transport.drain_events(self.connection, *_kwargs)
...snip...
File "/usr/lib/python2.7/dist-packages/pika/callback.py", line 124, in process
callback(_args, **keywords)
File "/usr/lib/python2.7/dist-packages/pika/adapters/blocking_connection.py", line 269, in _on_remote_close
frame.method.reply_text)
AMQPChannelError: (406, 'PRECONDITION_FAILED - unknown delivery tag 13235')

Given this hint that it appears the system is likely acking the same message twice (or making up a delivery ID - less likely) I went ahead and tried turning off delayed acks in the celery config. This completely cleared up or issues in both ampqlib and pika.

I hope this helps. Let me know if there is anything else we can do. I can probably get a packet capture to try to prove out my guess about the system re-sending an ack if necessary. I know that is against the AMQP spec and will cause this error.

-James

Owner

ask commented Aug 17, 2012

James,

Thanks for the update, I guess you mean 'acks_late'. I'll try to enable that to see if I can reproduce the issue here.

Owner

steeve commented Aug 21, 2012

I think it might be linked to #318

Switching from amqplib to librabbitmq solved similar problem for me (I had CELERY_ACKS_LATE already enabled).

Owner

steeve commented Aug 23, 2012

That's because librabbitmq is blocking, whereas amqplib (or pyamqp) are non-blocking in the event loop.

ask added a commit to celery/py-amqp that referenced this issue Aug 29, 2012

ask added a commit to celery/kombu that referenced this issue Aug 29, 2012

Owner

ask commented Aug 29, 2012

This issue is fixed in kombu 2.4.4 (or if you use pyamqp:// you have to upgrade to amqp 0.9.3)

@ask ask closed this Aug 29, 2012

Contributor

jbiel commented Aug 29, 2012

Thanks Ask! We've been successfully using librabbitmq:// for a few days now but we'll be sure to test out the new version of kombu + amqp://.

ask added a commit that referenced this issue Aug 29, 2012

Owner

steeve commented Aug 30, 2012

Yep, I can confirm it's fixed! Great work @ask!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment