Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"ERROR:pika.adapters.base_connection:Socket Error on fd 8: 104" #266

Closed
vgoklani opened this issue Jan 29, 2013 · 12 comments
Closed

"ERROR:pika.adapters.base_connection:Socket Error on fd 8: 104" #266

vgoklani opened this issue Jan 29, 2013 · 12 comments

Comments

@vgoklani
Copy link

Hi,

Could someone please explain this error and how to handle it. It looks like the socket is closing, but it's not clear how to address this. Is there an exception I should handle?

I am running the latest release build of both RabbitMQ and Pika.

Thanks,

Vishal

@atatsu
Copy link
Contributor

atatsu commented Jan 30, 2013

How are you using your channels? Are you using the same channel to do everything? As in declaring queues/exchanges, setting up bindings, consuming queues?

@vgoklani
Copy link
Author

Yes, I am using the default channel for everything. (presumably this is bad)

My base code is here: https://gist.github.com/4669710

I followed the RabbitMQ tutorial. Could you please point me to a better example.

My callback function is passed in, and the messages are a list of JSON objects that are produced by the producer and consumed by the consumer.

Thanks!

On Jan 29, 2013, at 8:07 PM, Nathan Lundquist notifications@github.com wrote:

How are you using your channels? Are you using the same channel to do everything? As in declaring queues/exchanges, setting up bindings, consuming queues?


Reply to this email directly or view it on GitHub.

@gmr
Copy link
Member

gmr commented Jan 30, 2013

Can you please paste the full traceback?

On Tuesday, January 29, 2013 at 8:13 PM, Vishal Goklani wrote:

Yes, I am using the default channel for everything. (presumably this is bad)

My base code is here: https://gist.github.com/4669710

I followed the RabbitMQ tutorial. Could you please point me to a better example.

My callback function is passed in, and the messages are a list of JSON objects that are produced by the producer and consumed by the consumer.

Thanks!

On Jan 29, 2013, at 8:07 PM, Nathan Lundquist <notifications@github.com (mailto:notifications@github.com)> wrote:

How are you using your channels? Are you using the same channel to do everything? As in declaring queues/exchanges, setting up bindings, consuming queues?


Reply to this email directly or view it on GitHub.


Reply to this email directly or view it on GitHub (#266 (comment)).

@vgoklani
Copy link
Author

There is no full traceback from pika, I only see these messages:

ERROR:pika.adapters.base_connection:Socket Error on fd 4: 104

Are you referring to the RabbitMQ logs?

On Jan 29, 2013, at 8:07 PM, Nathan Lundquist notifications@github.com wrote:

How are you using your channels? Are you using the same channel to do everything? As in declaring queues/exchanges, setting up bindings, consuming queues?


Reply to this email directly or view it on GitHub.

@gmr
Copy link
Member

gmr commented Jan 30, 2013

No, I am referring to the python traceback when the exception is raised in your application. That error tells us the problem but not where it is occurring or why. The Python traceback when the app breaks is more useful.

On Tuesday, January 29, 2013 at 8:22 PM, Vishal Goklani wrote:

There is no full traceback from pika, I only see these messages:

ERROR:pika.adapters.base_connection:Socket Error on fd 4: 104

Are you referring to the RabbitMQ logs?

On Jan 29, 2013, at 8:07 PM, Nathan Lundquist <notifications@github.com (mailto:notifications@github.com)> wrote:

How are you using your channels? Are you using the same channel to do everything? As in declaring queues/exchanges, setting up bindings, consuming queues?


Reply to this email directly or view it on GitHub.


Reply to this email directly or view it on GitHub (#266 (comment)).

@atatsu
Copy link
Contributor

atatsu commented Jan 30, 2013

From my experience you can save yourself a lot of headaches if you use a new channel for each new operation you wish to perform. The number of channel's you can have open at one time is really only limited by the resources of the box running RabbitMQ. We have thousands of active channels and the box we're running Rabbit on still has plenty to give.

There are a number of reasons to not use the same channel for multiple operations (i.e. queue_declare, exchange_declare, basic_consume) IMO. Say for example you've instructed a channel to consume ten different queues. Now pretend there's a channel error, you tried declaring a queue that already existed with different properties. RabbitMQ will close the channel. Your ten consumers are no longer consuming. You'll have to handle that. Setup all ten consumers again, assuming you know which ten were impacted.

Another problem you run into when reusing the same channel is RabbitMQ expecting a certain response from the channel and your code trying to do something else too soon resulting in an unexpected response on your end which can lead to RabbitMQ closing your connection (which I'm guessing is the problem you're experiencing).

In order to reuse the same channel you have to setup callbacks to know when Rabbit is finished doing its thing. So if you want to call channel.queue_declare five times consecutively you need to call it once, and then in the callback you can call it again. Rinse and repeat three more times.

Personally I find it easier to just use a separate channel for everything and then just close it when it isn't needed anymore. Does this make sense?

@vgoklani
Copy link
Author

I created a try / catch block:

try:
queue.publish(messages)
except Exception as e:
print e.message

and got this: 'ConnectionClosed' object has no attribute 'messages'. I don't see any other traceback.

So it's throwing a ConnectionClosed exception. The solution would be to catch the exception and just open another connection. The bigger question is why is the connection closing, and is the correct solution to simply open another connection (?)

This is a rough sketch of my code:

  1. open connection to queue
  2. call producer and put messages into queue
  3. sleep for 20min
  4. goto 2

But it seems now that the connection is closing after step 3.

Is it possible that the 20min delay is closing the connection?

Thanks for all the help!

Best,

Vishal

On Jan 29, 2013, at 8:33 PM, Gavin M. Roy notifications@github.com wrote:

No, I am referring to the python traceback when the exception is raised in your application. That error tells us the problem but not where it is occurring or why. The Python traceback when the app breaks is more useful.

On Tuesday, January 29, 2013 at 8:22 PM, Vishal Goklani wrote:

There is no full traceback from pika, I only see these messages:

ERROR:pika.adapters.base_connection:Socket Error on fd 4: 104

Are you referring to the RabbitMQ logs?

On Jan 29, 2013, at 8:07 PM, Nathan Lundquist <notifications@github.com (mailto:notifications@github.com)> wrote:

How are you using your channels? Are you using the same channel to do everything? As in declaring queues/exchanges, setting up bindings, consuming queues?


Reply to this email directly or view it on GitHub.


Reply to this email directly or view it on GitHub (#266 (comment)).


Reply to this email directly or view it on GitHub.

@gmr
Copy link
Member

gmr commented Jan 30, 2013

If you're not getting the error message from:

try:
    # foo do whatever here
catch ConnectionClosed as error:
    print 'Connection was closed due to: %s' % error

Then the RabbitMQ logs are the next place to look.

@vgoklani
Copy link
Author

is it normal to get messages like this (from the RabbitMQ log):

=WARNING REPORT==== 30-Jan-2013::03:52:16 ===
closing AMQP connection <0.466.0> (10.169.2.187:35609 -> 10.169.2.187:5672):
connection_closed_abruptly

=INFO REPORT==== 30-Jan-2013::03:52:21 ===
accepting AMQP connection <0.515.0> (10.169.2.187:35624 -> 10.169.2.187:5672)

=INFO REPORT==== 30-Jan-2013::03:52:38 ===
accepting AMQP connection <0.527.0> (10.169.2.187:35630 -> 10.169.2.187:5672)

=ERROR REPORT==== 30-Jan-2013::04:32:21 ===
closing AMQP connection <0.515.0> (10.169.2.187:35624 -> 10.169.2.187:5672):
{heartbeat_timeout,running}

=INFO REPORT==== 30-Jan-2013::04:34:32 ===
accepting AMQP connection <0.802.0> (10.169.2.187:37799 -> 10.169.2.187:5672)

How would I debug this further?

On Jan 29, 2013, at 10:34 PM, Gavin M. Roy notifications@github.com wrote:

If you're not getting the error message from:

try:

foo do whatever here

catch ConnectionClosed as error:
print 'Connection was closed due to: %s' % error

Then the RabbitMQ logs are the next place to look.

On Tue, Jan 29, 2013 at 10:28 PM, Vishal Goklani
notifications@github.comwrote:

I created a try / catch block:

try:
queue.publish(messages)
except Exception as e:
print e.message

and got this: 'ConnectionClosed' object has no attribute 'messages'. I
don't see any other traceback.

So it's throwing a ConnectionClosed exception. The solution would be to
catch the exception and just open another connection. The bigger question
is why is the connection closing, and is the correct solution to simply
open another connection (?)

This is a rough sketch of my code:

  1. open connection to queue
  2. call producer and put messages into queue
  3. sleep for 20min
  4. goto 2

But it seems now that the connection is closing after step 3.

Is it possible that the 20min delay is closing the connection?

Thanks for all the help!

Best,

Vishal

On Jan 29, 2013, at 8:33 PM, Gavin M. Roy notifications@github.com
wrote:

No, I am referring to the python traceback when the exception is raised
in your application. That error tells us the problem but not where it is
occurring or why. The Python traceback when the app breaks is more useful.

On Tuesday, January 29, 2013 at 8:22 PM, Vishal Goklani wrote:

There is no full traceback from pika, I only see these messages:

ERROR:pika.adapters.base_connection:Socket Error on fd 4: 104

Are you referring to the RabbitMQ logs?

On Jan 29, 2013, at 8:07 PM, Nathan Lundquist <
notifications@github.com (mailto:notifications@github.com)> wrote:

How are you using your channels? Are you using the same channel to
do everything? As in declaring queues/exchanges, setting up bindings,
consuming queues?


Reply to this email directly or view it on GitHub.


Reply to this email directly or view it on GitHub (
#266 (comment)).


Reply to this email directly or view it on GitHub.


Reply to this email directly or view it on GitHubhttps://github.com//issues/266#issuecomment-12872498.

Gavin M. Roy
Chief Technology Officer

http://www.meetme.com/
100 Union Square Drive
New Hope, PA 18938
p. +1.215.862.1162 x263
f. +1.215.862.0465

https://www.facebook.com/pages/MeetMe/21931227129
https://twitter.com/meetme
http://www.youtube.com/user/MeetMeVideos

The public market leader in social discovery. (NYSE MKT: MEET)

Reply to this email directly or view it on GitHub.

@gmr
Copy link
Member

gmr commented Jan 30, 2013

It appears that you are blocking in your consumer for longer than Pika has to respond to timeouts. The telling line is:

=ERROR REPORT==== 30-Jan-2013::04:32:21 ===
closing AMQP connection <0.515.0> (10.169.2.187:35624 -> 10.169.2.187:5672):
{heartbeat_timeout,running}

You can either turn them off or make them much longer. In pika it's heartbeat_interval=0 to turn them off or heartbeat_interval={seconds} to set how many seconds you want them run. My guess is your consumer is blocking in Python processing for a fair amount of time if this is happening.

Are you using time.sleep or any such thing?

@vgoklani
Copy link
Author

yes, I am using time.sleep to take 30min breaks in-between requests - time.sleep(60 * 30).

What's the proper way of pausing the producer/consumer, should I switch to connection.sleep(60*30)?

Thanks again for all the help!

On Jan 30, 2013, at 1:06 AM, Gavin M. Roy notifications@github.com wrote:

It appears that you are blocking in your consumer for longer than Pika has to respond to timeouts. The telling line is:

=ERROR REPORT==== 30-Jan-2013::04:32:21 ===
closing AMQP connection <0.515.0> (10.169.2.187:35624 -> 10.169.2.187:5672):
{heartbeat_timeout,running}

You can either turn them off or make them much longer. In pika it's heartbeat_interval=0 to turn them off or heartbeat_interval={seconds} to set how many seconds you want them run. My guess is your consumer is blocking in Python processing for a fair amount of time if this is happening.

Are you using time.sleep or any such thing?

Reply to this email directly or view it on GitHub.

@gmr
Copy link
Member

gmr commented Jan 30, 2013

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants