Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with
or
.
Download ZIP

Loading…

Message not published, AMPQConnection not closed #43

Closed
pulse00 opened this Issue · 16 comments

4 participants

@pulse00

We're experiencing a strange behavior which is only happening on our staging server:

When a producer publishes a message to rabbitMQ, that message is never received by rabbitmq, the rabbitMQ log shows that a connection is being opened, but the connection is never closed.

When this is done from the CLI, the php process hangs forever, you need to kill it manually.

The weird thing is that consumers can connect successfully to rabbitMQ.

Any hints how to debug this problem?

@pulse00

fyi: when the producer publishes a message, i have the following entry in the rabbitmq log:

=INFO REPORT==== 24-Oct-2012::15:55:16 ===
accepting AMQP connection <0.2104.0> (192.168.0.1:11319 -> 192.168.0.1:5672)

note that there's no corresponding entry that the connection is being closed. when i run the same code on my local machine, the AMQP connection is first accepted and then closed afterwards.

@videlalvaro
Owner

Can you post the minimum amount of code I could use to reproduce the issue?

@pulse00

i'll try to create a small testcase to reproduce it. Here's the debug output when the producer is publishing the message: https://gist.github.com/3946447.

This is the part where the script hangs:

< 20,40: Channel.close
waiting for 20,41
waiting for a new frame

Our server runs freebsd 9. Another developer on our team has the same issue on Mac OSX with rabbitMQ installed through homebrew.

@pulse00

@videlalvaro i've created an isolated test script in a composer package. running this script produces this problem on our server: https://github.com/pulse00/queue-test/blob/master/test.php

When running test.php, the script hangs until it gets killed, waiting for the channel to be closed:

< 20,40: Channel.close
waiting for 20,41
waiting for a new frame
@pulse00

as far as i can see the Problem seems to be here: https://github.com/videlalvaro/php-amqplib/blob/master/PhpAmqpLib/Wire/AMQPReader.php#L54

That loop is never finished on our server.

@pulse00

@videlalvaro forget the testcase. i can reproduce the exact same problem with the scripts from the demo folder when i run them on our server:

  1. start the consumer
  2. run the publisher

The consumer script will stay open, but the publisher script will hang to when it tries to close the channel:

< 20,40: Channel.close
waiting for 20,41
waiting for a new frame
@pulse00

@videlalvaro the issue has been resolved after we restarted the server. although it's not happening anymore, it's a bit scary for us because we don't know what caused the problem.

Have you ever experiences any issues like this?

@pulse00 pulse00 closed this
@videlalvaro
Owner

Hi, sorry I didn't answer, actually I was expecting to try this today since I've been traveling.

I haven't experienced issues like this and also the lib tests are running on travis-ci, sending messages and so on, so if we had such bug it would appear there.

Have you seen anything strange in the logs? What about file descriptors? Amount of Erlang processes? I don't know what might have caused that… maybe broken flow control? dunno really

@pulse00

Something must have been broken on the rabbitMQ side, because i could reproduce the exact same behavior with the amqp pecl extension. I just wanted to check back if behavior like this is known under certain circumstances, as we're pretty new to rabbitmq.

Thanks for your help.

@keizie

Under low memory watermark, rabbitmq does not allow publishing. (ref. http://www.rabbitmq.com/memory.html)
stream_set_timeout() with stream_set_blocking(true) does not work for this matter. (Even in non-blocking mode, timeout flag not detected by stream_get_meta_data() function.)

As mentioned in http://www.amqp.org/specification/1.0/amqp-org-download document, each peer should implement timeout properly to prevent this very issue.

2.4.3 Closing A Connection
Prior to closing a Connection, each peer MUST write a close frame with a code indicating the reason
for closing. This frame MUST be the last thing ever written onto a Connection. After writing this
frame the peer SHOULD continue to read from the Connection until it receives the partner's close
frame (in order to guard against erroneously or maliciously implemented partners, a peer SHOULD
implement a timeout to give its partner a reasonable time to receive and process the close before
giving up and simply closing the underlying transport mechanism). A close frame may be received
on any channel up to the maximum channel number negotiated in open. However, implementations
SHOULD send it on channel 0, and MUST send it on channel 0 if pipelined in a single batch with the
corresponding open.

@videlalvaro
Owner

Hi @pulse00 I merged #56 so perhaps you could try if this solved this issue?

@pulse00

@videlalvaro yes, that PR fixed the issue, now we're getting a proper timeout exception:

PHP Fatal error:  Uncaught exception 'Exception' with message 'Error reading data. Socket connection timed out'

@seansullivan great PR. thanks a lot.

@videlalvaro
Owner

@pulse00 great

@jhirbour

I'm having the same issue as @pulse00 . I recently ran yum update and it seems to be related to the version of Rabbit MQ. I'm testing with the master branch of this repo (just fetched again to make sure). I've also checked the amount of RAM etc... because I've read that Rabbit MQ gets cranky when you run out of RAM.

Also not sure if this may be related... I recently ran out of HD space on my VM which this isn't working on... I've since freed up disk space and rebooted it... RabbitMQ is up and running as far as I can tell.

< 20,40: Channel.close
waiting for 20,41
waiting for a new frame

The reason I think this is related to the version of Rabbit MQ is:

My VM that is not working is Centos 6.4. it's running : rabbitmq-server-3.1.5-1.el6.noarch Erlang R14B04

The issues doesn't happen on our production Redhat 5 box (i know old) that is running rabbitmq-server-3.1.3-1 rlang R14B04

or
on my local Mac which shows: "RabbitMQ 3.1.1, Erlang R15B03" in the webadmin (not sure how to check the package version any other way).

I'm going to try downgrading my version of Rabbit MQ for now, but I'd appreciate any input from you guys how how to solve this.

@videlalvaro
Owner

great to know. I was starting to get worried :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Something went wrong with that request. Please try again.