Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix deadlock on close() with congested server. #46

Merged
merged 1 commit into from Apr 23, 2015

Conversation

dpratt
Copy link
Contributor

@dpratt dpratt commented Apr 23, 2015

The call BlockingRPCContinuation#getReply used in the close() method of ChannelN does not specify a timeout. If the server is congested, this call to close() can (and does) block indefinitely and eats up the current thread.

From an API point of view, a close() method should really have a timeout associated with it, and if the server doesn't respond with a cleanly closed connection, the channel should just be discarded.

The RPC call for close() should not necessarily listen indefinitely,
since it will block essentially forever if the server is congested.
michaelklishin added a commit that referenced this pull request Apr 23, 2015
Fix deadlock on close() with congested server.
@michaelklishin michaelklishin merged commit a1156ad into rabbitmq:stable Apr 23, 2015
@michaelklishin
Copy link
Member

Thank you. If you have a moment, we'd consumer an API extension that introduces a close() version with a timeout, against the same branch.

@michaelklishin
Copy link
Member

I've modified this a bit: now the exception will be re-thrown if you use close() but will be ignored for abort() (just like in every other case). I hope this makes sense.

@dumbbell
Copy link
Member

Since RabbitMQ 1.5.0.

@katielim
Copy link

ChannelN.basicCancel can also hang since it calls getReply without timeout value.
Can we add a similar change in basicCancel as well?

@michaelklishin
Copy link
Member

@katielim it's not as clear cut as for Connection#start and channel closure. The reason being, if RabbitMQ applies TCP back pressure, basic.cancel response will be delayed but nonetheless will be sent after basic.cancel itself is read from the socket and processed. This means we need to adapt our continuations implementation to somehow handle this.

For connection negotiation and channel/connection closure, any exception can be treated as fatal. Otherwise I have no objections to having timeouts for more operations.

See also #55, this should make the lack of timeouts for channel methods somewhat less annoying.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants