A second connection watching a tube hangs reserve-with-timeout #78

Closed
jreitz opened this Issue Oct 13, 2011 · 7 comments

Comments

Projects
None yet
3 participants

jreitz commented Oct 13, 2011

When using reserve with timeout from multiple connections on the same tube, only the last one receives "TIMED_OUT". The other connections hang indefinitely. This is occurs with the trunk codebase.

More specifically:

  • connection A watches tube 'foo' then issues a reserve-with-timeout N seconds
  • connection B watches tube 'foo' within the N seconds period
  • connection A is never issued a "TIMED_OUT" response

The issue seems to stem from the existing connection being removed from srv->conns when the most recent connection issues the watch foo command. I'm still digging into the code to understand why that is happening.

Owner

kr commented Oct 14, 2011

Confirmed, thanks. If there's anything I can do to help you
track this down, let me know!

jreitz closed this Oct 14, 2011

jreitz reopened this Oct 14, 2011

jreitz commented Oct 14, 2011

Cool, I didn't mean to close this just now. Anyway, I think I'm pretty close and I hope to have a patch by Monday.

jreitz commented Oct 17, 2011

Ok, I have a one-line fix but developing a unit test for the bug will be trickier because the current unit test framework does not appear to support multiple concurrent connections. I'll look to see if I figure something out this evening.

Owner

kr commented Jan 5, 2012

Hey, even if it's not totally finished, can you push the fix for this
in a branch somewhere? I'd like to take a look at it.

It shouldn't be too hard to copy testsrv in integ-test.c to a new
function and edit it to make two concurrent connections.

Owner

kr commented Jan 15, 2012

I just wrote a test for this.

kr closed this in 0db458b Jan 15, 2012

jreitz commented Jan 18, 2012

Great! I saw your message about adding the test and was going to check my fix (which we've been running in production since October) but it looks like I'm too late. I had slightly changed the logic in srvscheconn in srv.c but your fix is better.

cpisto commented Mar 7, 2012

I'm still seeing exactly this behavior on 1.5

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment