Requests end up interleaved on Ruby 2.1 when Timeout module is used #501

tarcieri · 2015-01-19T22:47:03Z

The following snippet is able to reproduce a condition whereby the response of one command is never consumed, and the next command ends up consuming the response from the previous command:

https://gist.github.com/tarcieri/303c5dcf803b0e5d021d

Here's another reproduction of the same issue, I believe:

https://gist.github.com/stevehodgkiss/10204388

The issue manifests for me after 1000+ iterations of the loop. It only manifests on Ruby 2.1, and not on previous versions.

It may very well be a bug in Ruby's Timeout implementation, but I thought I'd start here.

antirez · 2015-01-19T22:49:08Z

Thank you @tarcieri, made an hypothesis about what could be the cause in issue #490. Probably Damian will identify the root cause much easily btw.

tarcieri · 2015-01-19T22:51:10Z

Given this only manifests on Ruby 2.1, I'm actually leaning towards it being a bug in Ruby itself (specifically in the 2.1+ implementation of Timeout)

antirez · 2015-01-19T22:52:52Z

Oh strange it only affects only 2.1, yep probably you are right. In the general form the bug looks like something every client out there should have a regression for... very easy to have or re-introduce it.

badboy · 2015-01-19T22:55:58Z

It might very well be the Timeout class, it's broken afterall. Maybe 2.1 is just faster by now and thus this racing conditions manifest. I'll try tracking it down tomorrow.

zanker · 2015-01-20T00:28:02Z

One thing to note, is that while the timeout issue is the most common case this happens, killing a thread can cause it too regardless of version (or more accurately, I've only tested on 2.0/2.1/JRuby 1.7).

Keeping track of whether or not you have a request outstanding when writing data on anything that's not thread safe handles all of the corruption problems though (like in mperham/connection_pool#67, mperham/connection_pool#70)

tamird · 2015-01-20T05:52:40Z

The issue manifests for me after 1000+ iterations of the loop.

This is because the timeout is very high 0.5s -- setting it to 0.001s manifests the issue after one iteration.

It only manifests on Ruby 2.1, and not on previous versions.

Definitely manifests on 2.2 as well, but interestingly not on 2.0/rbx/jruby.

zmack · 2015-01-20T10:19:05Z

I've run into this one as well, and I've seen similar bugs happen with the ruby mogilefs client.

I suspect what it is is that you've got data in the socket's buffer from the previous request ( the one that the timeout kills before actually reading from it ), and the subsequent read just gets whatever's in the buffer ( i.e. the response from the previous request ). On the next request, this happens all over again: the response from the previous request is in the socket's buffer, that's read before the new one's had a chance to make it in, and so on.

The solution we came up with was to just reset the connection whenever we hit a timeout, that way we know for sure whatever result comes up next it'll be fresh.

Unless the intent is to add manage request timeouts in the lib, i'm not sure there's anything we can do to protect users from setting this off with a timeout in their own code.

tarcieri · 2015-01-20T21:21:27Z

The hiredis backend already supports native timeouts. I mostly opened this for clarity around what was happening.

It seems the answer is (as I suspected) that Timeout is inherently unsafe and shouldn't be used.

djanowski · 2015-01-27T20:23:50Z

I implemented a possible fix in #502. Please take a look and provide feedback there. Thanks!

djanowski · 2015-02-11T19:26:03Z

Fixed in 3.2.1.

If the thread is abruptly terminated (`Thread#kill`), or an exception is raised asynchronously (`Thread#raise` or `Timeout.timeout`), it's possible to leave the connection with pending reads. If the connection is then re-used, the next call can potentially read responses to the interupted commands. Ref: redis/redis-rb#501

soveran mentioned this issue Jan 20, 2015

Calling collection attribute is returning string instead of collection soveran/ohm#185

Closed

tarcieri closed this as completed Jan 20, 2015

djanowski mentioned this issue Jan 27, 2015

Fix inconsistent connections when using Timeout under MRI 2.1+. #502

Merged

fw42 mentioned this issue Feb 11, 2015

Client possibly leaking state under high load #507

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Requests end up interleaved on Ruby 2.1 when Timeout module is used #501

Requests end up interleaved on Ruby 2.1 when Timeout module is used #501

tarcieri commented Jan 19, 2015

antirez commented Jan 19, 2015

tarcieri commented Jan 19, 2015

antirez commented Jan 19, 2015

badboy commented Jan 19, 2015

zanker commented Jan 20, 2015

tamird commented Jan 20, 2015

zmack commented Jan 20, 2015

tarcieri commented Jan 20, 2015

djanowski commented Jan 27, 2015

djanowski commented Feb 11, 2015

Requests end up interleaved on Ruby 2.1 when Timeout module is used #501

Requests end up interleaved on Ruby 2.1 when Timeout module is used #501

Comments

tarcieri commented Jan 19, 2015

antirez commented Jan 19, 2015

tarcieri commented Jan 19, 2015

antirez commented Jan 19, 2015

badboy commented Jan 19, 2015

zanker commented Jan 20, 2015

tamird commented Jan 20, 2015

zmack commented Jan 20, 2015

tarcieri commented Jan 20, 2015

djanowski commented Jan 27, 2015

djanowski commented Feb 11, 2015