This is sort of related to #1236 that I filed earlier today, and might be related to #1120.
We had a machine that is running one of our redis cluster masters go down hard today, and while it was down we saw some interesting behavior from the jedis pool. We got a lot of errors like this:
redis.clients.jedis.exceptions.JedisClusterMaxRedirectionsException: Too many Cluster redirections?
It looks like redis.clients.jedis.Connection.connect() is wrapping the java.net.ConnectException from the failed socket up in a JedisConnectionException, which gets wrapped again by redis.clients.util.Pool.getResource(), and because it's a JedisConnectionException, it triggers a retry.
I think this issue is maybe a little less obviously a bug than #1236, because I can see some uses where someone might want to retry immediately if a socket threw a ConnectException, but I think one could also make an argument that if a ConnectException is being thrown something is seriously wrong, and failing immediately seems reasonable. (If users were experiencing the ConnectException for nodes that weren't down, they could alway increase the connect timeout.) As it is, when a node goes down, users could potentially end up waiting a total of (maxRedirects * connectTimeout) ms, and still get an error, which is kind of unfortunate, especially when a regular cache hit is over in just a few ms. :)
(maxRedirects * connectTimeout) ms
@roblg. If you want to fail immediately you can set the JedisCluster retries to 0 and in that case JedisCluster won't try to contact other nodes upon failure. I'm in mobile now I'll explain the rationale for this functionality later
@marcosnils Thanks for the response. Yeah, I originally considered lowering retries to 0, but I think I actually do want retries for "MOVED" responses, so that we can add/remove nodes from the cluster and pick up those changes. I think maybe ideally there would be two kinds of retries -- error retries and redirect retries, but they seem to be the same right now.
In all honesty, it would probably work well enough to just set retries to 1, so that if we add a new node and rebalance the slots, Jedis would be able to correct for that, but in the case of a socket connect timeout, we don't pay a huge penalty.
Hmm... it also looks like if there's a ConnectException when trying to connect to a particular node, the request will be retried, but is retried with tryRandomNode=true, so the follow-up request is unlikely to land on a node that has the data. It'll get a MOVED response, and be forced to try again. Am I reading that right?
Yes, right. That implementation was port of https://github.com/antirez/redis-rb-cluster.
When we receive JedisConnectionException, we can't determine that Redis instance is down, or just in case of timeout. (We may could distinguish but let's abstract for now.)
If issue is former, reconnecting same instance is meaningless since Redis Cluster would try to failover and that information is not pushed to Jedis. We just have to request to random instance and let that forward me to valid instance.
Hmm... I think the logic is up to how fast Redis Cluster completes failover.
If Redis Cluster cannot complete failover fast enough, there's likely to just half of max retry count recurrences of connecting random node -> failed node and finally throw JedisConnectionException anyway.
@HeartSaVioR I believe this shouldn't be much trouble because even though Jedis might return MaxRedirects very soon depending on how much time RedisCluster completes its failover, whenever the Jedis user tries to issue any command using the same JedisCluster instance, Jedis will try to reconnect again and try to dispatch the command to the appropriate node.
In this scenario, Jedis users can catch the MaxRedirectException and implement some sort of exponential backoff or something before sending the next command to the cluster.
Yes, agreed. It's just who is responsible to do retry.
If we want to reduce max retry count to fail fast, retry count should be more than 2 (random node -> moved).