Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rename ReadFrom.NEAREST to LOWEST_LATENCY to clarify its functionality #1997

Closed
jhmartin opened this issue Feb 8, 2022 · 4 comments
Closed
Labels
type: enhancement A general enhancement
Milestone

Comments

@jhmartin
Copy link

jhmartin commented Feb 8, 2022

ReadFrom.NEAREST is described as the node with the 'lowest latency'. It appears this is measured via the total response time to the CLUSTER NODES command. This can be confusing if you are expecting it to be based purely on network latency, as a busy-but-near node may take longer than idle-but-far.

Please let me know if I am correct here and I'll submit some documentation updates to clarify this.

@mp911de
Copy link
Collaborator

mp911de commented Feb 10, 2022

This can be confusing if you are expecting it to be based purely on network latency

How else would you measure the network latency to the node? We can only leverage mechanisms provided by the Redis connection itself and using the latency of CLUSTER NODES completion is a by-product of obtaining the topology.

I agree that a busy node is going to yield a higher latency than an idle node. A way out could be renaming the constant to LOWEST_LATENCY.

@jhmartin
Copy link
Author

Yes, LOWEST_LATENCY or FASTEST might be a better name.
I looked as I was originally thinking NEAREST could act as an optimization to reduce inter-AZ traffic in AWS, assuming that redis was spread across the AZs along with clients, and replication lag wasn't a concern. I was then curious how that was calculated. I had hoped redis had some in-protocol method for determining that.

I ran a test in AWS with one node in the same availability zone as the client and one in another, and observed that the client flip-flopped between the two. That is a testament to AWS's inter-zone network speed I suppose.

As to how to better detect NEAREST:
First I might consider the response time to a redis PING command instead of CLUSTER NODES, since PING feels like it would be lighter weight.

Another thought is to use a ICMP Ping, but that is frequently blocked plus I gather it is difficult to do from Java.

Along those lines, sending a packet we know will trigger a connection-reset from the other and timing that could do it, but constructing those kinds of packets frequently requires root, and again I'm not sure there is a way to express that in Java. Plus packet games make firewalls unhappy and draw the ire of those who monitor firewall logs.

Timing the duration of creating a tcp connection, but not any application-level interaction, might be a way. The redis still have to call accept() on the socket but not deal with any data yet. This may not be a great idea if you have a lot of clients and a short interval, as they'll now be inundating Redis with connection open/closes.

Of those I'm thinking that using a Redis PING command is the best we can do, which is only a small improvement over CLUSTER NODES.

As a potential additional mechanism, one might look at the IP of the remote connection vs the local connection and give priority to the one that has the most matching bits. That would definitely prefer same-subnet nodes. In AWS one can also arrange the IP space such that AvailabilityZone (ie datacenter) is indicated in the most-significant-bits -- then one could use this new mechanism to reduce inter-AZ traffic even if redis is not in the same specific subnet as the clients. It'd have to be an explicit alternative (ReadFrom.CLOSEST_MATCHING_IP or something like that) though.

@mp911de
Copy link
Collaborator

mp911de commented Feb 11, 2022

With regards to locality, we have ReadFrom.subnet(cidrNotations) and ReadFrom.regex(Pattern) methods to allow applications sticking to certain subnets or addresses matching a pattern. LOWEST_LATENCY seems to be the least clunky name so I'd go with that one and reflect how it works in the docs.

@mp911de mp911de added the type: enhancement A general enhancement label Feb 11, 2022
@jhmartin
Copy link
Author

That seems reasonable.

@mp911de mp911de changed the title Clarify behavior of ReadFrom.NEAREST Rename ReadFrom.NEAREST to LOWEST_LATENCY to clarify its functionality Feb 25, 2022
@mp911de mp911de added this to the 6.1.7 milestone Feb 25, 2022
mp911de added a commit that referenced this issue Feb 25, 2022
…ality #1997

Introduce LOWEST_LATENCY and deprecate NEAREST.
mp911de added a commit that referenced this issue Feb 25, 2022
…ality #1997

Introduce LOWEST_LATENCY and deprecate NEAREST.
@mp911de mp911de closed this as completed Feb 25, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type: enhancement A general enhancement
Projects
None yet
Development

No branches or pull requests

2 participants