Add get-master-that-met-quorum-addr-by-name, to be used by sentinel-aware clients. #821
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Hi,
I asked the question on the mailing list, but I figured it would be easier to post some code.
Basically, according to the guidelines for sentinel-aware clients (http://redis.io/topics/sentinel-clients), clients should call get-master-addr-by-name to get the master address. However, sentinels will respond to this call with the master address even if the number of sentinels that appear to be connected to this master is less than the quorum. In some cases, this means that depending on the order in which the sentinels are queried, the resulting master address could be different if the sentinels are desynchronised.
This pull request adds a new method, get-master-that-met-quorum-addr-by-name (awful name, I agree), that answers with IDONTKNOW if the number of sentinels that appear to be connected to this master is less than the quorum.
Example of a real-world scenario that would trigger a desynchronization:
Two normal redis instances (called R1 and R2, with R1 being the master), and 3 sentinel instances S1, S2 and S3, quorum set to 2.
S1 and R1 become unavailable. Failover is initiated and completed, R2 is the new master, S2 and S3 respond accordingly.
Now S1 and R1 are available again. They don't know about the failover, so S1 becomes a master, and S1 answers to get-master-addr-by-name wtih R1.
At this point, if a client was given S1, S2 and S3 as starting sentinels, depending on the order in which it queries these sentinels, masters could be set to R1 (if calling S1 first) or R2 (if calling S2 or S3 first). If it were to call get-master-that-met-quorum-addr-by-name, S1 would respond with IDONTKNOW, and S2 and S3 would answer with R2, which is IMO the expected behaviour.