loqrecovery: when performing loss of quorum recovery use NON_VOTER replicas if no VOTERs survived #80620
Labels
A-kv-replication
Relating to Raft, consensus, and coordination.
C-enhancement
Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception)
T-kv-replication
KV Replication Team
Is your feature request related to a problem? Please describe.
Currently loss of quorum recovery only uses VOTERs or INCOMING_VOTERs as candidates for survivors.
Consider the scenario where we have regional by row table and all voters are in primary region but we also have follower reads in other regions. If primary region for a range is lost, we still have the data in other regions, but existing recovery procedures ignore them.
Describe the solution you'd like
Since followers contain all the data with only difference that it could be lagging further behind is it better than having no data at all. We should consider those if we don't have any voter candidates.
Note that this change is not straightforward as simply marking a follower as voter is not enough. DistSender won't route requests to them and up-replication won't happen either.
We could consider this as half online approach where we can do things like manipulate meta2 ranges or change DistSender.
Jira issue: CRDB-15578
Epic CRDB-14205
The text was updated successfully, but these errors were encountered: