New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add rack-awareness to default LBP #666
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You added the implementation of rack awareness, but I can't see a way to enable it from the user's perspective. The DefaultPolicyBuilder
should be adjusted.
@@ -1066,6 +1104,31 @@ mod tests { | |||
.group([B, C, E]) // remote nodes | |||
.build(), | |||
}, | |||
// Keyspace SS with RF=2 with enabled DC failover |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please adjust this description to mention that this test also enables rack-awareness
// us eu eu us eu eu us | ||
// r2 r1 r1 r1 r2 r1 r1 | ||
expected_groups: ExpectedGroupsBuilder::new() | ||
.group([G, B]) // pick + fallback local replicas |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't those be in separate groups? G
is in the local rack but B
is not.
V2:
Fixed all suggestions mentioned above. |
Please rebase on main. The PR doesn't seem to account for the newly added option that disables replica shuffling. |
1e9e7cf
to
acb2eda
Compare
Now that we want filter replicas not only by DC, but also by rack, we should consider adding this option to the replica locator, similarly to how filtering by DC is currently implemented. WDYT @havaker @cvybhu? |
let local_rack_replicas = self.replicas( | ||
ts, | ||
true, | ||
|node| { | ||
Self::is_alive(node) | ||
&& self.preferred_rack.is_some() | ||
&& DefaultPolicy::rack_predicate(node, &self.preferred_rack) | ||
}, | ||
cluster, | ||
); | ||
let local_replicas = self.shuffled_replicas( | ||
ts, | ||
true, | ||
|node| { | ||
Self::is_alive(node) | ||
&& (self.preferred_rack.is_none() | ||
|| !DefaultPolicy::rack_predicate(node, &self.preferred_rack)) | ||
}, | ||
cluster, | ||
); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are local_rack_replicas
not chosen with self.shuffled_replicas
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fn pick_local_rack_replica<'a>( | ||
&'a self, | ||
ts: &TokenWithStrategy<'a>, | ||
should_be_local: bool, | ||
predicate: &impl Fn(&NodeRef<'a>) -> bool, | ||
cluster: &'a ClusterData, | ||
) -> Option<NodeRef<'a>> { | ||
if let Some(rack) = &self.preferred_rack { | ||
return self | ||
.nonfiltered_replica_set(ts, should_be_local, cluster) | ||
.choose_filtered(&mut thread_rng(), |node| { | ||
predicate(&node) && node.rack.as_deref() == Some(rack) | ||
}); | ||
} | ||
|
||
None | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In my opinion, creating a distinct function for selecting local rack replicas is unnecessary. Instead, the desired outcome can be accomplished by utilizing a suitable predicate with the pick_replicas
function.
Furthermore, it appears that the modifications introduced by rebasing were not taken into account. While pick_replica
has the ability to select from an unshuffled sequence, this capability is not present in pick_local_rack_replica
, which results in a lack of support for the enable_shuffling_replicas(false)
option.
Overall, I believe this reinforces the argument against replicating the functionality of pick_replica
and instead utilizing a custom predicate to choose local rack replicas.
// us eu eu us eu eu us | ||
// r2 r1 r1 r1 r2 r1 r1 | ||
expected_groups: ExpectedGroupsBuilder::new() | ||
.group([G]) // pick local rack replicas |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Having a single element rack group fails to test shuffling.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add a test, that tests shuffling of local-rack replicas.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As I understand by reading the ExpectedGroups::assert_proper_grouping_in_plans
code, it does not test the shuffling of nodes, it just verifies that the plan contains all expected nodes in the group. So, does it matter if there are one or more replicas if you collect them with group
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it does test shuffling.
scylla-rust-driver/scylla/src/transport/load_balancing/default.rs
Lines 707 to 715 in 758ec3e
ExpectedGroup::NonDeterministic(s) => { | |
// The group is supposed to have non-deterministic | |
// ordering. If the group size is larger than one, | |
// then expect there to be more than one group | |
// in the set. | |
if gots.len() > 1 && s.len() > 1 { | |
assert!(sets.len() > 1); | |
} | |
} |
After some thought though, I realized that it would be quite difficult to test if returned local-rack replicas as shuffled. It is because local-rack replicas are always first in the plan (if available), an hence the first element from the plan is a picked local-rack replica. If pick
chooses nodes in a random way but fallback
doesn't, the returned group in plan is not deterministic anyway. Our testing framework has no way of detecting that a group is only partially shuffled.
I don't know if adding new functionality to the testing framework is worth it. I guess that I'm withdrawing my request for a test that checks shuffling of local-rack replicas. @wprzytula any thoughts on this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be nice to at least verify that pick
chooses rack-local replicas randomly. Having a test with the "local rack replicas" group of size 2 should check this.
Later we can consider adjusting the tests - for example, in addition to checking plans that combine pick
and fallback
we could also check plans returned by fallback
only.
/// in the preferred datacenter, and then the other replicas in the datacenter. | ||
/// | ||
/// When a preferred datacenter is not set, setting preferred rack will not have any effect. | ||
pub fn prefer_rack(mut self, rack_name: String) -> Self { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add this function's documentation to the doc book.
&& self.preferred_rack.is_some() | ||
&& DefaultPolicy::rack_predicate(node, &self.preferred_rack) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would prefer to include checking if preferred_rack
is not None
to rack_predicate
. It would reduce the complexity of this expression.
@@ -353,6 +405,10 @@ impl DefaultPolicy { | |||
node.is_enabled() | |||
} | |||
|
|||
fn rack_predicate(node: &NodeRef<'_>, preferred_rack: &Option<String>) -> bool { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The other predicates are named after the condition they check, such as is_alive
which verifies if a node is alive (it doesn't at the moment but will be;))). It may be worthwhile to rename rack_predicate
to adhere to this convention, e.g. is_rack_local
.
I was the one to suggest filtering replicas by rack on the higher level to @Gor027;))) The idea of putting more filtering options to the locator came to my mind, but I'm not a fan of it.
To simplify the scope of this pull request I would go with the predicate-filtering. We could later assess performance gains of moving such filtering to the locator module and implement it if needed. |
V3:
|
@@ -13,6 +13,12 @@ use scylla_cql::{errors::QueryError, frame::types::SerialConsistency, Consistenc | |||
use std::{fmt, sync::Arc, time::Duration}; | |||
use tracing::warn; | |||
|
|||
pub enum ReplicaLocation { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is it public?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was added by mistake, fixed it.
let local_rack_replicas = self.replicas( | ||
ts, | ||
true, | ||
|node| { | ||
Self::is_alive(node) | ||
&& self.preferred_rack.is_some() | ||
&& node.rack == self.preferred_rack | ||
}, | ||
cluster, | ||
); | ||
let local_replicas = self.shuffled_replicas( | ||
ts, | ||
true, | ||
|node| { | ||
Self::is_alive(node) | ||
&& (self.preferred_rack.is_none() || node.rack != self.preferred_rack) | ||
}, | ||
cluster, | ||
); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
self.replicas
, self.shuffled_replicas
and self.pick_replica
had a very similar signature. I feel that it's a bit inconsistent that you introduced a ReplicaLocation
enum only in self.pick_replica
. Was there any specific reason behind it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I changed the implementation of retrieving local rack replicas to use shuffled_replicas
and appropriately changed the should_be_local: bool
to replica_location: ReplicaLocation
enum in all of the methods.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After your changes, semantics of self.shuffled_replicas
and self.pick_replica
differ. self.pick_replica
modifies its predicate according to the value of ReplicaLocation
argument, but self.shuffled_replicas
does not. Why?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right, the predicate logic was passed in the fallback
function, so I thought it is redundant to modify the predicate in self.replicas
. I fixed it, but the predicate modification required the ReplicaLocation
enum to implement Clone
and Copy
.
let local_rack_replicas = self.replicas( | ||
ts, | ||
true, | ||
|node| { | ||
Self::is_alive(node) | ||
&& self.preferred_rack.is_some() | ||
&& DefaultPolicy::rack_predicate(node, &self.preferred_rack) | ||
}, | ||
cluster, | ||
); | ||
let local_replicas = self.shuffled_replicas( | ||
ts, | ||
true, | ||
|node| { | ||
Self::is_alive(node) | ||
&& (self.preferred_rack.is_none() | ||
|| !DefaultPolicy::rack_predicate(node, &self.preferred_rack)) | ||
}, | ||
cluster, | ||
); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// us eu eu us eu eu us | ||
// r2 r1 r1 r1 r2 r1 r1 | ||
expected_groups: ExpectedGroupsBuilder::new() | ||
.group([G]) // pick local rack replicas |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add a test, that tests shuffling of local-rack replicas.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please update the Creating a DefaultPolicy
section.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
Our convention regarding commit names is something like |
280a1e1
to
90edd07
Compare
@@ -68,6 +69,21 @@ or refrain from preferring datacenters (which may ban all other datacenters, if | |||
} | |||
} | |||
if let Some(ts) = &routing_info.token_with_strategy { | |||
// Try to pick some alive local rack random replica. | |||
// If preferred rack is not specified, try to pick local DC replica. | |||
if let Some(_local_rack) = &self.preferred_rack { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if let
is a bit superfluous here, you are creating an unused _local_rack
variable. if self.preferred_rack.is_some()
would be a better choice there probably.
let predicate = move |node| match replica_location { | ||
ReplicaLocation::Any | ReplicaLocation::Datacenter => predicate(&node), | ||
ReplicaLocation::DatacenterAndRack => { | ||
predicate(&node) && node.rack == self.preferred_rack | ||
} | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that it belongs to the previous commit, where you added ReplicaLocation
parameter to pick_replica
, replicas
and shuffled_replicas
.
This is because without this predicate modification, those functions are not fully respecting ReplicaLocation
values - replicas(..., ReplicaLocation::DatacenterAndRack, ...)
won't return a set that is restricted to local rack.
let predicate = |node| match replica_location { | ||
ReplicaLocation::Any | ReplicaLocation::Datacenter => predicate(&node), | ||
ReplicaLocation::DatacenterAndRack => { | ||
predicate(&node) && node.rack == self.preferred_rack | ||
} | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto
let should_be_local = match replica_location { | ||
ReplicaLocation::Any => false, | ||
ReplicaLocation::Datacenter | ReplicaLocation::DatacenterAndRack => true, | ||
}; | ||
self.nonfiltered_replica_set(ts, should_be_local, cluster) | ||
.into_iter() | ||
.filter(predicate) | ||
.filter(move |node: &NodeRef<'a>| predicate(node)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change should also be applied to the previous commit - it does not compile without it.
prioritize nodes based on their availability zones in the preferred datacenter. | ||
When a preferred rack is set, the policy will first return replicas in the local rack | ||
in the preferred datacenter, and then the other replicas in the datacenter. | ||
When a preferred datacenter is not set, setting preferred rack will not have any effect. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add a test that checks this behavior.
Updated documentation in the docs book.
Changed `should_be_local` parameter in `pick_replica`, `replicas` and `shuffled_replicas` to `ReplicaLocation` enum. It is more expressive than a boolean and allows for more ways of filtering while picking a replica. Modified pick and fallback plan to prefer local rack replicas.
// eu eu us eu eu us us | ||
// r1 r1 r1 r2 r1 r2 r1 | ||
expected_groups: ExpectedGroupsBuilder::new() | ||
.deterministic([B, C]) // pick local rack + fallback replicas |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: instead of this single .deterministic
, you should do:
.deterministic([B])
.deterministic([C])
The .deterministic()
method does not check that the load balancing policy always returns in the specified order - e.g. in your case it does not check that it always return B
and then C
. It only checks that the nodes within the group are always returned in an unspecified, but always the same order.
In your case, both [B, C]
and [C, B]
orders will be accepted by the test as long as the load balancing policy always sticks to the same order. I assume that you wanted to check that B
then C
is always returned - in such case you should use two groups like I suggested.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There's one minor thing left that I think is worth fixing - the main thing transport: load_balancing: Add ReplicationLocation enum
commit introduces is rack-awareness to the default policy internals, but it's not mentioned anywhere in the commit message.
It's mentioned in a short sentence: |
Added rack-awareness option to prefer replicas in local rack while picking a coordinator. It will allow to create a top-level tier with local datacenter and local rack replicas in the fallback plan.
Fixes: #604
Pre-review checklist
Fixes:
annotations to PR description.