LB: introduce randomization in locality LB scheduler initialization#32075
LB: introduce randomization in locality LB scheduler initialization#32075adisuissa merged 6 commits intoenvoyproxy:mainfrom
Conversation
Signed-off-by: Adi Suissa-Peleg <adip@google.com>
| locality_entries.emplace_back(std::make_shared<LocalityEntry>(i, effective_weight)); | ||
| } | ||
| } | ||
| // If not all effective weights were zero, create the scheduler. |
There was a problem hiding this comment.
Do we still want to create a scheduler even if all weights are equal?
There was a problem hiding this comment.
AFAICT this was the behavior prior to this PR, and I haven't modified that.
This comment (and code) is looking at the case where all weights are zero. The end-result of this method (both prior to this change and after this change) is that if all weights are zero, the locality_scheduler will be null.
Signed-off-by: Adi Suissa-Peleg <adip@google.com>
…b_seed Signed-off-by: Adi Suissa-Peleg <adip@google.com>
| hosts_added, hosts_removed, absl::nullopt, absl::nullopt); | ||
| priority_set_.updateHosts( | ||
| 0, HostSetImpl::partitionHosts(hosts_, hosts_per_locality_), {}, hosts_added, hosts_removed, | ||
| server_context_.api().randomGenerator().random(), absl::nullopt, absl::nullopt); |
There was a problem hiding this comment.
checked randomGenerator().random() briefly, it does not seem to rely on any properties that increase the chance of duplications across large Envoy fleet (e.g. by using current time in the seed)
There was a problem hiding this comment.
I think it is quite challenging to sync the time across a fleet, especially if we are thinking in the context of EDS updates that are being sent to many Envoys (and each triggers the update at a slightly different time).
BTW I'm also basing this code on the comment that already exists in:
envoy/source/common/upstream/load_balancer_impl.h
Lines 540 to 545 in b34d122
| locality_scheduler = std::make_unique<EdfScheduler<LocalityEntry>>( | ||
| EdfScheduler<LocalityEntry>::createWithPicks( | ||
| locality_entries, | ||
| [](const LocalityEntry& entry) { return entry.effective_weight_; }, seed)); |
There was a problem hiding this comment.
do we need to ensure that seed > 0?
There was a problem hiding this comment.
seed is defined as uint64_t in the input of the current function, and as a uint32_t in the input of createWithPick(), so both are unsigned and should be >0. The implicit casting seems fine to me, as it just specifies the number of initial-picks which if it is more than 10^6 is probably bad anyway.
There was a problem hiding this comment.
was curios if 0 value (which is valid uint64_t) can lead to unpredicted results.
nezdolik
left a comment
There was a problem hiding this comment.
if edf_lb_locality_scheduler_init_fix flag is now on by default, we need to test both code branches in the tests
…b_seed Signed-off-by: Adi Suissa-Peleg <adip@google.com>
Signed-off-by: Adi Suissa-Peleg <adip@google.com>
In some cases we can do a parameterized test that works with both. I think the question is what is the added value of this, and in this case I'm not sure the benefit outweighs the cost. |
|
Thanks for the detailed review and comments @nezdolik! |
Would not executing code from both if branches affect coverage? |
Yes. I added a specific test that sets the runtime-flag to false, to validate that it works as expected, and that will cover the old-path code. |
np, sorry I missed that test, it was lots of affected test code :) |
| // If not all effective weights were zero, create the scheduler. | ||
| if (!locality_entries.empty()) { | ||
| locality_scheduler = std::make_unique<EdfScheduler<LocalityEntry>>( | ||
| EdfScheduler<LocalityEntry>::createWithPicks( |
There was a problem hiding this comment.
This is LGTM overall.
I have only one question. The createWithPicks will use the seed % 429496729 as the pre picks number.
And because the seed is a random value from random() method, so it's possible we may do 429496728 times pre-picks when we update the hosts.
Do we acutally evalute the possible performance impact at the worst case? If the users update the host set frequently, will this bring huge performance burden?
(PS: the host scheduler will use seed % hosts.size() as the pre picks number, I think this should be some similar things? like we can use seed % locality_entries.size() or std::max(8, seed % locality_entries.size()) here?
There was a problem hiding this comment.
This is LGTM overall.
I have only one question. The
createWithPickswill use theseed % 429496729as the pre picks number. And because the seed is a random value fromrandom()method, so it's possible we may do 429496728 times pre-picks when we update the hosts.Do we acutally evalute the possible performance impact at the worst case? If the users update the host set frequently, will this bring huge performance burden?
While conceptually the number of pre-picks could be large, the code is bounded by the number of entries/weights (N- see here), and the operation has the complexity of O(N*log(N)), which is the same as it is right now (because the localities are added one after the other to the priority-queue).
A proof of the bounded number of iterations can be found at the doc linked for this comment.
(PS: the host scheduler will use
seed % hosts.size()as the pre picks number, I think this should be some similar things? like we can useseed % locality_entries.size()orstd::max(8, seed % locality_entries.size())here?
Yes, the next PR will be to fix the wrong initialization there as well.
There was a problem hiding this comment.
I actually not completely get it. I will read the code more detailedly tomorrow. Will unblock the PR first.
There was a problem hiding this comment.
feel free to reach out directly and I can try to give a guided explanation.
There was a problem hiding this comment.
I got it. Sorry for the previous negligenc.
I have to say, it's awesome 👍.
wbpcode
left a comment
There was a problem hiding this comment.
LGTM overall. I mark it as request changes because there is an important quesion to be checked first.
…b_seed Signed-off-by: Adi Suissa-Peleg <adip@google.com>
|
/retest |
Commit Message: LB: introduce randomization in locality LB scheduler initialization
Additional Description:
Following up on the randomized picking introduced in #31592, this PR plumbs a seed into the locality-LB.
Prior to this PR whenever a new EDS assignment arrived, the order of the picked locality was always reset.
For example, if there are 2 localities, locality_A with weight 99 and locality_B with weight 1, an update of the EDS message (that doesn't update the weights) may result in starting from the beginning of the "pick list".
This may cause a skew in the traffic distribution, especially in a fleet of Envoys, and when the Envoys receive periodic assignment updates and a very low QPS.
The change ensures that the starting point of which locality to choose from is randomized and picked from some random starting point.
Risk Level: Medium - a bug fix that may impact current traffic flows.
Testing: Added tests, and updated old ones.
Docs Changes: None.
Release Notes: Added.
Platform Specific Features: N/A.
Runtime guard: Added
envoy.reloadable_features.edf_lb_locality_scheduler_init_fix