Skip to content

Conversation

@tleach
Copy link
Contributor

@tleach tleach commented May 14, 2018

RoundRobinLoadBalancerFactory creates a new Picker instance every time the set of provided address groups changes or the connection state of subchannels associated with existing address groups changes. In certain scenarios, such as deployment/replacement of the target service cluster, this can lead to high churn of Picker objects. Given that each new Picker's subchannel index is initialized to zero, in these scenarios requests can end up getting disproportionately routed through subchannels (and hence server nodes) which are earlier in the list of address groups.

At Netflix we have measured that some service nodes end up taking 3-4x the load that of other nodes during deployment.

This commit randomizes the start index of the RoundRobinLoadBalancerFactory.Picker which eliminates this behavior.

@thelinuxfoundation
Copy link

Thank you for your pull request. Before we can look at your contribution, we need to ensure all contributors are covered by a Contributor License Agreement.

After the following items are addressed, please respond with a new comment here, and the automated system will re-verify.

Regards,
CLA GitHub bot

@carl-mastrangelo
Copy link
Contributor

@tleach One alternative is to shuffle the addresses in the name resolver before passing it to the LB. This avoids the problem of all clients overloading the backends in the same order. Picking a random start index means all backends would still end up synchronizing (like metronomes).

@tleach
Copy link
Contributor Author

tleach commented May 14, 2018

@carl-mastrangelo great point regarding clients synchronizing their access to backends.

I considered shuffling the addresses but was concerned about introducing an O(n) operation every time the address list is updated if that list changes frequently. Perhaps that concern is overblown.

@tleach
Copy link
Contributor Author

tleach commented May 14, 2018

Another thought that occurred to me: shuffling the addresses is likely a redundant operation for other load balancer implementations (e.g. random load balancer). Therefore doing this in the name resolver is probably the wrong place to do it (and arguably leaks information about the choice of load balancing algorithm). It still might make sense to shuffle the address list inside RoundRobinLoadBalancer.handleResolvedAddressGroups.

@carl-mastrangelo
Copy link
Contributor

@tleach I wouldn't worry about the time complexity here. The number of backends is usually less than 1000, and list being shuffled very likely fits in the cache. I have an internal benchmark that uses 50,000 backends and didn't see any significant issues. (Also, see #3579)

As for random load balancer: probably not a good idea, since some of the backends get way more load than the others. For an example of how this happens see Genius's blog post).

For gRPC we try to keep the LB implementations the same across languages, so they all behave roughly equally. @zhangkun83 and @ejona86 might be able to shed light on potentially applying your change across implementations.

@ejona86
Copy link
Member

ejona86 commented May 14, 2018

This is probably a good idea to do cross-language. But I'd like to discuss it with Kun, who will be back next week. I do agree that shuffling every update is a bit weird across all NameResolvers (for some it would be a good idea, for others not so much). But I can see the discussion going in a few different directions.

Can we sit on this for a week-ish and get back to you?

@tleach
Copy link
Contributor Author

tleach commented May 14, 2018

@ejona86 sure - no urgency on my end. We've shifted to a different (in-house) LB implementation for now. Thanks.

@zhangkun83
Copy link
Contributor

I am back :). I think shuffling in the round robin balancer is the right thing to do. The shuffling helps balancing the load, it has nothing to do with NameResolver.

@ejona86 ejona86 requested a review from zhangkun83 June 15, 2018 19:20
this.stickinessState = stickinessState;
// start off with a random address to ensure significant Picker churn does not skew
// subchannel selection lower-indexed addresses in the list
Random random = new Random();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't re-create the Random instance so frequently. Instead, have it a field of RoundRobinLoadBalancer and probably pass the value (or chosen starting index) into the Picker constructor. Since the Pickers are only created from one thread at a time, there's no concurrency issues from using Random at that point.

@ejona86 ejona86 added the kokoro:run Add this label to a PR to tell Kokoro the code is safe and tests can be run label Jun 15, 2018
@kokoro-team kokoro-team removed the kokoro:run Add this label to a PR to tell Kokoro the code is safe and tests can be run label Jun 15, 2018
@tleach tleach force-pushed the tl-random-round-robin-start-index branch 2 times, most recently from 52a445f to 9d8290d Compare June 25, 2018 04:17
@tleach tleach force-pushed the tl-random-round-robin-start-index branch from 9d8290d to 40c4aef Compare July 2, 2018 21:15
helper.updateBalancingState(state, new Picker(activeList, error, stickinessState));
// initialize the Picker to a random start index to ensure that a high frequency of Picker
// churn does not skew subchannel selection.
int startIndex = activeList.isEmpty() ? -1 : random.nextInt(activeList.size()) - 1;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: "startIndex" being -1 is a bit obscure here. How about letting startIndex base on 0 and in Picker construction let index = startIndex - 1?

assertEquals(subchannel1, picker.pickSubchannel(mockArgs).getSubchannel());
assertEquals(subchannel2, picker.pickSubchannel(mockArgs).getSubchannel());
assertEquals(subchannel, picker.pickSubchannel(mockArgs).getSubchannel());
Subchannel picked1 = picker.pickSubchannel(mockArgs).getSubchannel();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test is actually weaker than the previous one since it no longer verifies that subchannels are visited in order and no subchannel is skipped.

I would define an internal interface RandomProvider which can be mocked out, and deterministically test that it has been called and it's used as the start index.

RoundRobinLoadBalancerFactory creates a new Picker instance every time the set of provided address groups changes or the connection state of subchannels associated with existing address groups changes. In certain scenarios, such as deployment/replacement of the target service cluster, this can lead to high churn of Picker objects. Given that each new Picker's subchannel index is initialized to zero, in these scenarios requests can end up getting disproportionately routed through subchannels (and hence server nodes) which are earlier in the list of address groups.

At Netflix we have measured that some service nodes end up taking 3-4x the load that of other nodes during deployment.

This commit randomizes the start index of the RoundRobinLoadBalancerFactory.Picker which eliminates this behavior.
@tleach tleach force-pushed the tl-random-round-robin-start-index branch from 40c4aef to 14a9260 Compare July 31, 2018 19:11
@tleach
Copy link
Contributor Author

tleach commented Jul 31, 2018

@zhangkun83 I've updated the pickerRoundRobin() test which now simply passes in a start index and verifies specific subchannels are returned in a specific order as before. This should address your concern about weakening the test.

Given that, in other tests I've preserved the approach of using nextSubchannel() to dynamically determine the next expected subchannel. This approach seems preferable to wiring through a random provider for the start index which introduces a fair amount of boilerplate.

Copy link
Contributor

@zhangkun83 zhangkun83 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

ps. it's better to keep the commits to allow reviewers to track the progression. They will be squashed when the PR is merged.

@zhangkun83 zhangkun83 added the kokoro:force-run Add this label to a PR to tell Kokoro to re-run all tests. Not generally necessary label Aug 1, 2018
@kokoro-team kokoro-team removed the kokoro:force-run Add this label to a PR to tell Kokoro to re-run all tests. Not generally necessary label Aug 1, 2018
@zhangkun83 zhangkun83 merged commit b9d1bb8 into grpc:master Aug 1, 2018
@lock lock bot locked as resolved and limited conversation to collaborators Oct 30, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants