storage: Fix allocator diversity score calculations #17570

a-robinson · 2017-08-10T13:57:46Z

Plus a couple other small changes.

This still leaves open the question (in my mind, at least) of whether bad data diversity should be included in the conditions for shouldRebalance. Right now, we don't consider bad diversity a reason to rebalance a range even though a human would probably consider something like all of a range's replicas being in the same locality a pretty great reason to rebalance. What do you think? In a clean-slate implementation I'd definitely say yes, but I'm not sure about changing it at this point in the release cycle.

cockroach-teamcity · 2017-08-10T13:57:54Z

This change is

bdarnell · 2017-08-10T14:30:24Z

I think we should make bad diversity a reason to rebalance early in the 1.2 cycle.

Review status: 0 of 2 files reviewed at latest revision, all discussions resolved, all commit checks successful.

Comments from Reviewable

a-robinson · 2017-08-10T15:32:08Z

Yeah, there's already plenty of change in this release and it's quite late, so I'm fine waiting. The main counterargument, though, is that it's not clear it'll really get much extra testing in the next release cycle given that we don't have many folks running alpha versions with localities enabled.

tamird · 2017-08-10T16:10:58Z

This has been running on indigo since Saturday.

Perhaps write this in a way that is less sensitive to time of reading. How long was it running? I don't even know when this commit was written.

Reviewed 1 of 1 files at r1, 1 of 1 files at r2, 2 of 2 files at r3.
Review status: all files reviewed at latest revision, 2 unresolved discussions, all commit checks successful.

pkg/storage/allocator_scorer.go, line 680 at r3 (raw file):

	nodeID roachpb.NodeID, existingNodeLocalities map[roachpb.NodeID]roachpb.Locality,
) float64 {
	minScore := 1.0

where does 1.0 come from? is it locality.DiversityScore(locality)?

pkg/storage/allocator_scorer_test.go, line 598 at r3 (raw file):

		},
		{
			name: "one existing replicas",

singular?

Comments from Reviewable

a-robinson · 2017-08-10T16:41:40Z

Perhaps write this in a way that is less sensitive to time of reading. How long was it running? I don't even know when this commit was written.

Good point, done.

Review status: all files reviewed at latest revision, 2 unresolved discussions, all commit checks successful.

pkg/storage/allocator_scorer.go, line 680 at r3 (raw file):

Previously, tamird (Tamir Duberstein) wrote…

where does 1.0 come from? is it locality.DiversityScore(locality)?

Yeah, it's the max score returned by locality.DiversityScore. I've pulled it out into a constant.

pkg/storage/allocator_scorer_test.go, line 598 at r3 (raw file):

Previously, tamird (Tamir Duberstein) wrote…

singular?

Done.

Comments from Reviewable

tamird · 2017-08-10T17:07:28Z

Reviewed 2 of 2 files at r4, 3 of 3 files at r5.
Review status: all files reviewed at latest revision, 1 unresolved discussion, some commit checks failed.

pkg/storage/allocator_scorer.go, line 680 at r3 (raw file):

Previously, a-robinson (Alex Robinson) wrote…

Yeah, it's the max score returned by locality.DiversityScore. I've pulled it out into a constant.

Nice, thanks.

Comments from Reviewable

a-robinson · 2017-08-11T18:03:43Z

I've removed the second commit (the one with the tiny scoring change) since it's included in / superceded by #17593

BramGruneir · 2017-08-11T19:22:44Z

Great job locating some weirdness in the scores.

What about the situation where we have [a,a,b,c]. Would this allow movement of either a to another a, or a b or a c?

On a side note, when we have a larger set of replica than the available top level localities, should we also try for the most diverse scores in a second round? So if you have 3 localities, a, b, c, and we have 6 replicas, than we should try to double up in each locality.

That's a bit easier if there's a second locality tier, but should happen regardless.

Reviewed 1 of 1 files at r1, 1 of 1 files at r2, 3 of 3 files at r6, 2 of 3 files at r7.
Review status: 2 of 3 files reviewed at latest revision, all discussions resolved, some commit checks pending.

Comments from Reviewable

BramGruneir · 2017-08-11T19:28:48Z

Oh, forgot to mention, LGTM.

I'm starting to wonder if we might need specialized diversity scoring based not just on what operation is being performed (add/remove/rebalance) but also one for if the replica count is greater, equal or less than the locality count.

But this change is a clear improvement over the status quo.
Have you confirmed that it works when doing full movements in indigo, setting the zone config constraints to just east, and then -east?

Reviewed 1 of 3 files at r7.
Review status: all files reviewed at latest revision, all discussions resolved, some commit checks pending.

Comments from Reviewable

a-robinson · 2017-08-11T19:39:25Z

What about the situation where we have [a,a,b,c]. Would this allow movement of either a to another a, or a b or a c?

It differs slightly depending on the situation:

Adding a replica: a, b, and c would all be considered equally bad choices (meaning we could add a third a rather than a second b or c
Removing a replica: after this change, we'll always prefer to remove an a
Considering a rebalance: it would allow movement to b or c, but not to a. In fact, I think it may always trigger a rebalance. Four replica configurations are obviously a bad idea anyways but that's... not great, and kind of invalidates this approach. I'm not sure we could actually fix this without changing the way we compare new replicas to old replicas in (Allocator).RebalanceTarget, since we're comparing apples (diversity removal score) to oranges (diversity rebalance-to score). I tried to make them more comparable with this change, but didn't consider that case.

On a side note, when we have a larger set of replica than the available top level localities, should we also try for the most diverse scores in a second round? So if you have 3 localities, a, b, c, and we have 6 replicas, than we should try to double up in each locality.

I wouldn't say it's super urgent, but yeah, that'd certainly be nice.

Review status: all files reviewed at latest revision, all discussions resolved, all commit checks successful.

Comments from Reviewable

a-robinson · 2017-08-11T19:40:19Z

I'm starting to wonder if we might need specialized diversity scoring based not just on what operation is being performed (add/remove/rebalance) but also one for if the replica count is greater, equal or less than the locality count.

Yeah, I think we may to handle the case mentioned in my above response.

Review status: all files reviewed at latest revision, all discussions resolved, all commit checks successful.

Comments from Reviewable

Fixes both problems specified in cockroachdb#17479, which were: 1. "When considering a rebalance, we use one diversity score method for existing replicas and a different one for other replicas. We consider an existing replica to have good diversity if we can remove it and the range will still be diverse, but we only consider a new replica diverse if it doesn't overlap with any existing replica's locality. That means that we never choose to rebalance from one node in a locality to another node in the same locality." 2. "diversityRemovalScore doesn't seem to do what we want. It returns the max diversity of any remaining pair of replicas in a range. That means that if a range has 6 replicas (say it's down-replicating from 6 to 5) in localities [a, b, c, c, c, c] respectively, it'll consider all replicas to be equally valid for removal because no matter which one is removed, there will still be a pair of replicas with no locality tiers in common. To a human, this seems like an obviously bad choice. We may want to do something like average locality remaining after removing a replica to ensure that in cases like this we'd pick one of the replicas in locality c to remove."

BramGruneir · 2017-08-11T19:52:25Z

Considering a rebalance: it would allow movement to b or c, but not to a. In fact, I think it may always trigger a rebalance. Four replica configurations are obviously a bad idea anyways but that's... not great, and kind of invalidates this approach. I'm not sure we could actually fix this without changing the way we compare new replicas to old replicas in (Allocator).RebalanceTarget, since we're comparing apples (diversity removal score) to oranges (diversity rebalance-to score). I tried to make them more comparable with this change, but didn't consider that case.

Well, if we go with 5, which is going to be a common case,
[a,a,b,b,c] this might cause a good amount of thrashing then? Is there an easy way to guard against it?

Yeah, I think we may to handle the case mentioned in my above response.
The complexity of this scoring is going to get out of hand pretty quickly. Maybe we should instead consider a new approach instead of just a simple score.

Review status: 2 of 3 files reviewed at latest revision, all discussions resolved.

Comments from Reviewable

a-robinson requested review from BramGruneir and a team August 10, 2017 13:57

a-robinson force-pushed the followup branch from 621388e to da90db0 Compare August 10, 2017 16:41

storage: Improve some allocator logging

bb6fabd

a-robinson force-pushed the followup branch from da90db0 to f532418 Compare August 11, 2017 18:02

a-robinson mentioned this pull request Aug 11, 2017

teamcity: TestReplicaGCRace flake in testrace #17598

Closed

a-robinson mentioned this pull request Aug 11, 2017

storage: Bad replica diversity should trigger a rebalance #17599

Closed

a-robinson force-pushed the followup branch from f532418 to 587e3e9 Compare August 11, 2017 19:49

a-robinson mentioned this pull request Aug 11, 2017

storage: Make sure that DiversityScores make sense with unusual locality configurations #17603

Closed

a-robinson merged commit cde16e2 into cockroachdb:master Aug 11, 2017

a-robinson deleted the followup branch May 18, 2018 20:32

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

storage: Fix allocator diversity score calculations #17570

storage: Fix allocator diversity score calculations #17570

a-robinson commented Aug 10, 2017

cockroach-teamcity commented Aug 10, 2017

bdarnell commented Aug 10, 2017

a-robinson commented Aug 10, 2017

tamird commented Aug 10, 2017

a-robinson commented Aug 10, 2017

tamird commented Aug 10, 2017

a-robinson commented Aug 11, 2017

BramGruneir commented Aug 11, 2017

BramGruneir commented Aug 11, 2017

a-robinson commented Aug 11, 2017

a-robinson commented Aug 11, 2017

BramGruneir commented Aug 11, 2017

storage: Fix allocator diversity score calculations #17570

storage: Fix allocator diversity score calculations #17570

Conversation

a-robinson commented Aug 10, 2017

cockroach-teamcity commented Aug 10, 2017

bdarnell commented Aug 10, 2017

a-robinson commented Aug 10, 2017

tamird commented Aug 10, 2017

a-robinson commented Aug 10, 2017

tamird commented Aug 10, 2017

a-robinson commented Aug 11, 2017

BramGruneir commented Aug 11, 2017

BramGruneir commented Aug 11, 2017

a-robinson commented Aug 11, 2017

a-robinson commented Aug 11, 2017

BramGruneir commented Aug 11, 2017