Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Coordinator drop segment selection through cost balancer #5529

Merged
merged 6 commits into from Apr 3, 2018

Conversation

@clintropolis
Copy link
Member

commented Mar 23, 2018

Fixes issues that occur due to load rule segment drop does not use the segment balancer, one of the issues described in #5521.

Drop segment will now use the balancer to get an iterator of ServerHolder objects which can be consumed to get the best candidate servers to drop segments from.

@clintropolis clintropolis changed the title drop segment selection through cost balancer Coordinator drop segment selection through cost balancer Mar 24, 2018

@jihoonson
Copy link
Contributor

left a comment

LGTM overall. I'd like to comment one thing. I'm not sure it's easy to add a test or even it's possible, but it would be great if we can have a unit test to avoid the same issue in the future. @clintropolis what do you think?

@@ -31,5 +33,10 @@

BalancerSegmentHolder pickSegmentToMove(List<ServerHolder> serverHolders);

default Iterator<ServerHolder> pickServersToDrop(DataSegment toDropSegment, NavigableSet<ServerHolder> serverHolders)
{
return serverHolders.descendingIterator();

This comment has been minimized.

Copy link
@gianm

gianm Apr 2, 2018

Contributor

Would be nice to include a comment about why this descendingIterator is meaningful. (Like the comment on the old code: // Use the reverse order to get the holders with least available size first.)

// Use the reverse order to get the holders with least available size first.
final Iterator<ServerHolder> iterator = holdersInTier.descendingIterator();
final NavigableSet<ServerHolder> isServingSubset =
holdersInTier.stream().filter(s -> s.isServingSegment(segment)).collect(Collectors.toCollection(TreeSet::new));

This comment has been minimized.

Copy link
@gianm

gianm Apr 2, 2018

Contributor

There's still a (now pointless) check for isServingSegment below. It should be removed lest it confuse people.

This comment has been minimized.

Copy link
@clintropolis

clintropolis Apr 2, 2018

Author Member

I left the check in case in the time between the initial filtering and the loop something changed and the server was no longer serving the segment, but I can remove the extra check if you think it's safe.

This comment has been minimized.

Copy link
@gianm

gianm Apr 2, 2018

Contributor

If there is a race between pickServersToDrop and this line, then there's still a race even with the check. I think it's ok to leave the check in, but maybe log a warning if it actually gets tripped. (We don't expect it to unless there is some race)

This comment has been minimized.

Copy link
@clintropolis

clintropolis Apr 2, 2018

Author Member

👍 added log.warn

@@ -31,5 +33,10 @@

BalancerSegmentHolder pickSegmentToMove(List<ServerHolder> serverHolders);

default Iterator<ServerHolder> pickServersToDrop(DataSegment toDropSegment, NavigableSet<ServerHolder> serverHolders)

This comment has been minimized.

Copy link
@gianm

gianm Apr 2, 2018

Contributor

This should have some javadoc explaining what the returned iterator should be. It's non-obvious: the "obvious" return value would be a set of servers to drop from, where the order doesn't matter. But the actual return value is ordered by preferredness, and may contain more servers than we actually want to drop from.

This comment has been minimized.

Copy link
@clintropolis

clintropolis Apr 2, 2018

Author Member

I went ahead and added javadoc for whole interface, lmk if I got anything wrong


try {
List<Pair<Double, ServerHolder>> results = resultsFuture.get();
return results.stream()

This comment has been minimized.

Copy link
@gianm

gianm Apr 2, 2018

Contributor

This code is slightly obtuse (what's the double? why reverse it?) and would benefit from a comment.

@gianm
gianm approved these changes Apr 2, 2018
Copy link
Contributor

left a comment

👍

@clintropolis

This comment has been minimized.

Copy link
Member Author

commented Apr 2, 2018

The test that failed, CuratorDruidCoordinatorTest added in #5528, is hopefully is fixed by this commit in #5555 which increases the timeout.

@gianm gianm merged commit f31dba6 into apache:master Apr 3, 2018

2 checks passed

Inspections: pull requests (Druid) TeamCity build finished
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
jon-wei added a commit to implydata/druid that referenced this pull request Apr 4, 2018
Coordinator drop segment selection through cost balancer (apache#5529)
* drop selection through cost balancer

* use collections.emptyIterator

* add test to ensure does not drop from server with larger loading queue with cost balancer

* javadocs and comments to clear things up

* random drop for completeness
gianm added a commit to implydata/druid that referenced this pull request May 16, 2018
Coordinator drop segment selection through cost balancer (apache#5529)
* drop selection through cost balancer

* use collections.emptyIterator

* add test to ensure does not drop from server with larger loading queue with cost balancer

* javadocs and comments to clear things up

* random drop for completeness

@clintropolis clintropolis deleted the clintropolis:coordinator-drop-balance-fix branch Jul 4, 2018

@jihoonson jihoonson added this to the 0.12.2 milestone Jul 5, 2018

jihoonson added a commit to jihoonson/druid that referenced this pull request Jul 5, 2018
Coordinator drop segment selection through cost balancer (apache#5529)
* drop selection through cost balancer

* use collections.emptyIterator

* add test to ensure does not drop from server with larger loading queue with cost balancer

* javadocs and comments to clear things up

* random drop for completeness
fjy added a commit that referenced this pull request Jul 6, 2018
Coordinator drop segment selection through cost balancer (#5529) (#5948)
* drop selection through cost balancer

* use collections.emptyIterator

* add test to ensure does not drop from server with larger loading queue with cost balancer

* javadocs and comments to clear things up

* random drop for completeness
leventov added a commit to metamx/druid that referenced this pull request Jul 20, 2018
Coordinator drop segment selection through cost balancer (apache#5529) (
apache#5948)

* drop selection through cost balancer

* use collections.emptyIterator

* add test to ensure does not drop from server with larger loading queue with cost balancer

* javadocs and comments to clear things up

* random drop for completeness
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants
You can’t perform that action at this time.