Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Coordinator balancer move then drop fix #5528

Merged
merged 4 commits into from
Mar 29, 2018

Conversation

clintropolis
Copy link
Member

Fixes issue where coordinator balancer will move a segment to the destination server but never actually drop the segment from the source server, one of the issues described in #5521.

With this bug in place, the effective behavior of segment movement was that the balancer would copy the segment to where it thought it needed to be, and the load rule would then decide from where to drop. Because the drop here was never previously working, this may be worth discussion of any implications.

@jihoonson
Copy link
Contributor

@clintropolis thanks for working on this. This PR contains some changes to use lambda and I think it makes me difficult to figure out which parts are bug fixes. I would say it's better to split this PR into two sub PRs for bug fix only and remaining.

@jon-wei jon-wei added the Bug label Mar 27, 2018
@clintropolis
Copy link
Member Author

Sorry for mixing stuff, this was already part of a split up chain of commits that originally also included the stuff in #5529. Main bug fix is here and additions to test to test the original bug are some additions here and the test itself

I can split this stuff out if you prefer, I was just lazily updating stuff as I came across it.

Copy link
Contributor

@jihoonson jihoonson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @clintropolis. Left some comments.

sourceLoadQueueChildrenCache.start();
destinationLoadQueueChildrenCache.start();

Thread.sleep(100);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might cause test failures in Travis. If this is to wait for childrenCaches to be ready, some kind of real check (by polling or using a latch) should be used instead. Same for other sleeps.

tearDownServerAndCurator();
}

@Test
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a timeout to avoid too long waits even when this test becomes broken in the future.

loadManagementPeons.put("localhost:1", loadQueuePeon);
loadManagementPeons.put("localhost:2", loadQueuePeon2);


Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unnecessary line. Please remove other unnecessary lines too.

@jihoonson jihoonson merged commit 30fc4d3 into apache:master Mar 29, 2018
@clintropolis clintropolis deleted the coordinator-move-drop-fix branch March 30, 2018 20:19
jon-wei pushed a commit to implydata/druid-public that referenced this pull request Apr 4, 2018
* apache#5521 part 1

* formatting

* oops

* less magic tests
gianm pushed a commit to implydata/druid-public that referenced this pull request May 16, 2018
* apache#5521 part 1

* formatting

* oops

* less magic tests
@jihoonson jihoonson added this to the 0.12.2 milestone Jul 5, 2018
jihoonson pushed a commit to jihoonson/druid that referenced this pull request Jul 5, 2018
* apache#5521 part 1

* formatting

* oops

* less magic tests
fjy pushed a commit that referenced this pull request Jul 6, 2018
* #5521 part 1

* formatting

* oops

* less magic tests
leventov pushed a commit to metamx/druid that referenced this pull request Jul 20, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants