Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Take into account expectedShardSize when initializing shard in simulation #95734

Merged
merged 7 commits into from May 4, 2023

Conversation

idegtiarenko
Copy link
Contributor

This change takes into account expectedShardSize when simulating new shard initialization during desired balance computation.
This should reduce balance movements when restoring a shard from snapshot as it has known size beforehand.

@idegtiarenko idegtiarenko added >enhancement :Distributed/Distributed A catch all label for anything in the Distributed Area. If you aren't sure, use this one. :Distributed/Allocation All issues relating to the decision making around placing a shard (both master logic & on the nodes) v8.9.0 labels May 2, 2023
@idegtiarenko idegtiarenko added Team:Distributed Meta label for distributed team and removed :Distributed/Distributed A catch all label for anything in the Distributed Area. If you aren't sure, use this one. labels May 2, 2023
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-distributed (Team:Distributed)

Copy link
Contributor

@DaveCTurner DaveCTurner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

// initializing new (empty) primary
return 0L;
// initializing new (empty?) primary
return Math.max(0L, shard.getExpectedShardSize());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I guess there's no need for a max here since the caller ignores negative values.

Suggested change
return Math.max(0L, shard.getExpectedShardSize());
return shard.getExpectedShardSize();

…sage

This change asserts that produced balance does not utilize all disk space.
if (replicaNodeId != null) {
dataPath.put(new NodeAndShard(replicaNodeId, shardId), "/data");
usedDiskSpace.compute(primaryNodeId, (k, v) -> v + thisShardSize);
usedDiskSpace.compute(replicaNodeId, (k, v) -> v + thisShardSize);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was causing some of the assertions failing

@elasticsearchmachine
Copy link
Collaborator

Hi @idegtiarenko, I've created a changelog YAML for you.

var deviation = randomIntBetween(0, 50) - 100L;
return originalSize * (1000 + deviation) / 1000;
var deviation = randomIntBetween(0, 10) - 5;
return originalSize * (100 + deviation) / 100;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Previously this was only producing shards that are 5-10% smaller then original size, while this should generate -5..+5%

Copy link
Contributor

@DaveCTurner DaveCTurner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM still. I think there were only really test changes here, but you force-pushed so I can't see the diffs.

}

private static long smallShardSizeDeviation(long originalSize) {
var deviation = randomIntBetween(0, 50) - 100L;
return originalSize * (1000 + deviation) / 1000;
var deviation = randomIntBetween(0, 10) - 5;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit/suggestion:

Suggested change
var deviation = randomIntBetween(0, 10) - 5;
var deviation = randomIntBetween(-5, 5);

or maybe even drop the 100 + and do this:

Suggested change
var deviation = randomIntBetween(0, 10) - 5;
var deviation = randomIntBetween(95, 105);

@idegtiarenko idegtiarenko merged commit 251f923 into elastic:main May 4, 2023
12 checks passed
@idegtiarenko idegtiarenko deleted the assert_balanced_disk_usage branch May 4, 2023 08:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Distributed/Allocation All issues relating to the decision making around placing a shard (both master logic & on the nodes) >enhancement Team:Distributed Meta label for distributed team v8.9.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants