-
Notifications
You must be signed in to change notification settings - Fork 24.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve shard balancing #91603
Improve shard balancing #91603
Conversation
# Conflicts: # server/src/main/java/org/elasticsearch/cluster/metadata/IndexMetadata.java # server/src/main/java/org/elasticsearch/cluster/routing/allocation/allocator/BalancedShardsAllocator.java
…ion/allocator/BalancedShardsAllocator.java Co-authored-by: David Turner <david.turner@elastic.co>
Pinging @elastic/es-distributed (Team:Distributed) |
Hi @fcofdez, I've created a changelog YAML for you. |
if (forecastedShardSizeInBytes.isPresent()) { | ||
indexDiskUsageInBytes = forecastedShardSizeInBytes.getAsLong() * numberOfCopies(indexMetadata); | ||
} else { | ||
long totalSizeInBytes = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure if we should fallback to the cluster info in those cases?
} | ||
|
||
float weight(Balancer balancer, ModelNode node, String index) { | ||
final float weightShard = node.numShards() - balancer.avgShardsPerNode(); | ||
final float weightIndex = node.numShards(index) - balancer.avgShardsPerNode(index); | ||
return theta0 * weightShard + theta1 * weightIndex; | ||
final float ingestLoad = (float) (node.writeLoad() - balancer.avgWriteLoadPerNode()); | ||
// TODO: can this overflow? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We're casting from long to float here, I think it should be fine in most cases? but just wanted to double check
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I am not mistaken float is [1.175494351 E - 38 3.402823466 E + 38].
Petabyte would be 1e15 so we should be in range for some time now.
...ain/java/org/elasticsearch/cluster/routing/allocation/allocator/BalancedShardsAllocator.java
Outdated
Show resolved
Hide resolved
…into balance-disk-usage
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
} | ||
|
||
// TODO: Should we go through the cluster info service and compute the average in this case? | ||
return shardCount == 0 ? 0 : (totalSizeInBytes / shardCount) * numberOfCopies(indexMetadata); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@henningandersen do you think that computing the average with the available shards and then multiplying by the number of copies is the right call here? I think it's fine in most cases, but I just want to confirm that my intuition is correct.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, that sounds fine to me.
I wonder if we should also ask the cluster info though in case there is no forecasted shard size in diskUsageInBytesPerShard
? DIfferent topic, also fine to do in a follow-up ofc.
@elasticmachine run elasticsearch-ci/bwc |
Include shard write load and disk usage as a balancing factor.
Relates #17213