-
Notifications
You must be signed in to change notification settings - Fork 25.6k
allocation: add balancer round summary as metrics #136043
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
allocation: add balancer round summary as metrics #136043
Conversation
This commit adds the BalancerRoundSummary as a collection of APM/open telemetry metrics. These are already logged. The summary collected every ten seconds or so is set as the current state into the telemetry metrics class (AllocationBalancingRoundMetrics). Whenever the telemetry runs, each metric picks up its current view.
|
Hi @schase-es, I've created a changelog YAML for you. |
nicktindall
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just some comments, I think we should discuss with Dianna what we're aiming for with the node deltas
| bind(AllocationStatsService.class).toInstance(allocationStatsService); | ||
| bind(TelemetryProvider.class).toInstance(telemetryProvider); | ||
| bind(DesiredBalanceMetrics.class).toInstance(desiredBalanceMetrics); | ||
| bind(AllocationBalancingRoundMetrics.class).toInstance(balancingRoundMetrics); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Binding like this is only necessary if we use the instance in a cluster annotated @Inject, I'm not sure if we do with the AllocationBalancingRoundMetrics? It probably doesn't matter, but I think in general we don't do it unless we need to.
| assert summary != null : "balancing round metrics cannot be null"; | ||
|
|
||
| nodeNameToWeightChangesRef.set(summary.nodeNameToWeightChanges()); | ||
| if (enableSending) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to set the nodeNameToWeightChangesRef if enableSending = false ?
| long shardCount = nodeWeightChanges.baseWeights().shardCount() + nodeWeightChanges.weightsDiff().shardCountDiff(); | ||
| metrics.add(new LongWithAttributes(shardCount, getNodeAttributes(nodeWeights.getKey()))); | ||
| } | ||
| return metrics; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we want the current values of shard count, disk usage and write load, because we already have those in the cluster balance dashboard
Perhaps the deltas are more interesting? I think number of balancing rounds, and shard movements are definitely valuable as you have them published now, but I'm less clear on the value of the specific node shard/weight/disk usage deltas/values that we don't get already from existing metrics.
Maybe the absolute amount of change (e.g. if one node loses X weight and another gains X the sum would be 2X) in those values might be interesting?
If we plotted any of these values for the cluster by simply adding them together, they'd sum to zero I think? because for every node gaining X shards there are other node(s) losing X shards. @DiannaHohensee might have more clarity on the direction here.
I think for serverless as well, disk usage will always be zero after this change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As discussed I think we should reduce the scope on this PR to just the two metrics we know how to publish and do a second PR when we've discussed how we'd like to publish the write load/shard count/disk usage
.../org/elasticsearch/cluster/routing/allocation/allocator/AllocationBalancingRoundMetrics.java
Show resolved
Hide resolved
|
Pinging @elastic/es-distributed-coordination (Team:Distributed Coordination) |
| shardCountHistogram.record(baseWeights.shardCount() + weightsDiff.shardCountDiff(), getNodeAttributes(nodeName)); | ||
| diskUsageHistogram.record(baseWeights.diskUsageInBytes() + weightsDiff.diskUsageInBytesDiff(), getNodeAttributes(nodeName)); | ||
| writeLoadHistogram.record(baseWeights.writeLoad() + weightsDiff.writeLoadDiff(), getNodeAttributes(nodeName)); | ||
| totalWeightHistogram.record(baseWeights.nodeWeight() + weightsDiff.totalWeightDiff(), getNodeAttributes(nodeName)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are the metrics I'm wondering about
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the question you're thinking about? What value to put in the histogram?
DiannaHohensee
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't do a thorough review, I'll have to take another look/think later. But publishing some comments as is.
.../org/elasticsearch/cluster/routing/allocation/allocator/AllocationBalancingRoundMetrics.java
Outdated
Show resolved
Hide resolved
| NUMBER_OF_SHARD_MOVES_METRIC_NAME, | ||
| "Current number of shard moves", | ||
| "{shard}" | ||
| ); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did we discuss having a histogram for shard moves? To avoid the combine effect, merging multiple balancing rounds together 🤔 I can't recall what we decided.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nick and I talked about it last week again, and we decided to send the absolute difference with each summary. (The comment I made that seemed out of place was to create a link for Nick to see.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should have a histogram for shard moves as well, those differences are per-node, so we'll see how many shards moved on/off each node, but I think having a histogram of total shard move count would be good too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't have the shard moves per node as a stat right now, so we can't have that :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The high level conclusion Simon and I were discussing is that generally histograms are going to be more useful for the balancer round summary metrics because our goal is to see per balancing round granularity, as best we can. We don't have a use case for wanting to see how many total shard moves take place over time, but we do want to see roughly how many shard moves happen per balancing round, and the histogram will give us that.
So let's go with a histogram and remove the counter.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should have a histogram for shard moves as well, those differences are per-node, so we'll see how many shards moved on/off each node.
Right now, these metrics aren't available in the balancing round summary. We can certainly try to get them out of the Desired Balancer and get them in through here.
| shardCountHistogram.record(baseWeights.shardCount() + weightsDiff.shardCountDiff(), getNodeAttributes(nodeName)); | ||
| diskUsageHistogram.record(baseWeights.diskUsageInBytes() + weightsDiff.diskUsageInBytesDiff(), getNodeAttributes(nodeName)); | ||
| writeLoadHistogram.record(baseWeights.writeLoad() + weightsDiff.writeLoadDiff(), getNodeAttributes(nodeName)); | ||
| totalWeightHistogram.record(baseWeights.nodeWeight() + weightsDiff.totalWeightDiff(), getNodeAttributes(nodeName)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the question you're thinking about? What value to put in the histogram?
.../org/elasticsearch/cluster/routing/allocation/allocator/AllocationBalancingRoundMetrics.java
Outdated
Show resolved
Hide resolved
nicktindall
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking good, just added some comments
.../org/elasticsearch/cluster/routing/allocation/allocator/AllocationBalancingRoundMetrics.java
Outdated
Show resolved
Hide resolved
| NUMBER_OF_SHARD_MOVES_METRIC_NAME, | ||
| "Current number of shard moves", | ||
| "{shard}" | ||
| ); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should have a histogram for shard moves as well, those differences are per-node, so we'll see how many shards moved on/off each node, but I think having a histogram of total shard move count would be good too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sync'ed with Simon. I think this looks great, just need a histogram for the shardMoves to replace the LongCounter shardMovesCounter.
@nicktindall can I leave the final sign off to you? You're more familiar with this area of the code (APM metrics) than I am right now.
Update: to add an offline discussion, we decided that we'd like the shardMoveCount to also be reported as a histogram. So we'll have both a LongCounter shardMoveCount and LongHistogram shardMoveHistogram.
| public static final String NUMBER_OF_SHARDS_METRIC_NAME = "es.allocator.balancing_round.shard_count"; | ||
| public static final String DISK_USAGE_BYTES_METRIC_NAME = "es.allocator.balancing_round.disk_usage_bytes"; | ||
| public static final String WRITE_LOAD_METRIC_NAME = "es.allocator.balancing_round.write_load"; | ||
| public static final String TOTAL_WEIGHT_METRIC_NAME = "es.allocator.balancing_round.total_weight"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought the metrics registry made you end histogram metric names in .histogram or something?
nicktindall
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good to me, with a few nits and a very annoying change request to the labels we're using.
| this.shardMovesCounter = meterRegistry.registerLongCounter( | ||
| NUMBER_OF_SHARD_MOVES_METRIC_NAME, | ||
| "Current number of shard moves", | ||
| "{shard}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: I think the convention is to use unit when it's just a count ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for pointing this out -- this is one of the conventions that's been hard to understand.
|
|
||
| this.shardMovesHistogram = meterRegistry.registerLongHistogram( | ||
| NUMBER_OF_SHARD_MOVES_HISTOGRAM_METRIC_NAME, | ||
| "Histogram of shard moves", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: can we say "number of shard movements executed in a balancing round" to make it clear it's a per-round count
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are great docs suggestions -- thanks!
| ); | ||
| this.shardMovesCounter = meterRegistry.registerLongCounter( | ||
| NUMBER_OF_SHARD_MOVES_METRIC_NAME, | ||
| "Current number of shard moves", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: can we say "total number of shard moves" also above, current is a term often used for gauges
| "Histogram of shard moves", | ||
| "unit" | ||
| ); | ||
| this.shardCountHistogram = meterRegistry.registerLongHistogram(NUMBER_OF_SHARDS_METRIC_NAME, "Current number of shards", "unit"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: perhaps "change in node shard count per balancing round" to make it clear it's node-scoped and a delta
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also similar for the descriptions below.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed!
.../org/elasticsearch/cluster/routing/allocation/allocator/AllocationBalancingRoundMetrics.java
Show resolved
Hide resolved
nicktindall
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approved with some minor nits. Nice one.
| long shardCountDiff = weightsDiff.shardCountDiff(); | ||
| if (shardCountDiff < 0) { | ||
| shardCountDiff *= -1; | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I follow the reasoning here? can't we just use Math.abs if we know the diff won't be MIN_VALUE?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is in the forbidden APIs unfortunately -- it fails the math.abs call, but curious does not fail this *= -1 call. It's an annoyance because the whole point of longs is that the max or min value are so, so big that you'll never see it.
I should mention the forbidden APIs in the comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah that makes sense, yes either mention it in the comment, or there's also org.elasticsearch.core.SuppressForbidden, but on a large method that might be dangerous because I think it's scoped to the whole method.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I moved it into a private method longAbsNegativeSafe -- it matches the other metrics, and it's clearer as the suppress annotation has space for a reason.
| nodeToWeightChanges.forEach((node, nodesWeightChanges) -> nodeNameToWeightChanges.put(node.getName(), nodesWeightChanges)); | ||
|
|
||
| return Strings.format( | ||
| "CombinedBalancingRoundSummary[numberOfBalancingRounds=%d, nodeToWeightChange=%s, " + "numberOfShardMoves=%d]", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: looks like an auto-format mishap (unnecessary concatenation)
| Map.of(), | ||
| Set.of(), | ||
| null | ||
| ); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: there is also org.elasticsearch.cluster.node.DiscoveryNodeUtils for creating DiscoveryNode instances which might make this less verbose?
| final String NODE_1 = "node1"; | ||
| final String NODE_2 = "node2"; | ||
| final DiscoveryNode NODE_1 = new DiscoveryNode("node1", "node1Id", "eph-node1", "abc", "abc", null, Map.of(), Set.of(), null); | ||
| final DiscoveryNode NODE_2 = new DiscoveryNode("node2", "node2Id", "eph-node2", "abc", "abc", null, Map.of(), Set.of(), null); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also here (org.elasticsearch.cluster.node.DiscoveryNodeUtils)
|
|
||
| @SuppressForbidden(reason = "ForbiddenAPIs bans Math.abs on longs, but long.MIN_VALUE is impossible here") | ||
| private long longAbsNegativeSafe(long input) { | ||
| return Math.abs(input); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we just put an assertion above the call to abs that ensures input isn't Long.MIN_VALUE
This commit adds the BalancerRoundSummary as a collection of APM/open telemetry metrics. These are already logged. The summary collected every ten seconds or so is set as the current state into the telemetry metrics class (AllocationBalancingRoundMetrics). Whenever the telemetry runs, each metric picks up its current view.
Fixes: ES-10343