feat: rebalance cost-based autoscaler for best throughput by Fly-Style · Pull Request #19646 · apache/druid

Fly-Style · 2026-07-01T09:42:59Z

This PR retunes the cost-based autoscaler scoring for better throughput-oriented decisions and exposes idealIdleRatio as a configurable knob for the U-shaped idle-cost function.
It also updates autoscaler logging to show the effective idle ratio and max observed processing rate used during optimal task-count calculation.

Details

Picture 1

Small supervisor case: 48 partitions, starting from 6 tasks.
Shows that the new cost function scales up gradually as lag becomes more important, while higher observed idle ratios keep the recommendation conservative.
Useful sanity check for avoiding aggressive jumps on smaller supervisors.

Picture 2

Medium supervisor case: 125 partitions, plot title shows startTaskCount=25.
Shows that low-idle/high-pressure cases recommend materially more tasks, while moderate or high idle ratios stay well below the partition count.
Useful for checking that the scoring does not blindly chase one-task-per-partition when idle capacity is already available.

Picture 3

Large supervisor case: 500 partitions, starting at the maximum task count.
Shows the strongest separation between low-idle and high-idle scenarios: low idle can keep the recommendation near max capacity, while higher idle ratios strongly favor scale-down.
Useful for validating that the idle-cost side still prevents waste at large partition counts.

This PR has:

been self-reviewed.
a release note entry in the PR description.

kfaraz

In conclusion, does the auto-scaler now prefer under-provisioning by default?

kfaraz · 2026-07-01T11:14:35Z

   * Controls the steepness of the U-shape on the over-provisioning side.
   */
-  static final double OVER_PROVISIONING_PENALTY = 1.0;
+  static final double OVER_PROVISIONING_PENALTY = 2.5;


Why change this to 2.5 instead of 2.0?

I tried 2.0, but 2.5 looked better in simulation.

I feel 2x penalty on over-provisioning is already enough. Changing these constants very frequently will make it difficult to gather enough data to refine the constants empirically.

Let's use 2 for now and see how it does in real clusters. This PR is already interchanging the under provisioned penalty and the over provisioning penalty, so we will have enough stuff to validate anyway.

Okay, let's use 2, but I can assure you 2.5 looks even better on simulation. We'll see 😺

kfaraz · 2026-07-01T11:18:52Z

+   * do not return infinitely large lag recovery times, at the expense of underestimating the lag cost.
   */
-  static final double MIN_PROCESSING_RATE = 1_000;
+  static final double MIN_PROCESSING_RATE = 5_000;


Why this change?
Technically, 1000 was also an arbitrary number but 5000 definitely seems to be on the higher side, especially when dealing with bulkier records with large (say JSON) column values. I would rather the user choose whether they want to prefer lag recovery or throughput by tweaking the weights.

I decided to tweak it too and realized that 1000 is too permissible to scaleup during minimal lag. 5000 implies more strict behaviour in that critical state where we have not received metrics yet.

If we have not received metrics yet, auto-scaling would be skipped since CostBasedAutoScaler.validateMetricsForScaling would return an error.

I have personally seen tasks in prod clusters maxing out at 5000 records/sec when dealing with large records.
So, using a large MIN_PROCESSING_RATE would cause us to always under-estimate lagRecoveryTime, irrespective of the actual avgProcessingRate.
As such, let's keep a low value of MIN_PROCESSING_RATE(maybe even as low as 100), since it is meant to be a safe-side measure that kicks in only when avgProcessingRate is very low.

The penalty for scale-up is already driven by the optimal task idleness, and can be controlled using the weights.

P.S. reverting

In a future PR, I think we can remove the MIN_PROCESSING_RATE altogether and maybe use the window maxProcessingRate, but I haven't fully thought it through yet. It might have some unforeseen side effects.

@kfaraz I forgot about Math.max(...) and then I reconsider my approach 😁

FrankChen021

Severity	Findings
P0	0
P1	0
P2	1
P3	0
Total	1

Severity	Findings
P0	0
P1	0
P2	1
P3	0
Total	1

Found 1 issue.

Reviewed 6 of 6 changed files.

This is an automated review by Codex GPT-5.5

kfaraz

Left a non-blocking comment.

github-actions Bot added the Area - Ingestion label Jul 1, 2026

Fly-Style changed the title ~~Rebalance cost-based autoscaler for best troughput~~ feat: rebalance cost-based autoscaler for best throughput Jul 1, 2026

Rebalance cost-based autoscaler for best troughput

8a3e6dd

Fly-Style force-pushed the cba-expose-more-configs branch from f5be5c2 to 8a3e6dd Compare July 1, 2026 09:46

Complement Config test

ffcd9a5

Fly-Style marked this pull request as ready for review July 1, 2026 10:16

Fly-Style self-assigned this Jul 1, 2026

Fly-Style requested a review from kfaraz July 1, 2026 10:37

Relaxed the scaleup condition

8cf67eb

kfaraz reviewed Jul 1, 2026

View reviewed changes

review comments

7746e83

FrankChen021 reviewed Jul 1, 2026

View reviewed changes

Comment thread ...ava/org/apache/druid/indexing/seekablestream/supervisor/autoscaler/WeightedCostFunction.java Outdated

Revert

4e0f04f

Fly-Style requested a review from kfaraz July 1, 2026 12:37

kfaraz approved these changes Jul 2, 2026

View reviewed changes

Decrease over-provisioning penalty value

e3e0ae3

Fly-Style merged commit a2105da into apache:master Jul 2, 2026
38 checks passed

Fly-Style deleted the cba-expose-more-configs branch July 2, 2026 10:32

github-actions Bot added this to the 38.0.0 milestone Jul 2, 2026

kfaraz mentioned this pull request Jul 2, 2026

minor: Fix metric values of costBased auto-scaler #19641

Open

10 tasks

Uh oh!

Conversation

Fly-Style commented Jul 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

kfaraz left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

FrankChen021 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

kfaraz left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Fly-Style commented Jul 1, 2026 •

edited

Loading