-
Notifications
You must be signed in to change notification settings - Fork 3.8k
Clarify compactTask/availableSlot/count metric description
#18663
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This adds the compactTask/busySlot/count metric directly, rather than indirectly deducing it through the formula for available task slots and maximum task slots and taking their minimum. The formula to compute available compact task slots in a run is: availableCompactionTaskSlots = Math.max(0, compactionTaskCapacity - busyCompactionTaskSlots); Often times, it's confusing why compactTask/availableSlot/count is lower than expected. It turns out that the compact duty using the native engine just caps it using maxNumConcurrentSubTasks, regardless of the phase a current compact supervisor is in. This is likely the safest and conservative thing to do. This metric should help operators better plan for compaction task slots in a MM-based setup.
@abhishekrb19 , I am not sure how the To determine the actual count of tasks, we could count the sub-tasks for each currently running What do you think? Edit: Although, it might be useful to emit the "actual" busy slot count which would be computed using the sub-task counts as mentioned above. Is that what you mean to do here? |
|
@kfaraz thanks for taking a look!
Yeah, I was just noting my observation on how this metric is calculated. The docs for the While at it, I was thinking that the busy slot estimate could just be emitted directly given the condition - a consistent higher utilization of the busy slots metric (and a corresponding drop in available slots) would indicate the need to tune the compaction task slots, which is what we ended up doing for some clusters.
I wasn't necessarily thinking of emitting the "actual" busy slot count because it doesn't influence the auto-compaction algorithm currently; also I think this can be determined using Let me know if that makes sense. |
Fair point, @abhishekrb19 , it makes sense to update the docs.
Yes, that's true, we can get the running count for different task types/datasources using the
I am not against emitting the busy slot count per se. |
compactTask/busySlot/count metric to compact duty.compactTask/availableSlot/count metric description
bc6c64b to
e43c20b
Compare
e43c20b to
4321642
Compare
|
@kfaraz, I went ahead and just made the doc change to clarify and reverted the |
kfaraz
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the doc fix, @abhishekrb19 !
Co-authored-by: Kashif Faraz <kashif.faraz@gmail.com>
Clarify the
compactTask/availableSlot/countmetric description.The formula to compute available compact task slots in a run is:
availableCompactionTaskSlots = Math.max(0, compactionTaskCapacity - busyCompactionTaskSlots). Often times, it's confusing whycompactTask/availableSlot/countis lower than expected. This happens because the value is capped bymaxNumConcurrentSubTasks, and the current metric description can be slightly misleading.This PR has: