[Backport] Add SegmentAllocationQueue to batch SegmentAllocateActions (#13369) by kfaraz · Pull Request #13493 · apache/druid

kfaraz · 2022-12-05T08:32:48Z

Backports #13369

) In a cluster with a large number of streaming tasks (~1000), SegmentAllocateActions on the overlord can often take very long intervals of time to finish thus causing spikes in the `task/action/run/time`. This may result in lag building up while a task waits for a segment to get allocated. The root causes are: - large number of metadata calls made to the segments and pending segments tables - `giant` lock held in `TaskLockbox.tryLock()` to acquire task locks and allocate segments Since the contention typically arises when several tasks of the same datasource try to allocate segments for the same interval/granularity, the allocation run times can be improved by batching the requests together. Changes - Add flags - `druid.indexer.tasklock.batchSegmentAllocation` (default `false`) - `druid.indexer.tasklock.batchAllocationMaxWaitTime` (in millis) (default `1000`) - Add methods `canPerformAsync` and `performAsync` to `TaskAction` - Submit each allocate action to a `SegmentAllocationQueue`, and add to correct batch - Process batch after `batchAllocationMaxWaitTime` - Acquire `giant` lock just once per batch in `TaskLockbox` - Reduce metadata calls by batching statements together and updating query filters - Except for batching, retain the whole behaviour (order of steps, retries, etc.) - Respond to leadership changes and fail items in queue when not leader - Emit batch and request level metrics

lgtm-com · 2022-12-05T11:27:30Z

This pull request introduces 2 alerts when merging 3e26b96 into baf6ca4 - view on LGTM.com

new alerts:

2 for User-controlled data in numeric cast

Heads-up: LGTM.com's PR analysis will be disabled on the 5th of December, and LGTM.com will be shut down ⏻ completely on the 16th of December 2022. Please enable GitHub code scanning, which uses the same CodeQL engine ⚙️ that powers LGTM.com. For more information, please check out our post on the GitHub blog.

kfaraz added the Backport label Dec 5, 2022

kfaraz added this to the 25.0 milestone Dec 5, 2022

AmatyaAvadhanula approved these changes Dec 5, 2022

View reviewed changes

kfaraz merged commit c04ecde into apache:25.0.0 Dec 5, 2022

kfaraz deleted the backport_batch_alloc branch December 5, 2022 14:04

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Backport] Add SegmentAllocationQueue to batch SegmentAllocateActions (#13369)#13493

[Backport] Add SegmentAllocationQueue to batch SegmentAllocateActions (#13369)#13493
kfaraz merged 1 commit intoapache:25.0.0from
kfaraz:backport_batch_alloc

kfaraz commented Dec 5, 2022

Uh oh!

lgtm-com bot commented Dec 5, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

kfaraz commented Dec 5, 2022

Uh oh!

lgtm-com bot commented Dec 5, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants