Makes schedule task async#18272
Conversation
Introduces an opt-in concurrent scheduling path in PinotTaskManager so that task generation for different tables can run in parallel instead of serializing on a single controller-wide `synchronized(this)`. - New cluster flag `controller.task.concurrentSchedulingEnabled` (default false) and per-table override `TableTaskConfig.concurrentSchedulingEnabled` (null = inherit). - `scheduleTasks(TaskSchedulingContext)` is no longer `synchronized`; a dispatcher picks between the legacy `synchronized(this)`-wrapped path and a concurrent path that uses per-table JVM `ReentrantLock`s before the distributed ZK lock. Both paths share the same body in `doScheduleTasks`. - Per-table JVM locks are acquired in sorted order to avoid deadlock and cleaned up in `cleanUpCronTaskSchedulerForTable` to prevent map growth. - Warn at startup if the concurrent default is enabled without distributed locking, since ad-hoc `createTask` still takes `synchronized(this)` only. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The concurrent scheduling path now relies solely on the distributed ZK lock for same-table coordination. If two concurrent scheduleTasks calls target the same table, one acquires the ZK lock and the other skips generation with a lock-contention error and retries on the next cron fire — acceptable because the operator cluster always runs with distributed locking enabled. - Removed `_tableJvmLocks` map, ReentrantLock import, and sort/acquire/ release block inside `doScheduleTasks`. - Simplified `doScheduleTasks` signature (dropped the flag parameter). - Updated the startup warning to reflect that the concurrent path requires distributed locking for correctness. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The earlier concurrent-scheduling refactor inadvertently stripped the `synchronized` modifier from two `@Deprecated(forRemoval = true)` wrappers (`scheduleTasks(List<String>, boolean, String)` and `scheduleTask(String, List<String>, String)`). Restore it so the deprecated surface stays byte-for-byte unchanged against master, matching the six other deprecated wrappers in this class. The canonical `scheduleTasks(TaskSchedulingContext)` continues to handle concurrent dispatch. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Covers the two dispatch helpers introduced on this branch: `PinotTaskManager#shouldUseConcurrentPath` and `#resolveConcurrentScheduling`. Both are widened from `private` to package-private with `@VisibleForTesting` so the tests can exercise them directly without booting a controller. - `TableTaskConfigTest` (pinot-spi): single/two-arg constructor defaults, explicit true/false, JSON round-trip including omitted and explicit-null. - `ControllerConfTest`: default and override of `isPinotTaskManagerConcurrentSchedulingEnabled`. - `PinotTaskManagerConcurrentSchedulingTest`: table flag vs. cluster default precedence, per-table opt-out forcing legacy, database-scope expansion, all-tables scope, missing TableConfig skipped, empty-scope fallback to cluster default. Also corrects a stale Javadoc on `TableTaskConfig#getConcurrentSchedulingEnabled` that still described the per-table JVM lock dropped in 443d4da. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Boots a controller, registers a rendezvous-style task generator, and fires two parallel scheduleTasks calls against two tables. When both tables opt into concurrent scheduling via TableTaskConfig the generator observes maxInFlight == 2; with the cluster default (false) the legacy path serializes them and maxInFlight stays at 1. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #18272 +/- ##
============================================
+ Coverage 63.60% 63.62% +0.01%
+ Complexity 1660 1659 -1
============================================
Files 3246 3246
Lines 197514 197552 +38
Branches 30578 30589 +11
============================================
+ Hits 125633 125686 +53
+ Misses 61835 61824 -11
+ Partials 10046 10042 -4
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
| } | ||
| checkedAnyTable = true; | ||
| if (!resolveConcurrentScheduling(tableConfig)) { | ||
| return false; |
There was a problem hiding this comment.
Note: To not over-engineer and split TaskSchedulingContext, just retuning false here for entire batch. scheduleTasks called by cron scheduler only sends 1 table in batch so it should be fine for cron scheduled tables.
xiangfu0
left a comment
There was a problem hiding this comment.
My main concern is for many large minion task gen may just blow up the controller.
Shall we consider:
- limit the concurrency
- sync per tasks type or table based?
If Manual schedule API is called then the generation happens in a for-loop where generateTask is called in sync. |
xiangfu0
left a comment
There was a problem hiding this comment.
Found one high-signal concurrency issue; see inline comment.
| public synchronized Map<String, TaskSchedulingInfo> scheduleTasks(TaskSchedulingContext context) { | ||
| public Map<String, TaskSchedulingInfo> scheduleTasks(TaskSchedulingContext context) { | ||
| _controllerMetrics.addMeteredGlobalValue(ControllerMeter.NUMBER_TIMES_SCHEDULE_TASKS_CALLED, 1L); | ||
| if (shouldUseConcurrentPath(context)) { |
There was a problem hiding this comment.
This can enter the unlocked path when a table opts in via TableTaskConfig.concurrentSchedulingEnabled while controller.task.enableDistributedLocking is still false: acquireTaskLock then returns null, so same-table scheduled runs and ad-hoc createTask no longer share any mutex. That can double-generate or submit minion tasks for the same table/window. Gate the concurrent path on distributed locking being enabled, or keep a per-table JVM lock when the distributed lock manager is absent.
There was a problem hiding this comment.
This is expected. Pre-requisite for this change is that enableDistributedLocking is enabled.
There was a problem hiding this comment.
Agree with Xiang on having per table local locking so as to avoid concurrent task generations especially when user has configured an aggressive schedule so same controller ends up getting multiple triggers while older generations are still active.
There was a problem hiding this comment.
Also if we decide to have a lock let's have it on table + task type as well to avoid dropping schedules of different tasks from same table
There was a problem hiding this comment.
Agree with Xiang on having per table local locking so as to avoid concurrent task generations especially when user has configured an aggressive schedule so same controller ends up getting multiple triggers while older generations are still active.
why do we need table local lock when we have distrubuted table lock?
There was a problem hiding this comment.
This is for the case where distributed locking is not enabled
There was a problem hiding this comment.
This feature doesn't make sense if distributed lock is disabled. I think we are kind of confusing the short term solution and long term, Pls re-check the PR description.
shounakmk219
left a comment
There was a problem hiding this comment.
Let's also call out in the description (code comment as well) that when this is enabled along with distributed locking it may drop the task generations of different task types from same table in case the schedules/triggers coincide due to the distributed locking being on just at table level
| // The concurrent path relies on the distributed ZK lock to coordinate same-table task | ||
| // generation (and to mutually exclude with ad-hoc createTask, which still takes | ||
| // synchronized(this)). Running without distributed locking leaves those races unprotected. | ||
| LOGGER.warn("Concurrent task scheduling is enabled but distributed locking is disabled. " |
There was a problem hiding this comment.
Should we set _clusterConcurrentSchedulingEnabled to false in this case?
There was a problem hiding this comment.
I initially thought about it but I didn't do it thinking that this will be an anti pattern because the config is enabled explicitly but code will then disable it internally. Gives a wrong signal to user.
There was a problem hiding this comment.
I think we can call it out as a dependent feature which needs distributed locking to be enabled to allow concurrent task generations.
There was a problem hiding this comment.
Already called it out in the description.
We have distrubuted lock enabled by default, and I will just set my table with concurrentScheduling enabled.
| public synchronized Map<String, TaskSchedulingInfo> scheduleTasks(TaskSchedulingContext context) { | ||
| public Map<String, TaskSchedulingInfo> scheduleTasks(TaskSchedulingContext context) { | ||
| _controllerMetrics.addMeteredGlobalValue(ControllerMeter.NUMBER_TIMES_SCHEDULE_TASKS_CALLED, 1L); | ||
| if (shouldUseConcurrentPath(context)) { |
There was a problem hiding this comment.
Agree with Xiang on having per table local locking so as to avoid concurrent task generations especially when user has configured an aggressive schedule so same controller ends up getting multiple triggers while older generations are still active.
| } | ||
| // If at least one table was inspected and none opted out, use concurrent path. Otherwise (no | ||
| // tables in scope) fall back to the cluster default so the decision is deterministic. | ||
| return checkedAnyTable || _clusterConcurrentSchedulingEnabled; |
There was a problem hiding this comment.
promote the check on _clusterConcurrentSchedulingEnabled earlier on in the method to avoid the bunch of zk calls made here as when _clusterConcurrentSchedulingEnabled is true the table check are anyways redundant.
There was a problem hiding this comment.
table check is not redundant, table flag has higher priority than cluster level flag.
If tableFlag is not present in the table Config then only cluster flag is looked.
There was a problem hiding this comment.
Oh I see, we are doing all serially if any one of the table is not opting for concurrency.
| public synchronized Map<String, TaskSchedulingInfo> scheduleTasks(TaskSchedulingContext context) { | ||
| public Map<String, TaskSchedulingInfo> scheduleTasks(TaskSchedulingContext context) { | ||
| _controllerMetrics.addMeteredGlobalValue(ControllerMeter.NUMBER_TIMES_SCHEDULE_TASKS_CALLED, 1L); | ||
| if (shouldUseConcurrentPath(context)) { |
There was a problem hiding this comment.
Also if we decide to have a lock let's have it on table + task type as well to avoid dropping schedules of different tasks from same table
shounakmk219
left a comment
There was a problem hiding this comment.
This feature is safe to use only when distributed locking is enabled or on a cluster where tables do not have frequent cron schedules. With these limitations this short term fix is good enough to unblock concurrent task generations.
…k_async # Conflicts: # pinot-controller/src/main/java/org/apache/pinot/controller/helix/core/minion/PinotTaskManager.java
Thanks Shounak. @xiangfu0 @swaminathanmanish can you pls help with merging the PR |
|
Opened docs follow-up PR: pinot-contrib/pinot-docs#772 |
…tension (#18431) * Make concurrent-scheduling dispatch helpers protected for subclass extension Bumps PinotTaskManager#shouldUseConcurrentPath and #resolveConcurrentScheduling from package-private @VisibleForTesting to protected @VisibleForTesting so that subclasses in other packages can integrate with the concurrent scheduling path introduced in #18272. In particular, this lets a subclass override resolveConcurrentScheduling to plug in a different per-table policy (for example, defaulting concurrent scheduling to true for specific task types) and have its override invoked from the parent's shouldUseConcurrentPath. Without this change, a different-package subclass cannot polymorphically override the package-private resolver — Java treats the same-named method as hidden rather than overridden — so the extension hook is effectively unreachable. No behavioral change. The methods retain @VisibleForTesting; existing same- package tests continue to work unchanged. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Removes visibleForTesting --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Problem
Schedule task has a coarse synchronised lock which sluggishly blocks any task generation if theres a live task generation run. Also the
synchronised scheduleTasksblocks the grizzly controller API threads if task is scheduled via API making pinot controller UI unresponsive.Solution
Focussing on quick short term solution here - Adding table level and cluster level flag to control scheduleTasks is blocking or non-blocking.
The long term fix will be to remove synchronised block from schedule tasks and make the distrubtedLock inside scheduledTask granular at table-task level.
Note: For this change to work:
enableDistributedLockingconfig needs to be enabled.