Skip to content

[enhance] Support dynamically modify server thread pool#18047

Merged
xiangfu0 merged 5 commits intoapache:masterfrom
hongkunxu:enhance/add_more_cluster_config
Apr 12, 2026
Merged

[enhance] Support dynamically modify server thread pool#18047
xiangfu0 merged 5 commits intoapache:masterfrom
hongkunxu:enhance/add_more_cluster_config

Conversation

@hongkunxu
Copy link
Copy Markdown
Contributor

Make pinot.query.scheduler.query_runner_threads and pinot.query.scheduler.query_worker_threads dynamically adjustable at runtime via Helix cluster config, without requiring a server restart.

Previously, tuning these thread pool sizes to match available CPU/IO capacity required editing server.conf and restarting the server. This PR adds a PinotClusterConfigChangeListener that listens for changes to these two keys and resizes the underlying ThreadPoolExecutor pools on the fly.

@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented Mar 31, 2026

Codecov Report

❌ Patch coverage is 89.56522% with 12 lines in your changes missing coverage. Please review.
✅ Project coverage is 63.96%. Comparing base (4259af7) to head (4fb78f8).
⚠️ Report is 13 commits behind head on master.

Files with missing lines Patch % Lines
.../pinot/server/starter/helix/BaseServerStarter.java 0.00% 5 Missing ⚠️
...ore/query/scheduler/resources/ResourceManager.java 92.30% 1 Missing and 2 partials ⚠️
...duler/resources/BinaryWorkloadResourceManager.java 0.00% 2 Missing ⚠️
.../pinot/core/query/scheduler/PriorityScheduler.java 93.33% 0 Missing and 1 partial ⚠️
...che/pinot/core/query/scheduler/QueryScheduler.java 0.00% 1 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##             master   #18047      +/-   ##
============================================
+ Coverage     63.93%   63.96%   +0.03%     
  Complexity     1594     1594              
============================================
  Files          3178     3179       +1     
  Lines        193466   193574     +108     
  Branches      29880    29891      +11     
============================================
+ Hits         123683   123811     +128     
+ Misses        60010    59986      -24     
- Partials       9773     9777       +4     
Flag Coverage Δ
custom-integration1 100.00% <ø> (ø)
integration 100.00% <ø> (ø)
integration1 100.00% <ø> (ø)
integration2 0.00% <ø> (ø)
java-11 63.91% <89.56%> (+0.03%) ⬆️
java-21 63.91% <89.56%> (+0.01%) ⬆️
temurin 63.96% <89.56%> (+0.03%) ⬆️
unittests 63.95% <89.56%> (+0.03%) ⬆️
unittests1 55.86% <80.90%> (+0.03%) ⬆️
unittests2 34.35% <12.17%> (-0.01%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@hongkunxu hongkunxu force-pushed the enhance/add_more_cluster_config branch from 91087a6 to a9c049c Compare March 31, 2026 14:13
@xiangfu0 xiangfu0 added enhancement Improvement to existing functionality feature New functionality labels Mar 31, 2026
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds runtime (cluster-config-driven) resizing for the server query scheduler’s runner/worker thread pools, aiming to let operators tune pinot.query.scheduler.query_runner_threads and pinot.query.scheduler.query_worker_threads without restarting servers.

Changes:

  • Register a new PinotClusterConfigChangeListener from the server starter to react to cluster config updates for query scheduler thread-pool sizing.
  • Introduce a listener that parses the updated config values and calls ResourceManager.resizeThreadPools(...).
  • Update ResourceManager to keep references to the underlying ThreadPoolExecutors and support resizing; add/expand unit tests for resizing and listener behavior.

Reviewed changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
pinot-server/src/main/java/org/apache/pinot/server/starter/helix/BaseServerStarter.java Registers the new cluster-config change listener during server startup.
pinot-core/src/main/java/org/apache/pinot/core/query/scheduler/resources/ResourceManager.java Stores the underlying executors and adds dynamic resize support.
pinot-core/src/main/java/org/apache/pinot/core/query/scheduler/QuerySchedulerThreadPoolConfigChangeListener.java New listener that watches cluster config keys and triggers thread-pool resizing.
pinot-core/src/main/java/org/apache/pinot/core/query/scheduler/QueryScheduler.java Exposes getResourceManager() to allow the server starter to wire the listener.
pinot-core/src/test/java/org/apache/pinot/core/query/scheduler/resources/ResourceManagerTest.java Adds tests for increasing/decreasing/no-op/invalid resizes; ensures cleanup.
pinot-core/src/test/java/org/apache/pinot/core/query/scheduler/QuerySchedulerThreadPoolConfigChangeListenerTest.java New tests verifying listener behavior for relevant/unrelated/invalid config changes.
pinot-core/src/test/java/org/apache/pinot/core/query/scheduler/PrioritySchedulerTest.java Adjusts method visibility/override to match new QueryScheduler#getResourceManager().

@hongkunxu hongkunxu force-pushed the enhance/add_more_cluster_config branch from 0f0f982 to 7748e80 Compare April 2, 2026 02:32
@xiangfu0 xiangfu0 requested a review from Copilot April 2, 2026 02:34
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 11 out of 11 changed files in this pull request and generated 8 comments.

Comment on lines +142 to +146
resizePool(_queryRunnerPool, oldRunnerThreads, newRunnerThreads, "queryRunner");
_numQueryRunnerThreads = newRunnerThreads;

resizePool(_queryWorkerPool, oldWorkerThreads, newWorkerThreads, "queryWorker");
_numQueryWorkerThreads = newWorkerThreads;
Copy link

Copilot AI Apr 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

resizeThreadPools() has no synchronization, but it mutates multiple related fields and triggers callbacks that may themselves depend on previous values (e.g., PriorityScheduler uses an old->new delta). If resizeThreadPools() is invoked concurrently, deltas and listener state can become inconsistent. Consider guarding the entire resize operation with synchronization (e.g., synchronize the method or use a lock) so resizing + listener notifications are atomic and ordered.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is true, just very low chance to happen.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed @xiangfu0

@hongkunxu hongkunxu force-pushed the enhance/add_more_cluster_config branch from af652b4 to 270ee86 Compare April 2, 2026 03:07
@xiangfu0 xiangfu0 changed the title [enhance] Support dynamiclly modify server thread pool [enhance] Support dynamically modify server thread pool Apr 2, 2026
int newRunnerThreads = _resourceManager.getNumQueryRunnerThreads();
int newWorkerThreads = _resourceManager.getNumQueryWorkerThreads();

if (changedConfigs.contains(QUERY_RUNNER_THREADS_KEY) && clusterConfigs.containsKey(QUERY_RUNNER_THREADS_KEY)) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DefaultClusterConfigChangeHandler reports deleted cluster keys in changedConfigs, but this listener only reparses a value when the key is still present in clusterConfigs.

If an operator removes pinot.query.scheduler.query_runner_threads or pinot.query.scheduler.query_worker_threads to roll back the override (expecting the value to be rollback to the system default)

resizeThreadPools is invoked with the current value instead, so the last live size becomes sticky until another explicit number is pushed.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, thanks for pointing this out! I’ve updated the PR accordingly.

}
_clusterConfigChangeHandler.registerClusterConfigChangeListener(_segmentOperationsThrottlerSet);
_clusterConfigChangeHandler.registerClusterConfigChangeListener(keepPipelineBreakerStatsPredicate);
_clusterConfigChangeHandler.registerClusterConfigChangeListener(
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is another usage of query_worker_threads at line 707-711:

    // Create a thread pool used for mutable lucene index searches, with size based on query_worker_threads config
    LOGGER.info("Initializing lucene searcher thread pool");
    int queryWorkerThreads =
        _serverConf.getProperty(ResourceManager.QUERY_WORKER_CONFIG_KEY, ResourceManager.DEFAULT_QUERY_WORKER_THREADS);
    _realtimeLuceneTextIndexSearcherPool = RealtimeLuceneTextIndexSearcherPool.init(queryWorkerThreads);

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, I didn’t consider this case before. I’ve updated the logic to handle it. Please take another look.
cc @xiangfu0

Signed-off-by: Hongkun Xu <xuhongkun666@163.com>
Signed-off-by: Hongkun Xu <xuhongkun666@163.com>
Signed-off-by: Hongkun Xu <xuhongkun666@163.com>
Signed-off-by: Hongkun Xu <xuhongkun666@163.com>
Signed-off-by: Hongkun Xu <xuhongkun666@163.com>
@hongkunxu hongkunxu requested a review from xiangfu0 April 7, 2026 09:13
@hongkunxu hongkunxu force-pushed the enhance/add_more_cluster_config branch from fcc9c06 to 4fb78f8 Compare April 7, 2026 09:13
@xiangfu0 xiangfu0 merged commit 7e10a36 into apache:master Apr 12, 2026
44 of 48 checks passed
xiangfu0 pushed a commit to pinot-contrib/pinot-docs that referenced this pull request Apr 13, 2026
@xiangfu0
Copy link
Copy Markdown
Contributor

A documentation PR has been opened for this change: pinot-contrib/pinot-docs#732

xiangfu0 added a commit to pinot-contrib/pinot-docs that referenced this pull request Apr 13, 2026
…#18047)

This PR documents the dynamic server thread pool configuration added in
apache/pinot#18047:

- `pinot.query.scheduler.query_runner_threads` and
`pinot.query.scheduler.query_worker_threads` can now be updated at
runtime via Helix cluster config without restarting the server
- A PinotClusterConfigChangeListener automatically resizes the thread
pools on config change

Upstream PR: apache/pinot#18047

Co-authored-by: Pinot Docs Bot <pinot-docs-bot@apache.org>
xiangfu0 pushed a commit to pinot-contrib/pinot-docs that referenced this pull request Apr 13, 2026
@xiangfu0
Copy link
Copy Markdown
Contributor

📚 Documentation PR opened: pinot-contrib/pinot-docs#735

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement Improvement to existing functionality feature New functionality

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants