Reduce size of MANAGEMENT threadpool on small node #71171

DaveCTurner · 2021-04-01T08:43:51Z

Today by default the MANAGEMENT threadpool always permits 5 threads
even if the node has a single CPU, which unfairly prioritises management
activities on small nodes. With this commit we limit the size of this
threadpool to the number of processors if less than 5.

Relates #70435

Today by default the `MANAGEMENT` threadpool always permits 5 threads even if the node has a single CPU, which unfairly prioritises management activities on small nodes. With this commit we limit the size of this threadpool to the number of processors if less than 5. Relates elastic#70435

elasticmachine · 2021-04-01T08:43:55Z

Pinging @elastic/es-distributed (Team:Distributed)

original-brownbear

LGTM

… don't need to schedule anything anyway

DaveCTurner · 2021-04-01T10:41:42Z

Darn it. This change results in deadlocks in the org.elasticsearch.http.*RestCancellationIT tests for stats actions, because they rely on blocking the responding thread but they need another management thread for checking whether the tasks have started.

DaveCTurner · 2021-04-01T11:08:08Z

Darn it some more. TransportCancelTasksAction runs on the MANAGEMENT thread too. That seems like a bug, we might not be able to cancel remote management tasks if the remote node is too busy running them all. However it's not obvious we can move that to SAME, cancelling some tasks looks to involve nontrivial work.

DaveCTurner · 2021-04-01T13:11:49Z

@original-brownbear this has moved enough since your LGTM that it's worth another look.

@elasticmachine test this please (just for another CI run in case it shakes out anything else before merging)

DaveCTurner · 2021-04-01T13:12:24Z

once more with feeling ... @elasticmachine test this please

henningandersen

LGTM.

henningandersen · 2021-04-06T11:33:29Z

modules/reindex/src/test/java/org/elasticsearch/index/reindex/DeleteByQueryBasicTests.java

-                // The delete by query request will be executed successfully because the block will be released
-                assertThat(deleteByQuery().source("test").filter(QueryBuilders.matchAllQuery()).refresh(true).get(),
-                    matcher().deleted(docs));
+                // Fire off the delete-by-query first


I like this simplification, so just out of curiosity, did not changing this cause a failure? I need a hint to realize how I think.

Yes, the threadPool.schedule(..., ThreadPool.Names.MANAGEMENT) made a blocking call on the (sole) management thread that could not complete because the refresh needs to make stats calls which also need management threads. I did a targeted fix in b8a20dd but then decided that the whole thing could be simplified.

Thanks, got it now.

original-brownbear

LGTM2

Today by default the `MANAGEMENT` threadpool always permits 5 threads even if the node has a single CPU, which unfairly prioritises management activities on small nodes. With this commit we limit the size of this threadpool to the number of processors if less than 5. Relates #70435

DaveCTurner added >enhancement :Distributed/Distributed A catch all label for anything in the Distributed Area. If you aren't sure, use this one. v8.0.0 v7.13.0 labels Apr 1, 2021

DaveCTurner requested a review from henningandersen April 1, 2021 08:43

elasticmachine added the Team:Distributed Meta label for distributed team label Apr 1, 2021

original-brownbear approved these changes Apr 1, 2021

View reviewed changes

DaveCTurner added 3 commits April 1, 2021 10:18

Avoid blocking a management thread in DeleteByQueryBasicTests

b8a20dd

Actually we should not wait for the refresh to complete first, but we…

7c3fd47

… don't need to schedule anything anyway

Whitespace nits

280a43f

DaveCTurner added 3 commits April 1, 2021 12:21

Await tasks without using MANAGEMENT threadpool

fbbe5cd

Comment on GENERIC threadpool

363c53e

Precommit

294b713

DaveCTurner requested a review from original-brownbear April 1, 2021 13:10

henningandersen approved these changes Apr 6, 2021

View reviewed changes

original-brownbear approved these changes Apr 6, 2021

View reviewed changes

DaveCTurner merged commit b690798 into elastic:master Apr 6, 2021

DaveCTurner deleted the 2021-04-01-smaller-management-threadpool branch April 6, 2021 11:58

DaveCTurner mentioned this pull request Jul 14, 2021

Performance regression starting from 7.8.0 when too many search requests closes before getting result #75316

Closed

jakelandis added v8.0.0-alpha1 and removed v8.0.0 labels Jul 26, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reduce size of MANAGEMENT threadpool on small node #71171

Reduce size of MANAGEMENT threadpool on small node #71171

DaveCTurner commented Apr 1, 2021

elasticmachine commented Apr 1, 2021

original-brownbear left a comment

DaveCTurner commented Apr 1, 2021

DaveCTurner commented Apr 1, 2021

DaveCTurner commented Apr 1, 2021 •

edited

Loading

DaveCTurner commented Apr 1, 2021

henningandersen left a comment

henningandersen Apr 6, 2021

DaveCTurner Apr 6, 2021

henningandersen Apr 6, 2021

original-brownbear left a comment

Reduce size of MANAGEMENT threadpool on small node #71171

Reduce size of MANAGEMENT threadpool on small node #71171

Conversation

DaveCTurner commented Apr 1, 2021

elasticmachine commented Apr 1, 2021

original-brownbear left a comment

Choose a reason for hiding this comment

DaveCTurner commented Apr 1, 2021

DaveCTurner commented Apr 1, 2021

DaveCTurner commented Apr 1, 2021 • edited Loading

DaveCTurner commented Apr 1, 2021

henningandersen left a comment

Choose a reason for hiding this comment

henningandersen Apr 6, 2021

Choose a reason for hiding this comment

DaveCTurner Apr 6, 2021

Choose a reason for hiding this comment

henningandersen Apr 6, 2021

Choose a reason for hiding this comment

original-brownbear left a comment

Choose a reason for hiding this comment

DaveCTurner commented Apr 1, 2021 •

edited

Loading