Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Internal: Management thread pool should reject requests when there are too many #7318

Closed
mikemccand opened this Issue Aug 18, 2014 · 13 comments

Comments

Projects
None yet
7 participants
@mikemccand
Copy link
Contributor

mikemccand commented Aug 18, 2014

Today, the management thread pool (used by stats and cats) is bounded to 5, but it still accepts further requests, and then waits indefinitely for a thread to free up.

This is dangerous because node stats can be a somewhat costly operation (in proportion to number of shards on the node)....

And it confounds debugging, because it can cause loooong hangs in e.g. node stats requests via browser/curl, and it also is not graceful for recovering from "too many management requests" overload.

If we instead rejected the request it would make it clearer which clients are causing too much load.

@nik9000

This comment has been minimized.

Copy link
Contributor

nik9000 commented Aug 18, 2014

+1

@kevinkluge

This comment has been minimized.

Copy link
Member

kevinkluge commented Aug 18, 2014

I assume we'd add a configurable queue size and reject when the queue is full. I like the idea; it would help debugging cases where management tools are loading the server.

@kimchy

This comment has been minimized.

Copy link
Member

kimchy commented Aug 18, 2014

the scaling thread pool today doesn't support rejection based on queue size (just an FYI). We can try and add this support, can be a bit tricky since it relies on rejection to add a thread or not. I think potentially the simplest solution is to make it fixed with a bounded queue size?

@mikemccand

This comment has been minimized.

Copy link
Contributor Author

mikemccand commented Aug 18, 2014

Fixed with bounded queue size sounds like a good compromise; I'll try to take a stab at this.

@mikemccand mikemccand self-assigned this Aug 18, 2014

@mikemccand

This comment has been minimized.

Copy link
Contributor Author

mikemccand commented Aug 18, 2014

OK all I changed was this:

diff --git a/src/main/java/org/elasticsearch/threadpool/ThreadPool.java b/src/main/java/org/elasticsearch/threadpool/ThreadPool.java
index 0152c0e..19692e6 100644
--- a/src/main/java/org/elasticsearch/threadpool/ThreadPool.java
+++ b/src/main/java/org/elasticsearch/threadpool/ThreadPool.java
@@ -119,7 +119,7 @@ public class ThreadPool extends AbstractComponent {
                 .put(Names.SEARCH, settingsBuilder().put("type", "fixed").put("size", availableProcessors * 3).put("queue_size", 1000).build())
                 .put(Names.SUGGEST, settingsBuilder().put("type", "fixed").put("size", availableProcessors).put("queue_size", 1000).build())
                 .put(Names.PERCOLATE, settingsBuilder().put("type", "fixed").put("size", availableProcessors).put("queue_size", 1000).build())
-                .put(Names.MANAGEMENT, settingsBuilder().put("type", "scaling").put("keep_alive", "5m").put("size", 5).build())
+                .put(Names.MANAGEMENT, settingsBuilder().put("type", "fixed").put("size", availableProcessors).put("queue_size", 100).build())
                 .put(Names.FLUSH, settingsBuilder().put("type", "scaling").put("keep_alive", "5m").put("size", halfProcMaxAt5).build())
                 .put(Names.MERGE, settingsBuilder().put("type", "scaling").put("keep_alive", "5m").put("size", halfProcMaxAt5).build())
                 .put(Names.REFRESH, settingsBuilder().put("type", "scaling").put("keep_alive", "5m").put("size", halfProcMaxAt10).build())

But just to confirm: if the backlog tries to exceed 100 then ES will throw EsRejectedExecutionExc back to the client? Looks like it will, because EsExecutors.newFixed passes new EsAbortPolicy()...

@kimchy

This comment has been minimized.

Copy link
Member

kimchy commented Aug 18, 2014

@mikemccand correct, changing to fix with bounded queue will cause rejections to be thrown. I think that availableProcessors as the value for the thread pool is too big, specifically for beefy machines, I would put there Math.min(5, avaiableProcessors)?, I think 5 threads should be enough, otherwise the operation is just too heavy and its a bug, a stats call should not be heavy (like using the usable space on file length issue).

@mikemccand

This comment has been minimized.

Copy link
Contributor Author

mikemccand commented Aug 18, 2014

OK, thanks @kimchy I'll switch to min(5, availableProcessors).

@kimchy

This comment has been minimized.

Copy link
Member

kimchy commented Aug 18, 2014

ahh, we already have halfProcMaxAt5, maybe just use that one?

@mikemccand

This comment has been minimized.

Copy link
Contributor Author

mikemccand commented Aug 18, 2014

Ahh super.

mikemccand added a commit that referenced this issue Aug 18, 2014

Core: switch to fixed thread pool by default for management threads
Switch management threads to a fixed thread pool with up to 5 threads, and queue size of 100 by default, after which excess incoming requests are rejected.

Closes #7318

Closes #7320

@mikemccand mikemccand added the v1.3.3 label Aug 18, 2014

mikemccand added a commit that referenced this issue Aug 18, 2014

Core: switch to fixed thread pool by default for management threads
Switch management threads to a fixed thread pool with up to 5 threads, and queue size of 100 by default, after which excess incoming requests are rejected.

Closes #7318

Closes #7320

@clintongormley clintongormley changed the title Core: should management thread pool reject requests when there are too many? Internal: Management thread pool should reject requests when there are too many Sep 8, 2014

mikemccand added a commit that referenced this issue Sep 8, 2014

Core: switch to fixed thread pool by default for management threads
Switch management threads to a fixed thread pool with up to 5 threads, and queue size of 100 by default, after which excess incoming requests are rejected.

Closes #7318

Closes #7320

mikemccand added a commit that referenced this issue Sep 30, 2014

@mikemccand

This comment has been minimized.

Copy link
Contributor Author

mikemccand commented Sep 30, 2014

I've reverted this change, since it causes #7916 ...

mute pushed a commit to mute/elasticsearch that referenced this issue Jul 29, 2015

mute pushed a commit to mute/elasticsearch that referenced this issue Jul 29, 2015

Core: switch to fixed thread pool by default for management threads
Switch management threads to a fixed thread pool with up to 5 threads, and queue size of 100 by default, after which excess incoming requests are rejected.

Closes elastic#7318

Closes elastic#7320

mute pushed a commit to mute/elasticsearch that referenced this issue Jul 29, 2015

@bleskes

This comment has been minimized.

Copy link
Member

bleskes commented Nov 4, 2015

we should probably revisit this now that we have a different execution model for indices stats etc (#7990) . Reopening ....

@clintongormley

This comment has been minimized.

Copy link
Member

clintongormley commented Nov 25, 2016

@jasontedor may be of interest to you?

@jasontedor

This comment has been minimized.

Copy link
Member

jasontedor commented Mar 14, 2018

Superseded by #18613

@jasontedor jasontedor closed this Mar 14, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.