-
Notifications
You must be signed in to change notification settings - Fork 3.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use thread priorities. (aka set nice
values for background-like tasks)
#984
Conversation
8b0ec82
to
d05032b
Compare
7171d10
to
89dca15
Compare
@@ -54,8 +54,33 @@ | |||
}) | |||
public interface Task | |||
{ | |||
public static enum Priority | |||
{ | |||
// NOTE: Setting negative nice values on linux sysetems (priority > Thread.NORM_PRIORITY) requires running |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
systems*
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
f04d7e3
to
4a3d9cf
Compare
I think we need to test this PR a bit more to understand if the possible benefits justify the code complexity. I also think we should make sure that persists will not be starved out with these changes. |
I did simple tests, and when running a local firehose (so just firing off events as fast as they can be parsed from TPCH file) then the realtime index task runs in about 20% LESS TIME (good) to get to the first persist-n-merge when the persist and the merge thread pools are LOW priority as compared to NORM. Executing any queries on the nodes tends to have no measurable impact on the total ingestion time on my local machine (<50% cpu usage for ingest, persist, merge, and query all combined). I did not wait for the persist-n-merge to complete. The reason is that looking at the thread breakout, the persist and realtime index task (plumber) are the ones that run concurrently, with the merge task running all by its lonesome. This is on my Mac though, so mileage on Linux may vary since the schedulers are not exactly the same. |
@@ -204,6 +204,9 @@ public TaskStatus call() | |||
command.add(String.format("-Ddruid.host=%s", childHost)); | |||
command.add(String.format("-Ddruid.port=%d", childPort)); | |||
|
|||
command.add("-XX:+UseThreadPriorities"); | |||
command.add("-XX:ThreadPriorityPolicy=42"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we add a link to the documentation explaining this parameter
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if this is some weird undocumented behavior in the vm that we rely on, I would prefer we simply put a note in the documentation about this and let users set the parameter themselves in the forking task runner vm parameter.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
https://docs.oracle.com/cd/E15289_01/doc.40/e15062/optionxx.htm#BABGBFHF
Without setting the command line flags the behavior is disabled at the JVM level
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- for linux
63f027f
to
0027b91
Compare
@@ -35,24 +37,52 @@ | |||
*/ | |||
public class Execs | |||
{ | |||
public static ExecutorService singleThreaded(String nameFormat) | |||
private static int nullToNormal(@Nullable Integer priority) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would greatly simplify the code if we removed this method and just defaulted to null, only setting the priority if it is actually set to something, rather than assuming the default is normal.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the default IS normal.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And the code would actually be more complex since you couldn't have the cases with no priority call the cases with priority with NULL values
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the default is technically the priority of the creating thread.
Even if in our case this might be normal all the time, I think it will make things less cluttered to not not define a default everywhere and leave it null
afe4a04
to
109d92a
Compare
@nishantmonu51 I updated this PR to use your Task context |
109d92a
to
57adf0a
Compare
public class TaskPriority | ||
{ | ||
// The task context key to grab the task priority from | ||
public static final String CONTEXT_KEY = "taskPriority"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With the changes in #1679,
there will be two context values "priority" and "taskPriority" which looks similar but have different semantics.
Can we rename this to taskThreadPriority to be more explicit ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure, sounds good
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
renaming to backgroundThreadPriority would also be fine.
we should also add docs for the new configs. |
This PR needs to be rebased. |
@thedrow yes, and I'm supposed to have a meeting with @himanshug and some other folks on task priority. |
Then I'll worry about rebasing against current master |
@drcrallen Any news on this one? It should be useful if merged. |
57adf0a
to
5166ebe
Compare
@nishantmonu51 how would you feel about leaving it as an advanced feature for now? |
f8d7e5e
to
1dfe80f
Compare
👍 |
1dfe80f
to
8a43d77
Compare
Conflicts were trivial. All around extra config options |
@JsonProperty("buildV9Directly") Boolean buildV9Directly | ||
@JsonProperty("buildV9Directly") Boolean buildV9Directly, | ||
@JsonProperty("persistThreadPriority") int persistThreadPriority, | ||
@JsonProperty("mergeThreadPriority") int mergeThreadPriority |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
docs and good default values required
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Defaults revert to prior behavior and do not set thread priority.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added docs
* Defaults the thread priority to java.util.Thread.NORM_PRIORITY in io.druid.indexing.common.task.AbstractTask * Each exec service has its own Task Factory which is assigned a priority for spawned task. Therefore each priority class has a unique exec service * Added priority to tasks as taskPriority in the task context. <0 means low, 0 means take default, >0 means high. It is up to any particular implementation to determine how to handle these numbers * Add options to ForkingTaskRunner * Add "-XX:+UseThreadPriorities" default option * Add "-XX:ThreadPriorityPolicy=42" default option * AbstractTask - Removed unneded @JsonIgnore on priority * Added priority to RealtimePlumber executors. All sub-executors (non query runners) get Thread.MIN_PRIORITY * Add persistThreadPriority and mergeThreadPriority to realtime tuning config
8a43d77
to
2e1d6aa
Compare
👍 |
Use thread priorities. (aka set `nice` values for background-like tasks)
taskPriority
priority as a task context parameter. <0 means low, 0 means take default, >0 means high. It is up to any particular implementation to determine how to handle these numbers-XX:+UseThreadPriorities
-XX:ThreadPriorityPolicy=42
Running with
-XX:+UseThreadPriorities -XX:ThreadPriorityPolicy=1
with root privileges enables thisRunning with
-XX:+UseThreadPriorities -XX:ThreadPriorityPolicy=1
WITHOUT root privileges DOES NOT enables this. And will give the warningJava HotSpot(TM) 64-Bit Server VM warning: -XX:ThreadPriorityPolicy requires root privilege on Linux
Running with
-XX:+UseThreadPriorities
does NOT enable thisRunning with
-XX:+UseThreadPriorities -XX:ThreadPriorityPolicy=42
REGARDLESS of root privileges enables thisThis PR is to hopefully help when a task is running on a node that also needs to service queries (notably a realtime node doing persisting and merging)
The impact on query time can be seen in the image below.
As you can see, the maximum query time, and the query time during the long merge process, seem unaffected by this patch. While the persists are accumulating before the merge, the query time linearly increases (assumed proportional to the data intake, though I have nothing to back up that assumption). The max value reached before the persist-n-merge portion is also too close to make any sort of claim. What is obvious, however, is that the first part is completed notably faster.
It is worth noting that CGroups override nice values, so this setting should not interfere with CGroup resources for co-tenant workloads.