-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix TaskQueue-HRTR deadlock #6212
Conversation
private void taskComplete( | ||
HttpRemoteTaskRunnerWorkItem taskRunnerWorkItem, | ||
WorkerHolder workerHolder, | ||
TaskStatus taskStatus | ||
) | ||
{ | ||
Preconditions.checkArgument(!Thread.holdsLock(statusLock), "Current thread must not hold statusLock."); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: checkState is more appropriate here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
makes sense, fixed
@@ -376,36 +376,46 @@ private boolean runTaskOnWorker( | |||
// on a worker - this avoids overflowing a worker with tasks | |||
long waitMs = config.getTaskAssignmentTimeout().toStandardDuration().getMillis(); | |||
long waitStart = System.currentTimeMillis(); | |||
boolean isTaskAssignmentTimedout = false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: isTaskAssignmentTimedOut (spelling: timed out is two words.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed :)
Probably the right thing to do is have the caller that sets a callback also pass an ExecutorService to use for that callback. That way the caller has some control over what thread pool to use, and the callee doesn't need to have its own callbacks thread pool that might never get used. If it doesn't pass one in, then use a same-thread executor. |
LGTM, thanks @himanshug |
@himanshug, I moved this to 0.13, since 0.12 doesn't have the HTTP remote task runner. |
* fix TaskQueue-HRTR deadlock causing apache#6201 * address review comments
@gianm sure, thanks. |
Fixes #6201
FWIW: RTR has same bug https://github.com/apache/incubator-druid/blob/master/indexing-service/src/main/java/io/druid/indexing/overlord/RemoteTaskRunner.java#L908 but rarity of that happening causes getting into the deadlock very unlikely.
from the design perspective, to guarantee safety, I think it would eventually make more sense to make sure that all kinds of listeners are always executed in something other than
SameThreadExecutor
or that listeners are never called with internal locks being held as you don't know what listeners might end up doing.