Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ML] Prevent submit after autodetect worker is stopped #37700

Merged
merged 6 commits into from Jan 29, 2019

Conversation

Projects
None yet
3 participants
@davidkyle
Copy link
Member

commented Jan 22, 2019

Runnables can be submitted to AutodetectProcessManager.AutodetectWorkerExecutorService without error after it has been shutdown which can lead to requests timing out as their handlers are never called by the terminated executor. This change throws an EsRejectedExecutionException if a runnable is submitted after after the shutdown and calls AbstractRunnable.onRejection on any tasks not run.

Closes #37108

@davidkyle

This comment has been minimized.

Copy link
Member Author

commented Jan 23, 2019

run elasticsearch-ci/1

1 similar comment
@davidkyle

This comment has been minimized.

Copy link
Member Author

commented Jan 23, 2019

run elasticsearch-ci/1

davidkyle added some commits Jan 15, 2019

Call rejected for AbstractRunnables
This reverts commit ba4b275.

@davidkyle davidkyle force-pushed the davidkyle:prevent-submit-after-stop branch from 34ca548 to c6688df Jan 24, 2019

new EsRejectedExecutionException("unable to process as autodetect worker service has shutdown", true));
}
}
}

This comment has been minimized.

Copy link
@droberts195

droberts195 Jan 28, 2019

Contributor

I think there's still a race condition, although probably less likely than it was before. Consider this scenario:

  1. Thread 1 is running start(), and is preempted on line 841.
  2. Thread 2 is calling execute(), and is preempted on line 828 while running == true.
  3. Thread 3 calls shutdown() and sets running = false.
  4. Thread 1 goes back on CPU and progresses to line 860, i.e. all the way through the // if shutdown with tasks pending notify the handlers block.
  5. Thread 2 goes back on CPU and adds its command to the queue.

Step 5 happens after the queue is drained, so the problem this PR is trying to fix will still occur - its command is never run nor rejected.

One way to fix this race would be to make the whole of execute() synchronized, plus the queue draining part of start(), i.e. lines 849-860.

This comment has been minimized.

Copy link
@davidkyle

davidkyle Jan 28, 2019

Author Member

Good point, there is a lesson here about rolling your own instead of using a proven executor.

I looked at using a fixed threadpool of size 1 from EsExecutors.newFixed but that method wants ThreadFactory whereas here we use a single thread from the ml autodetect pool. My preference is to use one of the standard Executors but I can't see how to make that work

@droberts195
Copy link
Contributor

left a comment

LGTM

@droberts195 droberts195 merged commit 6d1693f into elastic:master Jan 29, 2019

7 checks passed

CLA Commit author has signed the CLA
Details
elasticsearch-ci/1 Build finished.
Details
elasticsearch-ci/2 Build finished.
Details
elasticsearch-ci/default-distro Build finished.
Details
elasticsearch-ci/docbldesx Build finished.
Details
elasticsearch-ci/oss-distro-docs Build finished.
Details
elasticsearch-ci/packaging-sample Build finished.
Details

droberts195 added a commit that referenced this pull request Jan 29, 2019

[ML] Prevent submit after autodetect worker is stopped (#37700)
Runnables can be submitted to
AutodetectProcessManager.AutodetectWorkerExecutorService
without error after it has been shutdown which can lead
to requests timing out as their handlers are never called
by the terminated executor.

This change throws an EsRejectedExecutionException if a
runnable is submitted after after the shutdown and calls
AbstractRunnable.onRejection on any tasks not run.

Closes #37108

henningandersen added a commit to henningandersen/elasticsearch that referenced this pull request Jan 30, 2019

[ML] Prevent submit after autodetect worker is stopped (elastic#37700)
Runnables can be submitted to
AutodetectProcessManager.AutodetectWorkerExecutorService
without error after it has been shutdown which can lead
to requests timing out as their handlers are never called
by the terminated executor.

This change throws an EsRejectedExecutionException if a
runnable is submitted after after the shutdown and calls
AbstractRunnable.onRejection on any tasks not run.

Closes elastic#37108

@colings86 colings86 added v7.0.0-beta1 and removed v7.0.0 labels Feb 7, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.