Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don't expose cleaned-up tasks as pending in PrioritizedEsThreadPoolExecutor #24237

Merged

Conversation

ywelsch
Copy link
Contributor

@ywelsch ywelsch commented Apr 21, 2017

Changes in #24102 exposed the following oddity: PrioritizedEsThreadPoolExecutor.getPending() can return Pending entries where pending.task == null.

This can happen for example when tasks are added to the pending list while they are in the clean up phase, i.e. TieBreakingPrioritizedRunnable#runAndClean has run already, but afterExecute has not removed the task yet.

Instead of safeguarding consumers of the API (as was done before #24102) I think that we should not count these tasks as pending at all.

Test failures: https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+master+java9-periodic/2235/consoleFull

ERROR   0.76s J2 | SharedClusterSnapshotRestoreIT.testBatchingShardUpdateTask <<< FAILURES!
   > Throwable #1: java.lang.NullPointerException
   > 	at org.elasticsearch.cluster.service.ClusterService.lambda$pendingTasks$2(ClusterService.java:491)
   > 	at java.base/java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195)
   > 	at java.base/java.util.Spliterators$ArraySpliterator.forEachRemaining(Spliterators.java:948)
   > 	at java.base/java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:484)
   > 	at java.base/java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:474)
   > 	at java.base/java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:913)
   > 	at java.base/java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
   > 	at java.base/java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:511)
   > 	at org.elasticsearch.cluster.service.ClusterService.pendingTasks(ClusterService.java:495)
   > 	at org.elasticsearch.action.admin.cluster.tasks.TransportPendingClusterTasksAction.masterOperation(TransportPendingClusterTasksAction.java:68)
   > 	at org.elasticsearch.action.admin.cluster.tasks.TransportPendingClusterTasksAction.masterOperation(TransportPendingClusterTasksAction.java:38)
   > 	at org.elasticsearch.action.support.master.TransportMasterNodeAction.masterOperation(TransportMasterNodeAction.java:87)
   > 	at  ...

pending.add(new Pending(unwrap(t.runnable), t.priority(), t.insertionOrder, executing));
Runnable innerRunnable = t.runnable;
if (innerRunnable != null) {
/** innerRunnable can be null if task is finished but not removed from executor yet,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

doesn't this has to do with the gap between capturing the pending list and when this code runs?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's another reason why innerRunnable can be null. Both of the documented one and the one you gave (either in isolation or in combination) will lead to this.

@ywelsch ywelsch merged commit c2deb1c into elastic:master Apr 21, 2017
ywelsch added a commit that referenced this pull request Apr 21, 2017
…ecutor (#24237)

Changes in #24102 exposed the following oddity: PrioritizedEsThreadPoolExecutor.getPending() can return Pending entries where pending.task == null. This can happen for example when tasks are added to the pending list while they are in the clean up phase, i.e. TieBreakingPrioritizedRunnable#runAndClean has run already, but afterExecute has not removed the task yet. Instead of safeguarding consumers of the API (as was done before #24102) this changes the executor to not count these tasks as pending at all.
ywelsch added a commit that referenced this pull request Apr 21, 2017
…ecutor (#24237)

Changes in #24102 exposed the following oddity: PrioritizedEsThreadPoolExecutor.getPending() can return Pending entries where pending.task == null. This can happen for example when tasks are added to the pending list while they are in the clean up phase, i.e. TieBreakingPrioritizedRunnable#runAndClean has run already, but afterExecute has not removed the task yet. Instead of safeguarding consumers of the API (as was done before #24102) this changes the executor to not count these tasks as pending at all.
jasontedor added a commit to jasontedor/elasticsearch that referenced this pull request Apr 21, 2017
* master: (61 commits)
  Build: Move plugin cli and tests to distribution tool (elastic#24220)
  Peer Recovery: remove maxUnsafeAutoIdTimestamp hand off (elastic#24243)
  Adds version 5.3.2 and backwards compatibility indices for 5.3.1
  Add utility method to parse named XContent objects with typed prefix (elastic#24240)
  MultiBucketsAggregation.Bucket should not extend Writeable (elastic#24216)
  Don't expose cleaned-up tasks as pending in PrioritizedEsThreadPoolExecutor (elastic#24237)
  Adds declareNamedObjects methods to ConstructingObjectParser (elastic#24219)
  ESIntegTestCase.indexRandom should not introduce types. (elastic#24202)
  Tests: Extend InternalStatsTests (elastic#24212)
  IndicesQueryCache should delegate the scorerSupplier method. (elastic#24209)
  Speed up parsing of large `terms` queries. (elastic#24210)
  [TEST] make sure that the random query_string query generator defines a default_field or a list of fields
  token_count type : add an option to count tokens (fix elastic#23227) (elastic#24175)
  Query string default field (elastic#24214)
  Make Aggregations an abstract class rather than an interface (elastic#24184)
  [TEST] ensure expected sequence no and version are set when index/delete engine operation has a document failure
  Extract batch executor out of cluster service (elastic#24102)
  Add 5.3.1 to bwc versions
  Added "release-state" support to plugin docs
  Added examples to cross cluster search of using cluster settings
  ...
asettouf pushed a commit to asettouf/elasticsearch that referenced this pull request Apr 23, 2017
…ecutor (elastic#24237)

Changes in elastic#24102 exposed the following oddity: PrioritizedEsThreadPoolExecutor.getPending() can return Pending entries where pending.task == null. This can happen for example when tasks are added to the pending list while they are in the clean up phase, i.e. TieBreakingPrioritizedRunnable#runAndClean has run already, but afterExecute has not removed the task yet. Instead of safeguarding consumers of the API (as was done before elastic#24102) this changes the executor to not count these tasks as pending at all.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants