HistoryTemplateSearchInputMappingsTests failure on master with RejectedExecutionException #47749

dakrone · 2019-10-08T21:29:40Z

The failure looks like:

 2> Oct 08, 2019 2:38:57 PM com.carrotsearch.randomizedtesting.RandomizedRunner$QueueUncaughtExceptionsHandler uncaughtException
  2> WARNING: Uncaught exception in thread: Thread[elasticsearch[node_sm1][snapshot][T#2],5,TGRP-HistoryTemplateSearchInputMappingsTests]
  2> java.util.concurrent.RejectedExecutionException: Task java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask@71681ccd[Not completed, task = java.util.concurrent.Executors$RunnableAdapter@305516f6[Wrapped task = org.elasticsearch.xpack.core.scheduler.SchedulerEngine$ActiveSchedule@346a8424]] rejected from java.util.concurrent.ScheduledThreadPoolExecutor@5829a5fb[Terminated, pool size = 0, active threads = 0, queued tasks = 0, completed tasks = 0]
  2> 	at __randomizedtesting.SeedInfo.seed([B75BDA1771E40CE4]:0)
  2> 	at java.base/java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(ThreadPoolExecutor.java:2055)
  2> 	at java.base/java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:825)
  2> 	at java.base/java.util.concurrent.ScheduledThreadPoolExecutor.delayedExecute(ScheduledThreadPoolExecutor.java:340)
  2> 	at java.base/java.util.concurrent.ScheduledThreadPoolExecutor.schedule(ScheduledThreadPoolExecutor.java:562)
  2> 	at org.elasticsearch.xpack.core.scheduler.SchedulerEngine$ActiveSchedule.scheduleNextRun(SchedulerEngine.java:222)
  2> 	at org.elasticsearch.xpack.core.scheduler.SchedulerEngine$ActiveSchedule.<init>(SchedulerEngine.java:196)
  2> 	at org.elasticsearch.xpack.core.scheduler.SchedulerEngine.add(SchedulerEngine.java:147)
  2> 	at org.elasticsearch.xpack.slm.SnapshotRetentionService.rescheduleRetentionJob(SnapshotRetentionService.java:88)
  2> 	at org.elasticsearch.xpack.slm.SnapshotRetentionService.onMaster(SnapshotRetentionService.java:73)
  2> 	at org.elasticsearch.cluster.service.ClusterApplierService$OnMasterRunnable.run(ClusterApplierService.java:644)
  2> 	at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:699)
  2> 	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
  2> 	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
  2> 	at java.base/java.lang.Thread.run(Thread.java:834)

  2> com.carrotsearch.randomizedtesting.UncaughtExceptionError: Captured an uncaught exception in thread: Thread[id=168, name=elasticsearch[node_sm1][snapshot][T#2], state=RUNNABLE, group=TGRP-HistoryTemplateSearchInputMappingsTests]

        Caused by:
        java.util.concurrent.RejectedExecutionException: Task java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask@71681ccd[Not completed, task = java.util.concurrent.Executors$RunnableAdapter@305516f6[Wrapped task = org.elasticsearch.xpack.core.scheduler.SchedulerEngine$ActiveSchedule@346a8424]] rejected from java.util.concurrent.ScheduledThreadPoolExecutor@5829a5fb[Terminated, pool size = 0, active threads = 0, queued tasks = 0, completed tasks = 0]

I was not able to reproduce this with

REPRODUCE WITH: ./gradlew ':x-pack:plugin:watcher:test' --tests "org.elasticsearch.xpack.watcher.history.HistoryTemplateSearchInputMappingsTests" -Dtests.seed=B75BDA1771E40CE4 -Dtests.security.manager=true -Dtests.locale=en-US -Dtests.timezone=Etc/UTC -Dcompiler.java=12 -Druntime.java=11

https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+pull-request-2/8367/console
https://gradle-enterprise.elastic.co/s/bsbxzot5q5dfy

The text was updated successfully, but these errors were encountered:

elasticmachine · 2019-10-08T21:29:41Z

Pinging @elastic/es-core-features (:Core/Features/Watcher)

romseygeek · 2019-10-18T09:35:16Z

ChainIntegrationTests failed with the same error:

2> Oct 18, 2019 11:34:05 AM com.carrotsearch.randomizedtesting.RandomizedRunner$QueueUncaughtExceptionsHandler uncaughtException
--
2> WARNING: Uncaught exception in thread: Thread[elasticsearch[node_sm1][snapshot][T#1],5,TGRP-ChainIntegrationTests]
2> java.util.concurrent.RejectedExecutionException: Task java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask@2fce700a[Not completed, task = java.util.concurrent.Executors$RunnableAdapter@2fa46546[Wrapped task = org.elasticsearch.xpack.core.scheduler.SchedulerEngine$ActiveSchedule@6aa6eed6]] rejected from java.util.concurrent.ScheduledThreadPoolExecutor@5b864592[Terminated, pool size = 0, active threads = 0, queued tasks = 0, completed tasks = 0]
2> 	at __randomizedtesting.SeedInfo.seed([9CD142BD98DB2226]:0)
2> 	at java.base/java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(ThreadPoolExecutor.java:2055)
2> 	at java.base/java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:825)
2> 	at java.base/java.util.concurrent.ScheduledThreadPoolExecutor.delayedExecute(ScheduledThreadPoolExecutor.java:340)
2> 	at java.base/java.util.concurrent.ScheduledThreadPoolExecutor.schedule(ScheduledThreadPoolExecutor.java:562)
2> 	at org.elasticsearch.xpack.core.scheduler.SchedulerEngine$ActiveSchedule.scheduleNextRun(SchedulerEngine.java:222)
2> 	at org.elasticsearch.xpack.core.scheduler.SchedulerEngine$ActiveSchedule.<init>(SchedulerEngine.java:196)
2> 	at org.elasticsearch.xpack.core.scheduler.SchedulerEngine.add(SchedulerEngine.java:147)
2> 	at org.elasticsearch.xpack.slm.SnapshotRetentionService.rescheduleRetentionJob(SnapshotRetentionService.java:88)
2> 	at org.elasticsearch.xpack.slm.SnapshotRetentionService.onMaster(SnapshotRetentionService.java:73)
2> 	at org.elasticsearch.cluster.service.ClusterApplierService$OnMasterRunnable.run(ClusterApplierService.java:644)
2> 	at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:699)
2> 	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
2> 	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
2> 	at java.base/java.lang.Thread.run(Thread.java:834)

It seems to be something to do with the way the SnapshotRetentionService is being shut down?

Build scan is here: https://gradle-enterprise.elastic.co/s/5z5xvauuwaqxg/console-log?task=:x-pack:plugin:watcher:test

This adds a guard for the SLM lifecycle and retention service that prevents new jobs from being scheduled once the service has been stopped. Previous if the node were shut down the service would be stopped, but a cluster state or local master election would cause a job to attempt to be scheduled. This could lead to an uncaught `RejectedExecutionException`. Resolves elastic#47749

This adds a guard for the SLM lifecycle and retention service that prevents new jobs from being scheduled once the service has been stopped. Previous if the node were shut down the service would be stopped, but a cluster state or local master election would cause a job to attempt to be scheduled. This could lead to an uncaught `RejectedExecutionException`. Resolves #47749

This adds a guard for the SLM lifecycle and retention service that prevents new jobs from being scheduled once the service has been stopped. Previous if the node were shut down the service would be stopped, but a cluster state or local master election would cause a job to attempt to be scheduled. This could lead to an uncaught `RejectedExecutionException`. Resolves elastic#47749

This adds a guard for the SLM lifecycle and retention service that prevents new jobs from being scheduled once the service has been stopped. Previous if the node were shut down the service would be stopped, but a cluster state or local master election would cause a job to attempt to be scheduled. This could lead to an uncaught `RejectedExecutionException`. Resolves #47749

* Don't schedule SLM jobs when services have been stopped (#48658) This adds a guard for the SLM lifecycle and retention service that prevents new jobs from being scheduled once the service has been stopped. Previous if the node were shut down the service would be stopped, but a cluster state or local master election would cause a job to attempt to be scheduled. This could lead to an uncaught `RejectedExecutionException`. Resolves #47749 * Fix test for backport

dakrone added >test-failure Triaged test failures from CI :Data Management/Watcher labels Oct 8, 2019

dakrone mentioned this issue Oct 8, 2019

Separate SLM stop/start/status API from ILM #47710

Merged

romseygeek added the :Data Management/ILM+SLM Index and Snapshot lifecycle management label Oct 18, 2019

dakrone mentioned this issue Oct 29, 2019

Don't schedule SLM jobs when services have been stopped #48658

Merged

dakrone removed the :Data Management/Watcher label Oct 29, 2019

dakrone closed this as completed in #48658 Oct 30, 2019

This was referenced Feb 3, 2020

[meta] 7.6 release elastic/elasticsearch-net#4340

Closed

[meta] 7.6 release elastic/elasticsearch-net#4341

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

HistoryTemplateSearchInputMappingsTests failure on master with RejectedExecutionException #47749

HistoryTemplateSearchInputMappingsTests failure on master with RejectedExecutionException #47749

dakrone commented Oct 8, 2019

elasticmachine commented Oct 8, 2019

romseygeek commented Oct 18, 2019

HistoryTemplateSearchInputMappingsTests failure on master with RejectedExecutionException #47749

HistoryTemplateSearchInputMappingsTests failure on master with RejectedExecutionException #47749

Comments

dakrone commented Oct 8, 2019

elasticmachine commented Oct 8, 2019

romseygeek commented Oct 18, 2019