Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RemoteClusterConnectionTests.testCloseWhileConcurrentlyConnecting sporadically fails #24179

Closed
danielmitterdorfer opened this issue Apr 19, 2017 · 2 comments
Assignees
Labels
>test Issues or PRs that are addressing/adding tests >test-failure Triaged test failures from CI v6.0.0-alpha2

Comments

@danielmitterdorfer
Copy link
Member

danielmitterdorfer commented Apr 19, 2017

Failure trace (source):

Throwable #1: com.carrotsearch.randomizedtesting.UncaughtExceptionError: Captured an uncaught exception in thread: Thread[id=1916, name=elasticsearch[org.elasticsearch.action.search.RemoteClusterConnectionTests][management][T#2], state=RUNNABLE, group=TGRP-RemoteClusterConnectionTests]
   > 	at __randomizedtesting.SeedInfo.seed([45ECFED169707D4D:F4A7F9173C63FBE1]:0)
   > Caused by: java.lang.AssertionError: shit's been called twice
   > 	at __randomizedtesting.SeedInfo.seed([45ECFED169707D4D]:0)
   > 	at org.elasticsearch.action.search.RemoteClusterConnectionTests$4.lambda$run$1(RemoteClusterConnectionTests.java:508)
   > 	at org.elasticsearch.action.ActionListener$1.onFailure(ActionListener.java:68)
   > 	at org.elasticsearch.action.ActionListener.onFailure(ActionListener.java:102)
   > 	at org.elasticsearch.action.search.RemoteClusterConnection$ConnectHandler$1.lambda$doRun$1(RemoteClusterConnection.java:347)
   > 	at org.elasticsearch.action.ActionListener$1.onFailure(ActionListener.java:68)
   > 	at org.elasticsearch.action.search.RemoteClusterConnection$ConnectHandler$SniffClusterStateResponseHandler.handleResponse(RemoteClusterConnection.java:493)
   > 	at org.elasticsearch.action.search.RemoteClusterConnection$ConnectHandler$SniffClusterStateResponseHandler.handleResponse(RemoteClusterConnection.java:440)
   > 	at org.elasticsearch.transport.TransportService$ContextRestoreResponseHandler.handleResponse(TransportService.java:1030)
   > 	at org.elasticsearch.transport.TransportService$ContextRestoreResponseHandler.handleResponse(TransportService.java:1030)
   > 	at org.elasticsearch.transport.TcpTransport$2.doRun(TcpTransport.java:1384)
   > 	at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:638)
   > 	at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37)
   > 	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
   > 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
   > 	at java.lang.Thread.run(Thread.java:745)
   > Caused by: java.lang.RuntimeException
   > 	at org.elasticsearch.action.search.RemoteClusterConnectionTests$4.lambda$run$1(RemoteClusterConnectionTests.java:505)
   > 	at org.elasticsearch.action.ActionListener$1.onFailure(ActionListener.java:68)
   > 	at org.elasticsearch.action.ActionListener.onFailure(ActionListener.java:102)
   > 	at org.elasticsearch.action.search.RemoteClusterConnection$ConnectHandler$1.lambda$doRun$1(RemoteClusterConnection.java:347)
   > 	at org.elasticsearch.action.ActionListener$1.onFailure(ActionListener.java:68)
   > 	at org.elasticsearch.action.search.RemoteClusterConnection$ConnectHandler.collectRemoteNodes(RemoteClusterConnection.java:408)
   > 	at org.elasticsearch.action.search.RemoteClusterConnection$ConnectHandler$1.doRun(RemoteClusterConnection.java:352)
  2> NOTE: test params are: codec=Asserting(Lucene70), sim=RandomSimilarity(queryNorm=true): {}, locale=nl-NL, timezone=America/Yakutat
   > 	at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37)
  2> NOTE: Linux 4.8.6-300.fc25.x86_64 amd64/Oracle Corporation 1.8.0_121 (64-bit)/cpus=4,threads=1,free=408114792,total=529530880
   > 	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
   > 	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
   > 	at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:569)
   > 	... 3 more
  • Affected branches (based on builds that failed in the past): master, 5.3
  • Reproduction: gradle :core:test -Dtests.seed=45ECFED169707D4D -Dtests.class=org.elasticsearch.action.search.RemoteClusterConnectionTests -Dtests.method="testFilterDiscoveredNodes" -Dtests.security.manager=true -Dtests.locale=nl-NL -Dtests.timezone=America/Yakutat
  • Reproducability: fails roughly once per month in CI; not reproducable locally after 100 iterations (reproduction line + -Dtests.iters=100)

Likely related to #24010

@danielmitterdorfer danielmitterdorfer added >test-failure Triaged test failures from CI >test Issues or PRs that are addressing/adding tests v6.0.0-alpha1 labels Apr 19, 2017
@colings86
Copy link
Contributor

Another case of this but on a different test method: https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+master+g1gc/2570/console

reproduce command:

 gradle :core:test -Dtests.seed=548442885855C658 -Dtests.class=org.elasticsearch.action.search.RemoteClusterConnectionTests -Dtests.method="testCloseWhileConcurrentlyConnecting" -Dtests.security.manager=true -Dtests.jvm.argline="-XX:-UseConcMarkSweepGC -XX:+UseG1GC" -Dtests.locale=mk -Dtests.timezone=Asia/Famagusta

As well as the above exception it also throws the following which is strange becasue 5.0.0 > 2.0.0:

  1> java.lang.IllegalStateException: Received message from unsupported version: [5.0.0] minimal compatible version is: [2.0.0]
  1> 	at org.elasticsearch.transport.TcpTransport.messageReceived(TcpTransport.java:1321) ~[main/:?]
  1> 	at org.elasticsearch.transport.MockTcpTransport.readMessage(MockTcpTransport.java:172) ~[framework-6.0.0-alpha1-SNAPSHOT.jar:?]
  1> 	at org.elasticsearch.transport.MockTcpTransport.access$900(MockTcpTransport.java:73) ~[framework-6.0.0-alpha1-SNAPSHOT.jar:?]
  1> 	at org.elasticsearch.transport.MockTcpTransport$MockChannel$1.lambda$doRun$0(MockTcpTransport.java:359) ~[framework-6.0.0-alpha1-SNAPSHOT.jar:?]
  1> 	at org.elasticsearch.common.util.CancellableThreads.executeIO(CancellableThreads.java:105) ~[main/:6.0.0-alpha1-SNAPSHOT]
  1> 	at org.elasticsearch.transport.MockTcpTransport$MockChannel$1.doRun(MockTcpTransport.java:359) ~[framework-6.0.0-alpha1-SNAPSHOT.jar:?]
  1> 	at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) ~[main/:?]
  1> 	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) ~[?:1.8.0_121]
  1> 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) ~[?:1.8.0_121]
  1> 	at java.lang.Thread.run(Thread.java:745) ~[?:1.8.0_121]

@jdconrad
Copy link
Contributor

jdconrad commented May 8, 2017

@s1monw s1monw closed this as completed in f9cfe86 May 16, 2017
s1monw added a commit that referenced this issue May 16, 2017
…ads aborts

Today we assert hart if failure listeners are invoked more than once. Yet, this
can happen if we cancel the execution since the caller and the handler will get
the exception on the cancelable threads and will notify the listener concurrently
if timinig allows. This commit relaxes the assertion towards handling multiple
invocations with `ExecutionCancelledException`

Closes #24010
Closes #24179
Closes vagnerclementino/elasticsearch/#98
s1monw added a commit that referenced this issue May 16, 2017
…ads aborts

Today we assert hart if failure listeners are invoked more than once. Yet, this
can happen if we cancel the execution since the caller and the handler will get
the exception on the cancelable threads and will notify the listener concurrently
if timinig allows. This commit relaxes the assertion towards handling multiple
invocations with `ExecutionCancelledException`

Closes #24010
Closes #24179
Closes vagnerclementino/elasticsearch/#98
s1monw added a commit that referenced this issue May 16, 2017
…ads aborts

Today we assert hart if failure listeners are invoked more than once. Yet, this
can happen if we cancel the execution since the caller and the handler will get
the exception on the cancelable threads and will notify the listener concurrently
if timinig allows. This commit relaxes the assertion towards handling multiple
invocations with `ExecutionCancelledException`

Closes #24010
Closes #24179
Closes vagnerclementino/elasticsearch/#98
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>test Issues or PRs that are addressing/adding tests >test-failure Triaged test failures from CI v6.0.0-alpha2
Projects
None yet
Development

No branches or pull requests

5 participants