Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CI] org.elasticsearch.xpack.ccr.FollowerFailOverIT#testFailOverOnFollower #58534

Closed
matriv opened this issue Jun 25, 2020 · 3 comments · Fixed by #59375
Closed

[CI] org.elasticsearch.xpack.ccr.FollowerFailOverIT#testFailOverOnFollower #58534

matriv opened this issue Jun 25, 2020 · 3 comments · Fixed by #59375
Assignees
Labels
:Distributed/CCR Issues around the Cross Cluster State Replication features Team:Distributed Meta label for distributed team >test-failure Triaged test failures from CI

Comments

@matriv
Copy link
Contributor

matriv commented Jun 25, 2020

Failing on master

Build scan: https://gradle-enterprise.elastic.co/s/jm5pkdiktr2zq

Repro line:

./gradlew ':x-pack:plugin:ccr:internalClusterTest' --tests "org.elasticsearch.xpack.ccr.FollowerFailOverIT.testFailOverOnFollower" \
  -Dtests.seed=A1E8FAE62B8B6C9E \
  -Dtests.security.manager=true \
  -Dtests.locale=cs-CZ \
  -Dtests.timezone=America/Argentina/Rio_Gallegos \
  -Druntime.java=11

Reproduces locally?: No

Applicable branches: master

Failure history:
https://build-stats.elastic.co/app/kibana#/discover?_g=(refreshInterval:(pause:!f,value:7200000),time:(from:now-30d,mode:quick,to:now))&_a=(columns:!(_source),index:e58bf320-7efd-11e8-bf69-63c8ef516157,interval:auto,query:(language:lucene,query:'class:%20%22org.elasticsearch.xpack.ccr.FollowerFailOverIT%22%20test:%22%20testFailOverOnFollower%22%20stacktrace:%20%22timed%20out%20waiting%20for%20green%20state%22'),sort:!(time,desc))

Failure excerpt:

07:36:11 org.elasticsearch.xpack.ccr.FollowerFailOverIT > testFailOverOnFollower FAILED
07:36:11     java.lang.AssertionError: timed out waiting for green state
07:36:11         at __randomizedtesting.SeedInfo.seed([A1E8FAE62B8B6C9E:7EBB537ABFEBCFA4]:0)
07:36:11         at org.junit.Assert.fail(Assert.java:88)
07:36:11         at org.elasticsearch.xpack.CcrIntegTestCase.ensureColor(CcrIntegTestCase.java:337)
07:36:11         at org.elasticsearch.xpack.CcrIntegTestCase.ensureFollowerGreen(CcrIntegTestCase.java:311)
07:36:11         at org.elasticsearch.xpack.CcrIntegTestCase.ensureFollowerGreen(CcrIntegTestCase.java:306)
07:36:11         at org.elasticsearch.xpack.ccr.FollowerFailOverIT.testFailOverOnFollower(FollowerFailOverIT.java:101)
07:36:11 REPRODUCE WITH: ./gradlew ':x-pack:plugin:ccr:internalClusterTest' --tests "org.elasticsearch.xpack.ccr.FollowerFailOverIT.testFailOverOnFollower" -Dtests.seed=A1E8FAE62B8B6C9E -Dtests.security.manager=true -Dtests.locale=cs-CZ -Dtests.timezone=America/Argentina/Rio_Gallegos -Druntime.java=11
07:36:11 
07:36:11 org.elasticsearch.xpack.ccr.FollowerFailOverIT > classMethod FAILED
07:36:11     com.carrotsearch.randomizedtesting.ThreadLeakError: 2 threads leaked from SUITE scope at org.elasticsearch.xpack.ccr.FollowerFailOverIT: 
07:36:11        1) Thread[id=712, name=Thread-23, state=TIMED_WAITING, group=TGRP-FollowerFailOverIT]
07:36:11             at java.base@11.0.6/jdk.internal.misc.Unsafe.park(Native Method)
07:36:11             at java.base@11.0.6/java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:234)
07:36:11             at java.base@11.0.6/java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedNanos(AbstractQueuedSynchronizer.java:1079)
07:36:11             at java.base@11.0.6/java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1369)
07:36:11             at java.base@11.0.6/java.util.concurrent.Semaphore.tryAcquire(Semaphore.java:415)
07:36:11             at app//org.elasticsearch.xpack.ccr.FollowerFailOverIT.lambda$testFailOverOnFollower$0(FollowerFailOverIT.java:70)
07:36:11             at app//org.elasticsearch.xpack.ccr.FollowerFailOverIT$$Lambda$4075/0x0000000100d3dc40.run(Unknown Source)
07:36:11             at java.base@11.0.6/java.lang.Thread.run(Thread.java:834)
07:36:11        2) Thread[id=711, name=Thread-22, state=TIMED_WAITING, group=TGRP-FollowerFailOverIT]
07:36:11             at java.base@11.0.6/jdk.internal.misc.Unsafe.park(Native Method)
07:36:11             at java.base@11.0.6/java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:234)
07:36:11             at java.base@11.0.6/java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedNanos(AbstractQueuedSynchronizer.java:1079)
07:36:11             at java.base@11.0.6/java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1369)
07:36:11             at java.base@11.0.6/java.util.concurrent.Semaphore.tryAcquire(Semaphore.java:415)
07:36:11             at app//org.elasticsearch.xpack.ccr.FollowerFailOverIT.lambda$testFailOverOnFollower$0(FollowerFailOverIT.java:70)
07:36:11             at app//org.elasticsearch.xpack.ccr.FollowerFailOverIT$$Lambda$4075/0x0000000100d3dc40.run(Unknown Source)
07:36:11             at java.base@11.0.6/java.lang.Thread.run(Thread.java:834)
07:36:11         at __randomizedtesting.SeedInfo.seed([A1E8FAE62B8B6C9E]:0)
@matriv matriv added >test-failure Triaged test failures from CI :Distributed/CCR Issues around the Cross Cluster State Replication features labels Jun 25, 2020
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-distributed (:Distributed/CCR)

@elasticmachine elasticmachine added the Team:Distributed Meta label for distributed team label Jun 25, 2020
@dnhatn dnhatn self-assigned this Jun 25, 2020
@dimitris-athanasiou
Copy link
Contributor

Another failure in: https://gradle-enterprise.elastic.co/s/t5ncm7jvhfo3k

@jrodewig
Copy link
Contributor

danielmitterdorfer added a commit that referenced this issue Jul 9, 2020
Relates #58534

Co-authored-by: Dimitris Athanasiou <dimitris@elastic.co>
dnhatn added a commit that referenced this issue Jul 14, 2020
)

The primary shards of follower indices during the bootstrap need to be
on nodes with the remote cluster client role as those nodes reach out to
the corresponding leader shards on the remote cluster to copy Lucene
segment files and renew the retention leases. This commit introduces a
new allocation decider that ensures bootstrapping follower primaries are
allocated to nodes with the remote cluster client role.

Relates #54146
Relates #53924
Closes #58534

Co-authored-by: Jason Tedor <jason@tedor.me>
dnhatn added a commit to dnhatn/elasticsearch that referenced this issue Jul 14, 2020
…stic#59375)

The primary shards of follower indices during the bootstrap need to be
on nodes with the remote cluster client role as those nodes reach out to
the corresponding leader shards on the remote cluster to copy Lucene
segment files and renew the retention leases. This commit introduces a
new allocation decider that ensures bootstrapping follower primaries are
allocated to nodes with the remote cluster client role.

Relates elastic#54146
Relates elastic#53924
Closes elastic#58534

Co-authored-by: Jason Tedor <jason@tedor.me>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Distributed/CCR Issues around the Cross Cluster State Replication features Team:Distributed Meta label for distributed team >test-failure Triaged test failures from CI
Projects
None yet
5 participants