Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CI] FollowerFailOverIT testAddNewReplicasOnFollower failed #38894

Closed
droberts195 opened this issue Feb 14, 2019 · 5 comments
Closed

[CI] FollowerFailOverIT testAddNewReplicasOnFollower failed #38894

droberts195 opened this issue Feb 14, 2019 · 5 comments
Assignees
Labels
:Distributed/CCR Issues around the Cross Cluster State Replication features >test-failure Triaged test failures from CI

Comments

@droberts195
Copy link
Contributor

droberts195 commented Feb 14, 2019

FollowerFailOverIT.testAddNewReplicasOnFollower failed in 7.0 in
https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+7.0+internalClusterTest/530/console

The repro command is:

./gradlew :x-pack:plugin:ccr:internalClusterTest \
  -Dtests.seed=19484C0EDDEDE80D \
  -Dtests.class=org.elasticsearch.xpack.ccr.FollowerFailOverIT \
  -Dtests.method="testAddNewReplicasOnFollower" \
  -Dtests.security.manager=true \
  -Dtests.locale=ca-ES \
  -Dtests.timezone=Asia/Ashgabat \
  -Dcompiler.java=11 \
  -Druntime.java=8

This didn't reproduce locally for me.

There are 20 assertion errors in the log. The first one is:

10:17:12   2> java.lang.AssertionError: [follower-index] IndexNotFoundException[no such index [follower-index]]

Then:

10:17:12   2> java.lang.AssertionError: java.lang.RuntimeException: Cluster is already closed

Then the one that causes the test failure:

10:17:12 ERROR   43.1s J7 | FollowerFailOverIT.testAddNewReplicasOnFollower <<< FAILURES!
10:17:12    > Throwable #1: java.lang.AssertionError: timed out waiting for green state
@droberts195 droberts195 added >test-failure Triaged test failures from CI :Distributed/CCR Issues around the Cross Cluster State Replication features labels Feb 14, 2019
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-distributed

@droberts195
Copy link
Contributor Author

I muted the test for 7.0 in 23716b7

@dnhatn dnhatn self-assigned this Feb 14, 2019
@dnhatn
Copy link
Member

dnhatn commented Feb 15, 2019

Log: add_new_replica.zip

@dnhatn
Copy link
Member

dnhatn commented Feb 15, 2019

The optimization using seq_no tripped the assertion assertDocDoesNotExist during peer-recovery.

  2> java.lang.AssertionError: doc [doc][63] exists [1] times in index
  2> 	at __randomizedtesting.SeedInfo.seed([19484C0EDDEDE80D]:0)
  2> 	at 
  2> 	at org.elasticsearch.index.engine.InternalEngine.indexIntoLucene(InternalEngine.java:1042)
  2> 	at org.elasticsearch.index.engine.InternalEngine.index(InternalEngine.java:871)
  2> 	at org.elasticsearch.index.shard.IndexShard.index(IndexShard.java:784)
  2> 	at org.elasticsearch.index.shard.IndexShard.applyIndexOperation(IndexShard.java:758)
  2> 	at org.elasticsearch.index.shard.IndexShard.applyTranslogOperation(IndexShard.java:1323)
  2> 	at org.elasticsearch.index.shard.IndexShard.applyTranslogOperation(IndexShard.java:1310)
  2> 	at org.elasticsearch.indices.recovery.RecoveryTarget.lambda$indexTranslogOperations$2(RecoveryTarget.java:358)
  2> 	at org.elasticsearch.action.ActionListener.completeWith(ActionListener.java:191)
  2> 	at org.elasticsearch.indices.recovery.RecoveryTarget.indexTranslogOperations(RecoveryTarget.java:333)
  2> 	at org.elasticsearch.indices.recovery.PeerRecoveryTargetService$TranslogOperationsRequestHandler.messageReceived(PeerRecoveryTargetService.java:521)

I believe this relates to #38879.

@dnhatn
Copy link
Member

dnhatn commented Feb 16, 2019

This was fixed by #38879. I unmuted this test on 7.0. Thanks @droberts195 for reporting this issue.

@dnhatn dnhatn closed this as completed Feb 16, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Distributed/CCR Issues around the Cross Cluster State Replication features >test-failure Triaged test failures from CI
Projects
None yet
Development

No branches or pull requests

3 participants