CcrRepositoryIT#testFollowerMappingIsUpdated can fail with assertion error #37887

jtibshirani · 2019-01-25T20:44:05Z

Unfortunately the failure doesn't reproduce for me locally.

Link to the build: https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+6.x+multijob-unix-compatibility/os=oraclelinux/202

Command to reproduce:

./gradlew :x-pack:plugin:ccr:internalClusterTest \
  -Dtests.seed=20B21185CE3EF1E3 \
  -Dtests.class=org.elasticsearch.xpack.ccr.CcrRepositoryIT \
  -Dtests.method="testFollowerMappingIsUpdated" \
  -Dtests.security.manager=true \
  -Dtests.locale=ar-TN \
  -Dtests.timezone=Atlantic/Jan_Mayen \
  -Dcompiler.java=11 \
  -Druntime.java=8

Relevant excerpt from the logs:

1> [2019-01-25T21:16:59,592][INFO ][o.e.c.m.MetaDataCreateIndexService] [leader0] [index1] creating index, cause [api], templates [], shards [1]/[0], mappings [doc]
  1> [2019-01-25T21:16:59,624][INFO ][o.e.c.r.a.AllocationService] [leader0] Cluster health status changed from [YELLOW] to [GREEN] (reason: [shards started [[index1][0]] ...]).
  1> [2019-01-25T21:16:59,643][INFO ][o.e.x.c.CcrRepositoryIT  ] [testFollowerMappingIsUpdated] ensure green leader indices [index1]
  1> [2019-01-25T21:16:59,684][WARN ][o.e.r.b.FileRestoreContext] [followerd3] [[index2][0]] [_latest_/_latest_] failed to delete file [extra0] during snapshot cleanup
  1> [2019-01-25T21:16:59,694][INFO ][o.e.c.m.MetaDataMappingService] [leader0] [index1/yHsi0cZ5TpmL9dnQq2hcDQ] update_mapping [doc]
  1> [2019-01-25T21:16:59,728][INFO ][o.e.c.r.a.AllocationService] [followerm2] Cluster health status changed from [YELLOW] to [GREEN] (reason: [shards started [[index2][0]] ...]).
  1> [2019-01-25T21:16:59,766][INFO ][o.e.c.m.MetaDataDeleteIndexService] [leader0] [index1/yHsi0cZ5TpmL9dnQq2hcDQ] deleting index
  1> [2019-01-25T21:16:59,816][INFO ][o.e.c.m.MetaDataDeleteIndexService] [followerm2] [index2/sXK7-Cy5SSqjmw9hdCu6rQ] deleting index
  1> [2019-01-25T21:16:59,837][INFO ][o.e.x.c.CcrRepositoryIT  ] [testFollowerMappingIsUpdated] after test
FAILURE 0.40s J3 | CcrRepositoryIT.testFollowerMappingIsUpdated <<< FAILURES!
   > Throwable #1: java.lang.AssertionError: expected:<2> but was:<1>
   > 	at __randomizedtesting.SeedInfo.seed([20B21185CE3EF1E3:2B41836BA05FE850]:0)
   > 	at org.elasticsearch.xpack.ccr.CcrRepositoryIT.testFollowerMappingIsUpdated(CcrRepositoryIT.java:342)
   > 	at java.lang.Thread.run(Thread.java:748)

The text was updated successfully, but these errors were encountered:

elasticmachine · 2019-01-25T20:44:06Z

Pinging @elastic/es-distributed

Tracked in #37887.

jtibshirani · 2019-01-25T22:57:46Z

I muted the test in 7c130d2, since it failed a few times.

Tracked in #37887.

dnhatn · 2019-01-31T06:44:44Z

I opened #38071.

There are two issues regarding the way that we sync mapping from leader to follower when a ccr restore is completed: 1. The returned mapping from a cluster service might not be up to date as the mapping of the restored index commit. 2. We should not compare the mapping version of the follower and the leader. They are not related to one another. Moreover, I think we should only ensure that once the restore is done, the mapping on the follower should be at least the mapping of the copied index commit. We don't have to sync the mapping which is updated after we have opened a session. Relates #36879 Closes #37887

There are two issues regarding the way that we sync mapping from leader to follower when a ccr restore is completed: 1. The returned mapping from a cluster service might not be up to date as the mapping of the restored index commit. 2. We should not compare the mapping version of the follower and the leader. They are not related to one another. Moreover, I think we should only ensure that once the restore is done, the mapping on the follower should be at least the mapping of the copied index commit. We don't have to sync the mapping which is updated after we have opened a session. Relates elastic#36879 Closes elastic#37887

There are two issues regarding the way that we sync mapping from leader to follower when a ccr restore is completed: 1. The returned mapping from a cluster service might not be up to date as the mapping of the restored index commit. 2. We should not compare the mapping version of the follower and the leader. They are not related to one another. Moreover, I think we should only ensure that once the restore is done, the mapping on the follower should be at least the mapping of the copied index commit. We don't have to sync the mapping which is updated after we have opened a session. Relates #36879 Closes #37887 * Change

jtibshirani added >test-failure Triaged test failures from CI :Distributed/CCR Issues around the Cross Cluster State Replication features labels Jan 25, 2019

dnhatn self-assigned this Jan 25, 2019

Tim-Brooks assigned Tim-Brooks and dnhatn and unassigned dnhatn and Tim-Brooks Jan 25, 2019

jtibshirani added a commit that referenced this issue Jan 25, 2019

Mute CcrRepositoryIT#testFollowerMappingIsUpdated

7c130d2

Tracked in #37887.

jtibshirani added a commit that referenced this issue Jan 25, 2019

Mute CcrRepositoryIT#testFollowerMappingIsUpdated

8dfa241

Tracked in #37887.

pgomulka mentioned this issue Jan 30, 2019

CcrRepositoryIT.testIndividualActionsTimeout failing #38027

Closed

dnhatn mentioned this issue Jan 31, 2019

Tighten mapping syncing in ccr remote restore #38071

Merged

dnhatn closed this as completed in #38071 Feb 4, 2019

Tim-Brooks mentioned this issue Feb 5, 2019

Tighten mapping syncing in ccr remote restore #38459

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CcrRepositoryIT#testFollowerMappingIsUpdated can fail with assertion error #37887

CcrRepositoryIT#testFollowerMappingIsUpdated can fail with assertion error #37887

jtibshirani commented Jan 25, 2019

elasticmachine commented Jan 25, 2019

jtibshirani commented Jan 25, 2019

dnhatn commented Jan 31, 2019

CcrRepositoryIT#testFollowerMappingIsUpdated can fail with assertion error #37887

CcrRepositoryIT#testFollowerMappingIsUpdated can fail with assertion error #37887

Comments

jtibshirani commented Jan 25, 2019

elasticmachine commented Jan 25, 2019

jtibshirani commented Jan 25, 2019

dnhatn commented Jan 31, 2019