Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CcrRepositoryIT#testFollowerMappingIsUpdated can fail with assertion error #37887

Closed
jtibshirani opened this issue Jan 25, 2019 · 3 comments
Closed
Assignees
Labels
:Distributed/CCR Issues around the Cross Cluster State Replication features >test-failure Triaged test failures from CI

Comments

@jtibshirani
Copy link
Contributor

Unfortunately the failure doesn't reproduce for me locally.


Link to the build: https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+6.x+multijob-unix-compatibility/os=oraclelinux/202

Command to reproduce:

./gradlew :x-pack:plugin:ccr:internalClusterTest \
  -Dtests.seed=20B21185CE3EF1E3 \
  -Dtests.class=org.elasticsearch.xpack.ccr.CcrRepositoryIT \
  -Dtests.method="testFollowerMappingIsUpdated" \
  -Dtests.security.manager=true \
  -Dtests.locale=ar-TN \
  -Dtests.timezone=Atlantic/Jan_Mayen \
  -Dcompiler.java=11 \
  -Druntime.java=8

Relevant excerpt from the logs:

1> [2019-01-25T21:16:59,592][INFO ][o.e.c.m.MetaDataCreateIndexService] [leader0] [index1] creating index, cause [api], templates [], shards [1]/[0], mappings [doc]
  1> [2019-01-25T21:16:59,624][INFO ][o.e.c.r.a.AllocationService] [leader0] Cluster health status changed from [YELLOW] to [GREEN] (reason: [shards started [[index1][0]] ...]).
  1> [2019-01-25T21:16:59,643][INFO ][o.e.x.c.CcrRepositoryIT  ] [testFollowerMappingIsUpdated] ensure green leader indices [index1]
  1> [2019-01-25T21:16:59,684][WARN ][o.e.r.b.FileRestoreContext] [followerd3] [[index2][0]] [_latest_/_latest_] failed to delete file [extra0] during snapshot cleanup
  1> [2019-01-25T21:16:59,694][INFO ][o.e.c.m.MetaDataMappingService] [leader0] [index1/yHsi0cZ5TpmL9dnQq2hcDQ] update_mapping [doc]
  1> [2019-01-25T21:16:59,728][INFO ][o.e.c.r.a.AllocationService] [followerm2] Cluster health status changed from [YELLOW] to [GREEN] (reason: [shards started [[index2][0]] ...]).
  1> [2019-01-25T21:16:59,766][INFO ][o.e.c.m.MetaDataDeleteIndexService] [leader0] [index1/yHsi0cZ5TpmL9dnQq2hcDQ] deleting index
  1> [2019-01-25T21:16:59,816][INFO ][o.e.c.m.MetaDataDeleteIndexService] [followerm2] [index2/sXK7-Cy5SSqjmw9hdCu6rQ] deleting index
  1> [2019-01-25T21:16:59,837][INFO ][o.e.x.c.CcrRepositoryIT  ] [testFollowerMappingIsUpdated] after test
FAILURE 0.40s J3 | CcrRepositoryIT.testFollowerMappingIsUpdated <<< FAILURES!
   > Throwable #1: java.lang.AssertionError: expected:<2> but was:<1>
   > 	at __randomizedtesting.SeedInfo.seed([20B21185CE3EF1E3:2B41836BA05FE850]:0)
   > 	at org.elasticsearch.xpack.ccr.CcrRepositoryIT.testFollowerMappingIsUpdated(CcrRepositoryIT.java:342)
   > 	at java.lang.Thread.run(Thread.java:748)
@jtibshirani jtibshirani added >test-failure Triaged test failures from CI :Distributed/CCR Issues around the Cross Cluster State Replication features labels Jan 25, 2019
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-distributed

@dnhatn dnhatn self-assigned this Jan 25, 2019
@Tim-Brooks Tim-Brooks assigned Tim-Brooks and dnhatn and unassigned dnhatn and Tim-Brooks Jan 25, 2019
jtibshirani added a commit that referenced this issue Jan 25, 2019
@jtibshirani
Copy link
Contributor Author

I muted the test in 7c130d2, since it failed a few times.

@dnhatn
Copy link
Member

dnhatn commented Jan 31, 2019

I opened #38071.

dnhatn added a commit that referenced this issue Feb 4, 2019
There are two issues regarding the way that we sync mapping from leader
to follower when a ccr restore is completed:

1.  The returned mapping from a cluster service might not be up to date
as the mapping of the restored index commit.

2. We should not compare the mapping version of the follower and the
leader. They are not related to one another.

Moreover, I think we should only ensure that once the restore is done,
the mapping on the follower should be at least the mapping of the copied
index commit. We don't have to sync the mapping which is updated after
we have opened a session.

Relates #36879
Closes #37887
Tim-Brooks pushed a commit to Tim-Brooks/elasticsearch that referenced this issue Feb 5, 2019
There are two issues regarding the way that we sync mapping from leader
to follower when a ccr restore is completed:

1.  The returned mapping from a cluster service might not be up to date
as the mapping of the restored index commit.

2. We should not compare the mapping version of the follower and the
leader. They are not related to one another.

Moreover, I think we should only ensure that once the restore is done,
the mapping on the follower should be at least the mapping of the copied
index commit. We don't have to sync the mapping which is updated after
we have opened a session.

Relates elastic#36879
Closes elastic#37887
Tim-Brooks added a commit that referenced this issue Feb 6, 2019
There are two issues regarding the way that we sync mapping from leader
to follower when a ccr restore is completed:

1.  The returned mapping from a cluster service might not be up to date
as the mapping of the restored index commit.

2. We should not compare the mapping version of the follower and the
leader. They are not related to one another.

Moreover, I think we should only ensure that once the restore is done,
the mapping on the follower should be at least the mapping of the copied
index commit. We don't have to sync the mapping which is updated after
we have opened a session.

Relates #36879
Closes #37887

* Change
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Distributed/CCR Issues around the Cross Cluster State Replication features >test-failure Triaged test failures from CI
Projects
None yet
Development

No branches or pull requests

4 participants