Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CI] checkpoint tracker failure in 6.8 cluster upgrade test #46311

Closed
tvernum opened this issue Sep 4, 2019 · 2 comments
Closed

[CI] checkpoint tracker failure in 6.8 cluster upgrade test #46311

tvernum opened this issue Sep 4, 2019 · 2 comments
Assignees
Labels
:Distributed/Distributed A catch all label for anything in the Distributed Area. If you aren't sure, use this one. >test-failure Triaged test failures from CI

Comments

@tvernum
Copy link
Contributor

tvernum commented Sep 4, 2019

BWC tests: 6.5.0 -> 6.8

In :x-pack:qa:full-cluster-restart:without-system-key:v6.5.0#upgradedClusterTestCluster

[2019-09-04T10:07:34,985][INFO ][o.e.x.m.e.l.LocalExporter] [node-0] waiting for elected master node [{node-1}{Iu3gP669TKqwgd5WOGm5fw}{IadoUf9wQyGoOw05oGWWrw}{127.0.0.1}{127.0.0.1:35699}{testattr=test, ml.machine_memory=63158317056, ml.max_open_jobs=20, xpack.installed=true, ml.enabled=true}] to setup local exporter [default_local] (does it have x-pack installed?)
[2019-09-04T10:07:35,162][ERROR][o.e.b.ElasticsearchUncaughtExceptionHandler] [node-0] fatal error in thread [elasticsearch[node-0][generic][T#3]], exiting
java.lang.AssertionError: local checkpoint tracker is not updated seq_no=49 id=49
	at org.elasticsearch.index.engine.InternalEngine.compareOpToLuceneDocBasedOnSeqNo(InternalEngine.java:729) ~[elasticsearch-6.8.3-SNAPSHOT.jar:6.8.3-SNAPSHOT]
	at org.elasticsearch.index.engine.InternalEngine.planIndexingAsNonPrimary(InternalEngine.java:1012) ~[elasticsearch-6.8.3-SNAPSHOT.jar:6.8.3-SNAPSHOT]
@tvernum tvernum added >test-failure Triaged test failures from CI :Distributed/Distributed A catch all label for anything in the Distributed Area. If you aren't sure, use this one. labels Sep 4, 2019
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-distributed

@dnhatn dnhatn self-assigned this Sep 4, 2019
dnhatn added a commit that referenced this issue Sep 7, 2019
The max_seq_no of Lucene commit of the old indices (before 6.6.2) can be
smaller than seq_no of some documents in the commit (see #38879).
Although we fixed this bug in 6.6.2 and 7.0.0, a problematic index
commit can still affect the newer version after a rolling upgrade or
full cluster restart. In particular, if a FollowingEngine (or an internal 
engine with MSU enabled) restores from a problematic commit, then 
it can apply MSU optimization for existing documents. The symptom
that we see here is the local checkpoint tracker assertion is violated.

Closes #46311
Relates #38879
@dnhatn
Copy link
Member

dnhatn commented Sep 7, 2019

Fixed in #46340

@dnhatn dnhatn closed this as completed Sep 7, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Distributed/Distributed A catch all label for anything in the Distributed Area. If you aren't sure, use this one. >test-failure Triaged test failures from CI
Projects
None yet
Development

No branches or pull requests

3 participants