New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Specialize pre-closing checks for engine implementations (#38702) #38727

Merged
merged 2 commits into from Feb 11, 2019

Conversation

Projects
None yet
2 participants
@tlrx
Copy link
Member

tlrx commented Feb 11, 2019

Backport of #37426 and #38702 to 6.7.

tlrx and others added some commits Jan 22, 2019

Ensure that max seq # is equal to the global checkpoint when creating…
… ReadOnlyEngines (#37426)

Since version 6.7.0 the Close Index API guarantees that all translog 
operations have been correctly flushed before the index is closed. If 
the index is reopened as a Frozen index (which uses a ReadOnlyEngine) 
we can verify that the maximum sequence number from the last Lucene 
commit is indeed equal to the last known global checkpoint and refuses 
to open the read only engine if it's not the case. In this PR the check is 
only done for indices created on or after 6.7.0 as they are guaranteed 
to be closed using the new Close Index API.

Related #33888
Specialize pre-closing checks for engine implementations (#38702)
The Close Index API has been refactored in 6.7.0 and it now performs
pre-closing sanity checks on shards before an index is closed: the maximum
sequence number must be equals to the global checkpoint. While this is a
strong requirement for regular shards, we identified the need to relax this
check in the case of CCR following shards.

The following shards are not in charge of managing the max sequence
number or global checkpoint, which are pulled from a leader shard. They
also fetch and process batches of operations from the leader in an unordered
way, potentially leaving gaps in the history of ops. If the following shard lags
a lot it's possible that the global checkpoint and max seq number never get
in sync, preventing the following shard to be closed and a new PUT Follow
action to be issued on this shard (which is our recommended way to
resume/restart a CCR following).

This commit allows each Engine implementation to define the specific
verification it must perform before closing the index. In order to allow
following/frozen/closed shards to be closed whatever the max seq number
or global checkpoint are, the FollowingEngine and ReadOnlyEngine do
not perform any check before the index is closed.

Co-authored-by: Martijn van Groningen <martijn.v.groningen@gmail.com>
@elasticmachine

This comment has been minimized.

Copy link

elasticmachine commented Feb 11, 2019

@tlrx tlrx merged commit 4b38523 into elastic:6.7 Feb 11, 2019

7 checks passed

CLA Commit author is a member of Elasticsearch
Details
elasticsearch-ci/1 Build finished.
Details
elasticsearch-ci/2 Build finished.
Details
elasticsearch-ci/default-distro Build finished.
Details
elasticsearch-ci/docbldesx Build finished.
Details
elasticsearch-ci/oss-distro-docs Build finished.
Details
elasticsearch-ci/packaging-sample Build finished.
Details

@tlrx tlrx deleted the tlrx:backport-pre-close-checks-to-6.7 branch Feb 11, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment