Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Peer recovery should flush at the end #41660

Merged
merged 14 commits into from May 22, 2019

Conversation

Projects
None yet
5 participants
@dnhatn
Copy link
Contributor

commented Apr 30, 2019

Flushing at the end of a peer recovery (if needed) can bring these benefits:

  • Closing an index won't end up with the red state for a recovering replica should always be ready for closing whether it performs the verifying-before-close step or not.
  • Good opportunities to compact store (i.e., flushing and merging Lucene, and trimming translog)

Closes #40024
Closes #39588
Relates #33888

@elasticmachine

This comment has been minimized.

Copy link

commented Apr 30, 2019

@ywelsch
Copy link
Contributor

left a comment

I've left a question.

// if all those uncommitted operations have baked into the existing Lucene index commit already.
final SequenceNumbers.CommitInfo commitInfo = SequenceNumbers.loadSeqNoInfoFromLuceneCommit(
indexShard.commitStats().getUserData().entrySet());
return commitInfo.maxSeqNo != commitInfo.localCheckpoint

This comment has been minimized.

Copy link
@ywelsch

ywelsch Apr 30, 2019

Contributor

I wonder if the condition above about the translog is sufficient. What situation is the condition here addressing that's not addressed by the above one?

This comment has been minimized.

Copy link
@dnhatn

dnhatn Apr 30, 2019

Author Contributor

If a file-based occurs, the primary also sends its translog to replica. These operations are uncommitted on the replica even though they are baked into the commit already. We need this condition to avoid flushing in this case to keep the syncId. I pushed 07c3a7c to use another check.

dnhatn added some commits Apr 30, 2019

@dnhatn dnhatn requested a review from ywelsch Apr 30, 2019

@dnhatn

This comment has been minimized.

Copy link
Contributor Author

commented Apr 30, 2019

@elasticmachine test this please

@henningandersen
Copy link
Contributor

left a comment

I think this could solve the issue and have other benefits as described.

But I am a bit worried about the implications, especially for future maintenance. If we ever add anything into VerifyShardBeforeClose, we need to also ensure the same holds at the end of a recovery. Also, I am not 100% sure recovery is the only place to ensure this (though I have no concrete cases).

I would find it more intuitive to (maybe in addition to this) add a check in MetaDataIndexStateService.closeRoutingTable to fail closing the index if the routing table contains unvalidated shard copies (meaning we would have to collect more info in the previous steps).

@ywelsch

This comment has been minimized.

Copy link
Contributor

commented May 2, 2019

Good point @henningandersen, but failing the closing operation would also not be very user-friendly, as shards are free to move around based on rebalancing decisions. Let's consider more options here.

@dnhatn

This comment has been minimized.

Copy link
Contributor Author

commented May 6, 2019

@henningandersen found that we can always validate max_seq_no equals to the global checkpoint in ReadOnlyEngine with this change. I pushed 6e952c5 to enable it.

@tlrx tlrx referenced this pull request May 7, 2019

Closed

Replicate closed indices #33888

50 of 50 tasks complete
@henningandersen

This comment has been minimized.

Copy link
Contributor

commented May 8, 2019

@henningandersen found that we can always validate max_seq_no equals to the global checkpoint in ReadOnlyEngine with this change. I pushed 6e952c5 to enable it.

I tend to think I was wrong about this, since FrozenEngine extends ReadOnlyEngine. If something was frozen on 6.7 or 7.0, it might not obey the invariant if they have #41041 ?

dnhatn added some commits May 16, 2019

@ywelsch
Copy link
Contributor

left a comment

LGTM

@tlrx

tlrx approved these changes May 17, 2019

Copy link
Member

left a comment

LGTM

dnhatn added some commits May 21, 2019

@dnhatn

This comment has been minimized.

Copy link
Contributor Author

commented May 22, 2019

Thanks everyone!

@dnhatn dnhatn merged commit 75be2a6 into elastic:master May 22, 2019

8 checks passed

CLA All commits in pull request signed
Details
elasticsearch-ci/1 Build finished.
Details
elasticsearch-ci/2 Build finished.
Details
elasticsearch-ci/bwc Build finished.
Details
elasticsearch-ci/default-distro Build finished.
Details
elasticsearch-ci/docbldesx Build finished.
Details
elasticsearch-ci/oss-distro-docs Build finished.
Details
elasticsearch-ci/packaging-sample Build finished.
Details

@dnhatn dnhatn deleted the dnhatn:peer-recovery-flush branch May 22, 2019

dnhatn added a commit that referenced this pull request May 22, 2019

Peer recovery should flush at the end (#41660)
Flushing at the end of a peer recovery (if needed) can bring these
benefits:

1. Closing an index won't end up with the red state for a recovering
replica should always be ready for closing whether it performs the
verifying-before-close step or not.

2. Good opportunities to compact store (i.e., flushing and merging
Lucene, and trimming translog)

Closes #40024
Closes #39588

gurkankaymak pushed a commit to gurkankaymak/elasticsearch that referenced this pull request May 27, 2019

Peer recovery should flush at the end (elastic#41660)
Flushing at the end of a peer recovery (if needed) can bring these
benefits:

1. Closing an index won't end up with the red state for a recovering
replica should always be ready for closing whether it performs the
verifying-before-close step or not.

2. Good opportunities to compact store (i.e., flushing and merging
Lucene, and trimming translog)

Closes elastic#40024
Closes elastic#39588
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.