-
Notifications
You must be signed in to change notification settings - Fork 24.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Peer recovery should flush at the end #41660
Conversation
Pinging @elastic/es-distributed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've left a question.
// if all those uncommitted operations have baked into the existing Lucene index commit already. | ||
final SequenceNumbers.CommitInfo commitInfo = SequenceNumbers.loadSeqNoInfoFromLuceneCommit( | ||
indexShard.commitStats().getUserData().entrySet()); | ||
return commitInfo.maxSeqNo != commitInfo.localCheckpoint |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if the condition above about the translog is sufficient. What situation is the condition here addressing that's not addressed by the above one?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If a file-based occurs, the primary also sends its translog to replica. These operations are uncommitted on the replica even though they are baked into the commit already. We need this condition to avoid flushing in this case to keep the syncId. I pushed 07c3a7c to use another check.
@elasticmachine test this please |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this could solve the issue and have other benefits as described.
But I am a bit worried about the implications, especially for future maintenance. If we ever add anything into VerifyShardBeforeClose
, we need to also ensure the same holds at the end of a recovery. Also, I am not 100% sure recovery is the only place to ensure this (though I have no concrete cases).
I would find it more intuitive to (maybe in addition to this) add a check in MetaDataIndexStateService.closeRoutingTable
to fail closing the index if the routing table contains unvalidated shard copies (meaning we would have to collect more info in the previous steps).
Good point @henningandersen, but failing the closing operation would also not be very user-friendly, as shards are free to move around based on rebalancing decisions. Let's consider more options here. |
@henningandersen found that we can always validate max_seq_no equals to the global checkpoint in ReadOnlyEngine with this change. I pushed 6e952c5 to enable it. |
I tend to think I was wrong about this, since FrozenEngine extends ReadOnlyEngine. If something was frozen on 6.7 or 7.0, it might not obey the invariant if they have #41041 ? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Thanks everyone! |
Flushing at the end of a peer recovery (if needed) can bring these benefits: 1. Closing an index won't end up with the red state for a recovering replica should always be ready for closing whether it performs the verifying-before-close step or not. 2. Good opportunities to compact store (i.e., flushing and merging Lucene, and trimming translog) Closes #40024 Closes #39588
Flushing at the end of a peer recovery (if needed) can bring these benefits: 1. Closing an index won't end up with the red state for a recovering replica should always be ready for closing whether it performs the verifying-before-close step or not. 2. Good opportunities to compact store (i.e., flushing and merging Lucene, and trimming translog) Closes elastic#40024 Closes elastic#39588
Flushing at the end of a peer recovery (if needed) can bring these benefits:
Closes #40024
Closes #39588
Relates #33888