Skip to content

ZOOKEEPER-4643: Committed txns may be improperly truncated if follower crashes right after updating currentEpoch but before persisting txns to disk#2028

Closed
AlphaCanisMajoris wants to merge 0 commit intoapache:masterfrom
AlphaCanisMajoris:ZOOKEEPER-4643
Closed

Conversation

@AlphaCanisMajoris
Copy link
Copy Markdown
Contributor

See ZOOKEEPER-4643 for details on the symptom, example trace, diagnostic, and possible fix idea.

To avoid the issues of ZOOKEEPER-4643, one possible fix is to guarantee that a follower updates its currentEpoch file only after it has synced the leader's history (persisted the pending transactions to disk) when receiving NEWLEADER in the SYNC phase.

The solution in this patch is built upon the FIX of ZOOKEEPER-4646 & ZOOKEEPER-4685, which guarantees that a follower syncs the leader's history (logs the pending transactions to disk) before replying ACK of NEWLEADER.

Overall, when a follower receives the NEWLEADER message, it will persist the pending transactions to disk first, then update the currentEpoch file, and finally reply with an ACK of NEWLEADER. This specific order ensures that issues such as ZOOKEEPER-4643, ZOOKEEPER-4646 & ZOOKEEPER-4685 are avoided.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant