New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How do I know that FSM.Apply has completed on a majority of nodes? #508
Comments
Thank you ❤️
There is no way to do exactly this in the library... but read on!
This is true - good catch. The way we deal with this though is not by reading from a quorum to check they have all applied. Instead the library provides two things:
Does that help? |
I see.
Does this mean that whenever Barrier is called using the 1 and 2 mechanisms, the Raft Log is stored in the follower's stable storage, the newly elected leader restores data from the latest Raft Log, and no data is lost? |
More or less yes. Specifically because of point 1 (the raft spec) in order for the new leader to have been elected by a quorum at all, it is already true that it has the most up-to-date log containing at-least every committed log entry that might have been acknowledged by any previous leader. So its log on disk is already "complete". The The Barrier apply itself is a no-op, but because committed log entries are only ever applied in order, waiting until this new "write" has been applied to the FSM ensures that every previous write that was ever committed by a previous leader is also now applied to the new leaders FSM and so it may proceed with accepting new requests with a fully consistent local state. |
I see. |
Thank you for the wonderful library.
I really like hashicorp's OSS because it's simple and robust.
By the way, I want to know that FSM.Apply has completed on the majority of nodes after FSM.Apply has been called.
Is there this way?
Specific cases where this might be necessary include.
Assume that immediately after FSM.Apply writes data, the leader node crashes and the leader is transferred to another no.
In this case, we assume that to avoid losing data written by FSM.Apply, we might wait for FSM.Apply to be called on a large number of nodes.
The text was updated successfully, but these errors were encountered: