-
Notifications
You must be signed in to change notification settings - Fork 9.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
raft: rework comment for advance interface #4049
Conversation
@bdarnell I did not change rawNode since it does not matter. The application drives raftNode, so application can always get Ready when there is one. |
What are the consequences if the applied index in the application falls behind the one tracked by the Node? I don't think it really matters, so this could be left out of the comment. Instead, I would explain the effects of an early advance: "However, no committed entries from the next LGTM. Cockroach has always called |
If the application only have one routine that applies the entries and only buffer the last batch of entries, then it is fine. However, if an application keeps on buffering all incoming committed entries, then there might be an issue when raft is running faster than the application can handle. The application might end up with buffering "infinite" number of entries to apply. Ideally, we want to keep raft and application in sync or at least not out of sync too much. I will add your suggestions into the comment. |
@andres-erbsen @bdarnell updated comment according to your feedbacks. |
@heyitsanthony We probably need to fix the possible two unapplied configuration issue in etcd. See (*) section. |
/cc @ngaut who maintains a rust port. |
// to call Advance before finish applying the last ready. But the applied index in your application | ||
// will be behind the applied index that tracks by Node. | ||
// | ||
// (*) NOTE: If a follower changes its state to candidate, it MUST wait for finishing all previous |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is subtle, and I think we were getting it wrong before (I didn't realize this was a requirement). What exactly does "changes its state to candidate" mean? A node may change its state to candidate on a Tick()
and then back to follower when it receives a new message; the application may not observe this if it doesn't call Ready
during that time. Does that still count, or is the problem only when Ready.SoftState.RaftState == StateCandidate
?
I think it would be better for the comment to start with the safe option: "the application should generally apply all entries before calling Advance
. However, as an optimization, the application may call Advance
while it is applying the commands if the following conditions hold:... This is most useful when applying the snapshot, which might take a long time and during which the application should continue to respond to heartbeats."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Am I right in assuming that this is required because of configuration changes needing to be applied?
Say I decide to queue up all non-configuration entries and apply them asynchronously (but apply configuration changes immediately). This is probably wrong, but if on restart I retrieve all committed entries from Storage and apply them to my state machine before starting the Node with Applied: hardState.Commit
, will I still run into the problem?
It feels weird that there wouldn't be a way where I can simply apply changes to my state machine asynchronously, without jumping through this (seemingly tricky) hoop of checking whether my node is a candidate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It feels weird that there wouldn't be a way where I can simply apply changes to my state machine asynchronously, without jumping through this (seemingly tricky) hoop of checking whether my node is a candidate.
You can apply log entries asynchronously. The only problem is that we need to ensure there is no more than one unapplied configuration change in the log of leader. So after the application receives the state change from follower to candidate (even before it can become a leader), it should block and wait for all pending configurations to be applied if there is any. I think this is very easy to achieve.
6e2eaba
to
390a451
Compare
@bdarnell @srijs @andres-erbsen |
LGTM |
raft: rework comment for advance interface
@heyitsanthony For your last changes in etcd.
/cc @bdarnell We discussed about adding a arg to Advance to indicate the actual apply index progress a while ago. But after more thoughts, we found it is easy enough for application to track its own applied index and we can accept the inconsistency between application and raft. Let me know if you have any concerns.