New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

What is CommitTimeout for? #28

Closed
stapelberg opened this Issue Jan 12, 2015 · 3 comments

Comments

Projects
None yet
2 participants
@stapelberg
Copy link
Contributor

stapelberg commented Jan 12, 2015

CommitTimeout is set to 50ms by default, and I notice that the leader sends RPCs to all followers in that interval. config.go says “Time without an Apply() operation before we heartbeat to ensure a timely commit”, but I am having trouble finding the necessity of such a timeout/forced AppendEntries RPC in the raft paper.

Could you clarify what this is for please? Is it a performance optimization only? What is the recommended value to set it to (is there a rule of thumb)? What will happen when I set it to e.g. 10s, while not changing any of the other timeouts?

Thanks!

@armon

This comment has been minimized.

Copy link
Member

armon commented Jan 12, 2015

Yeah, it's a subtle detail of how we've implemented raft. It's from section 5.3 on log replication. For a number of reasons, we do the heartbeats async from the AppendEntries with actual log data. Mostly it's to avoid head of line blocking on network and disk. Because of this decoupling, heartbeat does not include the leaderCommit value (highest committed index), since the replication may be lagging (any number of reasons). We only send that value in the replication stream. As a result, suppose I commit log at index 10. If there is no more write operations (e.g. nothing at index 11), then there is no way to update the leaderCommit of the followers. We use a CommitTimeout to instead replicate no logs, but to update the leaderCommit of the followers. In practice, it works the same as doing it in the heartbeat, but we avoid a number of other issues in favor of higher stability.

@armon armon closed this Jan 12, 2015

@stapelberg

This comment has been minimized.

Copy link
Contributor

stapelberg commented Jan 12, 2015

Thanks for clarifying that. I’m still wondering, though: wouldn’t it be enough to trigger one additional AppendEntries after a successful commit to get leaderCommit to the followers, instead of sending 20/s (default setting)? :)

@armon

This comment has been minimized.

Copy link
Member

armon commented Jan 12, 2015

Indeed. We used to do that :) But in practice you get all sorts of fun situations like a user restarting Consul in a way that causes data loss (e.g. replication log moves backward in time). With the no-op optimization we end up missing that and stalling the replication. Doing the AppendEntries is relatively cheap.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment