You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Dec 17, 2018. It is now read-only.
Configuration: 3-machine cluster. Machines 1 and 2 were left to run for a long period of time with NOPCommands being applied every 1 second. Machine 3 was left offline. Eventually there was a backlog of over 7000 entries for Machine 3 to apply. On starting up, I observed that Machine 3 was not catching up quickly. This was quickly traced to two factors:
On receiving a negative AppendEntriesReply, a new AppendEntries is not sent immediately. Instead, we wait for the next heartbeat timeout. On KayVee the heartbeats are sent after multi-second intervals, which means it can take forever for the backlog to be cleared.
The leader rolls back its prefix one index position at a time. Perhaps the optimization described in the Raft paper would be useful, where the follower reports information about its log entries.
It's also possible that this will be mitigated through the use of snapshots.
The text was updated successfully, but these errors were encountered:
Configuration: 3-machine cluster. Machines 1 and 2 were left to run for a long period of time with NOPCommands being applied every 1 second. Machine 3 was left offline. Eventually there was a backlog of over 7000 entries for Machine 3 to apply. On starting up, I observed that Machine 3 was not catching up quickly. This was quickly traced to two factors:
It's also possible that this will be mitigated through the use of snapshots.
The text was updated successfully, but these errors were encountered: