-
Notifications
You must be signed in to change notification settings - Fork 285
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[FIXED] Clustering: leadership acquired actions could get stuck
If a leadership changed occurred while leadership actions were executed, before the raft.Barrier() call was made, the server would be stuck in that call. This is because RAFT library notifies the Streaming server code that a leadership changed through a go channel that was just of size 1. Since the streaming server read from the channel and then executes the leadership acquired code, it could not read from the notification channel that caused the RAFT library to block on a go channel send, which then made the Barrier() call block. I believe the right approach is to have a bigger notification go channel instead of making Barrier() time out. If it does timeout, the server should then transfer leadership, which I am afraid could cause a cascading effect if all servers getting elected need longer that the chosen timeout to apply all the preceding entries to the FSM. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
- Loading branch information
Showing
6 changed files
with
224 additions
and
69 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.