-
-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
JSC Updates #1996
JSC Updates #1996
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am a hard no on lxfer at this time.
Given that we are so close to GA, I'd rather not introduce this quite yet as it appears too risky.
If at all, we should avoid the full wait if there was no leadership change. (the TODO)
I know that this will improve things, but wouldn't necessary want to force that in just yet.
The reason I'm weary about skipping the full wait until after is the following:
When the replacement is picked there is no guarantee it will still be current by the time the operation is received.
And being the server that is current is what my change is all about.
This could potentially introduce short truncation which is hard to test for.
The reason why a full wait is not needed when there was no leadership change is that the only server where assumptions about being current are guaranteed to hold true.
This will not be true for any other server.
---- edit:
The more I contemplate the change the less I like it.
Also how often is this going to improve things?
Yes we look at this a lot right now but we specifically tested truncation.
Again the thing to do here is to make sure we don't wait if there was no leader change, this will apply every time a follower experiences issues.
server/raft.go
Outdated
@@ -2084,7 +2088,7 @@ func (n *raft) runAsCandidate() { | |||
if n.wonElection(votes) { | |||
// TODO If this server was also leader in n.term-1, then we could skip the timer as well. | |||
// This would be ok as we'd be guaranteed to have the latest history. | |||
if len(n.peers) == votes { | |||
if len(n.peers) == votes || n.lxfer { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mentioned to Matthias that I feel that this addition "negates" his previous fix, so I am not sure about this change..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am a hard no on this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is when a current leader with quorum selects us to be the new leader, in that case there is no reason to wait. IMO the system is painful to work with when you just ctrl-c a server without this. If we have lost quorum and do not have a leader Matthias code will apply..
Short ciruit full wait for leaders if we were a leadership xfer of preferred. Signed-off-by: Derek Collison <derek@nats.io>
Signed-off-by: Derek Collison <derek@nats.io>
Signed-off-by: Derek Collison <derek@nats.io>
Signed-off-by: Derek Collison <derek@nats.io>
Signed-off-by: Derek Collison <derek@nats.io>
Made sure to not remove us if we were remapped after the peer removal. Fixed some raft behaviors. Signed-off-by: Derek Collison <derek@nats.io>
Signed-off-by: Derek Collison <derek@nats.io>
Signed-off-by: Derek Collison <derek@nats.io>
Signed-off-by: Derek Collison <derek@nats.io>
Signed-off-by: Derek Collison <derek@nats.io>
Signed-off-by: Derek Collison <derek@nats.io>
Signed-off-by: Derek Collison <derek@nats.io>
Signed-off-by: Derek Collison <derek@nats.io>
Signed-off-by: Derek Collison <derek@nats.io>
Signed-off-by: Derek Collison <derek@nats.io>
Signed-off-by: Derek Collison <derek@nats.io>
… stream config. Signed-off-by: Derek Collison <derek@nats.io>
Signed-off-by: Derek Collison <derek@nats.io>
Signed-off-by: Derek Collison <derek@nats.io>
Signed-off-by: Derek Collison <derek@nats.io>
Signed-off-by: Derek Collison <derek@nats.io>
… banner consistently Signed-off-by: Derek Collison <derek@nats.io>
Signed-off-by: Derek Collison <derek@nats.io>
Signed-off-by: Derek Collison <derek@nats.io>
1. With snapshots being installed under heavy load. 2. Running catchup and missing responses due to bug in chan size for catchup. 3. various other tweaks. Signed-off-by: Derek Collison <derek@nats.io>
Signed-off-by: Derek Collison <derek@nats.io>
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
Signed-off-by: Waldemar Quevedo <wally@synadia.com>
Fix for JS reload and exports
Signed-off-by: Derek Collison <derek@nats.io>
We also delay restarting JetStream to make sure accounts are enabled. Signed-off-by: Derek Collison <derek@nats.io>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Tweaked raft leader election under transfer or campaign.
Changed raft startup to not require certain cluster size.
Tweaked default block size for streams.
/cc @nats-io/core