-
Notifications
You must be signed in to change notification settings - Fork 9.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
lease: randomize expiry on initial refresh call #8101
Conversation
I wonder with rate limit revoking in runLoop added, is TLL randomization still necessary? |
lease/lessor.go
Outdated
@@ -475,6 +476,9 @@ func (le *lessor) initAndRecover() { | |||
if lpb.TTL < le.minLeaseTTL { | |||
lpb.TTL = le.minLeaseTTL | |||
} | |||
// randomize +/-10%, so that recovered leases with same TTL | |||
// do not expire all at the same time | |||
lpb.TTL += rand.Int63n(lpb.TTL/10) * int64((-(i & 1))|1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lpb.TTL *= 1.0+(0.1*(rand.Float64()-0.5))
? I think Int63n will panic if TTL < 10 since the docs say it expects >= 1.
@fanminshi without randomization the leases are forced to wait to drain at 1000 leases/s so if there's 10k leases, some leases will need to wait 10s to expire. If there's TTL randomization the queuing effect will be less pronounced: suppse those 10k leases TTLs are distributed over 10 seconds-- on average the leases will expire immediately since there's 1k leases expiring every second instead of 10k leases expiring in one instant. |
@heyitsanthony make sense. |
e6d81d9
to
d6c5035
Compare
@heyitsanthony PTAL. Thanks. |
cae0e47
to
58f4999
Compare
lease/lessor.go
Outdated
@@ -491,6 +492,7 @@ func (le *lessor) initAndRecover() { | |||
itemSet: make(map[LeaseItem]struct{}), | |||
expiry: forever, | |||
revokec: make(chan struct{}), | |||
fresh: true, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
probably this can be a flag on lessor itself instead? we change the extend accordingly. i do not feel we should put the calculation logic into lease.
lease/lessor.go
Outdated
@@ -147,6 +148,9 @@ type lessor struct { | |||
stopC chan struct{} | |||
// doneC is a channel whose closure indicates that the lessor is stopped. | |||
doneC chan struct{} | |||
|
|||
// 'true' when lease is just recovered | |||
fresh bool |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
promoting? // 'true' when the lessor is promoting
lease/lessor.go
Outdated
l.refresh(extend) | ||
ext := extend | ||
if fresh { | ||
// randomize expiry with 士10%, otherwise leases of same TTL |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
move this to a func?
lease/lessor.go
Outdated
expiredC: make(chan []*Lease, 16), | ||
stopC: make(chan struct{}), | ||
doneC: make(chan struct{}), | ||
promoting: true, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this should be set to false initially? when promote is called, we should change promoting to true. after finishing promotion, we set it back to false again. right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
promoting
field is only used by Promote
method. If we set it true
and then false
inside Promote
, there is no way for Promote
to know if it has promoted before? Or if we want to randomize for every Promote
call, we can just remove the field?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
when there is a new leader, the new leader will call promote to refresh all its leases, right? for each refresh caused by election, we need the randomization.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah we only call Promote
after leader election https://github.com/coreos/etcd/blob/master/etcdserver/server.go#L1307-L1319. So should be safe to randomize for every Promote
call, and no need promoting
field?
lgtm. defer to @heyitsanthony |
Randomize the very first expiry on lease recovery to prevent recovered leases from expiring all at the same time. Address etcd-io#8096. Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
Lease with TTL 5 should be renewed with randomization, thus it's still possible to exist after 3 seconds. Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
Address #8096.