-
Notifications
You must be signed in to change notification settings - Fork 38.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Stop timer and correctly drain it #101475
Conversation
@@ -475,14 +475,14 @@ func (c *csiAttacher) waitForVolumeAttachDetachStatusWithLister(volumeHandle, at | |||
clock = &clock.RealClock{} | |||
) | |||
backoffMgr := wait.NewExponentialBackoffManager(initBackoff, maxBackoff, resetDuration, backoffFactor, jitter, clock) | |||
defer backoffMgr.Backoff().Stop() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is incorrect as this may get another timer before draining the existing one. This can happen if ctx
signals done. See docs in
kubernetes/staging/src/k8s.io/apimachinery/pkg/util/wait/wait.go
Lines 291 to 295 in 5b42681
// BackoffManager manages backoff with a particular scheme based on its underlying implementation. It provides | |
// an interface to return a timer for backoff, and caller shall backoff until Timer.C() drains. If the second Backoff() | |
// is called before the timer from the first Backoff() call finishes, the first timer will NOT be drained and result in | |
// undetermined behavior. | |
// The BackoffManager is supposed to be called in a single-threaded environment. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch. Yes, after reviewing the API this defer clearly doesn't work.
@@ -491,6 +491,7 @@ func (c *csiAttacher) waitForVolumeAttachDetachStatusWithLister(volumeHandle, at | |||
return nil | |||
} | |||
case <-ctx.Done(): | |||
t.Stop() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't have to drain the timer's channel because the backoff manager is private to this function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the <-t.C()
case can return, do we need to call t.Stop()
if it does?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, it's a Timer
(not a Ticker
), it fires once and there is no need to stop it after that.
@@ -166,6 +166,9 @@ func BackoffUntil(f func(), backoff BackoffManager, sliding bool, stopCh <-chan | |||
// of every loop to prevent extra executions of f(). | |||
select { | |||
case <-stopCh: | |||
if !t.Stop() { | |||
<-t.C() | |||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Backoff manager was provided to this function so it must stop and drain the timer's channel here, otherwise the owner of the manager cannot reuse it. Even if the owner didn't care, stopping the timer earlier releases some resources so is beneficial.
/retest |
/assign @jpbetz |
Issues go stale after 90d of inactivity. If this issue is safe to close now please do so with Send feedback to sig-contributor-experience at kubernetes/community. |
/remove-lifecycle stale ping @jpbetz |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One question then LGTM.
@@ -475,14 +475,14 @@ func (c *csiAttacher) waitForVolumeAttachDetachStatusWithLister(volumeHandle, at | |||
clock = &clock.RealClock{} | |||
) | |||
backoffMgr := wait.NewExponentialBackoffManager(initBackoff, maxBackoff, resetDuration, backoffFactor, jitter, clock) | |||
defer backoffMgr.Backoff().Stop() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch. Yes, after reviewing the API this defer clearly doesn't work.
@@ -491,6 +491,7 @@ func (c *csiAttacher) waitForVolumeAttachDetachStatusWithLister(volumeHandle, at | |||
return nil | |||
} | |||
case <-ctx.Done(): | |||
t.Stop() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the <-t.C()
case can return, do we need to call t.Stop()
if it does?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
/priority important-longterm |
@caesarxuchao, @saad-ali would either of you be willing to approve? |
/lgtm |
/assign davidz627 @davidz627 mind taking a look |
/unassign davidz627 @jsafrane mind taking a look? |
/approve |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: ash2k, caesarxuchao, jpbetz, jsafrane The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
I am in the process of unifying and aligning some of the methods here, and I concur that clock.Timer is very dangerous and has no clear semantics for a consumer to use, because it differs significantly enough from Go's From a design perspective the right option is probably to have the backoff manager return the duration to wait (i.e., be just a variant of backoff), and let the loop manage the timer reuse itself. I may try that along with #107826 and consider the input of #114531. |
I took some of the comments in this and the related issues and applied them in #115064 to reduce the potential for accidental misuse. |
What type of PR is this?
/kind bug
What this PR does / why we need it:
Correctly drains timers acquired from backoff manager. This API is easy to misuse, I wonder if a better one is possible. Something like a function that takes a function and, after it returns, cleans things up properly. That would probably cater for only a subset of use cases though.
Does this PR introduce a user-facing change?
/sig api-machinery