New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Boskos: Retry on conflict #16529
Boskos: Retry on conflict #16529
Conversation
@alvaroaleman: GitHub didn't allow me to assign the following users: bbguimaraes. Note that only kubernetes members, repo collaborators and people who have commented on this issue/PR can be assigned. Additionally, issues/PRs can only have 10 assignees at the same time. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/cc @bbguimaraes |
@alvaroaleman: GitHub didn't allow me to request PR reviews from the following users: bbguimaraes. Note that only kubernetes members and repo collaborators can review this PR, and authors cannot review their own PRs. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this somewhat feels weird to me, like the abstraction is leaking - why have the handlers do retries rather than putting that into the ranch implementation? (this is arguably easier, probably - the ranch functions are very long.)
boskos/handlers/handlers.go
Outdated
@@ -390,3 +410,35 @@ func handleMetric(r *ranch.Ranch) http.HandlerFunc { | |||
res.Write(js) | |||
} | |||
} | |||
|
|||
// retryOnConflict is a copy of https://godoc.org/k8s.io/client-go/util/retry#RetryOnConflict |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it looks like this is really a copy of retry.OnError
except that you curried the retriable
parameter.
Couldn't we instead do this?
func retryOnConflict(backoff wait.Backoff, fn func() error) error {
return retry.OnError(backoff, isConflict, fn)
}
Or am I missing something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, retry.RetryOnError
is not available in the client-go version we currently use. I am working on another PR that will also bump the kube dependencies, once that is in I could change this to use retry.OnError
: #16524
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
are you working on a follow-up PR now that #16524 is in?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
to answer my own question: #16698
b68c37c
to
03d5ced
Compare
That is a very good point, updated the PR accordingly. Good you caught this, because we do the locking within ranch, so the retying on handler level would have meant that we might have to acquire the lock multiple times for a single request 😬 I also added retrying to |
|
||
// retryOnConflict is a copy of https://godoc.org/k8s.io/client-go/util/retry#RetryOnConflict | ||
// that only differs in the isConflict check, we use a variant that supports wrapped errors. | ||
// TODO: Simplify this by using retry.OnError once we have a sufficiently new client-go dependency |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want a tracking issue on this
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: alvaroaleman, stevekuznetsov The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
This PR adds retrying on conflicts to the boskos handlers, in order to resolve the occasional
the object has been modified; please apply your changes to the latest version and try again
errors we are seeing and that will presumably vastly increase once we remove the global lock.Fixes https://github.com/openshift/release/issues/7183
/assign @ixdy @bbguimaraes