-
Notifications
You must be signed in to change notification settings - Fork 40.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make resourceVersion parameter semantics consistent across all storage.Interface implementations #72170
Conversation
/test pull-kubernetes-integration |
1 similar comment
/test pull-kubernetes-integration |
/test pull-kubernetes-kubemark-e2e-gce-big |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks reasonable to me - I just left one comment.
But I would like at least someone else to take a look into that too.
@liggitt @smarterclayton
c9a3909
to
1b2c3b7
Compare
This is ready for another review pass. |
friendly nudge |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just two minor nits - other than that LGTM.
Though I would also wait for @liggitt to make a final look too.
1b2c3b7
to
d9c7dac
Compare
/test pull-kubernetes-dependencies |
Friendly nudge |
I'm really hoping this is true. Is watch cache enabled for CRDs by default? Edit: looks like not - so are we really sure no one is relying on this? |
There is no in-tree code relying on it. We can't entirely eliminate the possibility there is out-of-tree code relying on it. The conditions they'd have to meet are quite specific: (1) they've disabled the watch cache, or are using events, and (2) they're performing a |
…s enabled and disabled
/retest |
So if we had to bring back exact semantics? Say we were wrong, we break a large chunk of people, they need a fix. How do we make them whole? |
Worst case it that it's widespread (i.e. we really misread the situation), in which case we could revert the part of this change that applies to lists. We'd then have to regroup and figure out how to deprecate the old behavior and migrate. Slightly less improbable is that there are a limited set of use cases, but they're important and cannot easily migrate on to "minimum RV" semantics. In this case I suppose we could introduce a separate param to GetOptions to support exact RV semantics. I'd really like to avoid doing that, but it would be an option.. |
I updated the release note, please double check it for me to see if you agree. |
Release note looks great. Thanks! |
/lgtm I hope I don't regret this. |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: jpbetz, smarterclayton The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/retest Review the full test history for this PR. Silence the bot with an |
[This is part of effort to improve Apiserver Resource Version Semantics (as previously circulated with sig-apimachinery) and fix the "stale read" issue]
Make the minimum
resourceVersion
semantics consistent across allstorage.Interface
implementations by:storage.Interface
documentationList()
implementation to use "minimum RV" semantics instead of "exact RV" semantics when RV>0Important: the watch cache is enabled by default on the kube-apiserver for all but a select few types (e.g. Events, CRDs) and so while API clients do not get to decide when they make requests what semantics they get, they typically will get the watch cache semantics.
Before:
Note that "quorum read, RV ignored" behaves like "min RV" for all RVs that were retrieved from some previous operation on the store because quorum reads are guaranteed to be processed with the latest RV. So it appears from these semantics that "min RV" semantics are pervasive across both implementations and what we should converge to.
After:
The change from "exact RV" to "min RV" for the
List
operation is the most notable change here. But given that the watch cache is enabled by default for all but a select few types, and that the semantics change here would only be visible if providing a non-zero RV, we don't expect this change to be particularly risky/visible. And we believe it to be lower risk than leaving the semantic inconsistent across implementations, esp. given that clients don't necessarily have any way of knowing if the watch cache is enabled and what semantics to expect.As part of this change we're introducing a
ResourceVersionTooLargeError
type to the storage interface so that both the watch cache and etcd3 return the same error type if unable to serve a requested resource version because it's too high. Previously the watch cache would return a timeout error in the case because it attempts to wait for 3 seconds to see if the "too high" resource version might appear if it waits a bit. I've done a quick review and it looks like this change is low risk, but if reviewers know of reasons why this might be breaking, please let me know.cc @cheftako @lavalamp @wojtek-t @liggitt @smarterclayton