New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Admission chain for underlying resource is not called for requests to /scale subresource #84530
Comments
/sig api-machinery This is related to (one of the causes of) #82046 "Server-side Apply: Ownership not tracked for scale subresource" |
@jennybuckley: The label(s) In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/priority backlog |
This probably isn't well-defined enough to be marked as |
/remove-help |
@liggitt I should have read https://github.com/kubernetes/community/blob/master/contributors/guide/help-wanted.md first, I assumed As for the implications on existing webhooks, I think the only safe thing to do is to call both the |
Implicitly this would mean that a mutating webhook might change some part of the object other than the |
Also, the changes are typically shown because the object is returned to the client. In this case, the scale object is returned to the client. |
Validating & Mutating webhooks can't be interleaved, I think, so this isn't an option. |
That seems really problematic… the whole point of the scale subresource is that only the replicas field will change |
A few questions to explore the implications of that:
|
I can see value in adjusting this behavior, but it is a significant behavioral change for admission plugin authors. They should be relying on diffs of old and new objects to make decisions about whether an action is allowed, but I think we can be fairly certain they aren't all doing it correctly. How would you phase this in to be safe for existing consumers? Just an opt-in or would we be one of the first v2's? As Jordan pointed out, the same logical problem exists for some (but not all) other subresources, so that would need to be crisped up. |
Results of the discussion from the sig-apimachinery meeting on Nov 6:
Run the mutating admission controllers first for the subresource then for the parent resource. Optionally, invoke the mutating admissions controllers twice in that same order. Finally run all validating admission controllers in parallel.
The API webhook configuration for the parent resource must opt-in to observe changes to the parent from specific subresources e.g. Thus, the webhook on the parent resource will receive the actual user for the subresource event.
This is unresolved: can mutating webhooks change fields that cannot be changed via a normal API call to that subresource? Options:
|
I would vote for the above option. |
To re-summarize the current solution for this issue:
Decide:
|
Option 1: silently drop such changes; if that would leave the object in an inconsistent or bad state, a validating webhook can fail the entire operation. Option 1 and 2 both have the con that user changes that used to work will stop working when the webhook is in the path, and the end user (the one using the subresource) has no recourse. Option 1 would permit more such changes to go through, at the cost of producing (from the webhook's perspective) objects with partially applied changes. Option 3 has the con that changes that used to be confined to known fields could have affects on more and different fields now. Personally, I think that webhooks are highly privileged, and therefore we can trust them to make changes that we would not trust the subresource caller to make, and therefore option 3 is best. We can formalize this even more by e.g. explicitly giving webhooks identities and doing an RBAC check for update on the parent resource. If the webhook has this permission, then use option 3. If the webhook lacks this permission, then I guess option 2 is clearer than option 1. |
Option 1:
Option 2:
Option 3:
I would find status fields changing in spec updates (or vice-versa) in ways that could not be accomplished simply via the API very confusing. |
Let me modify option 3-- the status/spec distinction is solid enough that it should probably always be handled via option 1 or 2; no existing webhook should be attempting to set both. I had in mind things like /scale, not status, for option 3. |
Since this is still in the beta period and we have plenty to do, I will vote for option 1 for the purpose of expediency. Option 2 and 3 will be more work, therefore we should wait until we have evidence that option 1 isn't sufficient. |
Ok so next step, now that we are closer to a decision we might want an updated or new KEP to document this since among the changes there's a new field |
Issues go stale after 90d of inactivity. If this issue is safe to close now please do so with Send feedback to sig-testing, kubernetes/test-infra and/or fejta. |
/remove-lifecycle stale |
Issues go stale after 90d of inactivity. If this issue is safe to close now please do so with Send feedback to sig-testing, kubernetes/test-infra and/or fejta. |
/remove-lifecycle stale |
Hi, I'm also looking into this area. Not quite the same use-case, but I want to mutate things about a Pod depending on the scheduler's placement of it (Binding create). A change of the If anyone has any opinions on this I'd be interested? I'd also like to continue my own research, but I can't find the "apply" KEP being talked about? It's not server side apply I guess. The mooted |
The feature started before the KEP process, so our KEP is mostly work-in-progress. I'm trying to get people to update the KEP as we implement specific features for it, but it's still far from complete. This feature/bug is mostly unrelated to server-side apply, and the api-expression working group is focused on graduating server-side apply to GA at this time. If you want to get involved in solving this problem, feel free to let us know so that we can discuss the next steps! |
Yeah I'll take a look, shall we have a chat? 🙂 |
I'd love it! Happy to talk on Slack (apelisse@) whenever you want :-) Also feel free to join #wg-api-expression |
Issues go stale after 90d of inactivity. If this issue is safe to close now please do so with Send feedback to sig-contributor-experience at kubernetes/community. |
Stale issues rot after 30d of inactivity. If this issue is safe to close now please do so with Send feedback to sig-contributor-experience at kubernetes/community. |
/remove-lifecycle rotten |
Issues go stale after 90d of inactivity. If this issue is safe to close now please do so with Send feedback to sig-contributor-experience at kubernetes/community. |
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs. This bot triages issues and PRs according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle rotten |
/lifecycle frozen Bot, these issues don't magically fix themselves and closing them doesn't make them go away... |
What happened:
When a user makes a modifying request to an object through the /scale subresource, the /scale admission chain is called, but mutating and validating admission for the underlying resource are not. This introduces some unexpected behavior and also makes certain valid use cases impossible to enforce consistently.
For example, if a user wants to register a webhook on Deployments which prevents the total resource limits of the deployment from exceeding a certain amount, by validating that the product of replicas*memory is below a certain amount, this is not possible. Because a user could always increase the replicas through /deployments/scale, and bypass the validating webhooks registered for /deployments.
What you expected to happen:
The mutating and validating admission for the underlying resource would be called when making a request to the /scale subresource
How to reproduce it (as minimally and precisely as possible):
Register a webhook for /deployments which prevents changing replicas
Make a request to /deployments/scale
/cc @apelisse
The text was updated successfully, but these errors were encountered: