-
Notifications
You must be signed in to change notification settings - Fork 39.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
v1.10 AlwaysPullImages admission control order breaks MutatingAdmissionWebhook functionality like Istio #64333
Comments
@kubernetes/sig-api-machinery-bugs |
@billpratt: Reiterating the mentions to trigger a notification: In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
looks like this is the istio bug where this came from - istio/istio#5810 |
cc @sttts @hzxuzhonghu thoughts please |
I am looking into it. |
/priority important-soon |
@k8s-mirror-api-machinery-bugs please feel free to change the priority and/or the milestone |
I have a doubt: why enable |
That make no sense. |
Would putting AlwaysPullImages after mutating webhook fix the issue? In 1.9, it comes after. See https://kubernetes.io/docs/reference/access-authn-authz/admission-controllers/#is-there-a-recommended-set-of-admission-controllers-to-use |
Mutating webhooks are run at the end for the reason to be able to tweak everything other stages of the system have done. We cannot arbitrarily move some plugin adhoc somewhere else. This also breaks compatibility for people relying on the old order. I can see though that some plugins make more sense at the very end. I wonder whether we need some very limited ordering feature (please not priorities) in the webhook API, maybe a binary flag. |
Correct.
Everyone wants to be at the very end. Istio is likely to run afoul of PSP as well and changing an ordering for PSP will eliminate some power for mutating admission plugins in their ability to reconfigure objects. It seems more like istio may need to understand the world it's playing in than trying to twiddle the world around istio (of which there are likely to be many and various use-cases). |
@kubernetes/sig-api-machinery-misc @liggitt @smarterclayton @lavalamp |
Webhooks run after built-in plugins. There is no way to determine the relative ordering. Eventually, we will have to solve that, I think by giving built-ins their own configuration objects (and possibly even making them webhooks). (Clarification: @sttts and @deads2k did not mean to say that mutating webhooks run after validating webhooks: that's not true, validating webhooks run last.) The general rule @bgrant0607 has been arguing for is that webhooks should not modify user intent. "User intent" == fields that are already set. There are (at least) two problems with this: you can't tell the difference between a default and user intent. And existing admission plugins (like AlwaysPull) flagrantly violate this rule. But I do think overall life would be better if we could make this the rule. In this specific case, clearly you cannot enable both AlwaysPull and the Istio sidecar at the same time, no matter what order they are going to be executed in. They have contradictory desires. IMO AlwaysPull is a pretty bad plugin; it should be removed. People who want that behavior can make a webhook enforcing it, which can be more intelligent (e.g., make an exception for images known to be loaded on the node, like Istio's side car). |
@liggitt had some thoughts about a two pass approach that he was going to share.
It may be the case (and I don't know for sure), that the istio plugin didn't express a desire and got a default. However, alwaypullimages is just one case of many. Something like PSP is more likely to be problematic. |
I was assuming Istio specifically wanted to use an image that's pre-loaded
on nodes, so as not to increase a pod's startup time unnecessarily. I could
be wrong :)
If they're getting an unexpected default and *that* causes a problem, then
clearly the sidecar injector is very buggy.
…On Wed, May 30, 2018 at 5:26 AM David Eads ***@***.***> wrote:
@liggitt <https://github.com/liggitt> had some thoughts about a two pass
approach that he was going to share.
In this specific case, clearly you cannot enable both AlwaysPull and the
Istio sidecar at the same time, no matter what order they are going to be
executed in.
It *may* be the case (and I don't know for sure), that the istio plugin
didn't express a desire and got a default. However, alwaypullimages is just
one case of many. Something like PSP is more likely to be problematic.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#64333 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAngluyJG2hiBwRHjufaAHys2hcv9nElks5t3o_6gaJpZM4UOeh->
.
|
[MILESTONENOTIFIER] Milestone Issue Needs Approval @billpratt @kubernetes/sig-api-machinery-misc Action required: This issue must have the Issue Labels
|
/cc @yliaog @jennybuckley |
I also had a hard time understanding if Istio cared about the pull policy of the container it injects, or if it was just trying to mimic the server-side defaulting behavior.
It is currently the only way to ensure imagePullSecrets at the pod level are limited to providing access to images for that pod. A node/container-runtime-level alternative would be welcome, but is not forthcoming. In any case, assuming Istio doesn't actually care about the pull policy, this issue isn't specific to AlwaysPull. An identical scenario would be encountered with an admission plugin that set or required runAsUser to be a specific value on all containers in the pod. Solutions that come to mind:
|
to address #67050 i would support this. as a webhook dev, i want to do the minimum possible and inherit the controls the system puts in place. In my own request I don't want to have to re-implement the logic from previous admission plugins in my webhook. To this end, I want my mutating webhook to run first so as to be as close to the user's initial request as possible. This way I can inject my changes (update securityContext, add containers, volumes, etc) without having to know what the name of a service account's token is. This strategy leads to:
As an example, in order to inject the kerberos sidecar in our webhook we had to generate the following patches in OPA:
where as if the webhook could be run at the beginning of the chain my code would be:
The second block of code eliminates updates to individual container's service accounts and is portable between k8s distros (since its no longer coded to openshift's annotations). It also eliminates the need for me to lookup secret names for specific service accounts leading to fewer possible chances to leak information. |
Followup to my comment above (#64333 (comment)): As part of apply, we've developed the concept of "field sets"; we will be using this to track which fields the user deliberately set (and implicitly, which were defaulted). We probably will use this to indicate in the object which fields each mutating webhook set--we haven't written that part yet. Next we have a choice:
(1) doesn't solve things, it just gives a slightly better error message ("webhook X and Y conflict; talk to your cluster administrator") One example that came up in the discussion was a hypothetical webhook that took an image name:tag and resolved it to a hash, and put that in the image field (of e.g. a deployment's pod template), and didn't re-resolve on an update unless the user changes the image name:tag. Since on an update, webhooks will see the object post-apply, only (3) above allows the user to make a change without forcing every time. I'm not sure how important this use case is. Anyway if folks have use cases for choices 1 or 2 above, now would be a great time to give us that feedback. |
Issues go stale after 90d of inactivity. If this issue is safe to close now please do so with Send feedback to sig-testing, kubernetes/test-infra and/or fejta. |
/remove-lifecycle stale |
To me, it seems as though there are two types of mutating WebHooks:
The Conformance WebHooks are typically tied to a Validation WebHook that then validates the object adhere's to policy. Could the structure of validation AdmissionsWebhooks be modified to contain a reference to a "remediation" MutatingWebHook that that gets executed on failure, and then revalidated. If it still fails, then the object is rejected. This allows for real mutation hooks to be executed before validation (as is currently implemented) but differentiation MHW that only serve to ensure functionality of a Validation WebHook. Another idea is as follows: Concerns: Would require logic to ensure that objects aren't getting caught in a modification loop with a bad acting/conflicting webhooks. Pros: It would ensure that every MutatingWebhook has been applied to the object that gets passed to the validation webhooks. |
I came across this issue when attempting to use the k8s api from a container injected via a mutating webhook. It failed because the service account token volume mount doesn't exist. I think the reason for this is that the service account admission controller runs before the MAWH. @runyontr mentioned the idea of dividing MAWH into two categories, how about early and late? Early MAWHs run early and get access to the request as posted by the user. Late MAWHs run late, and see the results of most of the other plugins. For my usecase, I would use an early MAWH to add another container into the podspec, this would then go through the serviceaccount plugin where both containers would have their service account token volume mounts added. |
This commit copies the service account token volume vmount from an existing container in the pod to the injected pod. This ensures that injected containers can access the k8s api. This is a fragile hack and should be fixed somewhere else(!)
It seems there are other tools that solve this problem, and we might be able to leverage their implementations Systemdhttps://www.freedesktop.org/software/systemd/man/systemd.unit.html
chkconfighttps://linux.die.net/man/8/chkconfig RunLevel might be already captured with the two types of WebHooks (Mutating and Validation) but we may want to split them out and have different RunLevels for each type. Within a type, we could then specify the run position # What RunLevel to be in. Valid 0-6
Level int
# What Position within the RunLevel to be in. Valid >0
Position int |
We haven't thought of a good solution yet, where "good" is some combination of:
Requiring/permitting a graph (as in systemd) would:
"Run levels" are easier to understand/explain/predict but still have the latter two problems. Our current working plan is basically to run every mutating webhook a second time, if a webhook that ran after it changed something. (max 2 runs) There may be some factors I've forgotten to mention; @liggitt and @mbohlool have convinced me it's a hard problem. Note that today you can explicitly control the order of webhooks by changing the webhook configuration name and/or combining them into the same configuration object. |
fixed by #78080 in 1.15 |
Is this a BUG REPORT or FEATURE REQUEST?:
/kind bug
What happened:
In v1.10, 7c5f9e0 introduced the ability to not worry about admission control order because it's handled here
AlwaysPullImages
is beforeMutatingAdmissionWebhook
. When trying to use Istio sidecar injection, the pod fails to initialize statingError creating: pods "sleep-86f6b99f94-qxvq6" is forbidden: spec.initContainers[0].imagePullPolicy: Unsupported value: "IfNotPresent": supported values: "Always"
In v1.9, everything works as expected when placing
AlwaysPullImages
afterMutatingAdmissionWebhook
. If you putAlwaysPullImages
beforeMutatingAdmissionWebhook
, the same error above occurs.What you expected to happen:
In v1.10, when
AlwaysPullImages
andMutatingAdmissionWebhook
are turned on, sidecar injection like Istio should work.How to reproduce it (as minimally and precisely as possible):
AlwaysPullImages
andMutatingAdmissionWebhook
admission controllers.kubectl label namespace default istio-injection=enabled
kubectl describe rs [REPLICA_SET_NAME]
. You should see error events similar toError creating: pods "sleep-86f6b99f94-qxvq6" is forbidden: spec.initContainers[0].imagePullPolicy: Unsupported value: "IfNotPresent": supported values: "Always"
AlwaysPullImages
seems to fix IstioAnything else we need to know?:
Environment:
kubectl version
): v1.10.3uname -a
):The text was updated successfully, but these errors were encountered: