Add support for proxy injector reinvocation #6347

alpeb · 2021-06-22T20:46:02Z

Fixes #3750 and partially #6267

As of k8s 1.15 mutating webhooks can be reinvoked whenever another mutating webhook running after the current one mutates the pod being persisted.
Enabling reinvocation for the injector will allow configuring the proxy with annotations generated by such other mutating webhooks.

The implementation consists on adding "remove" statements into the json patch returned by the injector, implemented mostly in the new file pkg/inject/pod_patch.go which now holds the podPatch, moved from pkg/inject/inject.go.

This also updates the "reinvocation" integration test introduced in #6309 to properly verify the reinvocation is happening.

And this also means the "existing proxy sidecar" check is no longer relevant, and has been limited to "existing 3rd party sidecar".

Possible followup: refactor the CLI inject and uninject commands to properly leverage the cleanup done by this patch.

Fixes #3750 and partially #6267 As of k8s 1.15 mutating webhooks can be reinvoked whenever another mutating webhook running after the current one mutates the pod being persisted. Enabling reinvocation for the injector will allow configuring the proxy with annotations generated by such other mutating webhooks. The implementation consists on adding "remove" statements into the json patch returned by the injector, implemented mostly in the new file `pkg/inject/pod_patch.go` which now holds the `podPatch`, moved from `pkg/inject/inject.go`. This also updates the "reinvocation" integration test introduced in #6309 to properly verify the reinvocation is happening. And this also means the "existing proxy sidecar" check is no longer relevant, and has been limited to "existing 3rd party sidecar". Possible followup: refactor the CLI inject and uninject commands to properly leverage the cleanup done by this patch.

mateiidavid

This change is super cool! I think everything looks great as it is, although I haven't managed to test it yet since I couldn't find a readily available MWC to deploy alongside the injector and see how it all plays out in practice.

I've left a nitpicky comment about the removed log statement and a suggestion to contextualise the patch.addRemovals(..) call.

controller/webhook/server.go

pkg/inject/inject.go

dadjeibaah

These changes look great. I haven't tested them out as well with another MWC but they look good. I left some super tiny nits.

pkg/healthcheck/sidecar.go

pkg/inject/pod_patch.go

codecov-commenter · 2021-06-23T16:48:53Z

Codecov Report

❗ No coverage uploaded for pull request base (main@9b7a548). Click here to learn what that means.
The diff coverage is n/a.

@@           Coverage Diff           @@
##             main    #6347   +/-   ##
=======================================
  Coverage        ?   44.30%           
=======================================
  Files           ?      184           
  Lines           ?    17616           
  Branches        ?      265           
=======================================
  Hits            ?     7805           
  Misses          ?     9077           
  Partials        ?      734

Flag	Coverage Δ
golang	`42.38% <0.00%> (?)`
javascript	`66.69% <0.00%> (?)`
unittests	`44.30% <0.00%> (?)`

Flags with carried forward coverage won't be shown. Click here to find out more.

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 9b7a548...ca1e22a. Read the comment docs.

alpeb · 2021-06-23T16:54:16Z

@mateiidavid @dadjeibaah Thanks for your prompt reviews 🙂

If you wanna give it a test with another mutating webhook, you can install kubemod by applying test/integration/reinvocation/testdata/kubemod.yaml. Then you can configure it by instantiating a ModRule instance. The one in test/integration/reinvocation/testdata/modrule.yaml for example sets it so the kubemod mutating webhook adds the config.linkerd.io/proxy-log-level: debug annotation to the pod template.

charts/patch/templates/patch.json

adleong · 2021-06-23T17:28:48Z

charts/patch/templates/patch.json

@@ -29,6 +62,12 @@
  },
  {{- end }}
  {{- range $label, $value := .Values.labels }}
+  {{- if $alreadyInjected }}


Is it possible that the set of labels here during reinvocation is different from the set of a labels from a previous invocation? If so, we could have some previous labels which are not cleaned up here. Would it make sense to remove all labels with the linkerd.io prefix?

On the other hand, if the label set changing is not a concern, do we need to do the remove at all? Doing the add should replace the previous value.

Yeah, I think it'd make sense to remove all labels prefixed with linkerd.io 👍

alpeb · 2021-06-23T20:15:14Z

I realized the injector is now erasing what the jaeger mutating webhook is doing. Stay tuned...

alpeb · 2021-06-25T19:14:11Z

So the jaeger injector adds the env vars LINKERD2_PROXY_TRACE_ATTRIBUTES_PATH, LINKERD2_PROXY_TRACE_COLLECTOR_SVC_ADDR and LINKERD2_PROXY_TRACE_COLLECTOR_SVC_NAME into the proxy container. That mutation triggers again the proxy injector which first deletes the existing proxy container and adds it back, but doesn't know about those vars.

Here are some possible fixes:

Move some of the proxy template logic out of the helm template

Prior to generating the proxy container patch, the proxy injector would fetch all the env vars in the proxy container and merge them against the list of known "core" env vars, giving precedence to the latter, and use the resulting list in the patch. That means moving the rendering of the env vars out of the _proxy.tpl helm template and into the proxy injector code. The template would instead just iterate over a list of vars it receives as a param.

Use special names for env vars injected by extensions

Use a LINKERD2_PROXY_EXT prefix in env vars injected by extensions. So the jaeger ones would end up like LINKERD2_PROXY_EXT_TRACE_ATTRIBUTES_PATH, LINKERD2_PROXY_EXT_TRACE_COLLECTOR_SVC_ADDR and LINKERD2_PROXY_EXT_TRACE_COLLECTOR_SVC_NAME. So prior to removing the existing proxy container, the proxy-injector would gather any vars with that prefix, to re-apply them in the final patch. This requires changes in the proxy to use these new names.

adleong · 2021-06-25T21:28:45Z

Something to note about the proposals here is that they assume that extension webhooks will only add environment variables and will not modify any of the core env variables. This is true today, but we should consider if we're okay imposing this limitation going forward.

One of the very nice properties of the extension system today is that the proxy-injector has no coupling whatsoever to the extensions and the extensions may perform any arbitrary mutation to the proxy container. This could include modifying core env variables or other aspects of the proxy sidecar such as resource requests, security policy, etc. etc. This clean layering is possible because we can guarantee that the extension injectors will be invoked (or re-invoked) after the proxy-injector.

With this change, we would no longer be able to make that guarantee and would need to the proxy-injector to take into account that the extensions may have already performed mutations and be aware of what types of mutations are possible. Overall, I'm not sure that the benefit of allowing webhooks to add linkerd configuration through annotations is worth the increased complexity, increased coupling, and decreased flexibility that we would incur.

Instead, I would recommend that users either ensure the pod spec has all the desired annotations BEFORE it starts admission control, or that any webhooks which wish to perform mutation on the proxy container do it directly (i.e. by setting environment variables just like an extension would) instead of by setting annotations to control the proxy-injector.

alpeb · 2021-06-28T12:56:23Z

Overall, I'm not sure that the benefit of allowing webhooks to add linkerd configuration through annotations is worth the increased complexity, increased coupling, and decreased flexibility that we would incur.

There's also #6267 to cater for, to assist policy discovery. Not a hard requirement, but would make that feature more complete and avoid surprises when using additional sidecars.

I have in mind a third alternative, that I think would actually make the injector current behavior smoother, without adding any kind of coupling. Currently the injector's patch consists of multiple add ops for annotations, labels and volumes. However, the proxy init and non-init containers are parsed in separate templates and fed wholesale into the patch. We could instead treat each part of them as an add op, which shouldn't interfere with mutations from other webhooks. Let me me give this a try and I'll report back.

alpeb · 2021-06-30T20:23:40Z

Ok it is possible for the injector to generate a nice non-destructive json patch, instead of removing and then re-adding the proxy. However I painfully remembered that the purpose of the wholesale _proxy.tpl template is to be used as-is in the control plane deployments, particularly when installing through Helm, where templates is all we have at hand. So we can't get rid of that, and anything fancier like a non-destructive json patch in the injector requires separate logic, which would duplicate a lot of what we have in that template.

Yet another idea: stick to the rule that the proxy is to be configured only through annotations, forbidding extensions to directly mutate the proxy container. Instead, any such mutation (adding env vars, security policy, resources, etc) would then be declared by having the extension inject a new annotation containing the actual json patch for the mutation. Then when the injector runs, it would apply it on top of the regular patch.

What do you folks think? Any other suggestions?

adleong · 2021-06-30T20:28:29Z

I'm not sure what you mean by:

having the extension inject a new annotation containing the actual json patch for the mutation

Does this mean the extension webhook would add an annotation containing a json patch and then the proxy-injector would apply that patch? Why not just have the extension webhook perform the mutation directly? (like our existing extension webhooks do)

alpeb · 2021-06-30T20:33:46Z

Does this mean the extension webhook would add an annotation containing a json patch and then the proxy-injector would apply that patch?

Correct. Assuming we want to go ahead with reinvocation, the problem we current have is that after the extension performs a mutation, the injector needs to remove the proxy and then add it again (as to re-use the Helm template logic), thus losing the changes made by the extension.

adleong · 2021-06-30T20:43:11Z

What problem are we trying to solve with reinvocation? If it's just to allow extensions to configure the proxy, I think it makes more sense for those extensions to configure the proxy directly and keep reinvocation disabled on the proxy-injector. This gives us the invariant that extension webhooks will always be invoked (or re-invoked) AFTER the proxy-injector.

alpeb · 2021-07-01T15:27:01Z

What problem are we trying to solve with reinvocation? If it's just to allow extensions to configure the proxy, I think it makes more sense for those extensions to configure the proxy directly and keep reinvocation disabled on the proxy-injector. This gives us the invariant that extension webhooks will always be invoked (or re-invoked) AFTER the proxy-injector.

The problem reinvocation solves is to be able to configure the proxy given mutations done to the pod by webhooks running after the proxy injector, in particular other sidecar containers added after the proxy.

For example Cloudstate, which adds a sidecar and a config.linkerd.io/skip-outbound-ports annotation to have the proxy skip that sidecar. Granted, the workaround there is to require that annotation to be added beforehand.

Another more critical case is for policy enforcement (#6267). The proxy needs to know the list of ports exposed by all the containers as it will not allow connections to other ports. That list is not available through the downward API; only the injector can inspect the pod to retrieve it, and any container created after the injector runs will be locked out if the injector doesn't get another chance.

adleong · 2021-07-01T15:59:36Z

I think it makes things much simpler if we have the requirement that any mutating webhooks must assume that they will be invoked (or re-invoked) AFTER the proxy-injector. In the case of Cloudstate, this means that to add to the skip port list, their webhook should mutate the --outbound-ports-to-ignore flag of the proxy-init container instead of setting the config.linkerd.io/skip-outbound-ports annotation. Webhooks which add additional containers that listen on new ports would need to mutate the LINKERD2_PROXY_INBOUND_PORTS env variable to add their ports.

alpeb · 2021-07-09T14:19:37Z

Closing this one for now, until need for reinvocation is clarified. #6267 is getting addressed by #6445 which also adds an annotation to handle the additional sidecar case.

alpeb added the area/injector label Jun 22, 2021

alpeb requested a review from a team as a code owner June 22, 2021 20:46

Formatting fix

bc4d329

mateiidavid reviewed Jun 23, 2021

View reviewed changes

controller/webhook/server.go Show resolved Hide resolved

pkg/inject/inject.go Outdated Show resolved Hide resolved

Rename addRemovals to addReinvokeremovals, and added some extra logging

db4ad18

dadjeibaah reviewed Jun 23, 2021

View reviewed changes

pkg/healthcheck/sidecar.go Outdated Show resolved Hide resolved

pkg/inject/pod_patch.go Show resolved Hide resolved

Rename has3rdPartySidecars to hasThirdPartySidecars

ca1e22a

adleong reviewed Jun 23, 2021

View reviewed changes

alpeb closed this Jul 9, 2021

kleimkuhler deleted the alpeb/injector-reinvoke-2 branch July 9, 2021 19:40

mateiidavid mentioned this pull request Jan 18, 2022

Remove unused reinvocation test #7630

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support for proxy injector reinvocation #6347

Add support for proxy injector reinvocation #6347

alpeb commented Jun 22, 2021

mateiidavid left a comment

dadjeibaah left a comment

codecov-commenter commented Jun 23, 2021 •

edited

alpeb commented Jun 23, 2021

adleong Jun 23, 2021

alpeb Jun 23, 2021

alpeb commented Jun 23, 2021

alpeb commented Jun 25, 2021

adleong commented Jun 25, 2021

alpeb commented Jun 28, 2021

alpeb commented Jun 30, 2021

adleong commented Jun 30, 2021

alpeb commented Jun 30, 2021

adleong commented Jun 30, 2021

alpeb commented Jul 1, 2021

adleong commented Jul 1, 2021

alpeb commented Jul 9, 2021 •

edited

Add support for proxy injector reinvocation #6347

Add support for proxy injector reinvocation #6347

Conversation

alpeb commented Jun 22, 2021

mateiidavid left a comment

Choose a reason for hiding this comment

dadjeibaah left a comment

Choose a reason for hiding this comment

codecov-commenter commented Jun 23, 2021 • edited

Codecov Report

alpeb commented Jun 23, 2021

adleong Jun 23, 2021

Choose a reason for hiding this comment

alpeb Jun 23, 2021

Choose a reason for hiding this comment

alpeb commented Jun 23, 2021

alpeb commented Jun 25, 2021

Move some of the proxy template logic out of the helm template

Use special names for env vars injected by extensions

adleong commented Jun 25, 2021

alpeb commented Jun 28, 2021

alpeb commented Jun 30, 2021

adleong commented Jun 30, 2021

alpeb commented Jun 30, 2021

adleong commented Jun 30, 2021

alpeb commented Jul 1, 2021

adleong commented Jul 1, 2021

alpeb commented Jul 9, 2021 • edited

codecov-commenter commented Jun 23, 2021 •

edited

alpeb commented Jul 9, 2021 •

edited