New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proxy init and sidecar containers auto-injection #1714
Conversation
Holy cow, look at those tests! @ihcsim why did you put the inject label on the deployment instead of the pod itself? |
I don't remember why I did it that way 😆 I don't see why it needs to be at the deployment level. I had to change the mutating webhook configuration
Not perfect though because the webhook gets only as far as generating the JSON patch. Whether the JSON patch can be applied or not is beyond the webhook's scope. |
I'd love to see if it works at the pod level, that way we get all the other resource types for free. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ihcsim Great work! I was able to run this locally using the gist you provided, and it's awesome.
I didn't make it all the way through on this pass, but I left a bunch of miscellaneous comments below. I admittedly don't know a lot about admission controllers, so some of my feedback may not make sense. And I'll spend some more time looking at this over the next day or two.
From an architectural perspective, how would you feel about running the proxy-injector service in a separate proxy-injector pod? I think my preference would be to not modify the containers running in the existing controller pod, since this feature is an add on. Similarly, I couldn't quite follow the changes to the ca controller. Is is the case that we'll receive a new MutatingWebhookConfiguration event for each new deployment that's created, and we need to watch those events in order to propagate certs to those deployments? And that replaces the Pod watch events, since the pods no longer have injected YAML that tells us they're part of the data plane? If so, it seems like we should be running either the Pod watcher or the MutatingWebhookConfiguration watcher, but not both -- see my comment in that file about gating that behavior behind a flag to the ca server.
controller/k8s/api.go
Outdated
@@ -404,6 +419,29 @@ func (api *API) getServices(namespace, name string) ([]runtime.Object, error) { | |||
return objects, nil | |||
} | |||
|
|||
func (api *API) getMutatingWebhookConfigurations(name string) ([]runtime.Object, error) { | |||
log.Infof(">>>>> calling getmuttingwebhookconfigurations with s%", name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe this is leftover debug logging? If you want to keep in it, I'd change to log.Debugf
, and fix the issue with string substitution (%s
, not s%
).
|
||
// NewClientset returns a new instance of Clientset. It authenticates with the cluster using the service account token mounted inside the webhook pod. | ||
func NewClientset(kubeconfig string) (kubernetes.Interface, error) { | ||
return k8s.NewClientSet(kubeconfig) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This method seems like an alias for k8s.NewClientSet
-- why not just call that method directly from proxy-injector/main.go
and remove this file?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this is no longer needed. It was introduced in some of my earlier tests to polymorphically return a fake k8s client. I will remove it.
controller/k8s/api.go
Outdated
@@ -184,6 +197,8 @@ func (api *API) GetObjects(namespace, restype, name string) ([]runtime.Object, e | |||
return api.getRCs(namespace, name) | |||
case k8s.Service: | |||
return api.getServices(namespace, name) | |||
case k8s.MutatingWebhookConfiguration: | |||
return api.getMutatingWebhookConfigurations(name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think you're ever hitting this code path, since GetObjects
is only called by the tap server and the public-api server, and neither of those should be requesting MWCs. I recommend removing these two lines and the getMutatingWebhookConfigurations
method below entirely.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 It's one of those I will add them for completeness sake but don't know what it does changes. I will remove them.
flag.StringVar(&certFile, "tls-cert-file", "/var/linkerd-io/identity/certificate.crt", "location of the webhook server's TLS cert file") | ||
flag.StringVar(&keyFile, "tls-key-file", "/var/linkerd-io/identity/private-key.p8", "location of the webhook server's TLS private key file") | ||
flag.StringVar(&trustAnchorsPath, "trust-anchors-path", "/var/linkerd-io/trust-anchors/trust-anchors.pem", "path to the CA trust anchors PEM file used to create the mutating webhook configuration") | ||
flag.StringVar(&volumeMountsWaitTime, "volume-mounts-wait", "3m", "maximum wait time for the secret volumes to mount before the timeout expires") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could use a duration flag here instead of a string flag, and that would remove the need to call ParseDuration
later in this file. Duration flags will also automatically reject invalid durations.
volumeMountsWaitTime := flag.Duration("volume-mounts-wait", 3*time.Minute, "maximum wait time for the secret volumes to mount before the timeout expires")
flag.StringVar(&volumeMountsWaitTime, "volume-mounts-wait", "3m", "maximum wait time for the secret volumes to mount before the timeout expires") | ||
flag.StringVar(&webhookServiceName, "webhook-service", "proxy-injector.linkerd.io", "name of the admission webhook") | ||
flags.ConfigureAndParse() | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is super minor, but the convention in the rest of our controller mains is to define flags as local variables within the main
func itself, rather than as global variables that are configured via an init
func. So for consistency with those, I'd do something like:
func main() {
port := flag.String("port", "443", "port that this webhook admission server listens on")
kubeconfig := flag.String("kubeconfig", "", "path to kubeconfig")
controllerNamespace := flag.String("controller-namespace", "linkerd", "namespace in which Linkerd is installed")
certFile := flag.String("tls-cert-file", "/var/linkerd-io/identity/certificate.crt", "location of the webhook server's TLS cert file")
keyFile := flag.String("tls-key-file", "/var/linkerd-io/identity/private-key.p8", "location of the webhook server's TLS private key file")
trustAnchorsPath := flag.String("trust-anchors-path", "/var/linkerd-io/trust-anchors/trust-anchors.pem", "path to the CA trust anchors PEM file used to create the mutating webhook configuration")
volumeMountsWaitTime := flag.Duration("volume-mounts-wait", 3*time.Minute, "maximum wait time for the secret volumes to mount before the timeout expires")
webhookServiceName := flag.String("webhook-service", "proxy-injector.linkerd.io", "name of the admission webhook")
flags.ConfigureAndParse()
pkg/k8s/k8s.go
Outdated
@@ -119,6 +122,8 @@ func ShortNameFromCanonicalResourceName(canonicalName string) string { | |||
return "sts" | |||
case Authority: | |||
return "au" | |||
case MutatingWebhookConfiguration: | |||
return "mwc" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Related to my comments in controller/k8s/api.go
-- I'm pretty sure you can revert all of the changes to this file. Since we're not exposing resources of type MWC through linkerd stat
and linkerd tap
, they don't need to be added here.
} | ||
s.SyncAPI(nil) | ||
|
||
s.SetLogLevel(log.StandardLogger().Level) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The logrus log level is global, so there's no need to re-set the level each time you import it. And at this point in your main func, the log level was already set by calling flags.ConfigureAndParse()
. So I recommend removing this call and all of the SetLogLevel
funcs and tests that you've added in the injector
package.
if err != nil { | ||
log.Fatal("failed to initialize the webhook server: ", err) | ||
} | ||
s.SyncAPI(nil) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should pass in a ready channel to the sync call here so that we know when the caches are synced. That way we can also run an admin server and configure a readiness check that won't succeed until caches are synced.
If you add the following flag above:
metricsAddr := flag.String("metrics-addr", ":9993", "address to serve scrapable metrics on")
Then here you can do:
ready := make(chan struct{})
s.SyncAPI(ready)
go admin.StartServer(*metricsAddr, ready)
controller/proxy-injector/server.go
Outdated
type WebhookServer struct { | ||
*http.Server | ||
*Webhook | ||
*log.Logger |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ahhhh, related to my other comment about logging -- I now see that these structs embed the logger, which is why you're explicitly setting the level. My preference would be to remove the embedding of Logger and rely on global log
instead.
controller/ca/controller.go
Outdated
@@ -52,6 +53,13 @@ func NewCertificateController(controllerNamespace string, k8sAPI *k8s.API) (*Cer | |||
}, | |||
) | |||
|
|||
k8sAPI.MWC().Informer().AddEventHandler( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we need to gate this behavior behind a flag to the ca
service, contingent on whether or not linkerd was installed with the --proxy-auto-inject
set. Otherwise, all TLS deployments will fire up a watch on MutatingWebhookConfigurations, regardless of whether or not the admission controller is running.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Absolutely. I forgot about the scenario where --tls=optional
is enabled but not --proxy-auto-inject
.
@klingerf Thanks for the feedback.
No objections. That was my initial plan. I actually misread a comment in the other older PR, thinking that the intent was to run the injector as a container in the controller pod.
The MutatingWebhookConfiguration is created just once when the injector starts up. It contains information used by the k8s API server to communicate with the admission webhook (which conceptually, is made up of a TLS server and the actual webhook that does the spec mutation). The configuration has nothing to do with any other new "application" deployments. (Tangent: The only other times the MutatingWebhookConfiguration needs to be re-created is when the CA controller's trust-anchors.pem changed because it uses the trust-anchors.pem as its CA bundle. This is how the k8s API server knows it can trust the cert presented by the webhook TLS server. Currently, I am doing re-creating the MutatingWebhookConfiguration manually.)
Nothing about how LD2 is currently doing automatic certs propagation has changed. If it does, it's unintentional and I screwed up! My changes to the CA controller is simply my way of telling the CA controller, "hey, I need a CA cert for my webhook's TLS server. btw, the API server will be talking to my TLS server as https:// Open to more feedback and suggestions... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ihcsim Ok great, thanks for the additional feedback! I definitely wasn't understanding the intent of the ca change, but now I get it. I think the watch approach that you've taken works, but see my comment below.
Also, I haven't fully thought this through, but if we wanted to avoid setting up a watch on all MWCs and requiring the additional RBAC privileges, it seems possible that the ca
service could provide an endpoint that the proxy-injector
service could call on startup, prompting the ca
service to create the secret on demand. And maybe that could be generalized for other use cases in the future... although I can't think of one off the top of my head.
@klingerf To clarify, the intent is to move the injector completely out of the control plane, right? So I'll have to roll a new Docker image for the injector, update the build scripts and move the code to the project root level? |
@ihcsim Thanks for checking -- I just meant a separate pod definition in the install YAML. It's no problem for the pod to use the same controller image and rely on these existing build system. But we should add another service account for the new pod with the proper RBAC permissions. I think something like this in template.go would work: diff --git a/cli/install/template.go b/cli/install/template.go
index e4956792..4bcefe55 100644
--- a/cli/install/template.go
+++ b/cli/install/template.go
@@ -229,7 +229,40 @@ spec:
path: /ready
port: 9998
failureThreshold: 7
- {{if and .EnableTLS .ProxyAutoInjectEnabled }}
+
+{{- if and .EnableTLS .ProxyAutoInjectEnabled }}
+### Service Account Proxy Injector ###
+---
+kind: ServiceAccount
+apiVersion: v1
+metadata:
+ name: linkerd-proxy-injector
+ namespace: {{.Namespace}}
+
+### Controller RBAC ###
+... RBAC for injector here
+
+---
+kind: Deployment
+apiVersion: extensions/v1beta1
+metadata:
+ name: proxy-injector
+ namespace: {{.Namespace}}
+ labels:
+ {{.ControllerComponentLabel}}: proxy-injector
+ annotations:
+ {{.CreatedByAnnotation}}: {{.CliVersion}}
+spec:
+ replicas: {{.ControllerReplicas}}
+ template:
+ metadata:
+ labels:
+ {{.ControllerComponentLabel}}: proxy-injector
+ annotations:
+ {{.CreatedByAnnotation}}: {{.CliVersion}}
+ spec:
+ serviceAccount: linkerd-proxy-injector
+ containers:
- name: proxy-injector
image: {{.ControllerImage}}
imagePullPolicy: {{.ImagePullPolicy}}
@@ -255,7 +288,6 @@ spec:
secret:
secretName: {{.ProxyInjectorTLSSecret}}
optional: true
- {{ end }}
### Web ###
--- |
And thanks for addressing my other feedback! I'm going to be offline for a few days, but will re-review as soon as I get back. |
@klingerf I have moved the proxy-injector into a separate deployment in the control plane. The interesting thing now is that I see two TLS secrets being generated:
As previously mentioned, the
This is how the k8s API server communicates with the webhook. My preference is that Will chat more when you're back. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ihcsim Thanks for making those updates! This is looking really good. I unfortunately didn't get all the way through reviewing again today, but I wanted to send a couple of comments before I come back to it tomorrow.
cli/install/template.go
Outdated
kind: ClusterRole | ||
apiVersion: rbac.authorization.k8s.io/v1 | ||
metadata: | ||
name: linkerd-{{.Namespace}}-controller-proxy-injector |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is super minor, but the convention for the rest of our cluster roles in this file is to use the name linkerd-{{.Namespace}}-
+ deployment name. And since the deployment name is "proxy-injector", I'd prefer for this role name to be:
name: linkerd-{{.Namespace}}-proxy-injector
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it. Yeah, I had to make some assumption about the naming convention.
cli/install/template.go
Outdated
kind: ClusterRoleBinding | ||
apiVersion: rbac.authorization.k8s.io/v1 | ||
metadata: | ||
name: linkerd-{{.Namespace}}-controller-proxy-injector |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment here as above:
name: linkerd-{{.Namespace}}-proxy-injector
cli/install/template.go
Outdated
apiGroup: "" | ||
roleRef: | ||
kind: ClusterRole | ||
name: linkerd-{{.Namespace}}-controller-proxy-injector |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And here:
name: linkerd-{{.Namespace}}-proxy-injector
|
||
if !exist { | ||
log.Info("creating mutating webhook configuration") | ||
webhookConfig, err := webhookConfig.Create() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the event that the webhook config exists, how would you feel about updating it every time the proxy-injector process starts? I followed your gist and ran into this issue:
2018/10/01 04:09:54 http: TLS handshake error from 172.17.0.4:55728: remote error: tls: bad certificate
I think a possible workaround for that would be to force update the mwc each time this process starts. Something along the lines of:
diff --git a/controller/cmd/proxy-injector/main.go b/controller/cmd/proxy-injector/main.go
index a29310e6..342f3f75 100644
--- a/controller/cmd/proxy-injector/main.go
+++ b/controller/cmd/proxy-injector/main.go
@@ -44,19 +44,11 @@ func main() {
webhookConfig := injector.NewWebhookConfig(k8sClient, *controllerNamespace, *webhookServiceName, *trustAnchorsPath)
- exist, err := webhookConfig.Exist()
+ obj, err := webhookConfig.CreateOrUpdate()
if err != nil {
- log.Fatalf("failed to access the mutating webhook configurations resource: ", err)
- }
-
- if !exist {
- log.Info("creating mutating webhook configuration")
- webhookConfig, err := webhookConfig.Create()
- if err != nil {
- log.Fatal("faild to create the mutating webhook configuration: ", err)
- }
- log.Info("created mutating webhook configuration: ", webhookConfig.ObjectMeta.SelfLink)
+ log.Fatalf("failed to create or update mutating webhook configuration: %s", err)
}
+ log.Infof("created or updated mutating webhook configuration: ", obj.ObjectMeta.SelfLink)
log.Infof("waiting for the tls secrets to mount (wait at most %s)", volumeMountsWaitTime)
if err := waitForMounts(*volumeMountsWaitTime, *certFile, *keyFile); err != context.Canceled {
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup, that will work. There is still a chance that the MWC's CA bundle will go out-of-sync with the proxy-injector's TLS secret, triggering the TLS handshake error. At least, with your suggested approach, we know the workaround to the TLS error is to delete the proxy-injector pod, and let it update the MWC's CA bundle 👍ca
trust-anchors.pem
log.Info("creating mutating webhook configuration") | ||
webhookConfig, err := webhookConfig.Create() | ||
if err != nil { | ||
log.Fatal("faild to create the mutating webhook configuration: ", err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typo here: faild
=> failed
controller/ca/controller.go
Outdated
mwc := obj.(*v1beta1.MutatingWebhookConfiguration) | ||
log.Debugf("enqueuing secret write for mutating webhook configuration %q", mwc.ObjectMeta.Name) | ||
for _, webhook := range mwc.Webhooks { | ||
c.queue.Add(fmt.Sprintf("%s.%s.%s", webhook.ClientConfig.Service.Name, pkgK8s.Service, webhook.ClientConfig.Service.Namespace)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Related to my previous comment, I think we still need to check the name of the MWC here to make sure that the update pertains to the MWC that is being created by the proxy-injector process:
diff --git a/controller/ca/controller.go b/controller/ca/controller.go
index 34514f95..223c5ff9 100644
--- a/controller/ca/controller.go
+++ b/controller/ca/controller.go
@@ -186,9 +186,11 @@ func (c *CertificateController) handlePodUpdate(oldObj, newObj interface{}) {
func (c *CertificateController) handleMWCAdd(obj interface{}) {
mwc := obj.(*v1beta1.MutatingWebhookConfiguration)
- log.Debugf("enqueuing secret write for mutating webhook configuration %q", mwc.ObjectMeta.Name)
- for _, webhook := range mwc.Webhooks {
- c.queue.Add(fmt.Sprintf("%s.%s.%s", webhook.ClientConfig.Service.Name, pkgK8s.Service, webhook.ClientConfig.Service.Namespace))
+ if mwc.Name == pkgK8s.ProxyInjectorWebhookConfig {
+ log.Debugf("enqueuing secret write for mutating webhook configuration %q", mwc.ObjectMeta.Name)
+ for _, webhook := range mwc.Webhooks {
+ c.queue.Add(fmt.Sprintf("%s.%s.%s", webhook.ClientConfig.Service.Name, pkgK8s.Service, webhook.ClientConfig.Service.Namespace))
+ }
}
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ihcsim The latest set of updates looks great. And I had a chance to review the rest of the code in the injector
package, and left some comments. I'm pretty sure we can run the proxy-injector
process with fewer RBAC permissions, but would be interested to hear your feedback on it.
controller/cmd/ca/main.go
Outdated
@@ -30,9 +31,10 @@ func main() { | |||
k8sClient, | |||
k8s.Pod, | |||
k8s.RS, | |||
k8s.MWC, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Passing this as an argument to NewAPI
will cause the watch on MWCs to be established automatically, so you only want to provide the MWC argument when proxyAutoInject
is true. Something like:
var k8sAPI *k8s.API
if *proxyAutoInject {
k8sAPI = k8s.NewAPI(k8sClient, k8s.Pod, k8s.RS, k8s.MWC)
} else {
k8sAPI = k8s.NewAPI(k8sClient, k8s.Pod, k8s.RS)
}
controller/proxy-injector/webhook.go
Outdated
}, nil | ||
} | ||
|
||
// Mutate changes the given pod spec by injecting the proxy sidecar container into the spec. The admission review object returns contains the original request and the response with the mutated pod spec. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is minor, but can I bug you to wrap all of your comments at 80 characters?
controller/proxy-injector/webhook.go
Outdated
fallthrough | ||
case k8sPkg.ProxyAutoInjectCompleted: | ||
return true | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this would be clearer as:
switch status {
case k8sPkg.ProxyAutoInjectDisabled, k8sPkg.ProxyAutoInjectCompleted:
return true
}
controller/proxy-injector/webhook.go
Outdated
} | ||
|
||
// check for known proxies and initContainers | ||
// same logic as the checkSidecars() function in cli/cmd/inject.go |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should move checkSidecars
to a public HasExistingSidecars
function in a separate package that can be shared between the proxy-injector and the cli. How about creating a new file at pkg/healthcheck/sidecar.go
? It's at least loosely related to health checking. And then here:
return healthcheck.HasExistingSidecars(&deployment.Spec.Template.Spec)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good.
Yeah, I was hesitant to change the CLI code too much as I wasn't sure if those changes will be necessary/accepted etc. So I decided to let you bring them up :-)
controller/proxy-injector/webhook.go
Outdated
} | ||
|
||
func (w *Webhook) containersSpec(identity *k8sPkg.TLSIdentity) (*corev1.Container, *corev1.Container, error) { | ||
configMap, err := w.k8sAPI.CM().Lister().ConfigMaps(identity.ControllerNamespace).Get(k8sPkg.ProxyInjectorSidecarConfig) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rather than reading this configmap from the API each time we need to inject, couldn't we volume mount it into the proxy-injector container and read it from disk when the process starts up? That would remove the need to watch all configmaps in all namespaces, and it would allow us to scope the RBAC permissions for this process more tightly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Love it! Must admit that I got a bit API happy there.
controller/proxy-injector/webhook.go
Outdated
} | ||
log.Infof("resource namespace: %s", ns) | ||
|
||
namespace, err := w.k8sAPI.NS().Lister().Get(ns) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're using the Kubernetes API here to retrieve the full namespace object given the namespace name, but the only field you're ever accessing on the object is the name, on line 125. So it seems like you don't need to use the Kubernetes API at all for this. I'd change as follows:
@@ -106,11 +107,6 @@ func (w *Webhook) inject(request *admissionv1beta1.AdmissionRequest) (*admission
}
log.Infof("resource namespace: %s", ns)
- namespace, err := w.k8sAPI.NS().Lister().Get(ns)
- if err != nil {
- return nil, err
- }
-
if w.ignore(&deployment) {
log.Infof("ignoring deployment %s", deployment.ObjectMeta.Name)
return &admissionv1beta1.AdmissionResponse{
@@ -122,7 +118,7 @@ func (w *Webhook) inject(request *admissionv1beta1.AdmissionRequest) (*admission
identity := &k8sPkg.TLSIdentity{
Name: deployment.ObjectMeta.Name,
Kind: strings.ToLower(request.Kind.Kind),
- Namespace: namespace.ObjectMeta.GetName(),
+ Namespace: ns,
ControllerNamespace: w.controllerNamespace,
}
proxy, proxyInit, err := w.containersSpec(identity)
And that would allow you to remove the "namespaces" RBAC privileges for this process.
controller/proxy-injector/webhook.go
Outdated
} | ||
|
||
func (w *Webhook) volumesSpec(identity *k8sPkg.TLSIdentity) (*corev1.Volume, *corev1.Volume, error) { | ||
configMap, err := w.k8sAPI.CM().Lister().ConfigMaps(identity.ControllerNamespace).Get(k8sPkg.ProxyInjectorSidecarConfig) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment here as above -- since this process is already running in the controller namespace, it should be able to volume mount the ProxyInjectorSidecarConfig
configmap and read it on startup, rather than needing to retrieve it from the API.
controller/proxy-injector/webhook.go
Outdated
return &Webhook{ | ||
deserializer: codecs.UniversalDeserializer(), | ||
controllerNamespace: controllerNamespace, | ||
k8sAPI: k8s.NewAPI(client, k8s.NS, k8s.CM), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think if you make the configmap and namespace changes that I suggest in this file, then this process should no longer need access to the Kubernetes API, and you can remove this field.
verbs: ["create", "get", "watch"] | ||
- apiGroups: [""] | ||
resources: ["configmaps", "namespaces"] | ||
verbs: ["get", "list", "watch"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you make the changes I suggest in webhook.go
, I'm pretty sure you can remove the "configmaps" and "namespaces" RBAC rules here.
rules: | ||
- apiGroups: ["admissionregistration.k8s.io"] | ||
resources: ["mutatingwebhookconfigurations"] | ||
verbs: ["create", "get", "watch"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Based on the create or update changes that you made in your last commit, I think the new set of MWC verbs you're using is:
verbs: ["get", "create", "update"]
But I haven't tested those yet.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ooopss... rebase fail.... works on my machine ™️
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ihcsim Thanks for making those updates! This branch is looking really really good. I left just a few additional small comments, mostly around the tests (yay tests btw!).
Once the comments are addressed I think it needs one more master rebase to resolve conflicts from #1731, but after that it should hopefully be good to go.
If the code looks good, I can help put together some documentation for the website; probably gonna re-use the examples in my gist with less text.
Updating the docs would be totally awesome. I've been using the gist you provided, and it's super helpful.
Thanks again for bearing with me and addressing all of my feedback on this branch. I'm excited to see this feature added!
} | ||
|
||
// NewFakeClient returns a fake Kubernetes clientset. | ||
func NewClient(kubeconfig string) (kubernetes.Interface, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The function name and the comment here don't match -- the comment should be renamed from NewFakeClient
to NewClient
.
path: /metrics | ||
port: 4191 | ||
initialDelaySeconds: 10 | ||
name: linkerd-proxy |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we should do this as part of this pull request, but I wanted to note it while I'm thinking about it.
I'm worried that over time the inject YAML in this configmap will start to drift from the YAML that's produced by inject.go. I think a more future proof solution would be to somehow use the inject.go codepath to produce the YAML that goes into this configmap, rather than duplicating the logic from inject.go inside this template.
But like I said, that feels like something that can be done in a follow-up pull request.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed. As a matter of fact, I took that approach when I first started this POC, where I tried to extract out the code in injectPodSpec()
, to be shared between install
and inject
. But the scope of the changes to the code and test went beyond my comfort level. We can open a task for the follow-up PR.
} | ||
|
||
// HTTPRequestBody returns the content of the specified file as a slice of | ||
// bytes. If the file doesn't exist in the 'test/data' folder, an error will be |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it should say 'fake/data' instead of a 'test/data' for the folder path in this comment. Same goes for the rest of the comments in this file.
// AdmissionReview type. An error will be returned if: | ||
// i. the file doesn't exist in the 'test/data' folder or, | ||
// ii. the file content isn't a valid JSON structure that can be unmarshalled | ||
// into AdmissionReview type |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This factory func uses yaml.Unmarshal
, but the comment says that it requires valid JSON. I'd just change the comment to s/JSON/YAML/
. Same goes for the rest of the factory funcs in this file.
t.Fatal("Unexpected error: ", err) | ||
} | ||
|
||
if !reflect.DeepEqual(actual, expected) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It feels a bit brittle to compare JSON byte slices for an exact match, since two JSON objects could be effectively equal but with different key ordering. I'd be inclined to just remove the JSON marshaling from this test, and compare the object slices directly. Something like:
diff --git a/controller/proxy-injector/patch_test.go b/controller/proxy-injector/patch_test.go
index 5a47c22a..40026bd5 100644
--- a/controller/proxy-injector/patch_test.go
+++ b/controller/proxy-injector/patch_test.go
@@ -1,7 +1,6 @@
package injector
import (
- "encoding/json"
"reflect"
"testing"
@@ -53,11 +52,6 @@ func TestPatch(t *testing.T) {
k8sPkg.CreatedByAnnotation: createdBy,
})
- actual, err := json.Marshal(patch.patchOps)
- if err != nil {
- t.Fatal("Unexpected error: ", err)
- }
-
expectedOps := []*patchOp{
&patchOp{Op: "add", Path: patchPathContainer, Value: sidecar},
&patchOp{Op: "add", Path: patchPathInitContainerRoot, Value: []*v1.Container{}},
@@ -71,12 +65,8 @@ func TestPatch(t *testing.T) {
}},
&patchOp{Op: "add", Path: patchPathPodAnnotation, Value: map[string]string{k8sPkg.CreatedByAnnotation: createdBy}},
}
- expected, err := json.Marshal(expectedOps)
- if err != nil {
- t.Fatal("Unexpected error: ", err)
- }
- if !reflect.DeepEqual(actual, expected) {
- t.Errorf("Content mismatch\nExpected: %s\nActual: %s", expected, actual)
+ if !reflect.DeepEqual(expectedOps, patch.patchOps) {
+ t.Errorf("Content mismatch\nExpected: %s\nActual: %s", expectedOps, patch.patchOps)
}
}
|
||
func main() { | ||
metricsAddr := flag.String("metrics-addr", ":9995", "address to serve scrapable metrics on") | ||
port := flag.String("port", "443", "port that this webhook admission server listens on") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Super minor, but the rest of our controller mains accept an -addr
flag instead of a -port
flag. For consistency with those mains, I'd change this to:
addr := flag.String("addr", ":443", "address to serve on")
And then update NewWebhookServer
accordingly.
"k8s.io/client-go/kubernetes" | ||
) | ||
|
||
const yamlIndent = 6 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like this const is unused -- let's remove it.
log.SetOutput(ioutil.Discard) | ||
} | ||
|
||
func TestMutate(t *testing.T) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Totally optional, but you could condense this test a bit by using a table-driven test like you're doing in TestIgnore
below.
1. Add proxy-injector deployment spec to cli/install/template.go 2. Inject the Linkerd CA bundle into the MutatingWebhookConfiguration during the webhook's start-up process. 3. Add a new handler to the CA controller to create a new secret for the webhook when a new MutatingWebhookConfiguration is created. 4. Declare a config map to store the proxy and proxy-init container specs used during the auto-inject process. 5. Ignore namespace and pods that are labeled with linkerd.io/auto-inject: disabled or linkerd.io/auto-inject: completed 6. Add new flag to `linkerd install` to enable/disable proxy auto-injection Proposed implementation for #561. Signed-off-by: ihcsim <ihcsim@gmail.com>
Signed-off-by: ihcsim <ihcsim@gmail.com>
Signed-off-by: ihcsim <ihcsim@gmail.com>
Signed-off-by: ihcsim <ihcsim@gmail.com>
Signed-off-by: ihcsim <ihcsim@gmail.com>
This ensures the webhook doesn't error out due to proxy that are injected using the command Signed-off-by: ihcsim <ihcsim@gmail.com>
Signed-off-by: ihcsim <ihcsim@gmail.com>
Signed-off-by: ihcsim <ihcsim@gmail.com>
Signed-off-by: ihcsim <ihcsim@gmail.com>
Since we started using healhcheck.HasExistingSidecars() to ensure pods with existing proxies aren't mutated, we don't need to use the auto-inject label as an indicator. This resolves a bug which happens with the kubectl run command where the deployment is also assigned the auto-inject label. The mutation causes the pod auto-inject label to not match the deployment label, causing kubectl run to fail. Signed-off-by: ihcsim <ihcsim@gmail.com>
Signed-off-by: ihcsim <ihcsim@gmail.com>
Signed-off-by: ihcsim <ihcsim@gmail.com>
@klingerf I have updated all the tests per your feedback and rebased this branch with master. The config map is also updated to include the recent proxy resource requests options. Appreciate all your feedback! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ihcsim Ok, great, thanks for making those updates. I opened #1748 to track using one code path for CLI inject and admission controller inject. And to that end, I did a bit more rigorous testing of all of the CLI options for configuring admission controller inject, and found a few discrepancies. They're called out in my comments below. Once these are fixed this branch should be good to go! Thanks again for bearing with me through all of this feedback.
cli/cmd/install.go
Outdated
InboundPort uint | ||
OutboundPort uint | ||
IgnoreInboundPorts []uint | ||
IgnoreOutboundPorts []uint |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IgnoreInboundPorts
and IgnoreOutboundPorts
needs to be formatted as a list of comma-separate ints, but using a slice means that we'll end up printing [
and ]
characters around the ports in the YAML, which won't work.
Here's one way to go about fixing it, but would also be open to other implementations:
diff --git a/cli/cmd/install.go b/cli/cmd/install.go
index b344441f..0df9c4fd 100644
--- a/cli/cmd/install.go
+++ b/cli/cmd/install.go
@@ -6,6 +6,7 @@ import (
"io"
"io/ioutil"
"os"
+ "strings"
"text/template"
"github.com/linkerd/linkerd2/cli/install"
@@ -41,8 +42,8 @@ type installConfig struct {
TLSIdentityVolumeSpecFileName string
InboundPort uint
OutboundPort uint
- IgnoreInboundPorts []uint
- IgnoreOutboundPorts []uint
+ IgnoreInboundPorts string
+ IgnoreOutboundPorts string
ProxyAutoInjectEnabled bool
ProxyAutoInjectLabel string
ProxyUID int64
@@ -111,6 +112,17 @@ func validateAndBuildConfig(options *installOptions) (*installConfig, error) {
if err := validate(options); err != nil {
return nil, err
}
+ ignoreInboundPorts := []string{
+ fmt.Sprintf("%d", options.proxyControlPort),
+ fmt.Sprintf("%d", options.proxyMetricsPort),
+ }
+ for _, p := range options.ignoreInboundPorts {
+ ignoreInboundPorts = append(ignoreInboundPorts, fmt.Sprintf("%d", p))
+ }
+ ignoreOutboundPorts := []string{}
+ for _, p := range options.ignoreOutboundPorts {
+ ignoreOutboundPorts = append(ignoreOutboundPorts, fmt.Sprintf("%d", p))
+ }
return &installConfig{
Namespace: controlPlaneNamespace,
ControllerImage: fmt.Sprintf("%s/controller:%s", options.dockerRegistry, options.linkerdVersion),
@@ -137,8 +149,8 @@ func validateAndBuildConfig(options *installOptions) (*installConfig, error) {
TLSIdentityVolumeSpecFileName: k8s.TLSIdentityVolumeSpecFileName,
InboundPort: options.inboundPort,
OutboundPort: options.outboundPort,
- IgnoreInboundPorts: options.ignoreInboundPorts,
- IgnoreOutboundPorts: options.ignoreOutboundPorts,
+ IgnoreInboundPorts: strings.Join(ignoreInboundPorts, ","),
+ IgnoreOutboundPorts: strings.Join(ignoreOutboundPorts, ","),
ProxyAutoInjectEnabled: options.proxyAutoInject,
ProxyAutoInjectLabel: k8s.ProxyAutoInjectLabel,
ProxyUID: options.proxyUID,
diff --git a/cli/cmd/install_test.go b/cli/cmd/install_test.go
index 9823e27f..4e8ed7bf 100644
--- a/cli/cmd/install_test.go
+++ b/cli/cmd/install_test.go
@@ -57,8 +57,8 @@ func TestRender(t *testing.T) {
ProxyInjectorSidecarConfig: "ProxyInjectorSidecarConfig",
ProxySpecFileName: "ProxySpecFileName",
ProxyInitSpecFileName: "ProxyInitSpecFileName",
- IgnoreInboundPorts: []uint{1, 2, 3},
- IgnoreOutboundPorts: []uint{2, 3, 4},
+ IgnoreInboundPorts: "4190,4191,1,2,3",
+ IgnoreOutboundPorts: "2,3,4",
ProxyResourceRequestCPU: "RequestCPU",
ProxyResourceRequestMemory: "RequestMemory",
}
diff --git a/cli/cmd/testdata/install_output.golden b/cli/cmd/testdata/install_output.golden
index 51e7b5ad..91c3fa5d 100644
--- a/cli/cmd/testdata/install_output.golden
+++ b/cli/cmd/testdata/install_output.golden
@@ -1226,9 +1226,9 @@ data:
- --proxy-uid
- 2102
- --inbound-ports-to-ignore
- - 4190,4191,[1 2 3]
+ - 4190,4191,1,2,3
- --outbound-ports-to-ignore
- - [2 3 4]
+ - 2,3,4
image: ProxyInitImage
imagePullPolicy: IfNotPresent
name: linkerd-init
diff --git a/cli/install/template.go b/cli/install/template.go
index 9b25ecaa..99e6e3b1 100644
--- a/cli/install/template.go
+++ b/cli/install/template.go
@@ -822,9 +822,11 @@ data:
- {{.OutboundPort}}
- --proxy-uid
- {{.ProxyUID}}
+ {{- if ne (len .IgnoreInboundPorts) 0}}
- --inbound-ports-to-ignore
- - {{.ProxyControlPort}},{{.ProxyMetricsPort}}{{ if .IgnoreInboundPorts }},{{.IgnoreInboundPorts}}{{end}}
- {{- if .IgnoreOutboundPorts}}
+ - {{.IgnoreInboundPorts}}
+ {{- end}}
+ {{- if ne (len .IgnoreOutboundPorts) 0}}
- --outbound-ports-to-ignore
- {{.IgnoreOutboundPorts}}
{{- end}}
- --inbound-ports-to-ignore | ||
- 4190,4191,[1 2 3] | ||
- --outbound-ports-to-ignore | ||
- [2 3 4] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The brackets here for inbound and outbound ports to ignore aren't going to be parseable by the proxy-init go process, unfortunately. I think this needs to be:
- --inbound-ports-to-ignore
- 4190,4191,1,2,3
- --outbound-ports-to-ignore
- 2,3,4
cli/install/template.go
Outdated
- name: LINKERD2_PROXY_LOG | ||
value: warn,linkerd2_proxy=info | ||
- name: LINKERD2_PROXY_BIND_TIMEOUT | ||
value: 10s |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This timeout is configurable via the --proxy-bind-timeout
flag, so we need to reflect the value of that flag when rendering this template. Something like:
diff --git a/cli/cmd/install.go b/cli/cmd/install.go
index 0df9c4fd..b77f2c77 100644
--- a/cli/cmd/install.go
+++ b/cli/cmd/install.go
@@ -57,6 +57,7 @@ type installConfig struct {
ProxyImage string
ProxyResourceRequestCPU string
ProxyResourceRequestMemory string
+ ProxyBindTimeout string
}
type installOptions struct {
@@ -164,6 +165,7 @@ func validateAndBuildConfig(options *installOptions) (*installConfig, error) {
ProxyImage: options.taggedProxyImage(),
ProxyResourceRequestCPU: options.proxyCpuRequest,
ProxyResourceRequestMemory: options.proxyMemoryRequest,
+ ProxyBindTimeout: options.proxyBindTimeout,
}, nil
}
diff --git a/cli/cmd/install_test.go b/cli/cmd/install_test.go
index 4e8ed7bf..9c9ead1c 100644
--- a/cli/cmd/install_test.go
+++ b/cli/cmd/install_test.go
@@ -61,6 +61,7 @@ func TestRender(t *testing.T) {
IgnoreOutboundPorts: "2,3,4",
ProxyResourceRequestCPU: "RequestCPU",
ProxyResourceRequestMemory: "RequestMemory",
+ ProxyBindTimeout: "1m",
}
testCases := []struct {
diff --git a/cli/cmd/testdata/install_output.golden b/cli/cmd/testdata/install_output.golden
index 91c3fa5d..537a6e23 100644
--- a/cli/cmd/testdata/install_output.golden
+++ b/cli/cmd/testdata/install_output.golden
@@ -1243,7 +1243,7 @@ data:
- name: LINKERD2_PROXY_LOG
value: warn,linkerd2_proxy=info
- name: LINKERD2_PROXY_BIND_TIMEOUT
- value: 10s
+ value: 1m
- name: LINKERD2_PROXY_CONTROL_URL
value: tcp://proxy-api.Namespace.svc.cluster.local:123
- name: LINKERD2_PROXY_CONTROL_LISTENER
diff --git a/cli/install/template.go b/cli/install/template.go
index 99e6e3b1..93712f3d 100644
--- a/cli/install/template.go
+++ b/cli/install/template.go
@@ -844,7 +844,7 @@ data:
- name: LINKERD2_PROXY_LOG
value: warn,linkerd2_proxy=info
- name: LINKERD2_PROXY_BIND_TIMEOUT
- value: 10s
+ value: {{.ProxyBindTimeout}}
- name: LINKERD2_PROXY_CONTROL_URL
value: tcp://proxy-api.{{.Namespace}}.svc.cluster.local:{{.ProxyAPIPort}}
- name: LINKERD2_PROXY_CONTROL_LISTENER
pkg/k8s/labels.go
Outdated
|
||
// ProxyInjectorWebhookConfig is the name of the mutating webhook | ||
// configuration resource of the proxy-injector webhook. | ||
ProxyInjectorWebhookConfig = "proxy-injector-webhook-config" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I missed this earlier, but my preference is to call this "linkerd-proxy-injector-webhook-config", since MWCs are not namespaced. Prefixing "linkerd-" is the approach that we take for ClusterRoles and ClusterRoleBindings, which are also not namespaced, so that cluster admins can easily identify that they were created by the linkerd install.
The ignore inbound and outbound ports are changed to string type to avoid broken YAML caused by the string conversion in the uint slice. Also, parameterized the proxy bind timeout option in template.go. Renamed the sidecar config map to 'linkerd-proxy-injector-webhook-config'. Signed-off-by: ihcsim <ihcsim@gmail.com>
@klingerf Done! We'll probably also need another issue to extent the webhook to support other workload like statefulset, daemonset and pods. WDYT? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
⭐️ 🎉 Nice work!! Thanks for addressing all of my feedback. This looks good to me. I opened #1751 to track adding support for non-deployment workloads.
Thanks for the great work @ihcsim |
Per discussion with @grampelberg, here's a PR that contains a proposed implementation for the proxy init and side containers auto-injection feature.
Example usage can be found in this gist.
The following is a summary of the changes in this PR:
Control Plane
proxy-injector
pod to the control plane.proxy-injector
creates the webhook's mutating webhook configuration by using the CA controller's trust anchor as its CA bundle.linkerd.io/auto-inject: disabled
orlinkerd.io/auto-inject: completed
.CREATE
events ofDeployment
s.Deployment
with pods that are labelled withlinkerd.io/auto-inject: disabled
orlinkerd.io/auto-inject: completed
.proxy-injector-service-tls-linkerd-io
is created by the CA controller to enable TLS communication between the API server and the webhook.CLI
cli/install/template.go
with newDeployment
,Service
, RBAC etc. specs.inject
proxy options available to theinstall
command becausecli/install/template.go
now contains proxy container specs.install
option to enable or disable proxy auto-injection.--tls=optional
andproxy-auto-inject
options are set.CA Controller
Service
name defined in the webhook configuration will be used as the TLS identity.RBAC
get
,list
andwatch
mutating webhook configuration.create
,get
,list
andwatch
mutating webhook configuration. This cluster role also has permission toget
,list
andwatch
config maps. (I tried to limit the config map permissions to namespace-scoped, but I think the usage ofinformers.Lister().Get()
requires global scope?)(I can submit a PR for usage documentation once this PR is reviewed.)