-
Notifications
You must be signed in to change notification settings - Fork 71
Update Helm chart to include pod mutating webhook for readiness gates #612
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Have some additional work coming - converting back to draft while I finalize. |
…ipts and env variable for enabling
| webhook-e2e-test-namespace := "webhook-e2e-test" | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The webhook tests don't actually use this namespace. Instead they create new ones from within the tests.
| @@ -0,0 +1,62 @@ | |||
| {{ $tls := fromYaml ( include "aws-gateway-controller.webhookTLS" . ) }} | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I learned through this process that the include here is generated at include time, not once for the overall helm install. So, for example, if you have the same include in multiple files you will get different results when generating certificate data and the values will not match across files.
| -p="[{'op': 'replace', 'path': '/webhooks/0/clientConfig/caBundle', 'value': '${CERT_B64}'}]" | ||
|
|
||
| rm $TEMP_KEY $TEMP_CERT | ||
| echo "Done" No newline at end of file |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Example output:
scripts/gen-webhook-secret.sh
Generating certificate for webhook
..+++++++++++++++++++++++++++++++++++++++*.+...............+.+......+...+..+.+............+++++++++++++++++++++++++++++++++++++++*...+......+................+........+......+.+.....+.+...........................+.....+......+....+...+..............+...+......+.+......+.........+.....+.+..................+..+...+....+......+......+...+.....+...+....+...+......+....................+.+.....+......+.......+..++++++
...........+.+.....+.........+......+.......+...............+..+......+....+........+++++++++++++++++++++++++++++++++++++++*....+...+.+......+..............+.+...........+.......+.....+.......+.....+...+.+..+...+.+.....+.+.....+....+......+...+.....+++++++++++++++++++++++++++++++++++++++*...+.+.....+.........+.+......+...+......................................+...+.+......+...............+.....+.........+....+..+.............+..+...+.+.....+.+...+....................+......+...+......+.+.....+......+.......+...+..+...+..................+.+...+.....+.+.........+..+...+.............+..+...............+.............+...+...........+...+..........++++++
-----
Recreating webhook secret
secret "webhook-cert" deleted
secret/webhook-cert created
Patching webhook with new cert
mutatingwebhookconfiguration.admissionregistration.k8s.io/aws-appnet-gwc-mutating-webhook patched
Done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just double confirm, does gen-webhook-secret.sh work for both deploy.yaml deployed and helm installed controller?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Double confirm, doing a kubectl delete secret webhook-cert for a running pod with
volumes:
- name: webhook-cert
secret:
defaultMode: 420
secretName: webhook-cert
is all fine, right?
Do we need to print log here, say, the user should reboot the controller to get the new webhook-cert content and set WEBHOOK_ENABLED==true to make webhook work?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just double confirm, does gen-webhook-secret.sh work for both deploy.yaml deployed and helm installed controller?
Correct.
kubectl delete secret webhook-cert for a running pod
I have recreated the secret before for a running pod. It does work. I haven't tested what happens if you only delete but don't recreate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In terms of the prompt for WEBHOOK_ENABLED==true, I think it's easy enough to find in the docs.
| yq -i -e '(.[] as $item | select(.metadata.name == "gateway-api-controller" and .kind == "Deployment") | .spec.template.spec.containers[] | select(.name == "manager") | .env[] | select(.name == "WEBHOOK_ENABLED") | .value) = "true"' $DEPLOY_YAML 2>&1 | ||
|
|
||
| rm $TEMP_KEY $TEMP_CERT | ||
| echo "Done" No newline at end of file |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
example output
Key not specified. Will generate...
Generating certificate for webhook
.+....+........+.+.....................+.....+..........+.....+.+........+++++++++++++++++++++++++++++++++++++++*........+....+...+++++++++++++++++++++++++++++++++++++++*..+......+...........................+...+.....+......+.+..............+......+.............+...........+....+.........+............+.....+.+...++++++
....+++++++++++++++++++++++++++++++++++++++*........+...+...+......+.........+.....+.+........+...+++++++++++++++++++++++++++++++++++++++*........+....+...+..+....+.........+...+..+...+............+...+....+...+...+..+......++++++
-----
Patching webhook secret
Patching webhook
Enabling webhook
Done
Example updates
...
apiVersion: v1
data:
tls.crt: LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0...VElGSUNBVEUtLS0tLQo=
tls.key: LS0tLS1CRUdJTiBQUklWQVRFI...ZsOEY0RVhjd084YVh1Zz09Ci0tLS0tRU5EIFBSSVZBVEUgS0VZLS0tLS0K
ca.crt: LS0tLS1CRUdJTiBDRVJUSUZJQ0F...FTkQgQ0VSVElGSUNBVEUtLS0tLQo=
kind: Secret
metadata:
name: webhook-cert
namespace: aws-application-networking-system
type: kubernetes.io/tls
...
- admissionReviewVersions:
- v1
clientConfig:
service:
name: webhook-service
namespace: aws-application-networking-system
path: /mutate-pod
caBundle: LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tL...kzaz0KLS0tLS1FTkQgQ0VSVElGSUNBVEUtLS0tLQo=
...
env:
- name: WEBHOOK_ENABLED
value: "true"
...
| caCert: {{ .Values.webhookTLS.caCert }} | ||
| cert: {{ .Values.webhookTLS.cert }} | ||
| key: {{ .Values.webhookTLS.key }} | ||
| {{- else -}} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need the keepTLSSecret option? saw alb controller has that one https://github.com/kubernetes-sigs/aws-load-balancer-controller/blob/3ad23913c416258f37da9b6e6e248aa984a9f183/helm/aws-load-balancer-controller/values.yaml#L220
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's something we can consider adding? For now if folks use the auto-installed cert the new one will be just as good. If they are using their own cert the same values for ca/cert/key should work across installs.
zijun726911
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you! overall LGTM
| - name: Run test | ||
| run: | | ||
| make e2e-test | ||
| make webhook-e2e-test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you test github worker node can run the make webhook-e2e-test success? or hard to test it until this code merge to mainline?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't. I think it does a Helm install before this so it should "just work".
| -p="[{'op': 'replace', 'path': '/webhooks/0/clientConfig/caBundle', 'value': '${CERT_B64}'}]" | ||
|
|
||
| rm $TEMP_KEY $TEMP_CERT | ||
| echo "Done" No newline at end of file |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just double confirm, does gen-webhook-secret.sh work for both deploy.yaml deployed and helm installed controller?
scripts/gen-webhook-secret.sh
Outdated
| -addext "subjectAltName = DNS:${HOST}, DNS:${HOST}.cluster.local" | ||
|
|
||
| export KEY_B64=`cat $TEMP_KEY | base64` | ||
| export CERT_B64=`cat $TEMP_CERT | base64` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems don't need export here, and don't need the KEY_B64.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice catch. Fixed.
| -p="[{'op': 'replace', 'path': '/webhooks/0/clientConfig/caBundle', 'value': '${CERT_B64}'}]" | ||
|
|
||
| rm $TEMP_KEY $TEMP_CERT | ||
| echo "Done" No newline at end of file |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Double confirm, doing a kubectl delete secret webhook-cert for a running pod with
volumes:
- name: webhook-cert
secret:
defaultMode: 420
secretName: webhook-cert
is all fine, right?
Do we need to print log here, say, the user should reboot the controller to get the new webhook-cert content and set WEBHOOK_ENABLED==true to make webhook work?
What type of PR is this?
feature
Which issue does this PR fix:
#596
What does this PR do / Why do we need it:
scripts/to provision webhook TLS certificate and enable the webhookwebhook-e2e-testin github workflowSince the controller cannot start without a valid TLS secret in place AND we cannot provision a cert at build time without confusing github with "secret" material, we now disable the webhook by default for manual installs using
deploy.yaml. Instead, I have created new scripts (gen-webhook-secret.shandpatch-deploy-yaml.sh) to help provision the certificate and configure the webhook. The webhook is still enabled by default in the Helm chart.Otherwise, the github workflow looks like it's currently broken, so there is a small risk I may be breaking it more.
Testing done on this change:
Installed locally built Helm chart. Tested with and without setting new environment variable setting.
Ran webhook e2e-tests
Automation added to e2e:
n/a
Will this PR introduce any new dependencies?:
No
Will this break upgrades or downgrades. Has updating a running cluster been tested?:
By design, the webhook does not install to the controller namespace
If a namespace is tagged with
application-networking.k8s.aws/pod-readiness-gate-inject=enabledand the webhook exists BUT the controller has been downgraded to a pre-webhook version, you will not be able to successfully create new pods. To work around this issue, you would need to remove the namespace tag or delete the webhook, then also re-launch any pods or manually set the readiness gate value.Does this PR introduce any user-facing change?:
Yes, but it is covered in the main PR #606
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.