New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MCO-605: MCO-550: Remove Certificates from MachineConfig #3787
Conversation
Skipping CI for Draft Pull Request. |
@cdoern: This pull request references MCO-605 which is a valid jira issue. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/jira refresh |
@cdoern: This pull request references MCO-605 which is a valid jira issue. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
Some intersection with #3729 I think |
/retest-required |
/retest-required I think CI is fixed now |
/retest-required |
5a4eb7d
to
a5f9e7a
Compare
NAME VERSION AVAILABLE PROGRESSING SINCE STATUS
version 4.14.0-0.ci.test-2023-08-04-010040-ci-ln-41jwz7k-latest True False 164m Cluster version is 4.14.0-0.ci.test-2023-08-04-010040-ci-ln-41jwz7k-latest
# rhel node info
node -l node.openshift.io/os_id=rhel -o wide
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
ip-10-0-55-41.us-east-2.compute.internal Ready worker 125m v1.27.3+e123787 10.0.55.41 <none> Red Hat Enterprise Linux 8.6 (Ootpa) 4.18.0-477.15.1.el8_8.x86_64 cri-o://1.27.1-4.rhaos4.14.gitab7845e.el8
ip-10-0-61-50.us-east-2.compute.internal Ready worker 125m v1.27.3+e123787 10.0.61.50 <none> Red Hat Enterprise Linux 8.6 (Ootpa) 4.18.0-477.15.1.el8_8.x86_64 cri-o://1.27.1-4.rhaos4.14.gitab7845e.el8 check mc content to verify following certs are removed: mc rendered-worker-a612a055640964b5467a71f7a1179e89 -o yaml | egrep "\.crt|\.pem"
path: /etc/kubernetes/ca.crt check above certs on rhel worker node: debug node/ip-10-0-55-41.us-east-2.compute.internal -- chroot /host stat /etc/kubernetes/kubelet-ca.crt /etc/kubernetes/static-pod-resources/configmaps/cloud-config/ca-bundle.pem /etc/pki/ca-trust/source/anchors/openshift-config-user-ca-bundle.crt
File: /etc/kubernetes/kubelet-ca.crt
Size: 8332 Blocks: 24 IO Block: 4096 regular file
Device: ca02h/51714d Inode: 109058491 Links: 1
Access: (0644/-rw-r--r--) Uid: ( 0/ root) Gid: ( 0/ root)
Context: system_u:object_r:kubernetes_file_t:s0
Access: 2023-08-04 05:02:04.493771340 +0000
Modify: 2023-08-04 05:02:04.491771358 +0000
Change: 2023-08-04 05:02:04.493771340 +0000
Birth: 2023-08-04 05:02:04.490771367 +0000
File: /etc/kubernetes/static-pod-resources/configmaps/cloud-config/ca-bundle.pem
Size: 0 Blocks: 0 IO Block: 4096 regular empty file
Device: ca02h/51714d Inode: 92342056 Links: 1
Access: (0644/-rw-r--r--) Uid: ( 0/ root) Gid: ( 0/ root)
Context: system_u:object_r:kubernetes_file_t:s0
Access: 2023-08-04 05:02:04.493771340 +0000
Modify: 2023-08-04 05:02:04.493771340 +0000
Change: 2023-08-04 05:02:04.494771331 +0000
Birth: 2023-08-04 05:02:04.493771340 +0000
File: /etc/pki/ca-trust/source/anchors/openshift-config-user-ca-bundle.crt
Size: 0 Blocks: 0 IO Block: 4096 regular empty file
Device: ca02h/51714d Inode: 25166966 Links: 1
Access: (0644/-rw-r--r--) Uid: ( 0/ root) Gid: ( 0/ root)
Context: system_u:object_r:cert_t:s0
Access: 2023-08-04 05:02:04.494771331 +0000
Modify: 2023-08-04 05:02:04.494771331 +0000
Change: 2023-08-04 05:02:04.495771322 +0000
Birth: 2023-08-04 05:02:04.494771331 +0000 check mcd log to see, it wrote cert on the node according to controllerconfig I0804 02:34:40.024993 5262 certificate_writer.go:161] Certificate was synced from controllerconfig resourceVersion 40676
I0804 02:35:37.104797 5262 certificate_writer.go:161] Certificate was synced from controllerconfig resourceVersion 41966
NAME VERSION AVAILABLE PROGRESSING SINCE STATUS
version 4.13.7 True False 40m Cluster version is 4.13.7
node -l node.openshift.io/os_id=rhel -o wide
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
ip-10-0-53-230.us-east-2.compute.internal Ready worker 14m v1.26.6+73ac561 10.0.53.230 <none> Red Hat Enterprise Linux 8.6 (Ootpa) 4.18.0-477.15.1.el8_8.x86_64 cri-o://1.26.4-3.rhaos4.13.git615a02c.el8
ip-10-0-59-120.us-east-2.compute.internal Ready worker 14m v1.26.6+73ac561 10.0.59.120 <none> Red Hat Enterprise Linux 8.6 (Ootpa) 4.18.0-477.15.1.el8_8.x86_64 cri-o://1.26.4-3.rhaos4.13.git615a02c.el8 pre upgrade check above certs exist in mc mc rendered-worker-6700e4fc613c45609666b12da24ebfa4 -o yaml | egrep '\.crt|\.pem'
path: /etc/pki/ca-trust/source/anchors/openshift-config-user-ca-bundle.crt
path: /etc/kubernetes/static-pod-resources/configmaps/cloud-config/ca-bundle.pem
path: /etc/kubernetes/kubelet-ca.crt
path: /etc/kubernetes/ca.crt upgrade cluster to oc adm upgrade --to-image registry.build05.ci.openshift.org/ci-ln-41jwz7k/release:latest --force --allow-explicit-upgrade
warning: Using by-tag pull specs is dangerous, and while we still allow it in combination with --force for backward compatibility, it would be much safer to pass a by-digest pull spec instead
warning: The requested upgrade image is not one of the available updates.You have used --allow-explicit-upgrade for the update to proceed anyway
warning: --force overrides cluster verification of your supplied release image and waives any update precondition failures.
Requested update to release image registry.build05.ci.openshift.org/ci-ln-41jwz7k/release:latest
NAME VERSION AVAILABLE PROGRESSING SINCE STATUS
version 4.14.0-0.ci.test-2023-08-04-010040-ci-ln-41jwz7k-latest True False 7m14s Cluster version is 4.14.0-0.ci.test-2023-08-04-010040-ci-ln-41jwz7k-latest post upgrade check mc content to verify no above certs found: mc rendered-worker-e87b41a1570dbf0b781ff97342f309d9 -o yaml | egrep '\.crt|\.pem'
path: /etc/kubernetes/ca.crt check above certs on rhel worker node: debug node/ip-10-0-53-230.us-east-2.compute.internal -- chroot /host stat /etc/kubernetes/kubelet-ca.crt /etc/kubernetes/static-pod-resources/configmaps/cloud-config/ca-bundle.pem /etc/pki/ca-trust/source/anchors/openshift-config-user-ca-bundle.crt
File: /etc/kubernetes/kubelet-ca.crt
Size: 8332 Blocks: 24 IO Block: 4096 regular file
Device: ca02h/51714d Inode: 109107015 Links: 1
Access: (0644/-rw-r--r--) Uid: ( 0/ root) Gid: ( 0/ root)
Context: system_u:object_r:kubernetes_file_t:s0
Access: 2023-08-04 06:23:54.246234133 +0000
Modify: 2023-08-04 06:23:54.244234150 +0000
Change: 2023-08-04 06:23:54.246234133 +0000
Birth: 2023-08-04 06:23:54.244234150 +0000
File: /etc/kubernetes/static-pod-resources/configmaps/cloud-config/ca-bundle.pem
Size: 0 Blocks: 0 IO Block: 4096 regular empty file
Device: ca02h/51714d Inode: 117481765 Links: 1
Access: (0644/-rw-r--r--) Uid: ( 0/ root) Gid: ( 0/ root)
Context: system_u:object_r:kubernetes_file_t:s0
Access: 2023-08-04 06:23:54.246234133 +0000
Modify: 2023-08-04 06:23:54.246234133 +0000
Change: 2023-08-04 06:23:54.247234125 +0000
Birth: 2023-08-04 06:23:54.246234133 +0000
File: /etc/pki/ca-trust/source/anchors/openshift-config-user-ca-bundle.crt
Size: 0 Blocks: 0 IO Block: 4096 regular empty file
Device: ca02h/51714d Inode: 25166966 Links: 1
Access: (0600/-rw-------) Uid: ( 0/ root) Gid: ( 0/ root)
Context: system_u:object_r:cert_t:s0
Access: 2023-08-04 06:23:54.247234125 +0000
Modify: 2023-08-04 06:23:54.247234125 +0000
Change: 2023-08-04 06:23:54.248234117 +0000
Birth: 2023-08-04 06:23:54.247234125 +0000 |
@cdoern @yuqi-zhang please confirm the test result. thanks |
/unhold |
/payload 4.14 ci blocking |
@cdoern: trigger 5 job(s) of type blocking for the ci release of OCP 4.14
See details on https://pr-payload-tests.ci.openshift.org/runs/ci/42651f10-32ca-11ee-8506-9ea577b35548-0 trigger 8 job(s) of type blocking for the nightly release of OCP 4.14
See details on https://pr-payload-tests.ci.openshift.org/runs/ci/42651f10-32ca-11ee-8506-9ea577b35548-1 |
Thanks for the tests @rioliu-rh ! I thought they would have failed since in ansible, we are using an oc get on the machineconfig: https://github.com/openshift/openshift-ansible/blob/2ba66d98edb01fb1ff1250053c09fddae0441b12/roles/openshift_node/tasks/apply_machine_config.yml#L23 which no longer contains certs, so the rhel worker node should have in theory gotten no certs to join the cluster with. To double check, is that the ansible script being used? Edit: upon further inspection, it looks like https://github.com/openshift/openshift-ansible/blob/2ba66d98edb01fb1ff1250053c09fddae0441b12/roles/openshift_node/tasks/config.yml#L179-L184 is what's actually doing the bootstrap once-from, so using ignition and not MC, which is expected to work as Rio tested. Sorry for the confusion |
/payload-job periodic-ci-openshift-hypershift-release-4.14-periodics-e2e-aws-ovn-conformance |
@cdoern: trigger 1 job(s) for the /payload-(job|aggregate) command
See details on https://pr-payload-tests.ci.openshift.org/runs/ci/ab6d7670-3544-11ee-85db-3c439caed440-0 |
/payload 4.14 ci blocking |
@cdoern: trigger 5 job(s) of type blocking for the ci release of OCP 4.14
See details on https://pr-payload-tests.ci.openshift.org/runs/ci/e7344320-35ec-11ee-9f15-e81b432aba7d-0 |
the tests failed last time due to what looked like infrastructure reasons, running payload again |
/payload 4.14 ci blocking |
@cdoern: trigger 5 job(s) of type blocking for the ci release of OCP 4.14
See details on https://pr-payload-tests.ci.openshift.org/runs/ci/fa5aa2b0-360e-11ee-9490-e5247871e92d-0 |
In order to remove certificates from MC, the MCS needs to read the controller config both in the bootstrap server and the cluster server. Once this is in place, removing the certs from machine configs consists of ensuring the rendered configs do not contain the kube data to write into the files of the machine config. Accomplish this by deleting cert templates Signed-off-by: Charlie Doern <cdoern@redhat.com>
/test e2e-gcp-op |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We've done quite a bit of testing and checking against rhel8 workers, hypershift, IBM, windows nodes. The fix for windows is in flight and everything else should work. With that context, I'd like to get this in (windows is broken on ignition spec 3.4 anyways).
/lgtm
Thanks for all the hard work and verification!
/hold Upon further discussion, let's wait for the Windows fix to go through first |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: cdoern, yuqi-zhang The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
@cdoern: The following test failed, say
Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
/hold cancel |
e386166
into
openshift:master
- What I did
In order to remove certificates from MC, the MCS needs to read the controller config both in the bootstrap server and the cluster server. Once this is in place, removing the certs from machine configs consists of ensuring the rendered configs do not contain the kube data to write into the files of the machine config. Accomplish this by deleting cert templates
- How to verify it
oc describe mc/xxxx
the raw config/storage/files should no longer contain the following certs:/etc/kubernetes/kubelet-ca.crt
/etc/kubernetes/static-pod-resources/configmaps/cloud-config/ca-bundle.pem
/etc/pki/ca-trust/source/anchors/openshift-config-user-ca-bundle.crt
- Description for the changelog
ensure the MCS properly writes certificate data directly to ignition while also ensuring that all MachineConfigs do not have certificates in them.