Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kubeadm: change SystemPrivilegedGroup in apiserve-kubelet-client.crt #121837

Conversation

neolit123
Copy link
Member

@neolit123 neolit123 commented Nov 10, 2023

What type of PR is this?

/kind bug

What this PR does / why we need it:

The component connection between kube-apiserver and kubelet does not
require the "O" field on the Subject to be set to the
"system:masters" privileged group. It can be a less
privileged group like "kubeadm:cluster-admins".

Change the group in the apiserve-kubelet-client
certificate specification. This cert is passed to
--kubelet-client-certificate.

Which issue(s) this PR fixes:

NONE

Special notes for your reviewer:

NONE

Does this PR introduce a user-facing change?

kubeadm: change the "system:masters" Group in the apiserver-kubelet-client.crt certificate Subject to be "kubeadm:cluster-admins" which is a less privileged Group.

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:


@k8s-ci-robot k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. kind/bug Categorizes issue or PR as related to a bug. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. labels Nov 10, 2023
@neolit123
Copy link
Member Author

/milestone v1.29
/traige accepted
/priority important-longterm

@k8s-ci-robot k8s-ci-robot added this to the v1.29 milestone Nov 10, 2023
@k8s-ci-robot k8s-ci-robot added priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. area/kubeadm sig/cluster-lifecycle Categorizes an issue or PR as relevant to SIG Cluster Lifecycle. and removed needs-priority Indicates a PR lacks a `priority/foo` label and requires one. do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Nov 10, 2023
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: neolit123

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Nov 10, 2023
@SataQiu
Copy link
Member

SataQiu commented Nov 10, 2023

The authentication and authorization are required when kube-apiserver connects to kubelet.
Does the commands(such as kubectl logs/exec/port-forward) still work well without the system:masters group?

@neolit123
Copy link
Member Author

The authentication and authorization are required when kube-apiserver connects to kubelet. Does the commands(such as kubectl logs/exec/port-forward) still work well without the system:masters group?

right, we need to migrate this to the new kubeadm:cluster-admins group

@SataQiu
Copy link
Member

SataQiu commented Nov 10, 2023

FYI: https://kubernetes.io/docs/reference/access-authn-authz/kubelet-authn-authz/

When running in this mode, ensure the user identified by the --kubelet-client-certificate and --kubelet-client-key flags passed to the apiserver is authorized for the following attributes:
verb=*, resource=nodes, subresource=proxy
verb=*, resource=nodes, subresource=stats
verb=*, resource=nodes, subresource=log
verb=*, resource=nodes, subresource=spec
verb=*, resource=nodes, subresource=metrics

The component connection between kube-apiserver and kubelet does not
require the "O" field on the Subject to be set to the
"system:masters" privileged group. It can be a less
privileged group like "kubeadm:cluster-admins".

Change the group in the apiserve-kubelet-client
certificate specification. This cert is passed to
--kubelet-client-certificate.
@neolit123 neolit123 force-pushed the 1.29-remove-system-masters-from-kubelet-client-cert branch from c45b355 to 2780060 Compare November 10, 2023 13:05
@neolit123
Copy link
Member Author

FYI: https://kubernetes.io/docs/reference/access-authn-authz/kubelet-authn-authz/

When running in this mode, ensure the user identified by the --kubelet-client-certificate and --kubelet-client-key flags passed to the apiserver is authorized for the following attributes:
verb=, resource=nodes, subresource=proxy
verb=
, resource=nodes, subresource=stats
verb=, resource=nodes, subresource=log
verb=
, resource=nodes, subresource=spec
verb=*, resource=nodes, subresource=metrics

yeah, i'm testing with kubeadm:cluster-admins now

@SataQiu
Copy link
Member

SataQiu commented Nov 10, 2023

/cc @pacoxu

@neolit123 neolit123 changed the title kubeadm: remove SystemPrivilegedGroup from apiserve-kubelet-client.crt kubeadm: change SystemPrivilegedGroup in apiserve-kubelet-client.crt Nov 10, 2023
@neolit123
Copy link
Member Author

FYI: https://kubernetes.io/docs/reference/access-authn-authz/kubelet-authn-authz/

When running in this mode, ensure the user identified by the --kubelet-client-certificate and --kubelet-client-key flags passed to the apiserver is authorized for the following attributes:
verb=, resource=nodes, subresource=proxy
verb=
, resource=nodes, subresource=stats
verb=, resource=nodes, subresource=log
verb=
, resource=nodes, subresource=spec
verb=*, resource=nodes, subresource=metrics

yeah, i'm testing with kubeadm:cluster-admins now

exec works as expected after using "kubeadm:cluster-admins".

$ kubectl exec -it kube-proxy-9htk2 -n kube-system sh
kubectl exec [POD] [COMMAND] is DEPRECATED and will be removed in a future version. Use kubectl exec [POD] -- [COMMAND] instead.
# ls
sh: 1: ls: not found
# 
# echo HELLO
HELLO

PR updated, release note, description as well.

@SataQiu
Copy link
Member

SataQiu commented Nov 10, 2023

It seems that this change depends on the kubeadm:cluster-admins ClusterRoleBinding.
Can we ensure that the ClusterRoleBinding is always present when the certificate is updated or regenerated?

@neolit123
Copy link
Member Author

It seems that this change depends on the kubeadm:cluster-admins ClusterAdminsGroupAndClusterRoleBinding. Can we ensure that the ClusterAdminsGroupAndClusterRoleBinding is always present when the certificate is updated or regenerated?

i can add it as a task in out e2e test.
part of kubernetes/kubeadm#2414

updated the PR for k/website as well to use kubeadm:cluster-admins:
kubernetes/website#43870

@SataQiu
Copy link
Member

SataQiu commented Nov 10, 2023

The following case may lead to trouble:

Regenerate the apiserver-kubelet-client cert for an old v1.28 cluster using the new v1.29 kubeadm

mv /etc/kubernetes/pki/apiserver-kubelet-client.crt /etc/kubernetes/pki/apiserver-kubelet-client.crt.bak
mv /etc/kubernetes/pki/apiserver-kubelet-client.key /etc/kubernetes/pki/apiserver-kubelet-client.key.bak
kubeadm init phase certs apiserver-kubelet-client

At this point, the O of the certificate has become kubeadm:cluster-admins, but the kubeadm:cluster-admins CRB is not generated.

kubectl get clusterrolebinding kubeadm:cluster-admins 
Error from server (NotFound): clusterrolebindings.rbac.authorization.k8s.io "kubeadm:cluster-admins" not found

Maybe we should keep both system:masters and kubeadm:cluster-admins in this release and remove system:masters in a future release. WDYT?

@neolit123
Copy link
Member Author

neolit123 commented Nov 10, 2023

The following case may lead to trouble:

Regenerate the apiserver-kubelet-client cert for an old v1.28 cluster using the new v1.29 kubeadm

mv /etc/kubernetes/pki/apiserver-kubelet-client.crt /etc/kubernetes/pki/apiserver-kubelet-client.crt.bak
mv /etc/kubernetes/pki/apiserver-kubelet-client.key /etc/kubernetes/pki/apiserver-kubelet-client.key.bak
kubeadm init phase certs apiserver-kubelet-client

At this point, the O of the certificate has become kubeadm:cluster-admins, but the kubeadm:cluster-admins CRB is not generated.

kubectl get clusterrolebinding kubeadm:cluster-admins 
Error from server (NotFound): clusterrolebindings.rbac.authorization.k8s.io "kubeadm:cluster-admins" not found

Maybe we should keep both system:masters and kubeadm:cluster-admins in this release and remove system:masters in a future release. WDYT?

i can see it being a problem case, but i wanted to clear the undesired usage of system:masters completely in 1.29.
if users do use the 1.29 binary for renew of a 1.28 cert they can ask us about support and we can explain that they should use 1.28 for 1.28 certs and 1.29 for 1.29 certs. upgrade will handle it properly for them.

this is also a general problem if they use the 1.29 binary to renew 1.28 admin.conf.

EDIT: BTW, this is guaranteed by our support skew of kubeadm against kubeadm which is N-0 - i.e. only use kubeadm N for operations on a cluster that is version N (unless upgrade).

@neolit123
Copy link
Member Author

e2e test update PR:
kubernetes/kubeadm#2960

@pacoxu
Copy link
Member

pacoxu commented Nov 10, 2023

/triage accepted

this is also a general problem if they use the 1.29 binary to renew 1.28 admin.conf.

Did we check the version of cluster and kubeadm version before renew?

@k8s-ci-robot k8s-ci-robot added triage/accepted Indicates an issue or PR is ready to be actively worked on. and removed needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Nov 10, 2023
@neolit123
Copy link
Member Author

neolit123 commented Nov 10, 2023

Did we check the version of cluster and kubeadm version before renew?

i don't believe we do.

i don't think it will help, because kubeadm can deploy N-1 control plane.
the only way to check if a cert originated from kubeadm version X is to use a certificate extension of sorts, where we can store a version inside the cert.

if kubeadm supported the kubeadm N-1 skew it means we must store both old and new group in the Subject.

@pacoxu
Copy link
Member

pacoxu commented Nov 10, 2023

Is there other case that may cause a problem like this( use n kubeadm in n-1 cluster)?

If the skew is n-0, we may add some version checks when it may trigger a bad result.

@neolit123
Copy link
Member Author

Is there other case that may cause a problem like this( use n kubeadm in n-1 cluster)?

one that comes to mind is when the API version in kubeadm-config is changed to a new version. but an old version of kubeadm tries to join. it will not know how to decode the new API version in the config map.

If the skew is n-0, we may add some version checks when it may trigger a bad result.

adding such checks might be a bit tricky, especially for certs like i mentioned above.
but we can try in 1.30 or later.

it seems that users do not make this mistake to use old vs new kubeadm, but we cannot exclude the possibility.
we might need to start planning the N-1 kubeadm support and adding such checks in the future.

@SataQiu
Copy link
Member

SataQiu commented Nov 10, 2023

/lgtm
/hold

we may need more review :)

@k8s-ci-robot k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Nov 10, 2023
@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Nov 10, 2023
@k8s-ci-robot
Copy link
Contributor

LGTM label has been added.

Git tree hash: 2ab198fc4dc50934698a14cdd97a21297dbcefd3

@neolit123
Copy link
Member Author

neolit123 commented Nov 10, 2023

/lgtm /hold

we may need more review :)

i think this would work fine, we just need to sort the renewal that you did in your PR follow-up as well.
#121841

TBH, https://kubernetes.io/docs/reference/access-authn-authz/kubelet-authn-authz/ tells us to sign a cert with a completely custom user/group RBAC, but that's yet more RBAC management and we can just try using the "kubeadm:cluster-admins" group.
admin.conf (kubeadm:cluster-admins in 1.29) is already stored on nodes, so this level of access is already exposed. "system:masters" on the other hand we shouldn't expose randomly after 1.29.

ca.key is there, but users can decide to use the external CA path and sign custom certs instead of relying on the KCM to sign certs for new kubelets that join the cluster - i.e. don't keep ca.key on nodes.

@pacoxu
Copy link
Member

pacoxu commented Nov 10, 2023

/lgtm

@neolit123
Copy link
Member Author

/hold cancel
so that we can test the follow up PR and run e2e tests later

@k8s-ci-robot k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Nov 10, 2023
@k8s-ci-robot k8s-ci-robot merged commit 1f3256b into kubernetes:master Nov 10, 2023
14 checks passed
@sftim
Copy link
Contributor

sftim commented Nov 18, 2023

Changelog suggestion

kubeadm: changed the group that the `apiserver-kubelet-client.crt` X.509 certificate Subject (indirectly)
specifies. The new value, `kubeadm:cluster-admins`, is less privileged but still allows the API server to
perform all expected actions against nodes.
  • use past tense
  • use Markdown (backticks)
  • don't write Group in title case; we don't have a Group API, although kubeadm assumes you're using RBAC which kind of behaves in some places as if we do.
  • also, the certificate doesn't specify a group; instead, the API server maps the OU(s) in the certificate to group names. This is a Kubernetes behavior and not an X.509 thing; in X.509, OUs specify an organizational hierarchy.
    Frame the changelog accordingly.

@sftim
Copy link
Contributor

sftim commented Nov 18, 2023

Aside: readers might wonder why kubeadm doesn't make a dedicated group for the API server to act as when managing nodes. However, making that change would be a separate PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/kubeadm cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/bug Categorizes issue or PR as related to a bug. lgtm "Looks good to me", indicates that a PR is ready to be merged. priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. release-note Denotes a PR that will be considered when it comes time to generate release notes. sig/cluster-lifecycle Categorizes an issue or PR as relevant to SIG Cluster Lifecycle. size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

None yet

5 participants