Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🌱 Support new control plane label and taint #5919

Merged

Conversation

sbueringer
Copy link
Member

@sbueringer sbueringer commented Jan 10, 2022

Signed-off-by: Stefan Büringer buringerst@vmware.com

What this PR does / why we need it:
This PR is required to be compatible with the latest changes in Kubernetes 1.24.

In Kubernetes 1.24 on control plane nodes kubeadm now:

  • sets both node-role.kubernetes.io/control-plane and node-role.kubernetes.io/master taints
  • doesn't set the node-role.kubernetes.io/master label anymore (only node-role.kubernetes.io/control-plane)

Thus:

  • the toleration node-role.kubernetes.io/control-plane is additionally added so our controllers can also run on 1.24 control-plane nodes.
  • KCP now considers nodes with the old or the new (or both) label(s) as control plane nodes. This way KCP now supports:
    • v1.19: where only the old label is set
    • v1.20-v1.23: where both labels are set
    • v1.24: where only the new label is set

xref: corresponding kubeadm issue kubernetes/kubeadm#2200

Which issue(s) this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close the issue(s) when PR gets merged):
Partially implements #3279

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Jan 10, 2022
@k8s-ci-robot k8s-ci-robot added the size/M Denotes a PR that changes 30-99 lines, ignoring generated files. label Jan 10, 2022
Copy link
Member

@vincepri vincepri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor nit, rest LGTM

controlplane/kubeadm/internal/workload_cluster.go Outdated Show resolved Hide resolved
@k8s-ci-robot k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Jan 10, 2022
@@ -980,6 +980,9 @@ func fakeNode(name string, options ...fakeNodeOption) *corev1.Node {
p := &corev1.Node{
ObjectMeta: metav1.ObjectMeta{
Name: name,
Labels: map[string]string{
labelNodeRoleControlPlane: "",
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is now necessary because the old implementation of getControlPlaneNodes did select all nodes as control plane nodes independent of the label when using fakeClient (I assume because ctrlclient.MatchingLabels(labels) is not implemented in fake client)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll take a closer look.

To be clear for the new code everything works fine. I think it's now mostly a question of why didn't we have to set the control plane node label with the old code. (aka is there something wrong with fake client or our usage of it)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay everything worked/works as expected.

In the relevant unit test we're injecting the list result:

injectClient: &fakeClient{
list: &corev1.NodeList{
Items: []corev1.Node{*fakeNode("n1")},
},
},

Previously, we didn't have to set the control plane label as the result of the list was automatically the list of control plane nodes. As we're now iterating through the result and checking for the label it became relevant.

@k8s-ci-robot
Copy link
Contributor

@sbueringer: The /test command needs one or more targets.
The following commands are available to trigger required jobs:

  • /test pull-cluster-api-build-main
  • /test pull-cluster-api-e2e-main
  • /test pull-cluster-api-test-main
  • /test pull-cluster-api-test-mink8s-main
  • /test pull-cluster-api-verify-main

The following commands are available to trigger optional jobs:

  • /test pull-cluster-api-apidiff-main
  • /test pull-cluster-api-e2e-full-main
  • /test pull-cluster-api-e2e-informing-ipv6-main
  • /test pull-cluster-api-e2e-informing-main
  • /test pull-cluster-api-e2e-workload-upgrade-1-23-latest-main
  • /test pull-cluster-api-make-main

Use /test all to run the following jobs that were automatically triggered:

  • pull-cluster-api-apidiff-main
  • pull-cluster-api-build-main
  • pull-cluster-api-e2e-informing-ipv6-main
  • pull-cluster-api-e2e-informing-main
  • pull-cluster-api-e2e-main
  • pull-cluster-api-test-main
  • pull-cluster-api-test-mink8s-main
  • pull-cluster-api-verify-main

In response to this:

/test
pull-cluster-api-verify-main

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@sbueringer
Copy link
Member Author

/test pull-cluster-api-verify-main

@enxebre
Copy link
Member

enxebre commented Jan 10, 2022

thanks @sbueringer!
/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jan 10, 2022

// Use the control-plane label for Kubernetes version >= v1.20.0.
if utilversion.MustParseGeneric(serverVersion.String()).AtLeast(utilversion.MustParseGeneric("v1.20.0")) {
workloadDeployment.Spec.Template.Spec.NodeSelector = map[string]string{"node-role.kubernetes.io/control-plane": ""}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should the strings be constants here and below?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, done.

@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jan 11, 2022
@enxebre
Copy link
Member

enxebre commented Jan 11, 2022

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jan 11, 2022
@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jan 13, 2022
@neolit123
Copy link
Member

In Kubernetes 1.24 on control plane nodes kubeadm now:

  • uses the taint node-role.kubernetes.io/control-plane instead of node-role.kubernetes.io/master

both will be applied on nodes in 1.24. xref KEP:
https://github.com/neolit123/enhancements/tree/c33232cf66cc911ec736fcc80a08bd42189f8504/keps/sig-cluster-lifecycle/kubeadm/2067-rename-master-label-taint

  • doesn't set the node-role.kubernetes.io/master label anymore (only node-role.kubernetes.io/control-plane)

yes, for new CP nodes (init or join) only the "control-plane" label will be added.

@sbueringer
Copy link
Member Author

Thank you! Will update accordingly

@k8s-ci-robot k8s-ci-robot removed lgtm "Looks good to me", indicates that a PR is ready to be merged. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. labels Jan 13, 2022
@sbueringer
Copy link
Member Author

I've updated getControlPlaneNodes to use continue, so we're not missing any control plane nodes in large clusters.

@fabriziopandini ptal :)

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jan 26, 2022
@sbueringer
Copy link
Member Author

Hey folks,
WDYT about merging this PR?

@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Feb 1, 2022
Signed-off-by: Stefan Büringer buringerst@vmware.com
Copy link
Member

@vincepri vincepri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/approve
/lgtm

@vincepri
Copy link
Member

vincepri commented Feb 1, 2022

Should we include this PR in the v1.1 release?

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Feb 1, 2022
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: vincepri

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Feb 1, 2022
@k8s-ci-robot k8s-ci-robot merged commit 5756be1 into kubernetes-sigs:main Feb 1, 2022
@k8s-ci-robot k8s-ci-robot added this to the v1.2 milestone Feb 1, 2022
@sbueringer
Copy link
Member Author

sbueringer commented Feb 1, 2022

Should we include this PR in the v1.1 release?

I'm not sure. On one side we would need this PR to support v1.24, on the other side we will need more than that. (see release specific issues in #5968)

So this PR alone is not enough to support the upcoming v1.24 release.

I'm also not sure if we want to use v1.24 support as a forcing function so folks are upgrading to CAPI v1.2 :)

Even if we don't cherry-pick now, once we know what we need to support v1.24.0 (after its release) we can still cherry-pick that for a new v1.1.x release if we have consensus that we want to cherry-pick 1.24 support into CAPI v1.1.x

/cc @fabriziopandini

@enxebre
Copy link
Member

enxebre commented Feb 2, 2022

So this PR alone is not enough to support the upcoming v1.24 release.

fwiw 1.24 is planned for Tuesday 19th April 2022

Given the timelines I'd be in favour of supporting 1.24 in 1.1.x.
My +1 to cherry-pick now into 1.1 just so I'd rather cherry-pick smaller bits than a bigger one in one go so we keep the potential breaking surface controlled.

For ref #5968

@sbueringer
Copy link
Member Author

I don't have a strong opinion either way, but we should only consider cherry-picking it if we plan to support v1.24 in v1.1.x

@sbueringer
Copy link
Member Author

sbueringer commented Feb 9, 2022

We'll bring it up in the office hours today, but we would want to support v1.24 in v1.1.x. (we'll wait with merging the cherry-pick until after the meeting)
/cherry-pick release-1.1

@k8s-infra-cherrypick-robot

@sbueringer: new pull request created: #6084

In response to this:

We'll bring it up in the office hours today, but we would want to support v1.24 in v1.1.x. (we'll wait with merging the cherry-pick)
/cherry-pick release-1.1

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@kashifest
Copy link
Contributor

@sbueringer Would it be ok to cherry pick this for release-0.4 ? If not, v1a4 cluster wont be able to migrate to 1.24.x because of the missing node role label check.

@sbueringer
Copy link
Member Author

sbueringer commented Jul 4, 2022

CAPI v0.4.x doesn't support Kubernetes 1.24: https://cluster-api.sigs.k8s.io/reference/versions.html

For v1.24 support you have to upgrade to CAPI v1.1.

Note: CAPI v0.4 is out of support since 2022-04-06 and there won't be another v0.4.x release.

P.S. This is not the only PR we would have to backport to make this happen

@kashifest
Copy link
Contributor

CAPI v0.4.x doesn't support Kubernetes 1.24: https://cluster-api.sigs.k8s.io/reference/versions.html

For v1.24 support you have to upgrade to CAPI v1.1.

Note: CAPI v0.4 is out of support since 2022-04-06 and there won't be another v0.4.x release.

P.S. This is not the only PR we would have to backport to make this happen

Ah ok , missed that, thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants