kubeadm: prevent bootstrap of nodes with known names #81056

neolit123 · 2019-08-07T02:11:23Z

What this PR does / why we need it:
If a Node name in the cluster is already taken and this Node is Ready,
prevent TLS bootsrap on "kubeadm join" and exit early.

This change requires that a new ClusterRole is granted to the
"system:bootstrappers:kubeadm:default-node-token" group to be
able get Nodes in the cluster. The same group already has access
to obtain objects such as the KubeletConfiguration and kubeadm's
ClusterConfiguration.

The motivation of this change is to prevent undefined behavior
and the potential control-plane breakdown if such a cluster
is racing to have two nodes with the same name for long periods
of time.

The following values are validated in the following precedence
from lower to higher:

actual hostname
NodeRegistration.Name (or "--node-name") from JoinConfiguration
"--hostname-override" passed via kubeletExtraArgs

If the user decides to not let kubeadm know about a custom node name
and to instead override the hostname from a kubelet systemd unit file,
kubeadm will not be able to detect the problem.

Which issue(s) this PR fixes:

Fixes kubernetes/kubeadm#1711

Special notes for your reviewer:
NONE

Does this PR introduce a user-facing change?:

kubeadm: reject a node joining the cluster if a node with the same name already exists

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:

/kind feature
/priority important-longterm

k8s-ci-robot · 2019-08-07T02:12:31Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: neolit123

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~cmd/kubeadm/OWNERS~~ [neolit123]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

cmd/kubeadm/app/cmd/phases/join/kubelet.go

liggitt · 2019-08-07T04:50:28Z

what are the implications for cleaning up a node to set it up again? does this mean that kubeadm reset is no longer sufficient to allow kubeadm join to succeed again?

liggitt · 2019-08-07T04:51:50Z

cmd/kubeadm/app/cmd/phases/join/kubelet.go

+	klog.V(1).Infof("[kubelet-start] Checking for an existing node with name %q in the cluster", nodeName)
+	nodes, err := bootstrapClient.CoreV1().Nodes().List(metav1.ListOptions{})
+	if err != nil {
+		return errors.Wrap(err, "cannot obtain the list of nodes in the cluster")


should this be a skippable preflight check instead?

+1 on that, we need to be able to skip this as I can see users running into problems with stale nodes, etc.

no so sure about this, stale nodes are a subject of the administrator calling kubectl delete. kubeadm join failing during boostrap would indicate that the cluster requires admin intervetion (or node rename).

if the user is allowed to skip this check, it can lead to catastrophic outcome - e.g. imaging a case where the operator responsible for the worker join is not the cluster admin.

let's talk more about this during the office hours, possibly next week.

cmd/kubeadm/app/cmd/phases/join/kubelet.go

rosti

Thanks @neolit123 !
We may also try to nuke the Node object on reset, but permission wise IDK if this is feasible.

cmd/kubeadm/app/cmd/phases/join/kubelet.go

rosti · 2019-08-07T09:30:27Z

cmd/kubeadm/app/cmd/phases/join/kubelet.go

+	klog.V(1).Infof("[kubelet-start] Checking for an existing node with name %q in the cluster", nodeName)
+	nodes, err := bootstrapClient.CoreV1().Nodes().List(metav1.ListOptions{})
+	if err != nil {
+		return errors.Wrap(err, "cannot obtain the list of nodes in the cluster")


+1 on that, we need to be able to skip this as I can see users running into problems with stale nodes, etc.

cmd/kubeadm/app/cmd/phases/join/kubelet.go

liggitt · 2019-08-07T13:35:01Z

We may also try to nuke the Node object on reset, but permission wise IDK if this is feasible.

if kubeadm reset is run with the API permissions of the kubelet, it will not have the ability to delete the Node API object (I didn't think it made any API calls today)

I suppose, that Get is slightly more efficient, but IDK that for sure. Also, IDK what's the difference from security standpoint in allowing folks, with bootstrap tokens, to list all nodes vs get nodes.
I'll leave that on the trusted advice of @liggitt .

Get is way more efficient on large clusters.

A bootstrap token can obtain a node client credential that allows get/list/watch of all nodes, so read-only access to Get a node is not concerning.

neolit123 · 2019-08-07T13:57:27Z

Get is way more efficient on large clusters.

completely forgot about get in the early hours of the day.

We may also try to nuke the Node object on reset, but permission wise IDK if this is feasible.

possibly a separate PR, but also needs permission evaluation.
my initial reaction is to leave this to the admin group.

neolit123 · 2019-08-07T14:00:34Z

what are the implications for cleaning up a node to set it up again? does this mean that kubeadm reset is no longer sufficient to allow kubeadm join to succeed again?

this is a problem that i though about after sending the PR.

we either have to:

extend reset to delete nodes.
modify this check to only fail on existing Ready nodes.
EDIT: don't allow the node to re-join without admin deleting the stale node.

neolit123 · 2019-10-07T20:53:31Z

@rosti @liggitt @SataQiu

i have updated this PR:

using "GET" for the Node object
use the same common utility for node name precedence between the place where we write kubelet flags and join new nodes
still allow a new Node to join if the same Node name in the cluster exists, but its status is not Ready. this allows re-join on failure.

neolit123 · 2019-10-07T21:18:39Z

/retest

SataQiu · 2019-10-08T01:47:20Z

/test pull-kubernetes-kubemark-e2e-gce-big
/test pull-kubernetes-e2e-gce-100-performance

fejta-bot · 2020-01-06T03:17:24Z

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

neolit123 · 2020-01-06T03:23:06Z

/remove-lifecycle stale

rosti

Thanks @neolit123 !
Implementation wise, it looks good.
We may want to add a more thorough message to the user along with steps how to fix the problem.
/lgtm

cmd/kubeadm/app/cmd/phases/join/kubelet.go

If a Node name in the cluster is already taken and this Node is Ready, prevent TLS bootsrap on "kubeadm join" and exit early. This change requires that a new ClusterRole is granted to the "system:bootstrappers:kubeadm:default-node-token" group to be able get Nodes in the cluster. The same group already has access to obtain objects such as the KubeletConfiguration and kubeadm's ClusterConfiguration. The motivation of this change is to prevent undefined behavior and the potential control-plane breakdown if such a cluster is racing to have two nodes with the same name for long periods of time. The following values are validated in the following precedence from lower to higher: - actual hostname - NodeRegistration.Name (or "--node-name") from JoinConfiguration - "--hostname-override" passed via kubeletExtraArgs If the user decides to not let kubeadm know about a custom node name and to instead override the hostname from a kubelet systemd unit file, kubeadm will not be able to detect the problem.

neolit123 · 2020-01-26T16:52:20Z

/hold cancel

neolit123 · 2020-01-31T05:33:58Z

/retest

rosti

Thanks @neolit123 !
/lgtm

fejta-bot · 2020-01-31T11:47:20Z

/retest
This bot automatically retries jobs that failed/flaked on approved PRs (send feedback to fejta).

Review the full test history for this PR.

Silence the bot with an /lgtm cancel or /hold comment for consistent failures.

rosti · 2020-01-31T13:29:43Z

/test pull-kubernetes-e2e-gce

fejta-bot · 2020-01-31T19:29:00Z

/retest
This bot automatically retries jobs that failed/flaked on approved PRs (send feedback to fejta).

Review the full test history for this PR.

Silence the bot with an /lgtm cancel or /hold comment for consistent failures.

fejta-bot · 2020-02-01T05:17:01Z

/retest
This bot automatically retries jobs that failed/flaked on approved PRs (send feedback to fejta).

Review the full test history for this PR.

Silence the bot with an /lgtm cancel or /hold comment for consistent failures.

fejta-bot · 2020-02-01T09:08:01Z

/retest
This bot automatically retries jobs that failed/flaked on approved PRs (send feedback to fejta).

Review the full test history for this PR.

Silence the bot with an /lgtm cancel or /hold comment for consistent failures.

kvaps · 2020-03-26T11:24:43Z

Hi, this change caused two new issues: #89501, #89512

k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Aug 7, 2019

k8s-ci-robot requested review from ereslibre and rosti August 7, 2019 02:13

neolit123 mentioned this pull request Aug 7, 2019

Race condition in Kubelet service #79779

Closed

SataQiu reviewed Aug 7, 2019

View reviewed changes

cmd/kubeadm/app/cmd/phases/join/kubelet.go Outdated Show resolved Hide resolved

liggitt reviewed Aug 7, 2019

View reviewed changes

cmd/kubeadm/app/cmd/phases/join/kubelet.go Outdated Show resolved Hide resolved

rosti reviewed Aug 7, 2019

View reviewed changes

neolit123 force-pushed the 1.16-kubeadm-node-names branch from 6e4e38a to 1f078d6 Compare October 7, 2019 20:20

k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Oct 7, 2019

neolit123 force-pushed the 1.16-kubeadm-node-names branch 3 times, most recently from 8759416 to 43b1316 Compare October 7, 2019 20:48

neolit123 force-pushed the 1.16-kubeadm-node-names branch 2 times, most recently from 27ea933 to e52a155 Compare October 7, 2019 21:00

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jan 6, 2020

k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jan 6, 2020

rosti approved these changes Jan 8, 2020

View reviewed changes

cmd/kubeadm/app/cmd/phases/join/kubelet.go Outdated Show resolved Hide resolved

k8s-ci-robot assigned rosti Jan 8, 2020

k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jan 8, 2020

neolit123 force-pushed the 1.16-kubeadm-node-names branch from e52a155 to b117a92 Compare January 26, 2020 16:51

k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jan 26, 2020

k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jan 26, 2020

neolit123 requested a review from rosti January 26, 2020 16:52

rosti approved these changes Jan 31, 2020

View reviewed changes

k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jan 31, 2020

k8s-ci-robot merged commit f812429 into kubernetes:master Feb 1, 2020

k8s-ci-robot added this to the v1.18 milestone Feb 1, 2020

kvaps mentioned this pull request Mar 26, 2020

Introduce force option for kubeadm join #89522

Closed

prezha mentioned this pull request Mar 9, 2021

multinode cluster: fix waits and joins kubernetes/minikube#10758

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

kubeadm: prevent bootstrap of nodes with known names #81056

kubeadm: prevent bootstrap of nodes with known names #81056

neolit123 commented Aug 7, 2019 •

edited

k8s-ci-robot commented Aug 7, 2019

liggitt commented Aug 7, 2019

liggitt Aug 7, 2019

rosti Aug 7, 2019

neolit123 Aug 7, 2019 •

edited

rosti left a comment

rosti Aug 7, 2019

liggitt commented Aug 7, 2019

neolit123 commented Aug 7, 2019

neolit123 commented Aug 7, 2019 •

edited

neolit123 commented Oct 7, 2019 •

edited

neolit123 commented Oct 7, 2019

SataQiu commented Oct 8, 2019

fejta-bot commented Jan 6, 2020

neolit123 commented Jan 6, 2020

rosti left a comment

neolit123 commented Jan 26, 2020

neolit123 commented Jan 31, 2020

rosti left a comment

fejta-bot commented Jan 31, 2020

rosti commented Jan 31, 2020

fejta-bot commented Jan 31, 2020

fejta-bot commented Feb 1, 2020

fejta-bot commented Feb 1, 2020

kvaps commented Mar 26, 2020

kubeadm: prevent bootstrap of nodes with known names #81056

kubeadm: prevent bootstrap of nodes with known names #81056

Conversation

neolit123 commented Aug 7, 2019 • edited

k8s-ci-robot commented Aug 7, 2019

liggitt commented Aug 7, 2019

liggitt Aug 7, 2019

Choose a reason for hiding this comment

rosti Aug 7, 2019

Choose a reason for hiding this comment

neolit123 Aug 7, 2019 • edited

Choose a reason for hiding this comment

rosti left a comment

Choose a reason for hiding this comment

rosti Aug 7, 2019

Choose a reason for hiding this comment

liggitt commented Aug 7, 2019

neolit123 commented Aug 7, 2019

neolit123 commented Aug 7, 2019 • edited

neolit123 commented Oct 7, 2019 • edited

neolit123 commented Oct 7, 2019

SataQiu commented Oct 8, 2019

fejta-bot commented Jan 6, 2020

neolit123 commented Jan 6, 2020

rosti left a comment

Choose a reason for hiding this comment

neolit123 commented Jan 26, 2020

neolit123 commented Jan 31, 2020

rosti left a comment

Choose a reason for hiding this comment

fejta-bot commented Jan 31, 2020

rosti commented Jan 31, 2020

fejta-bot commented Jan 31, 2020

fejta-bot commented Feb 1, 2020

fejta-bot commented Feb 1, 2020

kvaps commented Mar 26, 2020

neolit123 commented Aug 7, 2019 •

edited

neolit123 Aug 7, 2019 •

edited

neolit123 commented Aug 7, 2019 •

edited

neolit123 commented Oct 7, 2019 •

edited