Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prevent nodes from updating taints #63167

Merged
merged 1 commit into from May 16, 2018

Conversation

@liggitt
Copy link
Member

liggitt commented Apr 25, 2018

Prevents kubelets from modifying or removing taints on update.

Nodes can set taints when they register themselves, but do not update/remove those taints after creation (that is done by the node controller based on reported node conditions).

xref kubernetes/community#911 kubernetes/enhancements#279

/sig node
/sig auth
/sig scheduling

/assign @mikedanese @k82cn

The NodeRestriction admission plugin now prevents kubelets from modifying/removing taints applied to their Node API object.
@tallclair

This comment has been minimized.

Copy link
Member

tallclair commented Apr 25, 2018

/cc @vishh
xref #62747

@k8s-ci-robot k8s-ci-robot requested a review from vishh Apr 25, 2018

@vishh

This comment has been minimized.

Copy link
Member

vishh commented Apr 25, 2018

Aren't existing conditions expected to be transitioned to taints in which case k'let will place and remove taints?
I have use cases around bootstrapping and handling termination where the current plan is to re-use k'let's credentials.
Why is this restriction necessary especially given that taints are yet to be fully adopted even in core?

@liggitt

This comment has been minimized.

Copy link
Member Author

liggitt commented Apr 25, 2018

Aren't existing conditions expected to be transitioned to taints

They already are, but are translated by the node controller, not set directly by the kubelet (which works really well to allow a limited set of conditions to affect taints without giving the kubelet full control over them)

Why is this restriction necessary especially given that taints are yet to be fully adopted even in core?

Because kubelets removing/modifying their own taints removes a primary tool an administrator has for preventing nodes from accepting certain workloads (imagine a "compromised" or "unverified" taint set on a node intended to keep the scheduler from sending workloads to it)

See kubernetes/community#911 for a fuller discussion of the issues with letting kubelets steer workloads via self-labeling or self-untainting

@liggitt

This comment has been minimized.

@k82cn

This comment has been minimized.

Copy link
Member

k82cn commented Apr 26, 2018

The diff is ok to me :) There's a discussion (#62311) on replacing node condition with taints, will we add white list for node conditions?

@aveshagarwal

This comment has been minimized.

Copy link
Member

aveshagarwal commented Apr 26, 2018

Would it impact these scenario (cloud specific taint being applied on the fly/runtime): https://github.com/kubernetes/kubernetes/blob/master/pkg/cloudprovider/providers/aws/aws.go#L1777 ?

@liggitt

This comment has been minimized.

Copy link
Member Author

liggitt commented Apr 26, 2018

Would it impact these scenario (cloud specific taint being applied on the fly/runtime): https://github.com/kubernetes/kubernetes/blob/master/pkg/cloudprovider/providers/aws/aws.go#L1777 ?

No, that is only called from AttachDisk, which has been done by the attach/detach controller, not the kubelet, since 1.3

xref #55517

@liggitt

This comment has been minimized.

Copy link
Member Author

liggitt commented Apr 26, 2018

There's a discussion (#62311) on replacing node condition with taints, will we add white list for node conditions?

If we want the kubelet to start setting/updating/removing taints on itself directly, we can discuss the pros/cons of whitelisting specific taints at that point. I could see whitelisting the set that is automatically translated from node-reported conditions->taints, but would want that to be an intentional decision.

@k82cn

This comment has been minimized.

Copy link
Member

k82cn commented Apr 26, 2018

There's a discussion (#62311) on replacing node condition with taints, will we add white list for node conditions?

If we want the kubelet to start setting/updating/removing taints on itself directly, we can discuss the pros/cons of whitelisting specific taints at that point. I could see whitelisting the set that is automatically translated from node-reported conditions->taints, but would want that to be an intentional decision.

That's ok to me, and PR is LGTM; waiting for mikedanese@'s comments if any :)

@mikedanese

This comment has been minimized.

Copy link
Member

mikedanese commented Apr 26, 2018

I would like to restrict this since nothing uses it, and loosen the restriction as needed going forward. @vishh has a reasonable usecase. How should a node agent out of core request evacuation? Does the node agent need an accompanying controller that understands a new node condition?

@liggitt

This comment has been minimized.

Copy link
Member Author

liggitt commented Apr 30, 2018

I would like to restrict this since nothing uses it, and loosen the restriction as needed going forward.

I agree.

@vishh has a reasonable usecase. How should a node agent out of core request evacuation? Does the node agent need an accompanying controller that understands a new node condition?

I could see any of the following (roughly in order of preference):

  • running a separate controller for managing a custom condition<->taint (part of the custom node agent setup)
  • deciding on a well-known key or prefix for taints that nodes would be allowed to manage themselves
  • options to the existing node controller to tell it additional custom conditions to translate into taints
  • options to the admission plugin to configure specific taints to allow the node to manage
@luxas

This comment has been minimized.

Copy link
Member

luxas commented May 1, 2018

@mikedanese

This comment has been minimized.

Copy link
Member

mikedanese commented May 11, 2018

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm label May 11, 2018

@mikedanese

This comment has been minimized.

Copy link
Member

mikedanese commented May 11, 2018

Let's get this in and relax constraints when we need to.

@liggitt liggitt force-pushed the liggitt:taint-modification branch from 486d088 to 15bcfd5 May 15, 2018

@k8s-ci-robot k8s-ci-robot removed the lgtm label May 15, 2018

@liggitt

This comment has been minimized.

Copy link
Member Author

liggitt commented May 15, 2018

rebased

@ericchiang

This comment has been minimized.

Copy link
Member

ericchiang commented May 15, 2018

Download error for bazel build

W0515 17:57:42.467] ERROR: /bazel-scratch/.cache/bazel/_bazel_root/e9f728bbd90b3fba632eb31b20e1dacd/external/io_bazel_rules_go/BUILD.bazel:9:1: GoStdlib external/io_bazel_rules_go/linux_amd64_stripped/stdlib~/pkg failed: Failed to delete output files after incomplete download. Cannot continue with local execution.: /bazel-scratch/.cache/bazel/_bazel_root/e9f728bbd90b3fba632eb31b20e1dacd/execroot/main/bazel-out/host/bin/external/io_bazel_rules_go/linux_amd64_stripped/stdlib~/pkg (Directory not empty)

/retest

@neolit123

This comment has been minimized.

Copy link
Member

neolit123 commented May 15, 2018

/test pull-kubernetes-e2e-kops-aws

@k82cn

This comment has been minimized.

Copy link
Member

k82cn commented May 16, 2018

running a separate controller for managing a custom condition<->taint (part of the custom node agent setup)

Separate controller manager has race condition issue between condition <-> taints :( , well-known key or prefix for taints makes sense to me.

xref #63897

@k82cn

This comment has been minimized.

Copy link
Member

k82cn commented May 16, 2018

/lgtm

help to re-lgtm :)

@k8s-ci-robot k8s-ci-robot added the lgtm label May 16, 2018

@k8s-ci-robot

This comment has been minimized.

Copy link
Contributor

k8s-ci-robot commented May 16, 2018

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: k82cn, liggitt, mikedanese

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-github-robot

This comment has been minimized.

Copy link
Contributor

k8s-github-robot commented May 16, 2018

/test all [submit-queue is verifying that this PR is safe to merge]

@k8s-github-robot

This comment has been minimized.

Copy link
Contributor

k8s-github-robot commented May 16, 2018

Automatic merge from submit-queue (batch tested with PRs 63167, 63357). If you want to cherry-pick this change to another branch, please follow the instructions here.

@k8s-github-robot k8s-github-robot merged commit d723028 into kubernetes:master May 16, 2018

13 of 18 checks passed

Submit Queue Required Github CI test is not green: pull-kubernetes-e2e-gce
Details
pull-kubernetes-e2e-gce Job triggered.
Details
pull-kubernetes-e2e-gce-100-performance Job triggered.
Details
pull-kubernetes-e2e-kops-aws Job triggered.
Details
pull-kubernetes-kubemark-e2e-gce-big Job triggered.
Details
cla/linuxfoundation liggitt authorized
Details
pull-kubernetes-bazel-build Job succeeded.
Details
pull-kubernetes-bazel-test Job succeeded.
Details
pull-kubernetes-cross Skipped
pull-kubernetes-e2e-gce-device-plugin-gpu Job succeeded.
Details
pull-kubernetes-e2e-gke Skipped
pull-kubernetes-integration Job succeeded.
Details
pull-kubernetes-kubemark-e2e-gce Job succeeded.
Details
pull-kubernetes-local-e2e Skipped
pull-kubernetes-local-e2e-containerized Skipped
pull-kubernetes-node-e2e Job succeeded.
Details
pull-kubernetes-typecheck Job succeeded.
Details
pull-kubernetes-verify Job succeeded.
Details

@liggitt liggitt deleted the liggitt:taint-modification branch May 16, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.