design: reduce scope of node on node object #911

mikedanese · 2017-08-14T22:47:45Z

Super mini design doc about centralizing reporting of some sensitive kubelet attributes.

@kubernetes/sig-auth-misc
@roberthbailey
@kubernetes/sig-cluster-lifecycle-misc as it relates to registration

smarterclayton · 2017-08-14T23:05:42Z

contributors/design-proposals/limit-node-object-self-control.md

+[initializer](admission_control_extension.md) mechanism, a centralized
+controller can register an initializer for the node object and build the
+sensitive fields by consulting the machine database. The
+`cloud-controller-manager` is an obvious candidate to house such a controller.


Since those fields change over time I'm not even sure that initializers are required for anything except strong exclusion rules.

I think you are right. I think we want strong exclusion of sensitive labels, taints and addresses though. I will keep the initializer and add a stanza that allows the central controller reconcile objects as well.

"sensitive labels" is going to be really tough to pin down. Examples of node self-labeling I've seen just in the past two days are cpu policy and kernel version, both of which could be used for capability steering (a workload requiring a specific CPU policy or kernel version to run) or for security purposes ("find nodes running a kernel with a known vulnerability", "schedule pods to a known good kernel")

smarterclayton · 2017-08-14T23:06:21Z

Has come up a lot in security auditing.

ConnorDoyle · 2017-08-23T17:13:07Z

contributors/design-proposals/limit-node-object-self-control.md

+dedicated nodes in the workload controller running `customer-info-app`.
+
+Since the nodes self reports labels upon registration, an intruder can easily
+register a compromised node with label `foo/dedicated=customer-info-app`. The


The label part seems minor in comparison to the problem of compromised nodes registering themselves at all.

Which is another way to say, if we can't trust a node to set appropriate labels then why are we trusting it at all?

labels allow steering workloads. we need to be able to set up labeled topologies of nodes to keep classes of workloads separate, and be able to isolate a compromised node by tainting it, unlabeling it, and know that it didn't steer workloads to itself by adding to its own labels.

liggitt · 2017-08-23T17:34:57Z

contributors/design-proposals/limit-node-object-self-control.md

+Since the nodes self reports labels upon registration, an intruder can easily
+register a compromised node with label `foo/dedicated=customer-info-app`. The
+scheduler will then bind `customer-info-app` to the compromised node potentially
+giving the intruder easy access to the PII.


the same applies to allowing a node to remove taints (or delete its own Node API object while tainted)

liggitt · 2017-08-24T04:03:13Z

cc @kubernetes/sig-node-proposals

mikedanese · 2017-08-24T20:12:01Z

This was discussed in sig-auth and the proposal needs some further consideration:

We need to dig into the label/taint whitelisting mechanism. We need to make it easy for kubelet to set some of these on itself. We need to make it clear that kubelet self-set labels cannot be used to implement strong exclusion.
We need to clarify what the expectations are for a deployment running without a cloud provider.
We need to decide whether a node should be able to delete itself, thereby potentially removing taints from itself.
We may need to consider how we might need to adjust kubelet starts up.

liggitt · 2017-09-22T18:12:50Z

this also came up in the GCE cloud provider trying to determine what zones exist by looking at the node API objects and trusting whatever zone they reported they were in (kubernetes/kubernetes#52322 (comment))

tallclair · 2017-09-23T00:04:21Z

/cc @davidopp @mtaufen

roberthbailey · 2017-10-09T21:13:07Z

@mikedanese - are you aiming to get this merged and something implemented for 1.9? Or is this a longer term proposal?

liggitt · 2017-10-09T22:05:26Z

are you aiming to get this merged and something implemented for 1.9? Or is this a longer term proposal?

I'd like to see the labeling/tainting approach agreed on and implemented for 1.9

The node addresses are more involved and have more ties to cloud providers and variance between cloud/non-cloud environments.

thockin · 2018-11-08T18:16:37Z

keps/sig-auth/0000-20170814-bounding-self-labeling-kubelets.md

+(e.g. `foo/dedicated=customer-info-app`) on the node and to select these
+dedicated nodes in the workload controller running `customer-info-app`.
+
+Since the nodes self reports labels upon registration, an intruder can easily


I'm all for belts and suspenders, but is "an intruder registers a node into our cluster" a high-prio attack vector?

It turns "can launch a VM inside a given infrastructure account" into "can root the entire infrastructure account" when you take into account that the masters need certain privileges on the infrastructure account. So I would say yes.

thockin · 2018-11-08T18:24:31Z

keps/sig-auth/0000-20170814-bounding-self-labeling-kubelets.md

+
+    ```
+    kubernetes.io/hostname
+    failure-domain.[beta.]kubernetes.io/zone


Maybe simpler to say:

kubernetes.io/hostname kubernetes.io/os kubernetes.io/arch kubernetes.io/instance-type [*.]beta.kubernetes.io/* (deprecated) failure-domain.kubernetes.io/* [*.]kubelet.kubernetes.io/* [*.]node.kubernetes.io/*

Could we maybe argue to allow all of naked kubernetes.io ?

Or at least make it clear that this list may change in the future. Concretely, we might add more top-level things, we might enable new prefixes, and we might even provide policy rules to users.

Maybe simpler to say
...

Updated to just enumerate allowed labels

Could we maybe argue to allow all of naked kubernetes.io ?

I'd prefer to start with this specific set, and document the allowed set may grow or shrink in the future.

Or at least make it clear that this list may change in the future. Concretely, we might add more top-level things, we might enable new prefixes, and we might even provide policy rules to users.

+1

thockin · 2018-11-08T18:30:23Z

keps/sig-auth/0000-20170814-bounding-self-labeling-kubelets.md

+    [*.]node.kubernetes.io/*
+    ```
+
+2. Reserve/recommend the `node-restriction.kubernetes.io/*` label prefix for users


Is this name already codified somewhere? I don't like the length or specificity of it.

spitballing:

protected.kubernetes.io
user.kubernetes.io
admin.kubernetes.io
local.kubernetes.io
site.kubernetes.io
my.kubernetes.io
x.kubernetes.io

Or maybe a distinct TLD?

[.]local/
[.].k8s/

Think through whether this prefix will be applicable in any other context (one advantage of node-restriction is that it is pretty clearly node-related).

Naming is hard, but I am OK with the rest of this proposal

Is this name already codified somewhere? I don't like the length or specificity of it.

No, it's new to this proposal. I agree on the length, but I like the specificity.

Think through whether this prefix will be applicable in any other context (one advantage of node-restriction is that it is pretty clearly node-related).

having stewed on it for a few days, I actually like this prefix for this use:

it is clearly node-related

it only reserves a single specific label prefix

it connects the label to the admission plugin that enforces it

it connotes both a restriction on the node itself, and reads ok as a node selector (node-restriction.kubernetes.io/fips=true, node-restriction.kubernetes.io/pci-dss=true, etc)

thockin · 2018-11-08T18:32:34Z

I'll LGTM for merge now, but would appreciate just a BIT more thought on naming and scope.

/lgtm
/approve

krmayankk · 2018-11-09T07:19:30Z

who is working on the implementation ?

liggitt · 2018-11-09T11:52:40Z

I am, will update the PR today after thinking through Tim's last comments

thockin · 2018-11-09T19:31:13Z

See https://github.com/kubernetes/kubernetes/pull/70555/files for another proposed use of a label to control daemonset addons.

…

On Fri, Nov 9, 2018 at 3:52 AM Jordan Liggitt ***@***.***> wrote: I am, will update the PR today after thinking through Tim's last comments — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#911 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AFVgVGPte5wD9MFcCrFtMF-nFAxJT2_gks5utWyWgaJpZM4O2_Ym> .

…within the kubernetes.io/k8s.io label namespace

smarterclayton · 2018-11-12T17:40:39Z

/lgtm
/approve

k8s-ci-robot · 2018-11-12T17:40:47Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: smarterclayton, thockin

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~keps/sig-auth/OWNERS~~ [smarterclayton]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

liggitt · 2018-11-12T18:13:55Z

/hold cancel

thockin · 2018-11-19T19:49:42Z

/lgtm

…

On Mon, Nov 12, 2018 at 10:15 AM k8s-ci-robot ***@***.***> wrote: Merged #911 <#911> into master. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#911 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AFVgVO26YI_pZRtksm-ziTqYfGDsE3diks5uubrNgaJpZM4O2_Ym> .

design: reduce scope of node on node object

k8s-ci-robot added sig/auth Categorizes an issue or PR as relevant to SIG Auth. sig/cluster-lifecycle Categorizes an issue or PR as relevant to SIG Cluster Lifecycle. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Aug 14, 2017

mikedanese assigned liggitt and roberthbailey and unassigned roberthbailey Aug 14, 2017

mikedanese requested review from roberthbailey and cjcullen August 14, 2017 22:48

mikedanese force-pushed the limit-node branch from 9780fa2 to 026b6c4 Compare August 14, 2017 22:58

mikedanese assigned tallclair Aug 14, 2017

smarterclayton reviewed Aug 14, 2017

View reviewed changes

k8s-github-robot added the size/M Denotes a PR that changes 30-99 lines, ignoring generated files. label Aug 15, 2017

liggitt mentioned this pull request Aug 23, 2017

node label for policy and condition for CPUPressure #970

Closed

ConnorDoyle reviewed Aug 23, 2017

View reviewed changes

liggitt reviewed Aug 23, 2017

View reviewed changes

k8s-ci-robot added sig/node Categorizes an issue or PR as relevant to SIG Node. kind/design Categorizes issue or PR as related to design. labels Aug 24, 2017

liggitt mentioned this pull request Aug 24, 2017

add kernel version to node label kubernetes/kubernetes#51006

Closed

liggitt mentioned this pull request Aug 27, 2017

Limit node access to API kubernetes/enhancements#279

Closed

k8s-ci-robot requested review from davidopp and mtaufen September 23, 2017 00:04

This was referenced Sep 26, 2017

PVCs using standard StorageClass create PDs in disks in wrong zone in multi-zone GKE clusters kubernetes/kubernetes#50115

Closed

SAR approver for node server certs should verify SANs kubernetes/kubernetes#50635

Closed

k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Nov 8, 2018

thockin reviewed Nov 8, 2018

View reviewed changes

k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Nov 8, 2018

This was referenced Nov 9, 2018

Support running a nodelocal dns cache kubernetes/kubernetes#70555

Merged

node labeling restriction docs kubernetes/website#10944

Merged

liggitt force-pushed the limit-node branch from f331160 to cf13055 Compare November 12, 2018 14:00

k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Nov 12, 2018

convert to KEP, limit to labeling, change to convention-based limits …

c2c1ace

…within the kubernetes.io/k8s.io label namespace

liggitt force-pushed the limit-node branch from cf13055 to c2c1ace Compare November 12, 2018 16:57

k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Nov 12, 2018

k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Nov 12, 2018

k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Nov 12, 2018

k8s-ci-robot merged commit 732907c into kubernetes:master Nov 12, 2018

liggitt mentioned this pull request Nov 14, 2018

Remove self-deletion permissions from kubelets kubernetes/kubernetes#71021

Merged

mikedanese deleted the limit-node branch November 19, 2018 20:14

justaugustus pushed a commit to justaugustus/community that referenced this pull request Dec 1, 2018

Merge pull request kubernetes#911 from mikedanese/limit-node

44ae1a3

design: reduce scope of node on node object

liggitt mentioned this pull request Feb 20, 2019

Promoting cloud provider labels to GA kubernetes/enhancements#839

Merged

verult mentioned this pull request Aug 14, 2019

CSI topology doesn't handle non-prefixed keys correctly kubernetes/kubernetes#76113

Closed

Klaven mentioned this pull request Jan 24, 2020

Show node's role in 'skuba cluster status' SUSE/skuba#928

Merged

MadhavJivrajani pushed a commit to MadhavJivrajani/community that referenced this pull request Nov 30, 2021

Merge pull request kubernetes#911 from mikedanese/limit-node

e77b340

design: reduce scope of node on node object

rittneje mentioned this pull request Feb 10, 2022

[feature request] add --node-annotations command flag for kubelet kubernetes/kubernetes#108046

Open

danehans pushed a commit to danehans/community that referenced this pull request Jul 18, 2023

Update co-lead of Extensions and Telemetry (kubernetes#911)

6b0869c

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

design: reduce scope of node on node object #911

design: reduce scope of node on node object #911

mikedanese commented Aug 14, 2017

smarterclayton Aug 14, 2017

mikedanese Aug 22, 2017

liggitt Aug 25, 2017

smarterclayton commented Aug 14, 2017

ConnorDoyle Aug 23, 2017

ConnorDoyle Aug 23, 2017

liggitt Aug 23, 2017

liggitt Aug 23, 2017

liggitt commented Aug 24, 2017

mikedanese commented Aug 24, 2017

liggitt commented Sep 22, 2017

tallclair commented Sep 23, 2017

roberthbailey commented Oct 9, 2017

liggitt commented Oct 9, 2017

thockin Nov 8, 2018

smarterclayton Nov 12, 2018

thockin Nov 8, 2018

liggitt Nov 12, 2018 •

edited

thockin Nov 8, 2018

liggitt Nov 12, 2018

thockin commented Nov 8, 2018

krmayankk commented Nov 9, 2018

liggitt commented Nov 9, 2018

thockin commented Nov 9, 2018 via email

smarterclayton commented Nov 12, 2018

k8s-ci-robot commented Nov 12, 2018

liggitt commented Nov 12, 2018

thockin commented Nov 19, 2018 via email

design: reduce scope of node on node object #911

design: reduce scope of node on node object #911

Conversation

mikedanese commented Aug 14, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

smarterclayton commented Aug 14, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

liggitt commented Aug 24, 2017

mikedanese commented Aug 24, 2017

liggitt commented Sep 22, 2017

tallclair commented Sep 23, 2017

roberthbailey commented Oct 9, 2017

liggitt commented Oct 9, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

liggitt Nov 12, 2018 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

thockin commented Nov 8, 2018

krmayankk commented Nov 9, 2018

liggitt commented Nov 9, 2018

thockin commented Nov 9, 2018 via email

smarterclayton commented Nov 12, 2018

k8s-ci-robot commented Nov 12, 2018

liggitt commented Nov 12, 2018

thockin commented Nov 19, 2018 via email

liggitt Nov 12, 2018 •

edited