Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

node label for policy and condition for CPUPressure #970

Closed

Conversation

sjenning
Copy link
Contributor

converted from ConnorDoyle#2

This is to address feedback from the RWMG discussion about a month ago.

Adds information on node label advertising cpu-policy and CPUPressure node condition.

@derekwaynecarr @ConnorDoyle

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Aug 23, 2017
This policy will be reported in the Node resource as a label with key
`alpha.kubernetes.io/cpu-policy`. This allows users with with a policy
preference to use a node selector to ensure the pod lands on a node with a
particular cpu manager policy enabled.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there are efforts in progress (#911) to remove or restrict the ability of nodes to label themselves, since that allows them to steer workloads to themselves.

Copy link
Contributor Author

@sjenning sjenning Aug 23, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow ok. Didn't know about that.

I can either 1) remove this section completely or 2) s/will/can/ if trusted node label mechanism configures it independently rather then the kubelet knowing its own config and reporting it through a label.

What will happen to all the node labels in kubernetes.io that the node self-labels now?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we'll have to whitelist them (which we may do for things like os/arch/instance-type) or silently drop them until we can remove them from the kubelet and get past the version skew window for kubelets that try to set them

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

using labels for this could still work, I would just expect them to be applied to nodes, rather than having nodes self-report them

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand the spirit of the design, but it does mean that some fields the kubelet sets today based on its own introspection of the machine need to move to fields. Are we ok with the node controller then converting those fields to labels? Can cpu manager policy be included in that initial whitelist?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vishh - have you been tracking removing the ability for a node to self label? did you have other ideas for converting them to fields and then node controller assigning them as labels/taints?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We rely on Node labels heavily to address scheduling constraints. Fields in node object cannot be used for scheduling purposes as of now.
Having users create labels that map kubelet configuration is not a great user experience. I can see the security aspect of having nodes introduce arbitrary labels.
Perhaps a whitelist of label keys is a way to get around the security issue? Ideally such a whitelist would be configurable (Configmaps maybe?) which would give cluster admins the ability to choose what labels they'd like to expose and also extend kubernetes easily.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@derekwaynecarr #911 is news to me. I wonder why that proposal wasn't discussed in SIG-node yet.

@k8s-github-robot k8s-github-robot added the size/S Denotes a PR that changes 10-29 lines, ignoring generated files. label Aug 23, 2017

NOTE: The decision to allow the shared pool to completely empty is
driven by a desire to keep the scheduler accounting simple. If a
--cpu-policy-static-min-shared-cpus flag were to exist, there is no simple way
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: add backticks to flag name to avoid soft-wrapping.

--cpu-policy-static-min-shared-cpus flag were to exist, there is no simple way
to convey that information to the scheduler. The scheduler would then need to
know that some portion of the CPU Allocatable is "special", for use only by
non-exlusive containers.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Possibly obvious, but consider also mentioning what would happen with a shared pool low-water-mark but without scheduler enhancements -- Kubelet would need to reject some G pods after scheduling, with high potential to confuse users.


1. _The shared pool becomes nonempty._
1. The CPU manager removes the node condition with effect NoSchedule,
NoExecute for BestEffort and Burstable QoS class pods.
1. The CPU manager sets a CPUPressure node condition to false that allows
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I mentioned in the PR, this node condition isn't necessary if we reserve a core by default which I feel will be user's expectation.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@derekwaynecarr
Copy link
Member

I spoke with @ConnorDoyle and agree we can drop the CPU pressure condition.

@sjenning -- can you remove the CPU pressure condition parts of this PR.

@sjenning
Copy link
Contributor Author

@derekwaynecarr with the "no node self labeling" and dropping of the CPU Pressure condition, I'm not sure if any parts of this PR are needed anymore. Should I just close?

@lavalamp lavalamp assigned vishh and unassigned lavalamp Aug 24, 2017
@vishh
Copy link
Contributor

vishh commented Aug 24, 2017

This PR can probably be closed, but we need to open a separate issue for #970 (comment)

danehans pushed a commit to danehans/community that referenced this pull request Jul 18, 2023
* Update steering members

Signed-off-by: Faseela K <faseela.k@est.tech>

* add ctrath as member

Signed-off-by: Faseela K <faseela.k@est.tech>

---------

Signed-off-by: Faseela K <faseela.k@est.tech>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/S Denotes a PR that changes 10-29 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

9 participants