chore: only label nodes with kubernetes.io/role if they need it #4827

jackfrancis · 2022-02-09T18:56:30Z

Reason for Change:

This PR updates the way we idempotently enforce kubernetes.io/role node labels. Instead of redoing it blindly every minute forever, we only do it if the required labels aren't present.

The key part of the change is the way we filter out nodes (by label) for labeling. As an example:

Was:

Filter by nodes with label kubernetes.azure.com/role!=master,kubernetes.io/role!=master
- This filter will always return a set of non-control-plane-labeled nodes.

Now:

Filter by nodes with label kubernetes.azure.com/role=agent,kubernetes.io/role!=agent,node-role.kubernetes.io/agent!=
- This filter returns nodes that have the kubernetes.azure.com/role=agent label (delivered via kubelet config args when node joins cluster), but have the missing or incorrect kubernetes.io/role and node-role.kubernetes.io/agent labels.

The key point above is when we have a set of nodes using the new filters, we apply the desired labels such that the next time we query for nodes those recently labeled nodes are not included in the next query. This results in far fewer, actual kubectl label nodes write operations, as the label selector we use to label nodes will only be non-empty when a new node has recently come online.

Issue Fixed:

Credit Where Due:

Does this change contain code from or inspired by another project?

No
Yes

If "Yes," did you notify that project's maintainers and provide attribution?

No
Yes

Requirements:

uses conventional commit messages
includes documentation
adds unit tests
tested upgrade from previous version

Notes:

Michael-Sinz · 2022-02-09T20:54:28Z

I wonder if this is correct.
If these labels really are needed, we may have a case where, during scaling, we have new nodes show up and not have their labels set correctly.

This script just screams of hack. Is there a reason the labels do not come up to start with? Why can we add them later?

This does not really do well with dynamic clusters, especially of large size. If we add 100 new nodes per minute for a while (during fast scaling periods), that means that those nodes will need the labels before fully functioning and thus we have these blind points (or maybe we don't need the labels ever, in which case we can just stop doing this)

I only noticed that this is running due to the amount of log output produced by this script into syslog. It is rather voluminous, especially in the larger clusters. It also puts a stress on API server in those clusters where we already have API server stress due to our workloads.

jackfrancis · 2022-02-10T18:21:11Z

I think this comment suggests we should not be doing this at all:

kubernetes/kubernetes#84912 (comment)

I will remove this entire node labeling foo. I've confirmed that the nodes are automatically labeled thusly:

control plane node: kubernetes.azure.com/role=master
worker node: kubernetes.azure.com/role=agent

Unfortunately, this may break some existing customers who are relying upon these labels:

kubernetes.io/role=master
node-role.kubernetes.io/master=
kubernetes.io/role=agent
node-role.kubernetes.io/agent=

But, in advance of deprecating the project we should definitely finally do the right thing and stop breaking the rules (this has been a core principle of node labeling since 1.16).

cc @mboersma @devigned

jackfrancis · 2022-02-10T18:48:07Z

Hmm, nevermind, this label key is all over k/k code, including cloud-provider-azure:

https://github.com/kubernetes-sigs/cloud-provider-azure/blob/master/pkg/consts/consts.go#L86

AKS nodes have these labels, so I take back everything above, we do need them.

jackfrancis · 2022-02-10T19:06:22Z

@Michael-Sinz read this thread:

eksctl-io/eksctl#2363 (comment)

jackfrancis · 2022-02-10T20:01:14Z

O.K., so after going through the effort to re-acquaint myself with what's going on here:

we definitely need to preserve this behavior
understood that applying these role labels in the kubernetes-approved way (via a privileged k8s api operation, not via setting kubelet --node-labels configuration on the nodes themselves) can definitely be optimized from the existing "run this script on all control plane nodes every minute forever" implementation, but doing that optimization work is out of scope for aks-engine given project deprecation
we can do a better job of identifying which nodes are new (by their not having certain labels), and I will turn this PR into such an improvement

Michael-Sinz

I like the concept - let me try this on one of my small test clusters to see what the behavior is.

Michael-Sinz

This seems to work rather well. Thank you for your quick response.

jackfrancis · 2022-02-10T21:27:28Z

@Michael-Sinz a possible edge case going forward with this solution is if the kubectl label command is not atomic, and one of the two labels is applied but not the other. Because our new query looks for all of the included labels (it's an AND query), if we're missing just one of them we won't filter it in for a follow-up kubectl label operation and that missing label will stay missing.

I don't want to over-engineer this to death and so I'm not anticipating that edge case being present in the real world, but if you observe something weird in nodes not having the desired labels let us know!

Michael-Sinz · 2022-02-10T21:31:14Z

@Michael-Sinz a possible edge case going forward with this solution is if the kubectl label command is not atomic, and one of the two labels is applied but not the other. Because our new query looks for all of the included labels (it's an AND query), if we're missing just one of them we won't filter it in for a follow-up kubectl label operation and that missing label will stay missing.

I don't want to over-engineer this to death and so I'm not anticipating that edge case being present in the real world, but if you observe something weird in nodes not having the desired labels let us know!

I think you are fine - given the mechanism involved, I would expect this to be all or nothing failure.

chore: clarify what label-nodes.sh is doing

034a554

jackfrancis force-pushed the label-nodes-restart-on-failure branch from 51f5835 to 034a554 Compare February 10, 2022 19:39

optimize queries

cbb24ca

jackfrancis changed the title ~~chore: only retry node label job if it fails~~ chore: only label nodes with kubernetes.io/role if they need it Feb 10, 2022

add check for node-role.kubernetes.io/

4c79e67

jackfrancis requested review from mboersma and Michael-Sinz February 10, 2022 20:57

Michael-Sinz reviewed Feb 10, 2022

View reviewed changes

Michael-Sinz approved these changes Feb 10, 2022

View reviewed changes

jackfrancis merged commit fe9e4d4 into Azure:master Feb 11, 2022

jackfrancis deleted the label-nodes-restart-on-failure branch February 11, 2022 00:05

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

chore: only label nodes with kubernetes.io/role if they need it #4827

chore: only label nodes with kubernetes.io/role if they need it #4827

jackfrancis commented Feb 9, 2022 •

edited

Loading

Michael-Sinz commented Feb 9, 2022

jackfrancis commented Feb 10, 2022

jackfrancis commented Feb 10, 2022

jackfrancis commented Feb 10, 2022

jackfrancis commented Feb 10, 2022

Michael-Sinz left a comment

Michael-Sinz left a comment

jackfrancis commented Feb 10, 2022 •

edited

Loading

Michael-Sinz commented Feb 10, 2022

chore: only label nodes with kubernetes.io/role if they need it #4827

chore: only label nodes with kubernetes.io/role if they need it #4827

Conversation

jackfrancis commented Feb 9, 2022 • edited Loading

Michael-Sinz commented Feb 9, 2022

jackfrancis commented Feb 10, 2022

jackfrancis commented Feb 10, 2022

jackfrancis commented Feb 10, 2022

jackfrancis commented Feb 10, 2022

Michael-Sinz left a comment

Choose a reason for hiding this comment

Michael-Sinz left a comment

Choose a reason for hiding this comment

jackfrancis commented Feb 10, 2022 • edited Loading

Michael-Sinz commented Feb 10, 2022

jackfrancis commented Feb 9, 2022 •

edited

Loading

jackfrancis commented Feb 10, 2022 •

edited

Loading