New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add AdditionalLabels to cloudprovider.InstanceMetadata #123223
Conversation
staging/src/k8s.io/cloud-provider/controllers/node/node_controller.go
Outdated
Show resolved
Hide resolved
So |
/assign @elmiko @andrewsykim |
Ya, that's true, but this is all coming from the cloud providers, so should be more tightly controlled. |
Agreed that risk is probably low because cloud providers should know about the other labels. But I do wonder if we should ignore or drop any |
Either way, can we update the API doc for AdditionalLabels to say it ignores (or overwrites) other node labels |
I think any of the following are reasonable:
With 4, we outlaw cloud providers from using Personally, I'm still inclined to just override and document. If cloud providers are overriding the k8s well known labels, sounds like a bug and seems very unlikely. This would be the simplest approach, and the easiest for cloud providers to reason about. It would potentially be confusing to have scenarios where they pass in a label and we ignore it. That being said, I don't think there's a wrong choice here. It's a new feature. |
Changelog suggestion -Added support for cloud providers to supply custom labels that will be applied to nodes by the node controller. Labels will only be applied if implemented by the cloud providers.
+Added support for cloud provider plugins to supply optional, per-node custom labels that will be
+applied to Nodes by the node controller.
+Extra labels will only be applied where the cloud provider integration implements this. We're talking about allowing code you run to these labels, rather than about granting access for a firm like Google or Microsoft or AWS to do so. If the right term isn't ”plugins”, please swap it for something more accurate. |
|
The rule I'd like:
|
Updated the changelog per @sftim 's suggestion, albeit with some tweaks. I'm not sure plugin is the right word or not, so I just when with super generic
No. What would be the use case for that? If there's a need, may just need to run a patched version of k8s.
My use case is labels, but I could see use for adding support for annotations or taints. I assume it'd be similar, and I don't think it would impact the model. My preference would be to keep this change focused on labels, and if there's a desire to extend the functionality to annotations and/or taints, would put up a follow up PR for that as well.
Interesting idea. In that case, are you suggesting ignoring or overriding the original value? NVM, I see you weighed in here in support of allow overriding. |
/retest |
staging/src/k8s.io/cloud-provider/controllers/node/node_controller.go
Outdated
Show resolved
Hide resolved
I feel overriding is not fine per #123223 (comment) Not overriding them and logging seems better to me because is the cloud provider who owns the problem, and it can fix it, and is completely backwards compatible. If we break compatibility overriden I expect this to be feature flagged at least |
@mmerkes i'm a little curious about this
my concern is that we run the risk of breaking a cloud provider that might be expecting to use these labels. i'm curious if we, as a community, are moving towards ensuring that the kubernetes.io/k8s.io prefix is only used with the well-known labels? |
That is something I've seen contributors pushing. My take on what Kubernetes might do in this case is that people would see warnings if they use unrecognized labels / annotations, rather than anything stronger (if we wanted to enforced, we'd likely use an actual API field). We also might opt to do nothing and silently accept the unrecognized keys (ie, the current behavior). A final option is to have a warnlist of labels that we know are not registered, are not used by any in-project code, and that folks should stop using. Another approach is to warn for unrecognized taints only, but leave labels and annotations out. We have a much smaller set of official taints and we make changes less often. |
Signed-off-by: Matt Merkes <merkes@amazon.com> Emits event when overriding labels in node controller Signed-off-by: Matt Merkes <merkes@amazon.com> Discard kubernetes.io additional labels in node controller Signed-off-by: Matt Merkes <merkes@amazon.com> Exclude kubernetes reserved label namespaces
There's a been a number of conversations on the right behavior here, so I'm going to try to summarize the topics here to work toward resolving them. Should we allow cloud provider implementations to use the No. There's been discussion that contributors are pushing to keep those namespaces restricted to k8s well known labels. If we were to allow them, cloud provider implementations could potentially be setting a label that k8s/k8s will implement in the future. Allowing labels in those namespaces would mean that restricting them later would be backwards incompatible. IMO, we should recommend that cloud provider implementations use their own namespace, like How do we treat labels that are already set in the node? Labels that are already set in the node will be treated as immutable, and the labels provided by the cloud provider will be ignored in those cases. Per this discussion, overriding the labels could create a hot loop situation where the controller and some other actor keep competing with each other to set the label. How will users know that labels are not getting applied as expected? If the cloud provider supplies a label in a k8s/k8s namespace or a label that's already applied to a node but with a different value, we'll discard those labels and log a warning. Cluster admins will need to use these logs to spot the potentially unexpected issue. When cloud provider implementations are setting new labels via this mechanism, I don't think it's unreasonable to expect that they actually do the bare minimum testing of those changes and verify that the labels get applied. A more confusing case might be where labels are passed into What about taints and annotations? This PR is scoped to handling labels. If there's appetite for taints and annotations in the future, we can add support via the same mechanism. If I missed any other open discussions, let me know. Otherwise, let me know we're close to aligning on this. @dims @cartermckinnon @andrewsykim @aojea @sftim |
What if a cloud provider integration wants to set, say, |
There's already a mechanism for that. Region and instance type are passed into the |
Thanks! Looks like everyone had a say :) and no new comments for a while. It looks sane to me. I am sure @mmerkes is happy to iterate as needed. /approve |
LGTM label has been added. Git tree hash: 99bcbe3662437eff362b5a421d57ca1339d75add
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: dims, mmerkes The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/triage accepted |
/retest |
The Kubernetes project has merge-blocking tests that are currently too flaky to consistently pass. This bot retests PRs for certain kubernetes repos according to the following rules:
You can:
/retest |
lgtm |
What type of PR is this?
/kind feature
What this PR does / why we need it:
Adds the
AdditionalLabels
field to theInstanceMetadata
, which can be set by cloud provider implementations to provide custom labels that will be applied to nodes. There's a need in the AWS cloud provider to set labels specific to AWS on the nodes, and I assume that other cloud providers can put this to use as well. The change is backwards compatible because if the cloud providers never set the field, nothing will change. There's not a hook directly in the cloud provider implementations that allows them to customize node objects directly, so they either need to run additional controllers or provide it via theInstanceMetadata
struct and instruct thenode_controller
on how to apply it to the node.As a side node, this change also requires that cloud providers utilize
InstancesV2
, which gives them direct access to the instance metadata. The AWS cloud provider does not implementInstancesV2
, so I will be doing that migration there directly.Which issue(s) this PR fixes:
N/A
Special notes for your reviewer:
Does this PR introduce a user-facing change?
Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.: