Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[cloud-provider] only pretend that instances from the current "cloud" exist #9280

Closed
fxposter opened this issue Jan 19, 2024 · 29 comments
Closed
Assignees
Milestone

Comments

@fxposter
Copy link

Is your feature request related to a problem? Please describe.
We're trying to use RKE2 in a bit special way where some part of the cluster is based on VM on bare-metal hardware and there are nodes that can be added on demand from AWS via karpenter or cluster autoscaler or something else. We tried to set up karpenter and managed to work with it, but there is a small problem related to cloud provider that RKE2 uses by default (and as far as I could find out - it is using the one from this repo).
The problem that we're facing now is that we can set providerID on those nodes manually, but unfortunately karpenter also relies on node.kubernetes.io/instance-type label to match the AWS instance type and we cannot set that label via --node-labels k8s option because it is constantly being overwritten by the value that cloud provider is providing to it (in our case it's always "rke").

Describe the solution you'd like
I think the good solution for the problem above and also the reasonable default behaviour would be to change https://github.com/k3s-io/k3s/blob/master/pkg/cloudprovider/instances.go, specifically InstanceExists and InstanceMetadata methods to first check the providerID that is already set on the node:

  • if it is empty or starts with {version.Program}:// - then use the current behaviour
  • if it is something else (in our case version.Program is rke and providerIDs on AWS nodes start with aws://) - then InstanceExists would return false and InstanceMetadata would return nil, nil.

Describe alternatives you've considered
none

Additional context
none

@fxposter
Copy link
Author

cc @brandond as you were the one who were involved in this code

@fxposter
Copy link
Author

PS. I'm happy to do the PR in case I get an approval

@brandond
Copy link
Contributor

I am not sure that this will get you what you want. Hybrid clusters (clusters with nodes whose provider-ids are set by different cloud controllers) are not supported by any cloud controller, as far as I know. The expectation is that all cluster members will have the same provider ID prefix. Some cloud controllers will throw errors, others will actively delete from the cluster nodes that cannot be successfully looked up in the infrastructure provider - including nodes that have the provider ID set to an unrecognized format.

@fxposter
Copy link
Author

If we ignore karpenter (or cluster autoscaler) - the actual cluster seem to work fine in that setup, but we would like to have node autoscaler.
Deploying aws cloud controller doesn't make much sense cause the actual control plane in our case has nothing to do with AWS and Karpenter seems to manage those additional weird nodes fine up to the point of their termination, where they check for instance types (for whatever reason) and it doesn't match.
But from the k8s cloud-provider perspective it seems to set instance-type to rke, which IMHO is incorrect and my proposal is more obvious from the human perspective and is backwards-compatible for the clusters with nodes from the single provider.

@fxposter
Copy link
Author

But from your perspective - is my proposal reasonable or cloud-providers should not really care about this at all?
I think we can try to play with --disable-cloud-controller (from https://docs.rke2.io/reference/server_config) on RKE2 side, but we haven't tried yet and I thought it would be a good thing to support more cases in rke's cloud-provider out of the box.

@brandond
Copy link
Contributor

brandond commented Jan 19, 2024

FWIW, something similar to this has been requested in the past, except in that case it was around node provider IDs instead of instance types.

I think there are a lot of sketchy edge cases to this ask, with regards how different things expect provider IDs and instance types to be set. Since you need special behavior for your hybrid cloud setup, you might consider creating a custom cloud controller, and disabling the one built in to k3s/rke2? The code here in K3s is probably a good starting point.

@brandond
Copy link
Contributor

brandond commented Jan 19, 2024

If you disable the built-in cloud controller without providing an alternative, the nodes will remain tainted as uninitialized, and things like the external IPs (from --node-external-ip) won't get set properly. So you do need something.

@fxposter
Copy link
Author

fxposter commented Jan 19, 2024

Sounds bad :(

except in that case it was around node provider IDs instead of instance types.

what we found out is that if we set provider-id explicitly in kubelet arguments - it does get set and the one from cloud controller is ignored. this is unfortunately not the case for instance-type.

@brandond
Copy link
Contributor

brandond commented Jan 19, 2024

I guess that if you wanted to tackle just this specific use case, and allow overriding the instanceType returned by the cloud provider, you could add a new annotation, and have the cloud-provider look up that annotation instead of just hardcoding the program name.

InstanceType: version.Program,

You would also need a new flag (similar to --node-external-ip) to set the type when the node is registered, and wire it up in the node config to be set here:

k3s/pkg/agent/run.go

Lines 441 to 442 in 6d77b7a

// updateAddressAnnotations updates the node annotations with important information about IP addresses of the node
func updateAddressAnnotations(nodeConfig *daemonconfig.Node, nodeAnnotations map[string]string) (map[string]string, bool) {

The instanceType is unrelated to the prefix on the providerID, we just happen to use the program name as the providerID URI scheme as well because it seemed like a reasonable thing to do.

@fxposter
Copy link
Author

I really wanted to avoid forking cloud providers and building our own :) Currently it all works fine up to the point where kubelet for some reason works with providerID and instance-type label differently, it prefers providerID from the command line and instance-type from cloud provider instead of preferring both from the single place :)

@brandond
Copy link
Contributor

we can set providerID on those nodes manually, but unfortunately karpenter also relies on node.kubernetes.io/instance-type label to match the AWS instance type

If it is really just the instance type that you need control over, if you wanted to open a PR to change the behavior as suggested above, we would probably accept that.

@fxposter
Copy link
Author

Mmm, do I understand your idea correctly: you propose changing https://github.com/k3s-io/k3s/blob/master/pkg/cloudprovider/instances.go#L74, so that if the node actually already have the relevant label - use that, otherwise fall back to the node.kubernetes.io/instance-type? We can actually do the same for providerID as well: if it's set on the Node object - use it, otherwise - generate it from nodeName.

@brandond
Copy link
Contributor

brandond commented Jan 19, 2024

no, I was suggesting plumbing through the instanceType value via an annotation set by the agent code, in the same way we set the node-external-id. I don't think that the instance-type label is one that nodes are allowed to set for themselves, so it needs to go through a node annotation, and then be returned by the cloud provider instance info.

There are some things that the kubelet isn't allowed to set directly on its node object, thats the whole reason that we have a stub cloud provider.

@fxposter
Copy link
Author

Ah, okay, I got it.
https://kubernetes.io/docs/reference/command-line-tools-reference/kubelet/ says it's allowed actually, but it also says it's an an alpha feature, though it's not clear what exactly is alpha, cause other labels work fine.

@fxposter
Copy link
Author

I really wonder how do you practically use annotations in this flow? I couldn't find a way to specify annotations on the node via kubelet flags, so when the node is created you cannot have custom annotations...

@brandond
Copy link
Contributor

That's why I suggested adding a CLI flag that is used to set the annotation. I linked the place where we set other annotations from CLI flags like --node-external-ip in a previous comment.

@fxposter
Copy link
Author

Hi. Can you please clarify what you are suggesting...
We have kubelet, it doesn't have a way to configure annotations by itself. But you are using node annotations in the cloud controller where the node object is passed. So there is someone who sets those annotations. And I am not sure who is it. :(

@brandond
Copy link
Contributor

brandond commented Jan 22, 2024

there is someone who sets those annotations. And I am not sure who is it

K3s does that. I linked the specific lines of code that do so in my comment: #9280 (comment)

@fxposter
Copy link
Author

does rke2 use the same code?

@brandond
Copy link
Contributor

brandond commented Jan 23, 2024

Yes. K3s is the core of RKE2. There are some unique bits for managing static pods on Linux or PEs on windows, and controllers for hardening/RBAC, but other than that it's all just K3s.

Copy link
Contributor

github-actions bot commented Mar 8, 2024

This repository uses a bot to automatically label issues which have not had any activity (commit/comment/label) for 45 days. This helps us manage the community issues better. If the issue is still relevant, please add a comment to the issue so the bot can remove the label and we know it is still valid. If it is no longer relevant (or possibly fixed in the latest release), the bot will automatically close the issue in 14 days. Thank you for your contributions.

@brandond
Copy link
Contributor

brandond commented Mar 8, 2024

I might take a shot at something in this space next month.

@brandond
Copy link
Contributor

It looks like this is less complicated than I thought, since the kubelet sets things early enough that we can just reuse the provider-id and various labels in the cloud provider.

@mdrahman-suse
Copy link

mdrahman-suse commented Apr 10, 2024

Validation on master with commit 4cc73b1

Environment and Config

Ubuntu 22.04
1 server
  • /etc/rancher/k3s/config.yaml
write-kubeconfig-mode: 644
cluster-init: true
token: summerheat
node-name: server1
kubelet-arg:
  - provider-id=aws://i-testprovider
node-label:
  - node.kubernetes.io/instance-type=c4.large
  - topology.kubernetes.io/region=us-east-2
  - topology.kubernetes.io/zone=us-east-2a

Testing Steps:

  • Install k3s using the config
  • Check for args provided matches the node info using kubectl get nodes -o yaml

Replication

v1.29.3+k3s1

$ kubectl get nodes -o yaml | grep "provider"
...
providerID: aws://i-testprovider

$ kubectl get nodes -o yaml
...
labels:
      beta.kubernetes.io/arch: amd64
      beta.kubernetes.io/instance-type: k3s
      beta.kubernetes.io/os: linux
      kubernetes.io/arch: amd64
      kubernetes.io/hostname: server1
      kubernetes.io/os: linux
      node-role.kubernetes.io/control-plane: "true"
      node-role.kubernetes.io/etcd: "true"
      node-role.kubernetes.io/master: "true"
      node.kubernetes.io/instance-type: k3s
      topology.kubernetes.io/region: us-east-2
      topology.kubernetes.io/zone: us-east-2a

Validation

$ k3s -v
k3s version v1.29.3+k3s-4cc73b1f (4cc73b1f)
go version go1.21.8

$ kubectl get nodes -o yaml | grep "provider"
...
providerID: aws://i-testprovider

$ kubectl get nodes -o yaml
...
labels:
      beta.kubernetes.io/arch: amd64
      beta.kubernetes.io/instance-type: c4.large
      beta.kubernetes.io/os: linux
      failure-domain.beta.kubernetes.io/region: us-east-2
      failure-domain.beta.kubernetes.io/zone: us-east-2a
      kubernetes.io/arch: amd64
      kubernetes.io/hostname: server1
      kubernetes.io/os: linux
      node-role.kubernetes.io/control-plane: "true"
      node-role.kubernetes.io/etcd: "true"
      node-role.kubernetes.io/master: "true"
      node.kubernetes.io/instance-type: c4.large
      topology.kubernetes.io/region: us-east-2
      topology.kubernetes.io/zone: us-east-2a

@mdrahman-suse
Copy link

@brandond is this expected to work on upgrades?

@brandond
Copy link
Contributor

brandond commented Apr 11, 2024

No, node-label only works when joining the cluster. Users can use kubectl label node to change them after the node has joined the cluster.

https://docs.k3s.io/cli/agent#node-labels-and-taints-for-agents

K3s agents can be configured with the options --node-label and --node-taint which adds a label and taint to the kubelet. The two options only add labels and/or taints at registration time, so they can only be added once and not changed after that again by running K3s commands.
If you want to change node labels and taints after node registration you should use kubectl. Refer to the official Kubernetes documentation for details on how to add taints and node labels.

ProviderID can only be set when joining the cluster, it cannot be changed afterwards.

@fxposter
Copy link
Author

thank you, we might consider using rke2 again :)

@brandond
Copy link
Contributor

brandond commented Apr 12, 2024

This will need to be pulled through to rke2 in an update to the rke2 cloud controller image; that has not been done yet.

@fxposter
Copy link
Author

Yes, I understand, but it's just a matter of time when it happens, not whether it happens, isn't it?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Done Issue
Development

No branches or pull requests

3 participants