Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

k3s CCM must allow for custom patching during node join #1644

Closed
sandys opened this issue Apr 16, 2020 · 13 comments
Closed

k3s CCM must allow for custom patching during node join #1644

sandys opened this issue Apr 16, 2020 · 13 comments

Comments

@sandys
Copy link

sandys commented Apr 16, 2020

hi
we are attempting a production deploy of k3s on AWS, but with auto-scaling-groups and spot instances.

To do this, we need to use cluster-autoscaler - which expects ProviderId in a certain format (aws:///eu-west-3a/<EC2_INSTANCE_ID>). The issue is that k3s built-in ccm will tag ProviderId in a different format (k3s://).

Now generally speaking, we dont need the AWS CCM. we are not doing much with it. All that we need is cluster-autoscaler. This is true for most people.
cluster-autoscaler will work if k3s were setup like this

k3s server --disable-cloud-controller --kubelet-arg cloud-provider=aws
kubectl patch node <NODE_NAME> -p '{"spec":{"providerId":"aws://whatever"}}'

If k3s ccm allows for a customer providerid to be set during node-join, everything should work just fine.

@brandond
Copy link
Member

If you disable the cloud controller, it shouldn't set the providerid at all. That's the behavior I see at least - are you seeing something different? That should let you patch it out-of-band to whatever you need to set it to.

@sandys
Copy link
Author

sandys commented Apr 16, 2020 via email

@stale
Copy link

stale bot commented Jul 31, 2021

This repository uses a bot to automatically label issues which have not had any activity (commit/comment/label) for 180 days. This helps us manage the community issues better. If the issue is still relevant, please add a comment to the issue so the bot can remove the label and we know it is still valid. If it is no longer relevant (or possibly fixed in the latest release), the bot will automatically close the issue in 14 days. Thank you for your contributions.

@stale stale bot added the status/stale label Jul 31, 2021
@jawabuu
Copy link

jawabuu commented Aug 11, 2021

@sandys Did you get a workaround for this?

@stale stale bot removed the status/stale label Aug 11, 2021
@brandond
Copy link
Member

I don't think it makes sense for the K3s cloud provider to contain functionality already covered by the AWS cloud-provider. If you want AWS ProviderIDs, disable the built-in k3s cloud-provider and install the out-of-tree aws cloud provider.

@jawabuu
Copy link

jawabuu commented Aug 11, 2021

@brandond Understood.
In my case I'm actually deploying to hetzner & linode.
k3s works perfectly including the ccm. What we're looking at is adding cluster-autoscaler abilities to k3s.
Unfortunately cluster-autoscaler has a hard requirement that provider-id matches the format for the cloud provider.
As @sandys commented, a feature request to override the provider-id on cluster creation would be very helpful in this case.
There are obviously manual workaroounds but these are required after creation which impacts any bootstrapping process.
Since the k3s:// does not really affect current functionality, I would request considering it.

@brandond
Copy link
Member

brandond commented Aug 11, 2021

cluster-autoscaler has a hard requirement that provider-id matches the format for the cloud provider.

So you want the K3s cloud provider to spoof other provider names so that the cluster autoscaler will think you're using the correct cloud provider? It seems like we'd need to also embed additional logic to set the provider ID; right now it just set the node ID to the hostname, but other cloud providers do different things, like setting it to an instance ID or something. Would K3s be expected to do that too?

We recently made it possible to run the k3s cloud provider as a standalone pod - you might take a look at https://github.com/rancher/image-build-rke2-cloud-provider/blob/main/main.go and see if you can tweak the code to do what you want.

@jawabuu
Copy link

jawabuu commented Aug 11, 2021

We recently made it possible to run the k3s cloud provider as a standalone pod - you might take a look at https://github.com/rancher/image-build-rke2-cloud-provider/blob/main/main.go and see if you can tweak the code to do what you want.

This is great news. I will definitely take a look.
In a simple case, the ccm could just honor this flag. K3S should not be responsible for anything else other than setting the ID.
Acquiring and formatting the ID should be the user's responsibility.
For example

--kubelet-arg="provider-id=hcloud://$(curl -s http://169.254.169.254/hetzner/v1/metadata/instance-id)"
--kubelet-arg="provider-id=aws:///$(curl -s http://169.254.169.254/latest/meta-data/placement/availability-zone)/$(curl -s http://169.254.169.254/latest/meta-data/instance-id)"

@sandys
Copy link
Author

sandys commented Aug 12, 2021

@brandond i think the way this bug can be looked at is that - its not about the CCM. Its just that if you could allow us to set providerid on node-join, that will solve everything.

Other people have the same issue btw - https://liquidreply.net/scale-out-your-raspberry-pi-kubernetes-cluster-to-the-cloud?cookie-state-change=1628751131057

Now, someone may ask that if "kubectl patch" exists, then why do we need this particular feature request.
Because it increases the complexity of the infrastructure massively. Because i need to patch kubectl post-join, i need to have an entire monitoring infrastructure that waits for a new node to come up and become healthy and patch it only after that. This is a big problematic thing to do (and is the place where we have frequent failure). If k3s allows for the patch while joining, i can completely do away with this post-join monitoring infrastructure.

There are lots of these restrictions when it comes to deploying k3s in the cloud.

e.g

Make sure you set the hostname before attempting to bootstrap the Kubernetes cluster, or you’ll end up with nodes whose name in Kubernetes doesn’t match up, and you’ll see various “permission denied”/“unable to enumerate” errors in the logs. For what it’s worth, preliminary testing indicates that this step—setting the hostname to the FQDN—is necessary for Ubuntu but may not be needed for CentOS/RHEL.

It is a HUGE benefit.

So please dont look at this issue as a CCM customization request. Please look at this as a kubectl-patch-while-node-join request.

Also, k3s agent config gives a lot of customization options - even "--node-name" and all. being able to set K3S_NODE_NAME="${EC2_INTERNAL_DNS}" is a godsend. There are a bunch of things that need to be done BEFORE join or else the cluster doesnt behave well. Some things are mentioned here - https://blog.scottlowe.org/2018/09/28/setting-up-the-kubernetes-aws-cloud-provider/

In general, this is a popular request with any kubernetes tool - e.g. kubernetes/kubeadm#202 . Also rancher has had a similar request as well rancher/rancher#13076 and rancher/rancher#13835

@brandond
Copy link
Member

brandond commented Aug 12, 2021

--kubelet-arg="provider-id=hcloud://$(curl -s http://169.254.169.254/hetzner/v1/metadata/instance-id)"

If we did allow something like this, it would probably be called --provider-id, as --kubelet-arg is for passing args directly to the kubelet, which isn't how you would want to do this.

If you want to take a shot at a PR to do this, you can find the code here:
https://github.com/k3s-io/k3s/blob/master/pkg/cloudprovider/instances.go#L34-L44

Note that the agent and cloud provider communicate via annotations - the agent sets annotations on the node that declare its desired hostname and IP addresses, and the cloud provider reads and returns those when the cloud controller initializes the node. You'd need to add a new annotation for the desired providerid, and return that instead of the nodename. You might also need to do something with the InstanceType if you don't want them to come up with a k3s:// prefix on the providerid.

@jawabuu
Copy link

jawabuu commented Aug 12, 2021

@brandond Thanks for this.
--provider-id flag makes sense

@sandys
Copy link
Author

sandys commented Aug 13, 2021

Note that the agent and cloud provider communicate via annotations - the agent sets annotations on the node that declare its desired hostname and IP addresses, and the cloud provider reads and returns those when the cloud controller initializes the node. You'd need to add a new annotation for the desired providerid, and return that instead of the nodename. You might also need to do something with the InstanceType if you don't want them to come up with a k3s:// prefix on the providerid.

@brandond this is more so the reason why k3s must set stuff like providerid, nodename, etc BEFORE joining. Because once it joins, there is a race condition between what the CCM (that is running on the existing k3s cluster) will do versus any kubectl patch that we manually do.

That is the reason almost any documentation of node joins in a cloud scenario STRONGLY recommends doing all of this before joining with strong disclaimers that it may result in unknown situations if any of these providerid, nodenames, etc is set after joining.

@stale
Copy link

stale bot commented Feb 9, 2022

This repository uses a bot to automatically label issues which have not had any activity (commit/comment/label) for 180 days. This helps us manage the community issues better. If the issue is still relevant, please add a comment to the issue so the bot can remove the label and we know it is still valid. If it is no longer relevant (or possibly fixed in the latest release), the bot will automatically close the issue in 14 days. Thank you for your contributions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants