-
Notifications
You must be signed in to change notification settings - Fork 726
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
aws-node should untaint node #2808
Comments
@runningman84 the node should not be marked as "Ready" until the |
The VPC CNI plugin is not able to modify node taints, as that would be a sizable security risk |
okay what could be the reason for us seeing this behaviour? |
Do you see the nodes as "Not Ready" during this time window? Do these application pods have their own tolerations? |
I have double check that ... t = point in time... can be seconds or even minute between two numbers t0 node appears as not ready It looks like the node gets ready to fast before waiting for the aws-node pod... |
@runningman84 can you please sure the node logs during this timeline? Mainly we would need to look at the CNI and IPAMD logs in |
I just sent the logs and we also have an open case id: 170893944201879 |
Thanks @runningman84, let's work through the support case, as the support team will triage and then bring in the service team if needed. |
Hi any news on this one? we have the same issue |
The aws support case did not really solve that, we got the suggestion that we should try to use prefix delegation mode or things like that to speedup the ip allocation. The general question is should a node be unready until aws-node is up and running? |
i am facing the same issue on eks 1.28 kubelet shows ready status on node seems i will have to make my own workaround by monitoring new nodes myself and assigning a label aged=y after they have been there for a minute then make all my pods have a nodeaffinity looking for that label ideally aws pods would add label to the node themself any ideas @jdn5126 ? |
What would you like to be added:
In our eks bottlerocket usecase we see karpenter provisioning nodes which get the aws-node pod and some application pods upon start. Unfortunatly the aws-node pod takes several dozens of seconds to get ready. The application pod try to start in the meantime and get error because they do not get an ip address.
Should we use karpenter to taint the nodes until the aws-node is ready? Is the cni plugin able to remove a startup taint once it is ready?
Why is this needed:
Pods constantly fail during startup due to missing ip addresses because the aws-node pod is not ready.
The text was updated successfully, but these errors were encountered: