aws-node should untaint node #2808

runningman84 · 2024-02-23T16:30:39Z

What would you like to be added:
In our eks bottlerocket usecase we see karpenter provisioning nodes which get the aws-node pod and some application pods upon start. Unfortunatly the aws-node pod takes several dozens of seconds to get ready. The application pod try to start in the meantime and get error because they do not get an ip address.

Should we use karpenter to taint the nodes until the aws-node is ready? Is the cni plugin able to remove a startup taint once it is ready?

Why is this needed:
Pods constantly fail during startup due to missing ip addresses because the aws-node pod is not ready.

jdn5126 · 2024-02-23T16:35:14Z

@runningman84 the node should not be marked as "Ready" until the aws-node pod copies the CNI to /etc/cni/net.d/, which it does after it finishes initialization. So the scheduler should not schedule application pods on a "Not Ready" node (unless those pods tolerate the node being not ready for some reason)

jdn5126 · 2024-02-23T16:36:27Z

The VPC CNI plugin is not able to modify node taints, as that would be a sizable security risk

runningman84 · 2024-02-23T16:37:52Z

okay what could be the reason for us seeing this behaviour?

jdn5126 · 2024-02-23T16:39:59Z

okay what could be the reason for us seeing this behaviour?

Do you see the nodes as "Not Ready" during this time window? Do these application pods have their own tolerations?

runningman84 · 2024-02-26T09:06:32Z

I have double check that ...

t = point in time... can be seconds or even minute between two numbers

t0 node appears as not ready
t1 daemon set pods are scheduled to it
t2 aws-node pod is in initializing
t2 kube-proxy pod is in initializing
t3 kube-proxy is ready
t4 node becomes ready (aws-node still not in running state)
t5 additional pods are scheduled to it
t6 all additional pods stay in state container creating (warnings due to failed to assign IP address to container)
t7 aws-node pods is ready
t8 new pods get ready within seconds
t9 the additional pods from t6 start to get ready

It looks like the node gets ready to fast before waiting for the aws-node pod...

jdn5126 · 2024-02-26T16:33:32Z

@runningman84 can you please sure the node logs during this timeline? Mainly we would need to look at the CNI and IPAMD logs in /var/log/aws-routed-eni/. You can email them to k8s-awscni-triage@amazon.com and we can take a look

runningman84 · 2024-02-27T13:06:55Z

I just sent the logs and we also have an open case id: 170893944201879

jdn5126 · 2024-02-27T16:55:17Z

Thanks @runningman84, let's work through the support case, as the support team will triage and then bring in the service team if needed.

mathieuherbert · 2024-04-09T07:22:51Z

Hi any news on this one? we have the same issue

runningman84 · 2024-04-09T09:32:16Z

The aws support case did not really solve that, we got the suggestion that we should try to use prefix delegation mode or things like that to speedup the ip allocation. The general question is should a node be unready until aws-node is up and running?
I could image that that behaviour would also have downsides because images would be pulled before the node is ready. The current situation is that pod might not start due to the ip address thing but at least all images are already pulled before they eventually start fine...

tooptoop4 · 2024-05-18T08:17:25Z

i am facing the same issue on eks 1.28

kubelet shows ready status on node
pods start being scheduled onto that node
few seconds later node goes into networknotready state
pods above get stuck forever
few seconds later kubelet switches back to ready state and new pods work, but the earlier ones don't

seems i will have to make my own workaround by monitoring new nodes myself and assigning a label aged=y after they have been there for a minute

then make all my pods have a nodeaffinity looking for that label

ideally aws pods would add label to the node themself

any ideas @jdn5126 ?

runningman84 added enhancement feature request labels Feb 23, 2024

tooptoop4 mentioned this issue May 24, 2024

option to not auto-add tolerations argoproj/argo-workflows#13088

Closed

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

aws-node should untaint node #2808

aws-node should untaint node #2808

runningman84 commented Feb 23, 2024

jdn5126 commented Feb 23, 2024

jdn5126 commented Feb 23, 2024

runningman84 commented Feb 23, 2024

jdn5126 commented Feb 23, 2024

runningman84 commented Feb 26, 2024

jdn5126 commented Feb 26, 2024

runningman84 commented Feb 27, 2024 •

edited

jdn5126 commented Feb 27, 2024

mathieuherbert commented Apr 9, 2024

runningman84 commented Apr 9, 2024

tooptoop4 commented May 18, 2024 •

edited

aws-node should untaint node #2808

aws-node should untaint node #2808

Comments

runningman84 commented Feb 23, 2024

jdn5126 commented Feb 23, 2024

jdn5126 commented Feb 23, 2024

runningman84 commented Feb 23, 2024

jdn5126 commented Feb 23, 2024

runningman84 commented Feb 26, 2024

jdn5126 commented Feb 26, 2024

runningman84 commented Feb 27, 2024 • edited

jdn5126 commented Feb 27, 2024

mathieuherbert commented Apr 9, 2024

runningman84 commented Apr 9, 2024

tooptoop4 commented May 18, 2024 • edited

runningman84 commented Feb 27, 2024 •

edited

tooptoop4 commented May 18, 2024 •

edited