Cilium ENI max_allocate not considering the prefix delegation #32205
Labels
area/eni
Impacts ENI based IPAM.
kind/bug
This is a bug in the Cilium logic.
kind/community-report
This was reported by a user in the Cilium community, eg via Slack.
needs/triage
This issue requires triaging to establish severity and next steps.
sig/agent
Cilium agent related.
Is there an existing issue for this?
What happened?
Under the EKS with ENI mode. if you enable the prefix delegation with max_allocate. you will find that cilium operator will start to complain something like this
time="2024-04-26T19:34:26Z" level=warning msg="max-allocate (30) is higher than the instance type limits (15)" instanceID=i-0b6723de0b389b6d9 name=ip-192-168-172-141.ca-central-1.compute.internal subsys=ipam
However, I looked into the code and found it is because we do the math not considering prefix delegation and just return based on the instance type.
cilium/pkg/aws/eni/node.go
Lines 635 to 664 in fe46958
https://github.com/cilium/cilium/blob/main/pkg/aws/eni/limits/limits.go.
the reason I tried to set the max_allocate is because AWS has the capacity to limit how many pods you can schedule the node(https://docs.aws.amazon.com/eks/latest/userguide/cni-increase-ip-addresses.html). I think its better to set the same number with EKS it might save some IPs since the prefix delegation step is 16. I agree that it's ok to leave it as blank since EKS scheduler will check before cilium start to allocate IP, so the bug itself is not that easy to be triggered and max_allocate field can only be configured through a kubectl patch based on the fact we didn't copy it over from CNI (https://github.com/isovalent/cilium/blob/v1.12.14-cee.2/pkg/nodediscovery/nodediscovery.go#L574-L629).
I personally think we should read this number from k8s nodes resouce as the best pracitce/default config but we will need to add the logic for the prefix delegation to calculate the IP address.
update the ticket and remove the IP address wasting problem from this issue
Cilium Version
1.14.10
Kernel Version
sh-4.2$ uname -a
Linux ip-192-168-103-170.ca-central-1.compute.internal 5.10.213-201.855.amzn2.x86_64 #1 SMP Mon Mar 25 18:16:11 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux
Kubernetes Version
1.29
Regression
No response
Sysdump
cilium-sysdump-20240426-161439.zip
No response
Relevant log output
test.log
No response
Anything else?
No response
Cilium Users Document
Code of Conduct
The text was updated successfully, but these errors were encountered: