Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

EKS IP-addresses limits #1366

Closed
okgolove opened this issue Oct 31, 2018 · 14 comments
Closed

EKS IP-addresses limits #1366

okgolove opened this issue Oct 31, 2018 · 14 comments

Comments

@okgolove
Copy link

@okgolove okgolove commented Oct 31, 2018

Hello. EKS uses AWS CNI to assign private IP to every pod https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/using-eni.html#AvailableIpPerENI
So, if you haven't free IPs your pod won't be schedulded.

Can somehow autoscaler implements autoscaling based on IP limits?

@aleksandra-malinowska
Copy link
Contributor

@aleksandra-malinowska aleksandra-malinowska commented Oct 31, 2018

It works differently for every cloud provider (e.g. on GCE, each nodes is assigned a range of IPs for pods). If I understand correctly, in this case IPs would be a cluster-level resource: it doesn't limit the number of nodes, and no matter how many we add, pods may not be able to run. Currently there's no support for such resources at all. It can probably be implemented by injecting a new pod list processor, which would remove pods that won't be able to run anyway from scale-up calculations.

@johanneswuerbach
Copy link
Contributor

@johanneswuerbach johanneswuerbach commented Jan 7, 2019

I'm not entirely sure cluster-autoscaler needs to do anything here, but your instances should be actually configured to only allow max IP address pods https://kubernetes.io/docs/reference/command-line-tools-reference/kubelet/#options.

A change for that was recently implemented in kops kubernetes/kops#6058, but I don't know whether this is done in EKS by default.

The max pods limit should also be recognised by the CA, but I'm not entirely sure whether it is, maybe @aleksandra-malinowska knows more?

@okgolove
Copy link
Author

@okgolove okgolove commented Jan 8, 2019

@johanneswuerbach hello, thank you for the feedback.
I have specified that option (--max-pods). It's exactly that thing I meant. As far I exhaust IP limit Kubernetes can't schedule any new pods. I meant it will be great if CA can handle these situations.

@aleksandra-malinowska
Copy link
Contributor

@aleksandra-malinowska aleksandra-malinowska commented Jan 8, 2019

I believe scheduler's predicates checks max pods per node limit. If it's not the case, it's probably a bug.

I have specified that option (--max-pods). It's exactly that thing I meant.

Can you verify if your nodes indeed have this set? kubectl get node <node-name> -o yaml

@okgolove
Copy link
Author

@okgolove okgolove commented Jan 8, 2019

@aleksandra-malinowska

status:
  addresses:
  - address: 10.0.1.217
    type: InternalIP
  - address: ip-10-0-1-217.eu-west-1.compute.internal
    type: InternalDNS
  - address: ip-10-0-1-217.eu-west-1.compute.internal
    type: Hostname
  allocatable:
    cpu: "2"
    ephemeral-storage: "96625420948"
    hugepages-2Mi: "0"
    memory: 3937632Ki
    pods: "17"
  capacity:
    cpu: "2"
    ephemeral-storage: 104845292Ki
    hugepages-2Mi: "0"
    memory: 4040032Ki
    pods: "17"

Also I have next option in kubelet-config.json:
"maxPods": 17

@aleksandra-malinowska
Copy link
Contributor

@aleksandra-malinowska aleksandra-malinowska commented Jan 8, 2019

Does CA ignore this (i.e. scale up assuming more than 17 pods will fit)? If so, any repro you have would be useful (sample pods etc.) Scheduler code looks fairly straightforwad, not sure what may be wrong here:/

@okgolove
Copy link
Author

@okgolove okgolove commented Jan 9, 2019

Hm. It's strange, but when I tried to deploy 100 pods to my EKS cluster CNI wasn't able to assign IP to pods and those pods weren't in status "Pending" (I don't remember what status was).
But now when I deploy 100 pods I have a lot of pods in "Pending" status and CA scales my nodes correctly:

Warning FailedScheduling 8s (x7 over 39s) default-scheduler 0/3 nodes are available: 3 Insufficient pods.

nginx-bucket-6f8b645d58-vg92h   0/1     Pending   0          4m
nginx-bucket-6f8b645d58-vpkkk   0/1     Pending   0          4m
nginx-bucket-6f8b645d58-w5px7   0/1     Pending   0          4m
nginx-bucket-6f8b645d58-w6fv4   0/1     Pending   0          4m
nginx-bucket-6f8b645d58-w8dxd   0/1     Pending   0          4m
nginx-bucket-6f8b645d58-wbn5v   0/1     Pending   0          4m
nginx-bucket-6f8b645d58-wkc26   0/1     Pending   0          4m
nginx-bucket-6f8b645d58-wq926   0/1     Pending   0          4m
nginx-bucket-6f8b645d58-ws7bz   0/1     Pending   0          4m
nginx-bucket-6f8b645d58-x9k6d   0/1     Pending   0          4m
nginx-bucket-6f8b645d58-xcgnf   0/1     Pending   0          4m
nginx-bucket-6f8b645d58-xxp4b   0/1     Pending   0          4m
nginx-bucket-6f8b645d58-zfcd7   0/1     Pending   0          4m
nginx-bucket-6f8b645d58-zlpr4   0/1     Pending   0          4m
nginx-bucket-6f8b645d58-zmsz6   0/1     Pending   0          4m
nginx-bucket-6f8b645d58-znjjt   0/1     Pending   0          4m

It seems CA works as expected.

@okgolove
Copy link
Author

@okgolove okgolove commented Jan 9, 2019

Oh, I've reproduced it!
Pod has status

0/1     Running
Warning  FailedCreatePodSandBox  58s (x12 over 70s)  kubelet, ip-10-0-2-196.eu-west-1.compute.internal  Failed create pod sandbox: rpc error: code = Unknown
 desc = NetworkPlugin cni failed to set up pod "nginx-develop-7845f449bc-lnlqv_nginx-develop" network: add cmd: failed to assign an IP a
ddress to container
  Normal   SandboxChanged          58s (x11 over 68s)  kubelet, ip-10-0-2-196.eu-west-1.compute.internal  Pod sandbox changed, it will be killed and re-create
d.```
@aleksandra-malinowska
Copy link
Contributor

@aleksandra-malinowska aleksandra-malinowska commented Jan 9, 2019

CA only makes sure there are enough nodes to schedule pods on. In this case, it seems pod was scheduled, but kubelet wasn't actually able to run it. I'd look for the scheduling constraints that were supposed to prevent this and ensure they're in place. Perhaps 17 pods per node is too many in this case, or there's some global limit on number of pods?

@okgolove
Copy link
Author

@okgolove okgolove commented Jan 9, 2019

Thank you for your help.
As I think it is not the CA problem.
This issue may be closed, if you think it should.

@dkuida
Copy link

@dkuida dkuida commented Mar 26, 2019

@okgolove did you manage to solve that ? when I increase --max-pod the ips cannot be assigned, but my machine is clearly under utilized

@okgolove
Copy link
Author

@okgolove okgolove commented Mar 26, 2019

@dkuida hi! I didn't.
I've decided to ignore it and just use kops in production :)

But you can subscribe to an issue about CNI aws/amazon-vpc-cni-k8s#214
I hope, we will have an ability to choose CNI plugin.

@mohag
Copy link

@mohag mohag commented Oct 16, 2019

@dkuida When using the EKS default AWS-VPC-CNI, the max-pods are set to the max IPs that is available for assignment to pods on that instance type

Increasing max-pods without changing the CNI to something else won't work. (Changing the CNI is possible, but possibly unsupported)

Some instance types have higher pods-per-CPU / RAM which might help... (e,g, t3.large can do 35 pods, while t3a.xlarge (double the size) and t3a.2xlarge (four times the size) can only do 58 pods)

@runningman84
Copy link

@runningman84 runningman84 commented Feb 6, 2020

I think this issue should not be closed... in my experience if you cannot place a new pod due to the pod limit per node the cluster autoscaler should scale for additional nodes. Right now it says something like this:

I0206 14:08:34.906644       1 scale_down.go:706] No candidates for scale down
I0206 14:08:34.906918       1 event.go:209] Event(v1.ObjectReference{Kind:"Pod", Namespace:"kube-system", Name:"external-dns-5fcb999649-59jdp", UID:"44a7afe3-48d4-11ea-ab32-0a5855b3f258", APIVersion:"v1", ResourceVersion:"25283822", FieldPath:""}): type: 'Normal' reason: 'NotTriggerScaleUp' pod didn't trigger scale-up (it wouldn't fit if a new node is added): 
I0206 14:08:34.906956       1 event.go:209] Event(v1.ObjectReference{Kind:"Pod", Namespace:"kube-system", Name:"k8s-spot-termination-handler-rtk4c", UID:"76b73040-48c6-11ea-ab32-0a5855b3f258", APIVersion:"v1", ResourceVersion:"25283856", FieldPath:""}): type: 'Normal' reason: 'NotTriggerScaleUp' pod didn't trigger scale-up (it wouldn't fit if a new node is added): 
I0206 14:08:34.906976       1 event.go:209] Event(v1.ObjectReference{Kind:"Pod", Namespace:"kube-system", Name:"k8s-spot-termination-handler-xmtv9", UID:"29ae5cfc-48d4-11ea-ab32-0a5855b3f258", APIVersion:"v1", ResourceVersion:"25283551", FieldPath:""}): type: 'Normal' reason: 'NotTriggerScaleUp' pod didn't trigger scale-up (it wouldn't fit if a new node is added): 
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
6 participants