Only 5 SGs per ENI allowed #682

dokipen · 2018-10-18T15:18:19Z

I got this error today: failed association of SecurityGroups due to failed to reconcile managed Instance securityGroup attachment due to SecurityGroupsPerInterfaceLimitExceeded: The maximum number of security groups per interface has been reached.

using quay.io/coreos/alb-ingress-controller:1.0-beta.7

The text was updated successfully, but these errors were encountered:

M00nF1sh · 2018-10-19T17:36:51Z

I wasn't aware of this issue... we create a managed securityGroup that will be attached to worker ENIs for every ingress by default.
For now,

you can increase the limit of 5 securityGroup per ENI limits by TT to AWS support: https://docs.aws.amazon.com/vpc/latest/userguide/amazon-vpc-limits.html#vpc-limits-security-groups, which give you maximum of 15(16-1) ingresses.
Or you can managed the securityGroup yourself, by creating an securityGroup sg-ForLB, and annotate every ingress with alb.ingress.kubernetes.io/security-groups: sg-ForLB, then add an rule to allow ingress traffic from sg-ForLB on your securityGroup for worker node.

@bigkraig I think ideally we can use an shared instance securityGroups for all ALB, and add/delete rules inside that instance securityGroup 😄 I'll make an PR for it

dokipen · 2018-10-20T17:11:58Z

Thanks. That's perfect. You also need < (300 / $sg_eni_limit) rules in ALL of your security groups, which was meant, for me, I couldn't only increase to 8.

vallurupallikhetan · 2018-10-22T20:34:55Z

Currently, I am facing the same issue with the same image quay.io/coreos/alb-ingress-controller:1.0-beta.7

rmn36 · 2018-10-24T18:26:48Z

Also having this issue. In order to increase SGs per interface you need to decrease rules per SG to maintain the limit of 250 rules per interface (Default Hard Limit = Per region 5 (security group) * 50 (Rule) = 250) therefore we weren't able to increase the number of SGs

bigkraig · 2018-11-13T19:25:34Z

Unfortunately SG limitations and dealing with the repercussions of that are a fact of life in AWS. We had the same problems at Ticketmaster and were able to improve the situation by reusing a lot of crafted groups for specific use cases. We can try to find ways of making the controller more adaptable but in the end there isn't anything that can truly solve the problem.

bincyber · 2018-11-16T01:06:23Z

I ran into this issue when target-type: instance and found the best course of action was to self-manage the security group using whatever IaC tool (CloudFormation/Terraform) was used to provision the VPC and EKS cluster. I have submitted a PR to update the documentation: #734

Multiply · 2018-11-30T08:37:28Z

@M00nF1sh Anything we can do to help speed up the creation of a PR? :D

M00nF1sh · 2018-12-03T17:40:50Z

@Multiply umm..@bigkraig didn't like the idea of share instance sg(less safe than individual sg)....maybe this can be improved by a feature flag(though it's more complicated than simply change it)?

Multiply · 2018-12-03T19:28:22Z

I guess we need to balance the creation of rules between SGs depending on AWS-limits, as these might differ between AWS accounts, but that also proves a bit difficult, if your cluster spans multiple accounts.

There doesn't seem to be a simple fix, but having an option to allow merging certain SGs would be nice.

encron · 2019-02-26T08:45:56Z

It would be nice if we could merge the sgs on the instance side since it will always have the same ruleset anyway, allowing all traffic from the ALB to the worker nodes. That way, we'd only have 1 sg per worker's ENI and won't come near the limit. Anyone already on a PR to implement something similar?

bendrucker · 2019-03-04T22:12:29Z

When this issue was originally filed, the SG per ENI limit times the security rule per group limit had to multiply to less than 300. The multiplicative limit is now 1,000 which leaves a bit more headroom for raising the SG per ENI limit.

yifan-gu · 2019-04-03T01:13:30Z

Still the limit is only 16

To increase or decrease this limit, contact AWS Support. The maximum is 16.

If we can dynamically create new node SG or merge rules into a single SG, then this will mitigate the problems a lot.

Seems that @M00nF1sh already has something that's working (thank you:)), any idea when you would release the code?

yifan-gu · 2019-04-05T03:44:12Z

fwiw, I made something that's working in my branch, let me know if you want to discuss this @M00nF1sh

M00nF1sh · 2019-04-05T23:14:43Z

@yifan-gu
Hi, thanks for providing an fix for this, really appreciate it.

I'm wondering whether we can fallback to a more simple version as below:

don't create instance security groups at all. Just use the existing securityGroup on worker nodes, and allow ingress traffic from the ALB securityGroup
(This is not less secure than current model, since ALB should be considered inside trust boundary, and all traffic inside VPC is secure).
The existing securityGroup on worker node can be resolved using the same logic as kubernetes core. (this is how serviceType=Loadbalancer works too).

Actually i have implemented it in a new branch for ingress group: https://github.com/M00nF1sh/aws-alb-ingress-controller/tree/ingress-group/

monkeymon · 2019-04-17T09:27:35Z

Hit the same restriction from AWS today as well, and I think we will asking AWS to increase the limit while we think of any workaround using CF script.

Really interested to see goes into stable. So we can have a cleaner implementation.
I will agree with @M00nF1sh as its a more simple approach. How can I test your implementation in the ingress-group?

Spareo · 2019-04-27T21:56:34Z

@M00nF1sh would it be possible to get an estimate of when the fix for this would be released?

montanaflynn · 2019-05-06T08:28:14Z

Also hitting this issue. After a little confusion I finally checked the logs and found:

"Reconciler error" "error"="failed to reconcile securityGroup associations due to SecurityGroupsPerInterfaceLimitExceeded: The maximum number of security groups per interface has been reached

yifan-gu · 2019-05-10T20:55:35Z

@yifan-gu
Hi, thanks for providing an fix for this, really appreciate it.

I'm wondering whether we can fallback to a more simple version as below:

don't create instance security groups at all. Just use the existing securityGroup on worker nodes, and allow ingress traffic from the ALB securityGroup
(This is not less secure than current model, since ALB should be considered inside trust boundary, and all traffic inside VPC is secure).
The existing securityGroup on worker node can be resolved using the same logic as kubernetes core. (this is how serviceType=Loadbalancer works too).

Actually i have implemented it in a new branch for ingress group: https://github.com/M00nF1sh/aws-alb-ingress-controller/tree/ingress-group/

@M00nF1sh Sounds like an easier approach, although the limit on the rules in a SG is not too high as well (60 by default, up to 100 [1]), maybe that's enough for most of the use cases.
[1] https://aws.amazon.com/premiumsupport/knowledge-center/increase-security-group-rule-limit/

diclophis · 2019-06-19T19:38:13Z

@M00nF1sh is there anything we can do to help you get the fix PR that you have developed merged in?

irlevesque · 2019-06-20T14:11:36Z

It appears that no movement is being made on this issue. I'd like to comment that this is also affecting our organization. Perhaps we need to be more vocal about how we're being affected by this limitation? Is there something blocking progress?

M00nF1sh · 2019-06-20T22:12:15Z

@irlevesque @diclophis
Hi, the reason i'm hesitate on merge it is we'll support separate securityGroup per pod in the near feature. Needs to discuss with the team what the best approach we should take for securityGroup handling.

For instance mode, since any pod can be addressed by node(kube-proxy), there is no need to create an node securityGroup at all. we can just use existing node securityGroup.
For ip mode, we might have to create a securityGroup and attach it pods that needs it(there might be enterprise customers that have concerns about securityGroup attached to all nodes/pods). There are to approach to handle ip mode:

change it to be same as instance mode, don't create a separate securityGroup at all. And once we actually supports securityGroup per pod, change it back to current mode(create node securityGroup and attach to ENI as need)
maintain it's current mode(create node securityGroup and attach to ENI as need), and handle Deregistration delay doesn't seem to have any effect, since the security group is detached too early from the ENI #824 correctly by checking the targetDeregisteration status in targetGroup, and asynchronously update the securityGroup attachment. ( I don't get enough time to work on this yet)

@irlevesque It can be easily mitigated by manually create a securityGroup, and use alb.ingress.kubernetes.io/security-groups: <sgCreatedForALB> on ingress. Then add an inbound rule to worker node securityGroup for sgCreatedForALB.

encron · 2019-07-18T09:43:48Z

@M00nF1sh creating each security group manually isn't really workable in our case and this issue is a big blocker for using this nice ingress-controller :(

anderm3 · 2019-08-02T17:57:10Z

This is impacting some projects of mine as well. One issue I see with the alb.ingress.kubernetes.io/security-groups: <sgCreatedForALB> option is that setting would be per Kubernetes cluster and we would have to track it on our various applications we are deploying into the cluster. It seems like a more viable option would be to allow alb-ingress-controller users to specify that value at the helm install level for the ingress controller. We would still have to manually make the SG, but since we'd make one per k8s cluster and set it cluster wide at alb-ingress-controller install time we would no longer need to manage it per our deployed application.

iliazlobin · 2019-08-28T16:00:44Z

The annotation alb.ingress.kubernetes.io/security-groups let us specify the security group we've created in advanced that will be attached to the application load balancer created by alb controller.

The behavior is pretty similar to what Kubernetes AWS cloud provider suggests (Nginx ingress controller uses that library) with the similar annotation service.beta.kubernetes.io/aws-load-balancer-security-groups when you annotate Service resource. There's no mention of this annotation in the official documentation but you can still explore the source code.

The difference is how they manage security groups for an EC2 instance.

Generally, speaking, the traffic needs to be delivered to an instance. It goes the following way:
[LB SG] LB -> [EC2 INSTANCE SG] EC2 INSTANCE -> KUBE-PROXY -> ...
Just specifying LB security groups is not enough (if you, of course, don't want to have ec2 instance security groups open to the world). So different controller implementations try to solve the problem uniquely.

AWS cloud provider

Once the load balancer is spun up and the security groups (specified or created) attached to it the controller searches for the security groups attached the network interfaces (ENI) amongst all of the instances in the cluster. Once it found the group(s), it adds the rule so that there's a permission on all network interfaces of all the instance in the cluster allowing traffic from the load balancer to it.

ALB ingress controller

This one tries to be more accurate in specifying rules and only in the case when the annotation service.beta.kubernetes.io/aws-load-balancer-security-groups is not specified. When so, the controller creates the group for an instance with the predefined rule that has a reference to the load balancer SG. Then it finds the certain ENI (in a case there are several per on an instance) amongst all of the instances and tries to attach the SG to it. And we pretty fast reach the limit here.

Is that the right logic?

I'm wondering why alb doesn't have the same logic as AWS cloud provider? It seems to be pretty neat and right. The fact that alb ingress controller in an IP target type mode (alb.ingress.kubernetes.io/target-type annotation) solves the problem of an extra hop by not forwarding the traffic through a proxy. In this case, it goes directly to the recipient (Pod (allocated IP for Pod)) and why then we need to make things more complicated due to security concerns? We still can solve the security issues leveraging the native Kubernetes way by using Network Policies.

The workaround.

As been said, the workaround for this is to manage security groups externally. In our organization, we used Terraform for spinning up EKS cluster. So the groups for load balancers are going to be managed from there.

I had some doubts whether Terraform would conflict with changes made by AWS cloud provider (that adds rules into existing SGs attached to an instance), but as turned out it doesn't. When you apply a plan, it just adds additional rules to its state and doesn't manage those if there's no rule (aws_security_group_rule) matching the name.

mmack-innio · 2019-11-07T10:14:09Z

any news on this?

TheBrigandier · 2020-01-16T15:11:12Z

Hello,

While the alb.ingress.kubernetes.io/security-groups annotation works, is there any possibility of specifying a default for this and allowing users to override it with this annotation? At my company this is causing a lot of support requests as end users are not aware we are running up against SG limits, and alb-ingress doesn't put a note in the ingress object's event log as to that being the case.

Thanks!

Alexx-G · 2020-02-10T08:59:51Z

Hi,

@M00nF1sh #1019 fixes this issues, doesn't it?
If alb-ingress-controller modifies worker's security group and reconciles changes, then we shouldn't hit any limitations on Security Groups. And it should work even with Managed Node Groups.

ecout · 2020-03-24T23:15:43Z

March 24-2020.

SecurityGroupsPerInterfaceLimitExceeded was also thrown on my case.

@M00nF1sh
Is there any way the ingress controller can use NODE AFFINITY to prevent PODS from being allocated to nodes with ENI's that have reached the hard limit? If the Cluster Autoscaling is managed properly PODS can get allocated to new Nodes/ENIS thus avoiding the hard limit AT LEAST at the ENI level.
Here are the current limits:

https://aws.amazon.com/premiumsupport/knowledge-center/increase-security-group-rule-limit/

Thanks!

ecout · 2020-03-24T23:17:23Z

Hi,

@M00nF1sh #1019 fixes this issues, doesn't it?
If alb-ingress-controller modifies worker's security group and reconciles changes, then we shouldn't hit any limitations on Security Groups. And it should work even with Managed Node Groups.

https://github.com/Alexx-G
+1. But actually, the Security Groups themselves have a HARD limit of about 40 rules. So with your suggestion we can increase the amount of ingresses in the cluster significantly but not really make it unlimited.

Edit:
The limits for SG numbers and SG rules have been increased across the board but this is still a possibility unless Services stop pointing to Nodes that can't take any more.
https://aws.amazon.com/premiumsupport/knowledge-center/increase-security-group-rule-limit/

ecout · 2020-03-24T23:34:43Z

Hello,

While the alb.ingress.kubernetes.io/security-groups annotation works, is there any possibility of specifying a default for this and allowing users to override it with this annotation? At my company this is causing a lot of support requests as end users are not aware we are running up against SG limits, and alb-ingress doesn't put a note in the ingress object's event log as to that being the case.

Thanks!

I don't understand what you mean. The logs are the reason I was able to find this issue:

kubebuilder/controller "msg"="Reconciler error" "error"="failed to reconcile securityGroup associations due to SecurityGroupsPerInterfaceLimitExceeded: The maximum number of security groups per interface has been reached.\n\tstatus code: 400, request id: ************"  "controller"="alb-ingress-controller" "request"={"Namespace":"******","Name":"*********"}
I

ecout · 2020-03-25T00:51:16Z

The annotation alb.ingress.kubernetes.io/security-groups let us specify the security group we've created in advanced that will be attached to the application load balancer created by alb controller.

The behavior is pretty similar to what Kubernetes AWS cloud provider suggests (Nginx ingress controller uses that library) with the similar annotation service.beta.kubernetes.io/aws-load-balancer-security-groups when you annotate Service resource. There's no mention of this annotation in the official documentation but you can still explore the source code.

The difference is how they manage security groups for an EC2 instance.

Generally, speaking, the traffic needs to be delivered to an instance. It goes the following way:
[LB SG] LB -> [EC2 INSTANCE SG] EC2 INSTANCE -> KUBE-PROXY -> ...
Just specifying LB security groups is not enough (if you, of course, don't want to have ec2 instance security groups open to the world). So different controller implementations try to solve the problem uniquely.

AWS cloud provider

Once the load balancer is spun up and the security groups (specified or created) attached to it the controller searches for the security groups attached the network interfaces (ENI) amongst all of the instances in the cluster. Once it found the group(s), it adds the rule so that there's a permission on all network interfaces of all the instance in the cluster allowing traffic from the load balancer to it.

ALB ingress controller

This one tries to be more accurate in specifying rules and only in the case when the annotation service.beta.kubernetes.io/aws-load-balancer-security-groups is not specified. When so, the controller creates the group for an instance with the predefined rule that has a reference to the load balancer SG. Then it finds the certain ENI (in a case there are several per on an instance) amongst all of the instances and tries to attach the SG to it. And we pretty fast reach the limit here.

Is that the right logic?

I'm wondering why alb doesn't have the same logic as AWS cloud provider? It seems to be pretty neat and right. The fact that alb ingress controller in an IP target type mode (alb.ingress.kubernetes.io/target-type annotation) solves the problem of an extra hop by not forwarding the traffic through a proxy. In this case, it goes directly to the recipient (Pod (allocated IP for Pod)) and why then we need to make things more complicated due to security concerns? We still can solve the security issues leveraging the native Kubernetes way by using Network Policies.

The workaround.

As been said, the workaround for this is to manage security groups externally. In our organization, we used Terraform for spinning up EKS cluster. So the groups for load balancers are going to be managed from there.

I had some doubts whether Terraform would conflict with changes made by AWS cloud provider (that adds rules into existing SGs attached to an instance), but as turned out it doesn't. When you apply a plan, it just adds additional rules to its state and doesn't manage those if there's no rule (aws_security_group_rule) matching the name.

Agreed,
Grouping ALBs per ingress policies can reduce the number of necessary Security groups per ALB and hence reduce the number of INSTANCE SG that need to be attached. If the Ingress definition is left without an explicit SG every time you create an ingress a new pair of SGs gets created until you reach the hard limit in the ENI side.

Edit, just for curiosity I took a look at cloud provider and found this feature request. Interestingly enough it seems they handle everything on a single SG with no OPTION for multiple SG support.

kubernetes/cloud-provider-aws#81

ecout · 2020-03-25T18:40:05Z

Ticketmaster and were able to improve the situation by reusing a lot of crafted groups for specific use cases.

@bigkraig How about Node Affinity with Autoscaling? More nodes more ENIs. PODS pertaining to an Ingress that can't get on an ENI can be moved to a new node.

fejta-bot · 2020-06-23T19:29:43Z

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

fejta-bot · 2020-07-23T20:10:40Z

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten

fejta-bot · 2020-08-22T20:51:50Z

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

k8s-ci-robot · 2020-08-22T20:51:56Z

@fejta-bot: Closing this issue.

In response to this:

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

M00nF1sh mentioned this issue Mar 29, 2019

Add support for shared subnets #904

Merged

M00nF1sh mentioned this issue Sep 11, 2019

refactor securityGroup handling to reuse existing securityGroup on worker nodes #1019

Merged

tom-butler mentioned this issue Oct 9, 2019

add shared security group for alb's to use opendatacube/datacube-k8s-eks#94

Merged

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jun 23, 2020

k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Jul 23, 2020

k8s-ci-robot closed this as completed Aug 22, 2020

Only 5 SGs per ENI allowed #682

Only 5 SGs per ENI allowed #682

Comments

dokipen commented Oct 18, 2018

M00nF1sh commented Oct 19, 2018 • edited

dokipen commented Oct 20, 2018

vallurupallikhetan commented Oct 22, 2018

rmn36 commented Oct 24, 2018

bigkraig commented Nov 13, 2018

bincyber commented Nov 16, 2018 • edited

Multiply commented Nov 30, 2018

M00nF1sh commented Dec 3, 2018

Multiply commented Dec 3, 2018

encron commented Feb 26, 2019

bendrucker commented Mar 4, 2019

yifan-gu commented Apr 3, 2019 • edited

yifan-gu commented Apr 5, 2019

M00nF1sh commented Apr 5, 2019 • edited

monkeymon commented Apr 17, 2019

Spareo commented Apr 27, 2019

montanaflynn commented May 6, 2019

yifan-gu commented May 10, 2019 • edited

diclophis commented Jun 19, 2019

irlevesque commented Jun 20, 2019 • edited

M00nF1sh commented Jun 20, 2019 • edited

encron commented Jul 18, 2019

anderm3 commented Aug 2, 2019

iliazlobin commented Aug 28, 2019

Is that the right logic?

The workaround.

mmack-innio commented Nov 7, 2019

TheBrigandier commented Jan 16, 2020

Alexx-G commented Feb 10, 2020

ecout commented Mar 24, 2020 • edited

ecout commented Mar 24, 2020 • edited

ecout commented Mar 24, 2020

ecout commented Mar 25, 2020 • edited

Is that the right logic?

The workaround.

ecout commented Mar 25, 2020

fejta-bot commented Jun 23, 2020

fejta-bot commented Jul 23, 2020

fejta-bot commented Aug 22, 2020

k8s-ci-robot commented Aug 22, 2020

M00nF1sh commented Oct 19, 2018 •

edited

bincyber commented Nov 16, 2018 •

edited

yifan-gu commented Apr 3, 2019 •

edited

M00nF1sh commented Apr 5, 2019 •

edited

yifan-gu commented May 10, 2019 •

edited

irlevesque commented Jun 20, 2019 •

edited

M00nF1sh commented Jun 20, 2019 •

edited

ecout commented Mar 24, 2020 •

edited

ecout commented Mar 24, 2020 •

edited

ecout commented Mar 25, 2020 •

edited