Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Only 5 SGs per ENI allowed #682

Closed
dokipen opened this issue Oct 18, 2018 · 36 comments
Closed

Only 5 SGs per ENI allowed #682

dokipen opened this issue Oct 18, 2018 · 36 comments
Labels
lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed.

Comments

@dokipen
Copy link

dokipen commented Oct 18, 2018

I got this error today: failed association of SecurityGroups due to failed to reconcile managed Instance securityGroup attachment due to SecurityGroupsPerInterfaceLimitExceeded: The maximum number of security groups per interface has been reached.

using quay.io/coreos/alb-ingress-controller:1.0-beta.7

@M00nF1sh
Copy link
Collaborator

M00nF1sh commented Oct 19, 2018

I wasn't aware of this issue... we create a managed securityGroup that will be attached to worker ENIs for every ingress by default.
For now,

  1. you can increase the limit of 5 securityGroup per ENI limits by TT to AWS support: https://docs.aws.amazon.com/vpc/latest/userguide/amazon-vpc-limits.html#vpc-limits-security-groups, which give you maximum of 15(16-1) ingresses.
  2. Or you can managed the securityGroup yourself, by creating an securityGroup sg-ForLB, and annotate every ingress with alb.ingress.kubernetes.io/security-groups: sg-ForLB, then add an rule to allow ingress traffic from sg-ForLB on your securityGroup for worker node.

@bigkraig I think ideally we can use an shared instance securityGroups for all ALB, and add/delete rules inside that instance securityGroup 😄 I'll make an PR for it

@dokipen
Copy link
Author

dokipen commented Oct 20, 2018

Thanks. That's perfect. You also need < (300 / $sg_eni_limit) rules in ALL of your security groups, which was meant, for me, I couldn't only increase to 8.

@vallurupallikhetan
Copy link

Currently, I am facing the same issue with the same image quay.io/coreos/alb-ingress-controller:1.0-beta.7

@rmn36
Copy link

rmn36 commented Oct 24, 2018

Also having this issue. In order to increase SGs per interface you need to decrease rules per SG to maintain the limit of 250 rules per interface (Default Hard Limit = Per region 5 (security group) * 50 (Rule) = 250) therefore we weren't able to increase the number of SGs

@bigkraig
Copy link

Unfortunately SG limitations and dealing with the repercussions of that are a fact of life in AWS. We had the same problems at Ticketmaster and were able to improve the situation by reusing a lot of crafted groups for specific use cases. We can try to find ways of making the controller more adaptable but in the end there isn't anything that can truly solve the problem.

@bincyber
Copy link

bincyber commented Nov 16, 2018

I ran into this issue when target-type: instance and found the best course of action was to self-manage the security group using whatever IaC tool (CloudFormation/Terraform) was used to provision the VPC and EKS cluster. I have submitted a PR to update the documentation: #734

@Multiply
Copy link

@M00nF1sh Anything we can do to help speed up the creation of a PR? :D

@M00nF1sh
Copy link
Collaborator

M00nF1sh commented Dec 3, 2018

@Multiply umm..@bigkraig didn't like the idea of share instance sg(less safe than individual sg)....maybe this can be improved by a feature flag(though it's more complicated than simply change it)?

@Multiply
Copy link

Multiply commented Dec 3, 2018

I guess we need to balance the creation of rules between SGs depending on AWS-limits, as these might differ between AWS accounts, but that also proves a bit difficult, if your cluster spans multiple accounts.

There doesn't seem to be a simple fix, but having an option to allow merging certain SGs would be nice.

@encron
Copy link

encron commented Feb 26, 2019

It would be nice if we could merge the sgs on the instance side since it will always have the same ruleset anyway, allowing all traffic from the ALB to the worker nodes. That way, we'd only have 1 sg per worker's ENI and won't come near the limit. Anyone already on a PR to implement something similar?

@bendrucker
Copy link

When this issue was originally filed, the SG per ENI limit times the security rule per group limit had to multiply to less than 300. The multiplicative limit is now 1,000 which leaves a bit more headroom for raising the SG per ENI limit.

@yifan-gu
Copy link

yifan-gu commented Apr 3, 2019

Still the limit is only 16

To increase or decrease this limit, contact AWS Support. The maximum is 16.

If we can dynamically create new node SG or merge rules into a single SG, then this will mitigate the problems a lot.

Seems that @M00nF1sh already has something that's working (thank you:)), any idea when you would release the code?

@yifan-gu
Copy link

yifan-gu commented Apr 5, 2019

fwiw, I made something that's working in my branch, let me know if you want to discuss this @M00nF1sh

@M00nF1sh
Copy link
Collaborator

M00nF1sh commented Apr 5, 2019

@yifan-gu
Hi, thanks for providing an fix for this, really appreciate it.

I'm wondering whether we can fallback to a more simple version as below:

  1. don't create instance security groups at all. Just use the existing securityGroup on worker nodes, and allow ingress traffic from the ALB securityGroup
    (This is not less secure than current model, since ALB should be considered inside trust boundary, and all traffic inside VPC is secure).
    The existing securityGroup on worker node can be resolved using the same logic as kubernetes core. (this is how serviceType=Loadbalancer works too).

Actually i have implemented it in a new branch for ingress group: https://github.com/M00nF1sh/aws-alb-ingress-controller/tree/ingress-group/

@monkeymon
Copy link

Hit the same restriction from AWS today as well, and I think we will asking AWS to increase the limit while we think of any workaround using CF script.

Really interested to see goes into stable. So we can have a cleaner implementation.
I will agree with @M00nF1sh as its a more simple approach. How can I test your implementation in the ingress-group?

@Spareo
Copy link

Spareo commented Apr 27, 2019

@M00nF1sh would it be possible to get an estimate of when the fix for this would be released?

@montanaflynn
Copy link

Also hitting this issue. After a little confusion I finally checked the logs and found:

"Reconciler error" "error"="failed to reconcile securityGroup associations due to SecurityGroupsPerInterfaceLimitExceeded: The maximum number of security groups per interface has been reached

@yifan-gu
Copy link

yifan-gu commented May 10, 2019

@yifan-gu
Hi, thanks for providing an fix for this, really appreciate it.

I'm wondering whether we can fallback to a more simple version as below:

  1. don't create instance security groups at all. Just use the existing securityGroup on worker nodes, and allow ingress traffic from the ALB securityGroup
    (This is not less secure than current model, since ALB should be considered inside trust boundary, and all traffic inside VPC is secure).
    The existing securityGroup on worker node can be resolved using the same logic as kubernetes core. (this is how serviceType=Loadbalancer works too).

Actually i have implemented it in a new branch for ingress group: https://github.com/M00nF1sh/aws-alb-ingress-controller/tree/ingress-group/

@M00nF1sh Sounds like an easier approach, although the limit on the rules in a SG is not too high as well (60 by default, up to 100 [1]), maybe that's enough for most of the use cases.
[1] https://aws.amazon.com/premiumsupport/knowledge-center/increase-security-group-rule-limit/

@diclophis
Copy link

@M00nF1sh is there anything we can do to help you get the fix PR that you have developed merged in?

@irlevesque
Copy link

irlevesque commented Jun 20, 2019

It appears that no movement is being made on this issue. I'd like to comment that this is also affecting our organization. Perhaps we need to be more vocal about how we're being affected by this limitation? Is there something blocking progress?

@M00nF1sh
Copy link
Collaborator

M00nF1sh commented Jun 20, 2019

@irlevesque @diclophis
Hi, the reason i'm hesitate on merge it is we'll support separate securityGroup per pod in the near feature. Needs to discuss with the team what the best approach we should take for securityGroup handling.

For instance mode, since any pod can be addressed by node(kube-proxy), there is no need to create an node securityGroup at all. we can just use existing node securityGroup.
For ip mode, we might have to create a securityGroup and attach it pods that needs it(there might be enterprise customers that have concerns about securityGroup attached to all nodes/pods). There are to approach to handle ip mode:

  1. change it to be same as instance mode, don't create a separate securityGroup at all. And once we actually supports securityGroup per pod, change it back to current mode(create node securityGroup and attach to ENI as need)
  2. maintain it's current mode(create node securityGroup and attach to ENI as need), and handle Deregistration delay doesn't seem to have any effect, since the security group is detached too early from the ENI #824 correctly by checking the targetDeregisteration status in targetGroup, and asynchronously update the securityGroup attachment. ( I don't get enough time to work on this yet)

@irlevesque It can be easily mitigated by manually create a securityGroup, and use alb.ingress.kubernetes.io/security-groups: <sgCreatedForALB> on ingress. Then add an inbound rule to worker node securityGroup for sgCreatedForALB.

@encron
Copy link

encron commented Jul 18, 2019

@M00nF1sh creating each security group manually isn't really workable in our case and this issue is a big blocker for using this nice ingress-controller :(

@anderm3
Copy link

anderm3 commented Aug 2, 2019

This is impacting some projects of mine as well. One issue I see with the alb.ingress.kubernetes.io/security-groups: <sgCreatedForALB> option is that setting would be per Kubernetes cluster and we would have to track it on our various applications we are deploying into the cluster. It seems like a more viable option would be to allow alb-ingress-controller users to specify that value at the helm install level for the ingress controller. We would still have to manually make the SG, but since we'd make one per k8s cluster and set it cluster wide at alb-ingress-controller install time we would no longer need to manage it per our deployed application.

@iliazlobin
Copy link

The annotation alb.ingress.kubernetes.io/security-groups let us specify the security group we've created in advanced that will be attached to the application load balancer created by alb controller.

The behavior is pretty similar to what Kubernetes AWS cloud provider suggests (Nginx ingress controller uses that library) with the similar annotation service.beta.kubernetes.io/aws-load-balancer-security-groups when you annotate Service resource. There's no mention of this annotation in the official documentation but you can still explore the source code.

The difference is how they manage security groups for an EC2 instance.

Generally, speaking, the traffic needs to be delivered to an instance. It goes the following way:
[LB SG] LB -> [EC2 INSTANCE SG] EC2 INSTANCE -> KUBE-PROXY -> ...
Just specifying LB security groups is not enough (if you, of course, don't want to have ec2 instance security groups open to the world). So different controller implementations try to solve the problem uniquely.

  1. AWS cloud provider

Once the load balancer is spun up and the security groups (specified or created) attached to it the controller searches for the security groups attached the network interfaces (ENI) amongst all of the instances in the cluster. Once it found the group(s), it adds the rule so that there's a permission on all network interfaces of all the instance in the cluster allowing traffic from the load balancer to it.

  1. ALB ingress controller

This one tries to be more accurate in specifying rules and only in the case when the annotation service.beta.kubernetes.io/aws-load-balancer-security-groups is not specified. When so, the controller creates the group for an instance with the predefined rule that has a reference to the load balancer SG. Then it finds the certain ENI (in a case there are several per on an instance) amongst all of the instances and tries to attach the SG to it. And we pretty fast reach the limit here.

Is that the right logic?

I'm wondering why alb doesn't have the same logic as AWS cloud provider? It seems to be pretty neat and right. The fact that alb ingress controller in an IP target type mode (alb.ingress.kubernetes.io/target-type annotation) solves the problem of an extra hop by not forwarding the traffic through a proxy. In this case, it goes directly to the recipient (Pod (allocated IP for Pod)) and why then we need to make things more complicated due to security concerns? We still can solve the security issues leveraging the native Kubernetes way by using Network Policies.

The workaround.

As been said, the workaround for this is to manage security groups externally. In our organization, we used Terraform for spinning up EKS cluster. So the groups for load balancers are going to be managed from there.

I had some doubts whether Terraform would conflict with changes made by AWS cloud provider (that adds rules into existing SGs attached to an instance), but as turned out it doesn't. When you apply a plan, it just adds additional rules to its state and doesn't manage those if there's no rule (aws_security_group_rule) matching the name.

@mmack-innio
Copy link

any news on this?

@TheBrigandier
Copy link

Hello,

While the alb.ingress.kubernetes.io/security-groups annotation works, is there any possibility of specifying a default for this and allowing users to override it with this annotation? At my company this is causing a lot of support requests as end users are not aware we are running up against SG limits, and alb-ingress doesn't put a note in the ingress object's event log as to that being the case.

Thanks!

@Alexx-G
Copy link

Alexx-G commented Feb 10, 2020

Hi,

@M00nF1sh #1019 fixes this issues, doesn't it?
If alb-ingress-controller modifies worker's security group and reconciles changes, then we shouldn't hit any limitations on Security Groups. And it should work even with Managed Node Groups.

@ecout
Copy link

ecout commented Mar 24, 2020

March 24-2020.

SecurityGroupsPerInterfaceLimitExceeded was also thrown on my case.

@M00nF1sh
Is there any way the ingress controller can use NODE AFFINITY to prevent PODS from being allocated to nodes with ENI's that have reached the hard limit? If the Cluster Autoscaling is managed properly PODS can get allocated to new Nodes/ENIS thus avoiding the hard limit AT LEAST at the ENI level.
Here are the current limits:

https://aws.amazon.com/premiumsupport/knowledge-center/increase-security-group-rule-limit/

Thanks!

@ecout
Copy link

ecout commented Mar 24, 2020

Hi,

@M00nF1sh #1019 fixes this issues, doesn't it?
If alb-ingress-controller modifies worker's security group and reconciles changes, then we shouldn't hit any limitations on Security Groups. And it should work even with Managed Node Groups.

https://github.com/Alexx-G
+1. But actually, the Security Groups themselves have a HARD limit of about 40 rules. So with your suggestion we can increase the amount of ingresses in the cluster significantly but not really make it unlimited.

Edit:
The limits for SG numbers and SG rules have been increased across the board but this is still a possibility unless Services stop pointing to Nodes that can't take any more.
https://aws.amazon.com/premiumsupport/knowledge-center/increase-security-group-rule-limit/

@ecout
Copy link

ecout commented Mar 24, 2020

Hello,

While the alb.ingress.kubernetes.io/security-groups annotation works, is there any possibility of specifying a default for this and allowing users to override it with this annotation? At my company this is causing a lot of support requests as end users are not aware we are running up against SG limits, and alb-ingress doesn't put a note in the ingress object's event log as to that being the case.

Thanks!

I don't understand what you mean. The logs are the reason I was able to find this issue:

kubebuilder/controller "msg"="Reconciler error" "error"="failed to reconcile securityGroup associations due to SecurityGroupsPerInterfaceLimitExceeded: The maximum number of security groups per interface has been reached.\n\tstatus code: 400, request id: ************"  "controller"="alb-ingress-controller" "request"={"Namespace":"******","Name":"*********"}
I

@ecout
Copy link

ecout commented Mar 25, 2020

The annotation alb.ingress.kubernetes.io/security-groups let us specify the security group we've created in advanced that will be attached to the application load balancer created by alb controller.

The behavior is pretty similar to what Kubernetes AWS cloud provider suggests (Nginx ingress controller uses that library) with the similar annotation service.beta.kubernetes.io/aws-load-balancer-security-groups when you annotate Service resource. There's no mention of this annotation in the official documentation but you can still explore the source code.

The difference is how they manage security groups for an EC2 instance.

Generally, speaking, the traffic needs to be delivered to an instance. It goes the following way:
[LB SG] LB -> [EC2 INSTANCE SG] EC2 INSTANCE -> KUBE-PROXY -> ...
Just specifying LB security groups is not enough (if you, of course, don't want to have ec2 instance security groups open to the world). So different controller implementations try to solve the problem uniquely.

  1. AWS cloud provider

Once the load balancer is spun up and the security groups (specified or created) attached to it the controller searches for the security groups attached the network interfaces (ENI) amongst all of the instances in the cluster. Once it found the group(s), it adds the rule so that there's a permission on all network interfaces of all the instance in the cluster allowing traffic from the load balancer to it.

  1. ALB ingress controller

This one tries to be more accurate in specifying rules and only in the case when the annotation service.beta.kubernetes.io/aws-load-balancer-security-groups is not specified. When so, the controller creates the group for an instance with the predefined rule that has a reference to the load balancer SG. Then it finds the certain ENI (in a case there are several per on an instance) amongst all of the instances and tries to attach the SG to it. And we pretty fast reach the limit here.

Is that the right logic?

I'm wondering why alb doesn't have the same logic as AWS cloud provider? It seems to be pretty neat and right. The fact that alb ingress controller in an IP target type mode (alb.ingress.kubernetes.io/target-type annotation) solves the problem of an extra hop by not forwarding the traffic through a proxy. In this case, it goes directly to the recipient (Pod (allocated IP for Pod)) and why then we need to make things more complicated due to security concerns? We still can solve the security issues leveraging the native Kubernetes way by using Network Policies.

The workaround.

As been said, the workaround for this is to manage security groups externally. In our organization, we used Terraform for spinning up EKS cluster. So the groups for load balancers are going to be managed from there.

I had some doubts whether Terraform would conflict with changes made by AWS cloud provider (that adds rules into existing SGs attached to an instance), but as turned out it doesn't. When you apply a plan, it just adds additional rules to its state and doesn't manage those if there's no rule (aws_security_group_rule) matching the name.

Agreed,
Grouping ALBs per ingress policies can reduce the number of necessary Security groups per ALB and hence reduce the number of INSTANCE SG that need to be attached. If the Ingress definition is left without an explicit SG every time you create an ingress a new pair of SGs gets created until you reach the hard limit in the ENI side.

Edit, just for curiosity I took a look at cloud provider and found this feature request. Interestingly enough it seems they handle everything on a single SG with no OPTION for multiple SG support.

kubernetes/cloud-provider-aws#81

@ecout
Copy link

ecout commented Mar 25, 2020

Ticketmaster and were able to improve the situation by reusing a lot of crafted groups for specific use cases.

@bigkraig How about Node Affinity with Autoscaling? More nodes more ENIs. PODS pertaining to an Ingress that can't get on an ENI can be moved to a new node.

@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jun 23, 2020
@fejta-bot
Copy link

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Jul 23, 2020
@fejta-bot
Copy link

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

@k8s-ci-robot
Copy link
Contributor

@fejta-bot: Closing this issue.

In response to this:

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed.
Projects
None yet
Development

No branches or pull requests