Allow NodeGroup/ASG targeting for ?LB instances #58

bassco · 2019-09-18T14:12:07Z

What would you like to be added:

Our cluster has been created with kops.

We operate multiple instance groups in our cluster to partition the different workloads we run.

I would like an annotation on the aws-load-balancer for targeted instances that will be assigned to the ELB/ALB/NLB TargetGroup using a tag.

E.g.:

service.beta.kubernetes.io/aws-load-balancer-node-groups: "asg-node-group-comma-separated-list"

Where node-group-comma-separated-list is a list of the ASG groupName tags to filter the host instance ids that will be added to the Load Balancer.

E.g. In our environment our 4 node groups have a tag: aws:autoscaling:groupName set to

nodes
ml-cpu
ml-gpu
search

Using the above list of instance group ASG names; to target Pods of the ml-* ASGs I would use the following annotation.

service.beta.kubernetes.io/aws-load-balancer-node-groups: "ml-cpu,ml-gpu"

Currently, I suspect that the tag k8s.io/role/node with a value of 1 is used to populate the instances on the LB.

Why is this needed:

When creating an ELB/ALB or NLB - the complete node instance list associated with the cluster is assigned to the TargetGroup. It is really inefficient to perform health checks against instances that will never host a Pod of the service type you are creating the LB for.

/kind feature

The text was updated successfully, but these errors were encountered:

fejta-bot · 2019-12-17T14:36:36Z

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

fejta-bot · 2020-01-16T15:23:41Z

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten

bassco · 2020-01-17T05:51:11Z

Anyone able to read through this and provide feedback?

fejta-bot · 2020-02-16T06:18:30Z

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

k8s-ci-robot · 2020-02-16T06:18:38Z

@fejta-bot: Closing this issue.

In response to this:

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

ari-becker · 2020-04-21T13:21:38Z

/reopen

Not sure why this was allowed to close - this is a must-have feature for large clusters where the number of nodes in the cluster reaches the quota for the number of targets permitted per load balancer. Scaling further requires limiting the load balancer's targets to a specific subset of servers, and configuring the deployment/statefulset to only schedule on those nodes. Currently the only workaround is to create the loadbalancer by hand outside of Kubernetes.

k8s-ci-robot · 2020-04-21T13:21:59Z

@ari-becker: You can't reopen an issue/PR unless you authored it or you are a collaborator.

In response to this:

/reopen

Not sure why this was allowed to close - this is a must-have feature for large clusters where the number of nodes in the cluster reaches the quota for the number of targets permitted per load balancer. Scaling further requires limiting the load balancer's targets to a specific subset of servers, and configuring the deployment/statefulset to only schedule on those nodes. Currently the only workaround is to create the loadbalancer by hand outside of Kubernetes.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

ari-becker · 2020-04-21T15:11:59Z

@bassco as the author, do you mind re-opening?

leakingtapan · 2020-04-21T15:27:08Z

/reopen

k8s-ci-robot · 2020-04-21T15:27:28Z

@leakingtapan: Reopened this issue.

In response to this:

/reopen

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

ari-becker · 2020-04-22T16:10:25Z

An additional gotcha that popped up for me:

Let's say that you have 450 nodes and a service that you'd like to expose with an external load balancer.
If you expose a single port on the service - a load balancer is created that targets all 450 servers, and it works.
If you expose multiple additional ports - despite the number of servers not changing, each additional port is considered a whole new target. So if you expose three ports you now have 3 * 450 = 1350 targets, which is above the limit, and AWS will simply refuse to add the listeners for the new ports, complaining about TooManyTargets.

foobarfran · 2020-05-06T19:04:06Z

This would be super useful.
I can contribute with a pull request for this feature if it helps

bassco · 2020-05-06T19:16:45Z

That would be fantastic if you could, @foobarfran

leakingtapan · 2020-05-08T05:40:29Z

/remove-lifecycle rotten

foobarfran · 2020-05-08T20:53:42Z

/assign @foobarfran

foobarfran · 2020-06-08T20:29:18Z

The feature for this issue is already merged in kubernetes/kubernetes#90943

/close

k8s-ci-robot · 2020-06-08T20:29:30Z

@foobarfran: You can't close an active issue/PR unless you authored it or you are a collaborator.

In response to this:

The feature for this issue is already merged in kubernetes/kubernetes#90943

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

leakingtapan · 2020-06-08T20:40:06Z

/close as the PR is merged in legacy provider

bassco · 2020-07-20T03:45:36Z

@foobarfran - legend!
Feature will be released in v1.19, for those that reach this comment and don't follow the MR

…penshift-4.15-ose-aws-cloud-controller-manager OCPBUGS-24135: Updating ose-aws-cloud-controller-manager-container image to be consistent with ART

k8s-ci-robot added the kind/feature Categorizes issue or PR as related to a new feature. label Sep 18, 2019

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Dec 17, 2019

k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Jan 16, 2020

k8s-ci-robot closed this as completed Feb 16, 2020

k8s-ci-robot reopened this Apr 21, 2020

k8s-ci-robot removed the lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. label May 8, 2020

k8s-ci-robot assigned foobarfran May 8, 2020

foobarfran mentioned this issue May 10, 2020

feat: use annotation to filter AWS LB target nodes kubernetes/kubernetes#90943

Merged

bassco closed this as completed Jul 20, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow NodeGroup/ASG targeting for ?LB instances #58

Allow NodeGroup/ASG targeting for ?LB instances #58

bassco commented Sep 18, 2019

fejta-bot commented Dec 17, 2019

fejta-bot commented Jan 16, 2020

bassco commented Jan 17, 2020

fejta-bot commented Feb 16, 2020

k8s-ci-robot commented Feb 16, 2020

ari-becker commented Apr 21, 2020

k8s-ci-robot commented Apr 21, 2020

ari-becker commented Apr 21, 2020

leakingtapan commented Apr 21, 2020

k8s-ci-robot commented Apr 21, 2020

ari-becker commented Apr 22, 2020

foobarfran commented May 6, 2020

bassco commented May 6, 2020

leakingtapan commented May 8, 2020

foobarfran commented May 8, 2020

foobarfran commented Jun 8, 2020

k8s-ci-robot commented Jun 8, 2020

leakingtapan commented Jun 8, 2020

bassco commented Jul 20, 2020

Allow NodeGroup/ASG targeting for ?LB instances #58

Allow NodeGroup/ASG targeting for ?LB instances #58

Comments

bassco commented Sep 18, 2019

fejta-bot commented Dec 17, 2019

fejta-bot commented Jan 16, 2020

bassco commented Jan 17, 2020

fejta-bot commented Feb 16, 2020

k8s-ci-robot commented Feb 16, 2020

ari-becker commented Apr 21, 2020

k8s-ci-robot commented Apr 21, 2020

ari-becker commented Apr 21, 2020

leakingtapan commented Apr 21, 2020

k8s-ci-robot commented Apr 21, 2020

ari-becker commented Apr 22, 2020

foobarfran commented May 6, 2020

bassco commented May 6, 2020

leakingtapan commented May 8, 2020

foobarfran commented May 8, 2020

foobarfran commented Jun 8, 2020

k8s-ci-robot commented Jun 8, 2020

leakingtapan commented Jun 8, 2020

bassco commented Jul 20, 2020