Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Add WatchIPBlocks and WatchInstanceFilters to TargetGroupBinding #1865

Conversation

nicolasgarfinkiel
Copy link

Introduction

This PR adds WatchIPBlocks and WatchInstanceFilters to TargetGroupBinding.

In our use case our product is deployed in multiple EKS clusters that are load balanced through a common ALB. When sharing a single target group among many aws-load-balancer-controller instances registration/deregistration of one controller affects other controllers. This PR makes a TargetGroupBinding to only care about a specific instance group or address space and leave the rest unmanaged.

WatchIPBlocks

When using TargetType: ip, you can specify IP blocks in CIDR notation
to only list (from AWS) ip targets that fall within their ranges. This is
useful when sharing a single target group among many aws-load-balancer-controller
instances (for example across different clusters), so registration/deregistration
of one controller does not affect other controllers.

apiVersion: elbv2.k8s.aws/v1beta1
kind: TargetGroupBinding
metadata:
  name: my-tgb
spec:
  watchIPBlocks:
    - cidr: 10.1.1.0/24
    - cidr: 10.1.2.0/24
  ...

WatchInstanceFilters

When using TargetType: instance, you can specify filters to only list
(from AWS) instance targets that match the specified filters. This is
useful when sharing a single target group among many aws-load-balancer-controller
instances (for example across different clusters), so registration/deregistration
of one controller does not affect other controllers.

Please read this reference
for more information about these filters.

apiVersion: elbv2.k8s.aws/v1beta1
kind: TargetGroupBinding
metadata:
  name: my-tgb
spec:
  watchInstanceFilters:
    - name: "tag:someTag"
      values:
        - value1
        - value2
  ...

@k8s-ci-robot
Copy link
Contributor

Thanks for your pull request. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

📝 Please follow instructions at https://git.k8s.io/community/CLA.md#the-contributor-license-agreement to sign the CLA.

It may take a couple minutes for the CLA signature to be fully registered; after that, please reply here with a new comment and we'll verify. Thanks.


Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@k8s-ci-robot k8s-ci-robot added the cncf-cla: no Indicates the PR's author has not signed the CNCF CLA. label Mar 8, 2021
@k8s-ci-robot
Copy link
Contributor

Welcome @nicolasgarfinkiel!

It looks like this is your first PR to kubernetes-sigs/aws-load-balancer-controller 🎉. Please refer to our pull request process documentation to help your PR have a smooth ride to approval.

You will be prompted by a bot to use commands during the review process. Do not be afraid to follow the prompts! It is okay to experiment. Here is the bot commands documentation.

You can also check if kubernetes-sigs/aws-load-balancer-controller has its own contribution guidelines.

You may want to refer to our testing guide if you run into trouble with your tests not passing.

If you are having difficulty getting your pull request seen, please follow the recommended escalation practices. Also, for tips and tricks in the contribution process you may want to read the Kubernetes contributor cheat sheet. We want to make sure your contribution gets all the attention it needs!

Thank you, and welcome to Kubernetes. 😃

@k8s-ci-robot k8s-ci-robot added the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label Mar 8, 2021
@k8s-ci-robot
Copy link
Contributor

Hi @nicolasgarfinkiel. Thanks for your PR.

I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added the size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. label Mar 8, 2021
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: nicolasgarfinkiel
To complete the pull request process, please assign m00nf1sh after the PR has been reviewed.
You can assign the PR to them by writing /assign @m00nf1sh in a comment when ready.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@nicolasgarfinkiel
Copy link
Author

/check-cla

@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. and removed cncf-cla: no Indicates the PR's author has not signed the CNCF CLA. labels Mar 9, 2021
@M00nF1sh
Copy link
Collaborator

M00nF1sh commented Mar 11, 2021

@nicolasgarfinkiel
Hi, i think this works but it's too hacky tbh..
ideally, we should only support this if ELBv2 API supports tags on targets registered. (will follow up with ELB PM on whether they have such plan)

@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Mar 11, 2021
@nicolasgarfinkiel
Copy link
Author

nicolasgarfinkiel commented Mar 11, 2021

Hi @M00nF1sh, thanks for your comments.

I think this PR opens the door to some configuration corner cases but I don't see why you consider it hacky. Can you please elaborate?

We are narrowing the controller area of responsibility using current API constructs: in the case type: instance by AWS API filters and the case of type: ip by IPBlock type already present on the controller.

I actually added these as arguments to the controller's executable but since there was nodeSelector in TargetGroupBinding I thought it was best to keep filtering config in the same place. Do you see that less hacky and more viable?

We need to have this feature because of our use case, so in case you guys don't accept this, what is your recommendation to support this feature, other than maintaining our own fork? I would like to avoid maintaining a fork as much as possible.

@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Mar 11, 2021
@M00nF1sh
Copy link
Collaborator

M00nF1sh commented Mar 12, 2021

@nicolasgarfinkiel

I think it's hacky in the way how it sub-divides a TargetGroup and how it treats IP/Instance Targets.

If we sub-divides a targetGroup, ideally it should work like a TGB only manages targets in TargetGroup added by it. However, in this case, we are identify the targets added a TGB using IPFilter and InstanceTags filter, which are limited to this specific use case. (an sample case cannot be satisfied by this is like share same targetGroup across multiple services/TGB within a single cluster).
Ideally, it should work like some tags/attributes associated with each target(like how cloudMap API did: https://docs.aws.amazon.com/cloud-map/latest/api/API_RegisterInstance.html). But ELB team doesn't have plans to support this.

another approach is instead of sub-divides a TargetGroup, we use weighted routing, so each set of Targets get it's own TargetGroup. However, only ALB supports this currently, and NLB support for this is planed for 2022.

so, personally i'm opposite to add this as fields into TGB's API as it's not generic and adds unnecessary burden for generic users.
but i'm open to add it as a flag to the controller binary for this specific use case/setup.

wondering what's the opinion of other folks? @kishorj

@kishorj
Copy link
Collaborator

kishorj commented Mar 13, 2021

My concerns

  • Having multiple controllers across clusters modify same target group will make it difficult to maintain consistency. For example, if the watch IP block gets misconfigured on a cluster, then we might end up with a difficult to debug situation.
  • With the proposed changes we also assume that all of the clusters are at the same security level, and this enforcing this is upto the end user. If clusters are at different security levels, we open up for privilege escalation where controller on lower security level can interfere with the targets registered by other clusters.
  • Even if we are to go the flag route, the CRDs will still have the fields. I'm assuming we will disable this flag by default. In that case, there is a potential for confusion.

With the current controller, we aren't ready to support cross-cluster modification of resources in a reliable and secure way. While this use case is compelling, we should look at whether alternative solutions are possible.

@nicolasgarfinkiel
Copy link
Author

nicolasgarfinkiel commented Mar 15, 2021

@M00nF1sh

I did consider what you mention as "a TGB only manages targets in TargetGroup added by it", and concluded that by filtering we can achieve that without actually managing state, which would require substancial work and also would open a can of worms.

I like the idea of adding this feature as a flag to the controller binary (as originally intended) as this makes more sense on a cluster wide level. I will update the PR accordingly.

@kishorj

Thanks for your comments.

  • AFAICT if you miss configure IP blocks then either get ip targets registered that never get removed (easily identifiable when they become unhealthy) or two controllers fighting each other for register/deregister (which is what would happen if you currently point many TGBs to the same Target Group).
  • It is a very valid point that we can specify as a caveat in the docs, IMO.
  • I see no need to keep the fields in the CRDs IMO, do you think we should?

Thanks.

@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Mar 17, 2021
@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Mar 17, 2021
@k8s-ci-robot
Copy link
Contributor

@nicolasgarfinkiel: PR needs rebase.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Mar 19, 2021
@nicolasgarfinkiel
Copy link
Author

@M00nF1sh @kishorj Hello. Can you please advice if this PR will be admitted? If so I'll rebase it to leave it ready to merge. Otherwise I'll close it.

Thanks.

@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jul 12, 2021
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Aug 11, 2021
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Reopen this issue or PR with /reopen
  • Mark this issue or PR as fresh with /remove-lifecycle rotten
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close

@k8s-ci-robot
Copy link
Contributor

@k8s-triage-robot: Closed this PR.

In response to this:

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Reopen this issue or PR with /reopen
  • Mark this issue or PR as fresh with /remove-lifecycle rotten
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants