Reuse instance-groups for internal load balancers #313

Fedosin · 2022-01-26T14:18:14Z

What type of PR is this?

/kind bug

What this PR does / why we need it:

Based on docs for internal loadbalancer here 1, backend services 2 and instances in instance-groups 3, following restrictions apply,

Internal LB can load balance to VMs in same region, but different subnets
Instance groups for the backend service must contain instance of the same subnet
An instance can only belong to one load balanced instance group
It is probably useful use-case to have nodes for the cluster belong to more than one subnet. And the current setup fails to create an internal load balancer with nodes in multiple subnets.

I1023 22:05:24.070949       1 gce_loadbalancer_internal.go:478] ensureInternalInstanceGroup(k8s-ig--27083f8254ed83c2, us-west1-b): adding nodes: [jstuev-5hzjp-m-1.c.openshift-dev-installer.internal jstuev-5hzjp-w-b-54qkc.c.openshift-dev-installer.internal]
E1023 22:05:25.385077       1 gce_loadbalancer.go:156] Failed to EnsureLoadBalancer(jstuev-5hzjp, openshift-ingress, router-default, a79c0f796db9e4157af2e2658433b3f6, us-west1), err: googleapi: Error 400: Resource 'projects/openshift-dev-installer/zones/us-west1-b/instances/jstuev-5hzjp-w-b-54qkc' is expected to be in the subnetwork 'projects/openshift-dev-installer/regions/us-west1/subnetworks/jstubyo-master-subnet' but is in the subnetwork 'projects/openshift-dev-installer/regions/us-west1/subnetworks/jstubyo-worker-subnet'., wrongSubnetwork

Also the use-case that some of these nodes (machines) might be part of some internal load balancer that is not managed by k8s is also pretty valid.
for example, you might have the machines hosting the control-plane (kube-apiserver) want to be part of a separate ILB that provides access to the apiserver through LB not managed by the k8s Service type Load Balancer.
But the current setup fails to create an interal load balancer like

r: failed to ensure load balancer: googleapi: Error 400: INSTANCE_IN_MULTIPLE_LOAD_BALANCED_IGS - Validation failed for instance 'projects/openshift-dev-installer/zones/us-west1-a/instances/jstuev-t285j-m-0': instance may belong to at most one load-balanced instance group.

So the subnet limitation should be automatically handled by the k8s cloud provider, but for now allowing users to create the IGs for instances that require this special setup should definietly help, and k8s cloud provider can just use those as-is, while maintaining the membership and lifecycle for ones created by it.

This change finds pre-existing instance-groups that ONLY contain instances that belong to the cluster, uses them for the backend service. And only ensures instance-groups for remaining ones.

Does this PR introduce a user-facing change?:

Reuse pre-existing instance groups that contain cluster nodes for internal load balancers's backend

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:

The work is based on kubernetes/kubernetes#84466

Based on docs for internal loadbalancer here [1], backend services [2] and instances in instance-groups [3], following restrictions apply: - Internal LB can load balance to VMs in same region, but different subnets - Instance groups for the backend service must contain instance of the same subnet - An instance can only belong to one load balanced instance group - It is probably useful use-case to have nodes for the cluster belong to more than one subnet. And the current setup fails to create an internal load balancer with nodes in multiple subnets. This change finds pre-existing instance-groups that ONLY contain instances that belong to the cluster, uses them for the backend service. And only ensures instance-groups for remaining ones. [1] https://cloud.google.com/load-balancing/docs/internal [2] https://cloud.google.com/load-balancing/docs/backend-service#restrictions_and_guidance [3] https://cloud.google.com/compute/docs/instance-groups/creating-groups-of-unmanaged-instances#addinstances Co-authored-by: Abhinav Dahiya <abhinav.dahiya@redhat.com>

k8s-ci-robot · 2022-01-28T16:03:14Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: Fedosin
To complete the pull request process, please assign saad-ali after the PR has been reviewed.
You can assign the PR to them by writing /assign @saad-ali in a comment when ready.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

providers/gce/OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

jprzychodzen · 2022-06-21T10:35:40Z

@Fedosin Is it still needed? If yes, could you reopen?

JoelSpeed · 2023-01-17T11:53:47Z

@Fedosin I believe this bug is still relevant? Can we get it reopened please?

k8s-ci-robot added kind/bug Categorizes issue or PR as related to a bug. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Jan 26, 2022

k8s-ci-robot requested review from cici37 and jpbetz January 26, 2022 14:18

Fedosin force-pushed the internal_lb branch 4 times, most recently from e556732 to 9f6ed06 Compare January 28, 2022 15:58

k8s-ci-robot added size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Jan 28, 2022

Fedosin force-pushed the internal_lb branch from 9f6ed06 to 74714c5 Compare January 28, 2022 16:02

k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. labels Jan 28, 2022

Fedosin closed this Apr 19, 2022

nrb mentioned this pull request Sep 13, 2023

Reuse instance-groups for internal load balancers #605

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reuse instance-groups for internal load balancers #313

Reuse instance-groups for internal load balancers #313

Fedosin commented Jan 26, 2022

k8s-ci-robot commented Jan 28, 2022

jprzychodzen commented Jun 21, 2022

JoelSpeed commented Jan 17, 2023

Reuse instance-groups for internal load balancers #313

Reuse instance-groups for internal load balancers #313

Conversation

Fedosin commented Jan 26, 2022

k8s-ci-robot commented Jan 28, 2022

jprzychodzen commented Jun 21, 2022

JoelSpeed commented Jan 17, 2023