Container Load Balancer - Load Balancer #8872

georgeedward2000 · 2025-04-14T07:31:44Z

What type of PR is this?

/kind feature

What this PR does / why we need it:

The key changes:

NRP Service Operations - NRP service create/delete and hooking of LB BP to NRP service
LocationAndNrpServiceBatchUpdater- Programming ServiceDto in NRP
Location Update API for Inbound scenarios- Programming of LocationDataDto in NRP(offline POD IP population in BP Pool)
Node Add/Remove to VMSS

Special notes for your reviewer:

Does this PR introduce a user-facing change?

Added the ability to set the type of backend pool to PodIP. 
This action allows the Load balancer attached to the service to route traffic directly to the containers, 
instead of routing to AKS nodes and then AKS node routing it to the PODs hosting the service.

NRP Service Operations - NRP service create/delete and hooking of LB BP to NRP service LocationAndNrpServiceBatchUpdater- Programming ServiceDto in NRP Location Update API for Inbound scenarios- Programming of LocationDataDto in NRP(offline POD IP population in BP Pool) Node Add/Remove to VMSS

linux-foundation-easycla · 2025-04-14T07:31:48Z

❌ - login: @georgeedward2000 / name: George Edward Nechitoaia . The commit (52d99b1, ec0944d, bdc02d2, 31fbbc7) is not authorized under a signed CLA. Please click here to be authorized. For further assistance with EasyCLA, please submit a support request ticket.

k8s-ci-robot · 2025-04-14T07:31:49Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: georgeedward2000
Once this PR has been reviewed and has the lgtm label, please assign feiskyer for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

k8s-ci-robot · 2025-04-14T07:31:54Z

Hi @georgeedward2000. Thanks for your PR.

I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

nilo19 · 2025-04-15T04:28:11Z

pkg/provider/azure_loadbalancer.go

+	// Node additions and removals to VMSS (Virtual Machine Scale Sets) do not require reconcileLB operations.
+	// The changes will be propagated via Location Update API upon endpoint-slice events and backendpool will be updated by NRP.
+	if az.IsLBBackendPoolTypePodIP() && az.UseStandardV2LoadBalancer() {
+		isOperationSucceeded = true


Can we move this to the beginning? I don't think we need to count a successful operation here as it is skipped.

What about externalTrafficPolicy:cluster services?

With CLB, the traffic is directed to the POD that's hosting the service and iptables is not used to choose the POD. So, I think CLB applies to only externalTrafficPolicy:local services.

Can we check and return an error in this podIP+Local case?

nilo19 · 2025-04-15T05:16:14Z

pkg/provider/azure_loadbalancer.go

+				}
+			}
+		}
+
 		lb, err = az.reconcileBackendPoolHosts(ctx, lb, lbToReconcile, service, nodes, clusterName, vmSetName, lbBackendPoolIDs)


Do we still need reconcile backend pool in this case? Or the batch updater does everything?

reconcileBackendPoolHosts calls EnsureHostsInPool, which is a NOP for the POD IP type backendpool. NRP adds the backendaddresses in the LB backendpool. Batch processor provides the PODs associated with the service to NRP in UpdateLocations API.

nilo19 · 2025-04-15T11:48:26Z

pkg/provider/azure_location_and_nrp_service_batch_updater.go

+	klog.V(2).Info("locationAndNRPServiceBatchUpdater.run: started")
+	for {
+		select {
+		case <-updater.channelUpdateTrigger:


Where is the batching logic of this updater?

Difftracker keeps on updating the K8S state with multiple cluster updates (batching). At some point, the batch updater is triggered and uses the Difftracker K8S state as the source of truth for updating NRP

kartickmsft · 2025-04-16T11:50:42Z

pkg/provider/azure.go

@@ -130,6 +130,8 @@ type Cloud struct {
 	eventRecorder      record.EventRecorder
 	routeUpdater       batchProcessor
 	backendPoolUpdater batchProcessor
+	// TBD (enechitoaia): Do we want to use the save interface: batchProcessor? even though we are not using interval based updater?


I thought we will update NRP at regular intervals, thus batching the updates. Any specific reason why we can't use the same interface?

kartickmsft · 2025-04-16T12:05:17Z

pkg/provider/azure_loadbalancer.go

+			},
+		)
+		az.localServiceNameToNRPServiceMap.Delete(serviceUID)
+		select {


I think the batchProcessor runs at periodic intervals and updates the NRP based on the K8 state that's set here.

Need to invoke NRP updateService API to remove ref to LB Backendpool here. reconcileLoadBalancer on line 504 will delete the LB/LB Backendpool resource in NRP and batchProcessor will delete NRP service after removing the reference of service from the Locations.

So we should call the NRP API directly without queuing it to the batch updater?

Yes. For removing reference of BP from NRP Service. Otherwise, NRP will throw validation error while deleting the LB/LB Backendpool resource.

kartickmsft · 2025-04-16T12:08:59Z

pkg/provider/azure_loadbalancer.go

@@ -475,6 +482,25 @@ func (az *Cloud) EnsureLoadBalancerDeleted(ctx context.Context, clusterName stri
 		return err
 	}



serviceReconcileLock need not be acquired on line 420 as we have CLB per service. Similar comment applies at other places also where we are acquiring serviceReconcileLock.

This lock is only acuqired when the service is being deleted.

I see it's acquired in functions like EnsureLoadBalancer, which is used to create/update the LB. Am I missing something?

kartickmsft · 2025-04-16T16:00:29Z

pkg/provider/azure_loadbalancer.go

@@ -1976,6 +2002,28 @@ func (az *Cloud) reconcileLoadBalancer(ctx context.Context, clusterName string,
 		if az.UseMultipleStandardLoadBalancers() {
 			lbToReconcile = existingLBs
 		}
+
+		if az.IsLBBackendPoolTypePodIP() && az.UseStandardV2LoadBalancer() && strings.EqualFold(string(service.Spec.ExternalTrafficPolicy), string(v1.ServiceExternalTrafficPolicyTypeLocal)) {


We throw error and fail if ExternalTrafficPolicy is Cluster with StandardV2LB and POD IP backendpooltype??

kartickmsft · 2025-04-16T16:01:36Z

pkg/provider/azure_loadbalancer.go

+					},
+				)
+				az.localServiceNameToNRPServiceMap.Store(serviceUID, struct{}{})
+				select {


NRP service create/service ->LB backendpool ref update could happen in the periodic batch processor.

kartickmsft · 2025-04-16T16:09:05Z

pkg/provider/azure_loadbalancer.go

+				}
+			}
+		}
+
 		lb, err = az.reconcileBackendPoolHosts(ctx, lb, lbToReconcile, service, nodes, clusterName, vmSetName, lbBackendPoolIDs)
 		if err != nil {


I think we need to modify getExpectedLBRules to pass empty probe property as we don't separately probe for CLB. This change was present in one of the earlier PRs. Maybe, got missed out in this PR.

I think you are referring to the changes in this PR: #8218
These will be eventually merged in together.

kartickmsft · 2025-04-16T16:10:39Z

pkg/provider/azure_local_services.go

+						az.difftracker.UpdateK8sEndpoints(updateK8sEndpointsInputType)
+					}
+
+					// TO BE DISCUSSED (enechitoaia): do we want to trigger the batch updater here too?


kartickmsft · 2025-04-17T05:48:44Z

pkg/provider/azure_location_and_nrp_service_batch_updater.go

+	if createServicesResponseDTO.Error == nil {
+		az.difftracker.UpdateNRPLoadBalancers(serviceLoadBalancerList.Additions)
+	} else {
+		// discuss action upon failure


I think we could log an event like what's done for backendpool updater and return. Continuing to the next set of operations may lead to NRP validation failures, causing NRP API failures.

kartickmsft · 2025-04-17T06:06:49Z

pkg/provider/azure_location_and_nrp_service_batch_updater.go

+	// Update all locations and addresses
+	locationData := updater.az.difftracker.GetSyncLocationsAddresses()
+	locationDataRequestDTO := MapLocationDataToDTO(locationData)
+	locationDataResponseDTO := NRPAPIClient.UpdateNRPLocations(ctx, locationDataRequestDTO)


We should invoke the UpdateNRPLocations API only if locationRequestDTO has non-empty locations for update.

kartickmsft · 2025-04-17T06:52:35Z

pkg/provider/azure_loadbalancer.go

+						Resource:  serviceUID,
+					},
+				)
+				az.localServiceNameToNRPServiceMap.Store(serviceUID, struct{}{})


We are doing a LoadOrStore in the batchProcessor. I think that is the right place (more granular) to enable Informer code path to start updating the k8 state in the DiffTracker.

kartickmsft · 2025-04-17T07:20:55Z

pkg/provider/azure_location_and_nrp_service_batch_updater.go

+	for _, serviceName := range serviceLoadBalancerList.Deletions {
+		updater.az.localServiceNameToNRPServiceMap.Delete(serviceName)
+
+		updateK8sEndpointsInputType := UpdateK8sEndpointsInputType{


I think when a service is being deleted, should we get the NRP state of the Locations associated with the service from DiffTracker and update NRP to disassociate a Service from a Location. Not sure if NRP will throw validation failures if some new location is passed based on latest k8 state for which NRP doesn't have the corresponding location -> service association. I think we need to check NRP behavior.

NRP behavior ignores Addresses that are not present in it's KV store. So, existing code changes are fine.

kartickmsft · 2025-04-17T07:23:41Z

pkg/provider/azure_location_and_nrp_service_batch_updater.go

+	}
+
+	for _, serviceName := range serviceLoadBalancerList.Deletions {
+		updater.az.localServiceNameToNRPServiceMap.Delete(serviceName)


We are deleting from the Map during deletion of LB. I think that's the right place because it will prevent further updates of endpoint-slices(k8 location state) in difftracker of locations for which service association is going to be deleted.

kartickmsft · 2025-04-17T07:26:19Z

pkg/provider/azure_loadbalancer.go

+				Resource:  serviceUID,
+			},
+		)
+		az.localServiceNameToNRPServiceMap.Delete(serviceUID)


I think better to move this to line 487 to prevent further update of locations/addresses in k8 difftracker state a little early.

nilo19 · 2025-04-18T05:59:25Z

pkg/provider/azure_loadbalancer.go

@@ -1976,6 +2002,28 @@ func (az *Cloud) reconcileLoadBalancer(ctx context.Context, clusterName string,
 		if az.UseMultipleStandardLoadBalancers() {
 			lbToReconcile = existingLBs
 		}
+
+		if az.IsLBBackendPoolTypePodIP() && az.UseStandardV2LoadBalancer() && strings.EqualFold(string(service.Spec.ExternalTrafficPolicy), string(v1.ServiceExternalTrafficPolicyTypeLocal)) {


ispodip and usev2lb always show together. can we merge them into a func?

kartickmsft · 2025-04-21T05:45:04Z

pkg/provider/azure_location_and_nrp_service_batch_updater.go

+}
+
+func (updater *locationAndNRPServiceBatchUpdater) process(ctx context.Context) {
+	updater.lock.Lock()


process will be called only from one go routine. So, locking is not needed.

nilo19 · 2025-04-29T05:14:46Z

@georgeedward2000 could you please add a design doc here? This feature has many big changes, and a clear documentation may help code review. Thanks.

- Update batch updater - Handle error cases - Refactor

kartickmsft · 2025-04-30T16:37:59Z

pkg/provider/azure_loadbalancer.go

+			klog.V(2).Info("az.locationAndNRPServiceBatchUpdater.channelUpdateTrigger is full. Batch update is already triggered.")
+		}
+	}
+


CLB(podip based backendpool and SLBv2 SKU) allows multiple LBs(LB per service), helping overcome the limitations for which multi-slb is used. So, I think better to ensure that error is returned when using multi-slb with CLB(podip based backendpool and SLBv2 SKU) in the function reconcileMultipleStandardLoadBalancerConfigurations when CLB is used.

kartickmsft · 2025-04-30T16:42:09Z

pkg/provider/azure_loadbalancer.go

@@ -1976,6 +2006,33 @@ func (az *Cloud) reconcileLoadBalancer(ctx context.Context, clusterName string,
 		if az.UseMultipleStandardLoadBalancers() {
 			lbToReconcile = existingLBs
 		}
+
+		if az.IsLBBackendPoolTypePodIPAndUseStandardV2LoadBalancer() {
+			if !strings.EqualFold(string(service.Spec.ExternalTrafficPolicy), string(v1.ServiceExternalTrafficPolicyTypeLocal)) {


I think beginning of reconcileService is the better place to validate this config.

kartickmsft · 2025-05-02T06:09:08Z

pkg/provider/azure_local_services.go

+							oldAddresses:    nil,
+							newAddresses:    az.getPodIPToNodeIPMapFromEndpointSlice(es, false),
+						}
+						az.difftracker.UpdateK8sEndpoints(updateK8sEndpointsInputType)


Since, we are not doing interval based batching updates, I think we should trigger the batchupdater from here.

kartickmsft · 2025-05-02T06:09:46Z

pkg/provider/azure_local_services.go

+						az.difftracker.UpdateK8sEndpoints(updateK8sEndpointsInputType)
+					}
+
+					// TO BE DISCUSSED (enechitoaia): do we want to trigger the batch updater here too?


Since, we are not doing interval based batching updates, I think we should trigger the batchupdater from here.

kartickmsft · 2025-05-02T06:10:15Z

pkg/provider/azure_local_services.go

+						az.difftracker.UpdateK8sEndpoints(updateK8sEndpointsInputType)
+					}
+
+					// TO BE DISCUSSED (enechitoaia): do we want to trigger the batch updater here too?


Same comment as for Add/Update.

kartickmsft · 2025-05-02T09:29:56Z

pkg/provider/azure_location_and_nrp_service_batch_updater.go

+		removeServicesResponseDTO := NRPAPIClient.UpdateNRPServices(ctx, removeServicesRequestDTO)
+
+		if removeServicesResponseDTO.Error == nil {
+			az.difftracker.UpdateNRPLoadBalancers(serviceLoadBalancerList.Additions)


This should be removals??
"https://github.com/kubernetes-sigs/cloud-provider-azure/pull/8464/files#:~:text=func%20(dt%20*DiffTracker)%20UpdateNRPLoadBalancers(syncServicesReturnType%20SyncServicesReturnType)%20%7B" takes SyncServicesReturnType, which has Additions and Removals.
Has the DiffTracker API changed or am I missing something?

kartickmsft · 2025-05-02T09:35:47Z

pkg/provider/azure_location_and_nrp_service_batch_updater.go

+}
+
+func (updater *locationAndNRPServiceBatchUpdater) addOperation(operation batchOperation) batchOperation {
+	// This is a no-op function. The actual processing is done in the run method.


Maybe, we can mention that add/remove operation is handled via the DiffTracker APIs.

georgeedward2000 · 2025-05-06T08:43:48Z

Design Doc to lift review blockers

Background

The following User stands for human users, or any cluster provisioning tools (e.g., AKS).

This design document describes the user-facing design and workflow of the Standard V2 LoadBalancer - Container Based Backendpool.

Design

Current users do not need to take any action, and the ongoing changes will not affect them.
New Users must create a Container Based Cluster.
Then, they must create a Standard V2 sku Load Balancer
After that, no further action is needed, as all networking resources will be automatically provisioned when new pods, LoadBalancer services, and egresses are created.

Workflow

Introduces DiffTracker API to keep track of the synchronization between the Kubernetes (K8s) cluster and its NRP resources structure.
Introduces LocationAndNRPServiceBatchUpdater as a parallel thread worker used as the main engine for DiffTracker synchronization. It runs continuously, waiting for updates in the cluster on a boolean channel and triggering NRP API requests.
All updates within the K8s cluster (regarding pods, services, egresses, etc.) are stored within DiffTracker.
After each update, a boolean value of true is sent through the LocationAndNRPServiceBatchUpdater channel.
The LocationAndNRPServiceBatchUpdater run method waits to consume booleans from the channel. When a value is consumed, it uses all the updates stored in DiffTracker to update NRP using the NRP API. Eventually, all successful NRP API calls are stored back into DiffTracker to assert the equivalence of the two structures (K8s cluster and Azure).
DiffTracker also functions as a batch operations aggregator. When a cluster undergoes multiple rapid updates, DiffTracker's state will continuously be updated. Meanwhile, the LocationAndNRPServiceBatchUpdater, running on a single thread, will consume updates from the channel. As a result, multiple updates can accumulate and be ready to be sent to the NRP in a single batch.

kartickmsft · 2025-05-06T09:45:30Z

pkg/provider/azure_local_services.go

+						az.difftracker.UpdateK8sEndpoints(updateK8sEndpointsInputType)
+					}
+
+					select {


I think this should be within the 384-391 if block. I mean we should trigger update only when the corresponding NRP service has been created or is in the process of being created.

kartickmsft · 2025-05-06T09:46:16Z

pkg/provider/azure_local_services.go

+						az.difftracker.UpdateK8sEndpoints(updateK8sEndpointsInputType)
+					}
+
+					select {


Same comment as for UpdateFunc

kartickmsft · 2025-05-06T09:47:00Z

pkg/provider/azure_local_services.go

+						az.difftracker.UpdateK8sEndpoints(updateK8sEndpointsInputType)
+					}
+
+					select {


Same comment as for UpdateFunc

k8s-ci-robot · 2025-06-06T13:56:50Z

PR needs rebase.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

k8s-ci-robot added kind/feature Categorizes issue or PR as related to a new feature. release-note Denotes a PR that will be considered when it comes time to generate release notes. labels Apr 14, 2025

k8s-ci-robot requested review from mainred and nilo19 April 14, 2025 07:31

k8s-ci-robot added cncf-cla: no Indicates the PR's author has not signed the CNCF CLA. needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Apr 14, 2025

github-actions bot added the tide/merge-method-squash Denotes a PR that should be squashed by tide when it merges. label Apr 14, 2025

k8s-ci-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Apr 14, 2025

nilo19 reviewed Apr 15, 2025

View reviewed changes

kartickmsft reviewed Apr 16, 2025

View reviewed changes

kartickmsft reviewed Apr 17, 2025

View reviewed changes

nilo19 reviewed Apr 18, 2025

View reviewed changes

kartickmsft reviewed Apr 21, 2025

View reviewed changes

Changes:

ec0944d

- Update batch updater - Handle error cases - Refactor

kartickmsft reviewed Apr 30, 2025

View reviewed changes

kartickmsft reviewed May 2, 2025

View reviewed changes

updated trigger flow + refactor

bdc02d2

kartickmsft reviewed May 6, 2025

View reviewed changes

update endpointslice informer flow

52d99b1

k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jun 6, 2025

		@@ -475,6 +482,25 @@ func (az *Cloud) EnsureLoadBalancerDeleted(ctx context.Context, clusterName stri
		return err
		}

Container Load Balancer - Load Balancer #8872

Are you sure you want to change the base?

Container Load Balancer - Load Balancer #8872

Conversation

georgeedward2000 commented Apr 14, 2025

What type of PR is this?

What this PR does / why we need it:

Special notes for your reviewer:

Does this PR introduce a user-facing change?

Uh oh!

linux-foundation-easycla bot commented Apr 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

k8s-ci-robot commented Apr 14, 2025

Uh oh!

k8s-ci-robot commented Apr 14, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

kartickmsft Apr 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

nilo19 commented Apr 29, 2025

linux-foundation-easycla bot commented Apr 14, 2025 •

edited

Loading

kartickmsft Apr 16, 2025 •

edited

Loading