fix: decouple vmss with 0 instance from lb when deleting the service #2489

nilo19 · 2022-10-12T08:28:50Z

What type of PR is this?

/kind bug

What this PR does / why we need it:

We parse the id of ipConfigs in the lb backend pool to determine what vmss are needed to be decoupled from the lb. But for those with 0 instance, they cannot be decoupled since there is no corresponding ipConfigs in the lb. This PR checks all cached vmss, if it is bound with the lb, we decouple it.

Which issue(s) this PR fixes:

Fixes #2443

Special notes for your reviewer:

Does this PR introduce a user-facing change?

fix: decouple vmss with 0 instance from lb when deleting the service

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:

netlify · 2022-10-12T08:28:55Z

✅ Deploy Preview for kubernetes-sigs-cloud-provide-azure canceled.

Name	Link
🔨 Latest commit	`5f97ea9`
🔍 Latest deploy log	https://app.netlify.com/sites/kubernetes-sigs-cloud-provide-azure/deploys/6348c8cc428856000824c748

nilo19 · 2022-10-12T08:30:38Z

pkg/provider/azure_vmss_cache.go

-						lastUpdate:    time.Now().UTC(),
-					})
-				}
+				localCache.Store(*scaleSet.Name, &vmssEntry{


Cache all vmss instead of only uniform ones. I don't think this will break the logic of vmss flex but @zmyzheng can correct me.

This will break the logic.

nilo19 · 2022-10-12T08:32:03Z

pkg/provider/azure_vmssflex.go

@@ -926,17 +926,7 @@ func (fs *FlexScaleSet) EnsureBackendPoolDeleted(service *v1.Service, backendPoo
 		}
 	}

-	// 1. Ensure the backendPoolID is deleted from the VMSS.


Since we ensure all vmss are decoupled from the lb in azure_vmss.go, we don't need it here, right?

I think we should still need it. Per AKS request, I will add another vmType == vmssflex for pure vmssflex cluster. In this case, we skip the node type check and only initialize FlexScaleSet ( similar to the pure standalone VM cluster).

andyzhangx · 2022-10-12T12:01:04Z

pkg/provider/azure_vmss_cache.go

@@ -92,13 +92,11 @@ func (ss *ScaleSet) newVMSSCache() (*azcache.TimedCache, error) {
 					klog.Warning("failed to get the name of VMSS")
 					continue
 				}
-				if scaleSet.OrchestrationMode == "" || scaleSet.OrchestrationMode == compute.OrchestrationModeUniform {


this has changed the original vmss flex behavior cc @zmyzheng

We should not save VMSS Flex into the vmssCache.
vmssCache is only used for vmss uniform (although the name is a little missleading)
The VMSS Flex cache is inside ss.flexScaleSet.vmssFlexCache

The main reason we need two different caches is because the vm list API is different between Uniform and Flex

Thanks for the review. I roll back the change and use flex cache to get all vmss. Can you help review again?

coveralls · 2022-10-13T01:48:15Z

Coverage decreased (-0.05%) to 79.868% when pulling 5f97ea9 on nilo19:fix/zero-vmss into 8d0fb7f on kubernetes-sigs:master.

feiskyer

lgtm

andyzhangx

/lgtm

andyzhangx · 2022-10-13T06:54:06Z

/hold
hold a moment for other comments

zmyzheng · 2022-10-13T06:58:07Z

pkg/provider/azure_vmss.go

+				klog.V(3).Infof("ensureBackendPoolDeletedFromVMSS: found vmss %s being deleted, skipping", to.String(vmss.Name))
+				return true
+			}
+			if vmss.VirtualMachineProfile.NetworkProfile.NetworkInterfaceConfigurations == nil {


When using VMSS Flex, it is possible the VMSS Flex does not have vm profile. We can skip the VMSS FLex which does not have vm profile

k8s-ci-robot · 2022-10-14T03:24:13Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: andyzhangx, feiskyer, nilo19, zmyzheng

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~OWNERS~~ [andyzhangx,feiskyer,nilo19]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

zmyzheng · 2022-10-14T03:24:26Z

/lgtm

k8s-ci-robot · 2022-10-14T03:24:29Z

@zmyzheng: changing LGTM is restricted to collaborators

In response to this:

/lgtm

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

andyzhangx · 2022-10-14T03:28:31Z

/hold cancel

nilo19 · 2022-10-14T11:40:54Z

Need an LGTM.

feiskyer · 2022-10-14T13:36:00Z

/lgtm

nilo19 · 2022-10-15T02:24:53Z

/cherrypick release-1.25

k8s-infra-cherrypick-robot · 2022-10-15T02:25:43Z

@nilo19: #2489 failed to apply on top of branch "release-1.25":

Applying: chore: update deploy-cluster.sh
Applying: fix: decouple vmss with 0 instance from lb when deleting the service
Using index info to reconstruct a base tree...
M	pkg/provider/azure_vmss.go
M	pkg/provider/azure_vmss_cache.go
M	pkg/provider/azure_vmss_test.go
M	pkg/provider/azure_vmssflex_test.go
Falling back to patching base and 3-way merge...
Auto-merging pkg/provider/azure_vmssflex_test.go
Auto-merging pkg/provider/azure_vmss_test.go
Auto-merging pkg/provider/azure_vmss_cache.go
Auto-merging pkg/provider/azure_vmss.go
CONFLICT (content): Merge conflict in pkg/provider/azure_vmss.go
error: Failed to merge in the changes.
hint: Use 'git am --show-current-patch=diff' to see the failed patch
Patch failed at 0002 fix: decouple vmss with 0 instance from lb when deleting the service
When you have resolved this problem, run "git am --continue".
If you prefer to skip this patch, run "git am --skip" instead.
To restore the original branch and stop patching, run "git am --abort".

In response to this:

/cherrypick release-1.25

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

chore: update deploy-cluster.sh

241fd7e

k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. kind/bug Categorizes issue or PR as related to a bug. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Oct 12, 2022

k8s-ci-robot requested review from andyzhangx and feiskyer October 12, 2022 08:29

k8s-ci-robot added approved Indicates a PR has been approved by an approver from all required OWNERS files. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Oct 12, 2022

nilo19 commented Oct 12, 2022

View reviewed changes

andyzhangx reviewed Oct 12, 2022

View reviewed changes

nilo19 force-pushed the fix/zero-vmss branch from 8097fd9 to beec134 Compare October 13, 2022 01:41

feiskyer approved these changes Oct 13, 2022

View reviewed changes

andyzhangx approved these changes Oct 13, 2022

View reviewed changes

k8s-ci-robot assigned andyzhangx Oct 13, 2022

k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Oct 13, 2022

k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Oct 13, 2022

zmyzheng reviewed Oct 13, 2022

View reviewed changes

nilo19 force-pushed the fix/zero-vmss branch from beec134 to b827856 Compare October 14, 2022 02:16

k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Oct 14, 2022

fix: decouple vmss with 0 instance from lb when deleting the service

5f97ea9

nilo19 force-pushed the fix/zero-vmss branch from b827856 to 5f97ea9 Compare October 14, 2022 02:26

zmyzheng approved these changes Oct 14, 2022

View reviewed changes

k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Oct 14, 2022

k8s-ci-robot assigned feiskyer Oct 14, 2022

k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Oct 14, 2022

k8s-ci-robot merged commit 905f6f0 into kubernetes-sigs:master Oct 14, 2022

nilo19 deleted the fix/zero-vmss branch October 15, 2022 00:14

zmyzheng mentioned this pull request Nov 3, 2022

Fix issue: CCM fails to delete LB because updating VMSS Flex network profile fails #2691

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: decouple vmss with 0 instance from lb when deleting the service #2489

fix: decouple vmss with 0 instance from lb when deleting the service #2489

nilo19 commented Oct 12, 2022

netlify bot commented Oct 12, 2022 •

edited

nilo19 Oct 12, 2022

zmyzheng Oct 12, 2022

nilo19 Oct 12, 2022

zmyzheng Oct 12, 2022

andyzhangx Oct 12, 2022

zmyzheng Oct 12, 2022

zmyzheng Oct 12, 2022

nilo19 Oct 13, 2022

coveralls commented Oct 13, 2022 •

edited

feiskyer left a comment

andyzhangx left a comment

andyzhangx commented Oct 13, 2022

zmyzheng Oct 13, 2022 •

edited

nilo19 Oct 14, 2022

k8s-ci-robot commented Oct 14, 2022

zmyzheng commented Oct 14, 2022

k8s-ci-robot commented Oct 14, 2022

andyzhangx commented Oct 14, 2022

nilo19 commented Oct 14, 2022

feiskyer commented Oct 14, 2022

nilo19 commented Oct 15, 2022

k8s-infra-cherrypick-robot commented Oct 15, 2022

fix: decouple vmss with 0 instance from lb when deleting the service #2489

fix: decouple vmss with 0 instance from lb when deleting the service #2489

Conversation

nilo19 commented Oct 12, 2022

What type of PR is this?

What this PR does / why we need it:

Which issue(s) this PR fixes:

Special notes for your reviewer:

Does this PR introduce a user-facing change?

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:

netlify bot commented Oct 12, 2022 • edited

✅ Deploy Preview for kubernetes-sigs-cloud-provide-azure canceled.

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

coveralls commented Oct 13, 2022 • edited

feiskyer left a comment

Choose a reason for hiding this comment

andyzhangx left a comment

Choose a reason for hiding this comment

andyzhangx commented Oct 13, 2022

zmyzheng Oct 13, 2022 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

k8s-ci-robot commented Oct 14, 2022

zmyzheng commented Oct 14, 2022

k8s-ci-robot commented Oct 14, 2022

andyzhangx commented Oct 14, 2022

nilo19 commented Oct 14, 2022

feiskyer commented Oct 14, 2022

nilo19 commented Oct 15, 2022

k8s-infra-cherrypick-robot commented Oct 15, 2022

netlify bot commented Oct 12, 2022 •

edited

coveralls commented Oct 13, 2022 •

edited

zmyzheng Oct 13, 2022 •

edited