feat: reduce vmss cache refresh in parallel disk attach/detach #803

andyzhangx · 2021-09-12T11:49:18Z

What type of PR is this?

/kind feature

What this PR does / why we need it:

feat: reduce vmss cache refresh in parallel disk attach/detach
This PR adds a lock in get vmss operation, in 1K disk attach/detach load test, it could reduce ~20%(67/338=20%) vmss cache refresh operations. It's not necessary to refresh cache every time when nodeName is not found in vmss cache, especially in parallel workloads.

Which issue(s) this PR fixes:

Fixes #

Special notes for your reviewer:

/hold

Does this PR introduce a user-facing change?

feat: reduce vmss cache refresh in parallel disk attach/detach

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:

feat: reduce vmss cache refresh in parallel disk attach/detach

sue cacheKey when refresh add logging

coveralls · 2021-09-12T11:54:54Z

Coverage decreased (-0.008%) to 80.613% when pulling 7f18487 on andyzhangx:reduce-cache-refresh into 13c8062 on kubernetes-sigs:master.

feiskyer · 2021-09-13T00:37:08Z

It's not necessary to refresh cache every time when nodeName is not found in vmss cache, especially in parallel workloads.

If the new nodes are created continuously by autoscaler, then the cache needs to be refreshed. or else, newly joined nodes may be deleted.

feiskyer · 2021-09-13T00:38:02Z

pkg/provider/azure_vmss.go

@@ -76,6 +76,8 @@ type ScaleSet struct {
 	vmssCache                 *azcache.TimedCache
 	vmssVMCache               *sync.Map // [resourcegroup/vmssname]*azcache.TimedCache
 	availabilitySetNodesCache *azcache.TimedCache
+	// lockMap in cache refresh
+	lockMap *lockMap


how is this map set and refreshed?

it's in L98:

lockMap: newLockMap(),

andyzhangx · 2021-09-13T02:12:35Z

It's not necessary to refresh cache every time when nodeName is not found in vmss cache, especially in parallel workloads.

If the new nodes are created continuously by autoscaler, then the cache needs to be refreshed. or else, newly joined nodes may be deleted.

@feiskyer this PR does not change the original logic, this PR make vmss list operation as serial by using lock, and then it would take second look in vmss cache, that would improve the cache hit ratio when there are large mount of vmss list operations in parallel.

feiskyer

/lgtm

k8s-ci-robot · 2021-09-13T07:25:47Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: andyzhangx, feiskyer

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~OWNERS~~ [andyzhangx,feiskyer]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

feiskyer · 2021-09-13T07:26:00Z

@feiskyer this PR does not change the original logic, this PR make vmss list operation as serial by using lock, and then it would take second look in vmss cache, that would improve the cache hit ratio when there are large mount of vmss list operations in parallel.

Reasonable. LGTM.

andyzhangx · 2021-09-13T07:31:51Z

/hold cancel

feat: reduce vmss cache refresh

7f18487

sue cacheKey when refresh add logging

k8s-ci-robot requested review from feiskyer and nilo19 September 12, 2021 11:49

k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Sep 12, 2021

andyzhangx changed the title ~~feat: reduce vmss cache refresh~~ feat: reduce vmss cache refresh in parallel disk attach/detach Sep 12, 2021

feiskyer reviewed Sep 13, 2021

View reviewed changes

feiskyer approved these changes Sep 13, 2021

View reviewed changes

k8s-ci-robot assigned feiskyer Sep 13, 2021

k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Sep 13, 2021

k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Sep 13, 2021

k8s-ci-robot merged commit 47c4f2e into kubernetes-sigs:master Sep 13, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: reduce vmss cache refresh in parallel disk attach/detach #803

feat: reduce vmss cache refresh in parallel disk attach/detach #803

andyzhangx commented Sep 12, 2021 •

edited

coveralls commented Sep 12, 2021

feiskyer commented Sep 13, 2021

feiskyer Sep 13, 2021

andyzhangx Sep 13, 2021

andyzhangx commented Sep 13, 2021

feiskyer left a comment

k8s-ci-robot commented Sep 13, 2021

feiskyer commented Sep 13, 2021

andyzhangx commented Sep 13, 2021

feat: reduce vmss cache refresh in parallel disk attach/detach #803

feat: reduce vmss cache refresh in parallel disk attach/detach #803

Conversation

andyzhangx commented Sep 12, 2021 • edited

What type of PR is this?

What this PR does / why we need it:

Which issue(s) this PR fixes:

Special notes for your reviewer:

Does this PR introduce a user-facing change?

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:

coveralls commented Sep 12, 2021

feiskyer commented Sep 13, 2021

feiskyer Sep 13, 2021

Choose a reason for hiding this comment

andyzhangx Sep 13, 2021

Choose a reason for hiding this comment

andyzhangx commented Sep 13, 2021

feiskyer left a comment

Choose a reason for hiding this comment

k8s-ci-robot commented Sep 13, 2021

feiskyer commented Sep 13, 2021

andyzhangx commented Sep 13, 2021

andyzhangx commented Sep 12, 2021 •

edited