New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: reduce vmss cache refresh in parallel disk attach/detach #803
feat: reduce vmss cache refresh in parallel disk attach/detach #803
Conversation
sue cacheKey when refresh add logging
If the new nodes are created continuously by autoscaler, then the cache needs to be refreshed. or else, newly joined nodes may be deleted. |
@@ -76,6 +76,8 @@ type ScaleSet struct { | |||
vmssCache *azcache.TimedCache | |||
vmssVMCache *sync.Map // [resourcegroup/vmssname]*azcache.TimedCache | |||
availabilitySetNodesCache *azcache.TimedCache | |||
// lockMap in cache refresh | |||
lockMap *lockMap |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how is this map set and refreshed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's in L98:
lockMap: newLockMap(),
@feiskyer this PR does not change the original logic, this PR make vmss list operation as serial by using lock, and then it would take second look in vmss cache, that would improve the cache hit ratio when there are large mount of vmss list operations in parallel. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: andyzhangx, feiskyer The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Reasonable. LGTM. |
/hold cancel |
What type of PR is this?
/kind feature
What this PR does / why we need it:
feat: reduce vmss cache refresh in parallel disk attach/detach
This PR adds a lock in get vmss operation, in 1K disk attach/detach load test, it could reduce ~20%(67/338=20%) vmss cache refresh operations. It's not necessary to refresh cache every time when nodeName is not found in vmss cache, especially in parallel workloads.
Which issue(s) this PR fixes:
Fixes #
Special notes for your reviewer:
/hold
Does this PR introduce a user-facing change?
Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.: