Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: reduce vmss cache refresh in parallel disk attach/detach #803

Merged

Conversation

andyzhangx
Copy link
Member

@andyzhangx andyzhangx commented Sep 12, 2021

What type of PR is this?

/kind feature

What this PR does / why we need it:

feat: reduce vmss cache refresh in parallel disk attach/detach
This PR adds a lock in get vmss operation, in 1K disk attach/detach load test, it could reduce ~20%(67/338=20%) vmss cache refresh operations. It's not necessary to refresh cache every time when nodeName is not found in vmss cache, especially in parallel workloads.

Which issue(s) this PR fixes:

Fixes #

Special notes for your reviewer:

/hold

Does this PR introduce a user-facing change?

feat: reduce vmss cache refresh in parallel disk attach/detach

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:

feat: reduce vmss cache refresh in parallel disk attach/detach

sue cacheKey when refresh

add logging
@k8s-ci-robot k8s-ci-robot added do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. release-note Denotes a PR that will be considered when it comes time to generate release notes. kind/feature Categorizes issue or PR as related to a new feature. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Sep 12, 2021
@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Sep 12, 2021
@coveralls
Copy link

Coverage Status

Coverage decreased (-0.008%) to 80.613% when pulling 7f18487 on andyzhangx:reduce-cache-refresh into 13c8062 on kubernetes-sigs:master.

@andyzhangx andyzhangx changed the title feat: reduce vmss cache refresh feat: reduce vmss cache refresh in parallel disk attach/detach Sep 12, 2021
@feiskyer
Copy link
Member

It's not necessary to refresh cache every time when nodeName is not found in vmss cache, especially in parallel workloads.

If the new nodes are created continuously by autoscaler, then the cache needs to be refreshed. or else, newly joined nodes may be deleted.

@@ -76,6 +76,8 @@ type ScaleSet struct {
vmssCache *azcache.TimedCache
vmssVMCache *sync.Map // [resourcegroup/vmssname]*azcache.TimedCache
availabilitySetNodesCache *azcache.TimedCache
// lockMap in cache refresh
lockMap *lockMap
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how is this map set and refreshed?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's in L98:

lockMap:         newLockMap(),

@andyzhangx
Copy link
Member Author

It's not necessary to refresh cache every time when nodeName is not found in vmss cache, especially in parallel workloads.

If the new nodes are created continuously by autoscaler, then the cache needs to be refreshed. or else, newly joined nodes may be deleted.

@feiskyer this PR does not change the original logic, this PR make vmss list operation as serial by using lock, and then it would take second look in vmss cache, that would improve the cache hit ratio when there are large mount of vmss list operations in parallel.

Copy link
Member

@feiskyer feiskyer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Sep 13, 2021
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: andyzhangx, feiskyer

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:
  • OWNERS [andyzhangx,feiskyer]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@feiskyer
Copy link
Member

@feiskyer this PR does not change the original logic, this PR make vmss list operation as serial by using lock, and then it would take second look in vmss cache, that would improve the cache hit ratio when there are large mount of vmss list operations in parallel.

Reasonable. LGTM.

@andyzhangx
Copy link
Member Author

/hold cancel

@k8s-ci-robot k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Sep 13, 2021
@k8s-ci-robot k8s-ci-robot merged commit 47c4f2e into kubernetes-sigs:master Sep 13, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/feature Categorizes issue or PR as related to a new feature. lgtm "Looks good to me", indicates that a PR is ready to be merged. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/S Denotes a PR that changes 10-29 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants