Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix race condition when delete azure disk right after that attach azure disk #84917

Merged
merged 1 commit into from Nov 9, 2019

Conversation

andyzhangx
Copy link
Member

What type of PR is this?
/kind bug

What this PR does / why we need it:
fix race condition when attach/delete disk in same time
There is condition that attach and delete disk happens in same time, azure CRP don't check such race condition, this PR is to address such issue:
When there is disk attach/detach operation, it will first insert a record(key is diskURI) into diskAttachDetachMap and after operation completed, will remove that record. And in disk delete operation, will first check whether there is a record in the diskAttachDetachMap, if there is, then return error.

Which issue(s) this PR fixes:

Fixes #82714

Special notes for your reviewer:

Does this PR introduce a user-facing change?:

fix race condition when attach/delete azure disk in same time

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:


/kind bug
/assign @feiskyer @khenidak
/priority important-soon
/sig cloud-provider
/area provider/azure

/hold
hold a while for review and test

@k8s-ci-robot k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. labels Nov 7, 2019
@k8s-ci-robot k8s-ci-robot added kind/bug Categorizes issue or PR as related to a bug. size/S Denotes a PR that changes 10-29 lines, ignoring generated files. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. sig/cloud-provider Categorizes an issue or PR as relevant to SIG Cloud Provider. area/provider/azure Issues or PRs related to azure provider labels Nov 7, 2019
@andyzhangx
Copy link
Member Author

cc @ritazh

@k8s-ci-robot k8s-ci-robot added area/cloudprovider approved Indicates a PR has been approved by an approver from all required OWNERS files. labels Nov 7, 2019
@andyzhangx
Copy link
Member Author

/test pull-kubernetes-e2e-aks-engine-azure

@andyzhangx
Copy link
Member Author

/hold cancel
test pass on a local aks-engine cluster

below is the sample code showing how this works for sync.Map:

package main

import "fmt"
import "strings"
import "sync"

func main() {
	diskURI := "abc"
	var diskAttachDetachMap sync.Map
	diskAttachDetachMap.Store(strings.ToLower(diskURI), "")
	if _, ok := diskAttachDetachMap.Load(strings.ToLower(diskURI)); ok {
		fmt.Printf("1: failed to delete disk(%s) since it's in attaching or detaching state\n", diskURI)
	}
	diskAttachDetachMap.Delete(strings.ToLower(diskURI))
	if _, ok := diskAttachDetachMap.Load(strings.ToLower(diskURI)); ok {
		fmt.Printf("2: failed to delete disk(%s) since it's in attaching or detaching state\n", diskURI)
	}
}

expected output:
1: failed to delete disk(abc) since it's in attaching or detaching state

@k8s-ci-robot k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Nov 7, 2019
@andyzhangx
Copy link
Member Author

/test pull-kubernetes-e2e-aks-engine-azure

@khenidak
Copy link
Contributor

khenidak commented Nov 7, 2019

/hold until we understand more about the race condition.

@k8s-ci-robot k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Nov 7, 2019
@khenidak
Copy link
Contributor

khenidak commented Nov 7, 2019

/unhold Let us have this as a stop gap for now.

@andyzhangx andyzhangx changed the title fix race condition when attach/delete azure disk in same time fix race condition when attach azure disk and then delete disk right after that Nov 8, 2019
@andyzhangx andyzhangx changed the title fix race condition when attach azure disk and then delete disk right after that fix race condition when delete azure disk right after that attach azure disk Nov 8, 2019
Copy link
Member

@feiskyer feiskyer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm
/approve

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Nov 8, 2019
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: andyzhangx, feiskyer

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@andyzhangx
Copy link
Member Author

BTW, there is another ToBeDetached flag issue, would be addressed by #84958, while that issue only happens from k8s 1.16.x

@andyzhangx
Copy link
Member Author

/hold cancel

@k8s-ci-robot k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Nov 9, 2019
@andyzhangx
Copy link
Member Author

/test pull-kubernetes-e2e-aks-engine-azure

@fejta-bot
Copy link

/retest
This bot automatically retries jobs that failed/flaked on approved PRs (send feedback to fejta).

Review the full test history for this PR.

Silence the bot with an /lgtm cancel or /hold comment for consistent failures.

@k8s-ci-robot k8s-ci-robot merged commit 91a53b6 into kubernetes:master Nov 9, 2019
@k8s-ci-robot k8s-ci-robot added this to the v1.17 milestone Nov 9, 2019
@k8s-ci-robot
Copy link
Contributor

@andyzhangx: The following test failed, say /retest to rerun them all:

Test name Commit Details Rerun command
pull-kubernetes-e2e-aks-engine-azure b6afc86 link /test pull-kubernetes-e2e-aks-engine-azure

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

k8s-ci-robot added a commit that referenced this pull request Nov 29, 2019
…4917-upstream-release-1.16

Automated cherry pick of #84917: fix race condition when attach/delete disk
k8s-ci-robot added a commit that referenced this pull request Nov 30, 2019
…4917-upstream-release-1.15

Automated cherry pick of #84917: fix race condition when attach/delete disk
k8s-ci-robot added a commit that referenced this pull request Dec 2, 2019
…4917-upstream-release-1.14

Automated cherry pick of #84917: fix race condition when attach/delete disk
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/cloudprovider area/provider/azure Issues or PRs related to azure provider cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/bug Categorizes issue or PR as related to a bug. lgtm "Looks good to me", indicates that a PR is ready to be merged. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. release-note Denotes a PR that will be considered when it comes time to generate release notes. sig/cloud-provider Categorizes an issue or PR as relevant to SIG Cloud Provider. size/S Denotes a PR that changes 10-29 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

should not delete an azure disk when that disk is being attached
8 participants