Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pv controller processes same object multiple times for same delete event #88088

Closed
gnufied opened this issue Feb 12, 2020 · 1 comment · Fixed by #88146
Closed

pv controller processes same object multiple times for same delete event #88088

gnufied opened this issue Feb 12, 2020 · 1 comment · Fixed by #88146

Comments

@gnufied
Copy link
Member

@gnufied gnufied commented Feb 12, 2020

PV controller appears to be processing same object multiple times for same PVC delete event. The reason is mainly because when a PVC is deleted and PV is synced(either because of an event or because of resync), then each update to PV object causes another event which gets processed until PV object is removed from informer/store and really marked as deleted.

This results in multiple delete requests being sent to storage provider for same volume. For example - https://gist.githubusercontent.com/gnufied/942552193cc5450f7401fe25f03937a8/raw/184215d23b2fd8ec0fb490d456fe8bee0e449197/code.txt

In above gist you can see that volume deletion is being done more than once for pv - pvc-6a3d4163-698a-40f5-83ae-16a30724d2ca]

/sig storage

cc @jsafrane @msau42

@gnufied

This comment has been minimized.

Copy link
Member Author

@gnufied gnufied commented Feb 12, 2020

It looks like this is to some extent by design because events are pulled from workqueue by one goroutine and then do some proccessing and then put into another goroutine via nestedpendingoperations thingy. The problem is, this is inherently "racy" because first goroutine might be pulling events that were triggered by second goroutine and hence causing processing of events that are still in-progress.

I take we can't use workqueues with a fixed number of workers because that would kinda limit number of PVs we can provision concurrently, but can we put the item pulled from queue directly into nested_pendingoperations so as we don't have this weird race condition?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

2 participants
You can’t perform that action at this time.