Join GitHub today
GitHub is home to over 40 million developers working together to host and review code, manage projects, and build software together.Sign up
pv controller processes same object multiple times for same delete event #88088
PV controller appears to be processing same object multiple times for same PVC delete event. The reason is mainly because when a PVC is deleted and PV is synced(either because of an event or because of resync), then each update to PV object causes another event which gets processed until PV object is removed from informer/store and really marked as deleted.
This results in multiple delete requests being sent to storage provider for same volume. For example - https://gist.githubusercontent.com/gnufied/942552193cc5450f7401fe25f03937a8/raw/184215d23b2fd8ec0fb490d456fe8bee0e449197/code.txt
In above gist you can see that volume deletion is being done more than once for pv -
It looks like this is to some extent by design because events are pulled from workqueue by one goroutine and then do some proccessing and then put into another goroutine via nestedpendingoperations thingy. The problem is, this is inherently "racy" because first goroutine might be pulling events that were triggered by second goroutine and hence causing processing of events that are still in-progress.
I take we can't use workqueues with a fixed number of workers because that would kinda limit number of PVs we can provision concurrently, but can we put the item pulled from queue directly into nested_pendingoperations so as we don't have this weird race condition?