I run into a situation that a Helm release stuck in pending-upgrade status.
Scene:
I use helm upgrade and send SIGTERM signal, then release stuck in pending-upgrade.
Suspicious path:
helm upgrade start upgrade in function releasingUpgrade, also call a handleContext function.
In handleContext, when context is done, it call reportToPerformUpgrade, follow the code we can see failRelease call, failRelease, recordRelease, storage update. Lets specify a driver such as Secret which my environment is using, update, newSecretsObject. We can see the newSecret is not filled with ResourceVersion, so race condition may happen.
Lets look into releasingUpgrade function. In execHook we see the recordRelease function is also called.
So there is a situation, when the execHook is recording hook status, it creates a pending-upgrade secret, ready to update. At this point, user send SIGTERM, context is cancelled. So failRelease is called, secret is created with status failed, update succeeded. And hook secret is just about to submit, finally turning release status to pending-upgrade. This cause a race condition.
Advice:
I think a mutex should be introduced to hold release status. But right now I dont have a answer to it, I need some input from helm team. If this is confirmed to be a bug, I'd like to fix it by myself.
I run into a situation that a Helm release stuck in
pending-upgradestatus.Scene:
I use
helm upgradeand send SIGTERM signal, then release stuck inpending-upgrade.Suspicious path:
helm upgradestart upgrade in function releasingUpgrade, also call ahandleContextfunction.In
handleContext, when context is done, it call reportToPerformUpgrade, follow the code we can see failRelease call, failRelease, recordRelease, storage update. Lets specify a driver such as Secret which my environment is using, update, newSecretsObject. We can see the newSecret is not filled with ResourceVersion, so race condition may happen.Lets look into
releasingUpgradefunction. In execHook we see therecordReleasefunction is also called.So there is a situation, when the execHook is recording hook status, it creates a
pending-upgradesecret, ready to update. At this point, user send SIGTERM, context is cancelled. SofailReleaseis called, secret is created with statusfailed, update succeeded. And hook secret is just about to submit, finally turning release status topending-upgrade. This cause a race condition.Advice:
I think a mutex should be introduced to hold release status. But right now I dont have a answer to it, I need some input from helm team. If this is confirmed to be a bug, I'd like to fix it by myself.