fix(k8s): delete orphaned StatefulSets before recreating to avoid PVC mismatch#21786
Merged
jujubot merged 3 commits intoFeb 18, 2026
Merged
Conversation
… mismatch When a Juju-managed StatefulSet has a different storage unique ID than expected (e.g. leftover from a force-removed deployment), delete and recreate it instead of attempting an in-place update. Kubernetes does not allow modifying volumeClaimTemplates on existing StatefulSets, which causes PVC name mismatches. Refs: juju#21722 Signed-off-by: Marcelo Henrique Neppel <marcelo.neppel@canonical.com>
wallyworld
reviewed
Feb 16, 2026
Member
wallyworld
left a comment
There was a problem hiding this comment.
Thanks for submitting the patch, much appreciated.
I've suggested a couple of tweaks.
Replace the 3-second polling loop in waitForStatefulSetDeletion with an event-driven Kubernetes informer watcher. Extract orphan detection logic from Ensure into a new shouldDeleteExistingStatefulSet method, and add a watchStatefulSet helper for creating the underlying notify watcher. Update tests to use a mock watcher instead of clock advancement, and standardize context usage across production and test code. Signed-off-by: Marcelo Henrique Neppel <marcelo.neppel@canonical.com>
Contributor
Author
|
Thanks for the review, @wallyworld! I've addressed all your comments in the latest push. Appreciate the suggestions! |
wallyworld
approved these changes
Feb 17, 2026
Member
wallyworld
left a comment
There was a problem hiding this comment.
Thank you for implementing this
Skip the waitForStatefulSetDeletion call when the delete returned NotFound, avoiding unnecessary polling. Also rename UID to UUID for consistency and use the JujuFieldManager constant instead of a hardcoded "juju" string. Signed-off-by: Marcelo Henrique Neppel <marcelo.neppel@canonical.com>
adisazhar123
approved these changes
Feb 18, 2026
Member
adisazhar123
left a comment
There was a problem hiding this comment.
Thank you for the fix. I have QAed it too and it looks good.
Member
|
/merge |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
When an application is force-removed with
--force --no-wait, the StatefulSet may not be fully cleaned up. On redeployment, Juju reuses the orphaned StatefulSet via the update path, but Kubernetes does not allow modifyingvolumeClaimTemplateson an existing StatefulSet. Since PR #20795 changed theStorageUniqueIDsource of truth from the StatefulSet annotation to the application document, a new unique ID is generated on redeploy, causing a mismatch between the volume mount names and the existing PVC names. This prevents pods from starting.The fix detects orphaned StatefulSets by comparing the
app.juju.is/uuidannotation against the expectedStorageUniqueID. When a mismatch is found and the StatefulSet is confirmed to be owned by Juju (viaapp.kubernetes.io/managed-bylabel andmodel.juju.is/idannotation), it is deleted and recreated with the correctvolumeClaimTemplates.Checklist
QA steps
juju bootstrap microk8s microjuju add-model testjuju deploy postgresql-k8s --trust --channel 16/edge --revision 726juju remove-application postgresql-k8s --destroy-storage --force --no-prompt --no-waitkubectl get statefulset -n testjuju deploy postgresql-k8s --trust --channel 16/edge --revision 726deleting orphaned statefulsetand the app reaches active.Documentation changes
No user-facing workflow changes. The fix is transparent — orphaned StatefulSets are automatically cleaned up during redeployment.
Links
Issue: Fixes #21722.
Jira card: JUJU-9161