-
Notifications
You must be signed in to change notification settings - Fork 38.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[job failed] 1.9-master upgrade|downgrade jobs #60764
Comments
/milestone v1.10 |
cc @kubernetes/sig-cluster-lifecycle-bugs |
Downgrades are failing to delete the stateful set test's namespace because PVCs remain. All the other tests are actually passing. |
@krousey do you know who I need to bug next? |
In particular: https://k8s-gubernator.appspot.com/build/kubernetes-jenkins/logs/ci-kubernetes-e2e-gce-master-new-downgrade-cluster-parallel/1910
|
@krzyzacy We should pull in someone from storage and maybe someone from apps. |
I've created PR #61324 for the above. |
After K8s 1.9 is upgraded to K8s 1.10 finalizer [kubernetes.io/pvc-protection] is added to PVCs because StorageObjectInUseProtection feature is enabled by default in K8s 1.10. However, when K8s 1.10 is downgraded to K8s 1.9 the finalizers remain in the PVCs and as pvc-protection-controller is not started by default in K8s 1.9 finalizers are not removed automatically from deleted PVCs and that's why deleted PVC are not removed but remain in Terminating phase. That's why pvc-protection-controller is always started because the pvc-protection-controller removes finalizers from PVCs automatically when a PVC is not in active use by a pod. Related issue: kubernetes#60764
After K8s 1.9 is upgraded to K8s 1.10 finalizer [kubernetes.io/pv-protection] is added to PVs because StorageObjectInUseProtection feature is enabled by default in K8s 1.10. However, when K8s 1.10 is downgraded to K8s 1.9 the finalizers remain in the PVs and as pv-protection-controller does not exist in K8s 1.9 PV finalizers are not removed automatically from deleted PVs and that's why deleted PV remain in the system. That's why the finalizer removing part of the pv-protection-controller is backported from K8s 1.10 in order to remove finalizers automatically when a PV is deleted and is not Bound to a PVC. Related issue: kubernetes#60764 Related pv-protection-controller PR: kubernetes#58743
[MILESTONENOTIFIER] Milestone Issue: Up-to-date for process @krzyzacy @lukemarsden @luxas @roberthbailey Issue Labels
|
1.9 PR was merged in #61370 mid-day yesterday. four green runs since then on https://k8s-testgrid.appspot.com/sig-release-master-upgrade#gce-master-1.9-downgrade-cluster-parallel&sort-by-failures= |
some tests are still flaking in the job, but no solid red tests |
I think we can close this issue, and I'll keep an eye there and open a few more flake issues |
After K8s 1.10 is upgraded to K8s 1.11 finalizer [kubernetes.io/pvc-protection] is added to PVCs because StorageObjectInUseProtection feature will be GA in K8s 1.11. However, when K8s 1.11 is downgraded to K8s 1.10 and the StorageObjectInUseProtection feature is disabled the finalizers remain in the PVCs and as pvc-protection-controller is not started in K8s 1.10 finalizers are not removed automatically from deleted PVCs and that's why deleted PVC are not removed from the system but remain in Terminating phase. The same applies to pv-protection-controller and [kubernetes.io/pvc-protection] finalizer in PVs. That's why pvc-protection-controller is always started because the pvc-protection-controller removes finalizers from PVCs automatically when a PVC is not in active use by a pod. Also the pv-protection-controller is always started to remove finalizers from PVs automatically when a PV is not Bound to a PVC. Related issue: kubernetes#60764
…nUseProtection-downgrade-issue Automatic merge from submit-queue (batch tested with PRs 61324, 62880, 62765). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>. Always Start pvc-protection-controller and pv-protection-controller **What this PR does / why we need it**: After K8s 1.10 is upgraded to K8s 1.11 finalizer `[kubernetes.io/pvc-protection]` is added to PVCs because `StorageObjectInUseProtection` feature will be GA in K8s 1.11. However, when K8s 1.11 is downgraded to K8s 1.10 and the `StorageObjectInUseProtection` feature is disabled the finalizers remain in the PVCs and as `pvc-protection-controller` is not started in K8s 1.10 finalizers are not removed automatically from deleted PVCs and that's why deleted PVC are not removed from the system but remain in `Terminating` phase. The same applies to `pv-protection-controller` and `[kubernetes.io/pvc-protection]` finalizer in PVs. That's why `pvc-protection-controller` is always started because the `pvc-protection-controller` removes finalizers from PVCs automatically when a PVC is not in active use by a pod. Also the `pv-protection-controller` is always started to remove finalizers from PVs automatically when a PV is not `Bound` to a PVC. **Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*: Fixes N/A This issue #60764 is for downgrade from K8s 1.10 to K8s 1.9. This PR fixes the same problem but for downgrade from K8s 1.11 to K8s 1.10. **Special notes for your reviewer**: **Release note**: ```release-note NONE ```
StorageObjectInUseProtection feature is enabled by default in K8s 1.10+. Assume K8s cluster is used with this feature enabled, i.e. finalizers are added to all PVs and PVCs. In case the K8s cluster admin disables the StorageObjectInUseProtection feature and a user deletes a PVC that is not in active use by a pod then the PVC is not removed from the system because of the finalizer. Therefore, the user will have to remove the finalizer manually in order to have the PVC removed from the system. Note: deleted PVs won't be removed from the system also because of finalizers. That's why pvc-protection-controller is always started because the pvc-protection-controller removes finalizers from PVCs automatically when a PVC is not in active use by a pod. Also the pv-protection-controller is always started to remove finalizers from PVs automatically when a PV is not Bound to a PVC. Related issue: kubernetes#60764 Related PRs: kubernetes#61370 kubernetes#61324
…ction-downgrade-issue-cherry-pick-into-K8s-1.10 Automatic merge from submit-queue. cherry-pick into K8s 1.10: Always Start pvc-protection-controller and pv-protection-controller **What this PR does / why we need it**: StorageObjectInUseProtection feature is enabled by default in K8s 1.10+. Assume K8s cluster is used with this feature enabled, i.e. finalizers are added to all PVs and PVCs. In case the K8s cluster admin disables the StorageObjectInUseProtection feature and a user deletes a PVC that is not in active use by a pod then the PVC is not removed from the system because of the finalizer. Therefore, the user will have to remove the finalizer manually in order to have the PVC removed from the system. Note: deleted PVs won't be removed from the system also because of finalizers. This problem was fixed in [K8s 1.9.6](https://github.com/kubernetes/kubernetes/releases/tag/v1.9.6) in PR #61370 This problem is also fixed in K8s 1.11+ in PR #61324 However, this problem is not fixed in K8s 1.10, that's why I've cherry-picked the PR #61324 and proposing to merge it into K8s 1.10. **Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*: Fixes N/A Related issue: #60764 **Special notes for your reviewer**: **Release note**: ```release-note In case StorageObjectInUse feature is disabled and Persistent Volume (PV) or Persistent Volume Claim (PVC) contains a finalizer and the PV or PVC is deleted it is not automatically removed from the system. Now, it is automatically removed. ```
this is an unbrella issue for k8s-testgrid.appspot.com/sig-release-master-upgrade
there are multiple test failures, and I will open individual issues for each failing test
/priority failing-test
/priority critical-urgent
/kind bug
/status approved-for-milestone
/sig cluster-lifecycle
/sig gcp
cc @jdumars @jberkus
and also cc @krousey who's our upgrade expert
The text was updated successfully, but these errors were encountered: