New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Federation cluster scripts accidentally delete PV's. #46380
Comments
PVC apiVersion: v1
kind: PersistentVolumeClaim
metadata:
annotations:
federation.alpha.kubernetes.io/federation-name: e2e-f8n-agent-pr-93-0
volume.alpha.kubernetes.io/storage-class: "yes"
labels:
app: federated-cluster
name: e2e-f8n-agent-pr-93-0-apiserver-etcd-claim
namespace: e2e-f8n-agent-pr-93-0
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 10Gi |
The PV got deleted somehow and so PVC entered Lost state. If we rule out that PV was accidentally deleted or it wasn't deleted at all then potentially there's an issue of the pv controller's cache getting out of sync with reality or something of that nature I0523 22:08:01.361501 5 pv_controller_base.go:215] volume "pvc-4723f163-4004-11e7-a75a-42010a80000a" deleted edit: clarification, above line only proves it's deleting the volume from the internal cache which it should only do if it observes a PV deletion event. Not triyng to state the obvious, just thinking out loud... |
I don't see "doDeleteVolume [pvc-4723f163-4004-11e7-a75a-42010a80000a" so I'm inclined to think the PV object was accidentally deleted by a user. ping @jsafrane since this involves PV controller |
@madhusudancs can you also attach apiserver logs? |
@msau42 we don't have the API server logs for the attached controller manager logs unfortunately. That cluster has been torn down. I can attach the API server logs belonging to a different instance when this happens again. |
@wongma7 is right, it seems that something else than the PV controller deleted PV with name
Looking at the log, there is quite lot of Lost PVCs... Some of the corresponding PVs are deleted very quickly after they're provisioned, some of them survived for couple of minutes. They are deleted in batches. Here is log from the controller where it provisioned 3 PVs during 3 minutes and they were deleted at the same time by something:
Something must be watching the PVs periodically and deleting them. Do you have any 3rd party controllers / provisioners? Do you accidentally run second controller-manager in parallel? Is the GCE PD deleted too or just it's Kubernetes PV? Watching the API server logs could help, especially if you could tell who deleted the PV object. |
PV is a non-namespaced resource. Running `kubectl delete pv --all`, even with `--namespace` is going to delete all the PVs in the cluster. This is a dangerous operation and should not be deleted this way. Instead we now retrieve the PVs bound to the PVCs in the namespace we are deleteing and delete only those PVs. Fixes issue kubernetes#46380.
/assign |
PV is a non-namespaced resource. Running `kubectl delete pv --all`, even with `--namespace` is going to delete all the PVs in the cluster. This is a dangerous operation and should not be deleted this way. Instead we now retrieve the PVs bound to the PVCs in the namespace we are deleteing and delete only those PVs. Fixes issue kubernetes#46380.
Good debugging @jsafrane! |
@madhusudancs Is there more to do with this issue that wasn't addressed by #46945? |
Renamed and closed, as #46945 has merged. |
We have a PVC that uses the alpha annotation to dynamically provision a GCE PD/PV. Sometimes in our test environment, the dynamically provisioned PV gets automatically deleted by the PV controller without any of us deleting the PVC. I am attaching the controller manager logs here: kube-controller-manager.log-20170523-1495580401.gz
Interesting bits in the logs start at:
The namespaced name of the claim is
f8n-system-agent-pr-93-0/e2e-f8n-agent-pr-93-0-apiserver-etcd-claim
Here is the deployment yaml
cc @kubernetes/sig-storage-bugs @saad-ali @kubernetes/sig-federation-bugs
The text was updated successfully, but these errors were encountered: