Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PVC Volume not detached if pod deleted via namespace deletion #29051

Closed
saad-ali opened this issue Jul 16, 2016 · 1 comment
Closed

PVC Volume not detached if pod deleted via namespace deletion #29051

saad-ali opened this issue Jul 16, 2016 · 1 comment
Assignees
Labels
kind/bug Categorizes issue or PR as related to a bug. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. sig/storage Categorizes an issue or PR as relevant to SIG Storage.
Milestone

Comments

@saad-ali
Copy link
Member

Problem:
An attachable volume referenced by a pod via a PVC object (instead of directly) is not detached if the pod is deleted via namespace deletion (i.e. deleting the pod's namespace instead of the pod object directly).

Repro steps:

  1. Create a new volume
    • gcloud compute disks create --zone=us-central1-b test-0b
  2. Create a new namespace.
    • kubectl create ns testns
      • namespace "testns" created
  3. Create PV and PVC objects:
    • kubectl create -f volumetest_pvc-pv.yaml
      • persistentvolume "pv-test-detach" created
      • persistentvolumeclaim "claim-test-detach" created
  4. Verify PV/PVC objects are in bound state
    • kubectl get pv
      • pv-test-detach 50Gi RWO Bound testns/claim-test-detach 2m
  5. Create pod object that references the volume via PVC
    • kubectl create -f volumetest_pod_pvc.yaml
      • replicationcontroller "sleepypod" created
  6. Verify pod gets into running state without issue (i.e. volume is attached)
    • kubectl get pods --namespace testns
      • sleepypod-0b0eq 1/1 Running 0 3m
  7. Delete the namespace
    • kubectl delete ns testns
      • namespace "testns" deleted
  8. Wait for pod to terminate
    • kubectl get pods --namespace testns
  9. Check if volume remains attached after a few minutes
    • Expected: Volume is detached.
    • Actual: Volume remains attached indefinitely.
`volumetest_pvc-pv.yaml` is:
apiVersion: v1
kind: PersistentVolume
metadata:
  name : pv-test-detach
spec:
  claimRef:
    name: claim-test-detach
    namespace: testns
  accessModes:
  - ReadWriteOnce
  capacity:
    storage: 50Gi
  persistentVolumeReclaimPolicy: Retain
  gcePersistentDisk:
    fsType: ext4
    pdName: test-0b

---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: claim-test-detach
  namespace: testns
spec:
  accessModes:
  - ReadWriteOnce
  resources:
    requests:
      storage: 50Gi

And volumetest_pod_pvc.yaml is:

apiVersion: v1
kind: ReplicationController
metadata:
  name: sleepypod
  namespace: testns
spec:
  replicas: 1
  selector:
    name: sleepy
  template:
    metadata:
      labels:
        name: sleepy
    spec:
      volumes:
      - name: data
        persistentVolumeClaim:
          claimName: claim-test-detach
      containers:
      - name: sleepycontainer1
        image: saadali/sleepy:v0.2
        env:
        - name: "FOO"
          value: " "
        resources:
          limits:
            cpu: "0.002"
            memory: "4Mi"
        volumeMounts:
        - name: data
          mountPath: /data
          readOnly: false

Workarounds:

  1. Delete the pod object directly before deleting the namespace.
    • In step 7 above do kubectl delete -f volumetest_pod_pvc.yaml instead
    • If the attachable volume is referenced by a pod using a PVC object, and the pod object is deleted directly (instead of the namespace first), the volume is correctly detached.
  2. Reference volume directly in pod (without a PVC object).
    • If the attachable volume is referenced by a pod directly (without a PVC object), namespace deletion results in the volume being correctly detached.
@saad-ali saad-ali added kind/bug Categorizes issue or PR as related to a bug. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. sig/storage Categorizes an issue or PR as relevant to SIG Storage. team/cluster labels Jul 16, 2016
@saad-ali saad-ali added this to the v1.3 milestone Jul 16, 2016
@saad-ali saad-ali self-assigned this Jul 16, 2016
@saad-ali
Copy link
Member Author

Root Cause:

  1. Delete event for pod fails on Attach/Detach controller

    • When a volume referenced via PVC is deleted by namespace, the PV/PVC objects are deleted before the pod object, therefore when the delete event is processed by the attach/detach controller, it fails because it cannot dereference the PVC object.
    I0716 06:59:47.551514       5 attach_detach_controller.go:304] Error processing volume "data" for pod "testns"/"sleepypod-la9oa": error processing PVC "testns"/"claim-test-detach": failed to find PVC "testns/claim-test-detach" in PVCInformer cache.
    
    
  2. There was code added to ensure volumes are still detached even if a delete event is missed (or fails), however that has a bug in it (the value of exists is not checked):

    informerPodObj, exists, err := dswp.podInformer.GetStore().GetByKey(dswPodKey)
    if err != nil || informerPodObj == nil {
        glog.Errorf("podInformer GetByKey failed for pod %q (UID %q) with %v", dswPodKey, dswPodUID, err)
        continue
    }

Proposed Fixes:

  1. Modify attach/detach controller to cache PVC/PV so even if it is deleted, it can be recovered on pod deletion events.
  2. Fix bug in desired_state_of_the_world_populator.go to check exists so that it can delete pods even if the delete event is missed (or fails).

k8s-github-robot pushed a commit that referenced this issue Jul 21, 2016
Automatic merge from submit-queue

Fix "PVC Volume not detached if pod deleted via namespace deletion" issue

Fixes #29051: "PVC Volume not detached if pod deleted via namespace deletion"

This PR:
* Fixes a bug in `desired_state_of_the_world_populator.go` to check the value of `exists` returned by the `podInformer` so that it can delete pods even if the delete event is missed (or fails).
* Reduces the desired state of the world populators sleep period from 5 min to 1 min (reducing the amount of time a volume would remain attached if a volume delete event is missed or fails).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. sig/storage Categorizes an issue or PR as relevant to SIG Storage.
Projects
None yet
Development

No branches or pull requests

1 participant