Ceph-CSI: Data retains if PV is deleted #4651

ckotzbauer · 2020-01-10T07:28:32Z

Is this a bug report or feature request?

Bug Report

Deviation from expected behavior:

The data in the Ceph-Storage retains even if I delete the PV.

Expected behavior:

If I delete a PV which was provisioned from the Ceph-CSI-Driver I expect, that the data is deleted from the Ceph-Cluster.

How to reproduce it (minimal and precise):

Mount the Ceph-Cluster to a directory to watch the stored files from CSI.
Provision a PV with the Ceph-CSI-Driver
Delete the PV
The corresponding folder in the mounted directory is still there.

File(s) to submit:

cluster.yaml

apiVersion: ceph.rook.io/v1
kind: CephCluster
metadata:
  name: rook-ceph
  namespace: rook-ceph
spec:
  cephVersion:
    image: ceph/ceph:v14.2.5-20191210
  dataDirHostPath: /var/lib/rook
  mon:
    count: 3
    allowMultiplePerNode: false
    volumeClaimTemplate:
      spec:
        storageClassName: standard
        resources:
          requests:
            storage: 10Gi
  mgr:
    modules:
    - name: pg_autoscaler
      enabled: true
  dashboard:
    enabled: true
  monitoring:
    enabled: true
  priorityClassNames:
    all: infra-priority
  storage:
    topologyAware: true
    storageClassDeviceSets:
    - name: set1
      count: 1
      portable: false
      placement:
        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
            - matchExpressions:
              - key: storage-node
                operator: In
                values:
                - storage01
            topologyKey: kubernetes.io/hostname
      volumeClaimTemplates:
      - metadata:
          name: data
        spec:
          resources:
            requests:
              storage: 100Gi
          storageClassName: standard
          volumeMode: Block
          accessModes:
            - ReadWriteOnce
    - name: set2
      count: 1
      portable: false
      placement:
        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
            - matchExpressions:
              - key: storage-node
                operator: In
                values:
                - storage02
            topologyKey: kubernetes.io/hostname
      volumeClaimTemplates:
      - metadata:
          name: data
        spec:
          resources:
            requests:
              storage: 100Gi
          storageClassName: standard
          volumeMode: Block
          accessModes:
            - ReadWriteOnce
    - name: set3
      count: 1
      portable: false
      placement:
        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
            - matchExpressions:
              - key: storage-node
                operator: In
                values:
                - storage03
            topologyKey: kubernetes.io/hostname
      volumeClaimTemplates:
      - metadata:
          name: data
        spec:
          resources:
            requests:
              storage: 100Gi
          storageClassName: standard
          volumeMode: Block
          accessModes:
            - ReadWriteOnce
---
apiVersion: ceph.rook.io/v1
kind: CephFilesystem
metadata:
  name: ceph-fs
  namespace: rook-ceph
spec:
  metadataPool:
    failureDomain: host
    replicated:
      size: 3
  dataPools:
    - replicated:
        size: 3
      failureDomain: host
  preservePoolsOnDelete: true
  metadataServer:
    activeCount: 2
    activeStandby: true
    placement:
       podAntiAffinity:
          preferredDuringSchedulingIgnoredDuringExecution:
          - weight: 100
            podAffinityTerm:
              labelSelector:
                matchExpressions:
                - key: app
                  operator: In
                  values:
                  - rook-ceph-mds
              # topologyKey: failure-domain.beta.kubernetes.io/zone can be used to spread MDS across different AZ
              # topologyKey: kubernetes.io/hostname will place MDS across different hosts
              topologyKey: kubernetes.io/hostname

rook-helm-chart.yaml

image:
  prefix: rook
  repository: rook/ceph
  tag: v1.2.1
  pullPolicy: Always

resources:
  limits:
    cpu: 500m
    memory: 256Mi
  requests:
    cpu: 100m
    memory: 256Mi

rbacEnable: true
pspEnable: true

csi:
  enableRbdDriver: false
  enableCephfsDriver: true
  enableGrpcMetrics: false
  enableSnapshotter: false
  cephFSPluginUpdateStrategy: OnDelete
  forceCephFSKernelClient: true

enableFlexDriver: false
enableDiscoveryDaemon: true

storageclass.yaml

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: rook-cephfs
  annotations:
    storageclass.kubernetes.io/is-default-class: "true"
provisioner: rook-ceph.cephfs.csi.ceph.com
parameters:
  clusterID: rook-ceph
  fsName: ceph-fs
  pool: ceph-fs-data0
  csi.storage.k8s.io/provisioner-secret-name: rook-csi-cephfs-provisioner
  csi.storage.k8s.io/provisioner-secret-namespace: rook-ceph
  csi.storage.k8s.io/node-stage-secret-name: rook-csi-cephfs-node
  csi.storage.k8s.io/node-stage-secret-namespace: rook-ceph
reclaimPolicy: Retain

Environment:

OS (e.g. from /etc/os-release): Ubuntu 18.04 LTS
Kernel (e.g. uname -a): 4.15.0-1044-gke
Cloud provider or hardware configuration: GKE
Rook version (use rook version inside of a Rook Pod): 1.2.1
Storage backend version (e.g. for ceph do ceph -v): ceph/ceph:v14.2.5-20191210
Kubernetes version (use kubectl version): v1.14.8-gke.17
Kubernetes cluster type (e.g. Tectonic, GKE, OpenShift): GKE
Storage backend status (e.g. for Ceph use ceph health in the Rook Ceph toolbox): HEALTHY

The text was updated successfully, but these errors were encountered:

Madhu-1 · 2020-01-10T07:30:54Z

@code-chris have you created and deleted PV or PVC?

ckotzbauer · 2020-01-10T07:32:28Z

Both created and both deleted

Madhu-1 · 2020-01-10T07:43:04Z

@code-chris you have created PV or PVC?

ckotzbauer · 2020-01-10T07:45:22Z

I created a Deployment which triggered a dynamic provisioning of a PV and a corresponding PVC.
I deleted the Deployment, the PVC and the PV in this order.

I created the PV or PVC not manually. The CSI-Driver did.

Madhu-1 · 2020-01-10T07:48:31Z

have you deleted the PV manually if yes its not an issue, if no this is a cephfs issue

ckotzbauer · 2020-01-10T07:49:42Z

I deleted the PV manually, as the ReclaimPolicy of the StorageClass is Retain.
Why is this not an issue? The Cluster stores data which can never be accessed again...

Madhu-1 · 2020-01-10T08:00:02Z

the user should not delete the PV object (the provisioner as to delete the PV object after deleting the backend image), check provisioner logs

ckotzbauer · 2020-01-10T08:11:21Z

Yes, there are logs which indicate, that the provisioner tries to sync the folders with PVs. I will test that.
When would the provisioner delete the PV and the stored data? When the PVC is deleted and the PV changes to "Released" state? This would only be correct for the "Delete" reclaimPolicy and not for "Retain"...

Madhu-1 · 2020-01-10T08:15:54Z

yeah sorry i didn't noticed the reclaim policy, in that case even if you delete PV and PVC admin need to manually cleanup the backend storage. This is not a bug working as expected see https://kubernetes.io/docs/concepts/storage/persistent-volumes/#retain

ckotzbauer · 2020-01-10T08:19:38Z

ah ok, seems I missed that. Thanks.
I thought, that the PV and the backend storage is always deleted at the same time and the policy only decides if this happens automatically or manually.
Then this is not an issue.

ehassan1312 · 2022-10-18T18:42:11Z

I made the same mistake and deleted the PV manually .

How can I cleanup ceph ?

Madhu-1 · 2022-10-19T07:23:06Z

@ehassan1312 try https://www.mrajanna.com/tracking-pv-rados-omap-in-cephcsi/ you will find a way to track down the mapping between rbd image in the pool and the pv. if the pv is not present delete rbd image and rados object.

reefland · 2024-03-03T20:56:54Z

I've written a script that will cross-references your existing Rook-Ceph PV's to Ceph RBD Images and list out images that are Stale/Orphaned and can be removed:

https://github.com/reefland/find-orphaned-rbd-images

Such as:

--[ RBD Image has no Persistent Volume (PV) ]----------------------------
NAME                                          PROVISIONED  USED
csi-vol-cbaa0262-461e-4fe8-a8bb-07f655bb423f       50 GiB  872 MiB
size 50 GiB in 12800 objects
snapshot_count: 0
create_timestamp: Tue Feb 27 15:42:34 2024
access_timestamp: Tue Feb 27 15:42:34 2024
modify_timestamp: Tue Feb 27 15:42:34 2024
-------------------------------------------------------------------------

ckotzbauer added the bug label Jan 10, 2020

ckotzbauer closed this as completed Jan 10, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Ceph-CSI: Data retains if PV is deleted #4651

Ceph-CSI: Data retains if PV is deleted #4651

ckotzbauer commented Jan 10, 2020

Madhu-1 commented Jan 10, 2020

ckotzbauer commented Jan 10, 2020

Madhu-1 commented Jan 10, 2020

ckotzbauer commented Jan 10, 2020

Madhu-1 commented Jan 10, 2020

ckotzbauer commented Jan 10, 2020 •

edited

Loading

Madhu-1 commented Jan 10, 2020

ckotzbauer commented Jan 10, 2020

Madhu-1 commented Jan 10, 2020

ckotzbauer commented Jan 10, 2020

ehassan1312 commented Oct 18, 2022

Madhu-1 commented Oct 19, 2022

reefland commented Mar 3, 2024

Ceph-CSI: Data retains if PV is deleted #4651

Ceph-CSI: Data retains if PV is deleted #4651

Comments

ckotzbauer commented Jan 10, 2020

Madhu-1 commented Jan 10, 2020

ckotzbauer commented Jan 10, 2020

Madhu-1 commented Jan 10, 2020

ckotzbauer commented Jan 10, 2020

Madhu-1 commented Jan 10, 2020

ckotzbauer commented Jan 10, 2020 • edited Loading

Madhu-1 commented Jan 10, 2020

ckotzbauer commented Jan 10, 2020

Madhu-1 commented Jan 10, 2020

ckotzbauer commented Jan 10, 2020

ehassan1312 commented Oct 18, 2022

Madhu-1 commented Oct 19, 2022

reefland commented Mar 3, 2024

ckotzbauer commented Jan 10, 2020 •

edited

Loading