Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DataMover: Cloned VolumeSnapshotContent remains in cluster when vsclass deletionPolicy set as Retain #6786

Closed
PrasadJoshi12 opened this issue Sep 7, 2023 · 10 comments
Assignees
Milestone

Comments

@PrasadJoshi12
Copy link

PrasadJoshi12 commented Sep 7, 2023

What steps did you take and what happened:
Created a backup with snapshotMoveData set as true. Backup was completed successfully.

$ oc get backup test-backup -o yaml
apiVersion: velero.io/v1
kind: Backup
metadata:
  annotations:
    velero.io/resource-timeout: 10m0s
    velero.io/source-cluster-k8s-gitversion: v1.26.7+0ef5eae
    velero.io/source-cluster-k8s-major-version: "1"
    velero.io/source-cluster-k8s-minor-version: "26"
  creationTimestamp: "2023-09-07T11:22:11Z"
  generation: 9
  labels:
    velero.io/storage-location: ts-dpa-1
  name: test-backup
  namespace: openshift-adp
  resourceVersion: "159157"
  uid: c0a64106-724f-4453-ab5c-df6b8a92e5c9
spec:
  csiSnapshotTimeout: 10m0s
  defaultVolumesToFsBackup: false
  includedNamespaces:
  - ocp-django
  itemOperationTimeout: 4h0m0s
  snapshotMoveData: true
  storageLocation: ts-dpa-1
  ttl: 720h0m0s
status:
  backupItemOperationsAttempted: 1
  backupItemOperationsCompleted: 1
  completionTimestamp: "2023-09-07T11:23:42Z"
  expiration: "2023-10-07T11:22:11Z"
  formatVersion: 1.1.0
  phase: Completed
  progress:
    itemsBackedUp: 96
    totalItems: 96
  startTimestamp: "2023-09-07T11:22:11Z"
  version: 1

Deleted backup resource.
$ velero delete backup test-backup -n openshift-adp
Cloned VolumesnapshotContent remains in cluster even though the backup is removed.

$ oc get backup -n openshift-adp
No resources found in openshift-adp namespace.
$ oc get vsc
NAME                READYTOUSE   RESTORESIZE   DELETIONPOLICY   DRIVER                  VOLUMESNAPSHOTCLASS   VOLUMESNAPSHOT      VOLUMESNAPSHOTNAMESPACE   AGE
test-backup-6wgk5   true         1073741824    Retain           pd.csi.storage.gke.io   example-snapclass     test-backup-6wgk5   openshift-adp             4m56s

Vsclass yaml

apiVersion: snapshot.storage.k8s.io/v1
deletionPolicy: Retain
driver: pd.csi.storage.gke.io
kind: VolumeSnapshotClass
metadata:
  annotations:
    snapshot.storage.kubernetes.io/is-default-class: "true"
  labels:
    velero.io/csi-volumesnapshot-class: "true"
  name: example-snapclass

What did you expect to happen:
When backup is removed it should also remove all the associated volumesnapshotContents.

The following information will help us better understand what's going on:

If you are using velero v1.7.0+:
Please use velero debug --backup <backupname> --restore <restorename> to generate the support bundle, and attach to this issue, more options please refer to velero debug --help

If you are using earlier versions:
Please provide the output of the following commands (Pasting long output into a GitHub gist or other pastebin is fine.)

  • kubectl logs deployment/velero -n velero
  • velero backup describe <backupname> or kubectl get backup/<backupname> -n velero -o yaml
  • velero backup logs <backupname>
  • velero restore describe <restorename> or kubectl get restore/<restorename> -n velero -o yaml
  • velero restore logs <restorename>

Anything else you would like to add:

Environment:

  • Velero version (use velero version):
  • Velero features (use velero client config get features):
  • Kubernetes version (use kubectl version):
  • Kubernetes installer & version:
  • Cloud provider or hardware configuration:
  • OS (e.g. from /etc/os-release):

Vote on this issue!

This is an invitation to the Velero community to vote on issues, you can see the project's top voted issues listed here.
Use the "reaction smiley face" up to the right of this comment to vote.

  • 👍 for "I would like to see this bug fixed as soon as possible"
  • 👎 for "There are more important bugs to focus on right now"
@PrasadJoshi12
Copy link
Author

PrasadJoshi12 commented Sep 7, 2023

When VSclass deletionPolicy is set as Delete in that case volumesnapshotcontents gets removed during the backup process.

@Lyndon-Li
Copy link
Contributor

@PrasadJoshi12
This is as the original design expectation --- if the deletionPolicy is Retain, if means users want to keep the snapshot, so Velero data mover won't delete it explicitly.
Do you have any concern for this design behavior?

@PrasadJoshi12
Copy link
Author

@Lyndon-Li so users have to manually clean it up after backup is expired or removed ?

@Lyndon-Li
Copy link
Contributor

Velero will not use it after the backup completes, so users are free to keep or clean it.

@PrasadJoshi12
Copy link
Author

ack. Thank you so much.

@kaovilai
Copy link
Contributor

kaovilai commented Sep 8, 2023

I think it would make sense to label on backup deletion at least that the backup associated has already been deleted.
velero.io/backup-deleted: true ?

@Lyndon-Li Lyndon-Li self-assigned this Sep 12, 2023
@reasonerjt
Copy link
Contributor

I think it's more consistent that we remove this vsc when backup is deleted.
Before DM was introduced the vsc was retained intentionally to ensure the physical snapshot will be deleted when the backup is deleted, we should keep this behavior after DM is introduced.

@kaovilai
Copy link
Contributor

+1 prior behavior of velero cleaning up of resources resulting from a backup upon backup deletion is preferred

@Lyndon-Li
Copy link
Contributor

Lyndon-Li commented Sep 13, 2023

Then let's remove the VSC after backup completes all the time.
For data mover, the snapshot will never be used after backup completes, so if we want to ensure the snapshot to be deleted after backup is deleted, which implies we treat the snapshot as part of the backup all the time, we should delete as early as possible when it is not used.

@Lyndon-Li
Copy link
Contributor

Fixed in Main branch. Will cherry pick to 1.12 branch after 1.12.0 is release. So let's keep the issue open.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants