Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CephFS] Support Volume Snapshots #702

Closed
ajarr opened this issue Oct 30, 2019 · 11 comments
Closed

[CephFS] Support Volume Snapshots #702

ajarr opened this issue Oct 30, 2019 · 11 comments
Assignees
Labels
component/cephfs Issues related to CephFS
Milestone

Comments

@ajarr
Copy link
Contributor

ajarr commented Oct 30, 2019

Describe the feature you'd like to have

Add volume snapshot support for CephFS CSI driver.
https://kubernetes.io/docs/concepts/storage/volume-snapshots/

The CephFS CSI driver can issue simple ceph mgr-volume CLI to create/delete/list fs subvolume snapshots to support this feature.

ceph fs subvolume snapshot create <vol-name> <subvol-name> <snapshot-name> --group_name csi
ceph fs subvolume snapshot rm <vol-name> <subvol-name> <snapshot-name> --group_name csi
ceph fs subvolume snapshot ls <vol-name> <subvol-name> --group_name csi

https://docs.ceph.com/docs/master/cephfs/fs-volumes/#fs-subvolumes

Need Ceph version >= 14.2.5

The other snapshot related feature, creating a PVC from a snapshot, is tracked by #411

@ajarr ajarr added the component/cephfs Issues related to CephFS label Oct 30, 2019
@dillaman
Copy link

... would also need create volume from snapshot support, correct?

@ajarr
Copy link
Contributor Author

ajarr commented Oct 30, 2019

... would also need create volume from snapshot support, correct?

There is a separate github issue with a lot of discussion tracking snapshot restore,
#411

Is CSI snapshot feature is considered complete only if the driver can also support snapshot restore?
If so, then I can link the snapshot restore issue in the description of this one.

@dillaman
Copy link

Separate is fine w/ me -- just wanted to make sure it wasn't dropped. #411 would then be dependent upon this issue.

@mykaul
Copy link
Contributor

mykaul commented Oct 31, 2019

How is this different than #246 ?

@socketpair
Copy link

@mykaul there is no difference. it's duplicate.

@socketpair
Copy link

With RBD you actually can make writeable RBD volume based on some read-only image snapshot. In CephFS you can't.

CephFS does not able to make writable FS subtree based on subtree snapshot.In principle, it's possible to create FS volume from CephFS snapshot using OverlayFS (combining Readonly CephFS snapshot with a writable dir in CephFS)

@nixpanic nixpanic added this to TODO in Release v2.1 Jan 30, 2020
@nixpanic nixpanic removed this from TODO in Release v2.1 Apr 16, 2020
@nixpanic nixpanic added this to the release-3.0.0 milestone Apr 16, 2020
@nixpanic nixpanic added this to To do in Backup & Restore Apr 17, 2020
@ShyamsundarR
Copy link
Contributor

ShyamsundarR commented Jun 3, 2020

To support snapshot decoupling from source volumes, CephFS subvolumes needs a structural change.

Older CephFS subvolumes directory structure was, /volumes/<group-name>/<vol-name>, where user data is under .
Since introduction of subvolumes clone feature, it is, /volumes/<group-name>/<vol-name>/<UUID>, where user data is under the directory.

The subvolumes clone feature was introduced in CephFS since these Ceph releases,

  • 15.2.0 (Octopus)
  • 14.2.8 (Nautilus)
    NOTE: Clone feature works with the older subvolume structure as well, accommodating for backward compatibility.

Subvolumes created using the Ceph versions equal to or greater than the above, would automatically be in the newer subvolume structure.

CephFS subvolume snapshots decoupling from the subvolume, would need the subvolume structure that has since been introduced with the cloning feature above. Older format subvolumes would not be able to decouple the snapshots from its source volume.
NOTE: This is due to snapshots, in the older structure, existing at the same level as the user data, and the decoupling design is to snapshot one directory level higher instead, giving rise to the required decoupling. This higher level directory hence needs a level of separation from the user data directory, that is introduced by the UUID directory at the leaf of the structure.

The decoupling feature is tracked for CephFS here
NOTE: An upgrade path from older subvolumes to the newer format is not planned with the above ticket, and based on feasibility will be addressed subsequently.

Hence, for CSI the following options exist to work with the above versions.

Option 1: Older style subvolumes will fail DeleteVolume call, if snapshots exist for the subvolume

  • Older subvolumes will return an error during an rm operation, stating the subvolumes have existing snapshots
  • Even though CSI DeleteVolume will fail and be retired, it will not succeed till all snapshots referring to the volume are deleted
  • This should not be a kubernetes user concern, as the PVC is deleted and PV is retained, thus allowing the user to recreate the same PVC and internally getting a new PV and hence a new subvolume assigned to the same
    • This addresses any reuse concern at the PVC layer in kubernetes
  • This also allows the CSI snapshot (and hence clone) feature to work with older style subvolumes and by design against newer style subvolumes as well, retaining similar user experience for the user

Option 2: CSI can prevent snaps and clones being created for older subvolumes

  • As CephFS-CSI snapshots are not yet in a CSI release, CSI be coded up to not allow older style CephFS subvolumes to create snapshots
  • This is possibly a detrimental option as there is no upgrade path from older style subvolumes to newer subvolumes yet, and so users cannot snap/clone the same to a newer subvolume format
  • If left at this option, it would require a pod level copy from the older PVC to a newer PVC, which is a bigger lift in itself
  • It would also confuse users, as some subvolumes of CephFS would allow snapshots (created with the newer format) and some not (older subvolumes)
  • In short, this option would be less desirable and probably a no-go

The snapshot decoupling feature would be functional by default, but to retain snapshots for a subvolume during a delete, an extra option (say, --retain-snaps) may need to be passed to the CLI. Without the same, the current default errors would be returned, stating volume cannot be deleted as there are existing snapshots.
NOTE: As with other CLI options that are introduced in Ceph, to support older version Ceph clusters CSI would need to invoke the command with the extra option, and handle errors and invoke the rm without the extra option (or other patterns thereof may need to be used as currenty followed in CSI)

As it stands, ceph-csi development to integrate CephFS snapshots and clones, should not be limited to waiting for the snapshot decoupling feature in CephFS to land if option 1 is chosen. The only change envisaged in the future is to pass the extra option as discussed above, to gain the decoupling feature in ceph-csi.

CC: @joscollin @batrick @vshankar

@humblec
Copy link
Collaborator

humblec commented Jun 18, 2020

The snapshot decoupling feature would be functional by default, but to retain snapshots for a subvolume during a delete, an extra option (say, --retain-snaps) may need to be passed to the CLI.

Can anyone share more details on, which Ceph release this is planned? Will it be backported to previous releases too? Is there any deadline available for this enhancement? In absence of this, RBD snapshot implementation and CephFS implementation look different in behavior for a Kubernetes/openshift user, which may not be good from user experience pov. In rbd the user is free to delete the parent volume even when the snapshot exists wherein CephFS you get an error. Also, this snapshot existence has any effect on the scalability of cephfs snapshots per subvolume? Is this flag enabled by default which causes the parent volume to be deleted even when snapshots are present ? or it's disabled by default?

@ShyamsundarR
Copy link
Contributor

The snapshot decoupling feature would be functional by default, but to retain snapshots for a subvolume during a delete, an extra option (say, --retain-snaps) may need to be passed to the CLI.

Can anyone share more details on, which Ceph release this is planned?

Currently this is being worked on in master branch, once complete it will be released with the next available Ceph release vehicle.

Will it be backported to previous releases too?

The Ceph tracker has been marked as requiring backports to Octopus and Nautilus, and the plan is to backport this to all releases that support the subvolume clone feature.

Is there any deadline available for this enhancement?

No, work is in progress, there is currently no "deadline" as such. Estimate is that it would get into master in about 3 weeks time.

Also, this snapshot existence has any effect on the scalability of cephfs snapshots per subvolume?

CephFS snapshot scale limits are discussed in #1133 if snapshots are preserved, then yes it will impact overall scale factor for CephFS snapshots.

Is this flag enabled by default which causes the parent volume to be deleted even when snapshots are present ? or it's disabled by default?

The flag should be explicitly mentioned when deleting a subvolume to get the desired behavior as mentioned in the comment above.

cc: @batrick for any additional comments

@Madhu-1
Copy link
Collaborator

Madhu-1 commented Jul 24, 2020

Moving it to release-v3.1.0

@humblec
Copy link
Collaborator

humblec commented Aug 10, 2020

This can be closed as we have #394 is merged now! 👍

@humblec humblec modified the milestone: release-3.1.0 Aug 10, 2020
@humblec humblec closed this as completed Aug 10, 2020
Backup & Restore automation moved this from To do to Done Aug 10, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component/cephfs Issues related to CephFS
Projects
Development

No branches or pull requests

8 participants