Skip to content

Commit 98fc279

Browse files
committed
Update volume group snapshot KEP
1 parent 61abddc commit 98fc279

File tree

2 files changed

+92
-51
lines changed

2 files changed

+92
-51
lines changed

keps/sig-storage/3476-volume-group-snapshot/README.md

Lines changed: 90 additions & 49 deletions
Original file line numberDiff line numberDiff line change
@@ -41,6 +41,7 @@
4141
- [CreateVolumeGroupSnapshot](#createvolumegroupsnapshot)
4242
- [DeleteVolumeGroupSnapshot](#deletevolumegroupsnapshot)
4343
- [GetVolumeGroupSnapshot](#getvolumegroupsnapshot)
44+
- [GetSnapshot](#getsnapshot)
4445
- [Production Readiness Review Questionnaire](#production-readiness-review-questionnaire)
4546
- [Feature enablement and rollback](#feature-enablement-and-rollback)
4647
- [Rollout, Upgrade and Rollback Planning](#rollout-upgrade-and-rollback-planning)
@@ -82,7 +83,7 @@ Items marked with (R) are required *prior to targeting to a milestone / release*
8283
- [x] (R) Production readiness review completed
8384
- [x] Production readiness review approved
8485
- [x] "Implementation History" section is up-to-date for milestone
85-
- [ ] User-facing documentation has been created in [kubernetes/website], for publication to [kubernetes.io]
86+
- [x] User-facing documentation has been created in [kubernetes/website], for publication to [kubernetes.io]
8687
- [ ] Supporting documentation e.g., additional design documents, links to mailing list discussions/SIG meetings, relevant PRs/issues, release notes
8788

8889
<!--
@@ -134,13 +135,16 @@ Note: In the following, we will use VolumeGroupSnapshot Controller to refer to t
134135
* Admin creates a VolumeGroupSnapshotClass.
135136
* User creates a VolumeGroupSnapshot with label selector that matches the label applied to all PVCs to be snapshotted together.
136137
* This will trigger the VolumeGroupSnapshot controller to create a VolumeGroupSnapshotContent API object. The group snapshot logic in the csi-snapshotter sidecar will call the CreateVolumeGroupSnapshot CSI function.
137-
* The group snapshot logic in csi-snapshotter will retrieve all volumeSnapshotHandles and their source volumeHandles in the Volume Group Snapshot from the CSI CreateVolumeGroupSnapshotResponse, and populate the VolumeSnapshotHandlePairList field in the VolumeGroupSnapshotContent status.
138+
* The group snapshot logic in csi-snapshotter will retrieve all volumeSnapshotHandles, their source volumeHandles, and other information in the Volume Group Snapshot from the CSI CreateVolumeGroupSnapshotResponse, and update the VolumeGroupSnapshotContent status.
139+
* In v1beta1, it populates the VolumeSnapshotHandlePairList field in the VolumeGroupSnapshotContent status.
140+
* In v1beta2, it populates the VolumeSnapshotInfoList field in the VolumeGroupSnapshotContent status for the new v1beta2 API objects. The existing v1beta1 API objects will be converted to the new v1beta2 API objects. The conversion logic will only populate the VolumeHandle and SnapshotHandle fields and will leave the remaining fields empty. A conversion webhook will be developed to make the conversion.
141+
* Note: We initially encountered an [issue](https://github.com/kubernetes-csi/external-snapshotter/issues/1271) where the restoreSize is not set for individual VolumeSnapshotContents and VolumeSnapshots if CSI driver does not implement ListSnapshots. We evaluated various options [here](https://docs.google.com/document/d/1LLBSHcnlLTaP6ZKjugtSGQHH2LGZPndyfnNqR1YvzS4/edit?tab=t.0). Making this v1beta2 API change is one option that can solve this problem. There is a second reason for making this v1beta2 API change. We need to make the v1beta2 API change for the pre-provisioned VolumeGroupSnapshot to automatically create individual VolumeSnapshots; otherwise, user has to manually create both group and individual snapshots API objects during static provisioning.
138142
* The VolumeGroupSnapshot controller will be watching the VolumeGroupSnapshotContent, and create VolumeSnapshotContents pointing to the volumeSnapshotHandles once they are available in the VolumeGroupSnapshotContent status. Then the controller will create VolumeSnapshots pointing to the VolumeSnapshotContents.
139143
* CreateVolumeGroupSnapshot CSI function response
140-
* The CreateVolumeGroupSnapshot CSI function should return a list of snapshots (Snapshot message defined in CSI Spec) in its response. The group snapshot logic in the csi-snapshotter sidecar will update the VolumeSnapshotHandlePairList field in the VolumeGroupSnapshotContent status based on the returned list of snapshots from the CSI call. The VolumeGroupSnapshot controller can use VolumeSnapshotHandles to construct corresponding individual VolumeSnapshotContents and VolumeSnapshots, wait for VolumeSnapshots and VolumeSnapshotContents to be bound, and update PVCVolumeSnapshotRefList in the VolumeGroupSnapshot Status and PVVolumeSnapshotContentList in the VolumeGroupSnapshotContent Status.
144+
* The CreateVolumeGroupSnapshot CSI function should return a list of snapshots (Snapshot message defined in CSI Spec) in its response. The group snapshot logic in the csi-snapshotter sidecar will update the VolumeSnapshotInfoList field in the VolumeGroupSnapshotContent status based on the returned list of snapshots from the CSI call. The VolumeGroupSnapshot controller can use VolumeSnapshotInfoList to construct corresponding individual VolumeSnapshotContents and VolumeSnapshots, wait for VolumeSnapshots and VolumeSnapshotContents to be bound.
141145
* Individual VolumeSnapshots will be named in this format:
142-
* <snap>-<hash of VolumeGroupSnapshot UUID+PVC UUID+timestamp>
143-
* A label with VolumeGroupSnapshot name will also be added to the VolumeSnapshot
146+
* <snap>-<hash of VolumeGroupSnapshot UUID + volume handle>
147+
* VolumeGroupSnapshot will also be added as an OwnerReference for the VolumeSnapshot
144148

145149
```
146150
apiVersion: snapshot.storage.k8s.io/v1
@@ -156,7 +160,7 @@ status:
156160
volumeGroupSnapshotName: groupSnapshot1
157161
```
158162

159-
* An admissions controller or finalizer should be added to prevent an individual snapshot from being deleted that belongs to a VolumeGroupSnapshot. Note that there is a [KEP](https://github.com/kubernetes/enhancements/pull/2840/files) that is proposing the Liens feature which could potentially be used for this purpose.
163+
* A finalizer should be added to prevent an individual snapshot from being deleted that belongs to a VolumeGroupSnapshot. Note that there is a [KEP](https://github.com/kubernetes/enhancements/pull/2840/files) that is proposing the Liens feature which could potentially be used for this purpose.
160164
* In the CSI spec, it is specified that it is required for individual snapshots to be returned along with the group snapshot.
161165

162166
#### Pre-provisioned VolumeGroupSnapshot
@@ -165,6 +169,8 @@ Admin can create a VolumeGroupSnapshotContent, specifying an existing VolumeGrou
165169

166170
The controller will call the CSI GetVolumeGroupSnapshot method to retrieve all volumeSnapshotHandles in the Volume Group Snapshot from the storage system, create VolumeSnapshotContents pointing to the volumeSnapshotHandles. Then the controller will create VolumeSnapshots pointing to the VolumeSnapshotContents.
167171

172+
Note: The automatic creation of individual VolumeSnapshots and VolumeSnapshotContents are not done in Beta. For now, the admin will need to manually construct these individual API objects. We plan to work on this before the feature moves to GA. We have information for all the individual snapshots from CSI [GetVolumeGroupSnapshot](https://github.com/kubernetes-csi/external-snapshotter/blob/release-8.2/pkg/sidecar-controller/groupsnapshot_helper.go#L781). We should be able to populate individual VolumeSnapshots and VolumeSnapshotContents based on this information.
173+
168174
### Delete VolumeGroupSnapshot
169175

170176
A VolumeGroupSnapshot can be deleted if the CSI driver supports the CREATE_DELETE_GET_VOLUME_GROUP_SNAPSHOT capability.
@@ -202,8 +208,9 @@ N/A
202208
Integration tests are not needed.
203209

204210
##### e2e tests
205-
* e2e tests for external volume group snapshot controller.
206-
* e2e tests for modified code path of external-snapshotter.
211+
* e2e tests for external volume group snapshot control logic in the modified code path of external-snapshotter.
212+
* https://github.com/kubernetes/kubernetes/blob/master/test/e2e/storage/testsuites/volume_group_snapshottable.go
213+
* https://testgrid.k8s.io/sig-storage-kubernetes#storage-kind-volume-group-snapshots
207214
* Add stress and scale tests before moving from beta to GA.
208215

209216
### Graduation Criteria
@@ -373,21 +380,6 @@ Type VolumeGroupSnapshotStatus struct {
373380
374381
// +optional
375382
Error *VolumeSnapshotError
376-
377-
// VolumeSnapshotRefList is the list of PVC and VolumeSnapshot pairs that
378-
// is part of this group snapshot.
379-
// The maximum number of allowed snapshots in the group is 100.
380-
// +optional
381-
PVCVolumeSnapshotRefList []PVCVolumeSnapshotPair
382-
}
383-
384-
// PVCVolumeSnapshotPair defines a pair of a PVC reference and a Volume Snapshot Reference
385-
type PVCVolumeSnapshotPair struct {
386-
// PersistentVolumeClaimRef is a reference to the PVC this pair is referring to
387-
PersistentVolumeClaimRef core_v1.LocalObjectReference
388-
389-
// VolumeSnapshotRef is a reference to the VolumeSnapshot this pair is referring to
390-
VolumeSnapshotRef core_v1.LocalObjectReference
391383
}
392384
```
393385

@@ -464,6 +456,8 @@ type GroupSnapshotHandles struct {
464456
VolumeSnapshotHandles []string
465457
}
466458
459+
// The VolumeSnapshotHandlePair struct is added in v1beta1 but removed in v1beta2
460+
// It is replaced by the VolumeSnapshotInfo struct
467461
// VolumeSnapshotHandlePair defines a pair of a source volume handle and a snapshot handle
468462
type VolumeSnapshotHandlePair struct {
469463
// VolumeHandle is a unique id returned by the CSI driver to identify a volume
@@ -475,6 +469,31 @@ type VolumeSnapshotHandlePair struct {
475469
SnapshotHandle string
476470
}
477471
472+
// The VolumeSnapshotInfo struct is added in v1beta2
473+
// VolumeSnapshotInfo contains information for a snapshot
474+
type VolumeSnapshotInfo struct {
475+
// VolumeHandle is a unique id returned by the CSI driver to identify a volume
476+
// on the storage system
477+
VolumeHandle string
478+
479+
// SnapshotHandle is a unique id returned by the CSI driver to identify a volume
480+
// snapshot on the storage system
481+
SnapshotHandle string
482+
483+
// creationTime is the timestamp when the point-in-time snapshot is taken
484+
// by the underlying storage system.
485+
// +optional
486+
CreationTime *int64
487+
488+
// ReadyToUse indicates if the snapshot is ready to be used to restore a volume.
489+
// +optional
490+
ReadyToUse *bool
491+
492+
// RestoreSize represents the complete size of the snapshot in bytes.
493+
// +optional
494+
RestoreSize *int64
495+
}
496+
478497
Type VolumeGroupSnapshotContentStatus struct {
479498
// VolumeGroupSnapshotHandle is a unique id returned by the CSI driver
480499
// to identify the VolumeGroupSnapshot on the storage system.
@@ -483,28 +502,32 @@ Type VolumeGroupSnapshotContentStatus struct {
483502
// +optional
484503
VolumeGroupSnapshotHandle *string
485504
505+
// This field is introduced in v1beta1 but removed in v1beta2
506+
// It is replaced by VolumeSnapshotInfoList
507+
// Information in this field from an existing v1beta1 API object
508+
// will be copied to VolumeSnapshotInfoList by the conversion logic
486509
// VolumeSnapshotHandlePairList is a list of CSI "volume_id" and "snapshot_id"
487510
// pair returned by the CSI driver to identify snapshots and their source volumes
488511
// on the storage system.
489512
// +optional
490-
VolumeSnapshotHandlePairList []VolumeSnapshotHandlePair
513+
// VolumeSnapshotHandlePairList []VolumeSnapshotHandlePair
514+
515+
// This field is introduced in v1beta2
516+
// It is replacing VolumeSnapshotHandlePairList
517+
// VolumeSnapshotInfoList is a list of snapshot information returned by
518+
// by the CSI driver to identify snapshots on the storage system.
519+
// +optional
520+
VolumeSnapshotInfoList []VolumeSnapshotInfo
491521
492522
// ReadyToUse becomes true when ReadyToUse on all individual snapshots become true
493523
// +optional
494524
ReadyToUse *bool
495525
496526
// +optional
497-
CreationTime *int64
527+
CreationTime *metav1.Time
498528
499529
// +optional
500530
Error *VolumeSnapshotError
501-
502-
// NOTE: We will consider removing this field after testing.
503-
// PVVolumeSnapshotContentList is the list of pairs of PV and
504-
// VolumeSnapshotContent for this group snapshot
505-
// The maximum number of allowed snapshots in the group is 100.
506-
// +optional
507-
PVVolumeSnapshotContentList []PVVolumeSnapshotContentPair
508531
}
509532
```
510533

@@ -569,24 +592,23 @@ A new group controller service will be added with a new controller capability CR
569592
Indicates that the controller plugin supports creating, deleting, and getting details of a snapshot of
570593
multiple volumes.
571594

595+
A new controller capability GET_SNAPSHOT will also be added. This indicates that the controller plugin supports getting details of a snapshot of multiple volumes.
596+
572597
#### CSI Group Controller RPC
573598

574599
```
575600
service Controller {
576601
577602
rpc CreateVolumeGroupSnapshot(CreateVolumeGroupSnapshotRequest)
578603
returns (CreateVolumeGroupSnapshotResponse) {
579-
option (alpha_method) = true;
580604
}
581605
582606
rpc DeleteVolumeGroupSnapshot(DeleteVolumeGroupSnapshotRequest)
583607
returns (DeleteVolumeGroupSnapshotResponse) {
584-
option (alpha_method) = true;
585608
}
586609
587610
rpc GetVolumeGroupSnapshot(GetVolumeGroupSnapshotRequest)
588611
returns (GetVolumeGroupSnapshotResponse) {
589-
option (alpha_method) = true;
590612
}
591613
592614
}
@@ -598,8 +620,6 @@ The purpose of this call is to request the creation of a multi-volume snapshot.
598620

599621
```
600622
message CreateVolumeGroupSnapshotRequest {
601-
option (alpha_message) = true;
602-
603623
// The suggested name for the group snapshot. This field is REQUIRED
604624
// for idempotency.
605625
// Any Unicode string that conforms to the length limit is allowed
@@ -627,16 +647,12 @@ message CreateVolumeGroupSnapshotRequest {
627647
}
628648
629649
message CreateVolumeGroupSnapshotResponse {
630-
option (alpha_message) = true;
631-
632650
// Contains all attributes of the newly created group snapshot.
633651
// This field is REQUIRED.
634652
VolumeGroupSnapshot group_snapshot = 1;
635653
}
636654
637655
message VolumeGroupSnapshot {
638-
option (alpha_message) = true;
639-
640656
// The identifier for this group snapshot, generated by the plugin.
641657
// This field MUST contain enough information to uniquely identify
642658
// this specific snapshot vs all other group snapshots supported by
@@ -673,8 +689,6 @@ message VolumeGroupSnapshot {
673689

674690
```
675691
message DeleteVolumeGroupSnapshotRequest {
676-
option (alpha_message) = true;
677-
678692
// The ID of the group snapshot to be deleted.
679693
// This field is REQUIRED.
680694
string group_snapshot_id = 1;
@@ -706,8 +720,6 @@ message DeleteVolumeGroupSnapshotResponse {
706720

707721
```
708722
message GetVolumeGroupSnapshotRequest {
709-
option (alpha_message) = true;
710-
711723
// The ID of the group snapshot to fetch current group snapshot
712724
// information for.
713725
// This field is REQUIRED.
@@ -725,13 +737,35 @@ message GetVolumeGroupSnapshotRequest {
725737
}
726738
727739
message GetVolumeGroupSnapshotResponse {
728-
option (alpha_message) = true;
729-
730740
// This field is REQUIRED
731741
VolumeGroupSnapshot group_snapshot = 1;
732742
}
733743
```
734744

745+
#### GetSnapshot
746+
747+
GetSnapshot is an optional controller capability that can help retrieve snapshot information. It can be used by Kubernetes to populate fields in individual VolumeSnapshotContents and VolumeSnapshots API objects that belong to a VolumeGroupSnapshot.
748+
749+
This is introduced because some CSI drivers cannot implement ListSnapshots RPC due to performance concerns. With the introduction of this new RPC, The CSI Snapshotter sidecar will call GetSnapshot if it is implemented. If GetSnapshot is not implemented, the snapshotter will fall back to the current behavior which is to call ListSnapshots instead. If neither GetSnapshot nor ListSnapshots is implemented, the snapshotter will fall back to the current behavior which is to assume the snapshot_id exists without being able to retrieve more information about this snapshot. Currently ListSnapshots is only called to retrieve status for pre-provisioned snapshots. The immediate use case of the new GetSnapshot RPC will also be for pre-provisioned snapshots. In the future, it is possible to use this API for other cases.
750+
751+
```
752+
message GetSnapshotRequest {
753+
// The ID of the snapshot to fetch current snapshot information for.
754+
// This field is REQUIRED.
755+
string snapshot_id = 1;
756+
757+
// Secrets required by plugin to complete GetSnapshot request.
758+
// This field is OPTIONAL. Refer to the `Secrets Requirements`
759+
// section on how to use this field.
760+
map<string, string> secrets = 2 [(csi_secret) = true];
761+
}
762+
763+
message GetSnapshotResponse {
764+
// This field is REQUIRED
765+
Snapshot snapshot = 1;
766+
}
767+
```
768+
735769
## Production Readiness Review Questionnaire
736770

737771
### Feature enablement and rollback
@@ -741,9 +775,11 @@ _This section must be completed when targeting alpha to a release._
741775
* **How can this feature be enabled / disabled in a live cluster?**
742776
- [x] Other
743777
- Describe the mechanism:
744-
We don't have a feature gate because this feature is out of tree.
745-
We will use a flag called enable-volume-group-snapshot to enable this
778+
This feature is out of tree.
779+
We initially used a flag called enable-volume-group-snapshot to enable this
746780
feature when the snapshot controller and csi-snapshotter sidecar are started.
781+
We changed this flag to a feature gate `VolumeGroupSnapshot` when moving it
782+
to Beta.
747783
- Will enabling / disabling the feature require downtime of the control
748784
plane?
749785
From the controller side, it only affects the external controller and sidecars.
@@ -943,6 +979,11 @@ provider?**
943979
the existing API objects?**
944980
Describe them, providing:
945981
- API type(s): A new string field is added to VolumeSnapshot and VolumeSnapshotContent respectively.
982+
Update from v1beta1 to v1beta2 API:
983+
In v1beta1, the VolumeGroupSnapshotContentStatus contains this field:
984+
VolumeSnapshotHandlePairList: [snapshotHandle, volumeHandle]
985+
In v1beta1, this is replaced with VolumeSnapshotInfoList: [snapshotHandle, volumeHandle, creationTime (*int64), readyToUse (*bool), restoreSize (*int64)]
986+
946987
- Estimated increase in size: (e.g., new annotation of size 32B):
947988
- Estimated amount of new objects: (e.g., new Object X for every existing Pod)
948989

keps/sig-storage/3476-volume-group-snapshot/kep.yaml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -18,14 +18,14 @@ approvers:
1818
see-also:
1919
replaces:
2020

21-
latest-milestone: "v1.32"
21+
latest-milestone: "v1.34"
2222
stage: "beta"
2323
milestone:
2424
alpha: "v1.27"
2525
beta: "v1.32"
2626
stable:
2727

28-
feature-gates:
28+
feature-gates: VolumeGroupSnapshot (out-of-tree)
2929
disable-supported: true
3030

3131
metrics:

0 commit comments

Comments
 (0)