[FEATURE] Snapshot CRD #3144

HubbeKing · 2021-10-12T16:28:19Z

Is your feature request related to a problem? Please describe.
Snapshot RecurringJobs set to Retain X number of snapshots do not touch unrelated snapshots, so if one ever changes the name of the RecurringJob, the old snapshots will stick around forever. These then have to be manually deleted in the UI.
Having a CRD for snapshots would greatly simplify this, as one could prune snapshots using kubectl - much like how one can currently manage backups using kubectl due to the existance of the backups.longhorn.io CRD.

Describe the solution you'd like
A CRD for Snapshots. Something like snapshots.longhorn.io, and code in longhorn-manager to ensure that there exists a CRD object for each snapshot for each volume, and that they are in sync.

Describe alternatives you've considered
I suppose a browser automation framework might also work for pruning large numbers of snapshots, but this feels janky as all hell.

Additional context
Screenshot of v1.2.2 UI showing leftover snapshots from a deleted recurringjob, and a single snapshot from a new RecurringJob with retain set to 4:
https://i.imgur.com/SnlP23P.png

jenting · 2021-10-15T01:34:55Z

I remember we discussed internally that Longhorn needs the snapshot CR. cc @joshimoo @shuo-wu

joshimoo · 2021-10-21T19:31:59Z

Hey team! Please add your planning poker estimate with ZenHub @jenting @shuo-wu @PhanLe1010

20 points seems to be the max for the planning poker, but this requires quite a bit of work. (Engine, Replica, Manager modifications + refactoring along the way)

PhanLe1010 · 2022-05-13T06:00:12Z

Test Plan:

Innit

Create volume: vol1, vol2
Attach and write some data to it

Create

Create new snapshot CR, volume attached
1. Deloy the yaml
```
apiVersion: longhorn.io/v1beta2
kind: Snapshot
metadata:
  name: snap001
  namespace: longhorn-system
spec:
  volume: vol1
  createSnapshot: true    
```
2. Verify that Longhorn provison a new snapshot snap001 inside Longhorn UI
3. Run kubectl get snapshots -n longhorn-system snap001. Verify the status field of the snapshot CR is good (all fields make sense)
4. Run kubectl describe snapshots -n longhorn-system snap001. Verify that the related events sequence is good
Create new snapshot CR, volume detached
1. Detach vol1
2. Deloy the yaml
```
apiVersion: longhorn.io/v1beta2
kind: Snapshot
metadata:
  name: snap002
  namespace: longhorn-system
spec:
  volume: vol1
  createSnapshot: true    
```
3. Run kubectl get snapshots -n longhorn-system snap002. Verify the status field contains the error message cannot take snapshot because the volume engine vol1-e-cbc27184 is not running
4. Run kubectl describe snapshots -n longhorn-system snap002. Verify that the related events cotains the same error
5. Attach vol1
6. Verify the the snapshot is created inside Longhorn UI and the snapshot CR status looks good
Create snapshot using UI
1. Create a snapshot using Longhorn UI
2. Verify the the snapshot CR is created with the correct status and related events
Create snapshot using recurring job
1. Setting up a snapshot recurring job with retain count 5 and period very minute
2. Verify that recurring job is functioning correctly
3. Delete the recurring job

Create snapshot CR with duplicated name

Deploying the following yaml

apiVersion: longhorn.io/v1beta2
kind: Snapshot
metadata:
  name: snap002
  namespace: longhorn-system
spec:
  volume: vol1
  createSnapshot: true

Verify that nothing change

Deploying the following yaml

apiVersion: longhorn.io/v1beta2
kind: Snapshot
metadata:
  name: snap002
  namespace: longhorn-system
spec:
  volume: vol2
  createSnapshot: true

Verify that we cannot deploy the yaml because spec.volume is immutable

Create snapshot CR for the next step

Deploying the following yaml

apiVersion: longhorn.io/v1beta2
kind: Snapshot
metadata:
  name: snap003
  namespace: longhorn-system
spec:
  volume: vol1
  createSnapshot: true

Delete

Delete snapshot CR, volume attached
1. Delete the snap001 CR
2. Verify the snap001 CR is deleted and the corresponding snapshot is deleted in Longhorn UI
Delete snapshot CR, volume detached
1. Detach vol1
2. Delete snap003 CR
3. Verify the the snap003 CR is stuck in deleting because the volume is not attached
4. Verify the related events contains cannot delete snapshot because the volume engine vol1-e-cbc27184 is not running by doing kubectl describe snapshots -n longhorn-system snap003
Delete snapshot CR that is next to volume-head
1. Attach vol1
2. Verify the snap003 is still stuck in deleting because it is the snapshot next to volume-head thus cannot be removed
3. Create a new snap004 by applying
```
apiVersion: longhorn.io/v1beta2
kind: Snapshot
metadata:
  name: snap004
  namespace: longhorn-system
spec:
  volume: vol1
  createSnapshot: true    
```
4. Verify the snap004 is created and snap003 is deleted in both kubectl and Longhorn UI
Delete snapshot using UI
1. Create a new snapshot using Longhorn UI, assume that its name is abc
2. Delete snap004 using Longhorn UI
3. Verify the snap004 are removed in both kubectl and UI
Delete snapshot that is next to volume-head using Longhorn UI
1. delete snapshot abc
2. verify that the snapshot abc is stuck in deleting
3. Verify that snapshot abc CR status is correct (e.g status.markRemoved:true and status.readyToUse:false)

Others

Deleting attached volume with existing snapshot CRs
1. Create a few more snapshots for vol1
2. Verify that corressponding snapshot CRs are creatred
3. Delete vol1
4. Verify that vol1is deleted together with all of its snapshots
Deleting detached volume with existing snapshot CRs
1. Attach vol2
2. Create a few more snapshots for vol2
3. Verify that corressponding snapshot CRs are creatred
4. Detach vol2
5. Delete vol2
6. Verify that vol2is deleted together with all of its snapshots
Uninstall Longhorn with existing snapshot CRs
1. Longhorn manager pods are still runing
  1. Create a few volumes, attach them
  2. Create snapshot for them
  3. Detach some of them
  4. Unistall Longhorn
  5. Verify that the uninstallation succeed
2. Longhorn manager pods are death
  1. Create a few volumes, attach them
  2. Create snapshot for them
  3. Detach some of them
  4. Delete Longhorn manager daemonset
  5. Unistall Longhorn
  6. Verify that the uninstallation succeed

longhorn-io-github-bot · 2022-05-18T00:42:22Z

chriscchien · 2022-05-18T09:56:15Z

HI @PhanLe1010 ,

I have quick questions about test plan steps Create -> 2 -> iii , is the snapshot name snap001 typo in command kubectl get snapshots -n longhorn-system snap001 ? and

In step Delete -> 3 -> ii, snapshot snap002 not next to volume-head because previously we tested recurring jobs with 5 retains, so there should be recurring job snapshots after to snap002, could you change the test steps let steps more smoothly? thank you

PhanLe1010 · 2022-05-18T18:41:30Z

steps Create -> 2 -> iii , is the snapshot name snap001 typo in command kubectl get snapshots -n longhorn-system snap001 ? and

Yeah, it is a typo. Fixed it. Thank you

In step Delete -> 3 -> ii, snapshot snap002 not next to volume-head because previously we tested recurring jobs with 5 retains, so there should be recurring job snapshots after to snap002, could you change the test steps let steps more smoothly? thank you

Done. Thank you

chriscchien · 2022-05-19T03:03:14Z

HubbeKing added the kind/feature Feature request, new feature label Oct 12, 2021

jenting modified the milestones: Planning, v1.3.0 Oct 15, 2021

innobead added priority/0 Must be implement or fixed in this release (managed by PO) area/api Longhorn manager public API highlight Important feature/issue to highlight labels Oct 15, 2021

jenting mentioned this issue Oct 21, 2021

[FEATURE] Extend CSI Snapshot support to Longhorn snapshot #2534

Closed

joshimoo mentioned this issue Oct 21, 2021

[IMPROVEMENT] Pre-condition check required for the snapshot purge #2777

Open

innobead added the require/lep Require adding/updating enhancement proposal label Dec 20, 2021

innobead assigned PhanLe1010 Dec 20, 2021

PhanLe1010 mentioned this issue Apr 20, 2022

Support Longhorn snapshot CRD longhorn/longhorn-manager#1304

Merged

joshimoo mentioned this issue Apr 20, 2022

[BUG] PVC create from volumsanpshot stuck at pending status(type=snap) #3860

Closed

PhanLe1010 mentioned this issue Apr 20, 2022

[outdated] Add LEP for Longhorn snapshot CRD feature #3876

Closed

innobead assigned chriscchien May 13, 2022

innobead added the require/auto-e2e-test Require adding/updating auto e2e test cases if they can be automated label May 13, 2022

innobead mentioned this issue May 13, 2022

[TEST] Add a pipeline for snapshot resource scaling testing #3979

Open

2 tasks

This was referenced May 16, 2022

Update charts for Longhorn snapshot CRDs #3991

Merged

Keep the owner check inside the isResponsibleFor in the worker goroutine only longhorn/longhorn-manager#1334

Merged

innobead added the require/doc Require updating the longhorn.io documentation label May 17, 2022

PhanLe1010 added require/manual-test-plan Require adding/updating manual test cases if they can't be automated and removed require/doc Require updating the longhorn.io documentation labels May 18, 2022

PhanLe1010 mentioned this issue May 18, 2022

[IMPROVEMENT] Change Longhorn API to create/delete snapshot CRs instead of calling engine CLI #3995

Closed

chriscchien closed this as completed May 19, 2022

innobead changed the title ~~[FEATURE] CRD for snapshots~~ [FEATURE] Snapshot CRD Jun 15, 2022

PhanLe1010 mentioned this issue Oct 20, 2022

Add LEP for Longhorn snapshot CRD feature #4743

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FEATURE] Snapshot CRD #3144

[FEATURE] Snapshot CRD #3144

HubbeKing commented Oct 12, 2021

jenting commented Oct 15, 2021

joshimoo commented Oct 21, 2021 •

edited

Loading

PhanLe1010 commented May 13, 2022 •

edited

Loading

longhorn-io-github-bot commented May 18, 2022 •

edited by PhanLe1010

Loading

chriscchien commented May 18, 2022 •

edited

Loading

PhanLe1010 commented May 18, 2022

chriscchien commented May 19, 2022 •

edited

Loading

[FEATURE] Snapshot CRD #3144

[FEATURE] Snapshot CRD #3144

Comments

HubbeKing commented Oct 12, 2021

jenting commented Oct 15, 2021

joshimoo commented Oct 21, 2021 • edited Loading

PhanLe1010 commented May 13, 2022 • edited Loading

Test Plan:

Innit

Create

Delete

Others

longhorn-io-github-bot commented May 18, 2022 • edited by PhanLe1010 Loading

Pre Ready-For-Testing Checklist

chriscchien commented May 18, 2022 • edited Loading

PhanLe1010 commented May 18, 2022

chriscchien commented May 19, 2022 • edited Loading

joshimoo commented Oct 21, 2021 •

edited

Loading

PhanLe1010 commented May 13, 2022 •

edited

Loading

longhorn-io-github-bot commented May 18, 2022 •

edited by PhanLe1010

Loading

chriscchien commented May 18, 2022 •

edited

Loading

chriscchien commented May 19, 2022 •

edited

Loading