Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Volume UI displays only the last backup when using the recurring job #2997

Closed
mfbrj opened this issue Sep 13, 2021 · 26 comments
Closed
Assignees
Labels
area/recurring-job Longhorn recurring job related area/snapshot Volume snapshot (in-cluster snapshot or external backup) component/longhorn-manager Longhorn manager (control plane) kind/bug priority/1 Highly recommended to fix in this release (managed by PO) require/doc Require updating the longhorn.io documentation
Milestone

Comments

@mfbrj
Copy link

mfbrj commented Sep 13, 2021

Describe the bug
Volume UI is displaying only the last backup

To Reproduce
Steps to reproduce the behavior:

  1. Create a backup recurring job with retain >= 2, and a group
  2. Associate this recurring job group to some volume
  3. Wait for 2 or 3 backups be performed
  4. See that volume UI displays only the last backup. Screenshots below.

Expected behavior
Volume UI displays 2 or more backups, within retain range.

Log

Environment:

  • Longhorn version: 1.2.0
  • Installation method (e.g. Rancher Catalog App/Helm/Kubectl): Rancher Catalog App
  • Kubernetes distro (e.g. RKE/K3s/EKS/OpenShift) and version: RKE v1.20.9
    • Number of management node in the cluster: 3
    • Number of worker node in the cluster: 8
  • Node config
    • OS type and version: Ubuntu 20.04.3 LTS kernel 5.4.0-84-generic
    • CPU per node: 4
    • Memory per node: 16 GB
    • Disk type(e.g. SSD/NVMe): SSD
    • Network bandwidth between the nodes:
  • Underlying Infrastructure (e.g. on AWS/GCE, EKS/GKE, VMWare/KVM, Baremetal): VMWare
  • Number of Longhorn volumes in the cluster: 23

Additional context
Wrong behaviour:
image

Wrong behaviour:
image

Right behaviour:
image

@mfbrj mfbrj added the kind/bug label Sep 13, 2021
@jenting jenting added this to New in Community Issue Review via automation Sep 13, 2021
@PhanLe1010 PhanLe1010 moved this from New to In progress in Community Issue Review Sep 14, 2021
@PhanLe1010 PhanLe1010 self-assigned this Sep 14, 2021
@PhanLe1010
Copy link
Contributor

PhanLe1010 commented Sep 15, 2021

@smallteeths Do you know if the snapshot chain graph and the table of the snapshots are getting data from the same source? We are wondering if this problem is because of UI parsing the timestamp differently or if we need to fix it from the backend.

@mfbrj
Copy link
Author

mfbrj commented Sep 15, 2021

@PhanLe1010 as far I can tell, there is only one source, because I have only one minio backup server configured on Longhorn settings.

@PhanLe1010
Copy link
Contributor

Thanks @mfbrj ! Sorry! My bad! the above comment was supposed for this issue #2994 😂

For this issue, we will try to see if we can reproduce it in lab first

@mfbrj
Copy link
Author

mfbrj commented Sep 15, 2021

@PhanLe1010 ok. I was suspicious that it was about the other issue, because of mention of the timestamp.

@shuo-wu
Copy link
Contributor

shuo-wu commented Sep 20, 2021

For each recurring backup job, Longhorn will retain the last snapshot created by this job after each backup creation is complete:
https://github.com/longhorn/longhorn-manager/blob/e4be7bebf46ce43e06e3d5ca882f6fda21588afd/app/recurring_job.go#L535-L543

This may work as expected. @PhanLe1010 Can you double-check this feature?

@mfbrj
Copy link
Author

mfbrj commented Sep 20, 2021

@shuo-wu
Why not to get the backup info from the same source of Backup tab? In the Backup tab, the info is consistent and correct.

@PhanLe1010
Copy link
Contributor

PhanLe1010 commented Sep 20, 2021

Thanks @shuo-wu. I just tried the provided reproducing steps and look at the code. Yeah, this is working as expected.

@mfbrj The snapshot chain in the volume detail page displays all existing snapshots of a volume (not all existing backups). If a snapshot has corresponding backup, the color of the snapshot changed to green. Users need to head to backup tap to see all existing backups.

@mfbrj
Copy link
Author

mfbrj commented Sep 20, 2021

Ok @PhanLe1010 but why the chain displays only the last backup, even if there are other snapshots with corresponding backup?

The chain should display all valid snapshots that have backups, to make easy the restore process in maintenance mode from UI.

@jenting
Copy link
Contributor

jenting commented Sep 22, 2021

Okay, after discuss internally, we should improve this to make the Snapshots and Backups list chain inside the Volume page consistent with the Backup page.

@jenting
Copy link
Contributor

jenting commented Nov 2, 2021

We'll fix it on the backend side.

@kaxing
Copy link

kaxing commented Dec 3, 2021

Validating in progress,
testing with 1.2.x-head images,

No problem when the snapshot is taking after a backup,
image

@kaxing
Copy link

kaxing commented Dec 3, 2021

Validation - Passed

Verified with master-head images.

Behavior expectations:

  • Backup creating from Volume page will be shown in the S/B(Snapshot and Backup) chain.
  • Backup deleted in backup page will be removed from the S/B chain
  • Snapshot taking in volume detail page will be display correctly.

Example of a volume with 4 backups:

  • The S/B chain:
    • image
  • The S/B list:
    • image
  • The backup volume page:
    • image

@kaxing kaxing closed this as completed Dec 3, 2021
@mfbrj
Copy link
Author

mfbrj commented Dec 3, 2021

@kaxing Did you also test the backups done by Recurring Jobs?

@innobead innobead added area/snapshot Volume snapshot (in-cluster snapshot or external backup) component/longhorn-manager Longhorn manager (control plane) labels Aug 22, 2022
@innobead innobead added priority/1 Highly recommended to fix in this release (managed by PO) and removed priority/2 Nice to fix in this release (managed by PO) labels Nov 7, 2022
@innobead innobead modified the milestones: v1.4.0, v1.5.0 Nov 30, 2022
@innobead innobead modified the milestones: v1.5.0, v1.6.0 Apr 6, 2023
@innobead
Copy link
Member

innobead commented Jun 8, 2023

@mantissahz saw you have the PR, but where is it?

@mantissahz
Copy link
Contributor

just created it, longhorn/longhorn-manager#1965

@longhorn-io-github-bot
Copy link

longhorn-io-github-bot commented Jun 8, 2023

Pre Ready-For-Testing Checklist

  • Where is the reproduce steps/test steps documented?
    The reproduce steps/test steps are at:
  1. Longhorn install
  2. Disable the setting "auto-cleanup-recurring-job-backup-snapshot"
  3. Create a volume with a recurring backup job that contains 3 retaining backup
  4. After the recurring job is done, there are 3 snapshots for 3 backups retained.
  5. Enable the setting "auto-cleanup-recurring-job-backup-snapshot"
  6. After a new recurring job is done, there is only one snapshot for last backup retained.

@innobead
Copy link
Member

This has a new setting introduced and no immediate need to fix it in the maintained release branches, so decided to not backport this to 1.5.x and 1.4.x.

@roger-ryao
Copy link

Verified on master-head 20231213

The test steps

#2997 (comment)

  1. Install Longhorn.

  2. Disable the setting auto-cleanup-recurring-job-backup-snapshot.
    Screenshot_20231213_141401

  3. Create a volume with a recurring backup job that contains 3 retain backups.

  4. After the recurring job is completed, there are 3 snapshots for the 3 retain backups, which can be obtained using the commands: kubectl -n longhorn-system get backup and kubectl -n longhorn-system get snapshot.

  5. Enable the setting auto-cleanup-recurring-job-backup-snapshot.

  6. After a new recurring job is completed, there is only one snapshot for the last backup retain.

Result Passed

  1. Manual tests have been successfully
  2. requir/doc : doc(setting): keep snapshots for the backup website#730

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/recurring-job Longhorn recurring job related area/snapshot Volume snapshot (in-cluster snapshot or external backup) component/longhorn-manager Longhorn manager (control plane) kind/bug priority/1 Highly recommended to fix in this release (managed by PO) require/doc Require updating the longhorn.io documentation
Projects
Archived in project
Community Issue Review
Resolved/Scheduled
Development

No branches or pull requests