Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] NPE when listing backups of a backup volume #2683

Closed
shuo-wu opened this issue Jun 14, 2021 · 8 comments
Closed

[BUG] NPE when listing backups of a backup volume #2683

shuo-wu opened this issue Jun 14, 2021 · 8 comments
Assignees
Labels
area/volume-backup-restore Volume backup restore backport/1.1.2 Require to backport to 1.1.2 release branch kind/bug
Milestone

Comments

@shuo-wu
Copy link
Contributor

shuo-wu commented Jun 14, 2021

Describe the bug
When the 1st backup creation of a new backup volume is in progress, listing backups for a new backup volume via UI will trigger an NPE issue

To Reproduce
Steps to reproduce the behavior:

  1. Launch a volume with enough data so that the 1st backup creation will take some time
  2. Create the 1st backup for the volume
  3. Enter into the backup volume to list all backups on the Backup page

Expected behavior
There is neither backup inside the backup volume nor any error

Log

error listing backups for volume 'vol': error listing backups: Failed to execute: /var/lib/longhorn/engine-binaries/shuowu-longhorn-engine-backup-test-3/longhorn [backup ls --volume vol s3://backupbucket@us-east-1/backupstore], output , stderr, panic: runtime error: invalid memory address or nil pointer dereference [signal SIGSEGV: segmentation violation code=0x1 addr=0x10 pc=0x7f3c9a] goroutine 66 [running]: github.com/longhorn/backupstore.addListVolume(0x7ffe50f6b001, 0x3, 0x1093e40, 0xc00007d3b0, 0x0, 0x0, 0x0, 0x0) /go/src/github.com/longhorn/longhorn-engine/vendor/github.com/longhorn/backupstore/list.go:123 +0x6da github.com/longhorn/backupstore.List.func1(0x108b160, 0xc000088c00, 0x1, 0xc000232d68, 0x8, 0x8) /go/src/github.com/longhorn/longhorn-engine/vendor/github.com/longhorn/backupstore/list.go:154 +0x4d github.com/honestbee/jobq.JobRunnerFunc.Run(0xc00016ab10, 0x108b160, 0xc000088c00, 0x4, 0x4, 0x0, 0x0) /go/src/github.com/longhorn/longhorn-engine/vendor/github.com/honestbee/jobq/job.go:19 +0x3a github.com/honestbee/jobq.newJob.func1.1.1(0x1076da0, 0xc00016ab10, 0xc0001ba0c0) /go/src/github.com/longhorn/longhorn-engine/vendor/github.com/honestbee/jobq/job.go:154 +0x5f created by github.com/honestbee/jobq.newJob.func1.1 /go/src/github.com/longhorn/longhorn-engine/vendor/github.com/honestbee/jobq/job.go:153 +0x50 , error exit status 2

Environment:

  • Longhorn version: master - 06/14/2021

Additional context
Need to check the possible error here: https://github.com/longhorn/backupstore/blob/89bd8d273db76bd454ffe90d6c12f45354d4ed07/list.go#L121-L122

@shuo-wu
Copy link
Contributor Author

shuo-wu commented Jun 14, 2021

cc @jenting

@AlexGustafsson
Copy link

I've started getting this error as well. I've been running Longhorn for about a month without similar issues. I've previously been able to create and view backups, but now, seemingly out of nowhere it doesn't work anymore.

I've configured automatic backups to S3 (minio) daily and weekly for some volumes of around 250 MiB in size each. After getting this error I checked the S3 bucket and there's only about 80 MiB stored there. Could this be due to compression of near-empty volumes or have they not been created at all? Is there any way to easily check for existing backups without being able to access the backup page?

@shuo-wu
Copy link
Contributor Author

shuo-wu commented Jun 17, 2021

I've configured automatic backups to S3 (minio) daily and weekly for some volumes of around 250 MiB in size each. After getting this error I checked the S3 bucket and there's only about 80 MiB stored there. Could this be due to compression of near-empty volumes or have they not been created at all? Is there any way to easily check for existing backups without being able to access the backup page?

This error should happen only in the latest master version. Which longhorn version are you using?

You can check your minio work directory to get the backup list. But it's not straightforward. e.g.,

# ls /storage/backupbucket/backupstore/backupstore/volumes/47/e3/vol/backups/
backup_backup-13ee055e2ccc45d7.cfg  backup_backup-4e264e73eb1f4820.cfg  backup_backup-74fa809a4bca4e3e.cfg  backup_backup-ed77850e8b3f4c76.cfg
backup_backup-3d6b4f569e7a4b71.cfg  backup_backup-50c2dcf332f14638.cfg  backup_backup-9f13ba7c0f044335.cfg

@jenting
Copy link
Contributor

jenting commented Jun 17, 2021

I think it will also happened at longhorn-engine v1.1.1

@jenting
Copy link
Contributor

jenting commented Jun 17, 2021

@innobead could we include it in v1.1.2? The fix is easy and quickly.

@AlexGustafsson
Copy link

AlexGustafsson commented Jun 17, 2021

I'm running v1.1.1. Thanks for the minio instructions.

@longhorn-io-github-bot
Copy link

longhorn-io-github-bot commented Jun 17, 2021

Pre Ready-For-Testing Checklist

  • Is the reproduce steps/test steps documented?

  • Is there a workaround for the issue? If so, is it documented?

  • Does the PR include the explanation for the fix or the feature?

  • Does the PR include deployment change (YAML/Chart)? If so, have both YAML file and Chart been updated in the PR?

  • Is the backend code merged (Manager, Engine, Instance Manager, BackupStore etc) (including backport-needed)?
    The PR is at Fix list in progress backup panic backupstore#73, Bump longhorn/backupstore version to fix list in progress backup panic longhorn-engine#627, [cherry-pick-v1.1.2] Bump longhorn/backupstore version to fix list in progress backup panic longhorn-engine#628

  • Which areas/issues this PR might have potential impacts on?
    Area List Backups
    Issues

  • If labeled: require/LEP Has the Longhorn Enhancement Proposal PR submitted?

  • If labeled: area/ui Has the UI issue filed or ready to be merged (including backport-needed)?

  • If labeled: require/doc Has the necessary document PR submitted or merged (including backport-needed)?

  • If labeled: require/automation-e2e Has the end-to-end test plan been merged? Have QAs agreed on the automation test case? (including backport-needed)

  • If labeled: require/automation-engine Has the engine integration test been merged (including backport-needed)?

  • If labeled: require/manual-test-plan Has the manual test plan been documented?

  • If the fix introduces the code for backward compatibility Has a separate issue been filed with the label release/obsolete-compatibility?

@innobead innobead modified the milestones: v1.1.2, v1.2.0 Jun 17, 2021
@kaxing
Copy link

kaxing commented Jun 21, 2021

Validation - Passed

Testing against to 1.1.2-head(ui: 7fb9440), with following steps:

  • Create a pod with longhorn volume
  • Setup recurring backup every minute and keep 1 backup
  • Refresh and watch the backup page for this volume

In additional to the original steps:

To Reproduce
Steps to reproduce the behavior:
Launch a volume with enough data so that the 1st backup creation will take some time
Create the 1st backup for the volume
Enter into the backup volume to list all backups on the Backup page

No more error pops up and backups will always show up.

@kaxing kaxing closed this as completed Jun 21, 2021
@kaxing kaxing self-assigned this Jun 21, 2021
@innobead innobead changed the title [BUG]NPE when listing backups of a backup volume [BUG] NPE when listing backups of a backup volume Aug 31, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/volume-backup-restore Volume backup restore backport/1.1.2 Require to backport to 1.1.2 release branch kind/bug
Projects
Status: Closed
Development

No branches or pull requests

6 participants