Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Add Pagination support to the S3 client (currently we only list the first 1000 results) #1904

Closed
joshimoo opened this issue Oct 22, 2020 · 7 comments
Assignees
Labels
area/v1-data-engine v1 data engine (iSCSI tgt) area/volume-backup-restore Volume backup restore kind/bug priority/1 Highly recommended to fix in this release (managed by PO)
Milestone

Comments

@joshimoo
Copy link
Contributor

joshimoo commented Oct 22, 2020

We need to add pagination support to the S3 client, currently we only list the first 1000 results.
We use the list calls for checking the backups/volumes/blocks currently we get lucky that we
use partitions for volumes and blocks.

But even so it is still possible to have more then 1000 backup.cfs as well as blocks/volumes in the same partition.
Which would end up not showing up on the longhorn-ui or any logic that relies on this content.

To Reproduce

  1. Open browser developer tool.
  2. Create one snapshot by clicks LH GUI (It'll calls snapshotCreate and snapshotBackup APIs).
  3. Copy the snapshotBackup API call, right click Copy -> Copy as cURL.
  4. Run the curl command over 1k times in the Shell.
  5. On LH GUI, Clicks the Backup -> PVC Name to display total PVC backups, you could see that only 100 pages (10 backups/page) are available even we take over 1k backups.
@joshimoo joshimoo added area/v1-data-engine v1 data engine (iSCSI tgt) kind/bug area/volume-backup-restore Volume backup restore priority/1 Highly recommended to fix in this release (managed by PO) labels Oct 22, 2020
@joshimoo
Copy link
Contributor Author

Maybe cause of #1667

@yasker yasker added this to the v1.1.1 milestone Oct 22, 2020
@timmy59100
Copy link

Is this also the cause of this error in the backup UI?
error listing backups: error listing backup volumes: Timeout executing: /var/lib/longhorn/engine-binaries/longhornio-longhorn-engine-v1.0.0/longhorn [backup ls --volume-only s3://PATH/], output , stderr, , error <nil>
Seems only the case if there's lots of data in the bucket, currently 2TB.

@yasker
Copy link
Member

yasker commented Nov 3, 2020

@timmy59100 Yes it's possible. This command needs to go to each level of the backup directory for iteration, so if there are lots of volumes, it's possible didn't get the response in time. Another potential issue is the S3 vendor might respond late. Which S3 vendor you're using?

@timmy59100
Copy link

@yasker we use Exoscale

Is there a way to increase the timeout?

@yasker
Copy link
Member

yasker commented Nov 4, 2020

@timmy59100 Not currently. Can you file another issue regarding the timeout? Also, how many volumes you have? We want to see what we can do to address it in the v1.1.0 release.

@longhorn-io-github-bot
Copy link

longhorn-io-github-bot commented Feb 1, 2021

Pre-merged Checklist

  • Is the reproduce steps/test steps documented?

  • Is there a workaround for the issue? If so, is it documented?

  • Does the PR include the explanation for the fix or the feature?

  • Does the PR include deployment change (YAML/Chart)? If so, have both YAML file and Chart been updated in the PR?

  • Is the backend code merged (Manager, Engine, Instance Manager, BackupStore etc)?
    The PR is at

  • Which areas/issues this PR might have potential impacts on?
    Area: GUI -> List Backups
    Issues: List Backups timeout

  • If labeled: require/LEP Has the Longhorn Enhancement Proposal PR submitted?
    The LEP PR is at

  • If labeled: area/ui Has the UI issue filed or ready to be merged?
    The UI issue/PR is at

  • If labeled: require/doc Has the necessary document PR submitted or merged?
    The Doc issue/PR is at

  • If labeled: require/automation-e2e Has the end-to-end test plan been merged? Have QAs agreed on the automation test case?
    The automation skeleton PR is at
    The automation test case PR is at

  • If labeled: require/automation-engine Has the engine integration test been merged?
    The engine automation PR is at

  • If labeled: require/manual-test-plan Has the manual test plan been documented?
    The updated manual test plan is at

  • If the fix introduces the code for backward compatibility Has an separate issue been filed with the label release/obsolete-compatibility?
    The compatibility issue is filed at

@yasker yasker removed the require/auto-e2e-test Require adding/updating auto e2e test cases if they can be automated label Feb 5, 2021
@khushboo-rancher khushboo-rancher self-assigned this Feb 17, 2021
@khushboo-rancher
Copy link
Contributor

Verified with the Longhorn Master - 02/18/2021

Validation - Pass

Validated with S3 backupstore, more than 1000 backups are getting listed. Also, verified with NFS backupstore, there is no impact.

The timeout issue during backup is being tracked and will be reproduced and validated as a part of #2218

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/v1-data-engine v1 data engine (iSCSI tgt) area/volume-backup-restore Volume backup restore kind/bug priority/1 Highly recommended to fix in this release (managed by PO)
Projects
None yet
Development

No branches or pull requests

7 participants