Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Host snapshots from a web server that supports HTTP byte ranges to enable download resuming #1552

Closed
eyeinsky opened this issue Nov 5, 2023 · 2 comments
Labels
bug Something isn't working

Comments

@eyeinsky
Copy link

eyeinsky commented Nov 5, 2023

This is perhaps a devops issue (where would I report that?), but the snapshots don't appear to be using a web server that supports HTTP range header which means that continuing a download isn't supported. For big files as they are it would be helpful if resuming were supported.

To reproduce do a download:

curl -v https://update-cardano-mainnet.iohk.io/cardano-db-sync/13.1/db-sync-snapshot-schema-13.1-block-9492777-x86_64.tgz -o snapshot.tgz

Cancel the above and then run the following to try to resume:

curl -v -C - https://update-cardano-mainnet.iohk.io/cardano-db-sync/13.1/db-sync-snapshot-schema-13.1-block-9492777-x86_64.tgz -o snapshot.tgz

After which curl exits with curl: (33) HTTP server doesn't seem to support byte ranges. Cannot resume.

OS
Your OS: NixOS

Versions
The db-sync version (eg cardano-db-sync --version): N/A
PostgreSQL version: N/A

Build/Install Method
The method you use to build or install cardano-db-sync: N/A

Run method
The method you used to run cardano-db-sync (eg Nix/Docker/systemd/none): N/A

Additional context
Add any other context about the problem here.

Problem Report
Please do not include screenshots or images, but instead cut and paste any relevant log messages
or errors.

@eyeinsky eyeinsky added the bug Something isn't working label Nov 5, 2023
@johnalotoski
Copy link
Contributor

johnalotoski commented Dec 7, 2023

Under investigation.

Also a duplicate of #1230 and #1149.

In the meantime, have you considered using wget instead of curl? The benefits would be:

  • A more robust client for large file downloads over lossy links and so more likely to complete on the first try without needing to use a continue flag (which generates a range request)
  • Won't hard fail on using the continue flag when the origin server doesn't support get byte range requests and in the worst case will re-download the full file rather than potentially breaking scripts.

Ex: wget [--debug] --continue $URL

Another possibility mentioned in one of the old duplicates is to fetch direct from the s3 bucket URL to get support for byte ranges, example:

wget --continue https://s3.ap-northeast-1.amazonaws.com/update.cardano-mainnet.iohk.io/cardano-db-sync/13.1/db-sync-snapshot-schema-13.1-block-9640454-x86_64.tgz

@johnalotoski
Copy link
Contributor

Range requests are now supported for cardano-db-sync snapshot artifacts and properly return 206 with accept-ranges and content-range headers.

@eyeinsky, your initial curl command pasted above which previously hard failed will now succeed.
Wget is still recommended for use where possible due to more robustness vs. curl.

curl -I --range 0-100 https://update-cardano-mainnet.iohk.io/cardano-db-sync/13.1/db-sync-snapshot-schema-13.1-block-9640454-x86_64.tgz
  HTTP/2 206 
  content-length: 101
  accept-ranges: bytes
  content-range: bytes 0-100/56721901324
  <...snip...>

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants