Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Curl options --range and --continue-at clash without warning #15646

Closed
piru opened this issue Nov 26, 2024 · 7 comments
Closed

Curl options --range and --continue-at clash without warning #15646

piru opened this issue Nov 26, 2024 · 7 comments
Assignees

Comments

@piru
Copy link

piru commented Nov 26, 2024

I did this

curl --continue-at 2 --range 0-1 https://curl.se/robots.txt

I expected the following

Error being reported since the two options clash (any --range option is quietly overridden by the --continue-at).

curl/libcurl version

curl 8.11.0

operating system

Any

@bagder bagder self-assigned this Dec 2, 2024
bagder added a commit that referenced this issue Dec 2, 2024
Allowing both just creates a transfer with behaviors no user can
properly anticipate so better just deny the combo.

Fixes #15646
Reported-by: Harry Sintonen
@bagder bagder closed this as completed in fcb5953 Dec 2, 2024
@barkoder
Copy link

barkoder commented Dec 16, 2024

@bagder

Could you please have a way for curl to work with both -C - and -r ?

  1. Say I'm downloading a file curl -s -r 11-20 https://curl.se/robots.txt -o test.file

  2. But the download is interrupted due to whatever reason(or maybe I manually stopped it) after 5 bytes.

  3. There is also no way for me use curl to just download and append byte range 16-20 to the same file.
    Running with continue – curl -r 10-20 -C - -s https://curl.se/robots.txt -o test.file used to(before you solved this issue) just overwrite the file.

  4. What I expect to happen here is that curl recognizes the size of the downloaded file(test.file), knows that 5 bytes have already been downloaded to test.file , and correctly resumes a range specified transfer where it left off(automatically transferring -r 16-20), instead of re-downloading the full range(-r 11-20) .

Basically I'm requesting an override option, so that -C - is able to work with -r , when -o test.file is specified .

Thanks!

@bagder
Copy link
Member

bagder commented Dec 16, 2024

Step 4 is an incorrect assumption though. The downloaded 5 bytes from step 2 are stored the output file but there is no trace of which offset those five bytes come. Trying resume that file will make curl ask to continue from offset 5 and onward to append to the already existing file. Which then makes it a broken file as a result.

Ie, -C can only continue and append to a transfer with a correct leading part.

@jay
Copy link
Member

jay commented Dec 16, 2024

Basically I'm requesting an override option, so that -C - is able to work with -r , when -o test.file is specified .

I don't see us adding support for that because it's something so niche that hardly anyone would use it.

@barkoder
Copy link

I use curl in a script that splits up a file hosted on very unreliable servers that is only accessible via certain proxies(that are also sometimes unreliable).

Looks a bit like this.

while true ; do curl --retry-all-errors -H '<HEADERS>' -s -x socks5h://PROXY1 "$url" -r 0-15032385536 -o test.1 && break ; sleep 10 ; done &
while true ; do curl --retry-all-errors -H '<HEADERS>' -s -x socks5h://PROXY2 "$url" -r 15032385537-30064771072 -o test.2 && break ; sleep 10 ; done &
while true ; do curl --retry-all-errors -H '<HEADERS>' -s -x socks5h://PROXY3 "$url" -r 30064771073-45097156608 -o test.3 && break ; sleep 10 ; done &
while true ; do curl --retry-all-errors -H '<HEADERS>' -s -x socks5h://PROXY4 "$url" -r 45097156609- -o test.4 && break ; sleep 10 ; done &

Once the transfers are complete, I then cat the individual pieces back to the complete file.
This works. The checksum for the complete file checks out.

Except for when the server/proxy flakes out. And then the script begins re-downloading the interrupted piece again.
Suffice it to say that re-downloading a ~14GiB piece at ~17KiB/s when the transfer gets interrupted 80% of the way through is... very painful.

Hence my request for that particular option.
If not -C - , then something else that allows transfer resumption with ranges when output file exists.

If this is truly something that you feel would be so niche that hardly anyone would use it, fair enough.
But I was hoping this could at least be considered a 'low-priority' request?

Thanks!

@bagder
Copy link
Member

bagder commented Dec 16, 2024

If not -C - , then something else that allows transfer resumption with ranges when output file exists.

That seems like logic your script should have. When you download a file in chunks, each chunk has a start offset and a size. If an individual chunk fails to get completely downloaded, you can do a second adjusted range request by adjusting the start offset and the size by checking how much of the chunk you got.

@bagder
Copy link
Member

bagder commented Dec 16, 2024

This said, I've had this idea for a long while: Use multiple parallel transfers for a single download

@jay
Copy link
Member

jay commented Dec 16, 2024

I think I'm already doing something like what he wants in #15333

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging a pull request may close this issue.

4 participants