New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Aws::S3::Client#get_object can use range requests to resume interrupted downloads #1535
Comments
Thanks for bring up the idea, appreciate that. I'd happy to dig more to see how do we make it clear as a feature request or enhance the user experience there. For the download, we release From my understanding of your proposal (feel free to correct me if I'm missing something : )), you are suggesting special handling for "basic"(no Personally, I feel |
Thanks for a thorough answer! I did consider the new multipart downloader, however, my use case is a bit different than normal download. I actually need the to stream the content (i.e. pass a block to |
@janko-m Thanks for the information, I see. Well, as that blog post suggests:
However, if partially retrieve is your main concern, I'm thinking perhaps we could add an extra Will that sounds good to you? If so, I can have that in the feature backlog : ) |
I'm not sure if that will have all the benefits of using io = Down::ChunkedIO.new(chunks: object.enum_for(:get))
io.read(1*1024*1024) # downloads and returns first ~1MB
io.close # `#get_object` request is terminated, and nothing more gets downloaded Secondly, using the Thirdly, which should be probably discussed in a separate ticket specific to the |
@janko-m Ah I see, thanks for the clarification, I agree it's a separate issue, tagging this as a feature request, happy to take a PR for review. I'll do some benchmarking/exploring myself : ) |
Tracked in backlog to be prioritized : ) |
This Amazon article describes various ways in which you can download objects from S3. The Seahorse client currently retries any failed requests, which includes interrupted downloads. It retries them like any other request, by re-issuing the same request. That way it's currently not possible for streaming downloads, i.e.
Aws::S3::Client#get_object
with a block, to be resilient to network errors, because when:response_target
is an IO object it can be truncated before retrying the request, while with a block (BlockIO
) you cannot do that, the chunks were already yielded.I propose that for
Aws::S3::Client#get_object
requests, instead of retrying the entire download request from the beginning, aws-sdk usesRange
header to resume downloading from the last downloaded byte. That way retrying interrupted downloads will be faster, and it will also work for streaming downloads (where a block is passed toAws::S3::Client#get_object
).The text was updated successfully, but these errors were encountered: