New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GetObjectRequest in S3 should support final bytes as a Range header value #1551
Comments
Makes sense, we'd have to see if we can make this in a backwards compatible way. In the meantime I think you should be able to workaround this by doing something like the following. GetObjectRequest req = new GetObjectRequest("bucket", "key");
req.putCustomRequestHeader("Range", "-500");
amazonS3.getObject(req); |
The simplest way to be backwards compatible here would likely be to add a new method, perhaps something like Thanks for the work around, I will use that strategy for now, though I believe the call would need to be
|
Yes good catch. |
Using the suggested work around results in the following error:
By default, when a This update to the work around allows it to work by setting the range (thus disabling the checksum check), then overwriting the Range header value with the custom header: GetObjectRequest req = new GetObjectRequest("bucket", "key");
req.setRange(0);
req.putCustomRequestHeader("Range", "bytes=-500");
amazonS3.getObject(req); Unfortunately, this is based on the assumption that the internal implementation will continue to override the Range value with the custom header. That does not seem like a good assumption to make. |
You can disable md5 checks for GET request using the System Property. Note: this will disable md5 checks for ALL get requests. |
Thanks for the pointer @zoewangg. Unfortunately, the majority of requests I will be making are full-object requests, and I really do want md5 checks to occur for those transfers. I'm just looking for a way to disable the md5 checks specifically for Range-limited requests. |
This will also be useful for file formats like ORC and Parquet that want to read the file footer first. |
@omalley I have exactly this use case. Did you find an acceptable work around? |
I was able to just pull the content length from the header and then have a second call using that content length to pull the footer, although I am guessing this issue is about being able to do this without doing 2 calls |
According to https://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.35, a Range header may include a single negative value to indicate the last X bytes in a file should be retrieved. For example
bytes=-500
is a valid Range value for the final 500 bytes in a file.This Range header option is currently supported by S3, as verified through the AWS S3 CLI.
Currently, GetObjectRequest includes
setRange(long start)
andsetRange(long start, long end)
, which supports Range values likebytes=100-
andbytes=100-200
, however, there is no way to provide a Range value in GetObjectRequest which results in "bytes=-100", despite the fact that this is a valid value which is already supported by S3.The text was updated successfully, but these errors were encountered: