New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DM-42522: Implement exists() and size() for S3 presigned URLs #77
Conversation
Presigned S3 URLs are signed for a single method only, so you can't call HEAD on a URL signed for GET. Instead, we now emulate HEAD using a GET with a 0-byte Range header.
Like exists() in the previous commit, size() was relying on a HEAD call that isn't valid for presigned S3 URLs.
Codecov ReportAttention:
Additional details and impacted files@@ Coverage Diff @@
## main #77 +/- ##
==========================================
+ Coverage 86.20% 86.45% +0.24%
==========================================
Files 27 27
Lines 4126 4193 +67
Branches 842 848 +6
==========================================
+ Hits 3557 3625 +68
+ Misses 449 447 -2
- Partials 120 121 +1 ☔ View full report in Codecov by Sentry. |
There are a few lines with missing coverage which were already missing coverage in the existing code. I manually verified that the uncovered lines still work. |
It turns out that HttpResourceHandle was also doing some parsing of the content-range header, so pull out a function that will work for both this and HttpResourcePath.size().
7fb06ee
to
06c7646
Compare
if total != "*" and end_pos >= int(total) - 1: | ||
self._eof = True | ||
else: | ||
self._log.warning("Requested byte range from server but instead got: %s", content_range) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's a slight behavior change here. Previously if the units were anything but bytes
it would only log a warning. Now it throws an exception. This should never happen in real life, and if it does the returned data is almost certainly wrong so we want to bail anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great.
prefix = r"^\s*bytes\s+" | ||
|
||
# Content-Range: <unit> <range-start>-<range-end>/<size> | ||
if (case1 := re.match(prefix + r"(\d+)-(\d+)/(\d+)", header)) is not None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hopefully three separate regexes won't turn into a hotspot.
Checklist
doc/changes