-
Notifications
You must be signed in to change notification settings - Fork 772
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make fetching work when package is in Amazon S3 #2843
Conversation
When a package is hosted on Amazon S3 (sometimes the case when using Gemfury as a private repository), there might be a redirect from the Gemfury link to the S3 link. The S3 link will be different for HEAD and GET requests. For this reason, we need to use the original Gemfury link when passing arguments to the range reader.
I'll have to think about this some more, we've very explicitly used the response URL over the request URL in the past. I worry this could break other user's workflows without further consideration. cc @baszalmstra perhaps your team would find this an interesting async_http_range_reader edge case. Do you know how |
|
Oh, but they definitely don't use the response URL from the |
My read is that we should consider passing the URL explicitly to |
Thanks for the ping. I think there is a case for both options. I guess if a server reports a different redirect url based on the http method this could be problematic. I would be happy accept a pr in async_http_range_reader whatever you decide. |
@charliermarsh @zanieb Just in case you didn't see it, I wrote about what led me to this PR in the comments of #2025. To sum up: I don't think this is the one-and-only right way to fix this, it's merely a proof of concept that seemed to get me past the problem 😊 Having said that, I thought the api of that range reader was a little bit strange. To me, the part where you give it the headers of the response so that it can determine whether the server supports range requests seems perfectly natural. But it's less clear to me that the URL used to make those range requests need to be the url in the response (and not the url in the original request). I guess I'm somewhat biased since I've made this PR that hacks around the fact that this url can't be overridden 😊 |
Yeah, I’m fairly confident we should change it to reuse the original URL. We just need to verify that it won’t regress a few other cases where we explicitly need to use a response URL. (For example, if a registry returns relative paths, those need to be relative to the response URL in the event of a redirect. But that’s a different case than the range requests we’re doing here.) |
Started the upstream changes at prefix-dev/async_http_range_reader#11 |
Superseded by #3460. |
Summary
When a package is hosted on Amazon S3 (sometimes the case when using Gemfury as a private repository), there might be a redirect from the Gemfury link to the S3 link. The S3 link will be different for HEAD and GET requests. For this reason, we need to use the original Gemfury link when passing arguments to the range reader.
Fixes #2025
Test Plan
Just tested locally where I had a repro of the issue.