New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GZip decoding regression #206
Comments
@schlamar Looks like this is related to d00d305, could you look into this please? @maxcountryman If you'd like to participate, a urllib3 test showing this failure (and making sure it doesn't regress in the future) would be helpful. :) |
@shazow Photobucket's API will yield the error. I'm assuming any GZipped response. Perhaps httpbin.org could assist with this? |
@maxcountryman For urllib3, we write tests that can be run without an internet connection. Take a look at some of the tests here for examples: https://github.com/shazow/urllib3/tree/master/test/with_dummyserver We already have several GZip-related tests which are passing. |
@shazow probably don't have time for that, sorry. |
Nps. If anyone else wants to do this, help is welcome. :) |
@maxcountryman gzip on httpbin works fine for me with urllib3 latest master:
And requests 1.2.3:
Can you give me an exact url which is failing? Preferable without requirement to register for an API key. |
Nope: you'll need to register for Photobucket and set up an OAuth application in order to access the API. This is a regression from requests 1.2.0 as previously stated. |
Here's the response from curl, which is the best I can do for you. This exact URL yields the above error. To reproduce you may need to setup an application with Photobucket or model the response.
The same URL raises the reported exception with Requests 1.2.3; this is a regression from Requests 1.2.0 which does not raise the error against the URL. >>> r = requests.get('http://api.photobucket.com/album/zentropa?<snip>')
>>> r.status_code
200
>>> requests.__version__
'1.2.0' >>> r = requests.get('http://api.photobucket.com/album/zentropa?<snip>')
Traceback (most recent call last):
...
requests.packages.urllib3.exceptions.DecodeError: Received response with content-encoding: gzip, but failed to decode it.
>>> requests.__version__
'1.2.3' |
@maxcountryman Can you reproduce the issue on Edit: removed stuff about proxy. Only the redirect is HTTP/1.0. |
There is no proxy: photobucket redirects to silos. It's on the same machine and I don't have time to test the ping endpoint although I doubt it would reproduce it if it doesn't redirect. Now that's about all the help I can provide. I might suggest rolling back the commit that broke this in the meantime if the problem isn't obvious. Bear in mind that the browser and curl handle this same endpoint without issue so this is clearly a problem for any consumer expecting urllib3 to behave in a sane way. HTH. |
@t-8ch Could it be something to do with switching from using @maxcountryman I rather not start rolling back commits from previous releases until we fully understand the issue. If it seems like this is photobucket-specific and not a wide-spread issue, we could wait until someone else runs into the bug who has more time to investigate it. It's important to remember that urllib3 grows thanks to community contributions; we don't have any corporate sponsors paying for developer time to satisfy specific consumers (though this can be arranged if your employers is interested in sponsoring some specific development). :) |
@shazow I am not sure, but I don't think so. @maxcountryman You could also remove the It would also be great, if you could use |
Looks like this issue stalled. Feel free to reopen if there are new developments. :) |
Moving the conversation here from https://github.com/Runscope/requests-runscope/issues/6#issuecomment-34945852. @maxcountryman wrote:
@maxcountryman It's not just a matter of "just revert this and everything is dandy."
@schlamar do you recall why we stopped using the |
Applied against master, this clearly illustrates the problem: diff --git a/test/test_response.py b/test/test_response.py
index ecfcbee..2f95d40 100644
--- a/test/test_response.py
+++ b/test/test_response.py
@@ -8,20 +8,17 @@ from urllib3.exceptions import DecodeError
from base64 import b64decode
-# A known random (i.e, not-too-compressible) payload generated with:
-# "".join(random.choice(string.printable) for i in xrange(512))
-# .encode("zlib").encode("base64")
-# Randomness in tests == bad, and fixing a seed may not be sufficient.
+# zlib-failing payload
ZLIB_PAYLOAD = b64decode(b"""\
-eJwFweuaoQAAANDfineQhiKLUiaiCzvuTEmNNlJGiL5QhnGpZ99z8luQfe1AHoMioB+QSWHQu/L+
-lzd7W5CipqYmeVTBjdgSATdg4l4Z2zhikbuF+EKn69Q0DTpdmNJz8S33odfJoVEexw/l2SS9nFdi
-pis7KOwXzfSqarSo9uJYgbDGrs1VNnQpT9f8zAorhYCEZronZQF9DuDFfNK3Hecc+WHLnZLQptwk
-nufw8S9I43sEwxsT71BiqedHo0QeIrFE01F/4atVFXuJs2yxIOak3bvtXjUKAA6OKnQJ/nNvDGKZ
-Khe5TF36JbnKVjdcL1EUNpwrWVfQpFYJ/WWm2b74qNeSZeQv5/xBhRdOmKTJFYgO96PwrHBlsnLn
-a3l0LwJsloWpMbzByU5WLbRE6X5INFqjQOtIwYz5BAlhkn+kVqJvWM5vBlfrwP42ifonM5yF4ciJ
-auHVks62997mNGOsM7WXNG3P98dBHPo2NhbTvHleL0BI5dus2JY81MUOnK3SGWLH8HeWPa1t5KcW
-S5moAj5HexY/g/F8TctpxwsvyZp38dXeLDjSQvEQIkF7XR3YXbeZgKk3V34KGCPOAeeuQDIgyVhV
-nP4HF2uWHA==""")
+eyJzdGF0dXMiOiJFeGNlcHRpb24iLCJtZXNzYWdlIjoiUmVkaXJlY3QgcmVxdWlyZWQgdG8gYWNjZ
+XNzIHVzZXIgZGF0YSIsImNvZGUiOjMwMSwiY29udGVudCI6eyJzdWJkb21haW4iOiJodHRwOlwvXC
+9hcGkxMjYwLnBob3RvYnVja2V0LmNvbSIsInVybCI6Imh0dHA6XC9cL2FwaTEyNjAucGhvdG9idWN
+rZXQuY29tXC9hbGJ1bVwvemVudHJvcGE/b2F1dGhfbm9uY2U9MDM1ZTU1MWZmOTk1MDIyZTMwM2E5
+MGVmNWE5YWYxNDI5NDQwNzcyMSZmb3JtYXQ9anNvbiZvYXV0aF9jb25zdW1lcl9rZXk9MTQ5ODMyN
+DAwJm9hdXRoX3RpbWVzdGFtcD0xMzkyMjY0MjkwJm9hdXRoX3NpZ25hdHVyZV9tZXRob2Q9SE1BQy
+1TSEExJm9hdXRoX3ZlcnNpb249MS4wJm9hdXRoX3Rva2VuPTM0LjIyNDc1ODlfMTM5MjI2NDI4OSZ
+vYXV0aF9zaWduYXR1cmU9VGE2ZmRrUDljYlV6OWhLc05sczJJTko2NktFJTNEIn0sImZvcm1hdCI6
+Impzb24iLCJtZXRob2QiOiJHRVQiLCJ0aW1lc3RhbXAiOjEzOTIyNjQyOTB9""")
class TestLegacyResponse(unittest.TestCase): |
Looking at that, it seems like it's trying to decode non-compressed data, which leads me to believe that previously urllib3 was not considering this request's data as compressed and in later versions (e.g. now) it does. In fact, in Requests 1.2.0, this data exits the |
Can you expand on what you mean by that? urllib3 decides whether the data is compressed by simply looking at the Also I get the same error by replacing the zlib payload with any kind of garbage (e.g. |
At that time requests did the decompression on its own. Can someone spot any difference? |
@schlamar yes, the difference is that urllib3 falls out of the |
@maxcountryman Yes, that's obvious. However, requests at that time called the decompression method linked above. This method had a fallback to return the uncompressed data if decompression failed but this was flawed: https://github.com/kennethreitz/requests/issues/1249. So I assume that photobucket does return non-gzipped data while sending a Maybe you could provide a raw HTTP response (including headers) from photobucket API, that would be really helpful I guess. |
I have just seen the curl example above. Interestingly, curl doesn't get gzip as content-encoding. Maybe there is some notable difference in the headers sent on the request which might explains this behavior. |
Curl does not send an Alternatively, you can try to use a requests session and remove |
@maxcountryman Please check out https://github.com/kennethreitz/requests/pull/1944, this should resolve this issue. |
See this issue on Requests: https://github.com/kennethreitz/requests/issues/1472
This is the truncated traceback:
cc @schlamar
The text was updated successfully, but these errors were encountered: