New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[MRG] Use body to choose response type after decompression content #2393
Conversation
Codecov Report
@@ Coverage Diff @@
## master #2393 +/- ##
=========================================
- Coverage 84.16% 83.36% -0.8%
=========================================
Files 162 161 -1
Lines 9079 8730 -349
Branches 1346 1285 -61
=========================================
- Hits 7641 7278 -363
- Misses 1177 1201 +24
+ Partials 261 251 -10
Continue to review full report at Codecov.
|
It looks similar to #2001. This is quite inconsistent in Scrapy: e.g. cache or ftp handler still won't use body to get response class. I guess a part of the problem is that ResponseTypes.from_body is not based on any specification, so there was no strong desire to use body consistently - not using body was not seen as a bug because from_body may look like a hack. It seems the most up-to-date document is https://mimesniff.spec.whatwg.org, especially https://mimesniff.spec.whatwg.org/#identifying-a-resource-with-an-unknown-mime-type. While this change looks fine, the fix is not complete, and it looks like a part of a larger problem. |
@kmike , what do you mean by the fix not being complete? The aim here was to fix an issue at http decompression where it determines something else than the default |
@redapple the patch looks good because it improves response handling in decompression middleware, so I'm fine with merging it after a rebase. The logic middleware uses to detect response type is still different from what browsers do, and Scrapy is inconsistent in mime sniffing it performs. I should have opened another ticket for that, but I found it while reviewing this PR, so I wrote it in a comment :) |
Maybe #2145 is so rare we don't need to care. (I have never seen it myself.) |
626de8b
to
e42b846
Compare
Fixes #2145