[MRG] Use body to choose response type after decompression content #2393
Conversation
Codecov Report
@@ Coverage Diff @@
## master #2393 +/- ##
=========================================
- Coverage 84.16% 83.36% -0.8%
=========================================
Files 162 161 -1
Lines 9079 8730 -349
Branches 1346 1285 -61
=========================================
- Hits 7641 7278 -363
- Misses 1177 1201 +24
+ Partials 261 251 -10
Continue to review full report at Codecov.
|
It looks similar to #2001. This is quite inconsistent in Scrapy: e.g. cache or ftp handler still won't use body to get response class. I guess a part of the problem is that ResponseTypes.from_body is not based on any specification, so there was no strong desire to use body consistently - not using body was not seen as a bug because from_body may look like a hack. It seems the most up-to-date document is https://mimesniff.spec.whatwg.org, especially https://mimesniff.spec.whatwg.org/#identifying-a-resource-with-an-unknown-mime-type. While this change looks fine, the fix is not complete, and it looks like a part of a larger problem. |
@kmike , what do you mean by the fix not being complete? The aim here was to fix an issue at http decompression where it determines something else than the default |
@redapple the patch looks good because it improves response handling in decompression middleware, so I'm fine with merging it after a rebase. The logic middleware uses to detect response type is still different from what browsers do, and Scrapy is inconsistent in mime sniffing it performs. I should have opened another ticket for that, but I found it while reviewing this PR, so I wrote it in a comment :) |
Maybe #2145 is so rare we don't need to care. (I have never seen it myself.) |
626de8b
to
e42b846
Fixes #2145