[MRG+1] Is_gzipped for application/x-gzip;charset=utf-8 #2050
Conversation
Current coverage is 83.34%
|
@@ -54,4 +54,4 @@ def gunzip(data): | |||
def is_gzipped(response): | |||
"""Return True if the response is gzipped, or False otherwise""" | |||
ctype = response.headers.get('Content-Type', b'') | |||
return ctype in (b'application/x-gzip', b'application/gzip') | |||
return b'application/x-gzip' in ctype or b'application/gzip' in ctype |
robsonpeixoto
Jun 12, 2016
•
Just a suggestion:
from six.moves import map # just if you would like a lazy solution
mimes = (b'application/x-gzip', b'application/gzip')
return any(map(lambda mime: mime in ctype, mimes))
Just a suggestion:
from six.moves import map # just if you would like a lazy solution
mimes = (b'application/x-gzip', b'application/gzip')
return any(map(lambda mime: mime in ctype, mimes))
Tethik
Jun 12, 2016
•
Author
Contributor
I did it as an any
-statement first, but figured it would be overkill since there are only two comparisons. Another version:
any(mime in ctype for mime in (b'application/x-gzip', b'application/gzip'))
I did it as an any
-statement first, but figured it would be overkill since there are only two comparisons. Another version:
any(mime in ctype for mime in (b'application/x-gzip', b'application/gzip'))
def test_is_gzipped(self): | ||
hdrs = Headers({"Content-Type": "application/x-gzip"}) | ||
r1 = Response("http://www.example.com", headers=hdrs) | ||
self.assertTrue(is_gzipped(r1)) |
robsonpeixoto
Jun 12, 2016
I do not known about the scrapy code style, but IMO it should be 4 different tests.
I do not known about the scrapy code style, but IMO it should be 4 different tests.
According to RFC 7231, section-3.1.1.1,
so it's worth fixing that too. Suggestion:
|
@redapple Alright, I'll look into it. Good catch on the gzipppp |
Done |
|
||
def is_gzipped(response): | ||
"""Return True if the response is gzipped, or False otherwise""" | ||
ctype = response.headers.get('Content-Type', b'') | ||
return ctype in (b'application/x-gzip', b'application/gzip') | ||
return not _is_gzipped_re.search(ctype) is None |
redapple
Jun 14, 2016
Contributor
I usually prefer return _is_gzipped_re.search(ctype) is not None
I usually prefer return _is_gzipped_re.search(ctype) is not None
Tethik
Jun 14, 2016
Author
Contributor
Zzzz... fine ;)
Zzzz... fine ;)
redapple
Jun 14, 2016
Contributor
Sorry for nitpicking, but I first read (quickly) "not is_gzipped".
Sorry for nitpicking, but I first read (quickly) "not is_gzipped".
Tethik
Jun 14, 2016
Author
Contributor
Nahh it's fine. I agree is not
looks better. I'm just lazy :)
Nahh it's fine. I agree is not
looks better. I'm just lazy :)
Looks good to me. |
@@ -27,3 +28,39 @@ def test_gunzip_truncated_short(self): | |||
with open(join(SAMPLEDIR, 'truncated-crc-error-short.gz'), 'rb') as f: | |||
text = gunzip(f.read()) | |||
assert text.endswith(b'</html>') | |||
|
|||
def test_is_gzipped_right(self): |
robsonpeixoto
Jun 14, 2016
IMO should break this test in two different:
def test_is_gzipped_right(self):
hdrs = Headers({"Content-Type": "application/gzip"})
r1 = Response("http://www.example.com", headers=hdrs)
self.assertTrue(is_gzipped(r1))
def test_is_x_gzipped_right(self):
hdrs = Headers({"Content-Type": "application/x-gzip"})
r1 = Response("http://www.example.com", headers=hdrs)
self.assertTrue(is_gzipped(r1))
IMO should break this test in two different:
def test_is_gzipped_right(self):
hdrs = Headers({"Content-Type": "application/gzip"})
r1 = Response("http://www.example.com", headers=hdrs)
self.assertTrue(is_gzipped(r1))
def test_is_x_gzipped_right(self):
hdrs = Headers({"Content-Type": "application/x-gzip"})
r1 = Response("http://www.example.com", headers=hdrs)
self.assertTrue(is_gzipped(r1))
[backport][1.1] Is_gzipped for application/x-gzip;charset=utf-8 (PR #2050)
Fix for #2049