Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MRG+1] [httpcompression] add support for br - brotli content encoding #2535

Merged
merged 5 commits into from Feb 20, 2017

Conversation

@pawelmhm
Copy link
Contributor

@pawelmhm pawelmhm commented Feb 6, 2017

brotli encoding is used by multiple websites, e.g. amazon started to use it recently. Support in Scrapy is added via python library brotlipy https://github.com/python-hyper/brotlipy/

@pawelmhm pawelmhm added the enhancement label Feb 6, 2017
@redapple
Copy link
Contributor

@redapple redapple commented Feb 6, 2017

Does the library require anything special/additional to get installed properly?
Maybe importing brotlipy can be tested at inittime/runtime and support for br only advertized in headers if succesful?

@pawelmhm
Copy link
Contributor Author

@pawelmhm pawelmhm commented Feb 6, 2017

Does the library require anything special/additional to get installed properly?

I didnt have to install anything, but it seems like some environments are failing to install, so most likely some systems are missing required libraries. Testing for support at runtime may be much safer yes, I'll add that @redapple

@codecov-io
Copy link

@codecov-io codecov-io commented Feb 6, 2017

Codecov Report

Merging #2535 into master will increase coverage by -0.01%.

@@            Coverage Diff             @@
##           master    #2535      +/-   ##
==========================================
- Coverage   83.49%   83.49%   -0.01%     
==========================================
  Files         161      161              
  Lines        8787     8795       +8     
  Branches     1289     1290       +1     
==========================================
+ Hits         7337     7343       +6     
- Misses       1203     1205       +2     
  Partials      247      247
Impacted Files Coverage Δ
scrapy/downloadermiddlewares/httpcompression.py 88.88% <77.77%> (-3.01%)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 3b8e6d4...fb4ef21. Read the comment docs.

@pawelmhm pawelmhm force-pushed the pawelmhm:brotli branch from 4fddd39 to 974d3f0 Feb 6, 2017
@pawelmhm pawelmhm force-pushed the pawelmhm:brotli branch from 974d3f0 to 3daf473 Feb 6, 2017
@redapple
Copy link
Contributor

@redapple redapple commented Feb 6, 2017

@pawelmhm , I do not see brotlipy being installed for Python 3.x tests on Travis.

@pawelmhm
Copy link
Contributor Author

@pawelmhm pawelmhm commented Feb 6, 2017

@redapple sorry missed that updated in af802ba

@@ -1,23 +1,33 @@
import zlib


This comment has been minimized.

@kmike

kmike Feb 6, 2017
Member

pep8 says 1 line between import groups :)

This comment has been minimized.

@pawelmhm

pawelmhm Feb 7, 2017
Author Contributor

good point, we could add flake8 style checks for PR. Flake8 could run on diff and point out all things like this.

@@ -6,6 +6,7 @@ pytest==2.9.2
pytest-twisted
pytest-cov==2.2.1
jmespath
brotlipy==0.6

This comment has been minimized.

@kmike

kmike Feb 6, 2017
Member

I'd prefer not to pin brotlipy version here: users will be installing it with 'pip install brotlipy', and we won't be able to detect if a brotlipy upgrade broke scrapy if we pin a version in tests.

This comment has been minimized.

@pawelmhm

pawelmhm Feb 7, 2017
Author Contributor

updated fb4ef21

@kmike
Copy link
Member

@kmike kmike commented Feb 6, 2017

The PR looks good, besides a couple of minor points 👍 A nice feature.
Have you compared gzip and brotli for amazon? How's the download size and CPU usage?

@pawelmhm
Copy link
Contributor Author

@pawelmhm pawelmhm commented Feb 7, 2017

@kmike re performance there are good benchmarks and charts here: https://quixdb.github.io/squash-benchmark/#ratio-vs-compression

@kmike
Copy link
Member

@kmike kmike commented Feb 7, 2017

@pawelmhm I was more worried about Python wrappers speed

@kmike kmike changed the title [httpcompression] add support for br - brotli content encoding [MRG+1] [httpcompression] add support for br - brotli content encoding Feb 7, 2017
@redapple redapple added this to the v1.4 milestone Feb 8, 2017
@dangra dangra merged commit 58a18e3 into scrapy:master Feb 20, 2017
1 check passed
1 check passed
continuous-integration/travis-ci/pr The Travis CI build passed
Details
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

None yet

5 participants
You can’t perform that action at this time.