Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MRG+1] [httpcompression] add support for br - brotli content encoding #2535

Merged
merged 5 commits into from Feb 20, 2017

Conversation

pawelmhm
Copy link
Contributor

@pawelmhm pawelmhm commented Feb 6, 2017

brotli encoding is used by multiple websites, e.g. amazon started to use it recently. Support in Scrapy is added via python library brotlipy https://github.com/python-hyper/brotlipy/

@redapple
Copy link
Contributor

redapple commented Feb 6, 2017

Does the library require anything special/additional to get installed properly?
Maybe importing brotlipy can be tested at inittime/runtime and support for br only advertized in headers if succesful?

@pawelmhm
Copy link
Contributor Author

pawelmhm commented Feb 6, 2017

Does the library require anything special/additional to get installed properly?

I didnt have to install anything, but it seems like some environments are failing to install, so most likely some systems are missing required libraries. Testing for support at runtime may be much safer yes, I'll add that @redapple

@codecov-io
Copy link

codecov-io commented Feb 6, 2017

Codecov Report

Merging #2535 into master will increase coverage by -0.01%.

@@            Coverage Diff             @@
##           master    #2535      +/-   ##
==========================================
- Coverage   83.49%   83.49%   -0.01%     
==========================================
  Files         161      161              
  Lines        8787     8795       +8     
  Branches     1289     1290       +1     
==========================================
+ Hits         7337     7343       +6     
- Misses       1203     1205       +2     
  Partials      247      247
Impacted Files Coverage Δ
scrapy/downloadermiddlewares/httpcompression.py 88.88% <77.77%> (-3.01%)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 3b8e6d4...fb4ef21. Read the comment docs.

@redapple
Copy link
Contributor

redapple commented Feb 6, 2017

@pawelmhm , I do not see brotlipy being installed for Python 3.x tests on Travis.

@pawelmhm
Copy link
Contributor Author

pawelmhm commented Feb 6, 2017

@redapple sorry missed that updated in af802ba

@@ -1,23 +1,33 @@
import zlib


Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pep8 says 1 line between import groups :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good point, we could add flake8 style checks for PR. Flake8 could run on diff and point out all things like this.

@@ -6,6 +6,7 @@ pytest==2.9.2
pytest-twisted
pytest-cov==2.2.1
jmespath
brotlipy==0.6
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer not to pin brotlipy version here: users will be installing it with 'pip install brotlipy', and we won't be able to detect if a brotlipy upgrade broke scrapy if we pin a version in tests.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated fb4ef21

@kmike
Copy link
Member

kmike commented Feb 6, 2017

The PR looks good, besides a couple of minor points 👍 A nice feature.
Have you compared gzip and brotli for amazon? How's the download size and CPU usage?

@pawelmhm
Copy link
Contributor Author

pawelmhm commented Feb 7, 2017

@kmike re performance there are good benchmarks and charts here: https://quixdb.github.io/squash-benchmark/#ratio-vs-compression

@kmike
Copy link
Member

kmike commented Feb 7, 2017

@pawelmhm I was more worried about Python wrappers speed

@kmike kmike changed the title [httpcompression] add support for br - brotli content encoding [MRG+1] [httpcompression] add support for br - brotli content encoding Feb 7, 2017
@redapple redapple added this to the v1.4 milestone Feb 8, 2017
@dangra dangra merged commit 58a18e3 into scrapy:master Feb 20, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants