Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add .dmg, .iso & .apk to ignored other extensions #4066

Merged
merged 6 commits into from Nov 14, 2019

Conversation

akhterwahab
Copy link
Contributor

@akhterwahab akhterwahab commented Oct 7, 2019

I run into downloading large files while scraping

Fixes #1837, fixes #2067

@akhterwahab akhterwahab changed the title Add dmg, iso & apk to ignored other extensions Add .dmg, .iso & .apk to ignored other extensions Oct 7, 2019
@codecov
Copy link

codecov bot commented Oct 7, 2019

Codecov Report

Merging #4066 into master will decrease coverage by 2.28%.
The diff coverage is n/a.

@@            Coverage Diff             @@
##           master    #4066      +/-   ##
==========================================
- Coverage   85.68%   83.39%   -2.29%     
==========================================
  Files         165      165              
  Lines        9734     9801      +67     
  Branches     1463     1463              
==========================================
- Hits         8341     8174     -167     
- Misses       1136     1364     +228     
- Partials      257      263       +6
Impacted Files Coverage Δ
scrapy/linkextractors/__init__.py 96.66% <ø> (-3.34%) ⬇️
scrapy/linkextractors/sgml.py 0% <0%> (-96.81%) ⬇️
scrapy/linkextractors/regex.py 0% <0%> (-95.66%) ⬇️
scrapy/linkextractors/htmlparser.py 0% <0%> (-92.07%) ⬇️
scrapy/extensions/statsmailer.py 0% <0%> (-30.44%) ⬇️
scrapy/_monkeypatches.py 54.54% <0%> (-18.19%) ⬇️
scrapy/link.py 86.36% <0%> (-13.64%) ⬇️
scrapy/utils/gz.py 92.1% <0%> (-7.9%) ⬇️
scrapy/utils/reqser.py 88.23% <0%> (-5.89%) ⬇️
scrapy/utils/python.py 78.72% <0%> (-5.52%) ⬇️
... and 75 more

@kmike
Copy link
Member

kmike commented Oct 7, 2019

Looks good to me, thanks @akhterwahab!

@akhterwahab akhterwahab requested a review from kmike October 9, 2019 11:08
@kmike
Copy link
Member

kmike commented Oct 9, 2019

See also: #2067

@Gallaecio
Copy link
Member

@akhterwahab Do you think you can add other extensions suggested in #2067? Otherwise, we can merge this as is and include additional extensions later.

@wRAR
Copy link
Member

wRAR commented Nov 12, 2019

Ping @akhterwahab :)

@akhterwahab
Copy link
Contributor Author

extensions

hi @Gallaecio Maybe we can merge these and add other extensions later on in another PR

@Gallaecio
Copy link
Member

I’ve added the additional extensions myself using the GitHub editor since it was trivial to do.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add more extensions to IGNORED_EXTENSIONS
4 participants