Scrapy, a fast high-level web crawling & scraping framework for Python.
Python Other
Latest commit bca374d Sep 30, 2016 @dangra dangra committed on GitHub Merge pull request #1581 from scrapy/fix-util-function-to-work-outsid…

[MRG+1] Make data_path work when outside project (used by HttpCacheMiddleware and Deltafetch plugin)
Failed to load latest commit information.
artwork added artwork files properly now Mar 20, 2012
debian Merge pull request #934 from Dineshs91/zsh-support Jul 30, 2015
docs Merge pull request #2267 from scrapy/deprecate-ubuntu-packages Sep 30, 2016
extras Merge pull request #934 from Dineshs91/zsh-support Jul 30, 2015
scrapy update data_path dosctring Sep 29, 2016
sep Spelling fixes Dec 14, 2015
tests test abs path outside project as well Sep 30, 2016
.bumpversion.cfg Allow more pre-releases with bumpversion Apr 21, 2016
.coveragerc Add coverage report trough Aug 13, 2015
.gitignore add coverage files to gitignore Aug 25, 2015
.travis.yml Remove "precise" test env from Travis-CI config Sep 1, 2016
AUTHORS added Nicolas Ramirez to AUTHORS Mar 14, 2013 Add Code of Conduct Version 1.3.0 from Jan 15, 2016 Put a blurb about support channels in CONTRIBUTING Jul 24, 2015
INSTALL fix link to online installation instructions Oct 2, 2012
LICENSE mv scrapy/trunk to root as part of svn2hg migration May 6, 2009 ENH: include tests/ to source distribution in Jun 26, 2015
Makefile.buildbot Generated version as pep440 and dpkg compatible Jun 16, 2015
NEWS added NEWS file pointing to docs/news.rst Apr 29, 2012
README.rst Remove download stats badge Aug 1, 2016 Simplify if statement Jan 18, 2016
pytest.ini Don't collect tests by their class name May 4, 2015
requirements-py3.txt Bump w3lib version dependency in Apr 29, 2016
requirements.txt Use w3lib.url.canonicalize_url() from w3lib 1.15.0 Aug 16, 2016
setup.cfg Build universal wheels Mar 1, 2016 Use w3lib.url.canonicalize_url() from w3lib 1.15.0 Aug 16, 2016
tox.ini Add Debian Jessie test env Sep 1, 2016



PyPI Version Build Status Wheel Status Python 3 Porting Status Coverage report


Scrapy is a fast high-level web crawling and web scraping framework, used to crawl websites and extract structured data from their pages. It can be used for a wide range of purposes, from data mining to monitoring and automated testing.

For more information including a list of features check the Scrapy homepage at:


  • Python 2.7 or Python 3.3+
  • Works on Linux, Windows, Mac OSX, BSD


The quick way:

pip install scrapy

For more details see the install section in the documentation:


You can download the latest stable and development releases from:


Documentation is available online at and in the docs directory.

Community (blog, twitter, mail list, IRC)



Please note that this project is released with a Contributor Code of Conduct (see

By participating in this project you agree to abide by its terms. Please report unacceptable behavior to


Companies using Scrapy


Commercial Support