Skip to content
This repository

Scrapy, a fast high-level screen scraping and web crawling framework for Python.

Octocat-spinner-32 artwork added artwork files properly now March 20, 2012
Octocat-spinner-32 bin remove scrapyd, it was migrated to its own repository February 06, 2013
Octocat-spinner-32 debian Added "six>=1.5.2" to requirements January 15, 2014
Octocat-spinner-32 docs DOC use top-level shortcuts in docs April 15, 2014
Octocat-spinner-32 extras remove references to deprecated scrapy-developers list February 16, 2014
Octocat-spinner-32 scrapy Merge top-level-shortcuts April 23, 2014
Octocat-spinner-32 sep sep 14 for #629 March 07, 2014
Octocat-spinner-32 .coveragerc Added rules to Makefile.buildbot for generating coverage reports December 15, 2010
Octocat-spinner-32 .gitignore Added request_fingerprint method to dupefilter classes so they could … January 15, 2014
Octocat-spinner-32 .travis-workarounds.sh try to restore pypy tests March 28, 2014
Octocat-spinner-32 .travis.yml New tox env: docs April 09, 2014
Octocat-spinner-32 AUTHORS added Nicolas Ramirez to AUTHORS March 14, 2013
Octocat-spinner-32 CONTRIBUTING.md renamed CONTRIBUTING to CONTRIBUTING.md so that links are rendered as… September 19, 2012
Octocat-spinner-32 INSTALL fix link to online installation instructions October 02, 2012
Octocat-spinner-32 LICENSE mv scrapy/trunk to root as part of svn2hg migration May 06, 2009
Octocat-spinner-32 MANIFEST.in get scrapy version from package data February 06, 2013
Octocat-spinner-32 Makefile.buildbot Fix permission and set umask before generating sdist tarball September 03, 2013
Octocat-spinner-32 NEWS added NEWS file pointing to docs/news.rst April 28, 2012
Octocat-spinner-32 README.rst Drop Python 2.6 support October 29, 2013
Octocat-spinner-32 pytest.ini TST fix tests that became broken after adding top-level imports and s… April 15, 2014
Octocat-spinner-32 requirements.txt test_command_deploy, test_contrib_linkextractors January 11, 2014
Octocat-spinner-32 setup.cfg remove no longer existent examples from doc_files used in bdist_rpm. … October 08, 2013
Octocat-spinner-32 setup.py Added "six>=1.5.2" to requirements January 15, 2014
Octocat-spinner-32 tests-requirements.txt Run testsuite with py.test April 03, 2014
Octocat-spinner-32 tox.ini New tox env: docs April 09, 2014
README.rst

Scrapy

https://badge.fury.io/py/Scrapy.png https://secure.travis-ci.org/scrapy/scrapy.png?branch=master

Overview

Scrapy is a fast high-level screen scraping and web crawling framework, used to crawl websites and extract structured data from their pages. It can be used for a wide range of purposes, from data mining to monitoring and automated testing.

For more information including a list of features check the Scrapy homepage at: http://scrapy.org

Requirements

  • Python 2.7
  • Works on Linux, Windows, Mac OSX, BSD

Install

The quick way:

pip install scrapy

For more details see the install section in the documentation: http://doc.scrapy.org/en/latest/intro/install.html

Releases

You can download the latest stable and development releases from: http://scrapy.org/download/

Documentation

Documentation is available online at http://doc.scrapy.org/ and in the docs directory.

Community (blog, twitter, mail list, IRC)

See http://scrapy.org/community/

Contributing

See http://doc.scrapy.org/en/latest/contributing.html

Companies using Scrapy

See http://scrapy.org/companies/

Commercial Support

See http://scrapy.org/support/

Something went wrong with that request. Please try again.