Scrapy, a fast high-level screen scraping and web crawling framework for Python.
Pull request Compare This branch is 2883 commits behind scrapy:master.
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
.travis
artwork
bin
debian
docs
extras
scrapy
scrapyd
sep
.coveragerc
.gitignore
.hgtags
.travis.yml
AUTHORS
INSTALL
LICENSE
MANIFEST.in
Makefile.buildbot
NEWS
README.rst
setup.cfg
setup.py
tox.ini

README.rst

Scrapy

https://secure.travis-ci.org/scrapy/scrapy.png?branch=master

Overview

Scrapy is a fast high-level screen scraping and web crawling framework, used to crawl websites and extract structured data from their pages. It can be used for a wide range of purposes, from data mining to monitoring and automated testing.

For more information including a list of features check the Scrapy homepage at: http://scrapy.org

Requirements

  • Python 2.6 or up
  • Works on Linux, Windows, Mac OSX, BSD

Install

The quick way:

pip install scrapy

For more details see the install section in the documentation: http://doc.scrapy.org/en/latest/intro/install.html

Releases

You can download the latest stable and development releases from: http://scrapy.org/download/

Documentation

Documentation is available online at http://doc.scrapy.org/ and in the docs directory.

Community (blog, twitter, mail list, IRC)

See http://scrapy.org/community/

Contributing

See http://doc.scrapy.org/en/latest/contributing.html

Companies using Scrapy

See http://scrapy.org/companies/

Commercial Support

See http://scrapy.org/support/