GitHub is home to over 50 million developers working together. Join them to grow your own development teams, manage permissions, and collaborate on projects.
Scrapy, a fast high-level web crawling & scraping framework for Python.
Library to populate items using XPath and CSS with a convenient API
A CLI for benchmarking Scrapy.
Common interface for data container classes
The scrapy.org website
Parsel lets you extract data from XML/HTML documents using XPath or CSS selectors
Python library of web-related functions
[Archived] Library to populate Scrapy items using XPath and CSS with a convenient API
CSS Selectors for Python
A pure-Python robots.txt parser with support for modern conventions.
A service daemon to run Scrapy spiders
Collection of persistent (disk-based) queues
Command line client for Scrapyd server
A pure-python HTML screen-scraping library
This is a sample Scrapy project for educational purposes
Fill HTML login forms automatically
A crawler for http://books.toscrape.com
Performance-focused replacement for Python urllib
url component from Chromium source code, forked from https://chromium.googlesource.com/chromium/src/url
base component forked from Chromium source https://chromium.googlesource.com/chromium/src/base/
Scrapy project to scrape public web directories (educational) [DEPRECATED]
Codespeed for scrapy-bench
A fork of http://pydispatcher.sourceforge.net/ with PyPy support
GSoC2014 - Scrapy Integration tests project