Crawlera middleware for Scrapy
Scrapy schema validation pipeline and Item builder using JSON Schema
Scrapy spider middleware to use Scrapinghub's Hub Crawl Frontier as a backend for URLs
Scrapy extension to control spiders using JSON-RPC
A scrapy extension to store requests and responses information in storage service
Scrapy spider middleware to ignore requests to pages containing items seen in previous crawls
A Scrapy pipeline to categorize items using MonkeyLearn
A scrapy extension to sync `.scrapy` folder to an S3 bucket
Scrapy middleware to add extra fields to items, like timestamp, response fields, spider attributes etc.
Scrapy spider middleware to split an item into multiple items using a multi-valued key
Scrapy spider middleware to clean up query parameters in request URLs
Scrapy extension to write scraped items using Django models
Scrapy pipeline for writing items to BigML datasets
Scrapy support for working with streamcorpus Stream Items.