Techcrunch Incremental Scrapy Spider With MongoDB
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
tc_scraper
.gitignore
README.md
requirements.txt
scrapy.cfg

README.md

Techcrunch Incremental Scrapy Spider With MongoDB

This project is the support for this blog post : Incremental crawler with Scrapy and MongoDB

Local setup

pip3 install -r requirements.txt

Setup a local MongoDB server. On Mac OS X :

brew install mongodb
brew services start mongodb

Run

scrapy crawl techcrunch -a limit_pages=2