Confoo.ca Spider
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
confs
notebooks
.gitignore
LICENSE
README.md
confs.db
requirements.txt
scrapy.cfg

README.md

Confoo.ca Spider

Supporting my talk at Confoo "Extracting data from the Internet with Scrapy".

Installation

Simply run pip install -r requirements.txt in your virtualenv.

Usage

To run the spider: scrapy crawl confoo. You can then check the data with sqlite3 confs.db. A Jupyter notebook is available under notebooks/ with some example of data drilling.