An extremely simple example of using RethinkDB in Python, along with the extraction module to create a database of crawled HTML pages and the extracted data.


There isn't really very much here beyond syntax examples, but to run those, first install RethinkDB (maybe using these very easy instructions), and then do this:

git clone
cd rethinkdb-extraction
virtualenv .
. ./bin/activate
pip install -r requirements.txt
python rethinkgdb_extraction/

That will crawl a few pages, and load them into a local RethinkDB instance.