Skip to content
This repository

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP

simplistic crawler and serializer for linked data at dewey.info

branch: master

Fetching latest commit…

Octocat-spinner-32-eaf2f5

Cannot retrieve the latest commit at this time

Octocat-spinner-32 .gitignore
Octocat-spinner-32 README
Octocat-spinner-32 crawl.py
Octocat-spinner-32 dewey.json
Octocat-spinner-32 dewey.rdf
Octocat-spinner-32 dewey.ttl
Octocat-spinner-32 dump.py
README
dewey-crawler is a simplistic, single threaded, possibly daft, crawler for 
Dewey Decimal Classification Summaries at http://dewey.info from the folks
at OCLC. The idea is to be able to pull down the summaries to make it 
easier to reference the resources in your linked data application.

More information about the data at http://dewey.info can be found at:

  http://www.worldcat.org/devnet/wiki/DeweyInfoTechOverview

After a crawl you'll have a rdflib berkelydb triple store on disk. You can
then run dump.py to generate dewey.rdf, dewey.ttl and dewey.json.

Usage:
./crawl.py
./dump.py

Dependencies:
rdflib 3.0

License:

This code is in the Public Domain.
http://creativecommons.org/licenses/publicdomain/ 

The data is governed by OCLC's use of the Attribution-Noncommercial-No
Derivative Works 3.0 Unported:
http://creativecommons.org/licenses/by-nc-nd/3.0/

Improvements welcome at:
http://github.com/edsu/dewey-crawler

Comments, Questions, Complaints:
Ed Summers <ehs@pobox.com>
Something went wrong with that request. Please try again.