edsu/dewey-crawler
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
dewey-crawler is a simplistic, single threaded, possibly daft, crawler for Dewey Decimal Classification Summaries at http://dewey.info from the folks at OCLC. The idea is to be able to pull down the summaries to make it easier to reference the resources in your linked data application. More information about the data at http://dewey.info can be found at: http://www.worldcat.org/devnet/wiki/DeweyInfoTechOverview After a crawl you'll have a rdflib berkelydb triple store on disk. You can then run dump.py to generate dewey.rdf, dewey.ttl and dewey.json. Usage: ./crawl.py ./dump.py Dependencies: rdflib 3.0 License: This code is in the Public Domain. http://creativecommons.org/licenses/publicdomain/ The data is governed by OCLC's use of the Attribution-Noncommercial-No Derivative Works 3.0 Unported: http://creativecommons.org/licenses/by-nc-nd/3.0/ Improvements welcome at: http://github.com/edsu/dewey-crawler Comments, Questions, Complaints: Ed Summers <ehs@pobox.com>
About
simplistic crawler and serializer for linked data at dewey.info
Resources
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published