Skip to content
Python wrappers for the HathiTrust APIs.
Branch: master
Clone or download
rlmv Merge pull request #5 from Princeton-CDH/fix-data_api-urls
Fix urls for getaggregrate and getstructure
Latest commit c52e934 Mar 16, 2019
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
hathitrust_api Fix urls for getaggregrate and getstructure Mar 12, 2019
tests oauth template Feb 25, 2013
.gitignore Add Python3 compatibility (#4) Jun 30, 2018
LICENSE Add MIT license Jun 30, 2018
MANIFEST.in Add MANIFEST.in Jul 2, 2018
Makefile Add clean directive to Makefile Jul 2, 2018
README.md Update README Jun 30, 2018
__init__.py Add Python3 compatibility (#4) Jun 30, 2018
setup.py Version 0.1.1 Jul 2, 2018

README.md

hathitrust-api

A simple interface for the HathiTrust APIs. The package contains basic classes and associated methods for querying the Bibliographic API, Data API, and the HTRC Solr Proxy.

The package is compatible with Python 2 and Python 3.

Installation

Clone and install from this repository:

git clone https://github.com/rlmv/hathitrust-api.git
cd hathitrust-api
python setup.py install

Or install directly using pip:

pip install hathitrust-api

DataAPI

The Data API retrieves non-google public domain works from the HathiTrust.

An OAuth keyset from HathiTrust is required to use the Data API.

Example usage:

>>> from hathitrust_api import DataAPI
>>> data_api = DataAPI(your_oauth_key, your_oauth_secret)
>>> ocrtext = data_api.getpageocr('nyp.33433082228226', 120)

BibAPI

The bibliographic API delivers HathiTrust bibliographic data and MARC records in JSON format.

Example:

>>> from hathitrust_api import BibAPI
>>> bib_api = BibAPI()
>>> bib_info = bib_api.get_single_record_json('htid', 'dul1.ark:/13960/t00z82c1q')
>>> bib_info.keys()
[u'records', u'items']
>>> bib_info['records']['010944133']['publishDates']
[u'1670']

SolrAPI

The HTRC Solr Proxy is a search index over the public domain collection.

>>> from hathitrust_api import SolrAPI
>>> solr = SolrAPI()
>>> results = solr.query("new zealand", fields=['title'])
>>> results
{u'responseHeader': {u'status': 0, u'QTime': 19}, u'response': {u'start': 0, u'numFound': 366613, u'docs': [{u'title': [u'The statues of New Zealand ...']}, {u'title': [u'New Zealand.']}, {u'title': [u"Wise's New Zealand index"]}, {u'title': [u'Palaeontological bulletin.']}, {u'title': [u'New Zealand,']}, {u'title': [u'The New Zealand official year-book.']}, {u'title': [u'The New Zealand official year-book.']}, {u'title': [u'The New Zealand official year-book.']}, {u'title': [u'The New Zealand official year-book.']}, {u'title': [u'The New Zealand official year-book.']}]}}

Needed:

  • Write test cases.
You can’t perform that action at this time.