Skip to content
A HTTP interface to the Project Gutenberg corpus.
Branch: master
Clone or download
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
gutenberg_http Avoid hard-coding test page url Jun 25, 2018
tests
.dockerignore Dockerize service Jun 25, 2018
.env
.gitignore Add config file for docker-compose Jun 25, 2018
.travis.yml Modernize Travis configuration Jun 25, 2018
Dockerfile
LICENSE.txt
MANIFEST.in Ensure package can be pip-installed Apr 30, 2017
README.rst
docker-compose.yml
makefile
requirements-dev.txt
requirements.txt
runserver.py Add file extensions May 3, 2017
setup.py

README.rst

Gutenberg-HTTP

https://travis-ci.org/c-w/gutenberg-http.svg?branch=master

Overview

This project is an HTTP wrapper for the Python Gutenberg API. As such, it lets you search for books, retrieve information about books and get the text of books via a set of easy-to-use HTTP endpoints.

The API is implemented using the Sanic web-framework and served in a Docker container. You can run the project locally using:

docker-compose up

This will serve the API at http://localhost:8000. It will take a while to bring up the service the first time since the Gutenberg metadata cache needs to get populated.

Endpoints

Fetch all metadata for a book

# fetch all metadata for a book-id
curl 'http://localhost:8000/texts/2701'
{
  "metadata": {
    "title": ["Moby Dick; Or, The Whale"],
    "rights": ["Public domain in the USA."],
    "author": ["Melville, Herman"],
    "subject": [
      "Mentally ill -- Fiction",
      "Whaling -- Fiction",
      "Ship captains -- Fiction",
      "Sea stories",
      "Whaling ships -- Fiction",
      "Psychological fiction",
      "Ahab, Captain (Fictitious character) -- Fiction",
      "PS",
      "Whales -- Fiction",
      "Adventure stories"
    ],
    "language": ["en"]
  },
  "text_id": 2701
}

Fetch specific metadata for a book

# fetch specific metadata for a book-id
curl 'http://localhost:8000/texts/2701?include=title,author'
{
  "metadata": {
    "author": ["Melville, Herman"],
    "title": ["Moby Dick; Or, The Whale"]
  },
  "text_id": 2701
}

Fetch the text of a book

# fetch the text for a book-id
curl 'http://localhost:8000/texts/2701/body'

Simple search for books

# simple single-predicate query with field expansion
curl 'http://localhost:8000/search/title eq Moby Dick?include=author,rights,language'
{
  "texts": [
    {
      "author": ["Melville, Herman"],
      "language": ["en"],
      "text_id": 9147,
      "rights": ["Copyrighted. Read the copyright notice inside this book for details."]
    },
    {
      "author": ["Melville, Herman"],
      "language": ["en"],
      "text_id": 15,
      "rights": ["Public domain in the USA."]
    }
  ]
}

Conjunctive query for books

# conjunctive query
curl 'http://localhost:8000/search/author eq "Melville, Herman" and rights eq "Public domain in the USA." and title eq "Moby Dick"'
{"texts": [{"text_id": 15}]}
You can’t perform that action at this time.