Skip to content
A HTTP interface to the Project Gutenberg corpus.
Branch: master
Clone or download
Type Name Latest commit message Commit time
Failed to load latest commit information.
gutenberg_http Avoid hard-coding test page url Jun 25, 2018
.dockerignore Dockerize service Jun 25, 2018
.gitignore Add config file for docker-compose Jun 25, 2018
.travis.yml Modernize Travis configuration Jun 25, 2018
LICENSE.txt Ensure package can be pip-installed Apr 30, 2017
requirements.txt Add file extensions May 3, 2017




This project is an HTTP wrapper for the Python Gutenberg API. As such, it lets you search for books, retrieve information about books and get the text of books via a set of easy-to-use HTTP endpoints.

The API is implemented using the Sanic web-framework and served in a Docker container. You can run the project locally using:

docker-compose up

This will serve the API at http://localhost:8000. It will take a while to bring up the service the first time since the Gutenberg metadata cache needs to get populated.


Fetch all metadata for a book

# fetch all metadata for a book-id
curl 'http://localhost:8000/texts/2701'
  "metadata": {
    "title": ["Moby Dick; Or, The Whale"],
    "rights": ["Public domain in the USA."],
    "author": ["Melville, Herman"],
    "subject": [
      "Mentally ill -- Fiction",
      "Whaling -- Fiction",
      "Ship captains -- Fiction",
      "Sea stories",
      "Whaling ships -- Fiction",
      "Psychological fiction",
      "Ahab, Captain (Fictitious character) -- Fiction",
      "Whales -- Fiction",
      "Adventure stories"
    "language": ["en"]
  "text_id": 2701

Fetch specific metadata for a book

# fetch specific metadata for a book-id
curl 'http://localhost:8000/texts/2701?include=title,author'
  "metadata": {
    "author": ["Melville, Herman"],
    "title": ["Moby Dick; Or, The Whale"]
  "text_id": 2701

Fetch the text of a book

# fetch the text for a book-id
curl 'http://localhost:8000/texts/2701/body'

Simple search for books

# simple single-predicate query with field expansion
curl 'http://localhost:8000/search/title eq Moby Dick?include=author,rights,language'
  "texts": [
      "author": ["Melville, Herman"],
      "language": ["en"],
      "text_id": 9147,
      "rights": ["Copyrighted. Read the copyright notice inside this book for details."]
      "author": ["Melville, Herman"],
      "language": ["en"],
      "text_id": 15,
      "rights": ["Public domain in the USA."]

Conjunctive query for books

# conjunctive query
curl 'http://localhost:8000/search/author eq "Melville, Herman" and rights eq "Public domain in the USA." and title eq "Moby Dick"'
{"texts": [{"text_id": 15}]}
You can’t perform that action at this time.