The DPLA ingestion system
Python JavaScript Shell
Latest commit 5853840 Feb 16, 2017 @markbreedlove markbreedlove committed on GitHub Merge pull request #75 from dpla/bhl-mods-mapping
Change BHL to MODS

The DPLA Ingestion System

Build Status

Build Status


Please see the release notes regarding changes and upgrade steps.

Setting up the ingestion server:

Install Python 2.7 if not already installed (;

Install PIP (;

Install the ingestion subsystem;

$ pip install --no-deps --ignore-installed -r requirements.txt

Configure an akara.ini file appropriately for your environment;

Port=<port for Akara to run on>
; Recommended LogLevel is one of DEBUG or INFO

ApiKey=<your Bing Maps API key>

Url=<URL to CouchDB instance, including trailing forward-slash>
Username=<CouchDB username>
Password=<CouchDB password>
SyncQAViews=<True or False; consider False on production>
; Recommended LogLevel is INFO for production; defaults to INFO if not set

Username=<Geonames username>
Token=<Geonames token>

Username=<Rackspace username>
ApiKey=<Rackspace API key>
DPLAContainer=<Rackspace container for bulk download data>
SitemapContainer=<Rackspace container for sitemap files>

NYPL=<Your NYPL API token>

SitemapURI=<Sitemap URI>
SitemapPath=<Path to local directory for sitemap files>

To=<Comma-separated email addresses to receive alert email>
From=<Email address to send alert email>


Merge the akara.conf.template and akara.ini file to create the akara.conf file;

$ python install 

Set up and start the Akara server;

$ akara -f akara.conf setup
$ akara -f akara.conf start

Build the database views;

$ python scripts/ dpla
$ python scripts/ dashboard
$ python scripts/ bulk_download

Testing the ingestion server:

You can test it with this set description from Clemson;

$ curl "http://localhost:8889/oai.listrecords.json?endpoint=" 

If you have the endpoint URL but not a set id, there's a separate service for listing the sets;

$ curl "http://localhost:8889/oai.listsets.json?endpoint="

To run the ingest process run the script, if not done so already, initialize the database and database views, then feed it a source profile (found in the profiles directory);

$ python install
$ python scripts/ dpla
$ python scripts/ dashboard
$ python scripts/ profiles/clemson.pjs


This application is released under a AGPLv3 license.

  • Copyright Digital Public Library of America, 2012 -- 2017