Skip to content
This repository has been archived by the owner. It is now read-only.


Switch branches/tags

Latest commit


Git stats


Failed to load latest commit information.
Latest commit message
Commit time

The DPLA Ingestion System

Build Status

Build Status


Please see the release notes regarding changes and upgrade steps.

Setting up the ingestion server:

Install Python 2.7 if not already installed (;

Install PIP (;

Install the ingestion subsystem;

$ pip install --no-deps --ignore-installed -r requirements.txt

Configure an akara.ini file appropriately for your environment;

Port=<port for Akara to run on>
; Recommended LogLevel is one of DEBUG or INFO

Url=<URL to CouchDB instance, including trailing forward-slash>
Username=<CouchDB username>
Password=<CouchDB password>
SyncQAViews=<True or False; consider False on production>
; Recommended LogLevel is INFO for production; defaults to INFO if not set

BaseUrl=<URL to Twofishes server, required for geo-enrichments>

Username=<Rackspace username>
ApiKey=<Rackspace API key>
DPLAContainer=<Rackspace container for bulk download data>
SitemapContainer=<Rackspace container for sitemap files>

NYPL=<Your NYPL API token>

SitemapURI=<Sitemap URI>
SitemapPath=<Path to local directory for sitemap files>

To=<Comma-separated email addresses to receive alert email>
From=<Email address to send alert email>


Merge the akara.conf.template and akara.ini file to create the akara.conf file;

$ python install 

Set up and start the Akara server;

$ akara -f akara.conf setup
$ akara -f akara.conf start

Build the database views;

$ python scripts/ dpla
$ python scripts/ dashboard
$ python scripts/ bulk_download

Testing the ingestion server:

You can test it with this set description from Clemson;

$ curl "http://localhost:8889/oai.listrecords.json?endpoint=" 

If you have the endpoint URL but not a set id, there's a separate service for listing the sets;

$ curl "http://localhost:8889/oai.listsets.json?endpoint="

To run the ingest process run the script, if not done so already, initialize the database and database views, then feed it a source profile (found in the profiles directory);

$ python install
$ python scripts/ dpla
$ python scripts/ dashboard
$ python scripts/ profiles/clemson.pjs


This application is released under a AGPLv3 license.

  • Copyright Digital Public Library of America, 2012 -- 2017