Skip to content
Go to file

Latest commit


Git stats


Failed to load latest commit information.
Latest commit message
Commit time



lobid-gnd: access GND+EntityFacts data as JSON-LD over HTTP.



sbt 0.13 or newer — download sbt

Elasticsearch 5.6.x (configured in application.conf)


Get the code, change into the project directory, and run the tests:

git clone ; cd lobid-gnd ; sbt test


The are three data sources involved:

EntityFacts (JSON-LD over HTTP), GND baseline (RDF-XML over HTTP), and GND updates (RDF-XML over OAI-PMH).


Set up a location for the EntityFacts input data:

mkdir entityfacts ; cd entityfacts

Get the EntityFacts data from the DNB – for the 20200713 release here, find current at


Unpack the data:

gunzip < authorities_entityfacts_20200713.jsonld.gz > authorities_entityfacts_20200713.jsonld

Go back to the root directory:

cd ..

Set up the data location in ‘conf/application.conf’ (data.entityfacts):

data { ... entityfacts: "entityfacts/authorities_entityfacts_20200713.jsonld" ...

Set up the name for the index to create in ‘conf/application.conf’ (index.entityfacts.index):

index { ... entityfacts { index: "entityfacts_20200713" ...

Index the data:

sbt "runMain apps.Index entityfacts"

GND Baseline

Get the RDF data

Set up a location for the input data:

mkdir input_data; cd input_data

Set ‘data.rdfxml’ in ‘conf/application.conf’ to the ‘input_data’ location.

Get the GND RDF/XML source data from


This should give you 6 local files ending with ‘.rdf.gz’. Go back to the project root directory:

cd ..

Convert RDF/XML to JSON

Set up a location for the index data:

mkdir index_data

Set ‘data.jsonlines’ in ‘conf/application.conf’ to the ‘index_data’ location.

Set ‘index.boot’ in ‘conf/application.conf’ to an existing index. This index will be used to get labels during the conversion process.

Set ‘’ in ‘conf/application.conf’ to a non-existing index. This index name will be used in the indexing data created during conversion.

Convert the data to JSON-LD lines, the index data format:

sbt "runMain apps.ConvertBaseline"

To be able to log out from the server while the conversion is running, we actually use:

setsid nohup sbt "runMain apps.ConvertBaseline" &

This should create 6 ‘*.jsonl’ files in ‘index_data’.

Index the JSON data

If the ‘’ configured in ‘application.conf’ does not exists, a new index will be created.

To start the indexing, run:

sbt "runMain apps.Index baseline"


Get and convert the updates

Updates are pulled via the DNB OAI-PMH interface.

Pass one argument: get updates since a given date:

sbt "runMain apps.ConvertUpdates 2020-06-22"

The date of the most recent update is stored in ‘GND-lastSuccessfulUpdate.txt’ (can be changed in the config).

The original downloaded data and the converted data are stored in separate files. To convert the data again without downloading it, use the steps described above under ‘Convert RDF/XML to JSON’ with the update RDF data.

Index the updates

To index the updates run:

sbt "runMain apps.Index updates"

See ‘application.conf’ for details on the configured file names etc.


In ‘lobid-gnd’, run the web application:

sbt run

Open http://localhost:9000/gnd


To set up an Eclipse project, first generate the Eclipse config for your machine:

sbt "eclipse with-source=true"

Then import the project in Eclipse: “File” > “Import” > “Existing Projects into Workspace”.

You can’t perform that action at this time.