Skip to content

This tool provides a web service to convert particular question answering datasets (represented in JSON format) into RDF Turtle. It uses the Question Answering Dataset Ontology (QADO) to represent the data in RDF.

License

Notifications You must be signed in to change notification settings

WSE-research/QADO-question-answering-dataset-RDFizer

Repository files navigation

QADO Question Answering Dataset RDFizer

This tool provides a web service to convert particular question answering datasets (represented in JSON format) into RDF Turtle. It uses the Question Answering Dataset Ontology (QADO) to represent the data in RDF.

Setup service

Configuration

This service needs a running instance of QADO RML Applicator. The host of a instance (e. g., http://localhost:8000) has to be provided by setting the environment variable RML_APPLICATOR_HOST.

Using Gradle

To run the service locally run ./gradlew run.

Using Docker

Otherwise, you can set up a Docker image running the service by pulling the prepared image:

docker pull wseresearch/qado-rdfizer:latest

Alternatively you can build the Docker image from source. To start a Docker container use the following command:

docker run -d --env RML_APPLICATOR_HOST="YOUR RML APPLICATOR HOST" -p "$EXTERNAL_PORT:8080" wseresearch/qado-rdfizer:latest

Accessing the service

Basic UI

This service provides a basic UI at $HOST:$PORT/ where you can transform a dataset and view the results directly in the web browser.

API endpoint

To transform a JSON file to RDF perform a POST-Request at $HOST:$PORT/json2rdf with a JSON payload of the following structure:

{
  "filePath": "URL of the JSON file",
  "format": "Mapping file name",
  "label": "Name for the generated RDF triples",
  "homepage": "URL of the data publisher",
  "language": "Language tag of the questions (required only for 'compositional_wikidata' format)"
}

Supported Datasets

By default, the following datasets/formats are supported (using these RML mappings):

Supported output formats

You can choose the output format of the service by providing an Accept Header. The following Content-Types are supported:

Web Service Usage

The following cURL command can be used to convert a JSON file of the QALD benchmark into RDF using Turtle as the output format.

curl --location --request POST 'http://$HOST:$PORT/json2rdf' \
--header 'Content-Type: application/json' \
--data-raw '{
    "filePath": "https://github.com/ag-sc/QALD/raw/master/6/data/qald-6-train-multilingual-raw.json",
    "format": "qald",
    "label": "QALD 6 train multilingual raw",
    "homepage": "https://github.com/ag-sc/QALD"
}'

Adding additional formats

To add new mapping rules just add a new mapping file NAME.ttl to app/mappings while NAME has to be in all caps. The mapping language is RML. To use the file within the webservice just use the base file name as the format parameter.

Statistics

Here, also a script for creating statistics about the created datasets can be created. See scripts/statistics for more details.