Scientific Data Management Lab

KG-Tools available @SDM

RDFizer - Transforms raw data to RDF
SemEP - Community detection applied on knowledge graphs to find insights and patterns.
MULDER - Federated SPARQL Engine for distributed and autonomous SPARQL endpoints.
BOUNCER - Privacy-aware Query Processing Over Federations of RDF Datasets
Ontario - Federated SPARQL Engine for Heterogenous Data Sources in a Data Lake

Running KG-Tools on Docker

1. Running RDFizer

$ docker run -d --name myrdfizer -p 4000:80 kemele/rdfizer:1.0

You propbably need to attach a volume to share data with the container, as follows:

$ docker run -d --name myrdfizer -p 4000:80 -v /path/to/data:/data kemele/rdfizer:1.0

Send a POST request with the configuration file to RDFizer the file:

$ curl -X POST localhost:4000/graph_creation/data/path/to/config-file

Config file should look something like this:

[default]
main_directory: /data
[datasets]
number_of_datasets: 1
output_folder: ${default:main_directory}/output

[dataset1]
name: ADSampleDataWP4CO
format: csv
path: ${default:main_directory}/rawdata/myfile.csv
mapping: ${default:main_directory}/mappings/mymap.ttl
remove_duplicate_triples_in_memory: yes

Example:

Raw data is stored in /home/user/documents/mydata and it contains three folders; config, mappings and rawdata. Our config_file.ini is in config folder, our RML mapping file, mymap.ttl is stored in mappings and our raw data myfile.csv is stored in rawdata folder. Run the following:

$ docker run -d --name myrdfizer -p 4000:80 -v /home/user/documents/mydata:/data kemele/rdfizer:1.0

Then execute the following to start the transformation of raw data:

$ curl -X POST localhost:4000/graph_creation/data/config/config_file.ini

The output of the RDFization process will be written to a folder specified in the output_folder (in this case, it is in /data/output) as N-Triple file.

2. Running SemEP

$ docker run -it --rm -v */path/to/graphandoutput/folder*:/data kemele/semep-node:20-04-18 semEP-node <nodes> <similarity matrix> <threshold>

nodes and similarity-matrix files sould be attached as a volume to the container.

3. Running MULDER

4. Running BOUNCER

5. Running Ontario

KG Creation and Management Pipeline

This pipeline created by combining the tools described above with other storage and communications. One example of such Pipeline is created for IASIS-KG, which uses the RDFizer, MULDER, and SemEP, in addition, OpenLink Virtuoso triple store is used to store the transformed data to make querying via SPARQL possible. RabbitMQ is used for communication between components.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Scientific Data Management Lab

KG-Tools available @SDM

Running KG-Tools on Docker

1. Running RDFizer

Example:

2. Running SemEP

3. Running MULDER

4. Running BOUNCER

5. Running Ontario

KG Creation and Management Pipeline

About

Releases

Packages

SDM-TIB/KG-Tools

Folders and files

Latest commit

History

Repository files navigation

Scientific Data Management Lab

KG-Tools available @SDM

Running KG-Tools on Docker

1. Running RDFizer

Example:

2. Running SemEP

3. Running MULDER

4. Running BOUNCER

5. Running Ontario

KG Creation and Management Pipeline

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Packages