Follow the instructions below to reproduce this experiment.
(sudo groupadd docker)
sudo usermod -aG docker $USER
docker run hello-world
Create a directory
/mnt/data/starvers_eval and make sure that docker can write to it by changing the privileges. This is the default host directory used by the docker-compose services. If you wish to change that, you can so in the .env file.
Run the following command in the root directory of this project:
docker build -t starvers_eval .
The experiment can be fully run by executing the following 7 docker-compose services one-by-one. This process is not automated as some individual steps need a considerable amount of time to finish. We want to make sure that each of them runs through and repeat them otherwise.
docker-compose run download: Downloads the BEAR datasets and query sets
docker-compose run clean_raw_datasets: Cleans the datasets by skolomizing blank nodes and commenting out invalid triples.
docker-compose run construct_datasets: For each raw dataset (BEARB_day, BEARB_hour, BEARC) it constructs the change sets, the StarVers RDF-star-based dataset, a dataset with all ICs stored into named graphs and a dataset with all change sets stored into named graphs. It also measures the execution time of the insert and outdate functions from the StarVers API.
docker-compose run ingest: Loads all 12 constructed datasets from the previous step into GraphDB and Jena TDB2, respectively.
docker-compose run construct_queries: Constructs the evaluation queries from the raw queries that have been downloaded in the first step. In total there should be 456.584 queries. This number results from the sum of 4 policies/dataset variants x 89 versions x 82 raw queries for BERAB_day, 4 policies/dataset variants x 1299 versions x 82 raw queries for BERAB_hour and 4 policies/dataset variants x 33 versions x 10 raw queries for BERAB_day.
docker-compose run evaluate: Runs the queries against the repositories and measures the execution time.
docker-compose run visualize: Creates 7 figures with plots for query performance, dataset sizes & ingestion and update performance. These 7 figures are provided in the output/figures directory of this project.