Skip to content

ldbc/ldbc_graphalytics_platforms_umbra

Repository files navigation

Postgres/Umbra Graphalytics implementation

Build Status

Implementation of LDBC Graphalytics using PostgreSQL and Umbra.

Pointers:

Building the project and running the benchmark

  1. To initialize the benchmark package, run:

    scripts/init.sh ${GRAPHS_DIR}

    where GRAPHS_DIR is the directory of the graphs and the validation data. The argument is optional and its default value is ~/graphs.

    This script creates a Maven package (graphalytics-${GRAPHALYTICS_VERSION}-umbra-${PROJECT_VERSION}.tar.gz). Then, it decompresses the package, initializes a configuration directory config (based on the content of the config-template directory) and sets the location of the graph directory.

    Note that the project uses the Build Number Maven plug-in to ensure reproducibility. Hence, builds fail if the local Git repository contains uncommitted changes. To build it regardless (for testing), run it as follows:

    scripts/init-for-testing.sh ${GRAPHS_DIR}
  2. Navigate to the directory created by the init.sh script:

    cd graphalytics-*-umbra-*/
  3. Edit the configuration files (e.g. graphs to be included in the benchmark) in the config directory. To conduct benchmark runs, edit the config/benchmark.properties file and replace the include = benchmarks/custom.properties to select the dataset size you wish to use, e.g. include = benchmarks/xl.properties

  4. To set up a Postgres instance, run e.g.:

    export POSTGRES_INPUT_DATA_DIR=~/graphs
    bin/scripts/start-postgres.sh
  5. Run the benchmark with the following command:

    bin/sh/run-benchmark.sh

Testing the package

If you would like to initialize the benchmark and run it with the default configuration, run the following:

scripts/package-and-run-benchmark.sh

Numdiff

For manual tests that require epsilon matching, numdiff can be useful. Use it as follows:

numdiff --absolute-tolerance 0.0001 scratch/output-data/output.csv ~/graphs/pr-directed-test-PR