Skip to content

uber/uber-graph-benchmark

Repository files navigation

Uber Graph Benchmark (UGB)

Slides: Uber Graph Benchmark Framwork

Getting Started

  1. Check out the repo

  2. Set up a database to benchmark. There is a README file under each binding directory. List out all modules by

    ./gradlew projects
  3. Run benchmark on db

# generates and writes to redis db, then reads with subgraph queries
./gradlew execute -PmainArgs="-db com.uber.ugb.db.redis.RedisDB -w -g benchdata/graphs/trips -b benchdata/workloads/workloada -r"

# generates and writes to redis db, then reads with subgraph queries
./gradlew execute -PmainArgs="-db com.uber.ugb.db.cassandra.CassandraDB -w -g benchdata/graphs/trips -b benchdata/workloads/workloada -r"

# this generate vertices and edges and write to noop, used for measuring data gen performance
./gradlew execute -PmainArgs="-db com.uber.ugb.db.NoopDB -g benchdata/graphs/trips -b benchdata/workloads/workloada -w"

# this generate vertices and edges and write as csv to System.out or a file
./gradlew execute -PmainArgs="-db com.uber.ugb.db.CsvOutputDB -g benchdata/graphs/trips -b benchdata/workloads/workloada -w"

Customization

Set environment variables in

benchdata/workloads/env.properties

To add a new DB implementation, consider inherit from

  • com.uber.ugb.db.KeyValueDB

    This stores the adjacency list in one blob.

  • com.uber.ugb.db.PrefixKeyValueDB

    This stores the adjacency list with the same prefix. The edge writes could be faster than KeyValueDB.

  • com.uber.ugb.db.GremlinDB

    This processes gremlin queries directly.

Build

  • create jar
./gradlew jar
  • build fat jar for spark
./gradlew build

Run on Spark

Here is one example on how to run spark

#!/usr/bin/env bash

cd ugsb

YARN_CONF_DIR=/etc/hadoop/conf /home/spark-2.1.0/bin/spark-submit \
--class "com.uber.ugb.Benchmark" \
--master yarn \
--deploy-mode client \
--driver-memory 6G \
--executor-memory 6G \
--executor-cores 2 \
--driver-cores 2 \
--num-executors 10 \
--conf spark.yarn.executor.memoryOverhead=2048 \
--driver-class-path '/etc/hive/conf' \
build/libs/ugb-all-0.0.15.jar \
"-db com.uber.ugb.db.cassandra.CassandraDB -w -g benchdata/graphs/trips -b benchdata/workloads/workloada -r -s"

echo $?

About

A framework to benchmark different graph databases, based on generated data from customizable schema, distribution, and size.

Topics

Resources

License

Code of conduct

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published