Skip to content

cloud based query benchmarking suite

Eduardo Pareja Tobes edited this page Feb 14, 2014 · 2 revisions
  • difficulty medium
  • technologies scala, java, titan, neo4j, blueprints, gremlin, cypher, aws, biology, datavis

Bio4j has one of the most richly-typed graph-based data models out there, and it makes for a perfect fit in terms of testing and comparing the performance of different engines and technologies for real-world biologically-meaningful queries/traversals. Here we will design and implement a set of queries taking into account the specifics of each engine (at least Titan and Neo4j, using Blueprints, Gremlin, Cypher, etc). Then an automated AWS-based testing system will be developed building on top of the already existent AWS deployment infrastructure, together with graphical output for the results.

expected outcome

An automated performance testing system based on the execution of a set of biologically-meaningful traversals/queries on bio4j instances, displaying its output in a graphical way.

mentors

They both have 4+ years of experience with Neo4j, Blueprints and more recently Titan. @pablopareja in particular is a well-known active participant in the graph databases community.