Permalink
Browse files

migrated all the wiki documentation into the project.

  • Loading branch information...
1 parent b77ace2 commit 287dd475f0954e26631ec5235ede346657e75dcd @okram okram committed Apr 12, 2011
Showing with 1,068 additions and 0 deletions.
  1. +9 −0 graphdb-bench/doc/graphdb-bench.wiki/Acknowledgements.textile
  2. +39 −0 graphdb-bench/doc/graphdb-bench.wiki/Benchmark.textile
  3. +16 −0 graphdb-bench/doc/graphdb-bench.wiki/Creating-Artificial-Graphs.textile
  4. +43 −0 graphdb-bench/doc/graphdb-bench.wiki/Evaluator.textile
  5. +1 −0 graphdb-bench/doc/graphdb-bench.wiki/Frequently-Asked-Questions.md
  6. +31 −0 graphdb-bench/doc/graphdb-bench.wiki/Home.textile
  7. +6 −0 graphdb-bench/doc/graphdb-bench.wiki/Introduction.textile
  8. +78 −0 graphdb-bench/doc/graphdb-bench.wiki/Operation.textile
  9. +98 −0 graphdb-bench/doc/graphdb-bench.wiki/OperationFactory.textile
  10. +26 −0 graphdb-bench/doc/graphdb-bench.wiki/Overview.textile
  11. +1 −0 graphdb-bench/doc/graphdb-bench.wiki/Plotting-Results.md
  12. +57 −0 graphdb-bench/doc/graphdb-bench.wiki/Published-Benchmark-Results.textile
  13. +33 −0 graphdb-bench/doc/graphdb-bench.wiki/Reading-Result-Logs.textile
  14. +46 −0 graphdb-bench/doc/graphdb-bench.wiki/Running-Benchmarks.textile
  15. +33 −0 lopsided/doc/lopsided.wiki/A-Tutorial-with-the-Villein-GUI.textile
  16. +23 −0 lopsided/doc/lopsided.wiki/Deploying-a-Farm.textile
  17. +174 −0 lopsided/doc/lopsided.wiki/Developing-a-Villein.textile
  18. +38 −0 lopsided/doc/lopsided.wiki/Home.textile
  19. +57 −0 lopsided/doc/lopsided.wiki/Introduction-to-Linked-Process.textile
  20. +19 −0 lopsided/doc/lopsided.wiki/Linked-Process-and-XMPP.textile
  21. +38 −0 lopsided/doc/lopsided.wiki/Managing-Farm-Permissions.textile
  22. +9 −0 lopsided/doc/lopsided.wiki/Use-Cases-for-Linked-Process.textile
  23. +94 −0 mutant/doc/mutant.wiki/Basic-Examples.textile
  24. +25 −0 mutant/doc/mutant.wiki/Home.textile
  25. +7 −0 mutant/doc/mutant.wiki/Introduction.textile
  26. +21 −0 mutant/doc/mutant.wiki/Mutant-Language.textile
  27. +15 −0 mutant/doc/mutant.wiki/Supported-Engines.textile
  28. +14 −0 wrender/doc/wrender.wiki/Home.textile
  29. +17 −0 wrender/doc/wrender.wiki/Introduction.textile
@@ -0,0 +1,9 @@
+This section provides a list of the people that have contributed in some way to the creation of GraphDB-Bench.
+
+# "Alex Averbuch":http://se.linkedin.com/in/alexaverbuch -- co-designed core concepts, main developer, and documenter of GraphDB-Bench.
+# "Marko A. Rodriguez":http://markorodriguez.com -- creator of GraphDB-Bench, and writer of initial code base and documentation.
+# "Martin Neumann":http://se.linkedin.com/pub/martin-neumann/19/5a6/553 -- co-designed core concepts, and helped develop initial version of current code base.
+
+Please review GraphDB-Bench's "pom.xml":http://github.com/tinkerpop/graphdb-bench/blob/master/pom.xml. GraphDB-Bench would not be possible without the work done by others to create these useful packages.
+
+For more information join the Gremlin (GraphDB-Bench does not have a users group at present) users group at "http://groups.google.com/group/gremlin-users":http://groups.google.com/group/gremlin-users.
@@ -0,0 +1,39 @@
+You've defined your "operations":http://github.com/tinkerpop/graphdb-bench/wiki/Operation and "factories":http://github.com/tinkerpop/graphdb-bench/wiki/OperationFactory. All that's left is to create your own @Benchmark@ definition that puts everything together. Fortunately this is also the easiest part, as @Benchmark@ is simply a convenient place to specify what type of operations your benchmark will comprise of.
+
+h3. Create your own Benchmark
+
+To create your own @Benchmark@ only one method (@getOperationFactories()@) needs to be overriden. Note however that the constructor of parent class @Benchmark@ must also be invoked (i.e. @super(log)@).
+
+bc. public class BenchmarkExample extends Benchmark {
+ // constructor
+ public BenchmarkBasic(String log) { super(log); }
+ @Override
+ protected ArrayList<OperationFactory> getOperationFactories() { return new ArrayList<OperationFactory>(); }
+}
+
+Here's what the method does:
+* *getOperationFactories():* This method is automatically called when the benchmark is started. Nearly all of the logic associated with running a benchmark (including invocation of @BenchRunner@) is already contained within @Benchmark@, all you need to do let @Benchmark@ know what type of operations are to be benchmarked - that is the sole purpose of this method. This is achieved by returning a collection of @OperationFactory@ instances, which will later be passed to @BenchRunner@ for execution. The order of execution is well defined: @BenchRunner@ executes all @Operation@ instances from the first factory, then moves onto the next factory, until all factories have been exhausted.
+
+h3. Example Benchmark implementation
+
+To support the explanations above, an example @Benchmark@ implementation follows:
+
+* This example creates a benchmark that's composed of the two operation types we introduced in [[Operation]] and [[OperationFactory]]. In this case, 10 instances of each operation will be created when the benchmark is run.
+
+bc. public class BenchmarkExample extends Benchmark {
+ //
+ private static String PROPERTY_KEY = "name";
+ private static int OP_COUNT_GET = 10;
+ private static int OP_COUNT_NEIGHBORS = 10;
+ //
+ public BenchmarkBasic(String log) {
+ super(log);
+ }
+ @Override
+ protected ArrayList<OperationFactory> getOperationFactories() {
+ ArrayList<OperationFactory> operationFactories = new ArrayList<OperationFactory>();
+ operationFactories.add(new OperationFactoryGetVertex(OP_COUNT_GET, PROPERTY_KEY));
+ operationFactories.add(new OperationFactoryGremlinOutNeighbors(OP_COUNT_NEIGHBORS, PROPERTY_KEY)));
+ return operationFactories;
+ }
+}
@@ -0,0 +1,16 @@
+Before a graph system can be tested, a graph dataset is required. With GraphDB-Bench it's possible to run benchmarks on your existing database (and dataset), and in many cases this is desirable. However, you may be using GraphDB-Bench to compare multiple different graph databases (@Graph@ implementations). Alternatively, your benchmark might involve writing to (and potentially destroying) your dataset. In such cases you'll need a way to import the *same* graph dataset multiple times, into each database.
+
+"Blueprints":http://github.com/tinkerpop/blueprints/wiki provides the functionality to populate a graph database by importing "GraphML":http://graphml.graphdrawing.org/ files. Given that GraphDB-Bench is built on top of Blueprints (and the rest of the "TinkerPop":http://github.com/tinkerpop stack), it's natural to take advantage of this ability.
+
+GraphDB-Bench comes with scripts for generating synthetic graphs of different topologies, and exporting those graphs to GraphML files. At present "R:Statistics":http://www.r-project.org/ and "Python":http://python.org/ scripts are provided, both using the "iGraph":http://igraph.sourceforge.net/ library's artificial graph generation functions. They can be found at:
+* @src/main/r/graph-creator.r@
+* @src/main/python/graph-creator.py@
+
+*Important:* If you intend on using these scripts you will need to install iGraph. The "iGraph Ubuntu Repository is here.":https://launchpad.net/~igraph/+archive/ppa
+
+Finally, when using @src/main/python/graph-creator.py@ note that it reads some of its parameters from @src/main/resources/com/tinkerpop/bench/bench.properties@. The relevant parts of this properties file are displayed below:
+
+bc. bench.datasets.directory=data/datasets/ <-- directory where .graphml file will be saved
+bench.graph.barabasi.file=barabasi.graphml <-- at present, this is the name of the generated file
+bench.graph.barabasi.vertices=1000 <-- number of vertices the graph will have
+bench.graph.barabasi.degree=5 <-- average degree (approximately) of vertices in the generated graph
@@ -0,0 +1,43 @@
+As previously covered in the [[OperationFactory]] section, almost all operations will only compute on a subset of the total graph. They will generally start from some start vertex/vertices, then perform a "traversal":http://arxiv.org/abs/1004.1001 from there.
+The way these start vertices are selected can be interesting for many reasons. For example, to more closely model a real-world application you may want your benchmark to access certain vertices more frequently than others. Alternatively, you may want reduce the impact of caching by accessing every vertex with equal frequency.
+The purpose of @Evaluator@ classes is to provide developers with an easy means of controlling how these start vertices are selected.
+
+h3. Random Vertex Sampling
+
+To retrieve a random vertex sample is straight forward, the @StatisticsHelper.getSampleVertexIds(Graph db, Evaluator evaluator, int sampleSize)@ helper method provides this functionality. However, the way it does so is the interesting part. The following flow of events provides a general explanation of the selection process:
+# Every vertex in the graph gets assigned some @score@ - a floating point number.
+# The sum of all vertices' @scores@ is calculated and stored in @scoreSum@.
+ ** A vertex's score denotes how much of the region [0,@scoreSum@] it "owns".
+# A floating point number is uniformly randomly generated in the range [0,@scoreSum@].
+# The vertex that owns the region in which the random number resides is selected.
+# Return to 3) and repeat until a large enough sample has been retrieved.
+
+h3. Standard Evaluator Implementations
+
+By taking an @Evaluator@ instance as one of its input parameters the @getSampleVertexIds@ method gives the developer control over how vertex @scores@ are calculated. The following list presents some of the standard @Evaluator@ implementations:
+* *EvaluatorUniform:* All vertices are assigned the same @score@.
+* *EvaluatorDegree:* A vertex's score is proportional to its degree (the number of edges entering it and leaving it).
+* *EvaluatorInDegree:* A vertex's score is proportional to its in-degree (the number of edges entering it).
+* *EvaluatorOutDegree:* A vertex's score is proportional to its out-degree (the number of edges leaving it).
+* *EvaluatorProperty:* A vertex's score is retrieved from a specified property (e.g. age, weight, coordinates, etc).
+
+h3. An Example
+
+To better explain the concept, consider the example graph below:
+
+!=https://github.com/tinkerpop/graphdb-bench/raw/master/doc/images/graphdb-bench-evaluator-graph.png!
+
+If vertices were selected from this graph using an @EvaluatorUniform@, every vertex would be equally likely to be selected. In contrast, if we were to use @EvaluatorDegree@ then the vertices with higher degree would be selected with higher probability (and therefore higher frequency). The image below visually illustrates this:
+
+!=https://github.com/tinkerpop/graphdb-bench/raw/master/doc/images/graphdb-bench-evaluator-distribution.png!
+
+h3. Extending Evaluator
+
+Finally, although the standard @Evaluator@ implementations are sufficient for many (most?) benchmarks, there will always be those that want more. When the standard implementations don't do the trick, defining your own is easy. Here's what you need to do, in code-snippet form:
+
+bc. public class EvaluatorExample extends Evaluator {
+ @Override
+ public double evaluate(Vertex vertex) {
+ return //<CALCULATE SCORE DETERMINISTICALLY>
+ }
+}
@@ -0,0 +1 @@
+Put stuff here...
@@ -0,0 +1,31 @@
+!https://github.com/tinkerpop/graphdb-bench/raw/master/doc/images/graphdb-bench-logo.png!
+
+GraphDB-Bench is an extensible graph database benchmarking tool. Its goal is to provide an easy-to-use library for defining and running application/domain-specific benchmarks against different graph database implementations. To achieve this the core code-base has been kept relatively simple, through extensive use of lower layers in the "TinkerPop":http://github.com/tinkerpop stack.
+
+In addition to the benchmarking framework, GraphDB-Bench also includes a collection of artificial graph generation code.
+
+The documentation herein will provide all the information necessary for understanding how to define benchmarks, generate synthetic graphs, run benchmarks on these loaded graphs, and evaluate benchmark results.
+
+Finally, note that each benchmark described in this documentation has provided results for those not desiring to run the benchmarks on their local machines.
+
+==<hr/>==
+
+# [[Introduction]]
+# [[Creating Artificial Graphs]]
+# Defining Benchmarks
+ ** [[Overview]]
+ ** Extending [[Operation]]
+ ** Extending [[OperationFactory]]
+ *** About [[Evaluator]]
+ ** Extending [[Benchmark]]
+# [[Running Benchmarks]]
+# Analyzing Benchmark Results
+ ** [[Reading Result Logs]]
+ ** [[Plotting Results]] *TODO*
+# [[Published Benchmark Results]]
+# [[Frequently Asked Questions]] *TODO*
+# [[Acknowledgements]]
+
+==<hr/>==
+
+For more details please ask questions on the "Gremlin mailing list":http://groups.google.com/group/gremlin-users/topics or git clone!
@@ -0,0 +1,6 @@
+GraphDB-Bench is a collection of benchmarks for graph database systems. There are four primary aspects to the benchmarking process:
+
+# *Generating Graph Datasets:* creates the datasets that are loaded into the various graph systems for analysis. For more on this see [[Creating Artificial Graphs]].
+# *Defining Benchmarks:* creation of a benchmark definition that contains a set of interesting algorithms/operations. For more on this start at [[Overview]].
+# *Generating Benchmark Results:* evaluates algorithms that test the performance of various graph systems. For more on this see [[Running Benchmarks]]
+# *Analyzing Benchmark Results:* creating informative plots/charts from the benchmark results. For more on this see [[Plotting Results]]
@@ -0,0 +1,78 @@
+As mentioned in the [[Overview]], operations are the actions/algorithms that execute on a database during benchmark runs. Clearly though, the definition of "operation" will differ greatly between domains and even between applications within the same domain. Because of this, GraphDB-Bench makes it very easy to create new operations that accurately reflect the usage of *your* application.
+
+h3. Create your own Operation
+
+To implement your own operation, simply extend the @Operation@ class, override two methods (@onInitialize(String[] args)@ and @onExecute()@), and you're done.
+
+bc. public class OperationExample extends Operation {
+ @Override
+ protected void onInitialize(String[] args) {}
+ @Override
+ protected boolean onExecute() { return true; }
+}
+
+Here's what these methods do:
+* *onInitialize(String[] args):* Has one input parameter, an array of arguments @String[] args@.
+The purpose of this method is to allow the operation to carry out any "setup" tasks. For example, if the operation "gets all neighbor vertices of a given start vertex" then it may be desirable to ignore how long it takes to retrieve the "start vertex". In this case the start vertex could be retrieved within @onInitialize(String[] args)@, and then stored in a class variable until @onExecute()@ needs it.
+Note, to access the @Graph@ database @Operation@ provides the setter method @getGraph()@.
+* *onExecute():* Takes no input parameters. The code it contains (and only the code it contains) is what will be timed when the operation is executed during a benchmark. It's quite straightforward, @onExecute()@ should contain the calculation/algorithm/traversal/operation that you are interested in benchmarking.
+Note, (for debugging purposes only) it's possible to store the result of your operation by calling @setResult(Object result)@. When comparing the performance of two different databases this is useful, as it allows you to check if each database is returning the exact same results for the exact same operation (*if their results differ something is wrong*).
+
+h3. Example Operation implementations
+
+To support the explanations above, a number of example @Operation@ implementations follow:
+
+* A basic operation that uses "Blueprints":http://github.com/tinkerpop/blueprints/wiki to perform a vertex lookup. It returns the name(s) of found vertex/vertices.
+
+bc. public class OperationIndexGetVertex extends Operation {
+ private String propertyKey = null;
+ private String propertyValue = null;
+ //
+ // args [0 -> property key, 1 -> property value]
+ //
+ @Override
+ protected void onInitialize(String[] args) {
+ this.propertyKeys = args[0];
+ this.propertyValues = args[1];
+ }
+ @Override
+ protected boolean onExecute() {
+ try {
+ ArrayList<Element> elements = new ArrayList<Element>();
+ for (Element element : getGraph().getIndex().get(propertyKey, propertyValue))
+ elements.add(element);
+ setResult(elements);
+ } catch (Exception e) {
+ return false;
+ }
+ return true;
+ }
+}
+
+* A basic operation that uses "Gremlin":http://github.com/tinkerpop/gremlin/wiki to retrieve all neighbor vertices of a given start vertex. In this case there is only one input parameter - the Gremlin script. It returns the number of neighbors that were found.
+
+bc. public class OperationGremlinOutNeighbors extends Operation {
+ private String gremlinScript = null;
+ //
+ // args = [0 -> gremlinScript]
+ //
+ @Override
+ protected void onInitialize(String[] args) {
+ this.gremlinScript = args[0];
+ }
+ @Override
+ protected boolean onExecute() {
+ try {
+ int neighbors = 0;
+ Iterable<Object> resultVertices;
+ resultVertices = (Iterable<Object>) BenchRunner.getGremlinScriptEngine().eval(gremlinScript);
+ Object vertices = resultVertices.iterator().next();
+ for (Vertex vertex : (Iterable<Vertex>) vertices) {
+ neighbor++;
+ setResult(neighbors);
+ } catch (ScriptException e) {
+ return false;
+ }
+ return true;
+ }
+}
Oops, something went wrong.

0 comments on commit 287dd47

Please sign in to comment.