Skip to content

Commit

Permalink
staging for release.
Browse files Browse the repository at this point in the history
  • Loading branch information
spmallette committed Aug 4, 2012
1 parent 47bea22 commit 25908b0
Show file tree
Hide file tree
Showing 35 changed files with 2,677 additions and 0 deletions.
15 changes: 15 additions & 0 deletions doc/wiki/Acknowledgments.textile
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
<img width="100" src="https://github.com/tinkerpop/blueprints/raw/master/doc/images/blueprints-character-2.png"/>

This section provides a list of the people that have contributed in some way to the creation of Blueprints.

# "Marko A. Rodriguez":http://markorodriguez.com -- designed, developed, tested, and documented Blueprints.
# "Luca Garulli":http://orientechnologies.com -- developed the OrientDB implementatio (@OrientGraph@).
# "Joshua Shinavier":http://fortytwo.net -- developed Blueprints Sail (@GraphSail@).
# "Darrick Weibe":http://github.com/pangloss -- tests, bug fixes, and transaction work.
# "Stephen Mallette":http://stephen.genoprime.com -- develops @RexsterGraph@ and other components.
# "Sergio Gómez Villamor":http://github.com/sgomezvillamor -- developed the Dex implementation (@DexGraph@).
# "Pierre De Wilde":http://www.linkedin.com/in/pierredewilde -- designs and tests new features.
# "Ketrina Yim":http://www.ketrinayim.com/ -- designed the Blueprints logo.
# "Matthias Broecheler":http://www.matthiasb.com/ -- designed many of the TinkerPop 2 API changes.

Please review Blueprints' "pom.xml":http://github.com/tinkerpop/blueprints/blob/master/pom.xml. Blueprints would not be possible without the work done by others to create these useful packages.
82 changes: 82 additions & 0 deletions doc/wiki/Batch-Implementation.textile
Original file line number Diff line number Diff line change
@@ -0,0 +1,82 @@
[[http://www.stacymakescents.com/wp-content/uploads/haystack-clip-art.gif|width=250px]]

```xml
<dependency>
<groupId>com.tinkerpop.blueprints</groupId>
<artifactId>blueprints-core</artifactId>
<version>??</version>
</dependency>
```

@BatchGraph@ wraps any @TransactionalGraph@ to enable batch loading of a large number of edges and vertices by chunking the entire load into smaller batches and maintaining a memory-efficient vertex cache so that intermediate transactional states can be flushed after each chunk is loaded to release memory.


@BatchGraph@ is *ONLY* meant for loading data and does not support any retrieval or removal operations. That is, BatchGraph only supports the following methods:
* @addVertex()@ for adding vertices
* @addEdge()@ for adding edges
* @getVertex()@ to be used when adding edges
* Property getter, setter and removal methods for vertices and edges as well as @getId()@

An important limitation of @BatchGraph@ is that edge properties can only be set immediately after the edge has been added. If other vertices or edges have been created in the meantime, setting, getting or removing properties will throw exceptions. This is done to avoid caching of edges which would require a great amount of memory.

@BatchGraph@ wraps @TransactionalGraph@. To wrap arbitrary graphs, use @BatchGraph.wrap()@ which will additionally wrap non-transactional graphs.

@BatchGraph@ can also automatically set the provided element ids as properties on the respective element. Use @setVertexIdKey()@ and @setEdgeIdKey()@ to set the keys for the vertex and edge properties respectively. This is useful when the graph implementation ignores supplied ids and allows to make the loaded graph compatible for later wrapping with @IdGraph@ (see [[Id Implementation]]) when setting the vertex and edge Id keys to @IdGraph.ID@.

As an example, suppose we are loading a large number of edges defined by a String array with four entries called _quads_:

# The out vertex id
# The in vertex id
# The label of the edge
# A string annotation for the edge, i.e. an edge property

Assuming this array is very large, loading all these edges in a single transaction is likely to exhaust main memory. Furthermore,
one would have to rely on the database indexes to retrieve previously created vertices for a given id. @BatchGraph@ addresses
both of these issues.

```java
BatchGraph bgraph = new BatchGraph(graph, BatchGraph.IdType.STRING, 1000);
for (String[] quad : quads) {
Vertex[] vertices = new Vertex[2];
for (int i=0;i<2;i++) {
vertices[i] = bgraph.getVertex(quad[i]);
if (vertices[i]==null) vertices[i]=bgraph.addVertex(quad[i]);
}
Edge edge = bgraph.addEdge(null,vertices[0],vertices[1],quad[2]);
edge.setProperty("annotation",quad[3]);
}
```

First, a @BatchGraph@ _bgraph_ is created wrapping an existing _graph_ and setting the id type to @IdType.STRING@ and the batch size to 1000.
@BatchGraph@ maintains a mapping from the external vertex ids, in our example the first two entries in the String array describing th edge,
to the internal vertex ids assigned by the wrapped grahp database. Since this mapping is maintained in memory, it is potentially much faster
than the database index. By specifying the @IdType@, @BatchGraph@ chooses the most memory-efficient mapping data structure and applies compression
algorithms if possible. There are four different @IdTypes@:

* _OBJECT_ : For arbitrary object vertex ids. This is the most generic and least space efficient type.
* _STRING_ : For string vertex ids. Attempts to apply string compression and prefixing strategies to reduce the memory footprint.
* _URL_ : For string vertex ids that parse as URLs. Applies URL specific compression schemes that are more efficient than generic string compression.
* _NUMBER_ : For numeric vertex ids. Uses primitive data structures that requires significantly less memory.

The last argument in the constructor is the batch size, that is, the number of vertices and edges to load before committing a transaction and starting a
new one.

The for-loop then iterates over all the quad String arrays and creates an edge for each by first retrieving or creating the vertex end points
and then creating the edge. Note, that we set the edge property immediately after creating the edge. This is required because
edges are only kept in memory until the next edge is created for efficiency reasons.

h2. Incremental Loading

The above describes how @BatchGraph@ can be used to load data into a graph under the assumption that the wrapped graph is initially empty. @BatchGraph@ can also be used to incrementally batch load edges and vertices into a graph with existing data. In this case, vertices may already exist for given ids.

If the wrapped graph does not ignore ids, then enabling incremental batch loading is as simple as calling @setLoadingFromScratch(false)@, i.e. to disable the assumption that data is loaded into an empty graph. If the wrapped graph does ignore ids, then one has to tell @BatchGraph@ how to find existing vertices for a given id by specifying the vertex id key using @setVertexIdKey(uid)@ where _uid_ is some string for the property key. Also, uid must be "key indexed":https://github.com/tinkerpop/blueprints/wiki/Graph-Indices for this to work.

```java
graph.createKeyIndex("uid",Vertex.class);
BatchGraph bgraph = new BatchGraph(graph, BatchGraph.IdType.STRING, 1000);
bgraph.setVertexIdKey("uid);
bgraph.setLoadingFromScratch(false);
//Load data as shown above
```

Note, that incremental batch loading is more expensive than loading from scratch because @BatchGraph@ has to call on the wrapped graph to determine whether a vertex exists for a given id.
92 changes: 92 additions & 0 deletions doc/wiki/Code-Examples.textile
Original file line number Diff line number Diff line change
@@ -0,0 +1,92 @@
This section will provide a collection of basic code examples that work with the Blueprints graph API. The in-memory [[TinkerGraph]] database will be used throughout the examples. Please feel free to alter the graph constructor to work with different graph databases.

# "Create a Simple Graph":#create
# "Iterate through the Elements of a Graph":#elements
# "Iterate through the Edges of a Vertex":#edge

<a name="create"></a>

h2(#create). Create a Simple Graph

Create a graph. Add two vertices. Set the @name@ property of each vertex. Create an @knows@ edge between the two vertices. Print the components of the graph.

```java
import com.tinkerpop.blueprints.impls.tg.TinkerGraph;
import com.tinkerpop.blueprints.Graph;
import com.tinkerpop.blueprints.Vertex;
import com.tinkerpop.blueprints.Edge;
import com.tinkerpop.blueprints.Direction;

Graph graph = new TinkerGraph();
Vertex a = graph.addVertex(null);
Vertex b = graph.addVertex(null);
a.setProperty("name", "marko");
b.setProperty("name", "peter");
Edge e = graph.addEdge(null, a, b, "knows");
System.out.println(e.getVertex(Direction.OUT).getProperty("name") + "--" + e.getLabel() + "-->" + e.getVertex(Direction.IN).getProperty("name"));
```

The @System.out@ after the code executes is:

bc. marko--knows-->peter

<a name="elements"></a>

h2(#elements). Iterate through the Elements of a Graph

Load the TinkerPop play graph diagrammed in [[Property Graph Model]]. Iterate through all the vertices and print them to @System.out@. Iterate through all the edges and print them to @System.out@.

```java
public void testIteratingGraph() {
Graph graph = TinkerGraphFactory.createTinkerGraph();
System.out.println("Vertices of " + graph);
for (Vertex vertex : graph.getVertices()) {
System.out.println(vertex);
}
System.out.println("Edges of " + graph);
for (Edge edge : graph.getEdges()) {
System.out.println(edge);
}
}
```

The @System.out@ after the code executes is:

bc. Vertices of tinkergraph[vertices:6 edges:6]
v[3]
v[2]
v[1]
v[6]
v[5]
v[4]
Edges of tinkergraph[vertices:6 edges:6]
e[10][4-created->5]
e[7][1-knows->2]
e[9][1-created->3]
e[8][1-knows->4]
e[11][4-created->3]
e[12][6-created->3]

<a name="edge"></a>

h2(#edge). Iterate through the Edges of a Vertex

Load the TinkerPop play graph diagrammed in [[Property Graph Model]]. Get vertex @1@ from the graph by its @id@. Print some information about the vertex. Iterate through the outgoing edges of the vertex and print the edges.

```java
Graph graph = TinkerGraphFactory.createTinkerGraph();
Vertex a = graph.getVertex("1");
System.out.println("vertex " + a.getId() + " has name " + a.getProperty("name"));
for(Edge e : a.getEdges(OUT)) {
System.out.println(e);
}
```

The @System.out@ after the code executes is:

bc. vertex 1 has name marko
e[7][1-knows->2]
e[9][1-created->3]
e[8][1-knows->4]

<a name="index"></a>
19 changes: 19 additions & 0 deletions doc/wiki/Desired-Implementations.textile
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
!https://github.com/tinkerpop/blueprints/raw/master/doc/images/blueprints-bob-the-builder.png!

It is a time consuming process to maintain Blueprints implementations as graph database/framework versions and features change. Many developers focusing on a particular implementation is an ideal way of ensuring that Blueprints has wide reach and is always consistent with the latest developments. If there is a graph database/framework that is currently not supported by Blueprints and you are an expert with that system, please contribute an implementation. To get a feel of what is required, see [[Property Graph Model]] and [[Property Graph Model Test Suite]].

Below is a list of desired implementations. This list is not intended to be exhaustive. Please feel free to add to the list.

* "db4o":http://developer.db4o.com
* "Versant Object Database":http://www.versant.com
* "InfoGrid":http://infogrid.org/
* "vertexdb":http://www.dekorte.com/projects/opensource/vertexdb/
* "Redis":http://code.google.com/p/redis/ - see "Blueredis":https://github.com/dmitriid/blueredis
* "AvocadoDB":http://www.avocadodb.org/
* "Lucene":http://lucene.apache.org/core/ - see "Lumeo":https://github.com/karussell/lumeo
* "Azure Table Storage":http://www.windowsazure.com/en-us/home/features/storage/
* Sail-based RDF Stores (very easy to do as only a @Sail@ constructor is needed)
** "4Store":http://4store.org/
** "AllegroGraph":http://www.franz.com/agraph/allegrograph/
** "OpenVirtuoso":http://virtuoso.openlinksw.com/
** "OWLim":http://www.ontotext.com/owlim/
109 changes: 109 additions & 0 deletions doc/wiki/Dex-Implementation.textile
Original file line number Diff line number Diff line change
@@ -0,0 +1,109 @@
[[http://www.sparsity-technologies.com/images/sparsity_logo_web.png]]

```xml
<dependency>
<groupId>com.tinkerpop.blueprints</groupId>
<artifactId>blueprints-dex-graph</artifactId>
<version>??</version>
</dependency>
```

```java
Graph graph = new DexGraph("/tmp/graph.dex");
```

"Dex":http://www.sparsity-technologies.com/dex is a graph database developed by "Sparsity Technologies":http://www.sparsity-technologies.com/. For a fine summary of the Dex graph database, please review the following "presentations":http://www.sparsity-technologies.com/dex_tutorials. The software can be downloaded from "here":http://www.sparsity-technologies.com/dex_downloads and be used with the default evaluation license (which restricts the amount of information Dex can deal with).

h2. Vertex label

As edges, Dex vertices have a label too. Thus, when adding a vertex to the database, its label can be set as follows:

```java
((DexGraph)graph).label.set("people");
Vertex v = graph.addVertex(null)
assertTrue(v.getProperty(StringFactory.LABEL).equals("people"));
```

The @DexGraph#label@ property is also relevant for other methods. Go to the javadoc of each of the following methods to see how:
* @DexGraph#addVertex@
* @DexGraph#createKeyIndex(String, Class<T>)@
* @DexGraph#getEdges(String, Object)@
* @DexGraph#getVertices(String, Object)@

h2. Memory Configuration

Dex memory is not managed by the JVM heap, so an specific memory configuration must be set for Dex in order to set the maximum amount of memory to be used by a Dex application.

Specifically, users should set @dex.io.cache.maxsize@ as is explained [[here|http://www.sparsity-technologies.com/downloads/javadoc-java/com/sparsity/dex/gdb/DexConfig.html]].

h2. Managment of @Iterable@ collections

As before, since Dex resources are not managed by the JVM heap, Dex-based blueprints applications should take into account the management of @Iterable@ collections and explicitly close them in order to free native resources.

For example, if we execute a long traversal like this:

```java
for (final Vertex vertex : graph.getVertices()) {
for (final Edge edge : vertex.getOutEdges()) {
final Vertex vertex2 = edge.getInVertex();
for (final Edge edge2 : vertex2.getOutEdges()) {
...
}
}
}
```

all retrieved collections won't be closed until the graph database is stopped. Of course, keep active this amount of resources will have a negative impact in the performance.

To avoid this, all retrieved collections from methods in the Dex implementation implement @CloseableIterable@. Thus, we could implement the previous traversal as follows:

```java
CloseableIterable<Vertex> vv = (CloseableIterable<Vertex>)graph.getVertices();
for (final Vertex vertex : vv) {
CloseableIterable<Edge> ee = (CloseableIterable<Edge>)vertex.getOutEdges();
for (final Edge edge : ee) {
final Vertex vertex2 = edge.getInVertex();
CloseableIterable<Edge> ee2 = (CloseableIterable<Edge>)vertex2.getOutEdges();
for (final Edge edge2 : ee2) {
...
}
ee2.close();
}
ee.close();
}
vv.close();
```

h2. DexGraph Feature List

```
supportsDuplicateEdges = true;
supportsSelfLoops = true;
isPersistent = true;
isRDFModel = false;
supportsVertexIteration = true;
supportsEdgeIteration = true;
supportsVertexIndex = false;
supportsEdgeIndex = false;
ignoresSuppliedIds = true;
supportsTransactions = false;
supportsIndices = false;

supportsSerializableObjectProperty = false;
supportsBooleanProperty = true;
supportsDoubleProperty = true;
supportsFloatProperty = true;
supportsIntegerProperty = true;
supportsPrimitiveArrayProperty = false;
supportsUniformListProperty = false;
supportsMixedListProperty = false;
supportsLongProperty = false;
supportsMapProperty = false;
supportsStringProperty = true;

isWrapper = false;
supportsKeyIndices = true;
supportsVertexKeyIndex = true;
supportsEdgeKeyIndex = true;
supportsThreadedTransactions = false;
```
52 changes: 52 additions & 0 deletions doc/wiki/Event-Implementation.textile
Original file line number Diff line number Diff line change
@@ -0,0 +1,52 @@
```xml
<dependency>
<groupId>com.tinkerpop.blueprints</groupId>
<artifactId>blueprints-core</artifactId>
<version>??</version>
</dependency>
```

@EventGraph@ and @EventIndexableGraph@ wrap any @Graph@ or @IndexableGraph@, respectively. The purpose of an @EventGraph@ is to raise events to one or more @GraphChangedListener@ as changes to the underlying @Graph@ occur. The obvious limitation is that events will only be raised to listeners if the changes to the @Graph@ occur within the same process.

@EventTransactionalGraph@ and @EventTransactionalIndexableGraph@ wrap any @TransactionalGraph@. These wrappers behave in the same fashion as the aforementioned @EventGraph@ and @EventIndexableGraph@, but respect the concept of transactions, such that the events that are triggered during a transaction are queued until the transaction is successfully committed. Once committed, the events will fire in the order that they were queued. If the transaction is rolled back the event queue is reset.

The following events are raised:

* New vertex
* New edge
* Vertex property changed
* Edge property changed
* Vertex property removed
* Edge property removed
* Vertex removed
* Edge removed
* Graph cleared

To start processing events from a @Graph@ first implement the @GraphChangedListener@ interface. An example of this implementation is the @ConsoleGraphChangedListener@ which writes output to the console for each event.

To add a listener to the @EventGraph@:

```java
EventGraph graph = new EventGraph(TinkerGraphFactory.createTinkerGraph());
graph.addListener(new ConsoleGraphChangedListener(graph));

Vertex v = graph.addVertex(100);
v.setProperty("name", "noname");

for (Edge edge : graph.getEdges()) {
edge.removeProperty("weight");
}
```

The following output would appear in the console:

```text
Vertex [v[100]] added to graph [eventgraph[tinkergraph[vertices:6 edges:6]]]
Vertex [v[4]] property [name] set to value of [noname] in graph [eventgraph[tinkergraph[vertices:6 edges:6]]]
Edge [e[10][4-created->5]] property [weight] with value of [1.0] removed in graph [eventgraph[tinkergraph[vertices:6 edges:6]]]
Edge [e[7][1-knows->2]] property [weight] with value of [0.5] removed in graph [eventgraph[tinkergraph[vertices:6 edges:6]]]
Edge [e[9][1-created->3]] property [weight] with value of [0.4] removed in graph [eventgraph[tinkergraph[vertices:6 edges:6]]]
Edge [e[8][1-knows->4]] property [weight] with value of [1.0] removed in graph [eventgraph[tinkergraph[vertices:6 edges:6]]]
Edge [e[11][4-created->3]] property [weight] with value of [0.4] removed in graph [eventgraph[tinkergraph[vertices:6 edges:6]]]
Edge [e[12][6-created->3]] property [weight] with value of [0.2] removed in graph [eventgraph[tinkergraph[vertices:6 edges:6]]]
```
Loading

0 comments on commit 25908b0

Please sign in to comment.