Sparksee Implementation

jbmusso edited this page Jul 12, 2016 · 4 revisions

Attention: this Wiki hosts an outdated version of the TinkerPop framework and Gremlin language documentation.

Please visit the Apache TinkerPop website and latest documentation.


<dependency>
   <groupId>com.tinkerpop.blueprints</groupId>
   <artifactId>blueprints-sparksee-graph</artifactId>
   <version>??</version>
</dependency>
Graph graph = new SparkseeGraph("/tmp/graph.gdb");

Sparksee is a graph database developed by Sparsity Technologies For a fine summary of the Sparksee graph database, please read the documentation section. The software can be downloaded from here and be used with the default evaluation license (which restricts the amount of information Sparksee can deal with).

Vertex label

Analogously of what happens with the edges, Sparksee vertices have a label too. Thus, when adding a vertex to the database, its label can be set as follows:

((SparkseGraph)graph).label.set("people");
Vertex v = graph.addVertex(null)
assertTrue(v.getProperty(StringFactory.LABEL).equals("people"));

The SparkseeGraph#label property is also relevant for the following methods:

  • SparkseeGraph#addVertex
  • SparkseeGraph#createKeyIndex(String, Class<T>)
  • SparkseeGraph#getEdges(String, Object)
  • SparkseeGraph#getVertices(String, Object)

Property scope

SparkseeGraph has two property scope models determined by the threadlocal variable SparkseeGraph.typeScope.

By default it acts as the Blueprints property graph model where the properties are for all vertex or edges. However, for compatibilty with previous versions and databases created outside Blueprints you can toggle to the mode where the properties are restricted to a specific vertex or edge type.

Take into account that the two attribute mode are mutually exclusive for example you can not see specific Vertex/Edge types if you are working with the pure blueprints property graph model. Thus, they can share the same name but contain different values.

((SparkseeGraph) graph).label.set("people");
Vertex v = graph.addVertex(null);
v.setProperty("name", "foo");

// Create a specefic type attribute
((SparkseeGraph) graph).typeScope.set(true);
// This creates the attribute name restricted to the type people.
// It does not overwrite the attribute value foo of the Vertex attribute also called name.
v.setProperty("name", "boo");
// Restore the normal property graph behaviour
((SparkseeGraph) graph).typeScope.set(false);

Memory Configuration

Sparksee memory is not managed by the JVM heap, so an specific memory configuration must be set for Sparksee in order to set the maximum amount of memory to be used by a Sparksee application.

Specifically, users should set sparksee.io.cache.maxsize as is explained in the Configuration chapter of Sparksee User Manual.

Managment of Iterable collections

Since Sparksee resources are not managed by the JVM heap, Sparksee-based blueprints applications should take into account the management of Iterable collections and explicitly close them in order to free native resources.

For example, if we execute a long traversal like this:

for (final Vertex vertex : graph.getVertices()) {
    for (final Edge edge : vertex.getOutEdges()) {
        final Vertex vertex2 = edge.getInVertex();
        for (final Edge edge2 : vertex2.getOutEdges()) {
            ...
        }
    }
}

all retrieved collections won’t be closed until the graph database is stopped. Of course, keeping active this amount of resources will have a negative impact in the performance.

To avoid this, all retrieved collections from methods in the Sparksee implementation implement CloseableIterable. Thus, we could implement the previous traversal as follows:

CloseableIterable<Vertex> vv = (CloseableIterable<Vertex>)graph.getVertices();
for (final Vertex vertex : vv) {
    CloseableIterable<Edge> ee = (CloseableIterable<Edge>)vertex.getOutEdges();
    for (final Edge edge : ee) {
        final Vertex vertex2 = edge.getInVertex();
        CloseableIterable<Edge> ee2 = (CloseableIterable<Edge>)vertex2.getOutEdges();
        for (final Edge edge2 : ee2) {
            ...
        }
        ee2.close();
    }
    ee.close();
}
vv.close();

GraphFactory Settings

If using GraphFactory to instantiate a SparkseeGraph, the following properties will apply:

key description
blueprints.graph com.tinkerpop.blueprints.impls.sparksee.SparkseeGraph
blueprints.sparksee.directory The directory of the SparkseeGraph instance.
blueprints.sparksee.config Location of the Sparksee configuration file.

SparkseeGraph Feature List

supportsDuplicateEdges = true;
supportsSelfLoops = true;
isPersistent = true;
isRDFModel = false;
supportsVertexIteration = true;
supportsEdgeIteration = true;
supportsVertexIndex = false;
supportsEdgeIndex = false;
ignoresSuppliedIds = true;
supportsTransactions = true;
supportsIndices = false;

supportsSerializableObjectProperty = false;
supportsBooleanProperty = true;
supportsDoubleProperty = true;
supportsFloatProperty = true;
supportsIntegerProperty = true;
supportsPrimitiveArrayProperty = false;
supportsUniformListProperty = false;
supportsMixedListProperty = false;
supportsLongProperty = true;
supportsMapProperty = false;
supportsStringProperty = true;

isWrapper = false;
supportsKeyIndices = true;
supportsVertexKeyIndex = true;
supportsEdgeKeyIndex = true;
supportsThreadedTransactions = false;
supportsThreadIsolatedTransactions = false;