Skip to content

Unary Graph Collection Operators

Timo edited this page Aug 14, 2019 · 9 revisions

This section provides an overview of unary graph collection operators, which operate on a single GraphCollection and return a subset or transformation of the input collection.

Unary Graph Collection Operators
Selection
Matching
Limit
Distinct

Selection

Given a collection, the select operator filters all contained graphs based on their associated graph head and return a collection with logical graphs that fulfill the predicate.

Method Expects Returns
select Predicate function (FilterFunction<GraphHead>) for a graph head GraphCollection which contains all logical graphs that fulfill the predicate.

Frequently used filter expressions are predefined in Gradoop. Make sure to check, if the filter is applicable for graph heads (G).

Selection example

Consider the Social Network example graph, which contains multiple logical graphs with different relations. This example takes a single graph collection and selects a subset of logical graphs that fulfill the predicate function.
In order to select all graphs that contain edges labeled 'hasMember', we would do the following:

  1. aggregate the property hasEdgeLabel_hasMember for each graph in the collection
  2. use the select method to retrieve graphs for which the newly set property is true
FlinkAsciiGraphLoader loader = getSocialNetworkLoader();
GraphCollection collection = loader.getGraphCollectionByVariables("g0", "g1", "g2", "g3");

// create aggregation and filter function
HasEdgeLabel hasLabelHasMember = new HasEdgeLabel("hasMember");

// apply aggregation with aggregation function
collection = collection.apply(new ApplyAggregation<>(hasLabelHasMember));

// select graphs with filter function
GraphCollection result = collection.select(hasLabelHasMember);

The graph collection result now contains the logical graph g3:Forum, since it is the only one which contains edges labeled 'hasMember'. Or to be more precise: it is the only one with a property 'hasEdgeLabel_hasMember' that is set to true after we applied the aggregation.

Matching

The match operator matches a given pattern on the GraphCollection. The method can behave in two different ways:

  1. Return the input GraphCollection with a new property contains pattern set in the graph heads
  2. Return a new collection consistent of the constructed embeddings
Method Expects Returns
match
  • GDL conformant construction pattern
  • Pattern matching algorithm (such as DepthSearchMatching)
  • Boolean returnEmbeddings to indicate whether embeddings should be returned as new collection (true) or collection should be returned with new properties (false)
A graph collection containing either the embeddings or the input graphs with a new property contains pattern.

Matching example

In this example we will explain both ways the match method can behave when applied to a GraphCollection. The question, which pattern matching algorithm to apply to a given graph cannot be discussed here. For this example, we are going to use the DepthSearchMatching algorithm. Consider the Social Network example graph. If we wanted to know, which of the graphs contained in the collection embed a pattern such as (v0)-[:hasModerator]->(v1), we would do the following:

FlinkAsciiGraphLoader loader = getSocialNetworkLoader();
GraphCollection collection = loader.getGraphCollectionByVariables("g0", "g1", "g2", "g3");

// define match pattern
String pattern = "(v0)-[:hasModerator]->(v1)";

GraphCollection containsPatternCollection = collection.query(pattern, new DepthSearchMatching(), false);
GraphCollection containsEmbeddingsCollection = collection.query(pattern, new DepthSearchMatching(), true);

Calling print() on the resulting graph collections would give us the following output (simplified):

containsPatternCollection
g0:Community {contains pattern:false,interest:"Hadoop",vertexCount:3}[ . . . ]
g1:Forum {contains pattern:true}[ . . . ]
g2:Community {contains pattern:false,interest:"Graphs",vertexCount:4}[ . . . 
g3:Community {contains pattern:false,interest:"Databases",vertexCount:3}[ . . . ]
  
containsEmbeddingsCollection
g0: {}[
  // vertices
  (v0:Forum {title:"Graph Processing"})
  (v1:Person {gender:"m",city:"Dresden",name:"Dave",age:40})
  // edges
  (v0)-[e0:hasModerator{since:2013}]->(v1)
]
  

Limit

Given a collection, the limit operator return the first n arbitrary logical graphs contained in that collection.

Method Expects Returns
limit Positive integer limit which defines the number of graphs to return from the collection Subset of the graph collection

Limit example

Using the limit operator is fairly straight forward. We define some limit n of graphs that are being returned by the method. If n is greater or equal than the total amount of graphs contained in the initial collection, then every graph will be contained in the resulting GraphCollection. Using the Social Network example graph, one could apply the limit operator as follows:

FlinkAsciiGraphLoader loader = getSocialNetworkLoader();
GraphCollection collection = loader.getGraphCollectionByVariables("g0", "g1", "g2", "g3");

// define some limit
int limit = 2;

// apply limit
GraphCollection result = collection.limit(limit);

The resulting graph collection result contains the first two arbitrary logical graphs, which are g0 and g1.

Distinct

Given a graph collection, the distinct operators return a distinct collection of logical graphs. Because graph equality can be based on the graph id or graph isomorphism, there exist two implementations.

Method Returns
distinctById Distinct GraphCollection
distinctByIsomorphism Distinct GraphCollection