New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding Neighbors stored procedures. #932

Merged
merged 2 commits into from Nov 8, 2018
File filter...
Filter file types
Jump to file or symbol
Failed to load files and symbols.
+796 −0
Diff settings

Always

Just for now

Next

adding neighbors

  • Loading branch information...
maxdemarzi committed Sep 30, 2018
commit fb604f011ce73270a9141387a9d8b6a87ea13ff6
Copy path View file
@@ -80,6 +80,8 @@ dependencies {

compile 'org.jsoup:jsoup:1.11.3'

compile group: 'org.roaringbitmap', name: 'RoaringBitmap', version: '0.7.17'

testCompile 'net.sourceforge.jexcelapi:jxl:2.6.12'

compileOnly group: 'com.amazonaws', name: 'aws-java-sdk-s3', version: '1.11.270'
Copy path View file
@@ -0,0 +1,69 @@
== Node Neighbors

You can find the distinct connected nodes "n" levels or away quickly with these following procedures.

You can use '>' or '<' for all outgoing or incoming relationships, or specify the types you are interested in.

[cols="1m,5"]
|===
| apoc.neighbors(node, rel-direction-pattern, distance) | returns distinct nodes of the given relationships in the pattern up to a certain distance
| apoc.neighbors.count(node, rel-direction-pattern, distance) | returns the count of distinct nodes of the given relationships in the pattern up to a certain distance
| apoc.neighbors.byhop(node, rel-direction-pattern, distance) | returns distinct nodes of the given relationships in the pattern grouped by distance
| apoc.neighbors.count.byhop(node, rel-direction-pattern, distance) | returns distinct nodes of the given relationships in the pattern grouped by distance
|===


[NOTE]
====
For graphs containing more than 2,147,483,647 nodes, use the _.large_ procedures:
====

[cols="1m,5"]
|===
| apoc.neighbors.large(node, rel-direction-pattern, distance) | returns distinct nodes of the given relationships in the pattern up to a certain distance
| apoc.neighbors.large.count(node, rel-direction-pattern, distance) | returns the count of distinct nodes of the given relationships in the pattern up to a certain distance
| apoc.neighbors.large.byhop(node, rel-direction-pattern, distance) | returns distinct nodes of the given relationships in the pattern grouped by distance
| apoc.neighbors.large.count.byhop(node, rel-direction-pattern, distance) | returns distinct nodes of the given relationships in the pattern grouped by distance
|===



=== Example

.Graph Setup
[source,cypher]
----
CREATE (a:First), (b:Neighbor), (c:Neighbor), (d:Neighbor),
(a)-[:KNOWS]->(b), (b)-[:KNOWS]->(a),
(b)-[:KNOWS]->(c), (c)-[:KNOWS]->(d)
----

[source,cypher]
----
MATCH (n:First) WITH n
CALL apoc.neighbors(n,'KNOWS>', 3) YIELD node AS neighbor
RETURN neighbor
----

[source,cypher]
----
MATCH (n:First) WITH n
CALL apoc.neighbors.count(n,'KNOWS>', 3) YIELD value AS number
RETURN number
----

[source,cypher]
----
MATCH (n:First) WITH n
CALL apoc.neighbors.byhop(n,'KNOWS>', 3) YIELD nodes AS neighbors
RETURN neighbors
----

[source,cypher]
----
MATCH (n:First) WITH n
CALL apoc.neighbors.count.byhop(n,'KNOWS>', 3) YIELD value AS numbers
RETURN numbers
----
Copy path View file
@@ -900,6 +900,16 @@ Sometimes type information gets lost, these functions help you to coerce an "Any

Example: `'FRIEND|MENTORS>|<REPORTS_TO'` will match to :FRIEND relationships in either direction, outgoing :MENTORS relationships, and incoming :REPORTS_TO relationships.

=== Neighbor Functions

[cols="1m,5"]
|===
| apoc.neighbors(node, rel-direction-pattern, distance) | returns distinct nodes of the given relationships in the pattern up to a certain distance
| apoc.neighbors.count(node, rel-direction-pattern, distance) | returns the count of distinct nodes of the given relationships in the pattern up to a certain distance
| apoc.neighbors.byhop(node, rel-direction-pattern, distance) | returns distinct nodes of the given relationships in the pattern grouped by distance
| apoc.neighbors.count.byhop(node, rel-direction-pattern, distance) | returns distinct nodes of the given relationships in the pattern grouped by distance
|===


=== Math Functions

@@ -0,0 +1,240 @@
package apoc.neighbors;

import apoc.result.*;
import org.neo4j.graphdb.*;
import org.neo4j.helpers.collection.Pair;
import org.neo4j.procedure.*;
import org.roaringbitmap.RoaringBitmap;

import java.util.*;
import java.util.stream.*;

import static apoc.path.RelationshipTypeAndDirections.parse;

public class Neighbors {

@Context
public GraphDatabaseService db;

@Procedure("apoc.neighbors")
@Description("apoc.neighbors(node, rel-direction-pattern, distance) - returns distinct nodes of the given relationships in the pattern up to a certain distance, can use '>' or '<' for all outgoing or incoming relationships")
public Stream<NodeResult> neighbors(@Name("node") Node node, @Name(value = "types", defaultValue = "") String types, @Name(value="distance", defaultValue = "1") Long distance) {
if (distance < 1) return Stream.empty();
if (types==null || types.isEmpty()) return Stream.empty();

// Initialize bitmaps for iteration
RoaringBitmap seen = new RoaringBitmap();
RoaringBitmap nextA = new RoaringBitmap();
RoaringBitmap nextB = new RoaringBitmap();
int nodeId = (int) node.getId();

This comment has been minimized.

@sarmbruster

sarmbruster Oct 1, 2018

Member

I think that's a unsafe operation in case we're working on a graph with more than 2bn nodes - which even is possible when using standard store format. Shouldn't we fall back on either aborting the operation with an exception or somehow deal gracefully with larger node ids?

This comment has been minimized.

@maxdemarzi

maxdemarzi Oct 1, 2018

Contributor

There is NeighborsLarge for those rare folks.

This comment has been minimized.

@maxdemarzi

maxdemarzi Oct 3, 2018

Contributor

Do we want to just drop the regular and rename the .large so it won't be an issue?

This comment has been minimized.

@sarmbruster

sarmbruster Oct 3, 2018

Member

Maybe leave both as they are. But the int version should throw an execption in case you're using node ids > 2bn. This way we have both, efficency of the int variant and scale of the long variant. just my 2ct

seen.add(nodeId);
Iterator<Integer> iterator;

// First Hop
for (Pair<RelationshipType, Direction> pair : parse(types)) {
for (Relationship r : node.getRelationships(pair.first(), pair.other())) {
nextB.add((int) r.getOtherNodeId(node.getId()));
}
}

for(int i = 1; i < distance; i++) {
// next even Hop
nextB.andNot(seen);
seen.or(nextB);
nextA.clear();
iterator = nextB.iterator();
while (iterator.hasNext()) {
nodeId = iterator.next();
node = db.getNodeById((long) nodeId);
for (Pair<RelationshipType, Direction> pair : parse(types)) {
for (Relationship r : node.getRelationships(pair.first(), pair.other())) {
nextA.add((int) r.getOtherNodeId((long) nodeId));
}
}
}

i++;
if (i < distance) {
// next odd Hop
nextA.andNot(seen);
seen.or(nextA);
nextB.clear();
iterator = nextA.iterator();
while (iterator.hasNext()) {
nodeId = iterator.next();
node = db.getNodeById((long) nodeId);
for (Pair<RelationshipType, Direction> pair : parse(types)) {
for (Relationship r : node.getRelationships(pair.first(), pair.other())) {
nextB.add((int) r.getOtherNodeId((long) nodeId));
}
}
}
}
}
if((distance % 2) == 0) {
seen.or(nextA);
} else {
seen.or(nextB);
}
// remove starting node
seen.remove((int)node.getId());

return StreamSupport.stream(seen.spliterator(), false).map(x -> new NodeResult(db.getNodeById(x)));
}

@Procedure("apoc.neighbors.count")
@Description("apoc.neighbors.count(node, rel-direction-pattern, distance) - returns distinct count of nodes of the given relationships in the pattern up to a certain distance, can use '>' or '<' for all outgoing or incoming relationships")
public Stream<LongResult> neighborsCount(@Name("node") Node node, @Name(value = "types", defaultValue = "") String types, @Name(value="distance", defaultValue = "1") Long distance) {
if (distance < 1) return Stream.empty();
if (types==null || types.isEmpty()) return Stream.empty();

// Initialize bitmaps for iteration
RoaringBitmap seen = new RoaringBitmap();
RoaringBitmap nextA = new RoaringBitmap();
RoaringBitmap nextB = new RoaringBitmap();
int nodeId = (int) node.getId();
seen.add(nodeId);
Iterator<Integer> iterator;

// First Hop
for (Pair<RelationshipType, Direction> pair : parse(types)) {
for (Relationship r : node.getRelationships(pair.first(), pair.other())) {
nextB.add((int) r.getOtherNodeId(node.getId()));
}
}

for(int i = 1; i < distance; i++) {
// next even Hop
nextB.andNot(seen);
seen.or(nextB);
nextA.clear();
iterator = nextB.iterator();
while (iterator.hasNext()) {
nodeId = iterator.next();
node = db.getNodeById((long) nodeId);
for (Pair<RelationshipType, Direction> pair : parse(types)) {
for (Relationship r : node.getRelationships(pair.first(), pair.other())) {
nextA.add((int) r.getOtherNodeId(nodeId));
}
}
}

i++;
if (i < distance) {
// next odd Hop
nextA.andNot(seen);
seen.or(nextA);
nextB.clear();
iterator = nextA.iterator();
while (iterator.hasNext()) {
nodeId = iterator.next();
node = db.getNodeById((long) nodeId);
for (Pair<RelationshipType, Direction> pair : parse(types)) {
for (Relationship r : node.getRelationships(pair.first(), pair.other())) {
nextB.add((int) r.getOtherNodeId(nodeId));
}
}
}
}
}
if((distance % 2) == 0) {
seen.or(nextA);
} else {
seen.or(nextB);
}
// remove starting node
seen.remove((int)node.getId());

return Stream.of(new LongResult((long)seen.getCardinality()));
}

@Procedure("apoc.neighbors.byhop")
@Description("apoc.neighbors.byhop(node, rel-direction-pattern, distance) - returns distinct nodes of the given relationships in the pattern at each distance, can use '>' or '<' for all outgoing or incoming relationships")
public Stream<NodeListResult> neighborsByHop(@Name("node") Node node, @Name(value = "types", defaultValue = "") String types, @Name(value="distance", defaultValue = "1") Long distance) {
if (distance < 1) return Stream.empty();
if (types==null || types.isEmpty()) return Stream.empty();

// Initialize bitmaps for iteration
RoaringBitmap[] seen = new RoaringBitmap[distance.intValue()];
for(int i = 0; i < distance; i++) {
seen[i] = new RoaringBitmap();
}
int nodeId = (int) node.getId();

Iterator<Integer> iterator;

// First Hop
for (Pair<RelationshipType, Direction> pair : parse(types)) {
for (Relationship r : node.getRelationships(pair.first(), pair.other())) {
seen[0].add((int) r.getOtherNodeId(node.getId()));
}
}

for(int i = 1; i < distance; i++) {
iterator = seen[i-1].iterator();
while (iterator.hasNext()) {
node = db.getNodeById((long) iterator.next());
for (Pair<RelationshipType, Direction> pair : parse(types)) {
for (Relationship r : node.getRelationships(pair.first(), pair.other())) {
seen[i].add((int) r.getEndNodeId());
}
}
}
for(int j = 0; j < i; j++){
seen[i].andNot(seen[j]);
seen[i].remove(nodeId);
}
}

return Arrays.stream(seen).map(x -> new NodeListResult(
StreamSupport.stream(x.spliterator(), false)
.map(y -> db.getNodeById((long) y))
.collect(Collectors.toList())));
}

@Procedure("apoc.neighbors.count.byhop")
@Description("apoc.neighbors.count.byhop(node, rel-direction-pattern, distance) - returns distinct nodes of the given relationships in the pattern at each distance, can use '>' or '<' for all outgoing or incoming relationships")
public Stream<ListResult> neighborsCountByHop(@Name("node") Node node, @Name(value = "types", defaultValue = "") String types, @Name(value="distance", defaultValue = "1") Long distance) {
if (distance < 1) return Stream.empty();
if (types==null || types.isEmpty()) return Stream.empty();

// Initialize bitmaps for iteration
RoaringBitmap[] seen = new RoaringBitmap[distance.intValue()];
for(int i = 0; i < distance; i++) {
seen[i] = new RoaringBitmap();
}
int nodeId = (int) node.getId();

Iterator<Integer> iterator;

// First Hop
for (Pair<RelationshipType, Direction> pair : parse(types)) {
for (Relationship r : node.getRelationships(pair.first(), pair.other())) {
seen[0].add((int) r.getOtherNodeId(node.getId()));
}
}

for(int i = 1; i < distance; i++) {
iterator = seen[i-1].iterator();
while (iterator.hasNext()) {
node = db.getNodeById((long) iterator.next());
for (Pair<RelationshipType, Direction> pair : parse(types)) {
for (Relationship r : node.getRelationships(pair.first(), pair.other())) {
seen[i].add((int) r.getEndNodeId());
}
}
}
for(int j = 0; j < i; j++){
seen[i].andNot(seen[j]);
seen[i].remove(nodeId);
}
}

ArrayList counts = new ArrayList<Integer>();
for(int i = 0; i < distance; i++) {
counts.add(seen[i].getCardinality());
}

return Stream.of(new ListResult(counts));
}
}
Oops, something went wrong.
ProTip! Use n and p to navigate between commits in a pull request.