Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Graph feature requests #6399

Open
kengz opened this issue Feb 10, 2016 · 10 comments
Open

Graph feature requests #6399

kengz opened this issue Feb 10, 2016 · 10 comments

Comments

@kengz
Copy link

kengz commented Feb 10, 2016

Adding more methods for retrieving theoretical graph properties would allow us to tap into the power of graphs. Some of them may be advanced, but are powerful in revealing additional insights about a graph. Some of them are already useful in the theoretical and applied domains.

Wondering if these are being considered, here's a quick list:

  • node neighborhood N[n, k], where n is the node and k the radius. (This can be done by cypher but it'd be nice to have a single method).
  • node in/out degree d(n), (add a method for it instead of writing a full cypher query and counting).
  • weighted edges (default to unit length); this would add a lot of flexibility to the graph model.
  • cycle detection
  • Dijkstra's, Prim's, Kruskal, Belford, basic vital algorithms
  • graph (minimum) spanning tree, to study relationship in a graph
  • (maximum) independent set, allows us to study the independence among nodes
  • vertex cover

Lastly, it'd be nice to have a roadmap for Neo4j up publicly.

@jexp
Copy link
Member

jexp commented Feb 11, 2016

Hey, really good ideas.

We're adding the capability for stored-procedures in Neo4j 3.0 which would nicely fit the bill here.
See this example: https://github.com/neo4j-examples/neo4j-procedure-template

If you are willing to do some java-programming we would love to see some of these as contributions from you as you probably already have some good ideas on how to implement them.

What graph sizes are you looking at? For large graphs it would be interesting to parallelize the work internally, and potentially also use some of the internal APIs to save on object allocation.

@spacecowboy
Copy link
Contributor

There several graph algorithms included, for example Dijsktra's. See the graphalgo package
https://github.com/neo4j/neo4j/tree/db02b00ad70f097588e1c92a278c164bde69b02f/community/graph-algo/src/main/java/org/neo4j/graphalgo/impl/shortestpath

@kengz
Copy link
Author

kengz commented Feb 11, 2016

I see. @jexp are stored-procedures like Cypher shortcut commands? For properties not accessible from Cypher can one use the stored-procedures, and in the similar way as the package @spacecowboy showed?

Next, I'm interested in seeing them work with Cypher, since most of the properties can readily be obtained there. The motive is to make shortcut commands for them. There's also a question of designing the proper syntax.
e.g. getting degree

# pure cypher
MATCH (n {name:'alice'})<-[e]-() RETURN count(DISTINCT(e))
# operator
MATCH (n {name:'alice'}) RETURN IN_DEGREE(n)

Instead of Cypher matching the subgraph and passing it to the next operator (IN_DEGREE in this case), the operator should extend the graph as needed, as will make sense next:
e.g. to get a neighborhood

# pure cypher
MATCH (n {name:'alice'})<-[l*0..2]-(k) RETURN DISTINCT(k)
# operator
MATCH (n {name:'alice'}) RETURN NEIGH(n, 2)

It'd be better to keep the front part of normal Cypher matching MATCH ..., and use these operators with the subgraph passed to them, alias (n) as entry points. For the more advanced graph computation I imagine:

# not doable in Cypher now; operators
# check for cycle in this graph, useful for causal-relationship
MATCH (n {name:'alice'}) RETURN HAS_CYCLE(n)
# Get results of after running Prim's algo on (n)
MATCH (n {name:'alice'}) RETURN PRIM(n)
# minimum spanning tree
MATCH (n {name:'alice'}) RETURN MST(n)

@kengz
Copy link
Author

kengz commented Feb 12, 2016

One thing worth mentioned since @jexp mentioned the graph size. Normal use cases don't go beyond billions of nodes.

While most graph algorithms are easy and quick to implement, I'm sure you guys know that some of the classic graph problems are NP-hard. In fact, if not careful, these may lead to runaway processes. Vertex cover for example takes very long to compute for a graph with over 100 nodes.

Moreover, the best solution algorithms for these are very complicated. So, implementing them will take considerable effort and time.

Examples are independent set and vertex cover, and they have vital applications, e.g. finding the optimal "covering set" for a traffic/computer/biological network.

I'll look into the stored-procedures soon and see if I can do anything :)

@jexp
Copy link
Member

jexp commented Feb 12, 2016

I'd be happy to get your input and contributions on this.

What you suggest for the combination of cypher + procedures is planned next (@boggle work's on that),
currently we're working on a plain CALL package.procedure(params);

What I would like to see is an efficient, multi-threaded implementation of those algorithms on Neo4j in a standalone java package which then can be exposed both as Neo4j Server Extension via REST, and via procedures too.

Some of those are already easily possible, for others there might be some things that make it more efficient. Esp. if you take readOps.nodeDegree() and parallel, linear reads over the page-cache structures (rel-store) into account.

@jexp
Copy link
Member

jexp commented Mar 13, 2016

@kengz if you want to collaborate with us on this, please let us know. Drop me an email to michael at neo4j.com

@jexp
Copy link
Member

jexp commented Mar 13, 2016

Btw. for DEGREE you can use this as an efficient means already today:

MATCH (n)
RETURN n, size( (n)-->() ) as out_degree, size( (n)<--() ) as in_degree, size( (n)-[:TYPE]->() ) as out_TYPE_degree;

@kengz
Copy link
Author

kengz commented Mar 13, 2016

@jexp Definitely! Though I was recently bogged down by work and other side projects, I may start on it soon together with my side project to build a graph-based brain for an AI. The implementation will be in Java and neo4j, and the required algorithms would include degrees, spanning tree etc.; that's a place to start, with applications in mind.

I brought up the brain project to Ryan from neo4j recently, just to see if there's any interest from neo4j, but that's unrelated to this.

@igorborojevic
Copy link
Contributor

Graph brain.. can it be thought to play Go ?

@kengz
Copy link
Author

kengz commented Mar 13, 2016

hahh in theory yes... in practice, no idea. But in all seriousness a graph-based brain would be a more powerful encoding of knowledge (with lower Kolmogorov complexity), and will have drastically lower complexity in terms of operations in the brain. That should render more power accessible to the AI. oops getting off topic here. Maybe need a new thread.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants