Support for Cypher DSL #110

andreasronge opened this Issue Dec 15, 2011 · 14 comments


None yet

5 participants


Design Goals

  • Should be easy to understand how the DSL maps to the Cypher language
  • Should make it easy to use prefixed Neo4j.rb relationships and lucene indices
  • Should have a Rubyish syntax

Using Strings

The DSL should support using strings that are sent directly to the Cypher Java Api.

Neo4j.query do
  START "foo=node(1)"
  MATCH "foo-[foo_bar]->bar"
  RETURN "bar, foo_bar"

It might be a bit controversial to have method names with capital letters like START, but I think it's easier to read.

Using Hashes and Bindings

It does also support using hashes instead of strings. It's also possible to mix.

Neo4j.query do
  START :foo => node(1)
  # following is same as MATCH "foo-[foo_bar?]->bar"
  RETURN bar, foo_bar

The symbols above will define instance variables which can be used later used.
Example, START :foo => node(1) will create instance variable foo which can be used later.
This means that we can actually check the syntax of the DSL and make sure that all variables are defined.
The second line MATCH will use the instance variable foo and
define two new instance variables foo_bar and bar which is used in the RETURN method.

Using Method Chaining

For simpler queries method chaining is supported.
Example of using method chaining instead of the block above:

Neo4j.query.start{:foo => node(1)}.match{}.to(:bar)}.returns{bar, foo_bar}

Starting from any node

Method chaining will probably be more used when you already have the start node since the query will be simpler.

node = Person.find(:name => 'andreas')
node.query.match{friends}.to(:bar)}.returns{friends.since, bar}

We know the declared relationships on the node so we don't have to specify outgoing but instead can friends as shown in the example above.

Optional Relationships

Neo4j.query do
  START "a=node(2)"
  MATCH "a-[?]->x"
  RETURN "a,x"

Same as

Neo4j.query do
  START :a=>node(2)
  MATCH a.outgoing?.to(:x)
  RETURN a, x

With types

Neo4j.query do
  START :a=>node(2)
  MATCH a.outgoing?.as(:foo)
  RETURN a, foo

Using Lucene index

Neo4j.query do
  START :user => index(Person, :name => "Anakin"),
  # etc ...

We will then know which type of index should be used (node/relationship and exact or fulltext).

Using Relationship

Neo4j.rb can have prefix relationships, example:

class Actor
  include Neo4j::NodeMixin

This example will prefix acted_in relationship as Actor#acted_in

Neo4j.query do
  START :actor => :Actor(:name => "Anakin")
  MATCH actor.outgoing(Actor.acted_in).as(:foo_bar).to(:bar)  
  # or maybe this since we know it is an Actor

Multiple Relationships

The following query:

Neo4j.query do
  START "john=node:Person_exact(name = 'John')"
  MATCH "john-[:friend]->()-[:friend]->fof"
  RETURN "john, fof"

can also be written as this:

Neo4j.query do
  START :john => index(Person, :name => 'John')
  MATCH john.outgoing(:friends).to().outgoing(:friends).as(:fof)
  RETURN john, fof


The following query

Neo4j.query do 
  START "user=node(5,4,1,2,3)"
  MATCH "user-[:friend]->follower"
  WHERE " =~ /S.*/"
  RETURN "user,"

can also be written like this

Neo4j.query do 
  START :user=>node(5,4,1,2,3)
  MATCH user.outgoing(:friend).to(:follower)
  WHERE == /S.*/
  RETURN "user,"

Yes, this is actually possible but maybe a bit crazy.
The follower object implements the method_missing method and returns an object which overloads
the == operator on the object.

This also means that we get validation on the RegExp (but it migh have a different syntax in Java, so we should also allow a String as a regexp somehow)

Setting depth

Neo4j.query do
  START :actor => :Actor(:name => "Anakin")
  MATCH actor.outgoing(Actor.acted_in).as(:foo_bar).to(:bar).depth(:any)
  # or 
  #  MATCH

This is just an early draft. Feedback is very welcome !

systay commented Dec 15, 2011

Looks really cool. Great job!

dnagir commented Dec 16, 2011

What I'm missing now is ability to start the query from known node.

So instead of Neo4j.query{ START :user => node(user.neo_id) } I'd rather write user.query....

Other thing is that this DSL seems to be too verbose. The Cypher most of the time is one liner.

This DSL advocates multiple lines even for the simples queries.

I would prefer something like blog.query.match { posts(:p).comments(:c) }.return { c } }.


Yes, we should support starting from any node as well. I've updated the issue above.
Not sure about cypher queries will most of the time be one liners.
Here is an example of a typical cypher query I use in a project:

  START n=node:admin_Facility_exact("name:*")
  MATCH (n)<-[:subfacilities*0..3]-()<-[:uses]-(p), (n)<-[:uses]-(o)<-[:member_of]-(p2)
  WHERE p._classname = "Person" AND p.number != p2.number

This also brings up another issue - WHERE p_classname = "Person" is probably going to be a common thing to do.
We should make it easier to express that.

I think we want to chain methods when you already have a start node, but use the block DSL when you don't have a start node or need to write more complex queries.


It should also be easier to express things like the line below and avoid using lucene index to find all instances of a class.

START n=node:admin_Facility_exact("name:*")

Instead we can use our class node (rule node) and traverse to find all instances.

Maybe it can be express like this instead:

dnagir commented Dec 17, 2011

Ok. It all makes sense. But my 2 cents:

  • WHERE p_classname = "Person" I'm also pretty sure it will be common use case.
  • We should NOT use ALL UPPPER CASE as it is supposed to be a constant in Ruby. 100% against it. It doesn't make sense to go against the language as Cypher itself is not case-sensitive (at least you can write lowercase start, where, return etc).
  • Facility.query_all {...} - I think we should provide the entry-point into the DSL the same way no matter where you start. I would think Neo4.query..., my_facility.query..., Facility.query... etc would be reasonable.

Also my 2 cents re DSL:

  • ...match{} I would rather write ...match{ foo.foo_bar } (the foo_bar indicates outgoing relationship by default, but we could change it to incoming: foo.foo_bar(:in)).

The foo_bar should be generated based on the existing relationship on the Model. So if you misspell it, it would raise.

But we shouldn't probably try to make it perfect the first time.
We'll see more common patterns only when we'll start using (I don't have enough of those yet).


dnagir commented Dec 17, 2011

Another thing to remember is that we should be able to pass in parameters natively: blog.query.where(since) { |since| created_at >= since }

There's a lot to learn for the similar API from squeel.


Yes, that would be nice.
But it requires a lot of operator overloading which is a bit limited (not all Ruby operators are allowed to overload).
{ created_at >= since } should be translated into a string "something.created_at >= since".

Have to do some more thinking and study the squeel syntax.

Regarding .match{} that means traverse any outgoing relationship, not just the foo_bar relationship types.
But we should support foo.foo_bar as well.

Yes, I know having method names in upper case is controversial. But we sort of build our own language and I just thought it was easier to read. But I'm willing to change to lower case anyway, since I know it might upset people :-)


Regarding Facility.query_all - we can instead add a query method on traversals, e.g. Facility.all.query which means traverse from the Rule/Class node with the _all relationship type. (
If you create your own rules (like scope in active record) it's possible to combine rules with Cypher.

For example

class Facility < Neo4j::Model
  rule(:ready) { ...}
  rule(:used) { !userd_by.empty?}

To query all Facilities that are both ready and used we can do this:
Facility.ready.used.query{ ...}
This will also narrow down the scope of the traversals and make them faster.


How about using the >>, << and <=> operators ?

Neo4j.query do
  # Same as: start "foo=node(1)"
  start(:foo) = node(1)

  # Related nodes: match (n) -- (x) 
  match foo <=> x

  # match "foo-[:foo_bar]->bar"
  match foo >> [:foo_bar] >> :bar # or foo >> ":foo_bar" >> bar

  # match "foo-[foo_bar]->bar"
  match foo >> [any, :foo_bar] >> :bar  # or foo >> "foo_bar" >> bar

  # match "foo-[*]->bar"
  match foo >> [any] >> :bar # same as foo >> [] >> :bar, or foo >> bar

  # Variable length relationships
  # match a-[:KNOWS*1..3]->x
  match a >> [:knows, 1..3] >> x # or a >> ":knows*1..3" 

  # MATCH (n)-[r:friends]->()
  match n >> [:r, :friends) # (bind outgoing friends to variable r)

  # match (a)-[:KNOWS]->(b)-[:KNOWS]->(c)
  match a >> [:knows] >> b >> [:knows] >> c
dnagir commented Jan 16, 2012

I like it. We could also allow simple strings and relations:

match "a->[:plain_cypher_string] ->b"
match a >> Person.friends >> x

But we can alos use > together with >> to convey a bit different meaning. (Maybe > would mean depth of 1, while >> - unlimited or similar).

Not thinking about the details right now, just the DSL.

Also something like this would be nice:

# self is a Company model

def all_companies
  query do
    start(:u) # The value should be implied as `self`
    match c >> Company.groups >> UserGroup.participations >> User.participations >> u
    return distinct c

This is actually what I currently have (the ugly):

def all_companies
  res = Neo4j.query("""
      START u=node({s})
      MATCH c-[:`#{Company.groups}`]->()-[:`#{UserGroup.participations}`]->()-[:`#{User.participations}`]->u
      RETURN distinct c
    """, 's'=>neo_id){|r| r['c'].wrapper }
pehrlich commented Mar 7, 2012

Hello! I've been building up my own cypher library which begins to implement many of the things here. It also has some pretty cool innovations, which I'll show.

You can see the source, here:

# post.rb
def comments 

    # posts_controller.rb
    def comments
      render json: Post.comments.paginate(pagination)

    # application controller (inspired by guthub)
    def pagination
      out = {}
      out[:per_page] = params[:per_page] if params[:per_page]
      out[:page] = params[:page] if params[:page]
      out[:skip] = params[:skip] if params[:skip] # unused

chained methods

Allows match, where, limit, etc. Doesn't yet support duplicates. Allows order, limit, returing, and so on to be applied in any order

Starts at self

I found myself repeating this pattern a lot: "self = node(#{}". So I moved cypher to belong to model, and made that an assumed default if the start method is not called.

returning content

  • Everything comes out as hashes of ruby objects with symbolized keys. No bombarding beginners with java objects.
  • There are two ways to get content out.
    • The simpler is the #mapped method, which accepts symbols returns all the results
      mapped to with symbols as keys. This would replace #returning in the above
      • returning..paginate: With #returning, a query object is returned rather than data, allowing lazy querying.

Next Up

  • I'm not happy with how data is returned from the query. It is possible to return multiple objects from the query. The current implementation is a hack for prefetching data (ie, what Arel's :include paremeter does right). I'm going to explore two solutions:
  • The good one: storing the data behind the scenes, so that if you set `returning(:comments, :post), when call, the post would be already fetched. I just thought of this this morning haven't checked to see if this is already implemented.
    • The above implementation could be quite complicated under the hood, as to be perfect it would need to detect whether two differently worded queries would deliver the same results. I don't know how easy this is.
  • The ok one: Allow #returning to receive a block. This block receives a hash of the returned row, and is used to format that row in to a desirable shape. I'm thinking something like this:
# return a comment for rendering by the view.  A more proper use might be instantiating a 
# container class with the comment and rel.
.returning(:comment, :rel) do |comment, rel|
    comment.to_json[:voted] = rel.voted?
  • Named scopes would fit very nicely, and probably be easy to make.

Regarding other stuff-- I'm not so sure I'm a fan of replacing cypher syntax with a ruby equivalent. ALL the learning materials on the web are currently in cypher, and as a language it is not so bad to learn or read. With a few simple methods like I've shown, its easy to remove some of the boilerplate, leaving the user to focus on the most expressive bits. Changing these I fear would limit the usage of the language. It is in WIP itself and rapidly changing, and not supporting everything is as good as supporting nothing; having it forces another decision to be made and syntax to learn for anyone starting with neo4j. But.. prove me wrong!


I've started to implement the DSL, see README
I think it will be great.

dre-hh commented Nov 29, 2013

it is grreat!



Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment