Cypher- Supporting Union #125

lassewesth opened this Issue Nov 12, 2012 · 12 comments


None yet
5 participants

lassewesth commented Nov 12, 2012

@luanne: 'There are some queries which require traversing the graph in more than one direction, with some of those directions possibly optional. Trying to combine all these in a single query results in messy handling of the optional directions as well as all the results appearing as columns with possibly null values in a single row. It would be great if each direction could be represented as a simple, single queries, union-ed together.
Example: Find people that match the following criteria-
a) People I know or work with
b) People who know or work with my friends (may or may not have friends)
c) People who have tagged me that live in the same city as me(optional)

For the first 2:

start n=node(13)
match (person)<-[:knows|works_with]-(n)-[?:friend]-(friend)-[?:knows|works_with]-(friendProfile)

the results returned are:
PersonA, PersonB (single row)

Adding the third type of match requires manipulation of the existing match which is not always straightforward.
start n=node(13)
match (person)<-[:knows|works_with]-(n)
return as name
start n=node(13)
return as name
start n=node(13)
match (n)<-[:tagged_by]-(tagger)-[:lives_in]->(city)<-[:lives_in]-(n)
return as name

Would be brilliant if the union could also take in something to specify whether to return distinct results or not.'


lassewesth commented Nov 12, 2012



lassewesth commented Nov 12, 2012

@systay: Thanks Luanne.

Any ideas around the distinct? If we allow UNION DISTINCT, what would this mean?


lassewesth commented Nov 12, 2012

@luanne: If my SQL memory is right, then the precedence of set operations is left to
right unless there is grouping by parentheses.
So in the case below:

A UNION DISTINCT B = {1,2,3,4} UNION C = {1,2,3,4,3,4,5}

Does that make sense?


lassewesth commented Nov 12, 2012

@systay: Makes sense. So we also need to support parentheses.

SQL union works on column order - the first result set names the columns, and the rest of the results are merged into that. Should Cypher do the same and go strictly by order, or should we ignore order and only look at column names?

And, if the column names don't match, or if there are different number of columns in the different sets, we should throw an exception, right?


lassewesth commented Nov 12, 2012

@luanne: I think go strictly by order- less cause for confusion. If there are a different number of columns in different sets or there is a datatype mismatch then definitely throw an exception.

About the column names, personally, I would be fine if an exception was thrown if the names did not match- I would always use an AS in case they're different. ORDER BY also follows nicely then.

However, many may not be happy with that- so maybe you can match order and have the column names of the first result set be the column names in the final result. In that case, an ORDER BY would have to make sure to use the resulting column name from the first result set- unless you want to support order by column position too :-)


lassewesth commented Nov 12, 2012

@systay: I'd rather start by using the column names. I rather not make column ordering a big deal.

And, ORDER BY is a good point - should the individual queries be allowed to use order by/skip/limit, or is it just the last one, which encompasses the full result?

(Thanks for helping iron out the details - this is really important work!)


lassewesth commented Nov 12, 2012

@luanne: Column names sound fine.

For the order by/skip/limit, I suppose it could be applied individually as well as to the whole- as long as the individuals are enclosed in parenthesis:
(A order by col1) UNION (B order by col 2) UNION C order by col3 would just reorder the entire thing by col3.

Honestly though, if you didn't support individual ordering and just applied it to the final result to start with wouldn't make us love cypher any less :-)
Pretty sure that sooner or later you'll be talking about the WHERE clause too being applied to the entire result :-)

This would be a super cool feature for easy generation of edge lists in FoF queries.

mhluongo commented Mar 1, 2013

👍 Any movement on this?


systay commented Mar 1, 2013

Hi there,

It's actually done in the 2.0 branch already. Missed updating this issue about it.

@systay systay closed this Mar 1, 2013

mhluongo commented Mar 2, 2013


luanne commented Mar 2, 2013


Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment