New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added the nested, updating, and chained subqueries CIP #100

Open
wants to merge 27 commits into
base: master
from

Conversation

Projects
None yet
9 participants
@petraselmer
Contributor

petraselmer commented Jun 22, 2016

This is part of the redesign of Cypher for adding support for working with multiple graphs that targets Cypher 10.

Rendered:

https://github.com/petraselmer/openCypher/blob/CIP-nested-subqueries/cip/1.accepted/CIP2016-06-22-nested-updating-and-chained-subqueries.adoc

@aseemk

This comment has been minimized.

aseemk commented Jun 23, 2016

I'm not informed enough to offer meaningful feedback here, but as a user of Neo4j, this looks great! Thank you for putting this together. =)

@aseemk

This comment has been minimized.

aseemk commented Jun 23, 2016

The only thing that made me go "hmm, interesting" was that subqueries' variables (variable bindings) persist into the outer query. How does that work if you have multiple independent subqueries that happen to (coincidentally) share a variable name? (E.g. something generic like user or tweet).

It would feel a bit more intuitive to me if only the RETURN variables persisted. Then, you could naturally rename those with WITH before running another subquery, etc.

But again, I'm not informed enough to know if this feedback is reasonable or flawed.

@cleishm

This comment has been minimized.

Contributor

cleishm commented Jun 23, 2016

@aseemk Based on the examples, I'm guessing that it is only the RETURN values that are made available to the outer query. It's just that the language isn't very clear when it says "...Any new variable bindings produced by evaluating the subquery will augment the variable bindings of the initial record...".

@boggle

This comment has been minimized.

Contributor

boggle commented Jun 24, 2016

Even more so, the intent is to not allow any shadowing, i.e. you cannot rebind variable from the outer query in the RETURN of the nested subquery. This is in line with CALL ... YIELD... and UNWIND ... AS ... semantics which also do not allow to do that. I think it really helps reduce confusion and hopefully nudges the users towards picking more meaningful (distinguishing) names.

@zazi

This comment has been minimized.

zazi commented Jun 27, 2016

+1
(but I guess in SPARQL I can do the same (i.e. I doubt a bit the SPARQL comparison in your proposal ;) ))

@thobe

This comment has been minimized.

Contributor

thobe commented Jul 12, 2016

I like the simplicity of this proposal, but I wonder how well it scales to other things we would/might want to do with/as subqueries in the future.

I was actually hoping we would be able to replace union with subqueries, rather than just allowing post-union processing by using union in a subquery, because I find union-queries hard to read with the constituent parts not being strongly delineated.

I've also held some hope that we would be able to use sub-queries as parameters to procedures, and there might be other use cases we would want to use sub-queries for that I can't think of right now.

@petraselmer

This comment has been minimized.

Contributor

petraselmer commented Jul 20, 2016

I have added the 'NOT READY FOR REVIEW' label only because we still need to consider path bindings for subqueries. We'll add this in a fortnight or so, when @boggle and I are back from leave.

In the meantime, much content regarding the new DO syntax has been added, so comments welcome.

@petraselmer

This comment has been minimized.

Contributor

petraselmer commented Jul 20, 2016

Hi @zazi

Thanks for your comment. I was wondering if you could provide an example (or link) in which correlated subqueries are supported in SPARQL? According to the W3C link in the CIP, it appears to be the case that 'vanilla' SPARQL cannot support correlated subqueries, but it may very well be the case that a vendor-specific implementation does. If this is the case, I would love to know, and, of course, this shall be added to the CIP.

@zazi

This comment has been minimized.

zazi commented Jul 24, 2016

@petraselmer I'm not 100% an expert in this field. Nevertheless, (from my understanding) I'm supporting the answer to this question http://answers.semanticweb.com/questions/24508/does-sparql-11-support-correlated-subqueries ;)

Owing to the bottom-up nature of SPARQL query evaluation, the subqueries are evaluated logically first, and the results are projected up to the outer query.
Only variables projected out of the subquery will be visible, or in scope, to the outer query.

This comment has been minimized.

@zazi

zazi Jul 24, 2016

I would tend to say that this is not a disadvantage, but rather then a (good) feature.

@zazi

This comment has been minimized.

zazi commented Jul 24, 2016

For readability issues, I would recommend to add all queries in plain, natural language sentences as well.

MERGE (c:Child {id: x})
MERGE (r)-[:PARENT]->(c)
}
----

This comment has been minimized.

@cleishm

cleishm Jul 25, 2016

Contributor

Might be nice to explain the difference between the preceding example and what it would be without the DO and just having the MERGE calls following UNWIND. I assume it's simply keeping the same cardinality (i.e. the DO itself has a cardinality of 1) and not introducing any new identifiers, but it might be helpful to be explicit about that and why it's helpful.

@zazi

This comment has been minimized.

zazi commented Jul 26, 2016

Furthermore, some example data that illustrate the queries might be helpful (i.e. a small data set and the result sets of the different queries).

@Mats-SX

This comment has been minimized.

Member

Mats-SX commented Jul 28, 2016

@zazi

Furthermore, some example data that illustrate the queries might be helpful (i.e. a small data set and the result sets of the different queries).

This would be solved by the addition of TCK scenarios for this feature (which is planned). Do you agree?

@boggle

This comment has been minimized.

Contributor

boggle commented Sep 26, 2016

@thobe passing subqueries as arguments to procedures is out of scope for this CIP.

@thobe How do you envision replacing UNION with subqueries? I'm not seeing that. Besides having UNION is very common for people with a SQL background and opens up a natural trajectory for adding other set operations (INTERSECT SET DIFFERENCE)

@Tin-Nguyen

This comment has been minimized.

Tin-Nguyen commented Sep 30, 2016

+1, I'm waiting for this

@ric81

This comment has been minimized.

ric81 commented Dec 13, 2016

In which Neo4j release is/will this change be included ?

@Mats-SX

This comment has been minimized.

Member

Mats-SX commented Dec 14, 2016

@ric81 That is currently not known. Do note that this repository manages the specification and standardisation of Cypher as a language, which is separate from Neo4j the graph database (although this decoupling is fairly new and not yet fully mature).

The inner mandatory query is any inner match query.
Moreover, any inner match query from which the trailing final `RETURN` clause has been omitted may also be used as an inner mandatory query.

This comment has been minimized.

@thobe

thobe Mar 30, 2017

Contributor

I think this needs examples to demonstrate the use of these. As well as the difference between having a RETURN clause and leaving it out.

In my understanding MANDATORY without RETURN would be like EXISTS, but as a clause with the capability of failing the query rather than a boolean expression. Interestingly that use of MANDATORY would not affect the cardinality of the query, whereas MANDATORY with RETURN could potentially increase the cardinality of the query (although never decrease it, since that would mean that no matches were found).

This comment has been minimized.

@boggle

boggle Apr 13, 2017

Contributor

We're actually thinking about making RETURN mandatory on MANDATORY match based on the reason that this will be the 95% use case and having to name in the remaining 5% isn't a big deal.

This comment has been minimized.

@boggle

boggle Apr 13, 2017

Contributor

The benefit would be less special casing. Then again, maybe someone might find it very natural to leave off RETURN.

@Mats-SX

This comment has been minimized.

Member

Mats-SX commented Apr 24, 2017

This CIP doesn't contain any formal syntax description in EBNF format.

@boggle boggle changed the title from Added the nested subqueries CIP to Added the Nested, updating, and chained subqueries CIP Oct 16, 2017

@boggle boggle changed the title from Added the Nested, updating, and chained subqueries CIP to Added the nested, updating, and chained subqueries CIP Oct 17, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment