New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Querying and constructing multiple graphs #241

Open
wants to merge 83 commits into
base: master
from

Conversation

Projects
None yet
4 participants
@boggle
Contributor

boggle commented Jul 2, 2017

This is a proposal for making Cypher work with multiple graphs.

It is part of the redesign of Cypher for adding support for working with multiple graphs that targets Cypher 10.

View latest version of CIP from associated branch

First early draft for Cypher support for working with multiple graphs
This covers a lot of ground:

* Data model
* Language execution model
* Working with named graphs
* Declarative Graph Construction
* Graph composition
* New Patterns: Optional Copy Patterns
* New Patterns: Merge Patterns
* Create, update, modify persistent graphs
@systay

This comment has been minimized.

Collaborator

systay commented on cip/CIP2017-06-18-multiple-graphs.adoc in 56f2d90 Jun 27, 2017

I don't think it's obvious what pipelining means in this context.

@boggle boggle changed the title from CIP2017-06-18 Multiple Graphs to CIP2017-06-18: Multiple Graphs Jul 2, 2017

=== (Property) Graph
_Definition_ A *property graph* is a set of labeled nodes and typed relationships both together with their properties (a property is a tuple of a named key and a value).
Graphs may be updatable, i.e. the set of contained nodes and relationships may change during the lifetime of the graph.

This comment has been minimized.

@Mats-SX

Mats-SX Jul 3, 2017

Member

This section should probably link to the PGM spec in our repo.

It is an error to attempt to update a read-only graph.
The same node or relationship may be part of many graphs.
A relationship may only be part of a graph if it's start node and it's end node are both also part of the same graph.

This comment has been minimized.

@Mats-SX

Mats-SX Jul 3, 2017

Member

it's -> its

or rephrased:

if its source and target nodes are both also ...

This comment has been minimized.

@boggle

boggle Jul 3, 2017

Contributor

That always trips me up :)

This comment has been minimized.

@Mats-SX

Mats-SX Jul 3, 2017

Member

Yeah, used to trip me up too, but then I learned that it's == it is, so in case you're unsure, just spell it out and it'll become apparent :)

The same node or relationship may be part of many graphs.
A relationship may only be part of a graph if it's start node and it's end node are both also part of the same graph.
Therefore removing a node from a graph may require removing some of it's relationships from the graph, too.

This comment has been minimized.

@Mats-SX

Mats-SX Jul 3, 2017

Member

it's -> its

It not only may, it will require removing all of them. Or rephrased:

Thus, removing a node from a graph will require removing all of its relationships from that graph, too.

Graphs do not expose an identity like nodes or relationships do.
Graphs may be made addressable through other means by a conforming implementation (e.g. through exposing the graph under a _graph URL_ for referencing and loading it).

This comment has been minimized.

@Mats-SX

Mats-SX Jul 3, 2017

Member

I suggest unwrapping the example from parentheses.

With this terminology in place, execution of a parameterized Cypher query in the single graph execution model can be described as executing within (and operating on) a given execution context and an initial query context and finally returning the query context produced as output for the top-most `RETURN` clause.
Note: This formulation is introduced to describe a high-level model for the execution of queries; A real world implementation is free to choose any other internal representation (e.g. based on an algebra) as long as it does not violate the specified semantics.

This comment has been minimized.

@Mats-SX

Mats-SX Jul 3, 2017

Member

A -> a (not capitalised)

* `<graph-specifier-list>`: A comma separated list of `<graph-specifier>` that are to be passed on
* `*`: All named graphs are to be passed on
* `*, <graph-specifier-list>`: All named graphs are to be passed on together with any additional named graphs that are newly bound in `<graph-specifier-list>`
* `-`: No named graphs are to be passed on

This comment has been minimized.

@Mats-SX

Mats-SX Jul 3, 2017

Member

I'm interpreting that GRAPHS is optional (which I support). What is the point of GRAPHS - if we can just leave it out?

This in essence mirrors the semantics for tabular data returned by Cypher.
Both `WITH ... GRAPHS ...` and `RETURN ... GRAPHS ...` will pass on (or return respectively) exactly the set of described named graphs.
To simplify passing on available graphs it is proposed by this CIP that regular `WITH <return-items>` is taken to be syntactic sugar for `WITH <return-items> GRAPHS -` and that regular `RETURN <return-items>` is taken to be syntactic sugar for `RETURN <return-items> GRAPHS -`.

This comment has been minimized.

@Mats-SX

Mats-SX Jul 3, 2017

Member

What is the point of having a long-form GRAPHS - as the normal form, and call leaving it out syntactic sugar? Why not say that leaving it out is the normal form, and that the other forms modify that?

This comment has been minimized.

@boggle

boggle Jul 3, 2017

Contributor

The procedures CIP (?) I think added - for procedures not returing any columns, similarly researches have suggested, that it is an omission on the part of SQL to not be able to return no columns (more so for Cypher where the single row field less table plays a special role to start off queries). In light of this I added this for no reason but consistency with these other decisions.

This comment has been minimized.

@Mats-SX

Mats-SX Jul 3, 2017

Member

But for procedures there was actually a need for YIELD - in order to not cause implicit conflicts with variables that were in scope, as I recall. My personal preference would be to use the empty string to denote the intention in this proposal.

Both `WITH ... GRAPHS ...` and `RETURN ... GRAPHS ...` will pass on (or return respectively) exactly the set of described named graphs.
To simplify passing on available graphs it is proposed by this CIP that regular `WITH <return-items>` is taken to be syntactic sugar for `WITH <return-items> GRAPHS -` and that regular `RETURN <return-items>` is taken to be syntactic sugar for `RETURN <return-items> GRAPHS -`.
To even further simplify, it is additionally proposed that `WITH|RETURN <return-items> INPUT GRAPHS <graph-return-items>` is to be syntactic sugar for `WITH|RETURN <return-items> GRAPHS <graph-return-items>, SOURCE GRAPH, TARGET GRAPH`.

This comment has been minimized.

@Mats-SX

Mats-SX Jul 3, 2017

Member

I'm not convinced of the usefulness of this syntactic sugar -- I find that it is hard to know what kind of queries will be prominent in this new model. In general, I think that it would be useful to have a little less focus on the syntactic sugar bits, and more on the core model. Syntactic sugar additions could always follow later.

This comment has been minimized.

@boggle

boggle Jul 3, 2017

Contributor

Let's revisit how default graphs are handled as a group first - this may very well remove the need for this. In short I added this as a simple way for a query to say: "I'm ok to run on any incoming graphs and am happy to pass those on, just give 'em some names for me". Without this sugar, expressing this becomes rather verbose.

This comment has been minimized.

@Mats-SX

Mats-SX Jul 3, 2017

Member

I'm not against the sugar per se, I just find it difficult to assess whether a particular piece of sugar is valuable this early in the process of defining these very new concepts, and so I'm leaning towards skepticism in general. I find it is peripheral to the contents of the CIP anyway.

=== Discarding available tabular data
It is additionally proposed that both `WITH GRAPHS <graph-return-items>` and `RETURN GRAPHS <graph-return-items>` are syntactic sugar for `WITH - GRAPHS <graph-return-items>` (and `RETURN - GRAPHS <graph-return-items>` respectively).

This comment has been minimized.

@Mats-SX

Mats-SX Jul 3, 2017

Member

I feel similarly to this as to GRAPHS -; I prefer the absence of - to its presence in this context.

However, the change has been carefully designed to not change the semantics of existing queries.
== Alternatives

This comment has been minimized.

@Mats-SX

Mats-SX Jul 3, 2017

Member

I think this and subsequent sections are superfluous since the introduction of CIRs. We should modify our template.

@Mats-SX

This comment has been minimized.

Member

Mats-SX commented Jul 3, 2017

Great work putting these concepts into spec!

@boggle boggle changed the title from CIP2017-06-18: Multiple Graphs to Querying and constructing multiple graphs May 8, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment