Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Querying and constructing multiple graphs #241

Open
wants to merge 83 commits into
base: master
from

Conversation

@boggle
Copy link
Contributor

@boggle boggle commented Jul 2, 2017

This is a proposal for making Cypher work with multiple graphs.

It is part of the redesign of Cypher for adding support for working with multiple graphs that targets Cypher 10.

View latest version of CIP from associated branch

This covers a lot of ground:

* Data model
* Language execution model
* Working with named graphs
* Declarative Graph Construction
* Graph composition
* New Patterns: Optional Copy Patterns
* New Patterns: Merge Patterns
* Create, update, modify persistent graphs
@systay

This comment has been minimized.

Copy link
Collaborator

@systay systay commented on cip/CIP2017-06-18-multiple-graphs.adoc in 56f2d90 Jun 27, 2017

I don't think it's obvious what pipelining means in this context.

@boggle boggle changed the title CIP2017-06-18 Multiple Graphs CIP2017-06-18: Multiple Graphs Jul 2, 2017
@boggle boggle force-pushed the boggle:CIP2017-06-18-multiple-graphs branch 3 times, most recently from 2498907 to 7332c02 Jul 2, 2017
@boggle boggle force-pushed the boggle:CIP2017-06-18-multiple-graphs branch from 7332c02 to 8459014 Jul 3, 2017
@boggle boggle force-pushed the boggle:CIP2017-06-18-multiple-graphs branch from 8459014 to 4714ca6 Jul 3, 2017
=== (Property) Graph

_Definition_ A *property graph* is a set of labeled nodes and typed relationships both together with their properties (a property is a tuple of a named key and a value).
Graphs may be updatable, i.e. the set of contained nodes and relationships may change during the lifetime of the graph.

This comment has been minimized.

@Mats-SX

Mats-SX Jul 3, 2017
Member

This section should probably link to the PGM spec in our repo.

It is an error to attempt to update a read-only graph.

The same node or relationship may be part of many graphs.
A relationship may only be part of a graph if it's start node and it's end node are both also part of the same graph.

This comment has been minimized.

@Mats-SX

Mats-SX Jul 3, 2017
Member

it's -> its

or rephrased:

if its source and target nodes are both also ...

This comment has been minimized.

@boggle

boggle Jul 3, 2017
Author Contributor

That always trips me up :)

This comment has been minimized.

@Mats-SX

Mats-SX Jul 3, 2017
Member

Yeah, used to trip me up too, but then I learned that it's == it is, so in case you're unsure, just spell it out and it'll become apparent :)


The same node or relationship may be part of many graphs.
A relationship may only be part of a graph if it's start node and it's end node are both also part of the same graph.
Therefore removing a node from a graph may require removing some of it's relationships from the graph, too.

This comment has been minimized.

@Mats-SX

Mats-SX Jul 3, 2017
Member

it's -> its

It not only may, it will require removing all of them. Or rephrased:

Thus, removing a node from a graph will require removing all of its relationships from that graph, too.


Graphs do not expose an identity like nodes or relationships do.

Graphs may be made addressable through other means by a conforming implementation (e.g. through exposing the graph under a _graph URL_ for referencing and loading it).

This comment has been minimized.

@Mats-SX

Mats-SX Jul 3, 2017
Member

I suggest unwrapping the example from parentheses.


With this terminology in place, execution of a parameterized Cypher query in the single graph execution model can be described as executing within (and operating on) a given execution context and an initial query context and finally returning the query context produced as output for the top-most `RETURN` clause.

Note: This formulation is introduced to describe a high-level model for the execution of queries; A real world implementation is free to choose any other internal representation (e.g. based on an algebra) as long as it does not violate the specified semantics.

This comment has been minimized.

@Mats-SX

Mats-SX Jul 3, 2017
Member

A -> a (not capitalised)

* `<graph-specifier-list>`: A comma separated list of `<graph-specifier>` that are to be passed on
* `*`: All named graphs are to be passed on
* `*, <graph-specifier-list>`: All named graphs are to be passed on together with any additional named graphs that are newly bound in `<graph-specifier-list>`
* `-`: No named graphs are to be passed on

This comment has been minimized.

@Mats-SX

Mats-SX Jul 3, 2017
Member

I'm interpreting that GRAPHS is optional (which I support). What is the point of GRAPHS - if we can just leave it out?

This in essence mirrors the semantics for tabular data returned by Cypher.

Both `WITH ... GRAPHS ...` and `RETURN ... GRAPHS ...` will pass on (or return respectively) exactly the set of described named graphs.
To simplify passing on available graphs it is proposed by this CIP that regular `WITH <return-items>` is taken to be syntactic sugar for `WITH <return-items> GRAPHS -` and that regular `RETURN <return-items>` is taken to be syntactic sugar for `RETURN <return-items> GRAPHS -`.

This comment has been minimized.

@Mats-SX

Mats-SX Jul 3, 2017
Member

What is the point of having a long-form GRAPHS - as the normal form, and call leaving it out syntactic sugar? Why not say that leaving it out is the normal form, and that the other forms modify that?

This comment has been minimized.

@boggle

boggle Jul 3, 2017
Author Contributor

The procedures CIP (?) I think added - for procedures not returing any columns, similarly researches have suggested, that it is an omission on the part of SQL to not be able to return no columns (more so for Cypher where the single row field less table plays a special role to start off queries). In light of this I added this for no reason but consistency with these other decisions.

This comment has been minimized.

@Mats-SX

Mats-SX Jul 3, 2017
Member

But for procedures there was actually a need for YIELD - in order to not cause implicit conflicts with variables that were in scope, as I recall. My personal preference would be to use the empty string to denote the intention in this proposal.

Both `WITH ... GRAPHS ...` and `RETURN ... GRAPHS ...` will pass on (or return respectively) exactly the set of described named graphs.
To simplify passing on available graphs it is proposed by this CIP that regular `WITH <return-items>` is taken to be syntactic sugar for `WITH <return-items> GRAPHS -` and that regular `RETURN <return-items>` is taken to be syntactic sugar for `RETURN <return-items> GRAPHS -`.

To even further simplify, it is additionally proposed that `WITH|RETURN <return-items> INPUT GRAPHS <graph-return-items>` is to be syntactic sugar for `WITH|RETURN <return-items> GRAPHS <graph-return-items>, SOURCE GRAPH, TARGET GRAPH`.

This comment has been minimized.

@Mats-SX

Mats-SX Jul 3, 2017
Member

I'm not convinced of the usefulness of this syntactic sugar -- I find that it is hard to know what kind of queries will be prominent in this new model. In general, I think that it would be useful to have a little less focus on the syntactic sugar bits, and more on the core model. Syntactic sugar additions could always follow later.

This comment has been minimized.

@boggle

boggle Jul 3, 2017
Author Contributor

Let's revisit how default graphs are handled as a group first - this may very well remove the need for this. In short I added this as a simple way for a query to say: "I'm ok to run on any incoming graphs and am happy to pass those on, just give 'em some names for me". Without this sugar, expressing this becomes rather verbose.

This comment has been minimized.

@Mats-SX

Mats-SX Jul 3, 2017
Member

I'm not against the sugar per se, I just find it difficult to assess whether a particular piece of sugar is valuable this early in the process of defining these very new concepts, and so I'm leaning towards skepticism in general. I find it is peripheral to the contents of the CIP anyway.


=== Discarding available tabular data

It is additionally proposed that both `WITH GRAPHS <graph-return-items>` and `RETURN GRAPHS <graph-return-items>` are syntactic sugar for `WITH - GRAPHS <graph-return-items>` (and `RETURN - GRAPHS <graph-return-items>` respectively).

This comment has been minimized.

@Mats-SX

Mats-SX Jul 3, 2017
Member

I feel similarly to this as to GRAPHS -; I prefer the absence of - to its presence in this context.


However, the change has been carefully designed to not change the semantics of existing queries.

== Alternatives

This comment has been minimized.

@Mats-SX

Mats-SX Jul 3, 2017
Member

I think this and subsequent sections are superfluous since the introduction of CIRs. We should modify our template.

@Mats-SX
Copy link
Member

@Mats-SX Mats-SX commented Jul 3, 2017

Great work putting these concepts into spec!

boggle and others added 8 commits Aug 3, 2017
- Homogenized graph specifier syntax
- Added DEFAULT GRAPH
- WITH, RETURN can also return comma separated list of graphs without
  leading `GRAPHS` if bound graphs are prefixed with `GRAPH`,
  i.e. RETURN a, b, c COPY OF GRAPH foo is possible
- COPY .. TO ..
- Allow FROM <name> AS <new-name> (wo leading GRAPH)
- Allow INTO <name> AS <new-name> (wo leading GRAPH)
@petraselmer petraselmer force-pushed the boggle:CIP2017-06-18-multiple-graphs branch 2 times, most recently from 54286fa to b402f1d Aug 4, 2017
- The jpg files ought to be moved elsewhere at a later stage
@petraselmer petraselmer force-pushed the boggle:CIP2017-06-18-multiple-graphs branch from b402f1d to f8fcc8e Aug 4, 2017
@boggle boggle force-pushed the boggle:CIP2017-06-18-multiple-graphs branch from 7f258be to c5b8e42 May 8, 2018
@boggle boggle changed the title CIP2017-06-18: Multiple Graphs Querying and constructing multiple graphs May 8, 2018
@linsimiao
Copy link

@linsimiao linsimiao commented Dec 26, 2019

hi all, I have read your documentation. I found it easy to mix CONSTRUCT with UPDATE. I wonder whether the following cyphers mean the same.

FROM xxx match (a:Person) UPDATE GRAPH merge (b:Student{name:a.name})

and

FROM xxx match (a:Person) CONSTRUCT merge (b:Student{name:a.name})

suppose the working graph is yyy, is it both two cyphers will lead to create nodes with a label Student in the graph yyy.

Thanks for your reply.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

None yet

5 participants
You can’t perform that action at this time.