Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Creating and managing graphs and views #314

Open
wants to merge 5 commits into
base: master
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
284 changes: 284 additions & 0 deletions cip/1.accepted/CIP2018-05-03-catalog-administration.adoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,284 @@
= CIP2018-05-03 Creating and managing graphs and views
:numbered:
:toc:
:toc-placement: macro
:source-highlighter: codemirror

*Author:* Stefan Plantikow <stefan.plantikow@neo4j.com>, Andres Taylor <andres.taylor@neo4j.com>, Petra Selmer <petra.selmer@neo4j.com>

This material is based on internal contributions from Alastair Green <alastair.green@neo4j.com>, Mats Rydberg <mats.rydberg@neo4j.com>, Martin Junghanns <martin.junghanns@neo4j.com>, Tobias Lindaaker <tobias.lindaaker@neo4j.com>

[abstract]
.Abstract
--
This CIP extends Cypher with support for creating and administrating graphs and views in the catalog.

This CIP has been created in tandem with `CIP2017-06-18` for adding support for working with multiple graphs to Cypher and `CIP2016-06-22` for nested subqueries and relies on the material introduced in these proposals.
Therefore this CIP is based on the assumption that `CIP2017-06-18` and `CIP2016-06-22` will be accepted.
--

toc::[]



== Introduction

`CIP2017-06-18` introduces the notion of a catalog and adds support to Cypher for working with multiple graphs that may be either obtained from the catalog by name or constructed dynamically by a query.

This proposal adds three additional capabilities: Managing the graphs in the catalog, managing views in the catalog and using them in a query, and finally query local declarations.


=== Related work

This CIP has been developed in tandem with the following CIPs; as such, it is recommended to read all four CIPs in conjunction with each other.

* `CIP2016-06-22`: Nested subqueries
* `CIP2018-05-04`: Equivalence operators, copy pattern, and related auxiliary functions
* `CIP2017-06-18`: Multiple graphs


== Managing graphs in the catalog


=== Creating graphs

Creating new graphs in the catalog is done via the new schema command `CREATE GRAPH`.

There are four forms of `CREATE GRAPH`:

1. `CREATE GRAPH <graph name>` creates an empty graph in the catalog with the name `<graph name>`.

2. `CREATE GRAPH <graph name> { <nested subquery> }` creates a new graph in the catalog that is a copy of the graph that was returned by `<nested subquery>`.
An error is raised if `<nested subquery>` does not return a graph.

3. `CREATE GRAPH myProc(...) AS <graph name>` creates a new graph in the catalog that is a copy of the graph that was returned by the call to `myProc`.
An error is raised if `myProc` does not return a graph.

4. Multiple graphs may be created with one graph statement by separating multiple `GRAPH` subclauses of the forms 1, 2, or 3 with `,` (e.g. `CREATE GRAPH foo, GRAPH bar { ... }`)

5. An error is raised when a graph is created with the name `<graph name>` and there already is a graph with that name in the catalog.


=== Delete graph

The catalog command `DELETE GRAPH <graph name>` deletes the graph with the name `<graph name>` from the catalog.
If `<graph name>` is not the name of a graph that already exists in the catalog, an error is raised.


=== Copy graph

The catalog command `COPY <old name> TO <new name>` copies the content and the schema of the graph with the name `<old name>` in the catalog to a new graph with the name `<new name>` in the catalog.
If `<old-name>` is not the name of a graph that already exists in the catalog, an error is raised.
If `<new-name>` is the name of a graph that already exists in the catalog, an error is raised.


=== Rename graph

The catalog command `RENAME <old name> TO <new name>` removes the graph with the name `<old name>` from the catalog and adds it as a new graph with the name `<new name>` in the catalog.
If `<old name>` is not the name of a graph that already exists in the catalog, an error is raised.
If `<new name>` is the name of a graph that already exists in the catalog, an error is raised.


=== Truncating graphs

The catalog command `TRUNCATE <graph name>` truncates the graph with the name `<graph name>` in the catalog.

Truncating a graph deletes all its nodes and relationships but retains any additional schema information like constraints.


=== Determining the name of a graph

The `catalog()` function returns the catalog name for the working graph or `NULL` if the working graph is a dynamically constructed graph.

The `catalog(g)` function returns the catalog name for the graph identity `g` or `NULL` if `g` is a dynamically constructed graph.

Note:: `toString(graph())` may be used to just generate a human readable name for the working graph.


== Views

A view is a query that is stored in the catalog (in the same way as graphs) but is re-evaluated whenever it is referenced.
This is called an activation.


=== Creating views in the catalog

Views are created in the catalog using syntax `CREATE QUERY <view name> { <nested subquery> }`.

Multiple views and graphs may be created by one `CREATE GRAPH/QUERY` schema command.


=== Managing views in the catalog

Views in the catalog can be managed with `COPY`, `RENAME`, `DELETE QUERY` in the same way as graphs.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why the distinction that COPY and RENAME do not require the additional keyword of what is copied/renamed, but DELETE needs that?


An error is raised when attempting to delete a graph using `DELETE QUERY` or a view using `DELETE GRAPH`.


== View activation

_Definition_: A view is activated whenever it is referenced from within a reading or updating statement.

View activation executes the query that was associated with the view and returns the graph as the query result for actual use.

The following forms of view activation currently exist in Cypher:

1. `FROM <view name>`
2. `UPDATE <view name>`
3. `RETURN CALL <view name>`
4. `RETURN GRAPH <view name>`


== Local declarations
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This whole section is not so clear to me.


Boths graphs and views may be declared at the start of a composite statement.

The syntax for local graph declarations is

[source, cypher]
----
GRAPH < local graph name > { <nested subquery }
GRAPH < graph or view name > AS < local graph name >
GRAPH myProc(...) AS < local graph name >
----

The syntax for local graph declarations is

[source, cypher]
----
QUERY < local name > { < composite statement > }
----

`<local name>` are identifiers that start with a `_`.

Semantics:

1. `<composite statement>` must not be a correlated subqueries.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo: "a"


2. An error is raised, if a local declaration would shadow an already exisint local declaration.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo: "exisint"


Note:: Restriction 1 is likely going to be lifted in the future.


== Parameterized views

Both views stored in the catalog and locally declared views may be parameterized with view arguments

[source, cypher]
----
QUERY _myView(<view argument>, ...) {
<composite statement>
}
----

Activation of a parameterized view requires providing view arguments to the activation.

1. View arguments use the same namespace as parameters.

2. View arguments may be evaluated from any valid constant expression, i.e. an expression that only references literals or parameters in scope.
However grouped nested subqueries may be used to make additional parameters available inside a subquery.

3. An error is raised if a local view declares a view argument that is already bound (either passed as a parameter or via a grouped nested subquery).

It is recommended that a warning is raised if a catalog view references a parameter that is not an explicitly bound view argument.

Furthermore, views may express expectations on the passed bindings:

[source, cypher]
----
QUERY _myView(args) {
WITH a, b
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the difference between binding expectations and parameters?

<composite statement>
}
----

This alternative form of argument passing is needed for grouped nested subqueries in order to distinguish between arguments that are evaluated over parameters and the grouping key and variable bindings available in all records for the same grouping key.

Activation of a view with binding expectations may provide those bindings via renaming.

[source, cypher]
----
QUERY _myView($constant) {
WITH a, b
<composite statement>
}
MATCH (x)-[r:KNOWS]->(y)
CALL PER since _myView(5 WITH x AS a, y AS b) YIELD ...
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very weird syntax IMHO.

...
----

If binding expectations as just passed through by the inner query, they are _not_ added as additional bindings in the result records.


== Grammar

[source, ebnf]
----
<catalog command> ::= CREATE < catalog item list >
| COPY <catalog name> TO <catalog name>
| RENAME <catalog name > TO <catalog name>
| TRUNCATE <catalog name>
| DELETE GRAPH <catalog name>
;

<catalog item list> ::= < local declaration > [ { `,` < local declaration > } ;

<local declaration ::= QUERY < local name > { < composite statement > }
| GRAPH < local name > [ { < composite statement > } ]
| GRAPH < invocation > AS < local name >
;

<activation> ::= < view name > [ `(` < view arguments > `)` ] ;

<view arguments> ::= < expr > [ { `,` < expr > } ... < table arguments > ] ;

<table arguments> ::= [ WITH < item > [ { `,` < item > } ] ;

<item> ::= < expr > AS < identifier > ;

<view name> ::= < catalog name > | < local name > ;

-- no leading _ allowed
<catalog name> ::= identifier [ { `.` identifier } ... ] ;

<local name> ::= `_` identifier ;
----

== Considerations

This CIP aims to bring together different concepts and syntactic ideas introduced in the design of Cypher for multiple graphs and the CIP for nested subqueries.

It therefore tries to respect the guiding principles already expressed in those CIPs and other related proposals.


=== Interaction with existing features

None known.


=== Alternatives

Instead of adding additional clauses the major part of the proposed functionality could be expressed using procedures.
However, catalog management was felt central enough to warrant proper inclusion into the language.


=== Syntax variations

* `DROP GRAPH` instead of `DELETE GRAPH`


=== What others do

SQL has followed a similar approach in that it allows to register both views and tables in a global catalog.


=== Benefits to this proposal

Catalog management can be expressed using the Cypher language (instead of having to rely on implementation specific means).


=== Caveats to this proposal

The size of the language is increased.
This makes it harder to learn Cypher.
However the chosen syntax is quite intuitive which is expected to at leat reduce the impact of this change on readability.