New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Configurable pattern matching semantics in response to #174 #175

Open
wants to merge 38 commits into
base: master
from

Conversation

Projects
None yet
6 participants
@Mats-SX

I like the direction of this.

The default uniqueness mode used by `MATCH` (without a further specification of the preferred uniqueness mode) is relationship-unique matching.
`MATCH ALL` does not reject any paths - not even paths containing cycles - and hence can lead to infinite result sets for the whole query.
It is recommended that implementations generate at least a warning when static analysis is not able to proof query termination due to the chosen uniqueness mode.

This comment has been minimized.

@Mats-SX

Mats-SX Jan 26, 2017

Member

proof -> prove

=== Proposal: Default uniqueness mode
Additionally, it is proposed that a conforming implementation should provide a pre-parser option for defining a default uniqueness level for use with regular pattern matching.

This comment has been minimized.

@Mats-SX

Mats-SX Jan 26, 2017

Member

I'm not convinced this kind of recommendation belongs in a CIP. Is it not well understood that an implementing system would provide ways of changing defaults?

* `closed(p)`: true if the start and the end node of `p` are the same node
* `trail(p)`: true if `p` contains no duplicate relationships
* `simple(p)`: true if `p` contains no duplicate relationships and either no duplicate nodes at all or the start node and the end node are the same node
* `trek(p)`: true if `p` contains two identical consecutive relationships

This comment has been minimized.

@Mats-SX

Mats-SX Jan 26, 2017

Member

What does identical mean here? Same rel-type? Same type and properties? Equal?

* `trail(p)`: true if `p` contains no duplicate relationships
* `simple(p)`: true if `p` contains no duplicate relationships and either no duplicate nodes at all or the start node and the end node are the same node
* `trek(p)`: true if `p` contains two identical consecutive relationships
* `repetetive(p)`: true if `p` contains any closed subpath `q` of `size > 1` that is immediately repeated after itself in `p`

This comment has been minimized.

@Mats-SX

Mats-SX Jan 26, 2017

Member

repetitive

RETURN p
----
Note that these functions naturally extend to lists.

This comment has been minimized.

@Mats-SX

Mats-SX Jan 26, 2017

Member

Do you mean lists generally, or lists containing only nodes and relationships? I'm not sure I follow; what does trail(list) yielding true mean? That the list is a trail?

Changing the uniqueness mode of a sub query recursively changes the default uniqueness mode for all contained `MATCH` clauses unless it is overridden again. Examples:
* `MATCH <uniqueness-modes> { MATCH ... } ...`
* `DO <uniqueness-modes> { MATCH ... } ...`

This comment has been minimized.

@Mats-SX

Mats-SX Jan 26, 2017

Member

Are MATCH and DO (this is the first time it appears on this repo I think) the two cases where you'd be able to supply these modes? What about MERGE?

@boggle boggle changed the title from Add proposal for isomorphic pattern matching in response to #174 to Configurable pattern matching semantics in response to #174 Jul 17, 2017

== Motivation
Currently Cypher uses pattern matching semantics that treats all patterns that occur in a `MATCH` clause as a unit (called a *uniqueness scope*) and only considers pattern instances that bind different relationships to each fixed length relationship pattern variable and to each element of a variable length relationship pattern variable.
This has come to be called *cypermorphism* informally and is a variation of edge isomorphism.

This comment has been minimized.

@Mats-SX

Mats-SX Jul 21, 2017

Member

I thought these two were synonymous; what is the variation?

This comment has been minimized.

@boggle

boggle Jul 23, 2017

Contributor

'Academic' edge isomorphism only talks about a single, connected candidate walk while cyphermorphism considers all relationships bound by any pattern in the same match (even relationships bound by different, disconnected walks) for uniqueness.

This comment has been minimized.

@Mats-SX

Mats-SX Jul 24, 2017

Member

Aha! Thanks for the clarification.

This comment has been minimized.

@alastai

alastai Jul 26, 2017

I don't think that this is a difference of "morphism". If one followed strict isomorphism ("path isomorphism" in Walks, Trails, Paths terms, no repeated vertices, and therefore also no repeated edges), then Cypher's current "pattern gluing" rules would apply (unless we change those rules), and we would end up evaluating matches against the compound, glued pattern, but using isomorphic semantics. Gluing may be syntactic salt, but is orthogonal to "morphism". Cyphermorphism, in my view, is no different to "Trail morphism", or "edge isomorphism".

@romanskas

This comment has been minimized.

romanskas commented Aug 1, 2017

Just for the completeness: there is a fourth option (injective vertices, non-injective edges): (a)-[e1]->(b), (a)-[e2]->(b). In this case, a and b have to be distinct, but e1 and e2 can match to the same edge.

| 'DIFFERENT', ('RELATIONSHIPS' | 'EDGES'), [ VariableList ]
PatternMorphism = 'DIFFERENT', ('NODES' | 'VERTICES')
| 'DIFFERENT', ('RELATIONSHIPS' | 'EDGES')
| 'DIFFERENT', [ VariableList ]

This comment has been minimized.

@thobe

thobe Mar 27, 2018

Contributor

Optional VariableList? Is that really right?

As we can see above, patterns in Cypher consist of a comma-separated list of _pattern parts_, where a pattern part is exemplified by `p = (e:Employee)-[:REPORTS_TO*1..3]->(m:Manager)`.
PathClass = 'WALK'
| 'TRAIL'
| [ 'SIMPLE' ], 'PATH'

This comment has been minimized.

@thobe

thobe Mar 27, 2018

Contributor

Does TRAIL, PATH and SIMPLE PATH really encode three different classes? If not I wonder why synonyms are allowed. (a WALK is obviously different from those three)

@boggle boggle closed this May 7, 2018

@boggle boggle deleted the boggle:isomatch branch May 7, 2018

@boggle boggle restored the boggle:isomatch branch May 7, 2018

@petraselmer petraselmer reopened this May 31, 2018

@boggle

This comment has been minimized.

Contributor

boggle commented May 31, 2018

Note that this CIP is in a heavy state of flux in order to allow for alignment with ongoing discussions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment