Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support "unique" constraint on relationships #173

Open
IanRogers opened this issue Jan 22, 2017 · 12 comments
Open

Support "unique" constraint on relationships #173

IanRogers opened this issue Jan 22, 2017 · 12 comments
Labels

Comments

@IanRogers
Copy link

IanRogers commented Jan 22, 2017

CIR-2017-173

Related to #172

There are some graphs where you want to ensure there are no duplicate relationships.
http://neo4j.com/docs/developer-manual/current/cypher/clauses/create-unique/ does exist but that requires all developers to remember to do it!

It would be great to be able to say

create constraint on ()-[r:R]-() assert unique(r)

and have cypher throw an error if a duplicate is attempted.

@thobe
Copy link
Contributor

thobe commented Jan 23, 2017

Could you elaborate on what you mean by unique for relationships. It seems like you have something useful in mind, but I fail to understand exactly what the scope of the uniqueness should be. Should there only be one relationship with label R in the entire dataset? Should there only be one relationship with label R and the same combination of properties? Or should there only be one relationship with label R and the same source and target node?

The intent of the specification of Cypher around constraints has been to only provide the syntax for defining constraints, and an understanding of the semantics. Not to define which constraints an implementation should (or should not) support. Coming up with a good understanding of the use case, and then a clear way of expressing it is still very valuable here.

@IanRogers-LShift
Copy link

Unique as in the same source and target node - i.e. what CREATE UNIQUE currently does but force it.

So if someone uses CREATE UNIQUE to add a relationship and a copy already exists then do nothing as now.

But if someone uses a bare CREATE to create a relationship where one already exists with the same source, target and label then throw an error.

@IanRogers-LShift
Copy link

IanRogers-LShift commented Jan 23, 2017

Like my comment on #173 the syntax could be simplified to

CREATE CONSTRAINT [:R] ASSERT UNIQUE

and the word "ASSERT" isn't really needed either.

This is mostly because what I really want is to enforce a strict tree in a graph for a certain relationship label by specifying

CREATE CONSTRAINT [:R] ASSERT ACYCLIC, UNIQUE

@IanRogers-LShift
Copy link

Ah, now I've read #166 I guess the syntax could be

CREATE CONSTRAINT Rdag FOR [r:R] REQUIRE ACYCLIC r

@Mats-SX
Copy link
Member

Mats-SX commented Jan 26, 2017

with the same source, target and label

So relationships of another type between the same nodes are fine?

This is essentially a cardinality constraint, and could be expressed a bit more generally like this:

CREATE CONSTRAINT only_one_R_between_nodes
FOR (n)
REQUIRE size((n)-[:R]-()) < 2

@IanRogers-LShift
Copy link

Nearly, I'm looking for there to be only one link of type R between any pair of nodes. A node may be start or finish of any number of R. So, perhaps:

CREATE CONSTRAINT enforce_dag_unique_for_R_links
FOR p=(a)-[:R]-(b)
REQUIRE size(p) < 2

I'm not that bothered by the syntax though :-) - with this, and #172, I just want to be able to enforce a DAG.

@Mats-SX
Copy link
Member

Mats-SX commented Jan 27, 2017

Ah, yes, you're right.

@petraselmer petraselmer changed the title feature request: Support "unique" constraint on relationships Support "unique" constraint on relationships Feb 3, 2017
@petraselmer
Copy link

Updated the original comment to include the name of the CIR

Mats-SX added a commit to Mats-SX/openCypher that referenced this issue Mar 1, 2017
@Mats-SX
Copy link
Member

Mats-SX commented Mar 1, 2017

I've tried to capture this CIR in #166.

@Mats-SX
Copy link
Member

Mats-SX commented Mar 6, 2017

The example above using size() won't work, as size() is not defined over paths. I think the below example would give the requested semantics, but it is not very elegant:

CREATE CONSTRAINT enforce_dag_unique_for_R_links
FOR p = (a)-[:R]-(b)
REQUIRE size(collect(DISTINCT p)) < 2

@IanRogers
Copy link
Author

IanRogers commented Mar 6, 2017

How about we start simple (as I started in the beginning :) and just do:

CREATE CONSTRAINT enforce_dag_unique_for_R_links
FOR p = (a)-[:R]-(b)
REQUIRE UNIQUE p

as I'd suggest that covers a large set of the use cases (enforcing a DAG like a pert chart) and it already has a precedent in the CREATE UNIQUE example.

To enforce a tree (the remaining set of use cases I think) would need:

CREATE CONSTRAINT enforce_tree_for_T_links
FOR p = (a)-[:T*]->(b)
REQUIRE UNIQUE p

I think (i.e. the multiple length and added direction) plus something to enforce a unique root - but I'd be happy to leave that out of this discussion.

If someone wants to make a case for the general "size" form then perhaps do so in a different ticket.

@kant111
Copy link

kant111 commented Jul 15, 2019

I am also looking for something similar. I would like a unique constraint on relationship properties.

For example, I create the graph incrementally (say in batches from Kafka by repeatedly applying the below operation for every batch):

CALL apoc.periodic.iterate("
MATCH (a:Dense1) 
WHERE a.id <> '1'
MATCH (b:Dense1) 
WHERE b.id = '1' AND a.key = b.key 
RETURN a, b",
"CREATE (a)-[:PARENT_OF {edge_id: a.uid + "," + b.uid}]->(b)",
{}) YIELD batches, total, errorMessages
RETURN batches, total, errorMessages

The above will create duplicates and I understand that I could use MERGE instead of CREATE but that is draining the write performance by a significant margin because of double locking.

For my use case, I actually don't want neo4j to do anything if the "same relationship" (as defined by the uniqueness constraint below) already exists or even better throw an exception so the application can use that info to do something else. This way the double locking can be avoided and I can finally build the graph incrementally as fast as possible.

CREATE CONSTRAINT ON ()-[r:PARENT_OF]-() ASSERT r.edge_id IS UNIQUE;

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

6 participants