# About

This notebook will introduce constraints in Neo4j as well as the `MERGE` Cypher keyword.

In [1]:
from neo4j import GraphDatabase, Record, ResultSummary, EagerResult
from neo4j.time import Date

import pandas as pd
pd.set_option('display.max_colwidth', 100)

import os 
import sys
from dotenv import load_dotenv 
load_dotenv()

# Add the utils directory to sys.path
sys.path.append(os.path.abspath("../utils"))

from Neo4jParser import Neo4jParser


NEO4J_URI = os.getenv("NEO4J_URI")
NEO4J_USERNAME = os.getenv("NEO4J_USERNAME")
NEO4J_PASSWORD = os.getenv("NEO4J_PASSWORD")

driver = GraphDatabase.driver(NEO4J_URI, auth=(NEO4J_USERNAME, NEO4J_PASSWORD))

## Contraints
* Contraints help ensure two things:
    1. **Data integrity or governance**
    2. **Data consistency**
    * An example of this is creating a 'Person' label that represents people and requiring them to have a name. A 'Person' cannot be added to the graph if they do not have a name.
        * <u>*Advanced Tip:*</u> Use a Pydantic model to define the entity you're uploading to the graph. This will ensure the data representing each node has the same schema. Although an advantage of Neo4j is schemaless, the entities themselves should have a schema and they all should interact the same way for each instance.
* Contraints allow you to remove properties from which you defined a contraint but not create against them.

In [2]:
# Create a constraint on the Person node
result = driver.execute_query(
    """ 
    CREATE CONSTRAINT name_constraint
    FOR (p:Person) REQUIRE p.name IS UNIQUE
    """,
    database_="neo4j"
)

data = Neo4jParser.parse(result, True, False)
data

Started streaming 0 records after 77 ms and completed after 77 ms.

Query executed against database: 'neo4j':  
    CREATE CONSTRAINT name_constraint
    FOR (p:Person) REQUIRE p.name IS UNIQUE
    


{}

In [4]:
# Now, try to upload another "Tom Hanks" Person node
result = driver.execute_query(
    """ 
    CREATE (p:Person {name: "Tom Hanks"}) RETURN p
    """,
    database_="neo4j"
)

data = Neo4jParser.parse(result, True, False)
data

ConstraintError: {code: Neo.ClientError.Schema.ConstraintValidationFailed} {message: Node(159) already exists with label `Person` and property `name` = 'Tom Hanks'}

**NOTE:**

The few times we as developers are excited to see an error! The constraint worked, we cannot upload another Person node with the name "Tom Hanks". Neo4j will also throw an error if you try to define another contraint with the same name.

In [None]:
# Create a constraint relationship 'ACTED_IN' where the relationships must have a property called "roles"
result = driver.execute_query(
    """ 
    CREATE CONSTRAINT acted_in_relationship
    FOR ()-[r:ACTED_IN]-() REQUIRE r.roles IS NOT NULL
    """,
    database_="neo4j"
)

data = Neo4jParser.parse(result, True, False)
data

Started streaming 0 records after 26 ms and completed after 27 ms.

Query executed against database: 'neo4j':  
    CREATE CONSTRAINT acted_in_relationship
    FOR ()-[r:ACTED_IN]-() REQUIRE r.roles IS NOT NULL
    


{}

In [15]:
# Show all contraints in the graph
result = driver.execute_query(
    """ 
    SHOW CONSTRAINT
    """,
    database_="neo4j"
)

data = Neo4jParser.parse(result, True, True)
data

Started streaming 2 records after 1 ms and completed after 2 ms.

Query executed against database: 'neo4j':  
    SHOW CONSTRAINT
    


Unnamed: 0,id,labelsOrTypes,properties,propertyType,entityType,name,type,ownedIndex
0,3,[ACTED_IN],[roles],,RELATIONSHIP,acted_in_relationship,RELATIONSHIP_PROPERTY_EXISTENCE,
1,2,[Person],[name],,NODE,name_constraint,UNIQUENESS,name_constraint


In [None]:
# Let's drop a specific constraint: "acted_in_relationship"
result = driver.execute_query(
    """ 
    DROP CONSTRAINT acted_in_relationship;
    """,
    database_="neo4j"
)

data = Neo4jParser.parse(result, True, True)
data

Started streaming 0 records after 1 ms and completed after 1 ms.

Query executed against database: 'neo4j':  
    DROP CONSTRAINT acted_in_relationship;
    


## `MERGE`

* `MERGE` combines both `MATCH` and `CREATE`. If the node cannot be found, then it will be created.
    * After discovering `MERGE` I've personally never used `CREATE` again. In a way, `MERGE` adds intrinsic constraints when created nodes/relationships with a schema. 
    * The caveat to `MERGE`, is trying to 'MERGE' a new node with only partial properties. It will not create a node, rather match the existing node with the additional properties not specified.

In [25]:
# Let's drop a specific constraint: "acted_in_relationship"
result = driver.execute_query(
    """ 
    MATCH (m:Movie {title: "Cast Away"})
    RETURN m
    """,
    database_="neo4j"
)

data = Neo4jParser.parse(result, True, False)
data

Started streaming 1 records after 35 ms and completed after 36 ms.

Query executed against database: 'neo4j':  
    MATCH (m:Movie {title: "Cast Away"})
    RETURN m
    


{'m': [{'elementId': '4:552b0252-2f83-4c7e-a0bf-f921a4b1b7cf:138',
   'labels': frozenset({'Movie'}),
   'properties': {'tagline': 'At the edge of the world, his journey begins.',
    'title': 'Cast Away',
    'released': 2000}}]}

In [26]:
# Let's merge a Person node
result = driver.execute_query(
    """ 
    MERGE (m:Movie {title: "Cast Away"})
    RETURN m
    """,
    database_="neo4j"
)

data = Neo4jParser.parse(result, True, False)
data

Started streaming 1 records after 40 ms and completed after 40 ms.

Query executed against database: 'neo4j':  
    MERGE (m:Movie {title: "Cast Away"})
    RETURN m
    


{'m': [{'elementId': '4:552b0252-2f83-4c7e-a0bf-f921a4b1b7cf:138',
   'labels': frozenset({'Movie'}),
   'properties': {'tagline': 'At the edge of the world, his journey begins.',
    'title': 'Cast Away',
    'released': 2000}}]}

**NOTE:** Notice `MERGE` did not create a new node. It matched the current "Cast Away" movie node in the graph. If the expected outcome was to create a new node with only the property title, then use `CREATE`. However, this is **EXCELLENT** native behavior in Neo4j because it supports the concept of ontologies, which you can read more about [here.](https://en.wikipedia.org/wiki/Ontology_(information_science))

In [None]:
# Let's merge a Person node
result = driver.execute_query(
    """ 
    MERGE (p:Person {name: "Tom Hanks"})
    RETURN p
    """,
    database_="neo4j"
)

data = Neo4jParser.parse(result, True, False)
data

*Scenario:* Search for the person with the name "Cameron" of age 34 years. If we find it, return it to me. Otherwise, create it.

In [27]:
result = driver.execute_query(
    """ 
    MERGE (p:Person {name: "Cameron", age:34})
    RETURN p
    """,
    database_="neo4j"
)

data = Neo4jParser.parse(result, True, False)
data

Started streaming 1 records after 29 ms and completed after 37 ms.

Query executed against database: 'neo4j':  
    MERGE (p:Person {name: "Cameron", age:34})
    RETURN p
    


{'p': [{'elementId': '4:552b0252-2f83-4c7e-a0bf-f921a4b1b7cf:171',
   'labels': frozenset({'Person'}),
   'properties': {'name': 'Cameron', 'age': 34}}]}