<img src="https://datascientest.fr/train/assets/logo_datascientest.png" style="height:150px">

<hr style="border-width:2px;border-color:#75DFC1">
<center><h1>Neo4J</h1></center>
<center><h2>Basics of Cypher</h2></center>
<hr style="border-width:2px;border-color:#75DFC1">


<blockquote>
<center><h3>The Seven Bridges of Königsberg</h3></center>

In this exercise, you will create a very simple graph. It is a tribute to Leonhard Euler's solution of the problem of the <a href="https://en.wikipedia.org/wiki/Seven_Bridges_of_K%C3%B6nigsberg">Seven Bridges of Königsberg</a>. His solution is at the origin of graph theory: 

<center><img src="./koenigsb.gif"></center>
<i>The Pregel river flows through the city of Königsberg (currently named Kaliningrad and part of Russia), dividing it into three separate banks and an island. There are 7 bridges linking the different part of the city: in 1735, Euler showed that it was impossible to cross all the bridges once and only once.
    </i>
We are going to create this graph.
</blockquote>

* run the following cell to launch the container and create a Neo4J driver

In [1]:
# Run this cell 

import pprint
# Importing the class Neo4JForDockerDriver
from neo_utils import Neo4JForDockerDriver
# Instatiating the class Neo4JForDockerDriver
neo4j = Neo4JForDockerDriver()
# starting the container
neo4j.launch_container()
# creating the driver
driver = neo4j.create_driver()

<center><h3>Node creation</h3></center>
<blockquote>
To create nodes, the Cypher syntax is as follows: 
    
```cypher
CREATE (n)
```


 The brackets represent the nodes. `n` is just a placeholder for the node. You can add some information about the node by adding a json-like string: 

    
```cypher 
CREATE (n {name: "name of the node", attribute1: "attribute1_value"})
```
 With Cypher, we can label nodes: for example, a graph representing a family tree, we should label men and women differently. 

    
```cypher
CREATE (n:Label {name: "name of the node", attribute1: "attribute1_value"})
```

 
 We can also add different super-labels, e.g. more general labels (the principle is similar to super-classes in Oriented Object Programing).
```cypher 
CREATE (n:SuperLabel:Label {name: "name of the node", attribute1: "attribute1_value"})
```

</blockquote> 



* using this syntax, create a node whose name is `"North"` and whose label is `Bank`

In [2]:
# Insert your code here

query = """
RETURN 'Insert your code in this string'
"""

with driver.session() as session:
    results = session.run(query)


In [3]:
# Insert your code here

query = """
CREATE (n:Bank {name: 'North'})
"""

with driver.session() as session:
    results = session.run(query)


<blockquote>
In <b>Cypher</b>, statements can be distributed over multiple lines. Moreover, you can run multiple transactions at once if they are separated by <b>;</b> if it is specified in the settings of <b>Neo4J</b>. Finally, Cypher is not case-sensitive but for the sake of clarity, we will put Cypher key-words in upper case. 
</blockquote>

* Create two queries to create two nodes with label <code>"Bank"</code> and whose names are respectively <code>"South"</code> and <code>"East"</code>. (here Neo4J is not set to accept multiple statements at once...)

In [4]:
# Insert your code here

queries = [
    """RETURN 'Insert your code in this string'""",
    """RETURN 'Insert your code in this string'"""
]           

with driver.session() as session:
    for q in queries: 
        results = session.run(q)


In [5]:
# Insert your code here

queries = [
    """CREATE (n:Bank {name: "South"});""",
    """CREATE (n:Bank {name: "East"})"""
]

with driver.session() as session:
    for q in queries: 
        results = session.run(q)


> The last node of our graph is the central island named KneipHof. We are going to give it both the label `Bank` and the label `Island`

* create a node with labels `Bank` and `Island` and name `"Kneiphof"`

In [6]:
# Insert your code here

query = """
RETURN 'Insert your code in this string'
"""

with driver.session() as session:
    results = session.run(query)


In [7]:
# Insert your code here

query = """
CREATE (n:Bank:Island {name: 'Kneiphof'})
"""

with driver.session() as session:
    results = session.run(query)


<center><h3> Simple queries </h3></center>

<blockquote>
In the next chapter, we will see how to make complicated queries over a graph with Cypher but we need to see how to make very simple matching queries. The following is the query to return all the nodes of a graph.
<br>
<br>
    
```cypher
MATCH (n) 
RETURN n
```
<br>
<br>
It is decomposed in two parts: 
    <ul>
        <li>A <code>MATCH</code> part specifying conditions on the nodes to return </li>
        <li>A <code>RETURN</code> part specifying what to return
    </ul>
    
</blockquote>

* run the following cell to return all the nodes in our graph

In [8]:
# Insert your code here

query = """
MATCH (n) RETURN n
"""

with driver.session() as session:
    results = session.run(query)
    pprint.pprint(results.data())

[{'n': <Node id=0 labels={'Bank'} properties={'name': 'North'}>},
 {'n': <Node id=20 labels={'Bank'} properties={'name': 'South'}>},
 {'n': <Node id=21 labels={'Bank'} properties={'name': 'East'}>},
 {'n': <Node id=22 labels={'Bank', 'Island'} properties={'name': 'Kneiphof'}>}]


<blockquote>
You can see that each node is represented as a dictionary with its own id, its labels and its properties. 

Neo4J provides a user interface to visualize our graph. We cannot access it here but the same query will return something similar to this: 
<center>
    
```cypher 
MATCH (n) RETURN n 
```
</center>   

<center> <img src="./Pictures/neo4j_02_01.png"> </center>

If we want to return only a specific property of the nodes, we can specify it in the <code>RETURN</code> statement: 

```cypher
MATCH (n) 
RETURN n.property
```
</blockquote>

* make a query to return the <code>names</code> of all the nodes

In [9]:
# Insert your code here

query = """
RETURN 'Insert your code in this string'
"""

with driver.session() as session:
    results = session.run(query)
    pprint.pprint(results.data())


[{"'Insert your code in this string'": 'Insert your code in this string'}]


In [10]:
# Insert your code here

query = """
MATCH (n) RETURN n.name
"""

with driver.session() as session:
    results = session.run(query)
    pprint.pprint(results.data())


[{'n.name': 'North'},
 {'n.name': 'South'},
 {'n.name': 'East'},
 {'n.name': 'Kneiphof'}]


<blockquote>
To return specifically the ID or the labels of the nodes, we can use the following syntaxes: 

<br>
<br>

```cypher 
MATCH (n) 
RETURN id(n)
```

<br>
<br>

```cypher
MATCH(n)
RETURN labels(n)
```

<br>
<br>


Finally, if we want to put a condition on the nodes to return, we can do it in the <code>MATCH</code> statement: For example, if we want to return all the nodes with <code>property1</code> set to <code>value1</code>, we can do the following: 

<br>
<br>

```cypher 
MATCH (n {property1: 'value1'})
RETURN n
```

<br>
<br>

There are other ways to do the same thing: for example with a <code>WHERE</code> clause: 

<br>
<br>

```cypher 
MATCH (n)
WHERE n.property1 = 'value1'
RETURN n
```

<br>
<br>
This is all we need for this lesson regarding queries.

</blockquote>

* Make a query returning the labels of the nodes whose property <code>name</code> is set to <code>'Kneiphof'</code> 

In [11]:
# Insert your code here

query = """
RETURN 'Insert your code in this string'
"""

with driver.session() as session:
    results = session.run(query)
    pprint.pprint(results.data())


[{"'Insert your code in this string'": 'Insert your code in this string'}]


In [12]:
# Insert your code here

query = """
MATCH (n {name: 'Kneiphof'})
RETURN labels(n)
"""

with driver.session() as session:
    results = session.run(query)
    pprint.pprint(results.data())


[{'labels(n)': ['Bank', 'Island']}]


<center><h3>Relationship creation</h3></center>

<blockquote>

In Neo4J, relationships are directed: if you want to create an undirected relationship, you need to create two directed relationships. Relationships can have labels and properties. This is useful to make weigthed graphs. You can think of train networks: you may want to link two cities and give this link information about the length of the track, the average running time, the average price, ... 

To create a relationship between two nodes, we first need to query those nodes and then call a <code>CREATE</code> statement to create the relationship. For example, if we want to create a relationship labeled <code>RELATIONSHIP_1</code> between a node with <code>name</code> set to <code>'name1'</code> and another node with <code>name</code> set to <code>'name2'</code>, we would use the following syntax:  

<br>
<br>

```cypher
MATCH (p {name: 'name1'})
MATCH (q {name: 'name2'})
CREATE (p)-[:RELATIONSHIP_1]->(q)
```

<br>
<br>

If we want to create the same relationship in the two directions, we can do it with the same query:
    
<br>
<br>
    
```cypher
MATCH (p {name: 'name1'})
MATCH (q {name: 'name2'})
CREATE (p)-[:RELATIONSHIP_1]->(q)
CREATE (q)-[:RELATIONSHIP_1]->(p)
```

<br>
<br>
    

If we want to add properties to this relationship, the syntax is similar to the one used when we created nodes: 

<br>
<br>

```cypher
MATCH (p {name: 'name1'})
MATCH (q {name: 'name2'})
CREATE (p)-[:RELATIONSHIP_1 {property1: 'value1', property2: 'value2'}]->(q)
```

<br>
<br>

Notice that the if several nodes are met with the criteria of the <code>MATCH</code> statements, we will create a relationship for each couple of matches. 

We will represent the bridges of Königsberg with relationships labeled  <code>BRIDGE</code> and whose name correspond to the first figure of this lesson.
</blockquote>

* create a query to build an undirected relationship <code>BRIDGE</code> with name <code>'g'</code> between the <code>North</code> and <code>East</code> banks

In [13]:
# Insert your code here

query = """
RETURN 'Insert your code in this string'
"""

with driver.session() as session:
    results = session.run(query)


In [14]:
# Insert your code here

query = """
MATCH (p {name: 'North'})
MATCH (q {name: 'East'})
CREATE (p)-[:BRIDGE {name: 'g'}]->(q)
CREATE (q)-[:BRIDGE {name: 'g'}]->(p)
"""

with driver.session() as session:
    results = session.run(query)


<blockquote>There are two bridges between the island of <code>Kneiphof</code> and the <code>North</code> bank addressed as <code>'c'</code> and <code>'d'</code>.</blockquote>

* make a single query to create those bridges

In [15]:
# Insert your code here

query = """
RETURN 'Insert your code in this string'
"""

with driver.session() as session:
    results = session.run(query)


In [16]:
# Insert your code here

query = """
MATCH (island {name: 'Kneiphof'})
MATCH (north_bank {name: 'North'})
CREATE (island)-[:BRIDGE {name: 'c'}]->(north_bank)
CREATE (island)-[:BRIDGE {name: 'd'}]->(north_bank)
CREATE (north_bank)-[:BRIDGE {name: 'c'}]->(island)
CREATE (north_bank)-[:BRIDGE {name: 'd'}]->(island)
"""

with driver.session() as session:
    results = session.run(query)


<blockquote>As you can see we can create a lot of relationships in a single query: this allows use to avoid making too many <code>MATCH</code> statements. We still need to create some other relationships. </blockquote>

* Try making a query to get all the nodes and create all the remaining relationships at once (see the first figure for names)


In [17]:
# Insert your code here

query = """
RETURN 'Insert your code in this string'
"""

with driver.session() as session:
    results = session.run(query)


In [18]:
# Insert your code here

query = """
MATCH (island {name: 'Kneiphof'})
MATCH (north_bank {name: 'North'})
MATCH (south_bank {name: 'South'})
MATCH (east_bank {name: 'East'})
// island to south bank
CREATE (island)-[:BRIDGE {name: 'a'}]->(south_bank)
CREATE (island)-[:BRIDGE {name: 'b'}]->(south_bank)
CREATE (south_bank)-[:BRIDGE {name: 'a'}]->(island)
CREATE (south_bank)-[:BRIDGE {name: 'b'}]->(island)
// island to east bank
CREATE (island)-[:BRIDGE {name: 'e'}]->(east_bank)
CREATE (east_bank)-[:BRIDGE {name: 'e'}]->(island)
//east to south 
CREATE (east_bank)-[:BRIDGE {name: 'f'}]->(south_bank)
CREATE (south_bank)-[:BRIDGE {name: 'f'}]->(east_bank)
"""

with driver.session() as session:
    results = session.run(query)


<blockquote>Our graph is finally over. We are going to make a simple query to check that the relationships are effectively created</blockquote>

* run the following cell

In [19]:
# Run the following cell

query = """
MATCH (p)-[rel]->(q)
RETURN p.name AS Start, rel.name AS Bridge, q.name AS Stop ORDER BY Bridge"""

with driver.session() as session:
    results = session.run(query)
    pprint.pprint(results.data())


[{'Bridge': 'a', 'Start': 'South', 'Stop': 'Kneiphof'},
 {'Bridge': 'a', 'Start': 'Kneiphof', 'Stop': 'South'},
 {'Bridge': 'b', 'Start': 'South', 'Stop': 'Kneiphof'},
 {'Bridge': 'b', 'Start': 'Kneiphof', 'Stop': 'South'},
 {'Bridge': 'c', 'Start': 'North', 'Stop': 'Kneiphof'},
 {'Bridge': 'c', 'Start': 'Kneiphof', 'Stop': 'North'},
 {'Bridge': 'd', 'Start': 'North', 'Stop': 'Kneiphof'},
 {'Bridge': 'd', 'Start': 'Kneiphof', 'Stop': 'North'},
 {'Bridge': 'e', 'Start': 'East', 'Stop': 'Kneiphof'},
 {'Bridge': 'e', 'Start': 'Kneiphof', 'Stop': 'East'},
 {'Bridge': 'f', 'Start': 'South', 'Stop': 'East'},
 {'Bridge': 'f', 'Start': 'East', 'Stop': 'South'},
 {'Bridge': 'g', 'Start': 'North', 'Stop': 'East'},
 {'Bridge': 'g', 'Start': 'East', 'Stop': 'North'}]


<blockquote>
If we run the query to get all the nodes and visualize the result, we get something similar to this: 

<br>
<br>
<center>
    
```cypher
MATCH (n) RETURN n
```
</center>
<br>
<br>
    
<center> <img src="./Pictures/neo4j_02_02.png"> </center>

<i>Notice that we have been using the <code>CREATE</code> key-word but we could have used the <code>MERGE</code> key-word: the difference is subtle: <code>MERGE</code> look for the pattern we are trying to create and if it already exists, it will return it as in a <code>RETURN</code> statement. <code>CREATE</code> does not care for this: if the pattern already exists, it is created anyway. 

</i>
Congratulations on building this graph ! Of course this graph is quite small but the principle remains the same when dealing with larger graphs. 
    
We still need to know how to modify this data. 
</blockquote>

<center><h3>Property modification</h3></center>

<blockquote>To create properties on existing entities, we can use the <code>SET</code>. The use of <code>SET</code> relies also on the <code>MATCH</code> statement. For example, if we want to create a new property called <code>new_property</code> with value <code>'new_value'</code> on the nodes with <code>property1</code> set to <code>'value1'</code>, we can do: 

<br>
<br>

```cypher 
MATCH (n {property1: 'value1')
SET n.new_property = 'new_value'
```

<br>
<br>

If the property does not exist, it is created but if it already exists it is modified. The same can be done with a relationship: 

<br>
<br>

```cypher 
MATCH (p {property1: 'value1')
MATCH (q {property1: 'value2')
MATCH (p)-[rel]->(q)
SET rel.new_property = 'new_value'
```

<br>
<br>

Or 

```cypher 
MATCH ()-[rel {property1: 'value1'}]->()
SET rel.new_property = 'new_value'
```

<br>
<br>
    
<i> Do not worry too much about <code>MATCH</code> statements, we will explore them further in the next lesson</i>
    
</blockquote>

* add a property <code>'famous_places'</code> to the node named <code>'Kneiphof'</code> whose value is a list containing the <code>Cathedral</code> and the <code>Town Hall</code>


In [20]:
# Insert your code here

query = """
RETURN 'Insert your code in this string'
"""

with driver.session() as session:
    results = session.run(query)


In [21]:
# Insert your code here

query = """
MATCH (n {name: 'Kneiphof'})
SET n.famous_places = ['Cathedral', 'Town Hall']
"""

with driver.session() as session:
    results = session.run(query)


* create a query to return the node named <code>'Kneiphof'</code>

In [22]:
# Insert your code here

query = """
RETURN 'Insert your code in this string'
"""

with driver.session() as session:
    results = session.run(query)
    pprint.pprint(results.data())

[{"'Insert your code in this string'": 'Insert your code in this string'}]


In [23]:
# Insert your code here

query = """
MATCH (n {name: 'Kneiphof'})
RETURN n"""

with driver.session() as session:
    results = session.run(query)
    pprint.pprint(results.data())


[{'n': <Node id=22 labels={'Bank', 'Island'} properties={'name': 'Kneiphof', 'famous_places': ['Cathedral', 'Town Hall']}>}]


* add a property <code>'tarif'</code> to the bridge <code>'a'</code> with a value of <code>25</code>

In [24]:
# Insert your code here

query = """
RETURN 'Insert your code in this string'
"""

with driver.session() as session:
    results = session.run(query)

In [25]:
# Insert your code here

query = """
MATCH ()-[rel { name: 'a'}]->()
SET rel.tarif = 25
"""

with driver.session() as session:
    results = session.run(query)

* create a query to get all the informations about the bridge addressed as <code>'a'</code>

In [26]:
# Insert your code here

query = """
RETURN 'Insert your code in this string'
"""

with driver.session() as session:
    results = session.run(query)
    pprint.pprint(results.data())

[{"'Insert your code in this string'": 'Insert your code in this string'}]


In [27]:
# Insert your code here

query = """
MATCH ()-[rel { name: 'a'}]->()
RETURN rel
"""

with driver.session() as session:
    results = session.run(query)
    pprint.pprint(results.data())

[{'rel': <Relationship id=24 nodes=(<Node id=22 labels=set() properties={}>, <Node id=20 labels=set() properties={}>) type='BRIDGE' properties={'name': 'a', 'tarif': 25}>},
 {'rel': <Relationship id=26 nodes=(<Node id=20 labels=set() properties={}>, <Node id=22 labels=set() properties={}>) type='BRIDGE' properties={'name': 'a', 'tarif': 25}>}]


<blockquote>
Finally, to master all the posibilities of graph creation, we need to see how to delete nodes or relationships.
</blockquote>

<center><h3>Deleting nodes and relationships</h3></center>

<blockquote>
Once again, the principle is the same as before: 
<ul>
    <li>a <code>MATCH</code> statement </li>
    <li>a <code>DELETE</code> statement </li>
</ul>
     
For example, to delete all nodes with property <code>property1</code> set to <code>'value1'</code>, we can do the following 

<br>
<br>

```cypher 
MATCH (n {property1: 'value1'})
DELETE n
```

<br>
<br>

The issue with this syntax, is that if any relationship exists with this node, Neo4J will throw an error. There are two ways to overcome this issue: 
<ul> 
    <li> Deleting the relationships of this particular node before deleting the node itself. </li>
    <li> Using the clause <code>DETACH DELETE</code> 
</ul>

For the first solution, we can do the following:

<br>
<br>

```cypher
MATCH (n {property1: 'value1'})
MATCH (n)-[rel1]->()
MATCH ()-[rel2]->(n)
DELETE rel1, rel2
```
<br>
<br>

And then the normal delete. 
<br>
For the second solution, we can do the following: 

<br>
<br>

```cypher
MATCH (n {property1: 'value1'})
DETACH DELETE (n)
```

</blockquote>

* create a query to delete the node named <code>'Kneiphof'</code>

In [28]:
# Insert your code here

query = """
RETURN 'Insert your code in this string'
"""

with driver.session() as session:
    results = session.run(query)

In [29]:
# Insert your code here

query = """
MATCH (n {name: 'Kneiphof'})
DETACH DELETE n
"""

with driver.session() as session:
    results = session.run(query)

<blockquote> 
    In this lesson, we have seen some of the main aspects of the creation, modification and deletion of nodes, relationships and properties. Syntaxes are almost always the same. We find the data through a <code>MATCH</code> statement and change the data with a <code>CREATE</code>, <code>SET</code> or <code>DELETE</code>.
    <br>
    We have not been very exhaustive on the possibilities of the <code>MATCH</code> clauses but we will see every aspect of it in the next chapter. 
</blockquote>