<img src="https://datascientest.fr/train/assets/logo_datascientest.png" style="height:150px">

<hr style="border-width:2px;border-color:#75DFC1">
<center><h1>Neo4J</h1></center>
<center><h2>Basics of Cypher</h2></center>
<hr style="border-width:2px;border-color:#75DFC1">


<blockquote>
<center><h3>Our dataset</h3></center>


In this exercise, we will play with a dataset containing informations about <a href="https://en.wikipedia.org/wiki/Marvel_Cinematic_Universe">Marvel Cinematic Universe</a> Movies. It is derived from the <a href="https://www.imdb.com/interfaces/">IMDB datasets</a>. The dataset contains <code>Actors</code>, <code>Characters</code> and <code>Movies</code>. <code>Actors</code> can <code>PLAY</code> a <code>Character</code>. <code>Characters</code> can <code>APPEAR_IN</code> in a <code>Movie</code>.  An <code>Actor</code> is also a <code>Person</code>. 

<code>Actor</code> can have features such as a <code>name</code>, <code>birth_year</code>, <code>death_year</code>, <code>professions</code>, ... A <code>Character</code> has only one attribute: its <code>name</code>. <code>Movie</code> can have a <code>title</code>, a <code>year</code>, a <code>runtime</code> and a list of <code>genres</code>.

<center><img src="./demo_mcu.png"></center>
    
    
<div class="alert alert-info"><i class='fa fa-exclamation-circle'></i> &emsp; Notice that we use <code>UPPER CASE</code> and <code>underscores</code> for relationships labels and <code>Capitalized</code> for node labels. It is just a convention.</div>
    

In this lesson, we will learn how to make complicated queries and also how to load data from a <code>csv</code> file.
    
</blockquote>

* run the following cell to launch the container with the dataset already loaded

In [1]:
from neo4j import GraphDatabase

In [4]:
import pprint
driver = GraphDatabase.driver('bolt://localhost:7687', auth=('neo4j', 'password'))

<center><h3>Querying nodes</h3></center>

<blockquote>

In the previous lecture, we have seen how to make simple queries: 
    <ul>
        <li>a <code>MATCH</code> statement</li>
        <li>a <code>RETURN</code> statement</li>
    </ul>

    
For example, to make a query returning every node with a property <code>property1</code> set to value <code>"value1"</code>, you can use the following syntaxes: 
    
<br>
<br>    
    
```cypher
    MATCH (n {property1: "value1"})
    RETURN n
```

<br>
<br> 
    
```cypher
    MATCH (n)
    WHERE n.property1 = "value1"
    RETURN n
``` 
    
<br>
<br>
    
To add conditions on a specific label, for example, <code>Label1</code>, we can use the following:
    
<br>
<br>

```cypher
MATCH (n:Label1)
RETURN n
```
    
<br>
<br>
    
</blockquote>


* Make a query to return all the <code>Person</code> nodes with property <code>birth_year</code> set to <code>1975</code>

In [5]:
# Insert your code here

query = """
RETURN 'Insert your code in this string'
"""

with driver.session() as session:
    results = session.run(query)
    pprint.pprint(results.data())

[{"'Insert your code in this string'": 'Insert your code in this string'}]


In [6]:
# Insert your code here

query = """
MATCH (n:Person {birth_year: 1975})
RETURN n
"""

with driver.session() as session:
    results = session.run(query)
    pprint.pprint(results.data())


[{'n': <Node id=480 labels={'Person'} properties={'profession': ['producer', 'writer', 'miscellaneous'], 'name': 'Drew Pearce', 'id': 'nm1510800', 'birth_year': 1975}>},
 {'n': <Node id=485 labels={'Person'} properties={'profession': ['writer', 'actor', 'producer'], 'name': 'C. Robert Cargill', 'id': 'nm1803036', 'birth_year': 1975}>},
 {'n': <Node id=834 labels={'Person'} properties={'profession': ['producer', 'actor', 'director'], 'name': 'Taika Waititi', 'id': 'nm0169806', 'birth_year': 1975}>},
 {'n': <Node id=835 labels={'Actor', 'Person'} properties={'profession': ['actor', 'producer', 'soundtrack'], 'name': 'Bradley Cooper', 'id': 'nm0177896', 'birth_year': 1975}>}]


<blockquote>
To return only a partial information, we can use the following syntax. 

<br>
<br>

```cypher 
MATCH (n)
RETURN n.attribute
```

<br>
<br>

Some special attributes can be accessed through specific keywords: for example, to access le <code>ID</code> of the node or its <code>labels</code> we can use the <code>id</code> and <code>labels</code> functions.

<br>
<br>

```cypher
MATCH (n)
RETURN labels(n), id(n)
```
    
<br>
<br>
    
</blockquote>

* Write a query to return the <code>year</code> attribute of nodes with label <code>Movie</code> and its <code>id </code>

In [7]:
# Insert your code here

query = """
RETURN 'Insert your code in this string'
"""

with driver.session() as session:
    results = session.run(query)
    pprint.pprint(results.data())

[{"'Insert your code in this string'": 'Insert your code in this string'}]


In [8]:
# Insert your code here

query = """
MATCH (m:Movie)
RETURN m.year, id(m)
"""

with driver.session() as session:
    results = session.run(query)
    pprint.pprint(results.data())


[{'id(m)': 513, 'm.year': 2008},
 {'id(m)': 514, 'm.year': 2011},
 {'id(m)': 515, 'm.year': 2015},
 {'id(m)': 516, 'm.year': 2008},
 {'id(m)': 517, 'm.year': 2011},
 {'id(m)': 518, 'm.year': 2012},
 {'id(m)': 519, 'm.year': 2016},
 {'id(m)': 520, 'm.year': 2010},
 {'id(m)': 521, 'm.year': 2013},
 {'id(m)': 533, 'm.year': 2018},
 {'id(m)': 534, 'm.year': 2014},
 {'id(m)': 535, 'm.year': 2013},
 {'id(m)': 536, 'm.year': 2014},
 {'id(m)': 537, 'm.year': 2017},
 {'id(m)': 538, 'm.year': 2015},
 {'id(m)': 539, 'm.year': 2013},
 {'id(m)': 540, 'm.year': 2016},
 {'id(m)': 541, 'm.year': 2017},
 {'id(m)': 542, 'm.year': 2017},
 {'id(m)': 543, 'm.year': 2019},
 {'id(m)': 544, 'm.year': 2018},
 {'id(m)': 545, 'm.year': 2019},
 {'id(m)': 546, 'm.year': 2018}]


<blockquote>
Note that you can alias the results of your queries using the keyword <code>AS</code> as follows: 

<br>
<br>

```cypher
MATCH (n)
RETURN n.attribute AS my_attribute
```
</blockquote>

* rewrite the previous query to alias the name of the attributes as <code>id</code> and <code>start_year</code>

In [9]:
# Insert your code here

query = """
RETURN 'Insert your code in this string'
"""

with driver.session() as session:
    results = session.run(query)
    pprint.pprint(results.data())

[{"'Insert your code in this string'": 'Insert your code in this string'}]


In [10]:
# Insert your code here

query = """
MATCH (m:Movie)
RETURN m.year AS start_year, id(m) AS id
"""

with driver.session() as session:
    results = session.run(query)
    pprint.pprint(results.data())


[{'id': 513, 'start_year': 2008},
 {'id': 514, 'start_year': 2011},
 {'id': 515, 'start_year': 2015},
 {'id': 516, 'start_year': 2008},
 {'id': 517, 'start_year': 2011},
 {'id': 518, 'start_year': 2012},
 {'id': 519, 'start_year': 2016},
 {'id': 520, 'start_year': 2010},
 {'id': 521, 'start_year': 2013},
 {'id': 533, 'start_year': 2018},
 {'id': 534, 'start_year': 2014},
 {'id': 535, 'start_year': 2013},
 {'id': 536, 'start_year': 2014},
 {'id': 537, 'start_year': 2017},
 {'id': 538, 'start_year': 2015},
 {'id': 539, 'start_year': 2013},
 {'id': 540, 'start_year': 2016},
 {'id': 541, 'start_year': 2017},
 {'id': 542, 'start_year': 2017},
 {'id': 543, 'start_year': 2019},
 {'id': 544, 'start_year': 2018},
 {'id': 545, 'start_year': 2019},
 {'id': 546, 'start_year': 2018}]


<center><h3>Advanced conditions</h3></center>


<blockquote>
<h4>Mathematical operators</h4>
Cypher has the classical mathematical operators: 
<ul>
    <li><code>&#x2329;&#x232A;</code> is used to mean <i>different from</i></li>
    <li><code>&#x2329;</code> and <code>&#x232A;</code> are used for strict comparison </li>
    <li><code>&#x2329;=</code> and <code>&#x232A;=</code> are used for comparison </li>
</ul>

<h4>Logical operators</h4>

We can combine multiple conditions using <code>OR</code>, <code>AND</code>, <code>XOR</code>, <code>NOT</code>...

For example, to get the ids of the nodes that have <code>attribute1</code> different from <code>value1</code> and <code>attribute2</code> strictly greater than <code>value2</code>, we can do 

<br>
<br>

```cypher
MATCH (n) 
WHERE n.attribute1 <> value1
AND n.attribute2 > value2 
RETURN id(n) AS id
```

<br>
<br>

    
To check that a value belongs to a list of values, you can use the <code>IN</code> keyword: 
    
<br>
<br>
    
```cypher
MATCH (n)
WHERE n.attribute1 IN ['value1', 'value2']
RETURN n
```
    
<br>
<br>

<h4>String operators</h4>
    
We can also match character arrays to predefined substring: 
<ul>
    <li><code>STARTS WITH "substring"</code> checks if the string starts with a given substring</li>
    <li><code>ENDS WITH "substring"</code> checks if the string ends with a given substring</li>
    <li><code>CONTAINS "substring"</code> checks if the string contains a given substring</li>
    <li><code>~= "regular expression"</code> checks if the string matches a given regular expression</li>
</ul>
    
For example, to get all the nodes whose attribute <code>attribute1</code> starts with <code>"string_start"</code>, we can use the following syntax: 
    
<br>
<br>
    
```cypher
MATCH (n) 
WHERE n.attribute1 STARTS WITH "string_start"
RETURN n
```

<br>
<br>

</blockquote>

* write a query to get <code>Actor</code> nodes with attribute <code>name</code> containing <code>"Robert"</code>

In [11]:
# Insert your code here

query = """
RETURN 'Insert your code in this string'
"""

with driver.session() as session:
    results = session.run(query)
    pprint.pprint(results.data())

[{"'Insert your code in this string'": 'Insert your code in this string'}]


In [12]:
# Insert your code here

query = """
MATCH (a:Actor)
WHERE a.name CONTAINS "Robert"
RETURN a
"""

with driver.session() as session:
    results = session.run(query)
    pprint.pprint(results.data())


[{'a': <Node id=816 labels={'Actor', 'Person'} properties={'profession': ['actor', 'producer', 'soundtrack'], 'name': 'Robert Downey Jr.', 'id': 'nm0000375', 'birth_year': 1965}>},
 {'a': <Node id=820 labels={'Actor', 'Person'} properties={'profession': ['producer', 'actor', 'director'], 'name': 'Robert Redford', 'id': 'nm0000602', 'birth_year': 1936}>}]


<center><h3>Querying relationships</h3></center>

<blockquote>
We can also query relationships with the same principle: relationships are represented by the following syntax: 
<ul>
    <li><code>()-[relationship]-()</code> if the direction has no importance</li>
    <li><code>()-[relationship]->()</code> if the direction is important</li>
</ul>

<i>If nodes are not important to this query, we do not need to name them but they still have to be represented by <code>()</code>.</i>

We have seen that relationships can also have labels: the syntax is similar as with nodes: 


<br>
<br>

```cypher 
MATCH ()-[rel:RELATIONSHIP_LABEL]-()
RETURN rel
```
    
<br>
<br>
    
</blockquote>

* write a query to return all the relationships corresponding to the label <code>"PLAY"</code>. Limit the first results to the first 10 using <code>LIMIT 10</code> after the <code>RETURN</code> statement.

In [13]:
# Insert your code here

query = """
RETURN 'Insert your code in this string'
"""

with driver.session() as session:
    results = session.run(query)
    pprint.pprint(results.data())

[{"'Insert your code in this string'": 'Insert your code in this string'}]


In [14]:
# Insert your code here

query = """
MATCH ()-[rel:PLAY]-()
RETURN rel LIMIT 10
"""

with driver.session() as session:
    results = session.run(query)
    pprint.pprint(results.data())


[{'rel': <Relationship id=34641 nodes=(<Node id=169 labels=set() properties={}>, <Node id=782 labels=set() properties={}>) type='PLAY' properties={}>},
 {'rel': <Relationship id=34667 nodes=(<Node id=171 labels=set() properties={}>, <Node id=795 labels=set() properties={}>) type='PLAY' properties={}>},
 {'rel': <Relationship id=34622 nodes=(<Node id=171 labels=set() properties={}>, <Node id=764 labels=set() properties={}>) type='PLAY' properties={}>},
 {'rel': <Relationship id=34655 nodes=(<Node id=175 labels=set() properties={}>, <Node id=789 labels=set() properties={}>) type='PLAY' properties={}>},
 {'rel': <Relationship id=34607 nodes=(<Node id=175 labels=set() properties={}>, <Node id=750 labels=set() properties={}>) type='PLAY' properties={}>},
 {'rel': <Relationship id=34645 nodes=(<Node id=177 labels=set() properties={}>, <Node id=786 labels=set() properties={}>) type='PLAY' properties={}>},
 {'rel': <Relationship id=34293 nodes=(<Node id=177 labels=set() properties={}>, <Node i

<blockquote>

We can access the attributes of the relationships in the same way used with nodes, add conditions, ...

As for the nodes, we can access special attributes thanks to special functions:
<ul>
    <li><code>startNode</code> returns the starting node of the relationship</li>
    <li><code>endNode</code> returns the ending node of the relationship</li>
    <li><code>type</code> returns the label of the relationship</li>
    <li><code>id</code> returns the id of the relationship</li>
</ul>
</blockquote>

* make a query returning the nodes of the relationships and the type of the relationships. Limit the results to the first 10 using <code>LIMIT 10</code> after the return statement

In [15]:
# Insert your code here

query = """
RETURN 'Insert your code in this string'
"""

with driver.session() as session:
    results = session.run(query)
    pprint.pprint(results.data())

[{"'Insert your code in this string'": 'Insert your code in this string'}]


In [16]:
# Insert your code here

query = """
MATCH ()-[rel]-()
RETURN startNode(rel), endNode(rel), type(rel) LIMIT 10
"""

with driver.session() as session:
    results = session.run(query)
    pprint.pprint(results.data())


[{'endNode(rel)': <Node id=782 labels={'Character'} properties={'name': 'Luis'}>,
  'startNode(rel)': <Node id=169 labels={'Actor', 'Person'} properties={'profession': ['actor', 'producer'], 'name': 'Michael Peña', 'id': 'nm0671567', 'birth_year': 1976}>,
  'type(rel)': 'PLAY'},
 {'endNode(rel)': <Node id=795 labels={'Character'} properties={'name': 'Star-Lord'}>,
  'startNode(rel)': <Node id=171 labels={'Actor', 'Person'} properties={'profession': ['actor', 'soundtrack', 'producer'], 'name': 'Chris Pratt', 'id': 'nm0695435', 'birth_year': 1979}>,
  'type(rel)': 'PLAY'},
 {'endNode(rel)': <Node id=764 labels={'Character'} properties={'name': 'Peter Quill'}>,
  'startNode(rel)': <Node id=171 labels={'Actor', 'Person'} properties={'profession': ['actor', 'soundtrack', 'producer'], 'name': 'Chris Pratt', 'id': 'nm0695435', 'birth_year': 1979}>,
  'type(rel)': 'PLAY'},
 {'endNode(rel)': <Node id=789 labels={'Character'} properties={'name': 'Hawkeye'}>,
  'startNode(rel)': <Node id=175 labe

* rewrite the previous query to get only the <code>name</code> of each end and the relationship type  

In [15]:
# Insert your code here

query = """
RETURN 'Insert your code in this string'
"""

with driver.session() as session:
    results = session.run(query)
    pprint.pprint(results.data())

[{"'Insert your code in this string'": 'Insert your code in this string'}]


In [17]:
# Insert your code here

query = """
MATCH ()-[rel]-()
RETURN startNode(rel).name AS start_name, 
endNode(rel).name AS end_name, 
type(rel) AS relation LIMIT 10
"""

with driver.session() as session:
    results = session.run(query)
    pprint.pprint(results.data())


[{'end_name': 'Luis', 'relation': 'PLAY', 'start_name': 'Michael Peña'},
 {'end_name': 'Star-Lord', 'relation': 'PLAY', 'start_name': 'Chris Pratt'},
 {'end_name': 'Peter Quill', 'relation': 'PLAY', 'start_name': 'Chris Pratt'},
 {'end_name': 'Hawkeye', 'relation': 'PLAY', 'start_name': 'Jeremy Renner'},
 {'end_name': 'Clint Barton',
  'relation': 'PLAY',
  'start_name': 'Jeremy Renner'},
 {'end_name': 'Ant-Man', 'relation': 'PLAY', 'start_name': 'Paul Rudd'},
 {'end_name': 'Scott Lang', 'relation': 'PLAY', 'start_name': 'Paul Rudd'},
 {'end_name': 'Bruce Banner', 'relation': 'PLAY', 'start_name': 'Mark Ruffalo'},
 {'end_name': 'Hulk', 'relation': 'PLAY', 'start_name': 'Mark Ruffalo'},
 {'end_name': 'Gamora', 'relation': 'PLAY', 'start_name': 'Zoe Saldana'}]


<center><h3>Complex queries</h3></center>

<blockquote>
A database like Neo4J is interesting because of the links between nodes. We can combine queries on nodes and on relationships to get useful information. 

For example, to get all the relationships going from a the nodes which <code>attribute1</code> is set to <code>"value1"</code> to other nodes, we can do the following: 

<br>
<br>

```cypher
MATCH (n {attribute1: "value1"})
MATCH (n)-[rel]->()
RETURN rel
```

<br>
<br>

Moreover, if we want to query only the relationships from this first node to a second node whose attribute <code>attribute2</code> is set to <code>"value2"</code>, we can do the following: 

<br>
<br>

```cypher
MATCH (n1 {attribute1: "value1"})
MATCH (n2 {attribute2: "value2"})
MATCH (n1)-[rel]->(n2)
RETURN rel
```

<br>
<br>

If we do not care about the direction of the relationship between those nodes: 

<br>
<br>

```cypher
MATCH (n1 {attribute1: "value1"})
MATCH (n2 {attribute2: "value2"})
MATCH (n1)-[rel]-(n2)
RETURN rel
```

<br>
<br>


<br>
<br>

Finally, if we want only the neighbour nodes of our first node 


```cypher
MATCH (n1 {attribute1: "value1"})
MATCH (n1)-[rel]-(n2)
RETURN n2
```

<br>
<br>

</blockquote>

* write a query to return all the <code>Actor</code> nodes that played the <code>Character</code> whose <code>name</code> is <code>"Bruce Banner"</code>.

In [18]:
# Insert your code here

query = """
RETURN 'Insert your code in this string'
"""

with driver.session() as session:
    results = session.run(query)
    pprint.pprint(results.data())

[{"'Insert your code in this string'": 'Insert your code in this string'}]


In [19]:
# Insert your code here

query = """
MATCH (a:Actor)
MATCH (c:Character {name: "Bruce Banner"})
MATCH (a)-[rel:PLAY]->(c)
RETURN a
"""

with driver.session() as session:
    results = session.run(query)
    pprint.pprint(results.data())


[{'a': <Node id=178 labels={'Actor', 'Person'} properties={'profession': ['actor', 'producer', 'director'], 'name': 'Mark Ruffalo', 'id': 'nm0749263', 'birth_year': 1967}>},
 {'a': <Node id=826 labels={'Actor', 'Person'} properties={'profession': ['actor', 'producer', 'writer'], 'name': 'Edward Norton', 'id': 'nm0001570', 'birth_year': 1969}>}]


* write a query to get all the <code>Character</code> nodes that <code>APPEAR_IN</code> in the movies whose <code>title</code> begins with <code>"Iron Man"</code>

In [20]:
# Insert your code here

query = """
RETURN 'Insert your code in this string'
"""

with driver.session() as session:
    results = session.run(query)
    pprint.pprint(results.data())

[{"'Insert your code in this string'": 'Insert your code in this string'}]


In [21]:
# Insert your code here

query = """
MATCH (m:Movie)
WHERE m.title STARTS WITH "Iron Man"
MATCH (c:Character)
MATCH (c)-[rel:APPEAR_IN]->(m)
RETURN c.name AS name
"""

with driver.session() as session:
    results = session.run(query)
    pprint.pprint(results.data())


[{'name': 'Obadiah Stane'},
 {'name': "Lt. Col. James 'Rhodey' Rhodes"},
 {'name': 'Iron Man'},
 {'name': 'Tony Stark'},
 {'name': 'Pepper Potts'},
 {'name': "Lt. Col. James 'Rhodey' Rhodes"},
 {'name': 'Pepper Potts'},
 {'name': 'Ivan Vanko'},
 {'name': 'Tony Stark'},
 {'name': 'Tony Stark'},
 {'name': 'Colonel James Rhodes'},
 {'name': 'Pepper Potts'},
 {'name': 'Aldrich Killian'},
 {'name': 'Shalchi'},
 {'name': 'Shahram'},
 {'name': 'The shop worker'},
 {'name': 'The Children Parents'}]


* write a query to get all the <code>Movie</code> nodes in which the <code>Character</code> whose <code>name</code> is <code>"Iron Man"</code> <code>APPEAR_IN</code>

In [22]:
# Insert your code here

query = """
RETURN 'Insert your code in this string'
"""

with driver.session() as session:
    results = session.run(query)
    pprint.pprint(results.data())

[{"'Insert your code in this string'": 'Insert your code in this string'}]


In [23]:
# Insert your code here

query = """
MATCH (c:Character {name: "Iron Man"})
MATCH (m:Movie)
MATCH (c)-[rel:APPEAR_IN]->(m)
RETURN m
"""

with driver.session() as session:
    results = session.run(query)
    pprint.pprint(results.data())


[{'m': <Node id=518 labels={'Movie'} properties={'runtime': 143, 'id': 'tt0848228', 'title': 'The Avengers', 'year': 2012, 'genres': ['Action', 'Adventure', 'Sci-Fi']}>},
 {'m': <Node id=540 labels={'Movie'} properties={'runtime': 147, 'id': 'tt3498820', 'title': 'Captain America: Civil War', 'year': 2016, 'genres': ['Action', 'Adventure', 'Sci-Fi']}>},
 {'m': <Node id=537 labels={'Movie'} properties={'runtime': 133, 'id': 'tt2250912', 'title': 'Spider-Man: Homecoming', 'year': 2017, 'genres': ['Action', 'Adventure', 'Sci-Fi']}>},
 {'m': <Node id=544 labels={'Movie'} properties={'runtime': 149, 'id': 'tt4154756', 'title': 'Avengers: Infinity War', 'year': 2018, 'genres': ['Action', 'Adventure', 'Sci-Fi']}>},
 {'m': <Node id=538 labels={'Movie'} properties={'runtime': 141, 'id': 'tt2395427', 'title': 'Avengers: Age of Ultron', 'year': 2015, 'genres': ['Action', 'Adventure', 'Sci-Fi']}>},
 {'m': <Node id=513 labels={'Movie'} properties={'runtime': 126, 'id': 'tt0371746', 'title': 'Iron M

<blockquote>
We can of course combine multiple relationship statements: 

For example, to get all the nodes that respond that are two relationships away from a node whose <code>attriute1</code> is set to <code>"value1"</code>, we can do the following: 

<br>
<br>

```cypher
MATCH (n1 {attribute1: "value1"})
MATCH (n1)-[]->()-[]->(n2)
RETURN n2
```
<br>
<br>

<i>Remember that we do not need to name the relationships or the nodes if they are not used anymore.</i>
</blockquote>

* write a query to get all the <code>name</code> of the <code>Actor</code> that <code>PLAY</code> a <code>Character</code> in the <code>Movie</code> whose <code>title</code> is <code>'Avengers: Endgame'</code>.

In [24]:
# Insert your code here

query = """
RETURN 'Insert your code in this string'
"""

with driver.session() as session:
    results = session.run(query)
    pprint.pprint(results.data())

[{"'Insert your code in this string'": 'Insert your code in this string'}]


In [25]:
# Insert your code here

query = """
MATCH (m:Movie {title: 'Avengers: Endgame'})
MATCH (a:Actor)-[:PLAY]->()-[:APPEAR_IN]->(m)
RETURN a.name as name
"""

with driver.session() as session:
    results = session.run(query)
    pprint.pprint(results.data())


[{'name': 'Chris Evans'},
 {'name': 'Chris Hemsworth'},
 {'name': 'Mark Ruffalo'},
 {'name': 'Edward Norton'},
 {'name': 'Robert Downey Jr.'},
 {'name': 'Chris Evans'},
 {'name': 'Mark Ruffalo'},
 {'name': 'Robert Downey Jr.'}]


<blockquote>
We can also use the directed relationships in opposite directions. For example if we want to get all the nodes that have a common neighbour with a node whose <code>attribute1</code> is set to value <code>"value1"</code> but that are not necessarily reachable from it, we can do the following:
    
<br>
<br>
    
    
```cypher
MATCH (n1 {attribute1: "value1"})
MATCH (n1)-[rel1]->(p)<-[rel2]-(n2)
RETURN n2
    
```
                                    
<br>
<br>

</blockquote>

* write a query to get the <code>name</code> of <code>Actor</code> that <code>PLAY</code> in the same <code>Movie</code> as the actor named <code>"Robert Downey Jr."</code>. You can return only unique values by using the <code>DISTINCT</code> keyword right after <code>RETURN</code>.

In [26]:
# Insert your code here

query = """
RETURN 'Insert your code in this string'
"""

with driver.session() as session:
    results = session.run(query)
    pprint.pprint(results.data())

[{"'Insert your code in this string'": 'Insert your code in this string'}]


In [27]:
# Insert your code here

query = """
MATCH (a1:Actor {name: 'Robert Downey Jr.'})
MATCH (m:Movie)
MATCH (a1)-[:PLAY]->()-[:APPEAR_IN]->(m)<-[:APPEAR_IN]-()<-[:PLAY]-(a2)
RETURN DISTINCT a2.name as name
"""

with driver.session() as session:
    results = session.run(query)
    pprint.pprint(results.data())


[{'name': 'Don Cheadle'},
 {'name': 'Gwyneth Paltrow'},
 {'name': 'Guy Pearce'},
 {'name': 'Robert Downey Jr.'},
 {'name': 'Chris Evans'},
 {'name': 'Sebastian Stan'},
 {'name': 'Scarlett Johansson'},
 {'name': 'Tom Holland'},
 {'name': 'Michael Keaton'},
 {'name': 'Marisa Tomei'},
 {'name': 'Chris Hemsworth'},
 {'name': 'Mark Ruffalo'},
 {'name': 'Edward Norton'},
 {'name': 'Jeremy Renner'},
 {'name': 'Jeff Bridges'},
 {'name': 'Terrence Howard'},
 {'name': 'Mickey Rourke'}]


<blockquote>

Rather than using very complicated patterns, we can simply specify a number of relationships separating two nodes using <code>*number_of_relationships</code>.
For example, if we want to get all the nodes 5 relationships from a node with attribute <code>attribute1</code> set to <code>"value1"</code>, we can use: 

<br>
<br>

```cypher
MATCH (n1 {attribute1: "value1"})
MATCH (n1)-[*5]-(n2)
RETURN n2
```

<br>
<br>

If we want only one direction or one type of relationship, we can use as usual the regular syntax

<br>
<br>

```cypher
MATCH (n1 {attribute1: "value1"})
MATCH (n1)-[*5]->(n2)
RETURN n2
```


<br>
<br>

```cypher
MATCH (n1 {attribute1: "value1"})
MATCH (n1)-[*5:RELATIONSHIP_TYPE]-(n2)
RETURN n2
```

<br>
<br>

</blockquote>

* rewrite the previous query using this syntax

In [28]:
# Insert your code here

query = """
RETURN 'Insert your code in this string'
"""

with driver.session() as session:
    results = session.run(query)
    pprint.pprint(results.data())

[{"'Insert your code in this string'": 'Insert your code in this string'}]


In [29]:
# Insert your code here

query = """
MATCH (a1:Actor {name: 'Robert Downey Jr.'})
MATCH (a2:Actor)
MATCH (a1)-[*4]-(a2)
RETURN DISTINCT a2.name as name
"""

with driver.session() as session:
    results = session.run(query)
    pprint.pprint(results.data())


[{'name': 'Don Cheadle'},
 {'name': 'Gwyneth Paltrow'},
 {'name': 'Guy Pearce'},
 {'name': 'Robert Downey Jr.'},
 {'name': 'Chris Evans'},
 {'name': 'Sebastian Stan'},
 {'name': 'Scarlett Johansson'},
 {'name': 'Tom Holland'},
 {'name': 'Michael Keaton'},
 {'name': 'Marisa Tomei'},
 {'name': 'Chris Hemsworth'},
 {'name': 'Mark Ruffalo'},
 {'name': 'Edward Norton'},
 {'name': 'Jeremy Renner'},
 {'name': 'Jeff Bridges'},
 {'name': 'Terrence Howard'},
 {'name': 'Mickey Rourke'}]


<blockquote>

We can also make queries with a given range of relationship length. For example to get all the nodes that are from 3 to 5 relationships appart from a node with <code>attribute1</code> set to <code>value1</code>: 

<br>
<br>

```cypher
MATCH (n1 {attribute1: "value1"})
MATCH (n1)-[*3..5]->(n2)
RETURN n2
```

<br>
<br>

To get a minimum of 3 relationships: 

<br>
<br>

```cypher
MATCH (n1 {attribute1: "value1"})
MATCH (n1)-[*3..]->(n2)
RETURN n2
```

<br>
<br>

To get a maximum of 5 relationships:

<br>
<br>

```cypher
MATCH (n1 {attribute1: "value1"})
MATCH (n1)-[*..5]->(n2)
RETURN n2
```

<br>
<br>

To get all the nodes that are reachable from a given node: 

<br>
<br>

```cypher
MATCH (n1 {attribute1: "value1"})
MATCH (n1)-[*]->(n2)
RETURN n2
```

</blockquote>

* make a query to get all the nodes that are reachable from <code>"Robert Downey Jr."</code> within a maximum of 3 relationships.

In [30]:
# Insert your code here

query = """
RETURN 'Insert your code in this string'
"""

with driver.session() as session:
    results = session.run(query)
    pprint.pprint(results.data())

[{"'Insert your code in this string'": 'Insert your code in this string'}]


In [31]:
# Insert your code here

query = """
MATCH (a1:Actor {name: 'Robert Downey Jr.'})
MATCH (a1)-[*..3]->(n2)
RETURN n2
"""

with driver.session() as session:
    results = session.run(query)
    pprint.pprint(results.data())


[{'n2': <Node id=784 labels={'Character'} properties={'name': 'Iron Man'}>},
 {'n2': <Node id=545 labels={'Movie'} properties={'runtime': 181, 'id': 'tt4154796', 'title': 'Avengers: Endgame', 'year': 2019, 'genres': ['Action', 'Adventure', 'Sci-Fi']}>},
 {'n2': <Node id=513 labels={'Movie'} properties={'runtime': 126, 'id': 'tt0371746', 'title': 'Iron Man', 'year': 2008, 'genres': ['Action', 'Adventure', 'Sci-Fi']}>},
 {'n2': <Node id=538 labels={'Movie'} properties={'runtime': 141, 'id': 'tt2395427', 'title': 'Avengers: Age of Ultron', 'year': 2015, 'genres': ['Action', 'Adventure', 'Sci-Fi']}>},
 {'n2': <Node id=544 labels={'Movie'} properties={'runtime': 149, 'id': 'tt4154756', 'title': 'Avengers: Infinity War', 'year': 2018, 'genres': ['Action', 'Adventure', 'Sci-Fi']}>},
 {'n2': <Node id=537 labels={'Movie'} properties={'runtime': 133, 'id': 'tt2250912', 'title': 'Spider-Man: Homecoming', 'year': 2017, 'genres': ['Action', 'Adventure', 'Sci-Fi']}>},
 {'n2': <Node id=540 labels={'M

In [32]:
# Insert your code here

query = """
MATCH (a1:Actor {name: 'Robert Downey Jr.'})
MATCH (a2:Actor)
MATCH (a1)-[*4]-(a2)
RETURN DISTINCT a2.name as name
"""

with driver.session() as session:
    results = session.run(query)
    pprint.pprint(results.data())


[{'name': 'Don Cheadle'},
 {'name': 'Gwyneth Paltrow'},
 {'name': 'Guy Pearce'},
 {'name': 'Robert Downey Jr.'},
 {'name': 'Chris Evans'},
 {'name': 'Sebastian Stan'},
 {'name': 'Scarlett Johansson'},
 {'name': 'Tom Holland'},
 {'name': 'Michael Keaton'},
 {'name': 'Marisa Tomei'},
 {'name': 'Chris Hemsworth'},
 {'name': 'Mark Ruffalo'},
 {'name': 'Edward Norton'},
 {'name': 'Jeremy Renner'},
 {'name': 'Jeff Bridges'},
 {'name': 'Terrence Howard'},
 {'name': 'Mickey Rourke'}]
