# Usability Study

In the following, you'll be asked some question regarding some provided pieces of code.

In [1]:
from dtgraph import Neo4jGraph, Rule, Transformation
hostname = "localhost"
password = ""
uri = f"bolt://{hostname}:7687"
graph = Neo4jGraph(uri, database="neo4j", username="", password=password)

We will use the popular movies database offered by Neo4j.

In [2]:
from dtgraph.scenarios.movies import Movies
Movies.load(graph)

Flushed database: Deleted 174 nodes, deleted 259 relationships, completed after 2 ms.
Load scenario: Added 171 labels, created 171 nodes, set 564 properties, created 253 relationships, completed after 1 ms.


# Dataset

We will explore transformations on the following dataset having the following schema:

![Schema](./../images/tuto-basics-schema.svg)

## Understandability of openCypher scripts

Let us consider the following queries:

In [3]:
graph.query("MATCH (n:Person)-[:DIRECTED]->(:Movie) CREATE (:Director {name: n.name, born: n.born})")
graph.query("MATCH (n:Person)-[:ACTED_IN]->(:Movie) CREATE (:Actor {name: n.name, born: n.born})")

(0, 0)

Could you explain the output? Specifically:
* **(Question 1)** Does this query create as many `Director` nodes as there are `Person` nodes that have an outgoing relationship of type `DIRECTED` to a `Movie` node?
* **(Question 2)** Does this query create nodes with both `Director` and `Actor` labels?

We now execute the following query after having executed the two previous ones:

In [4]:
graph.query("MATCH (n:Person)-[:DIRECTED]->(m:Movie) MERGE (:Director {name: n.name, born: n.born})-[:DIRECTOR_OF]-(:Film {title: m.title})")

(0, 0)

Could you explain the output? Specifically:
* **(Question 3)** Executed after the previous two queries, does this one will create new `Director` nodes?
* **(Question 4)** Will two persons having co-directed the same movie be connected to the same `Film` node?

**Reloading data**

In [5]:
from dtgraph.scenarios.movies import Movies
Movies.load(graph)

Flushed database: Deleted 475 nodes, deleted 297 relationships, completed after 3 ms.
Load scenario: Added 171 labels, created 171 nodes, set 564 properties, created 253 relationships, completed after 0 ms.


## Understandability of Property graph transformations

Let us consider the following transformation rules:

In [6]:
generate_Actors = Rule('''
    MATCH (n:Person)-[:ACTED_IN]->(:Movie)
    GENERATE 
    ((n):Actor {name = n.name, born = n.born})
''')
generate_Directors = Rule('''
    MATCH (n:Person)-[:DIRECTED]->(:Movie)
    GENERATE
    ((n):Director {name = n.name, born = n.born})
''')

In [7]:
transform = Transformation([generate_Actors, generate_Directors])
transform.apply_on(graph)

Index: Added 0 index, completed after 2 ms.
Rule: Added 204 labels, created 102 nodes, set 446 properties, created 0 relationships, completed after 14 ms.
Rule: Added 51 labels, created 23 nodes, set 111 properties, created 0 relationships, completed after 5 ms.


19

Could you explain the output? Specifically:
* **(Question 5)** Does this transformation generate as many `Director` nodes as there are `Person` nodes that have an outgoing relationship of type `DIRECTED` to a `Movie` node?
* **(Question 6)** Does this transformation generate nodes with both a `Director` and an `Actor` labels?

We now add a new rule to this transformation:

In [8]:
generate_Connections = Rule('''
    MATCH (n:Person)-[:DIRECTED]->(m:Movie)
    GENERATE
    ((n):Director {name = n.name, born = n.born})-[():DIRECTOR_OF]->((m):Film {title = m.title})
''')

In [9]:
transform.add(generate_Connections)

Rule: Added 76 labels, created 38 nodes, set 214 properties, created 44 relationships, completed after 11 ms.


* **(Question 7)** Executed after the previous two queries, does this one create new `Director` nodes?
* **(Question 8)** Will two persons having co-directed the same movie be connected to the same `Film` node?

**Reloading data**

In [10]:
from dtgraph.scenarios.movies import Movies
Movies.load(graph)

Flushed database: Deleted 334 nodes, deleted 297 relationships, completed after 2 ms.
Load scenario: Added 171 labels, created 171 nodes, set 564 properties, created 253 relationships, completed after 0 ms.


## Handwritten openCypher scripts

Please, produce a single openCypher script producing the same output as the previous $3$-rule transformation.

In [11]:
# write a single openCypher query









## Converting openCypher scripts into a Property graph transformation

Please, produce a property graph transformation that is equivalent to executing the three previous openCypher queries in same order.

In [12]:
# write a single transformation that may have multiple rules









## Dealing with ambiguities in openCypher and with Property graph transformations

Let us consider the following openCypher script and the following transformation rule which are almost equivalent:

In [13]:
graph.query('''
MATCH (n:Person)-[:DIRECTED]->(m:Movie)<-[:DIRECTED]-(o:Person) 
MERGE (x:Director {name: n.name, born: n.born})
MERGE (y:Director {name: o.name, born: o.born})
MERGE (x)-[d:COLLEAGUE]->(y)
SET d.movie = m.title''')

(0, 0)

**Reloading data**

In [14]:
from dtgraph.scenarios.movies import Movies
Movies.load(graph)

Flushed database: Deleted 174 nodes, deleted 259 relationships, completed after 1 ms.
Load scenario: Added 171 labels, created 171 nodes, set 564 properties, created 253 relationships, completed after 1 ms.


In [15]:
generate_colleagues = Rule('''
MATCH (n:Person)-[:DIRECTED]->(m:Movie)<-[:DIRECTED]-(o:Person)
GENERATE (x = (n):Director {name = n.name, born = n.born})-[():COLLEAGUE { movie = m.title }]->(y = (o):Director {name = n.name, born = n.born})''')
transform = Transformation([generate_colleagues])
transform.apply_on(graph)

Index: Added 0 index, completed after 1 ms.
Rule: Added 9 labels, created 3 nodes, set 87 properties, created 6 relationships, completed after 14 ms.


14

* The value of the *movie* attribute of the `COLLEAGUE` relationships may not be well-defined.
  How would you modify both the script and the transformation to account for this? (*Hint:* you want to have as many relationships as there are possible values for it.)

In [16]:
# write the correct version of the openCypher query









In [17]:
# write the correct version of the transformation









* Can you adapt your two solutions to the case where the conflicting attribute is not stored, but you still want to create as many `COLLEAGUE` relationships as there are films that both `DIRECTORS` co-directed together ?

In [18]:
# write the updated version of the openCypher query









In [19]:
# write the updated version of the transformation







