# Tutorial: An introduction to property graph transformations

In this tutorial, you will learn the key concepts related to our property graph transformations:
* How to specify transformations of property graph with this framework (i.e., Understanding how we make use of Skolem functions to construct new property graphs.)
* How the rules in a single transformation interact with each other (i.e., How content of output elements can be jointly specified in several rules.)
* Understand the notion of *conflicts*, and how to deal with them.
* How this framework integrates with openCypher (i.e., How the rules are compiled into openCypher scripts and in which context they are executed.)
* What property graph transformations are capable of doing (i.e., The kind of constructs that can be expressed.)

## Part 1: Preliminaries


By default this notebook is configured to connect to a local Neo4j instance running inside a Docker container. This [notebook](./Tutorial_Connecting_Neo4j_Docker.ipynb) will guide you trought the process of setting up a local Docker container and connecting to it.

In [1]:
from dtgraph import Neo4jGraph, Rule, Transformation
hostname = "localhost"
password = ""
uri = f"bolt://{hostname}:7687"
graph = Neo4jGraph(uri, database="neo4j", username="", password=password)

For this tutorial, we will use a stripped version of the Movies dataset from Neo4j.

In [2]:
from dtgraph.scenarios.movies import Movies
Movies.load(graph)

Flushed database: Deleted 273 nodes, deleted 253 relationships, completed after 4 ms.
Load scenario: Added 171 labels, created 171 nodes, set 564 properties, created 253 relationships, completed after 1 ms.


## Part 2: Transformation rules

This dataset constains information about **Movies** and **Person** related to these movies.
Such persons could could have **:ACTED_IN**, or even **:DIRECTED** or **:PRODUCED** a movie. With this schema, information about whether people are actors, directors or producers is not located in the nodes.
Let's build a new graph to make this information explicit, we start by introducing the new label **Actor**.

We will do so with a transformation rule:

In [3]:
generate_actors = Rule('''
        MATCH (n:Person)-[:ACTED_IN]->(m:Movie)
        => 
        (x = (n) : Actor {
            name = n.name,
            born = n.born,
            source = "movie dataset"
        })
        ''')

In [4]:
my_transform = Transformation([generate_actors])
my_transform.apply_on(graph)

Index: Added 0 index, completed after 1 ms.
Rule: Added 204 labels, created 102 nodes, set 618 properties, created 0 relationships, completed after 28 ms.


The above rule consists of three parts:
- `MATCH (n:Person)-[:ACTED_IN]->(m:Movie)` which is an openCypher query to retrieve the relevant information from the input graph.
  This Cypher query should bind its exported variables only to graph elements such as nodes and relationships.
- `(x = (n) : Actor { name = n.name, born = n.born })` which is a node constructor, composed of the following elements:
  - `x = (n)` is
  - A set of labels (here this is only one label, `Actor`) which specifies the labels for the new elements.
  - A list of properties `{ name = n.name, born = n.born, source = "movie dataset" }` for the new elements. Values from the initial graph can be retrieved using access keys such as `n.born`, fixed constants can be used such as `"movie dataset"`.
- `=>` or `GENERATE` to connect the above two parts.

New that we have created and executed the transformation we can see the current output:

In [5]:
generate_actors._compile()
print(generate_actors._compiled)

MATCH (n:Person)-[:ACTED_IN]->(m:Movie)
MERGE (x:_dummy {
    _id: "(" + elementID(n) + ")" 
})
ON CREATE
    SET x:Actor,
        x.name = n.name,
        x.born = n.born,
        x.source = "movie dataset"
ON MATCH
    SET x:Actor,
        x.name = 
        CASE
            WHEN x.name <> n.name THEN
                "Conflict Detected!"
            ELSE
                n.name
        END,
        x.born = 
        CASE
            WHEN x.born <> n.born THEN
                "Conflict Detected!"
            ELSE
                n.born
        END,
        x.source = 
        CASE
            WHEN x.source <> "movie dataset" THEN
                "Conflict Detected!"
            ELSE
                "movie dataset"
        END

