# DTGraph Introduction

This notebook consists of a simple example of using the library.
It is primarily intended to be used to evaluate that everything is working on a particular backend.

## Part 1: Establishing a connection

In [1]:
from dtgraph import Neo4jGraph, Rule, Transformation

In the following, we will connect to a [Neo4j Sandbox](https://sandbox.neo4j.com/).
After having created your instance, you can retrieve the *ip address* and the *password* (e.g., in my case, those are  `54.159.136.111` and `analyzer-information-tails`).

In [2]:
hostname = "54.197.198.93"
password = "inquiries-boxcars-child"

In [3]:
uri = f"bolt://{hostname}:7687"
graph = Neo4jGraph(uri, database="neo4j", username="neo4j", password=password)

You can check that it has some content if you specified the Movie dataset:

In [4]:
graph.output_all_nodes()

Info: There is currently 296 node(s) in the database.


## Part II: (Optional) Loading a curated scenario

If your database is empty or you don't have it loaded with the *Movie dataset*, you can import it using the following command:

In [5]:
from dtgraph.scenarios.movies import Movies
Movies.load(graph)

Flushed database: Deleted 296 nodes, deleted 1221 relationships, completed after 56 ms.
Load scenario: Added 171 labels, created 171 nodes, set 564 properties, created 253 relationships, completed after 3603 ms.


## Part III: Defining and executing a rule

Then you can write your own transformation rule:

In [6]:
my_rule = Rule('''
        MATCH (n:Person)-[:ACTED_IN]->(m:Movie)<-[:ACTED_IN]-(o:Person)
        WHERE n.name < o.name
        => 
        (x = (n) : Actor {
            name = n.name,
            born = n.born
        })-[(m) : COLLEAGUE {
            movie = m.title
        }]->(y = (o) : Actor {
            name = o.name,
            born = o.born
        })
''')

And execute it after after wrapping it in a Transformation object:

In [7]:
my_transform = Transformation([my_rule])
my_transform.apply_on(graph)

Index: Added 1 index, completed after 4 ms.
Rule: Added 204 labels, created 102 nodes, set 2406 properties, created 384 relationships, completed after 1809 ms.


Yay, 102 nodes and 204 labels have been created! You can check with the following query on your Neo4j browser the result (alongside the initial dataset, for now):
```
MATCH (n)
RETURN n
```

The line `Index: Added 1 index, completed after 7 ms.` indicates that an index has been added. This is part of the internals of the library to speed up the computation of the output graph.

We can now add a new rule to this transformation:

In [8]:
my_second_rule = Rule('''
    MATCH (d:Person)-[:DIRECTED]->(m:Movie)<-[:ACTED_IN]-(a:Person)
    =>
    (x = (d) : Director {
        name = d.name,
        born = d.born
    })-[(m) : SUPERVISED {
        movie = m.title
    }]->(y = (a) : Actor {
        name = a.name,
        born = a.born
    })
''')
my_transform.add(my_second_rule)

Rule: Added 51 labels, created 23 nodes, set 1223 properties, created 200 relationships, completed after 898 ms.


We see that 23 nodes and 51 labels have been added.

Let us now investigate the output of the transformation. We have two rules and both extract `Person` nodes. The first rule creates nodes only of type `Actor` while the second one also creates nodes of type `Director`.
We can confirm with the following query that a single node is created on the output with both labels if a person is both an Actor and a Director of some films:

```
MATCH (n)
WHERE n:Actor and n:Director
RETURN n
```

This query should return the following output:
```
╒══════════════════════════════════════════════════════════════════════╕
│n                                                                     │
╞══════════════════════════════════════════════════════════════════════╡
│(:Actor:Director:_dummy {born: 1967,name: "James Marshall",_id: "(4:7f│
│732a8b-14ba-4846-8477-f326f7a1b5d0:2469)"})                           │
├──────────────────────────────────────────────────────────────────────┤
│(:Actor:Director:_dummy {born: 1956,name: "Tom Hanks",_id: "(4:7f732a8│
│b-14ba-4846-8477-f326f7a1b5d0:2515)"})                                │
├──────────────────────────────────────────────────────────────────────┤
│(:Actor:Director:_dummy {born: 1930,name: "Clint Eastwood",_id: "(4:7f│
│732a8b-14ba-4846-8477-f326f7a1b5d0:2543)"})                           │
├──────────────────────────────────────────────────────────────────────┤
│(:Actor:Director:_dummy {born: 1944,name: "Danny DeVito",_id: "(4:7f73│
│2a8b-14ba-4846-8477-f326f7a1b5d0:2586)"})                             │
├──────────────────────────────────────────────────────────────────────┤
│(:Actor:Director:_dummy {born: 1942,name: "Werner Herzog",_id: "(4:7f7│
│32a8b-14ba-4846-8477-f326f7a1b5d0:2503)"})                            │
└──────────────────────────────────────────────────────────────────────┘
```

This is indeed the desired behavior to be able to define the content of new elements accross multiple independent rules.

Finally, `.eject()` removes the internal bookeeping data and let the output of the transformation alongside of the initial data.

In [9]:
my_transform.eject()

Index: Removed 1 index, completed after 4 ms.
Eject: Removed 125 labels, erased 709 properties, completed after 1038 ms.


Note that the index has been removed from the database.