# Demo: iBench Amalgam1ToAmalgam3

In the following, we will connect to a [Neo4j Community Edition](https://neo4j.com/product/neo4j-graph-database/) instance running inside a Docker container.
You can type 
```
sudo docker run --name neo4jDemo -p 7687:7687 -p 7474:7474 -v ~/research/DTGraph/output-ibench-data:/var/lib/neo4j/import --env=NEO4J_AUTH=none neo4j:5.16.0-community
``` 
to install and run a Neo4j Community Edition locally. (Of course you need to have [Docker](https://docs.docker.com/engine/install/) already installed on your system.) You should then be able to access [Neo4j browser](http://localhost:7474/browser/) running locally on your computer.

You need to replace `~/research/DTGraph` with the DTGraph's installation path on your computer. We need to mount the volume on the Docker instance to run the import scripts.

*Note:* We have specifically tested the compatibility of this framework with Neo4j Community Edition 5.16.0, which was the latest versions by the time of writting this guide.

In [1]:
from dtgraph import Neo4jGraph, Rule, Transformation
hostname = "localhost"
password = ""
uri = f"bolt://{hostname}:7687"
graph = Neo4jGraph(uri, database="neo4j", username="", password=password)

For this tutorial, we will use the [Amalgam1ToAmalgam3](https://github.com/yannramusat/TPG/tree/main/input-ibench-config/a1ta3) data integration scenario from [iBench](https://github.com/RJMillerLab/ibench), which can be loaded into the database using the following command.

In [2]:
from dtgraph.scenarios.ibench_a1ta3 import iBenchAmalgam1ToAmalgam3
iBenchAmalgam1ToAmalgam3.load(graph, size = 1_000)

Flushed database: Deleted 296 nodes, deleted 260 relationships, completed after 248 ms.
CSV:    Added 1000 labels, created 1000 nodes, set 12000 properties, created 0 relationships, completed after 1174 ms.
CSV:    Added 1000 labels, created 1000 nodes, set 12000 properties, created 0 relationships, completed after 154 ms.
CSV:    Added 1000 labels, created 1000 nodes, set 12000 properties, created 0 relationships, completed after 112 ms.
CSV:    Added 1000 labels, created 1000 nodes, set 12000 properties, created 0 relationships, completed after 117 ms.
CSV:    Added 1000 labels, created 1000 nodes, set 12000 properties, created 0 relationships, completed after 110 ms.
CSV:    Added 1000 labels, created 1000 nodes, set 13000 properties, created 0 relationships, completed after 120 ms.
CSV:    Added 1000 labels, created 1000 nodes, set 12000 properties, created 0 relationships, completed after 119 ms.
CSV:    Added 1000 labels, created 1000 nodes, set 2000 properties, created 0 relatio

In [3]:
rule1 = Rule('''
MATCH (pub:InProcPublished)
MATCH (ip:InProceedings)
WHERE pub.inproc = ip.inprocid
MATCH (a:Author)
WHERE pub.auth = a.authid 
GENERATE
(x = (ip):TArticle {
    articleid = "SK1(" + ip.inprocid + ")",
    title = ip.title,
    vol = ip.vol,
    num = ip.num,
    pages = ip.pages,
    month = ip.month,
    year = ip.year,
    refkey = "SK2(" + ip.inprocid + ")",
    note = ip.note,
    remarks = "SK3(" + ip.inprocid + ")",
    refs = "SK4(" + ip.inprocid + ")",
    xxxrefs = "SK5(" + ip.inprocid + ")",
    fullxxxrefs = "SK6(" + ip.inprocid + ")",
    oldkey = "SK7(" + ip.inprocid + ")",
    abstract = "SK8(" + ip.inprocid + ")",
    preliminary = "SK9(" + ip.inprocid + ")"
})-[():ARTICLE_PUBLISHED]->(y = (a.authid):Auth {
    authorid = a.authid,
    name = a.name
})
''')

rule2 = Rule('''
MATCH (ap:ArticlePublished)
MATCH (art:Article)
WHERE ap.article = art.articleid
MATCH (a:Author)
WHERE ap.auth = a.authid 
GENERATE
(x = (art):TArticle {
    articleid = "SK11(" + art.articleid + ")",
    title = art.title,
    vol = art.vol,
    num = art.num,
    pages = art.pages,
    month = art.month,
    year = art.year,
    refkey = "SK12(" + art.articleid + ")",
    note = art.note,
    remarks = "SK13(" + art.articleid + ")",
    refs = "SK14(" + art.articleid + ")",
    xxxrefs = "SK15(" + art.articleid + ")",
    fullxxxrefs = "SK16(" + art.articleid + ")",
    oldkey = "SK17(" + art.articleid + ")",
    abstract = "SK18(" + art.articleid + ")",
    preliminary = "SK19(" + art.articleid + ")"
})-[():ARTICLE_PUBLISHED]->(y = (a.authid):Auth {
    authorid = a.authid,
    name = a.name
})
''')

rule3 = Rule('''
MATCH (tp:TechPublished)
MATCH (t:TechReport)
WHERE tp.tech = t.techid
MATCH (a:Author)
WHERE tp.auth = a.authid 
GENERATE
(x = (t):TArticle {
    articleid = "SK21(" + t.techid + ")",
    title = t.title,
    vol = t.vol,
    num = t.num,
    pages = t.pages,
    month = t.month,
    year = t.year,
    refkey = "SK22(" + t.techid + ")",
    note = t.note,
    remarks = "SK23(" + t.techid + ")",
    refs = "SK24(" + t.techid + ")",
    xxxrefs = "SK25(" + t.techid + ")",
    fullxxxrefs = "SK26(" + t.techid + ")",
    oldkey = "SK27(" + t.techid + ")",
    abstract = "SK28(" + t.techid + ")",
    preliminary = "SK29(" + t.techid + ")"
})-[():ARTICLE_PUBLISHED]->(y = (a.authid):Auth {
    authorid = a.authid,
    name = a.name
})
''')

rule4 = Rule('''
MATCH (bp:BookPublished)
MATCH (b:Book)
WHERE bp.book = b.bookid
MATCH (a:Author)
WHERE bp.auth = a.authid 
GENERATE
(x = (b):TArticle {
    articleid = "SK31(" + b.bookid + ")",
    title = b.title,
    vol = b.vol,
    num = b.num,
    pages = b.pages,
    month = b.month,
    year = b.year,
    refkey = "SK32(" + b.bookid + ")",
    note = b.note,
    remarks = "SK33(" + b.bookid + ")",
    refs = "SK34(" + b.bookid + ")",
    xxxrefs = "SK35(" + b.bookid + ")",
    fullxxxrefs = "SK36(" + b.bookid + ")",
    oldkey = "SK37(" + b.bookid + ")",
    abstract = "SK38(" + b.bookid + ")",
    preliminary = "SK39(" + b.bookid + ")"
})-[():ARTICLE_PUBLISHED]->(y = (a.authid):Auth {
    authorid = a.authid,
    name = a.name
})
''')

rule5 = Rule('''
MATCH (icp:InCollPublished)
MATCH (i:InCollection)
WHERE icp.col = i.colid
MATCH (a:Author)
WHERE icp.auth = a.authid 
GENERATE
(x = (i):TArticle {
    articleid = "SK41(" + i.colid + ")",
    title = i.title,
    vol = i.vol,
    num = i.num,
    pages = i.pages,
    month = i.month,
    year = i.year,
    refkey = "SK42(" + i.colid + ")",
    note = i.note,
    remarks = "SK43(" + i.colid + ")",
    refs = "SK44(" + i.colid + ")",
    xxxrefs = "SK45(" + i.colid + ")",
    fullxxxrefs = "SK46(" + i.colid + ")",
    oldkey = "SK47(" + i.colid + ")",
    abstract = "SK48(" + i.colid + ")",
    preliminary = "SK49(" + i.colid + ")"
})-[():ARTICLE_PUBLISHED]->(y = (a.authid):Auth {
    authorid = a.authid,
    name = a.name
})
''')

rule6 = Rule('''
MATCH (mp:MiscPublished)
MATCH (m:Misc)
WHERE mp.misc = m.miscid
MATCH (a:Author)
WHERE mp.auth = a.authid 
GENERATE
(x = (m):TArticle {
    articleid = "SK51(" + m.miscid + ")",
    title = m.title,
    vol = m.vol,
    num = m.num,
    pages = m.pages,
    month = m.month,
    year = m.year,
    refkey = "SK52(" + m.miscid + ")",
    note = m.note,
    remarks = "SK53(" + m.miscid + ")",
    refs = "SK54(" + m.miscid + ")",
    xxxrefs = "SK55(" + m.miscid + ")",
    fullxxxrefs = "SK56(" + m.miscid + ")",
    oldkey = "SK57(" + m.miscid + ")",
    abstract = "SK58(" + m.miscid + ")",
    preliminary = "SK59(" + m.miscid + ")"
})-[():ARTICLE_PUBLISHED]->(y = (a.authid):Auth {
    authorid = a.authid,
    name = a.name
})
''')

rule7 = Rule('''
MATCH (mp:ManualPublished)
MATCH (m:Manual)
WHERE mp.manual = m.manid
MATCH (a:Author)
WHERE mp.auth = a.authid 
GENERATE
(x = (m):TArticle {
    articleid = "SK61(" + m.manid + ")",
    title = m.title,
    vol = m.vol,
    num = m.num,
    pages = m.pages,
    month = m.month,
    year = m.year,
    refkey = "SK62(" + m.manid + ")",
    note = m.note,
    remarks = "SK63(" + m.manid + ")",
    refs = "SK64(" + m.manid + ")",
    xxxrefs = "SK65(" + m.manid + ")",
    fullxxxrefs = "SK66(" + m.manid + ")",
    oldkey = "SK67(" + m.manid + ")",
    abstract = "SK68(" + m.manid + ")",
    preliminary = "SK69(" + m.manid + ")"
})-[():ARTICLE_PUBLISHED]->(y = (a.authid):Auth {
    authorid = a.authid,
    name = a.name
})
''')

rule8 = Rule('''
MATCH (a:Author)
GENERATE
(y = (a.authid):Auth {
    authorid = a.authid,
    name = a.name
})
''')

In [4]:
a1ta3_transform = Transformation([rule1, rule2, rule3, rule4, rule5, rule6, rule7, rule8], with_diagnose = False)
a1ta3_transform.apply_on(graph)

Index: Added 0 index, completed after 40 ms.
Rule: Added 3316 labels, created 1658 nodes, set 20658 properties, created 1000 relationships, completed after 440 ms.
Rule: Added 2838 labels, created 1208 nodes, set 20208 properties, created 1000 relationships, completed after 164 ms.
Rule: Added 2317 labels, created 1079 nodes, set 20079 properties, created 1000 relationships, completed after 191 ms.
Rule: Added 2114 labels, created 1033 nodes, set 20033 properties, created 1000 relationships, completed after 146 ms.
Rule: Added 2050 labels, created 1015 nodes, set 20015 properties, created 1000 relationships, completed after 143 ms.
Rule: Added 2013 labels, created 1003 nodes, set 20003 properties, created 1000 relationships, completed after 136 ms.
Rule: Added 2007 labels, created 1003 nodes, set 20003 properties, created 1000 relationships, completed after 138 ms.
Rule: Added 3 labels, created 1 nodes, set 2001 properties, created 0 relationships, completed after 27 ms.


1385