# Querying Graph Databases

### Data Representation

<img src="../img/na7.png" width="600">
<img src="../img/na8.png" width="600">

## Graph Query Language

- Retrieving data from a graphDB requires a query language
- Currently no single language has been universally adopted in the same
way as SQL was for relational databases
- Some standardization efforts Gremlin, SPARQL, and Cypher.

### Cypher (.cql, .cyp, .cypher)
<img src="../img/na10.png" width="1000">

# neo4j.com/sandbox-v2

# py2neo

`py2neo` is one of Neo4j's Python drivers. It offers a fully-featured interface for interacting with your data in Neo4j.   Install `py2neo` with:  
`!pip install py2neo`.

Connect to Neo4j with the `Graph` class.

In [1]:
from py2neo import Graph
import pandas as pd

In [3]:
graph = Graph(host="18.234.106.185",
               password="november-uncertainties-defeat",
               port="37113",
               scheme='http',
               user='neo4j')

In [4]:
graph

<Graph database=<Database uri='http://18.234.106.185:37113' secure=False user_agent='py2neo/4.2.0 urllib3/1.24.1 Python/3.7.3-final-0 (darwin)'> name='data'>

In [5]:
def query2table(graph, query):
    return pd.DataFrame(graph.run(query).data())

In [6]:
query_1 = """
MATCH (c:Character)-->()
WITH c, count(*) AS num
RETURN min(num) AS min, max(num) AS max, avg(num) AS avg_characters, stdev(num) AS stdev
"""

In [7]:
query2table(graph, query_1)

Unnamed: 0,avg_characters,max,min,stdev
0,6.578856,148,1,14.160456


In [8]:
query_2 = """
MATCH (c:Character)-[r]->(s:Character)
WHERE r.weight > 20
RETURN c.name AS source, s.name AS target
"""

In [9]:
query2table(graph, query_2).head()

Unnamed: 0,source,target
0,Aemon-Targaryen-(Maester-Aemon),Jon-Snow
1,Aemon-Targaryen-(Maester-Aemon),Samwell-Tarly
2,Aemon-Targaryen-(Maester-Aemon),Samwell-Tarly
3,Aemon-Targaryen-(Maester-Aemon),Jon-Snow
4,Aemon-Targaryen-(Maester-Aemon),Jon-Snow


In [10]:
query_3 = """
MATCH (c:Character)-[r]->()
WITH r.book as book, c, count(*) AS num
RETURN book, min(num) AS min, max(num) AS max, avg(num) AS avg_characters, stdev(num) AS stdev
ORDER BY book

"""

In [11]:
query2table(graph, query_3).head()

Unnamed: 0,avg_characters,book,max,min,stdev
0,4.920863,1,51,1,7.096707
1,4.015544,2,37,1,5.360423
2,4.289362,3,36,1,5.335866
3,3.691667,45,57,1,5.636326


In [12]:
query_4 = """
MATCH (c:Character)-[]-()
RETURN c.name AS character, count(*) AS degree ORDER BY degree DESC LIMIT 50
"""

In [13]:
query2table(graph, query_4).head()

Unnamed: 0,character,degree
0,Tyrion-Lannister,210
1,Jon-Snow,182
2,Cersei-Lannister,177
3,Jaime-Lannister,162
4,Joffrey-Baratheon,140


## jgraph

In [14]:
import jgraph as jg

In [15]:
def query2tuples(graph, query):
    return [tuple(x) for x in graph.run(query)]

In [16]:
query ="""MATCH (n)-[INTERACTS1]->(m) RETURN n.name, m.name LIMIT 50"""

In [17]:
query_tuples = query2tuples(graph, query)

In [18]:
jg.draw(query_tuples, directed=False, shader="lambert",
        default_node_color=0x383294, z=200, size=(800, 600))

In [19]:
generated = jg.generate(query_tuples)

In [20]:
colors = ['Arianne-Martell','Aegon-V-Targaryen','Catelyn-Stark','Arya-Stark']

In [21]:
for k,v in generated['nodes'].items():
    if k in colors:
        v.update({'color': 0xffaaaa})
    else:
        v.update({'color': 0x2222ff})

In [22]:
jg.draw(generated, directed=False, 
        shader="lambert", default_node_color=0x383294, z=200, size=(800, 600))

### Integration

In [23]:
import networkx as nx
import matplotlib.pyplot as plt

In [24]:
g = nx.Graph()
g.add_edges_from(query_tuples)

In [25]:
def measure2table(measure):
    table = pd.DataFrame.from_dict(measure, orient='index').reset_index()
    table.columns = ['nodes','score']
    return table.sort_values('score', ascending=False)

In [26]:
dc = nx.degree_centrality(g)

In [27]:
centrality = measure2table(dc)

In [28]:
centrality.head()

Unnamed: 0,nodes,score
6,Arya-Stark,0.257143
33,Catelyn-Stark,0.2
24,Bran-Stark,0.142857
14,Balon-Greyjoy,0.114286
19,Barristan-Selmy,0.085714
