## Neo4j via python

First, we need to import the driver package we want to use. 

Neo4j provides an official one, called neo4j, but py2neo is much, much easier to use. 


In [60]:
!pip install py2neo
import py2neo



Launch your database in Neo4j destkop, and figure out what port it's running on. 

Right now, mine is running at localhost:7687.

Import the Graph module from py2neo to connect to our graph.

In [61]:
from py2neo import Graph

Establish a connection with the graph using the localhost port your graph is running on. 

Specify your username and password - the defaults will be "neo4j" and "password."

In [79]:
graph = Graph("bolt://localhost:7687", auth=("neo4j", "launch"))

In [80]:
type(graph)

py2neo.database.Graph

## Method 1: Explicit cypher execution

Now, we can use graph.run to run cypher queries directly from python!

In [81]:
graph.run("CREATE (n:Person {name:'Ben', grad_year:2020})")

<py2neo.database.Cursor at 0x11fdce4e0>

In [82]:
#Delete all observations 
graph.run("MATCH (n) DETACH DELETE n")

<py2neo.database.Cursor at 0x11fdd7b00>

In [83]:
graph.run("MATCH (n) RETURN n")

<py2neo.database.Cursor at 0x11fdd7048>

In [84]:
import pandas as pd
names = ['Daniel Willson', "Andy Page", "Kaleigh Watson", "Amanda Coombs"]
titles = ['VPP', "ED", "LPD", "COO"]
schools = ['UVA', 'UVA', 'UVA', 'UVA']
workplaces = ['Forge', 'Forge', 'Forge', 'Forge']

people = pd.DataFrame({'name':names, 'title':titles, 'school':schools, 'workplace':workplaces})

In [85]:
school_name = ['UVA', 'VT']
school_type= ['Public', 'Public']
school_size = [16000, 30000]

company_name = ['Forge', 'Astraea']
company_type = ['501(c)(3)', 'For-profit startup']


schools = pd.DataFrame({'name':school_name, 'type':school_type, 'size':school_size})
companies = pd.DataFrame({'name':company_name, 'type':company_type})

In [86]:
schools

Unnamed: 0,name,type,size
0,UVA,Public,16000
1,VT,Public,30000


Create *what will be referened* before you create *what will reference it.*

In [87]:
#Creating schools: 

for i in range(len(schools)):
    print(schools.iloc[i])

name       UVA
type    Public
size     16000
Name: 0, dtype: object
name        VT
type    Public
size     30000
Name: 1, dtype: object


In [88]:
str(16000)

'16000'

We want our create string to be: 

CREATE (n:SCHOOL {name:"UVA", size:16000, type:"Public"})

In [89]:
#Let's work with this, constructing a cypher statement for each row. 

for i in range(len(schools)):
    row = schools.iloc[i]
    print("CREATE (:SCHOOL {name:" + row['name'] + ",size:" + str(row['size']) + ",type:" + row.type)

CREATE (:SCHOOL {name:UVA,size:16000,type:Public
CREATE (:SCHOOL {name:VT,size:30000,type:Public


In [90]:
#Turns out we need to add quotes where we want them (but not where we don't!)

for i in range(len(schools)):
    row = schools.iloc[i]
    print("CREATE (:SCHOOL {name:'" + row['name'] + "',size:" + str(row.size) + ",type:'" + row.type+"'")

CREATE (:SCHOOL {name:'UVA',size:3,type:'Public'
CREATE (:SCHOOL {name:'VT',size:3,type:'Public'


Much better! Now we're able to "run" this into the graph

In [91]:
for i in range(len(schools)):
    row = schools.iloc[i]
    graph.run("CREATE (:SCHOOL {name:'" + row['name'] + "',size:" + str(row.size) + ",type:'" + row.type+"'})")

Similarly for companies:

In [92]:
for i in range(len(companies)):
    row = companies.iloc[i]
    graph.run("CREATE (:COMPANY {name:'" + row['name'] + "',type:'" + row.type+"'})")

Reading in people, and creating some links.

In [93]:
for i in range(len(people)):
    row = people.iloc[i]
    #First step: Create the person nodes, with the properties that won't be used to create relationships.
    graph.run("CREATE (:PERSON {name:'" + row['name'] + "'})")
    
    #What would creating relationships to companies and schools look like? Let's go to the cypher browser
    graph.run("MATCH (p:PERSON {name:'" + row['name'] +"'}), (c:COMPANY {name:'" + row['workplace'] + "'}) CREATE (p)-[:WORKS_AT]->(c)")
    
    #Same (similar) deal for schools
    graph.run("MATCH (p:PERSON {name:'" + row['name'] +"'}), (s:SCHOOL {name:'" + row['school'] + "'}) CREATE (p)-[:ATTENDED]->(s)")

In [94]:
graph

<Graph database=<Database uri='bolt://localhost:7687' secure=False user_agent='py2neo/4.3.0 neobolt/1.7.17 Python/3.7.3-final-0 (darwin)'> name='data'>

In [None]:
#Template strings make this drastically less hideous.

In [None]:
for i in range(len(people)):
    row = people.iloc[i]
    #First step: Create the person nodes, with the properties that won't be used to create relationships.
    graph.run("CREATE (:PERSON {name:'" + row['name'] + "'})")
    
    #What would creating relationships to companies and schools look like? Let's go to the cypher browser
    graph.run("MATCH (p:PERSON {name:'" + row['name'] +"'}), (c:COMPANY {name:'" + row['workplace'] + "'}) CREATE (p)-[:WORKS_AT]->(c)")
    
    #Same (similar) deal for schools
    graph.run("MATCH (p:PERSON {name:'" + row['name'] +"'}), (s:SCHOOL {name:'" + row['school'] + "'}) CREATE (p)-[:ATTENDED]->(s)")

## Method 2: Using actual py2neo datatypes!
Node("label", property="value", property="value")

In [103]:
graph.run("MATCH (n) DETACH DELETE n")
from py2neo import Node
kaleigh = Node("Person", name = "Kaleigh Watson", coworkers = ['Andy', 'Amanda', 'Daniel'])

In [104]:
graph.create(kaleigh)

In [108]:
from py2neo import Relationship
WorksFor = Relationship.type("WORKS_FOR")
Attended = Relationship.type("ATTENDED")

In [154]:
forge = Node("Company", name = "Forge", type="501c3")
uva = Node("School", name = "UVA", type = "Public")

In [112]:
#Create the relationship between kaleigh and forge. 
graph.create(WorksFor(kaleigh, uva))


In [138]:
from py2neo import Graph, NodeMatcher
matcher = NodeMatcher(graph)

#The older, .match("Type", attr="value", attr="value") syntax appears broken. We'll use this instead:
matcher.match("Company").where("_.name = 'Forge'").first()

(_11:Company {name: 'Forge', type: '501c3'})

In [149]:
graph.run("Match (n) detach delete n")

<py2neo.database.Cursor at 0x11fdda1d0>

Per this github issue (https://github.com/technige/py2neo/issues/777)

It appears that this type of functionality is broken or just not supported, since the "create" statement simply overrides both nodes and creates new ones. 

Would love to see someone figure this out, definitely poke around and see if you can get it to work! 

We'll use this workaround for now: 

- Read in each individual node type
- Connect the nodes based on properties 
- Delete the redundant properties from each node.

In [157]:
graph.run("Match (n) detach delete n")

<py2neo.database.Cursor at 0x1202ad2e8>

### 1: Read in each individual node type using this easier API method

In [160]:
#People
for i in range(len(people)):
    row = people.iloc[i]
    person = Node("Person", name = row['name'], workplace=row['workplace'], school = row['school'])
    graph.create(person)
#Schools
for i in range(len(schools)):
    row = schools.iloc[i]
    school = Node("School", name = row['name'], type=row['type'], size = str(row['size']))
    graph.create(school)
#Companies
for i in range(len(companies)):
    row = companies.iloc[i]
    company = Node("Company", name = row['name'], type = row['type'])
    graph.create(company)

### 2: Use raw cypher queries to connect the nodes based on properties

In [163]:
#People ATTENDED Schools
graph.run("MATCH (n:Person), (m:School) WHERE n.school = m.name CREATE (n)-[:ATTENDED]->(m)")

#People WORK_AT Comapnies
graph.run("MATCH (n:Person), (m:Company) WHERE n.workplace = m.name CREATE (n)-[:WORKS_AT]->(m)")

<py2neo.database.Cursor at 0x1202c8fd0>

### 3. Eliminate redundant properties
We want to get rid of those extra properties we had on our person nodes, since we don't need to explicitly see what School they attended and where there Workplace is. 

In [167]:
graph.run("MATCH (n:Person) REMOVE n.workplace, n.school")

<py2neo.database.Cursor at 0x1202b8828>