# Red Hook Initiative (RHI) network analysis using py2neo and neo4j
This notebook is connected to neo4j project and database, thus we can create, manage and query components of the database via the notebook, then visualizing them on neo4j. The language used is Cypher - a Neo4j’s graph query language that allows users to store and retrieve data from the graph database.

- [Download neo4j](https://neo4j.com/download-neo4j-now/?utm_source=google&utm_medium=ppc&utm_campaign=*NA%20-%20Search%20-%20Branded&utm_adgroup=*NA%20-%20Search%20-%20Branded%20-%20Neo4j%20-%20Download%20-%20Exact&utm_term=neo4j%20download&gclid=Cj0KCQjwi43oBRDBARIsAExSRQFFn8nIJ4rzDGV0kdXlvF7nt1BAZY_z0URrTSJ_7-CHhCPHqXqZ9ksaAjiTEALw_wcB)
- [install py2neo](https://py2neo.org/v4/)
- [Cypher documentation](https://neo4j.com/developer/cypher/)

[**Red Hook Initiative (RHI)**](https://rhicenter.org/) is a dominant community organization in Red Hook, Brooklyn, NY. Their goal is to support youth development through community building efforts. Since 2006, RHI programs had bloomed, it’s roots in the community deepened and today it reaches to over 5,000 Red Hook residents annually. 

# 0. Connecting to neo4j project

In [1]:
from py2neo import Graph
g = Graph(password="draw1234") # connecting to the neo4j database

# 1. Creating the database
1. creating constrains
2. creating nodes from .csv
3. creating relationships between nodes

In [50]:
# creating constrains
g.run("CREATE CONSTRAINT ON (s:Staff) ASSERT s.name IS UNIQUE")
g.run("CREATE CONSTRAINT ON (p:Program) ASSERT p.program IS UNIQUE")
g.run("CREATE CONSTRAINT ON (b:SubProgram) ASSERT b.subprogram IS UNIQUE")
g.run("CREATE CONSTRAINT ON (i:Initiative) ASSERT i.initiative IS UNIQUE")
g.run("CREATE CONSTRAINT ON (l:Location) ASSERT l.location IS UNIQUE")
g.run("CREATE CONSTRAINT ON (h:Hours) ASSERT h.hours IS UNIQUE")

<py2neo.database.Cursor at 0x1098ece10>

In [53]:
# # code for deleting all nodes and relationships
# # unhash to run
# g.run("MATCH (n) DETACH DELETE n")

<py2neo.database.Cursor at 0x1098f86a0>

## 1.1 Nodes
### Staff Nodes

In [54]:
g.run("""
    LOAD CSV WITH HEADERS FROM "file:///Staff.csv" AS row 
    MERGE (s:Staff {name: row.NAME})
    ON CREATE SET s.program = row.PROGRAM, s.role = row.ROLE, 
    s.team = row.TEAM, s.location = row.LOCATION
    """)

<py2neo.database.Cursor at 0x1098ec550>

### Program Nodes

In [55]:
g.run("""
    LOAD CSV WITH HEADERS FROM "file:///Programs.csv" AS row 
    MERGE (p:Program {name: row.PROGRAM})
    ON CREATE SET p.goal = row.GOAL
    """)

<py2neo.database.Cursor at 0x1098f8dd8>

### Sub-Program Nodes

In [56]:
g.run("""
    LOAD CSV WITH HEADERS FROM 'file:///SubPrograms.csv' AS row
    MERGE (b:SubProgram {name: row.SUBPROGRAM})
    ON CREATE SET b.program = row.PROGRAM, b.goal = row.GOAL,
    b.location = row.LOCATION, b.hours = row.HOURS
    """)

<py2neo.database.Cursor at 0x1098fe198>

### Initiatives Nodes

In [57]:
g.run("""
    LOAD CSV WITH HEADERS FROM 'file:///Initiatives.csv' AS row
    MERGE (i:Initiative {name: row.INITIATIVE})
    ON CREATE SET i.program = row.PROGRAM, i.subprogram = row.SUBPROGRAM,
    i.location = row.LOCATION, i.hours = row.HOURS
    """)

<py2neo.database.Cursor at 0x1098f8a58>

### Location Nodes

In [58]:
g.run("""
    LOAD CSV WITH HEADERS FROM 'file:///Locations.csv' AS row
    MERGE (l:Location {name: row.LOCATION})
    ON CREATE SET l.spaces = row.SPACE_TYPES
    """)

<py2neo.database.Cursor at 0x1098fe898>

### Hours Nodes

In [59]:
g.run("""
    LOAD CSV WITH HEADERS FROM 'file:///Initiatives.csv' AS row
    MERGE (h:Hours {hours: row.HOURS})
    """)

g.run("""
    LOAD CSV WITH HEADERS FROM 'file:///SubPrograms.csv' AS row
    MERGE (h:Hours {hours: row.HOURS})
    """)

<py2neo.database.Cursor at 0x1098fe940>

## 1.2 Connections
### Connection: STAFF > PROGRAM and/or SUB PROGRAM

In [60]:
g.run("""
    LOAD CSV WITH HEADERS FROM 'file:///Staff.csv' AS row
    MATCH (s:Staff {name: row.NAME})
    MATCH (p:Program {name: row.PROGRAM})
    MERGE (s)-[:WORKS_IN]->(p)
    """)

g.run("""
    LOAD CSV WITH HEADERS FROM 'file:///Staff.csv' AS row
    MATCH (s:Staff {name: row.NAME})
    MATCH (b:SubProgram {name: row.PROGRAM})
    MERGE (s)-[:WORKS_IN]->(b)
    """)

<py2neo.database.Cursor at 0x1098fedd8>

### Connection: SUB PROGRAM > PROGRAM

In [61]:
g.run("""
    LOAD CSV WITH HEADERS FROM 'file:///SubPrograms.csv' AS row
    MATCH (b:SubProgram {name: row.SUBPROGRAM})
    MATCH (p:Program {name: row.PROGRAM})
    MERGE (b)-[:IS_PART_OF]->(p)
    """)

<py2neo.database.Cursor at 0x1098fe780>

### Connection: INITIATIVE > SUB PROGRAM

In [63]:
# initiative IS_PART_OF program
g.run("""
    LOAD CSV WITH HEADERS FROM 'file:///Initiatives.csv' AS row
    MATCH (i:Initiative {name: row.INITIATIVE})
    MATCH (p:Program {name: row.PROGRAM})
    MERGE (i)-[:IS_PART_OF]->(p)
    """)

# initiative IS_PART_OF subprogram
g.run("""
    LOAD CSV WITH HEADERS FROM 'file:///Initiatives.csv' AS row
    MATCH (i:Initiative {name: row.INITIATIVE})
    MATCH (b:SubProgram {name: row.SUBPROGRAM})
    MERGE (i)-[:IS_PART_OF]->(b)
    """)

# initiative AT_LOCATION location
g.run("""
    LOAD CSV WITH HEADERS FROM 'file:///Initiatives.csv' AS row
    MATCH (i:Initiative {name: row.INITIATIVE})
    MATCH (l:Location {name: row.LOCATION})
    MERGE (i)-[:AT_LOCATION]->(l)
    """)

# initiative  hours
g.run("""
    LOAD CSV WITH HEADERS FROM 'file:///Initiatives.csv' AS row
    MATCH (i:Initiative {name: row.INITIATIVE})
    MATCH (h:Hours {hours: row.HOURS})
    MERGE (i)-[:ON_TIME]->(h)
    """)

<py2neo.database.Cursor at 0x1098fe470>

## Degree Centrality

In [7]:
g.run("""
MATCH (n) 
WHERE NOT (n:Program)
RETURN DISTINCT n.name,n.last, n.email, size((n)<--()) as ego Order BY ego DESC LIMIT 5
      """).data()

[{'n.name': 'Conor',
  'n.last': 'Cunningham',
  'n.email': 'cc2697@cornell.edu',
  'ego': 9},
 {'n.name': 'Gabrielle',
  'n.last': 'Zandi',
  'n.email': 'gpz4@cornell.edu',
  'ego': 9},
 {'n.name': 'Jo-Anne',
  'n.last': 'Loh',
  'n.email': 'jl3839@cornell.edu',
  'ego': 8},
 {'n.name': 'Po Yen',
  'n.last': 'Tseng',
  'n.email': 'pt382@cornell.edu',
  'ego': 7},
 {'n.name': 'Marisha',
  'n.last': 'Thakker',
  'n.email': 'mbt53@cornell.edu',
  'ego': 7}]

## Betweenness Centrality

In [8]:
g.run("""
MATCH (node:Student)
WITH collect(node) AS nodes
CALL apoc.algo.betweenness(['TEAMING'],nodes,'INCOMING') YIELD node, score
RETURN node.name, node.last, node.email, score
ORDER BY score DESC
LIMIT 5
""").data()

[{'node.name': 'Renee',
  'node.last': 'Zacharowicz',
  'node.email': 'rz336@cornell.edu',
  'score': 5214.8912698412705},
 {'node.name': 'Conor',
  'node.last': 'Cunningham',
  'node.email': 'cc2697@cornell.edu',
  'score': 4308.464285714284},
 {'node.name': 'Steffen',
  'node.last': 'Baumgarten',
  'node.email': 'stb92@cornell.edu',
  'score': 3921.6833333333325},
 {'node.name': 'Charles',
  'node.last': 'Kuang',
  'node.email': 'ck742@cornell.edu',
  'score': 3874.683333333332},
 {'node.name': 'Iman',
  'node.last': 'Diarra',
  'node.email': 'iman@newschool.edu',
  'score': 3627.5376984126992}]

## PageRank

In [9]:
g.run("""
MATCH (node:Student)
WITH collect(node) AS nodes
CALL apoc.algo.pageRank(nodes) YIELD node, score
RETURN node.name, node.last, node.email, score
ORDER BY score DESC
LIMIT 5
""").data()

[{'node.name': 'Gabrielle',
  'node.last': 'Zandi',
  'node.email': 'gpz4@cornell.edu',
  'score': 1.06157},
 {'node.name': 'Iman',
  'node.last': 'Diarra',
  'node.email': 'iman@newschool.edu',
  'score': 0.77762},
 {'node.name': 'Ben',
  'node.last': 'Zelditch',
  'node.email': 'bz87@cornell.edu',
  'score': 0.73736},
 {'node.name': 'Saniya',
  'node.last': 'Shah',
  'node.email': 'ss3734@cornell.edu',
  'score': 0.7159},
 {'node.name': 'Christopher',
  'node.last': 'Caulfield',
  'node.email': 'ctc98@cornell.edu',
  'score': 0.70883}]

## Skills

In [10]:
g.run("CREATE CONSTRAINT ON (s:Skills_topic) ASSERT s.topic_skills IS UNIQUE")

<py2neo.database.Cursor at 0x111713ac8>

In [11]:
g.run("""
    LOAD CSV WITH HEADERS FROM "file:///superpowers-clean-4.csv" AS row 
    MERGE (s:Skills_topic {topic: row.topic_skills})
    """)

<py2neo.database.Cursor at 0x1119ccac8>

## ConnectionSkills

In [12]:
g.run("""
    LOAD CSV WITH HEADERS FROM 'file:///superpowers-clean-4.csv' AS row
    MATCH (s:Student {email: row.email})
    MATCH (p:Skills_topic {topic: row.topic_skills})
    MERGE (s)-[:HAVE_SKILL]->(p)
    """)

<py2neo.database.Cursor at 0x1100f2940>

In [13]:
#MATCH (p:Program)<-[:STUDING]-(s:Student)-[r:HAVE_SKILL]->(T:Skills_topic) 
#RETURN r,s,T,p

In [14]:
g.run("""
    MATCH (s:Skills_topic {topic:"0"})
    SET s.decription = "digit, intern, compani, build, comput, digit market, market, databas"
    """)
g.run("""
    MATCH (s:Skills_topic {topic:"1"})
    SET s.decription = "product, design, market, product manag, manag, ux, brand, present"
    """)
g.run("""
    MATCH (s:Skills_topic {topic:"2"})
    SET s.decription = "work, team, legal, experi, technic, plan, public, python"
    """)
g.run("""
    MATCH (s:Skills_topic {topic:"3"})
    SET s.decription = "java, statist, analysi, project, data, js, project manag, data analysi"
    """)
g.run("""
    MATCH (s:Skills_topic {topic:"4"})
    SET s.decription = "c, web, develop, stack, full, full stack, web develop, python"
    """)
g.run("""
    MATCH (s:Skills_topic {topic:"5"})
    SET s.decription = "larn, machin, machin learn, think, model, strategi, speak, public speak"
    """)

<py2neo.database.Cursor at 0x111a31f28>

## Passions

In [15]:
g.run("CREATE CONSTRAINT ON (p:Passions_topic) ASSERT p.topic_passions IS UNIQUE")

<py2neo.database.Cursor at 0x1119ccc18>

In [16]:
g.run("""
    LOAD CSV WITH HEADERS FROM "file:///superpowers-clean-4.csv" AS row 
    MERGE (s:Passions_topic {topic_passions: row.topic_passions})
    """)

<py2neo.database.Cursor at 0x110192ef0>

In [17]:
g.run("""
    LOAD CSV WITH HEADERS FROM 'file:///superpowers-clean-4.csv' AS row
    MATCH (s:Student {email: row.email})
    MATCH (p:Passions_topic {topic_passions: row.topic_passions})
    MERGE (s)-[:HAVE_PASSION]->(p)
    """)

<py2neo.database.Cursor at 0x1119cc940>

In [18]:
g.run("""
    MATCH (s:Passions_topic {topic_passions:"0"})
    SET s.decription = "travel, learn, new, food, cultur, citi"
    """)
g.run("""
    MATCH (s:Passions_topic {topic_passions:"1"})
    SET s.decription = "travel, design, food, game, movi, card"
    """)
g.run("""
    MATCH (s:Passions_topic{topic_passions:"2"})
    SET s.decription = "fashion, develop, design, love, travel, talk"
    """)
g.run("""
    MATCH (s:Passions_topic {topic_passions:"3"})
    SET s.decription = "tech, polit, energi, econom, blockchain, healthcar"
    """)
g.run("""
    MATCH (s:Passions_topic {topic_passions:"4"})
    SET s.decription = "data, health, scienc, data scienc, tech, law"
    """)
g.run("""
    MATCH (s:Passions_topic {topic_passions:"5"})
    SET s.decription = "music, game, video game, video, movi, food"
    """)

<py2neo.database.Cursor at 0x111a3d4a8>

## Passions

In [19]:
g.run("CREATE CONSTRAINT ON (p:Passions_topic) ASSERT p.topic_passions IS UNIQUE")

<py2neo.database.Cursor at 0x111a31c18>

In [20]:
g.run("""
    LOAD CSV WITH HEADERS FROM "file:///superpowers-clean-4.csv" AS row 
    MERGE (s:Passions_topic {topic_passions: row.topic_passions})
    """)

<py2neo.database.Cursor at 0x1119ccbe0>

In [21]:
g.run("""
    LOAD CSV WITH HEADERS FROM 'file:///superpowers-clean-4.csv' AS row
    MATCH (s:Student {email: row.email})
    MATCH (p:Passions_topic {topic_passions: row.topic_passions})
    MERGE (s)-[:HAVE_PASSION]->(p)
    """)

<py2neo.database.Cursor at 0x111a31a90>

In [22]:
g.run("""
    MATCH (s:Passions_topic {topic_passions:"0"})
    SET s.decription = "travel, learn, new, food, cultur, citi"
    """)
g.run("""
    MATCH (s:Passions_topic {topic_passions:"1"})
    SET s.decription = "travel, design, food, game, movi, card"
    """)
g.run("""
    MATCH (s:Passions_topic{topic_passions:"2"})
    SET s.decription = "fashion, develop, design, love, travel, talk"
    """)
g.run("""
    MATCH (s:Passions_topic {topic_passions:"3"})
    SET s.decription = "tech, polit, energi, econom, blockchain, healthcar"
    """)
g.run("""
    MATCH (s:Passions_topic {topic_passions:"4"})
    SET s.decription = "data, health, scienc, data scienc, tech, law"
    """)
g.run("""
    MATCH (s:Passions_topic {topic_passions:"5"})
    SET s.decription = "music, game, video game, video, movi, food"
    """)

<py2neo.database.Cursor at 0x111a3d978>

In [23]:
#MATCH (p:Program)<-[:STUDING]-(s:Student)-[r:HAVE_PASSION]->(t:Passions_topic) 
#RETURN p,s,r,t

In [24]:
#MATCH (p:Program {name:"MBA"})<-[:STUDING]-(s:Student)-[r:HAVE_PASSION]->(t:Passions_topic {topic_passions:"5"}) 
#RETURN p,s,r,t

## Experience

In [25]:
g.run("CREATE CONSTRAINT ON (p:Experience_topic) ASSERT p.topic_experience IS UNIQUE")

<py2neo.database.Cursor at 0x111a319e8>

In [26]:
g.run("""
    LOAD CSV WITH HEADERS FROM "file:///superpowers-clean-4.csv" AS row 
    MERGE (s:Experience_topic {topic_experience: row.topic_experience})
    """)

<py2neo.database.Cursor at 0x111a31908>

In [27]:
g.run("""
    LOAD CSV WITH HEADERS FROM 'file:///superpowers-clean-4.csv' AS row
    MATCH (s:Student {email: row.email})
    MATCH (p:Experience_topic {topic_experience: row.topic_experience})
    MERGE (s)-[:HAVE_EXPERIENCE]->(p)
    """)

<py2neo.database.Cursor at 0x111a316d8>

In [28]:
g.run("""
    MATCH (s:Experience_topic {topic_experience:"0"})
    SET s.decription = "consult, design, product, data, project, product manag, experi"
    """)
g.run("""
    MATCH (s:Experience_topic {topic_experience:"1"})
    SET s.decription = "engin, softwar, tech, compani, develop, product, intern"
    """)


<py2neo.database.Cursor at 0x111a3dcf8>

In [29]:
#MATCH (p:Program)<-[:STUDING]-(s:Student)-[r:HAVE_EXPERIENCE]->(t:Experience_topic) 
#RETURN p,s,r,t

## Complementary Skills

In [30]:
g.run("CREATE CONSTRAINT ON (p:Complementary_skills_topic) ASSERT p.complementary_skills IS UNIQUE")

<py2neo.database.Cursor at 0x111a3de10>

In [31]:
g.run("""
    LOAD CSV WITH HEADERS FROM "file:///superpowers-clean-4.csv" AS row 
    MERGE (s:Complementary_skills_topic {complementary_skills: row.topic_clomplentary_skills})
    """)

<py2neo.database.Cursor at 0x111a31e10>

In [32]:
g.run("""
    LOAD CSV WITH HEADERS FROM 'file:///superpowers-clean-4.csv' AS row
    MATCH (s:Student {email: row.email})
    MATCH (p:Complementary_skills_topic {complementary_skills: row.topic_clomplentary_skills})
    MERGE (s)-[:COMPLEMENTARY_SKILL]->(p)
    """)

<py2neo.database.Cursor at 0x111a3deb8>

In [33]:
g.run("""
    MATCH (s:Complementary_skills_topic {complementary_skills:"0"})
    SET s.decription = "develop, busi, sale, learn"
    """)
g.run("""
    MATCH (s:Complementary_skills_topic {complementary_skills:"1"})
    SET s.decription = "public, speak, public speak, design"
    """)
g.run("""
    MATCH (s:Complementary_skills_topic {complementary_skills:"2"})
    SET s.decription = "skill, technic, code, technic skill"
    """)


<py2neo.database.Cursor at 0x111a4d320>

In [34]:
#MATCH (p:Program)<-[:STUDING]-(s:Student)-[r:COMPLEMENTARY_SKILL]->(t:Complementary_skills_topic) 
#RETURN p,s,r,t

In [35]:
#MATCH (p:Program {name:"CS"})<-[:STUDING]-(s:Student {gender: "1"})-[r:HAVE_PASSION]->(t:Passions_topic {topic_passions:"5"}) 
#RETURN s.name, s.email

In [None]:
#MATCH (p:Program {name:"MBA"})<-[:STUDING]-(s:Student {gender: "1"})-[r:HAVE_PASSION]->(t:Passions_topic) 
#WHERE s.passions CONTAINS "marketing"
#RETURN s

In [None]:
#MATCH (p:Program {name:"CS"})<-[:STUDING]-(s:Student {gender: "1"})-[r:HAVE_PASSION]->(t:Passions_topic {topic_passions:"5"}) 
#WHERE s.passions CONTAINS "game"
#RETURN s