# Neo4j

explore a very basic dataset of movies in order to understand better the property graph data model and the Cypher query language.  

* [Neo4j Cheat Sheet](https://quickref.me/neo4j)
* [Cypher Reference Card](https://neo4j.com/docs/cypher-cheat-sheet/5/auradb-enterprise/)

## Create a Neo4J sandbox database instance

To create an instance go to this [link](https://sandbox.neo4j.com/), log in, and click on "New Project."  From here, select the Movies graph and "Create".

TODO:  docker container with my customer service dataset or Enron emails.  But can't get docker env to work.  


## Connect to the database

available under "Connection Details" tab.

In [1]:
bolt_url = "bolt://18.205.185.18:7687"
username = "neo4j"
pwd = "insertion-gums-comparison"

In [None]:
! pip install py2neo

In [3]:
from py2neo import Graph
conn = Graph(bolt_url, auth=(username, pwd))

### Get all labels and their node count

In [4]:
query = """MATCH (n) RETURN distinct labels(n), count(n)"""
result = conn.query(query)
result



labels(n),count(n)
['Movie'],38
['Person'],133


### Get outgoing relations of "Person" nodes

In [5]:
outgoing_relations_query = """MATCH (:Person)-[r]->(n) RETURN distinct type(r), labels(n)"""
result = conn.query(outgoing_relations_query).data()
result

[{'type(r)': 'ACTED_IN', 'labels(n)': ['Movie']},
 {'type(r)': 'DIRECTED', 'labels(n)': ['Movie']},
 {'type(r)': 'PRODUCED', 'labels(n)': ['Movie']},
 {'type(r)': 'WROTE', 'labels(n)': ['Movie']},
 {'type(r)': 'FOLLOWS', 'labels(n)': ['Person']},
 {'type(r)': 'REVIEWED', 'labels(n)': ['Movie']}]

identify the incoming relations of "Movie" nodes

In [6]:

query = """MATCH (:Movie)<-[r]-(n) RETURN distinct type(r), labels(n)"""
result = conn.query(query).data()
result

[{'type(r)': 'ACTED_IN', 'labels(n)': ['Person']},
 {'type(r)': 'PRODUCED', 'labels(n)': ['Person']},
 {'type(r)': 'DIRECTED', 'labels(n)': ['Person']},
 {'type(r)': 'WROTE', 'labels(n)': ['Person']},
 {'type(r)': 'REVIEWED', 'labels(n)': ['Person']}]

### Get node properties per label

In [7]:
query = """call db.schema.nodeTypeProperties()"""
result = conn.query(query).data()
result



[{'nodeType': ':`Movie`',
  'nodeLabels': ['Movie'],
  'propertyName': 'title',
  'propertyTypes': ['String'],
  'mandatory': True},
 {'nodeType': ':`Movie`',
  'nodeLabels': ['Movie'],
  'propertyName': 'released',
  'propertyTypes': ['Long'],
  'mandatory': True},
 {'nodeType': ':`Movie`',
  'nodeLabels': ['Movie'],
  'propertyName': 'tagline',
  'propertyTypes': ['String'],
  'mandatory': False},
 {'nodeType': ':`Person`',
  'nodeLabels': ['Person'],
  'propertyName': 'name',
  'propertyTypes': ['String'],
  'mandatory': True},
 {'nodeType': ':`Person`',
  'nodeLabels': ['Person'],
  'propertyName': 'born',
  'propertyTypes': ['Long'],
  'mandatory': False}]

## Querying the data

### Find all the movies Tom Hanks acted in

In [8]:
query = """MATCH (n:Person {name:"Tom Hanks"})-[r:ACTED_IN]->(m:Movie)
RETURN m.title"""
result = conn.query(query).data()
result

[{'m.title': 'Apollo 13'},
 {'m.title': "You've Got Mail"},
 {'m.title': 'A League of Their Own'},
 {'m.title': 'Joe Versus the Volcano'},
 {'m.title': 'That Thing You Do'},
 {'m.title': 'The Da Vinci Code'},
 {'m.title': 'Cloud Atlas'},
 {'m.title': 'Cast Away'},
 {'m.title': 'The Green Mile'},
 {'m.title': 'Sleepless in Seattle'},
 {'m.title': 'The Polar Express'},
 {'m.title': "Charlie Wilson's War"}]

### Find all the movies Tom Hanks acted in AND directed

In [9]:
query = """MATCH (n:Person {name:"Tom Hanks"})-[:ACTED_IN]->(m:Movie)<-[:DIRECTED]-(n)
RETURN m.title"""
result = conn.query(query).data()
result

[{'m.title': 'That Thing You Do'}]

### Find the persons who have not directed a movie

In [10]:
query = """MATCH (n:Person) WHERE NOT (n)-[:DIRECTED]->() return n.name"""
result = conn.query(query).data()
result

[{'n.name': 'Keanu Reeves'},
 {'n.name': 'Carrie-Anne Moss'},
 {'n.name': 'Laurence Fishburne'},
 {'n.name': 'Hugo Weaving'},
 {'n.name': 'Joel Silver'},
 {'n.name': 'Emil Eifrem'},
 {'n.name': 'Charlize Theron'},
 {'n.name': 'Al Pacino'},
 {'n.name': 'Tom Cruise'},
 {'n.name': 'Jack Nicholson'},
 {'n.name': 'Demi Moore'},
 {'n.name': 'Kevin Bacon'},
 {'n.name': 'Kiefer Sutherland'},
 {'n.name': 'Noah Wyle'},
 {'n.name': 'Cuba Gooding Jr.'},
 {'n.name': 'Kevin Pollak'},
 {'n.name': 'J.T. Walsh'},
 {'n.name': 'Christopher Guest'},
 {'n.name': 'Aaron Sorkin'},
 {'n.name': 'Kelly McGillis'},
 {'n.name': 'Val Kilmer'},
 {'n.name': 'Anthony Edwards'},
 {'n.name': 'Tom Skerritt'},
 {'n.name': 'Meg Ryan'},
 {'n.name': 'Jim Cash'},
 {'n.name': 'Renee Zellweger'},
 {'n.name': 'Kelly Preston'},
 {'n.name': "Jerry O'Connell"},
 {'n.name': 'Jay Mohr'},
 {'n.name': 'Bonnie Hunt'},
 {'n.name': 'Regina King'},
 {'n.name': 'Jonathan Lipnicki'},
 {'n.name': 'River Phoenix'},
 {'n.name': 'Corey Feldman'

movies that have been REVIEWED

In [11]:
# insert your code here
query = """MATCH (n:Movie) WHERE ()-[:REVIEWED]->(n) return n.title"""
result = conn.query(query).data()
result

[{'n.title': 'Jerry Maguire'},
 {'n.title': 'The Replacements'},
 {'n.title': 'The Birdcage'},
 {'n.title': 'Unforgiven'},
 {'n.title': 'Cloud Atlas'},
 {'n.title': 'The Da Vinci Code'}]