<div>
<img src=https://www.institutedata.com/wp-content/uploads/2019/10/iod_h_tp_primary_c.svg width="300">
</div>

# Lab 3.1.5
# *Neo4j and Python*

## Introduction

Neo4j is the most popular graph database. Free versions include the Desktop (Developer) edition and the Community Server edition (which we can drive from Python).

We will begin this lab by working through the tutorial embedded in the Neo4j *start* page to learn about graph databases structures and the Cypher query language. We will then see how to integrate a Neo4j database with a Python program.

The Community Server version can be downloaded here: https://neo4j.com/deployment-center/#releases
* Link for additional assistance: [How To Neo4j Community Install On Windows 2019 Server For External Access](https://www.youtube.com/watch?v=JaWGwnPkOYA&t=105s)

If installation here is unsuccessful try the Desktop edition: https://neo4j.com/download/
* Link for additional assistance: [Neo4j (Graph Database) Crash Course](https://www.youtube.com/watch?v=8jNPelugC2s&t=597s)

- Go through the *Concepts* tutorial.
- At the end, click *Intro* under *Keep getting started* heading and go through the tutorial.
- At the end, click *Cypher* under *Keep getting started* heading and go through the tutorial.
- At the end, click *The Movie Graph* under *Jump into code* heading and go through the tutorial.


In [None]:
#### (If you are using community)Add instruction to create/connect/start the database from neo4j localhost browser, otherwise your query won't work

## Driving Neo4j from Python

There are a variety of Python libraries for Neo4j, some of which provide more compact (and simpler) ways of executing commands. To avoid having to learn too many different ways of doing the same thing, however, we will use the official one, which is based on the syntax of the Cypher query language.

The ***Neo4j Bolt Driver for Python*** is documented at https://neo4j.com/docs/api/python-driver/current/.

In [None]:
#!pip install neo4j

In [None]:
# After a while if you want to use neo4j, you can use this local browser link to log in
# http://localhost:7474/browser/

In [None]:
#from neo4j.v1 import GraphDatabase
from neo4j import GraphDatabase

#uri = "bolt://localhost:7687" ----> This should also work
uri = "neo4j://localhost:7687"

username = "neo4j"
password = "SuperPassword" #----> your default password might not work in the first try, so you should change your password

# How to change Neo4j default password?
# After going to localhost:7474 in your browser, you will be prompted to log in.
# Use the default login info:
    # User: neo4j
    # Password: neo4j
# You’ll then be asked to change your password.

In [None]:
driver = GraphDatabase.driver(uri, auth=(username, password))

To execute a query against a database using this driver, we need to wrap the Cypher query string in a function definition and pass the function to the `read_transaction` method of the `session` object. Our query function then has access to the `tx` object.

Here is a function that finds all the movies that the requested `Person` acted in:

In [None]:
def print_movies_by(tx, name):
    for record in tx.run("MATCH (a:Person)-[:ACTED_IN]->(anyMovies) "
                         "WHERE a.name = $name "
                         "RETURN anyMovies", name = name):
        print(record["anyMovies"])

**Note**: For the following queries to work the Movie Graph database should have been loaded with your Neo4j program.

Here is how to use it to list Tom Hanks' movies:

In [None]:
with driver.session() as session:
    session.execute_read(print_movies_by, "Tom Hanks")

<Node element_id='4:8ac45bf2-ab8d-4c4a-bb5b-6cb557f63360:144' labels=frozenset({'Movie'}) properties={'tagline': 'Houston, we have a problem.', 'title': 'Apollo 13', 'released': 1995}>
<Node element_id='4:8ac45bf2-ab8d-4c4a-bb5b-6cb557f63360:67' labels=frozenset({'Movie'}) properties={'tagline': 'At odds in life... in love on-line.', 'title': "You've Got Mail", 'released': 1998}>
<Node element_id='4:8ac45bf2-ab8d-4c4a-bb5b-6cb557f63360:162' labels=frozenset({'Movie'}) properties={'tagline': 'Once in a lifetime you get a chance to do something different.', 'title': 'A League of Their Own', 'released': 1992}>
<Node element_id='4:8ac45bf2-ab8d-4c4a-bb5b-6cb557f63360:78' labels=frozenset({'Movie'}) properties={'tagline': 'A story of love, lava and burning desire.', 'title': 'Joe Versus the Volcano', 'released': 1990}>
<Node element_id='4:8ac45bf2-ab8d-4c4a-bb5b-6cb557f63360:85' labels=frozenset({'Movie'}) properties={'tagline': 'In every life there comes a time when that thing you dream be

Clearly, some further wrangling is required to produce neat output. (Read the documentation before you attempt this.)

In fact, both the method of using the Neo4j Bolt Driver and the data returned by it are unwieldy. This is typical of low-level drivers.

Try building and running some more queries based on the code in examples queries in The Movie Graph tutorial.

In [None]:
# Prints a more readable output
def print_movies_by(tx, name):
    for record in tx.run("MATCH (a:Person)-[:ACTED_IN]->(movie:Movie) "
                         "WHERE a.name = $name "
                         "RETURN movie.title AS title, movie.tagline AS tagline, movie.released AS released", name=name):
        title = record["title"]
        tagline = record["tagline"]
        released = record["released"]
        print(f"Title: {title}")
        print(f"Tagline: {tagline}")
        print(f"Released: {released}")
        print("=" * 50)

with driver.session() as session:
    session.execute_read(print_movies_by, "Tom Hanks")


Title: Apollo 13
Tagline: Houston, we have a problem.
Released: 1995
Title: You've Got Mail
Tagline: At odds in life... in love on-line.
Released: 1998
Title: A League of Their Own
Tagline: Once in a lifetime you get a chance to do something different.
Released: 1992
Title: Joe Versus the Volcano
Tagline: A story of love, lava and burning desire.
Released: 1990
Title: That Thing You Do
Tagline: In every life there comes a time when that thing you dream becomes that thing you do
Released: 1996
Title: The Da Vinci Code
Tagline: Break The Codes
Released: 2006
Title: Cloud Atlas
Tagline: Everything is connected
Released: 2012
Title: Cast Away
Tagline: At the edge of the world, his journey begins.
Released: 2000
Title: The Green Mile
Tagline: Walk a mile you'll never forget.
Released: 1999
Title: Sleepless in Seattle
Tagline: What if someone you never met, someone you never saw, someone you never knew was the only someone for you?
Released: 1993
Title: The Polar Express
Tagline: This Holida

## 1. Find actors who have acted in more than a certain number of movies:

In [None]:
def find_actors_with_movies(tx, threshold):
    query = (
        "MATCH (a:Person)-[:ACTED_IN]->(m:Movie) "
        "WITH a, count(m) AS movieCount "
        "WHERE movieCount > $threshold "
        "RETURN a.name AS actorName, movieCount"
    )
    result = tx.run(query, threshold=threshold)
    for record in result:
        print(record["actorName"], "acted in", record["movieCount"], "movies")


In [None]:
# Usage:
driver = GraphDatabase.driver("bolt://localhost:7687", auth=(username, password))
with driver.session() as session:
    session.execute_read(find_actors_with_movies, threshold=5)

Keanu Reeves acted in 7 movies
Tom Hanks acted in 12 movies


## 2. Identify co-actors of a specific actor

In [None]:
def find_co_actors(tx, actor_name):
    query = (
        "MATCH (a1:Person)-[:ACTED_IN]->(m:Movie)<-[:ACTED_IN]-(a2:Person) "
        "WHERE a1.name = $actorName "
        "RETURN a2.name AS coActorName, collect(m.title) AS movies"
    )
    result = tx.run(query, actorName=actor_name)
    data = [(record["coActorName"], ", ".join(record["movies"])) for record in result]
    return data

In [None]:
from tabulate import tabulate
# Usage:
# driver = GraphDatabase.driver("bolt://localhost:7687", auth=(username, password))
with driver.session() as session:
    data = session.execute_read(???)

print(tabulate(data, headers=["Co-Actor Name", "Movies"]))


Co-Actor Name           Movies
----------------------  -------------------------------------------------------------
Ed Harris               Apollo 13
Gary Sinise             Apollo 13, The Green Mile
Kevin Bacon             Apollo 13
Bill Paxton             Apollo 13, A League of Their Own
Parker Posey            You've Got Mail
Greg Kinnear            You've Got Mail
Meg Ryan                You've Got Mail, Joe Versus the Volcano, Sleepless in Seattle
Steve Zahn              You've Got Mail
Dave Chappelle          You've Got Mail
Madonna                 A League of Their Own
Rosie O'Donnell         A League of Their Own, Sleepless in Seattle
Geena Davis             A League of Their Own
Lori Petty              A League of Their Own
Nathan Lane             Joe Versus the Volcano
Liv Tyler               That Thing You Do
Charlize Theron         That Thing You Do
Ian McKellen            The Da Vinci Code
Audrey Tautou           The Da Vinci Code
Paul Bettany            The Da Vinci Code

## 3. Find movies released after a certain year:

In [None]:
def find_movies_after_year(tx, year):
    query = (
        "MATCH (m:Movie) "
        "WHERE m.released > $year "
        "RETURN m.title AS movieTitle, m.released AS releaseYear"
    )
    result = tx.run(query, year=year)
    for record in result:
        print(record["movieTitle"], "was released in", record["releaseYear"])


In [None]:
# Usage:
#driver = GraphDatabase.driver("bolt://localhost:7687", auth=("neo4j", "password"))
with driver.session() as session:
    ???

The Matrix Reloaded was released in 2003
The Matrix Revolutions was released in 2003
Something's Gotta Give was released in 2003
The Polar Express was released in 2004
RescueDawn was released in 2006
The Da Vinci Code was released in 2006
V for Vendetta was released in 2006
Charlie Wilson's War was released in 2007
Speed Racer was released in 2008
Frost/Nixon was released in 2008
Ninja Assassin was released in 2009
Cloud Atlas was released in 2012


## 4. Find movies in a particular genre:

In [None]:
def find_movies_in_genre(tx, genre):
    query = (
        "MATCH ???"
        "WHERE g.name = $genre "
        "RETURN m.title AS movieTitle"
    )
    result = tx.run(query, genre=genre)
    for record in result:
        print(record["movieTitle"], "is in the", genre, "genre")

In [None]:
with driver.session() as session:
    session.execute_read(find_movies_in_genre, genre="Action")

## 5. Find movies directed by a specific director:

In [None]:
def find_movies_by_director(tx, director):
    query = (
        "MATCH ???"
        "WHERE d.name = $director "
        "RETURN m.title AS movieTitle"
    )
    result = tx.run(query, director=director)
    for record in result:
        print(record["movieTitle"], "was directed by", director)

In [None]:
# Usage:
#driver = GraphDatabase.driver("bolt://localhost:7687", auth=("neo4j", "password"))
with driver.session() as session:
    session.execute_read(find_movies_by_director, director="Steven Spielberg")

## 6. Find movies where a specific actor both acted in and directed:

In [None]:
def find_movies_actor_directed(tx, actor_name):
    query = (
        "MATCH ???"
        "WHERE a.name = $actorName "
        "RETURN m.title AS movieTitle"
    )
    result = tx.run(query, actorName=actor_name)
    for record in result:

        print(actor_name, "acted in and directed", record["movieTitle"])

In [None]:
# Usage:
#driver = GraphDatabase.driver("bolt://localhost:7687", auth=("neo4j", "password"))
with driver.session() as session:
    session.execute_read(find_movies_actor_directed, actor_name="Tom Hanks")

Tom Hanks acted in and directed That Thing You Do


## 7. Find actors who have worked in a specific genre:

In [None]:
def find_actors_in_genre(tx, genre):
    query = (
        "MATCH ???"
        "WHERE g.name = $genre "
        "RETURN DISTINCT a.name AS actorName"
    )
    result = tx.run(query, genre=genre)
    for record in result:
        print(record["actorName"], "has worked in the", genre, "genre")


In [None]:
# Usage:
#driver = GraphDatabase.driver("bolt://localhost:7687", auth=("neo4j", "password"))
with driver.session() as session:
    session.execute_read(find_actors_in_genre, genre="Comedy")


In [None]:
session.close()
driver.close()

### Additional Reference
* [Build applications with Neo4j and Python](https://neo4j.com/docs/python-manual/current/)

## - END -



---



---



> > > > > > > > > © 2024 Institute of Data


---



---



