<h1 style="text-align: center;">Neo4J - Graph Database</h1>

___

## Overview

### Properties / features

Neo4j is a so-called graph database, which consists of nodes and relations. Nodes are basic data, and relations connect this data. So instead of tables with rows and columns, Neo4j uses a graph with nodes and relationships. Compared to relational databases, which are optimal for tabular data that isn't closely related, graph databases such as Neo5j are designed to work with closely related data.

### Typical application areas

* Real-time recommendations
* Fraud detection
* Knowledge Management / Graph-Based Search
* Master data management
* Logistics
* Social Networks

### Advantages & disadvantages

#### Advantages

* **Simplicity:** Neo4j is very descriptive and intuitive due to the graph representation. A graph drawn on a whiteboard can usually be transferred 1:1 into the database. The query language Cypher is kept understandable by the "ASCII style".

* **Performance:** Simple and complex queries are executed in real time. The query time is proportional to the result set, not to the total amount of data, and results in the same query being similarly fast, regardless of the total amount of data in the database. Consequently, this leads to consistently fast queries, avoidance of JOIN problems, and only partial processing of the total amount of data.

* **Agility:** New structures can be easily integrated into the graph and existing data can be extended without affecting existing applications. For example, it is possible to add labels and properties to a node or to remove them. With restrictions, this is also possible for edges.

#### Disadvantages

* **Scalability**: Neo4j has a poor scalablitiy since it's designed for single-server archtietcure (like all graph databases).

* **Language**: There is no uniform query languages between the graph databases.

___

## Official documentation

### Website & accounts

* Offical Neo4j website: https://neo4j.com/
* Offical Neo4j GitHub account: https://github.com/neo4j
* Official Neo4j social media accounts: 
    * YouTube: https://www.youtube.com/channel/UCvze3hU6OZBkB1vkhH2lH9Q
    * Twitter: https://twitter.com/neo4j
    * Discord: https://discord.com/invite/neo4j

### General documentation

Official Neo4j documentation: https://neo4j.com/docs/getting-started/current/

### Download site

Official Neo4j Download site (for Neo4j Desktop): https://neo4j.com/download/?ref=get-started-dropdown-cta

___

## Installation

Neo4j can be downloaded and used as a desktop version (see link above) or by using the cloud database (see below how to do that).

In [147]:
# install neo4j and import necessary classes and functions

# !pip install neo4j
from neo4j import GraphDatabase

In [148]:
# use bolt protocol - difference between the bolt and neo4j protocol is that
# when using bolt, a single point-to-point connection to one server instance is
# established while neo4j may route requests to different members of a cluster.
# https://neo4j.com/docs/driver-manual/4.0/client-applications/#driver-configuration-examples

url = "bolt://localhost:7687"
driver = GraphDatabase.driver(
    uri=url,
    auth=("neo4j", "password"),
    database = "linked",
    encrypted=False
)

In [149]:
# confirm creation
driver

<neo4j._sync.driver.BoltDriver at 0x227eedd4eb0>

In [150]:
# create session
session = driver.session()
session

<neo4j._sync.work.session.Session at 0x227eedd4460>

___

## Node creation

Graph databases consist of 2 main components: <strong>nodes</strong> and <strong>relationships</strong> that connect nodes. Nodes can have properties and labels. Labels can be seen as custom node types. For example when modelling a movie database, you would have nodes of type <i>actor</i>, <i>movie</i> and <i>director</i>.
</p>

In [151]:
# create nodes of type "Person". Some people have additional roles
# such as recruiters or premium users. Each user has a name
create_query = """
create (franz:Person:Recruiter {name: "Franz"})
create (hans:Person:Premium {name: "Hans"})
create (john:Person {name:"John"})
create (dean:Person:Recruiter {name:"Dean"})
create (sam:Person {name:"Sam"})
"""

# create nodes of type "Company". Each company has a name, founding date and an industry
create_query += """
create (microsoft:Company {name: "Microsoft Corporation", established: 1975, industry: "Technology & Innovation"})
create (apple:Company {name: "Apple", established: 1976, industry: "Technology & Innovation"})
create (google:Company {name: "Google LLC", established: 1998, industry: "Technology & Innovation"})
create (voestalpine:Company {name: "Vöstalpine AG", established: 1938, industry: "Steel / metal industry"})
"""

___

## Relationship creation

Relationships are connections between nodes that can contain additional information.

In [152]:
# create "worked_at" relationships between nodes. Because we gave each node a variable name, we can simply access the created node through the variable.
create_query += """
create (franz) -[:worked_at {
    from: date("2018-01-01"), until: date("2020-03-31"), position: "Recruiting Officer"
}]-> (google)
create (franz) -[:worked_at {
    from: date("2020-09-01"), until: null, position: "Recruiting Manager"
}] -> (microsoft)
create (hans) - [:worked_at {
    from: date("2022-01-01"), until: null, position: "Software Developer"
}] -> (apple)
create (john) - [:worked_at {
    from: date("2020-05-01"), until: date("2023-01-31"), position: "Scrum Master"
}] -> (voestalpine)
create (john) - [:worked_at {
    from: date("2018-01-01"), until: date("2020-03-31"), position: "Project Manager"
}] -> (microsoft)
create (dean) - [:worked_at {
    from: date("2022-01-01"), until: date("2022-07-31"), position: "Recruiting Greenhorn"
}] -> (microsoft)
create (dean) - [:worked_at {
    from: date("2022-08-01"), until: date("2023-12-31"), position: "Recruiting"
}] -> (microsoft)
create (dean) - [:worked_at {
    from: date("2023-01-01"), until: null, position: "Recruiting Officer"
}] -> (google)
create (sam) - [:worked_at {
    from: date("2022-08-01"), until: null, position: "Data Scientist"
}] -> (apple)
"""

# create "is_friends_with" relationships between "person"-nodes
create_query += """
create (franz) - [:is_friends_with] -> (hans)
create (franz) <- [:is_friends_with] - (hans)

create (dean) - [:is_friends_with] -> (sam)
create (dean) <- [:is_friends_with] - (sam)

create (dean) - [:is_friends_with] -> (john)
create (dean) <- [:is_friends_with] - (john)
"""

#Use run() equation to execute neo4j statement
session.run(create_query)

<neo4j._sync.work.result.Result at 0x227eeda1d60>

___

## Selecting all nodes

In [173]:
#match is a query statement, here is a "nodes" variable is defined to save the returned data
select_query = "MATCH (n) RETURN n"
nodes = session.run(select_query)

#Use for loop to print data
for n in nodes:
    print(n.data())

# See visualisation via localhost:7474
# match (n) return n

{'n': {'name': 'John'}}
{'n': {'name': 'Dean'}}
{'n': {'name': 'Sam'}}
{'n': {'established': 1975, 'name': 'Microsoft Corporation', 'industry': 'Technology & Innovation'}}
{'n': {'established': 1976, 'name': 'Apple Inc.', 'industry': 'Technology & Innovation'}}
{'n': {'established': 1998, 'name': 'Google LLC', 'industry': 'Technology & Innovation'}}
{'n': {'established': 1938, 'name': 'Vöstalpine AG', 'industry': 'Steel / metal industry'}}
{'n': {'name': 'Franz'}}
{'n': {'name': 'Hans'}}


![](first_creation.png)

___

## Selects
### Selecting nodes and creating a new relationship

In [155]:
create_query = """
match (sam {name: "Sam"}), (franz {name: "Franz"})
create (sam) -[:is_friends_with]->(franz)
create (sam) <-[:is_friends_with]-(franz)
"""

session.run(create_query)

<neo4j._sync.work.result.Result at 0x227eede2d30>

![](add_sam_franz_relationship.png)

### Updating information

In [158]:
# "where" - filter also exists (regex can be used with the =~ operator)
update_query = """
match (apple:Company)
where apple.name =~ "App.+"
set apple.name = "Apple Inc."
return apple
"""

nodes = session.run(update_query)

for n in nodes:
    print(n.data())

{'apple': {'established': 1976, 'name': 'Apple Inc.', 'industry': 'Technology & Innovation'}}


### Select previous workplaces of a certain person not including the current one

In [159]:
select_query = """
match (dean:Person {name: "Dean"}) -[worked:worked_at]-> (company:Company)
where worked.until is not null
return distinct company
"""

nodes = session.run(select_query)

for n in nodes:
    print(n.data())

{'company': {'established': 1975, 'name': 'Microsoft Corporation', 'industry': 'Technology & Innovation'}}


### Get all friends of Franz

In [145]:
select_query = """
match (franz:Person {name: "Franz"}) -[:is_friends_with]-> (person2:Person)
return franz, person2
"""

nodes = session.run(select_query)

for n in nodes:
    print(n.data())

{'franz': {'name': 'Franz'}, 'person2': {'name': 'Sam'}}
{'franz': {'name': 'Franz'}, 'person2': {'name': 'Hans'}}


### Get all relationships of Sam

In [186]:
select_query = """
match relationship = (:Person {name: "Sam"}) -[:is_friends_with|worked_at]-> (n)
return relationship
"""

nodes = session.run(select_query)

for n in nodes:
    print(n.data())

{'relationship': [{'name': 'Sam'}, 'is_friends_with', {'name': 'Franz'}]}
{'relationship': [{'name': 'Sam'}, 'is_friends_with', {'name': 'Dean'}]}
{'relationship': [{'name': 'Sam'}, 'worked_at', {'established': 1976, 'name': 'Apple Inc.', 'industry': 'Technology & Innovation'}]}


![](all_relationships_of_sam.png)

### Get all job titles of recruiters ordered by starting date

Here we can see that the positions of recruiters changed for the better i.e. they got promotions.

In [198]:
select_query = """
match (recruiter:Recruiter) -[worked:worked_at]-> (company:Company)
return worked.position, recruiter.name
order by recruiter.name, worked.from
"""

nodes = session.run(select_query)

# here we can see that the positions of recruiters changed for the better i.e. they got promotions
for n in nodes:
    print(n.data())


{'worked.position': 'Recruiting Greenhorn', 'recruiter.name': 'Dean'}
{'worked.position': 'Recruiting', 'recruiter.name': 'Dean'}
{'worked.position': 'Recruiting Officer', 'recruiter.name': 'Dean'}
{'worked.position': 'Recruiting Officer', 'recruiter.name': 'Franz'}
{'worked.position': 'Recruiting Manager', 'recruiter.name': 'Franz'}


### Get possible friend recommendations for users that worked_at the same company and aren't friends

In [196]:

select_query = """
match (person:Person) -[:worked_at]->(company:Company)
match (person2:Person) -[:worked_at]->(company)
where person <> person2
    and not (person) -[:is_friends_with]-> (person2)
    and not (person2) -[:is_friends_with]-> (person)
return distinct person, person2
order by person, person2
"""

nodes = session.run(select_query)

for n in nodes:
    print(n.data())

{'person': {'name': 'John'}, 'person2': {'name': 'Franz'}}
{'person': {'name': 'Dean'}, 'person2': {'name': 'Franz'}}
{'person': {'name': 'Sam'}, 'person2': {'name': 'Hans'}}
{'person': {'name': 'Franz'}, 'person2': {'name': 'John'}}
{'person': {'name': 'Franz'}, 'person2': {'name': 'Dean'}}
{'person': {'name': 'Hans'}, 'person2': {'name': 'Sam'}}


___

## Delete all nodes including their relationships

In [146]:
#Delete all nodes and relationships
delete_query = """
match (n) detach delete(n)
"""

session.run(delete_query)

<neo4j._sync.work.result.Result at 0x227eedd4b20>