# Graph Database Connection

In the following, we will show you how to connect and query data on Neo4j, using python. 

**IMPORTANT NOTE**

This notebook requires that you have access to a working version of Neo4j. In order to install Neo4j locally, we advise you to refer to the Neo4j webpage (https://neo4j.com/download/) or to use docker (https://hub.docker.com/_/neo4j).

In [1]:
with open("./movieCreationQuery.txt", "rb") as fid:
    lines = fid.readlines()

In [2]:
query = " ".join([line.decode("utf-8").replace("\n", "") for line in lines])

In [3]:
from neo4j import GraphDatabase

In [4]:
import os
host = os.environ.get("NEO4J_HOST", "localhost")

uri = f"neo4j://{host}:7687"
driver = GraphDatabase.driver(uri, auth=("neo4j", "neo5j"))

In [5]:
def run_query(tx, query):
    return list(tx.run(query))

In [6]:
with driver.session() as session:
    session.write_transaction(run_query, query)

  session.write_transaction(run_query, query)


Query

In [7]:
query = "MATCH (n) RETURN count(*)"

In [8]:
with driver.session() as session:
    result = session.read_transaction(run_query, query)
[r for r in result]

  result = session.read_transaction(run_query, query)


[<Record count(*)=171>]

### Using `graphdatascience`

In [9]:
import graphdatascience

  from .autonotebook import tqdm as notebook_tqdm


In [10]:
from graphdatascience import GraphDataScience

import os
host = os.environ.get("NEO4J_HOST", "localhost")

uri = f"bolt://{host}:7687"
gds = GraphDataScience(uri, auth=("neo4j", "neo5j"))

In [11]:
gds.run_cypher("MATCH (n) RETURN count(*);") 

Unnamed: 0,count(*)
0,171


### Using the analytics capabilities of `graphdatascience`

In [12]:
G = gds.graph.load_cora()

In [13]:
gds.graph.list()

Unnamed: 0,degreeDistribution,graphName,database,databaseLocation,memoryUsage,sizeInBytes,nodeCount,relationshipCount,configuration,density,creationTime,modificationTime,schema,schemaWithOrientation
0,"{'min': 0, 'max': 166, 'p90': 5, 'p999': 74, '...",cora,neo4j,local,34 MiB,35685078,2708,5429,"{'readConcurrency': 4, 'undirectedRelationship...",0.000741,2025-02-23T17:46:21.127305017+00:00,2025-02-23T17:46:21.127305017+00:00,"{'graphProperties': {}, 'nodes': {'Paper': {'s...","{'graphProperties': {}, 'nodes': {'Paper': {'s..."


In [14]:
G=gds.graph.get("cora")

In [15]:
G.node_properties()

Paper    [subject, features]
dtype: object

In [16]:
gds.graph.nodeProperty.stream(gds.graph.get("cora"), node_property="subject")

Unnamed: 0,nodeId,propertyValue,nodeLabels
0,31336,0,[]
1,1061127,1,[]
2,1106406,2,[]
3,13195,2,[]
4,37879,3,[]
...,...,...,...
2703,1128975,5,[]
2704,1128977,5,[]
2705,1128978,5,[]
2706,117328,6,[]


In [17]:
pr_result = gds.pageRank.mutate(G, mutateProperty="pagerank")

In [18]:
print(f"Compute millis: {pr_result['computeMillis']}")
print(f"Node properties written: {pr_result['nodePropertiesWritten']}")
print(f"Centrality distribution: {pr_result['centralityDistribution']}")

Compute millis: 23
Node properties written: 2708
Centrality distribution: {'min': 0.14999961853027344, 'max': 3.5378417968749996, 'p90': 0.4555196762084961, 'p999': 2.6002798080444336, 'p99': 1.5071401596069336, 'p50': 0.21511173248291016, 'p75': 0.3093576431274414, 'p95': 0.6003026962280273, 'mean': 0.2869838661069884}


In [19]:
G.node_properties()

Paper    [pagerank, subject, features]
dtype: object

In [20]:
gds.graph.nodeProperties.stream(G, ["pagerank"], separate_property_columns=True)
# gds.graph.nodeProperties.write(G, ["pagerank"])

Unnamed: 0,nodeId,pagerank
0,35,0.203022
1,40,0.168341
2,114,0.150000
3,117,0.150000
4,128,0.184487
...,...,...
2703,1154500,0.161591
2704,1154520,0.168214
2705,1154524,0.307409
2706,1154525,0.248215


### Delete datasets

In [21]:
gds.graph.drop("cora")

graphName                                                             cora
database                                                             neo4j
databaseLocation                                                     local
memoryUsage                                                               
sizeInBytes                                                             -1
nodeCount                                                             2708
relationshipCount                                                     5429
configuration            {'readConcurrency': 4, 'undirectedRelationship...
density                                                           0.000741
creationTime                           2025-02-23T17:46:21.127305017+00:00
modificationTime                       2025-02-23T17:46:21.352154457+00:00
schema                   {'graphProperties': {}, 'nodes': {'Paper': {'p...
schemaWithOrientation    {'graphProperties': {}, 'nodes': {'Paper': {'p...
Name: 0, dtype: object

In [22]:
with driver.session() as session:
    result = session.write_transaction(run_query, "MATCH (n)-[e]-() DELETE n, e")

  result = session.write_transaction(run_query, "MATCH (n)-[e]-() DELETE n, e")
