# Introduction

Nowadays, traditional databases often struggle with highly connected data, leading to slow and complex queries. Neo4j, a graph database released in 2007, solves this problem by storing data as nodes and relationships instead of tables. This approach makes it faster and easier to explore connections between entities.

Neo4j is widely used in areas like social networks, recommendation systems, and knowledge graphs—where relationships matter most. In this tutorial, we will explore its capabilities by analyzing a startup ecosystem, using Cypher queries, PageRank, and Louvain community detection to uncover key insights.

# Comparison with Relational Databases 

* Advantages of Neo4j over relational databases

* Drawbacks of Neo4j compared to relational databases

* Example Cypher query to illustrate a key difference with SQL

* Key takeaways

# Installation & configuration

## Installing Neo4J
If you don't have Docker installed, you can install it from [here](https://www.docker.com/). 

First, in the terminal, pull the Neo4J image from Docker:

`docker pull neo4j`

Now, create a Neo4J instance (thanks to the docker-compose.yml file).

`docker compose up -d`

You can now access the Neo4J browser by going to [http://localhost:7474](http://localhost:7474). We will use this to visualize the graph.

The default username is `neo4j` and the default password is `password`.



In [1]:
# Installation of Neo4j 
!pip install neo4j
!pip install yfiles_jupyter_graphs_for_neo4j



In [2]:
# Loading the libraries
from neo4j import GraphDatabase
from yfiles_jupyter_graphs_for_neo4j import Neo4jGraphWidget

# Connecting to the Neo4j database
driver = GraphDatabase.driver(uri="bolt://localhost:7687", auth=("neo4j", "password"))
session = driver.session()
g = Neo4jGraphWidget(driver)

# Dataset

First, we clear the existing database

In [3]:
session.run("""
    MATCH (n)
    DETACH DELETE n
""")

<neo4j._sync.work.result.Result at 0x21ad44efad0>

Secondly, we create two JSON datasets: one for startups and the other for investors. PS: We used chatgpt to help us generating realistic data.

In [4]:
# Here we load the dataset into the Neo4j database

# Clear existing database
session.run("""
    CREATE 
    // AI Startups (Community 1)
    (s1:Startup {name: 'OpenAI', communityId: 1, country: 'USA', technology: 'AI'}),
    (s2:Startup {name: 'DeepMind', communityId: 1, country: 'UK', technology: 'AI'}),
    (s3:Startup {name: 'Anthropic', communityId: 1, country: 'USA', technology: 'AI'}),
    (s4:Startup {name: 'Cohere', communityId: 1, country: 'Canada', technology: 'AI'}),
    (s5:Startup {name: 'Adept AI', communityId: 1, country: 'USA', technology: 'AI'}),
    (s6:Startup {name: 'Stability AI', communityId: 1, country: 'UK', technology: 'AI'}),

    // Aerospace Startups (Community 2)
    (s7:Startup {name: 'SpaceX', communityId: 2, country: 'USA', technology: 'Aerospace'}),
    (s8:Startup {name: 'Blue Origin', communityId: 2, country: 'USA', technology: 'Aerospace'}),
    (s9:Startup {name: 'Rocket Lab', communityId: 2, country: 'New Zealand', technology: 'Aerospace'}),
    (s10:Startup {name: 'Relativity Space', communityId: 2, country: 'USA', technology: 'Aerospace'}),
    (s11:Startup {name: 'Virgin Galactic', communityId: 2, country: 'USA', technology: 'Aerospace'}),
    (s12:Startup {name: 'Firefly Aerospace', communityId: 2, country: 'USA', technology: 'Aerospace'}),

    // FinTech Startups (Community 3)
    (s13:Startup {name: 'Stripe', communityId: 3, country: 'USA', technology: 'FinTech'}),
    (s14:Startup {name: 'Revolut', communityId: 3, country: 'UK', technology: 'FinTech'}),
    (s15:Startup {name: 'Klarna', communityId: 3, country: 'Sweden', technology: 'FinTech'}),
    (s16:Startup {name: 'Brex', communityId: 3, country: 'USA', technology: 'FinTech'}),
    (s17:Startup {name: 'Chime', communityId: 3, country: 'USA', technology: 'FinTech'}),
    (s18:Startup {name: 'Plaid', communityId: 3, country: 'USA', technology: 'FinTech'}),

    // Electric Vehicle Startups (Community 4)
    (s19:Startup {name: 'Tesla', communityId: 4, country: 'USA', technology: 'EV'}),
    (s20:Startup {name: 'Rivian', communityId: 4, country: 'USA', technology: 'EV'}),
    (s21:Startup {name: 'Lucid Motors', communityId: 4, country: 'USA', technology: 'EV'}),
    (s22:Startup {name: 'Nio', communityId: 4, country: 'China', technology: 'EV'}),
    (s23:Startup {name: 'Xpeng', communityId: 4, country: 'China', technology: 'EV'}),
    (s24:Startup {name: 'Fisker', communityId: 4, country: 'USA', technology: 'EV'}),

    // Blockchain Startups (Community 5)
    (s25:Startup {name: 'Binance', communityId: 5, country: 'Malta', technology: 'Blockchain'}),
    (s26:Startup {name: 'Coinbase', communityId: 5, country: 'USA', technology: 'Blockchain'}),
    (s27:Startup {name: 'Chainalysis', communityId: 5, country: 'USA', technology: 'Blockchain'}),
    (s28:Startup {name: 'Ledger', communityId: 5, country: 'France', technology: 'Blockchain'}),
    (s29:Startup {name: 'Kraken', communityId: 5, country: 'USA', technology: 'Blockchain'}),
    (s30:Startup {name: 'Uniswap', communityId: 5, country: 'Global', technology: 'Blockchain'})
""")



session.run("""
    CREATE
    (i1:Investor {name: 'Elon Musk', sector: 'AI, Aerospace, EV'}),
    (i2:Investor {name: 'Andreessen Horowitz', sector: 'FinTech, Blockchain'}),
    (i3:Investor {name: 'Sequoia Capital', sector: 'FinTech'}),
    (i4:Investor {name: 'Tim Draper', sector: 'EV, Blockchain'}),
    (i5:Investor {name: 'Binance Labs', sector: 'Blockchain'}),
    (i6:Investor {name: 'Y Combinator', sector: 'AI, FinTech'}),
    (i7:Investor {name: 'SoftBank', sector: 'AI, EV'}),
    (i8:Investor {name: 'Peter Thiel', sector: 'Aerospace, AI'}),
    (i9:Investor {name: 'Tiger Global', sector: 'FinTech'}),
    (i10:Investor {name: 'Cathie Wood', sector: 'EV, Blockchain'}),
    (i11:Investor {name: 'Lightspeed Ventures', sector: 'AI, FinTech'}),
    (i12:Investor {name: 'General Catalyst', sector: 'FinTech'}),
    (i13:Investor {name: 'Khosla Ventures', sector: 'AI, Aerospace'}),
    (i14:Investor {name: 'Founders Fund', sector: 'Aerospace, Blockchain'}),
    (i15:Investor {name: 'Coinbase Ventures', sector: 'Blockchain'}),
    (i16:Investor {name: 'Google Ventures', sector: 'AI, FinTech'}),
    (i17:Investor {name: 'Accel Partners', sector: 'FinTech'}),
    (i18:Investor {name: 'Bessemer Venture Partners', sector: 'FinTech'}),
    (i19:Investor {name: 'Benchmark', sector: 'EV, FinTech'}),
    (i20:Investor {name: 'Union Square Ventures', sector: 'Blockchain'})
""")





<neo4j._sync.work.result.Result at 0x21ad4500bd0>

Creation of the relationships between investors and startups

In [5]:
# AI Sector Investments
session.run("""
    MATCH (i1:Investor {name: 'Elon Musk'}), (s1:Startup {name: 'OpenAI'}), (s2:Startup {name: 'Anthropic'}),
          (s3:Startup {name: 'Adept AI'}), (s4:Startup {name: 'DeepMind'})
    CREATE (i1)-[:INVESTS_IN]->(s1),
           (i1)-[:INVESTS_IN]->(s2),
           (i1)-[:INVESTS_IN]->(s3),
           (i1)-[:INVESTS_IN]->(s4)
""")

session.run("""
    MATCH (i2:Investor {name: 'Andreessen Horowitz'}), (s1:Startup {name: 'OpenAI'}), (s2:Startup {name: 'Cohere'}),
          (s3:Startup {name: 'Hugging Face'}), (s4:Startup {name: 'Stability AI'})
    CREATE (i2)-[:INVESTS_IN]->(s1),
           (i2)-[:INVESTS_IN]->(s2),
           (i2)-[:INVESTS_IN]->(s3),
           (i2)-[:INVESTS_IN]->(s4)
""")

# Aerospace Sector Investments
session.run("""
    MATCH (i7:Investor {name: 'SoftBank'}), (s1:Startup {name: 'SpaceX'}), (s2:Startup {name: 'Blue Origin'}),
          (s3:Startup {name: 'Rocket Lab'}), (s4:Startup {name: 'Relativity Space'})
    CREATE (i7)-[:INVESTS_IN]->(s1),
           (i7)-[:INVESTS_IN]->(s2),
           (i7)-[:INVESTS_IN]->(s3),
           (i7)-[:INVESTS_IN]->(s4)
""")

session.run("""
    MATCH (i8:Investor {name: 'Peter Thiel'}), (s1:Startup {name: 'SpaceX'}), (s2:Startup {name: 'Rocket Lab'})
    CREATE (i8)-[:INVESTS_IN]->(s1),
           (i8)-[:INVESTS_IN]->(s2)
""")

session.run("""
    MATCH (i7:Investor {name: 'SoftBank'}), (s1:Startup {name: 'OpenAI'}), (s2:Startup {name: 'SpaceX'}),
          (s3:Startup {name: 'Tesla'}), (s4:Startup {name: 'Revolut'})
    CREATE (i7)-[:INVESTS_IN]->(s1),
           (i7)-[:INVESTS_IN]->(s2),
           (i7)-[:INVESTS_IN]->(s3),
           (i7)-[:INVESTS_IN]->(s4)
""")

session.run("""
    MATCH (i2:Investor {name: 'Andreessen Horowitz'}), (s1:Startup {name: 'OpenAI'}), (s2:Startup {name: 'Stripe'}),
          (s3:Startup {name: 'Coinbase'}), (s4:Startup {name: 'Tesla'})
    CREATE (i2)-[:INVESTS_IN]->(s1),
           (i2)-[:INVESTS_IN]->(s2),
           (i2)-[:INVESTS_IN]->(s3),
           (i2)-[:INVESTS_IN]->(s4)
""")

session.run("""
    MATCH (i9:Investor {name: 'Tiger Global'}), (s1:Startup {name: 'Stripe'}), (s2:Startup {name: 'Binance'}),
          (s3:Startup {name: 'Tesla'}), (s4:Startup {name: 'Hugging Face'})
    CREATE (i9)-[:INVESTS_IN]->(s1),
           (i9)-[:INVESTS_IN]->(s2),
           (i9)-[:INVESTS_IN]->(s3),
           (i9)-[:INVESTS_IN]->(s4)
""")


# FinTech Sector Investments
session.run("""
    MATCH (i3:Investor {name: 'Sequoia Capital'}), (s1:Startup {name: 'Stripe'}), (s2:Startup {name: 'Revolut'}),
          (s3:Startup {name: 'Klarna'}), (s4:Startup {name: 'Brex'})
    CREATE (i3)-[:INVESTS_IN]->(s1),
           (i3)-[:INVESTS_IN]->(s2),
           (i3)-[:INVESTS_IN]->(s3),
           (i3)-[:INVESTS_IN]->(s4)
""")

session.run("""
    MATCH (i9:Investor {name: 'Tiger Global'}), (s1:Startup {name: 'Stripe'}), (s2:Startup {name: 'Klarna'}),
          (s3:Startup {name: 'Brex'})
    CREATE (i9)-[:INVESTS_IN]->(s1),
           (i9)-[:INVESTS_IN]->(s2),
           (i9)-[:INVESTS_IN]->(s3)
""")

# Electric Vehicle Sector Investments
session.run("""
    MATCH (i10:Investor {name: 'Cathie Wood'}), (s1:Startup {name: 'Tesla'}), (s2:Startup {name: 'Nio'}),
          (s3:Startup {name: 'Rivian'})
    CREATE (i10)-[:INVESTS_IN]->(s1),
           (i10)-[:INVESTS_IN]->(s2),
           (i10)-[:INVESTS_IN]->(s3)
""")

session.run("""
    MATCH (i11:Investor {name: 'Mark Cuban'}), (s1:Startup {name: 'Tesla'}), (s2:Startup {name: 'Lucid Motors'})
    CREATE (i11)-[:INVESTS_IN]->(s1),
           (i11)-[:INVESTS_IN]->(s2)
""")

# Blockchain Sector Investments
session.run("""
    MATCH (i5:Investor {name: 'Binance Labs'}), (s1:Startup {name: 'Binance'}), (s2:Startup {name: 'Ledger'})
    CREATE (i5)-[:INVESTS_IN]->(s1),
           (i5)-[:INVESTS_IN]->(s2)
""")

session.run("""
    MATCH (i12:Investor {name: 'Accel Partners'}), (s1:Startup {name: 'Chainalysis'}), (s2:Startup {name: 'Coinbase'})
    CREATE (i12)-[:INVESTS_IN]->(s1),
           (i12)-[:INVESTS_IN]->(s2)
""")



<neo4j._sync.work.result.Result at 0x21ad452d610>

Relationships startup-startup

In [6]:
session.run("""
    MATCH (s1:Startup {name: 'OpenAI'}), (s2:Startup {name: 'Tesla'})
    CREATE (s1)-[:COLLABORATES_WITH]->(s2)
""")

session.run("""
    MATCH (s1:Startup {name: 'Revolut'}), (s2:Startup {name: 'Stripe'})
    CREATE (s1)-[:COLLABORATES_WITH]->(s2)
""")

session.run("""
    MATCH (s1:Startup {name: 'Binance'}), (s2:Startup {name: 'Coinbase'})
    CREATE (s1)-[:COMPETES_WITH]->(s2)
""")

session.run("""
    MATCH (s1:Startup {name: 'Tesla'}), (s2:Startup {name: 'Lucid Motors'})
    CREATE (s1)-[:COMPETES_WITH]->(s2)
""")

session.run("""
    MATCH (s1:Startup {name: 'SpaceX'}), (s2:Startup {name: 'Blue Origin'})
    CREATE (s1)-[:COMPETES_WITH]->(s2)
""")

session.run("""
    MATCH (s1:Startup {name: 'DeepMind'}), (s2:Startup {name: 'Mistral AI'})
    CREATE (s1)-[:PARTNERS_WITH]->(s2)
""")


<neo4j._sync.work.result.Result at 0x21ad44edd50>

In [7]:
# Define the Cypher query to visualize Startups and Investors

g.show_cypher("MATCH (s)-[r]->(t) RETURN s, r, t")


GraphWidget(layout=Layout(height='790px', width='100%'))

# PageRank algorithm

The PageRank algorithm ranks the nodes of a graph based on their influence. It’s a recursive algorithm where a node’s score depends on the scores of the nodes linking to it, as well as how many other nodes those linking nodes connect to. This algorithm is implemented in the "graph-data-science plugin", and we’ll break down its core functionalities.

The first step is to create an in-memory projection of the graph. The primary goal of doing this projection is to streamline the graph and isolate it from live data. This permits faster execution times as we work on a simplified version of the graph. To create such a projection, we first need to retrieve the labels of our nodes as well as the possible relationships between them.

In [8]:
labelsResponse = session.run("""CALL db.labels()""")
labels = labelsResponse.data()
print(labels)

relationshipResponse = session.run("CALL db.relationshipTypes()")
relationships = relationshipResponse.data()
print(relationships)


[{'label': 'Startup'}, {'label': 'Investor'}]
[{'relationshipType': 'INVESTS_IN'}, {'relationshipType': 'COLLABORATES_WITH'}, {'relationshipType': 'COMPETES_WITH'}]


Then, by using the labels and relationships type, we can create the projection.

In [9]:
projection_query = """
CALL gds.graph.project(
  'generalProjection', 
  ['Startup', 'Investor'], 
  {
    INVESTS_IN: {},
    COLLABORATES_WITH: {},
    COMPETES_WITH: {}
  }
)
"""
session.run(projection_query)

<neo4j._sync.work.result.Result at 0x21ad453d4d0>

Now we can run the PageRank algorithm using the stream mode. In stream mode, the algorithm computes a score for every node, which allows us to post-process the results without affecting the underlying data. Additionally, we limit the query to return only the top 10 nodes.

In [10]:
pagerankGeneralQuery = """
CALL gds.pageRank.stream('generalProjection')
YIELD nodeId, score
RETURN gds.util.asNode(nodeId).name AS name, score
ORDER BY score DESC
LIMIT 10
"""
pagerankGeneralRes = session.run(pagerankGeneralQuery)
for record in pagerankGeneralRes:
    print(f"Name: {record['name']}, Score: {record['score']}")

Name: Lucid Motors, Score: 0.52021484375
Name: Tesla, Score: 0.43554687500000006
Name: Coinbase, Score: 0.42731250000000004
Name: Stripe, Score: 0.424390625
Name: Blue Origin, Score: 0.37471875000000004
Name: SpaceX, Score: 0.24562500000000004
Name: OpenAI, Score: 0.22968750000000004
Name: Rocket Lab, Score: 0.22968750000000004
Name: Klarna, Score: 0.22437500000000005
Name: Brex, Score: 0.22437500000000005


Note that we can filter the type of relationship on the projection. For exemple, if we want to find the most influent node with regard to only the INVESTS_IN relationships, we can do the following projection.

In [11]:
filteredProjectionQuery = """
CALL gds.graph.project(
  'filteredProjection',
  ['Startup', 'Investor'],
  {
    INVESTS_IN: {}
  }
)"""

pagerankFilteredQuery = """
CALL gds.pageRank.stream('filteredProjection')
YIELD nodeId, score
RETURN gds.util.asNode(nodeId).name AS name, score
ORDER BY score DESC
LIMIT 10
"""

session.run(filteredProjectionQuery)

pagerankFilteredRes = session.run(pagerankFilteredQuery)
for record in pagerankFilteredRes:
    print(f"Name: {record['name']}, Score: {record['score']}")


Name: Stripe, Score: 0.25625000000000003
Name: SpaceX, Score: 0.24562500000000004
Name: Coinbase, Score: 0.24562500000000004
Name: Tesla, Score: 0.24031250000000004
Name: Rocket Lab, Score: 0.22968750000000004
Name: OpenAI, Score: 0.22968750000000004
Name: Brex, Score: 0.22437500000000005
Name: Klarna, Score: 0.22437500000000005
Name: Binance, Score: 0.21375000000000002
Name: Chainalysis, Score: 0.21375000000000002


In a similar fashion, we can filter nodes based on their labels. For example, if we want to focus solely on nodes with the label Startup, we can create a projection like this:

In [12]:
filteredProjectionQuery2 = """
CALL gds.graph.project(
  'filteredProjection2', 
  ['Startup'], 
  {
    INVESTS_IN: {},
    COLLABORATES_WITH: {},
    COMPETES_WITH: {}
  }
)
"""

pagerankFilteredQuery2 = """
CALL gds.pageRank.stream('filteredProjection2')
YIELD nodeId, score
RETURN gds.util.asNode(nodeId).name AS name, score
ORDER BY score DESC
LIMIT 10
"""

session.run(filteredProjectionQuery2)

pagerankFilteredRes2 = session.run(pagerankFilteredQuery2)
for record in pagerankFilteredRes2:
    print(f"Name: {record['name']}, Score: {record['score']}")


Name: Lucid Motors, Score: 0.385875
Name: Stripe, Score: 0.2775
Name: Blue Origin, Score: 0.2775
Name: Coinbase, Score: 0.2775
Name: Tesla, Score: 0.2775
Name: SpaceX, Score: 0.15000000000000002
Name: Stability AI, Score: 0.15000000000000002
Name: Anthropic, Score: 0.15000000000000002
Name: Relativity Space, Score: 0.15000000000000002
Name: Adept AI, Score: 0.15000000000000002


There are numerous options available to fine-tune and extend the PageRank algorithm. You can read the Neo4j documentation for more details (https://neo4j.com/docs/graph-data-science/current/algorithms/page-rank/). However, to keep this tutorial straightforward, we'll focus solely on estimating the memory cost of running the algorithm on our projections. Note that we must specify the execution mode; here, we are using stream mode.

In [13]:
estimateQuery = """
CALL gds.louvain.stream.estimate('generalProjection', {})
YIELD nodeCount, relationshipCount, bytesMin, bytesMax, requiredMemory
"""

estimateRes = session.run(estimateQuery)
for record in estimateRes.data():
    for key, value in record.items():
        print(f"{key}: {value}")

nodeCount: 50
relationshipCount: 37
bytesMin: 8497
bytesMax: 568584
requiredMemory: [8497 Bytes ... 555 KiB]


# Louvain algorithm

The Louvain algorithm is used for community detection in graphs by maximizing a metric called modularity. It works iteratively to group nodes into communities such that nodes within the same community are densely connected, while connections between different communities are sparser. This algorithm is also implemented in the "graph-data-science plugin," and we’ll break down its core functionalities.

The Louvain algorithm also operates on an in-memory graph projection. In this tutorial, we'll use the same projection that we created for the PageRank algorithm. Like PageRank, Louvain offers various execution modes, and we'll use stream mode to stay consistent with our previous approach.

We create a query that orders the communities by the number of nodes they contain. The query also list the nodes in each community and limit the output to only the top 5 communities.

In [14]:
louvainGeneralQuery = """
CALL gds.louvain.stream('generalProjection')
YIELD nodeId, communityId
WITH communityId, 
     collect(gds.util.asNode(nodeId).name) AS nodes, 
     count(*) AS communitySize
ORDER BY communitySize DESC
LIMIT 5
RETURN communityId AS community, communitySize, nodes
"""

louvainGeneralRes = session.run(louvainGeneralQuery)
for record in louvainGeneralRes:
    print(f"community: {record['community']}, communitySize: {record['communitySize']}, nodes: {record['nodes']}")


community: 25, communitySize: 13, nodes: ['OpenAI', 'Tesla', 'Rivian', 'Lucid Motors', 'Nio', 'Binance', 'Coinbase', 'Chainalysis', 'Ledger', 'Andreessen Horowitz', 'Binance Labs', 'Cathie Wood', 'Accel Partners']
community: 15, communitySize: 6, nodes: ['Stripe', 'Revolut', 'Klarna', 'Brex', 'Sequoia Capital', 'Tiger Global']
community: 9, communitySize: 6, nodes: ['SpaceX', 'Blue Origin', 'Rocket Lab', 'Relativity Space', 'SoftBank', 'Peter Thiel']
community: 4, communitySize: 4, nodes: ['DeepMind', 'Anthropic', 'Adept AI', 'Elon Musk']
community: 3, communitySize: 1, nodes: ['Cohere']


In [15]:
louvainFilteredQuery = """
CALL gds.louvain.stream('filteredProjection')
YIELD nodeId, communityId
WITH communityId, 
     collect(gds.util.asNode(nodeId).name) AS nodes, 
     count(*) AS communitySize
ORDER BY communitySize DESC
LIMIT 5
RETURN communityId AS community, communitySize, nodes
"""

louvainFilteredRes = session.run(louvainFilteredQuery)
for record in louvainFilteredRes:
    print(f"community: {record['community']}, communitySize: {record['communitySize']}, nodes: {record['nodes']}")

community: 21, communitySize: 9, nodes: ['OpenAI', 'Tesla', 'Rivian', 'Nio', 'Coinbase', 'Chainalysis', 'Andreessen Horowitz', 'Cathie Wood', 'Accel Partners']
community: 15, communitySize: 6, nodes: ['Stripe', 'Revolut', 'Klarna', 'Brex', 'Sequoia Capital', 'Tiger Global']
community: 9, communitySize: 6, nodes: ['SpaceX', 'Blue Origin', 'Rocket Lab', 'Relativity Space', 'SoftBank', 'Peter Thiel']
community: 4, communitySize: 4, nodes: ['DeepMind', 'Anthropic', 'Adept AI', 'Elon Musk']
community: 27, communitySize: 3, nodes: ['Binance', 'Ledger', 'Binance Labs']


In [16]:
louvainFilteredQuery2 = """
CALL gds.louvain.stream('filteredProjection2')
YIELD nodeId, communityId
WITH communityId, 
     collect(gds.util.asNode(nodeId).name) AS nodes, 
     count(*) AS communitySize
ORDER BY communitySize DESC
LIMIT 5
RETURN communityId AS community, communitySize, nodes
"""

louvainFilteredRes2 = session.run(louvainFilteredQuery2)
for record in louvainFilteredRes2:
    print(f"community: {record['community']}, communitySize: {record['communitySize']}, nodes: {record['nodes']}")

community: 20, communitySize: 3, nodes: ['OpenAI', 'Tesla', 'Lucid Motors']
community: 25, communitySize: 2, nodes: ['Binance', 'Coinbase']
community: 12, communitySize: 2, nodes: ['Stripe', 'Revolut']
community: 7, communitySize: 2, nodes: ['SpaceX', 'Blue Origin']
community: 2, communitySize: 1, nodes: ['Anthropic']


There are also numerous options available for configuring the Louvain algorithm. You can find more details in the Neo4j Graph Data Science documentation at https://neo4j.com/docs/graph-data-science/current/algorithms/louvain/. The memory cost estimation for running this algorithm is similiar as how we porceeded for pageRank.

In [17]:
estimateQuery = """
CALL gds.louvain.stream.estimate('generalProjection', {})
YIELD nodeCount, relationshipCount, bytesMin, bytesMax, requiredMemory
"""

estimateRes = session.run(estimateQuery)
for record in estimateRes.data():
    for key, value in record.items():
        print(f"{key}: {value}")

nodeCount: 50
relationshipCount: 37
bytesMin: 8497
bytesMax: 568584
requiredMemory: [8497 Bytes ... 555 KiB]


# Cross-Analyzing PageRank & Communities 

In [18]:
# Code for the cross-analyzing PageRank & Communities 

#  Real-World Use Cases 

Describe and/or cite real-world examples of how the database technology is used in different industries and applications.

# Conclusion

Conclude the tutorial with a summary of the main points and the benefits (and drawbacks) of using Neo4j for graph databases.

# References

1. [NODES 2024 – Advanced Graph Visualizations in Jupyter Notebooks](https://neo4j.com/videos/nodes-2024-advanced-graph-visualizations-in-jupyter-notebooks/)