# Introduction

Nowadays, traditional databases often struggle with highly connected data, leading to slow and complex queries. Neo4j, a graph database released in 2007, solves this problem by storing data as nodes and relationships instead of tables. This approach makes it faster and easier to explore connections between entities.

Neo4j is widely used in areas like social networks, recommendation systems, and knowledge graphs—where relationships matter most. In this tutorial, we will explore its capabilities by analyzing a startup ecosystem, fetching data from a JSON file, using Cypher queries, PageRank, and Louvain community detection to uncover key insights.

# Comparison with Relational Databases 

To compare Neo4j and Cypher with traditional relational databases, we will explore the advantages and drawbacks of using a graph-based approach over a tabular one.

### Advantages of Neo4j and cypher over relational SQL
One of the main advantages of Neo4j and its query language, Cypher, is that data is stored in a graph structure composed of nodes and relationships rather than in tables. This allows for the efficient traversal of relationships between entities. In Neo4j, navigating from one node to another via a relationship can be done in constant time, regardless of the graph's size.

In contrast, relational databases require joins between tables to establish relationships, which can become computationally expensive—especially as the number of joins increases or when dealing with deep or complex relationships

This is particularly true for relationship-heavy queries, where SQL databases often need to create recursive views or perform iterative joins to explore multi-level relationships. In contrast, Neo4j can efficiently leverage its graph-native structure to maintain **linear performance** relative to the number of hops.

Another key advantage of Neo4j and Cypher is the **simplicity** and **readability** of queries when working with graph data. 
For example, if we want to find the names of people who are friends with someone named Alice, we can write the following Cypher query:

```sql
MATCH (p:Person)-[:FRIEND_OF]->(:Person {name: 'Alice'})
RETURN p.name
```

In relational SQL, achieving the same result would require:

```sql
SELECT p1.name
FROM Person p1
JOIN Friendship f ON p1.id = f.person_id
JOIN Person p2 ON f.friend_id = p2.id
WHERE p2.name = 'Alice';
```

As shown above, relationship queries are significantly more intuitive in Cypher. 
The graph pattern-matching syntax closely reflects the actual structure of the data, making queries easier to write, read, and reason about.

The third advantage of Neo4j is its **schema-free data model**. In Neo4j, we are not required to define all possible relationships or properties for each node type in advance. This allows the data model to handle irregular or evolving data naturally, offering greater flexibility as your application and data grow over time.

In contrast, SQL databases rely on rigid schemas, where structural modifications often require **schema migrations**, making them less adaptable to dynamic or semi-structured data.

Finally, as you will see throughout this tutorial, Neo4j is both well-suited and optimized for implementing graph algorithms. By using in-memory graph projections, we can efficiently run algorithms to uncover patterns such as communities, influence, shortest paths, and more. This makes Neo4j not just a data store, but a powerful analytical tool.

### Drawbacks of Neo4j and cypher over relational SQL
Interestingly, some of Neo4j’s strengths can also become limitations depending on the use case. 

1. **Transactional Workloads**

    If your data is **highly structured** and the focus is on **transactions** rather than relationships, Neo4j may be less efficient than relational databases. Traditional SQL databases typically perform better for workflows centered around aggregations, batch updates, or structured reporting.

2. **Real-Time Data Challenges**

    Neo4j is not ideal for real-time analysis on rapidly changing data. As you will see in this tutorial, running graph algorithms often requires creating **in-memory projections**, which work best with relatively stable snapshots of the data rather than constantly updating streams.

3. **Higher Memory Overhead**

    Finally, graph databases tend to have a higher memory overhead compared to relational SQL databases. This is due to how they store nodes and relationships in memory to enable **fast traversal**, which can increase resource usage—especially for large-scale datasets.

### Conclusion

As with any decision involving database management systems, it's important to carefully analyze your **use cases** before choosing the right tool. Neo4j excels in certain scenarios, but may not be the best fit for others.

A simple roadmap of requirements that might indicate Neo4j is a good choice includes:

- **Unstructured or evolving data**: When the data model is flexible and may change over time.
- **Relationship-focused queries**: When your workload involves exploring or analyzing relationships and traversals (e.g., social networks, recommendations, graphs of dependencies).
- **Stable datasets or snapshot-based analysis**: When the data is relatively stable, or when it's acceptable to analyze it using periodic snapshots rather than in real-time.


# Installation & configuration

## Installing Neo4J
If you don't have Docker installed, you can install it from [here](https://www.docker.com/). 

First, in the terminal, pull the Neo4j image from Docker:

`docker pull neo4j`

Now, create a Neo4J instance (thanks to the docker-compose.yml file).

`docker compose up -d`

You can now access the Neo4j browser by going to [http://localhost:7474](http://localhost:7474).

The default username is `neo4j` and the default password is `password`.



In [None]:
# Installation of Neo4j 
!pip install neo4j
!pip install yfiles_jupyter_graphs_for_neo4j

In [None]:
# Loading the libraries
from neo4j import GraphDatabase

# Library to visualize the graph
from yfiles_jupyter_graphs_for_neo4j import Neo4jGraphWidget

# Library to load the data from the JSON file
import json

# Connecting to the Neo4j database
driver = GraphDatabase.driver(uri="bolt://localhost:7687", auth=("neo4j", "password"))
session = driver.session()

# Creating the graph instance in order to visualize the graph
g = Neo4jGraphWidget(driver)

# Dataset

First, let's clear the existing database.

In [None]:
session.run("""
    MATCH (n)
    DETACH DELETE n
""")

In Cypher, we select all nodes (n) in the database using ***MATCH (n)***. Since some nodes may have relationships, we use ***DETACH DELETE n*** to first remove all relationships before deleting the node

Secondly, we fetch the data from an external JSON file named ***startups.json***. This file contains structured data that we will use to populate our database. *JSON* (JavaScript Object Notation) is a widely used format for storing and exchanging data (especially in APIs) due to its simplicity. In this case, the data was generated with the help of ChatGPT [<sup>1</sup>](#chatgpt) to create realistic but fake startup and investor information. You can learn more about the JSON format here [<sup>2</sup>](#reference-json).


In [None]:
with open('startups.json', 'r') as file:
    # Open the json file
    data = json.load(file)
    
    # Create startups with their technology
    for tech_name, tech_data in data['technologies'].items():
        for startup in tech_data['startups']:
            session.run("""
                CREATE (s:Startup {
                    name: $name,
                    country: $country,
                    technology: $technology
                })
            """, {
                'name': startup['name'],
                'country': startup['country'],
                'technology': tech_name
            })
    
    # Create investors with their sectors
    for investor in data['investors']:
        session.run("""
            CREATE (i:Investor {
                name: $name,
                sector: $sector
            })
        """, {
            'name': investor['name'],
            'sector': ', '.join(investor['sectors'])
        })


### Investment relationships between investors and startups across various sectors.

In [None]:
# AI Sector Investments
session.run("""
    MATCH (i1:Investor {name: 'Elon Musk'}), (s1:Startup {name: 'OpenAI'}), (s2:Startup {name: 'Anthropic'}),
          (s3:Startup {name: 'Adept AI'}), (s4:Startup {name: 'DeepMind'})
    CREATE (i1)-[:INVESTS_IN]->(s1),
           (i1)-[:INVESTS_IN]->(s2),
           (i1)-[:INVESTS_IN]->(s3),
           (i1)-[:INVESTS_IN]->(s4)
""")

session.run("""
    MATCH (i2:Investor {name: 'Andreessen Horowitz'}), (s1:Startup {name: 'OpenAI'}), (s2:Startup {name: 'Cohere'}),
          (s3:Startup {name: 'Hugging Face'}), (s4:Startup {name: 'Stability AI'})
    CREATE (i2)-[:INVESTS_IN]->(s1),
           (i2)-[:INVESTS_IN]->(s2),
           (i2)-[:INVESTS_IN]->(s3),
           (i2)-[:INVESTS_IN]->(s4)
""")

# Aerospace Sector Investments
session.run("""
    MATCH (i7:Investor {name: 'SoftBank'}), (s1:Startup {name: 'SpaceX'}), (s2:Startup {name: 'Blue Origin'}),
          (s3:Startup {name: 'Rocket Lab'}), (s4:Startup {name: 'Relativity Space'})
    CREATE (i7)-[:INVESTS_IN]->(s1),
           (i7)-[:INVESTS_IN]->(s2),
           (i7)-[:INVESTS_IN]->(s3),
           (i7)-[:INVESTS_IN]->(s4)
""")

session.run("""
    MATCH (i8:Investor {name: 'Peter Thiel'}), (s1:Startup {name: 'SpaceX'}), (s2:Startup {name: 'Rocket Lab'})
    CREATE (i8)-[:INVESTS_IN]->(s1),
           (i8)-[:INVESTS_IN]->(s2)
""")

session.run("""
    MATCH (i7:Investor {name: 'SoftBank'}), (s1:Startup {name: 'OpenAI'}), (s2:Startup {name: 'SpaceX'}),
          (s3:Startup {name: 'Tesla'}), (s4:Startup {name: 'Revolut'})
    CREATE (i7)-[:INVESTS_IN]->(s1),
           (i7)-[:INVESTS_IN]->(s2),
           (i7)-[:INVESTS_IN]->(s3),
           (i7)-[:INVESTS_IN]->(s4)
""")

session.run("""
    MATCH (i2:Investor {name: 'Andreessen Horowitz'}), (s1:Startup {name: 'OpenAI'}), (s2:Startup {name: 'Stripe'}),
          (s3:Startup {name: 'Coinbase'}), (s4:Startup {name: 'Tesla'})
    CREATE (i2)-[:INVESTS_IN]->(s1),
           (i2)-[:INVESTS_IN]->(s2),
           (i2)-[:INVESTS_IN]->(s3),
           (i2)-[:INVESTS_IN]->(s4)
""")

session.run("""
    MATCH (i9:Investor {name: 'Tiger Global'}), (s1:Startup {name: 'Stripe'}), (s2:Startup {name: 'Binance'}),
          (s3:Startup {name: 'Tesla'}), (s4:Startup {name: 'Hugging Face'})
    CREATE (i9)-[:INVESTS_IN]->(s1),
           (i9)-[:INVESTS_IN]->(s2),
           (i9)-[:INVESTS_IN]->(s3),
           (i9)-[:INVESTS_IN]->(s4)
""")


# FinTech Sector Investments
session.run("""
    MATCH (i3:Investor {name: 'Sequoia Capital'}), (s1:Startup {name: 'Stripe'}), (s2:Startup {name: 'Revolut'}),
          (s3:Startup {name: 'Klarna'}), (s4:Startup {name: 'Brex'})
    CREATE (i3)-[:INVESTS_IN]->(s1),
           (i3)-[:INVESTS_IN]->(s2),
           (i3)-[:INVESTS_IN]->(s3),
           (i3)-[:INVESTS_IN]->(s4)
""")

session.run("""
    MATCH (i9:Investor {name: 'Tiger Global'}), (s1:Startup {name: 'Stripe'}), (s2:Startup {name: 'Klarna'}),
          (s3:Startup {name: 'Brex'})
    CREATE (i9)-[:INVESTS_IN]->(s1),
           (i9)-[:INVESTS_IN]->(s2),
           (i9)-[:INVESTS_IN]->(s3)
""")

# Electric Vehicle Sector Investments
session.run("""
    MATCH (i10:Investor {name: 'Cathie Wood'}), (s1:Startup {name: 'Tesla'}), (s2:Startup {name: 'Nio'}),
          (s3:Startup {name: 'Rivian'})
    CREATE (i10)-[:INVESTS_IN]->(s1),
           (i10)-[:INVESTS_IN]->(s2),
           (i10)-[:INVESTS_IN]->(s3)
""")

session.run("""
    MATCH (i11:Investor {name: 'Mark Cuban'}), (s1:Startup {name: 'Tesla'}), (s2:Startup {name: 'Lucid Motors'})
    CREATE (i11)-[:INVESTS_IN]->(s1),
           (i11)-[:INVESTS_IN]->(s2)
""")

# Blockchain Sector Investments
session.run("""
    MATCH (i5:Investor {name: 'Binance Labs'}), (s1:Startup {name: 'Binance'}), (s2:Startup {name: 'Ledger'})
    CREATE (i5)-[:INVESTS_IN]->(s1),
           (i5)-[:INVESTS_IN]->(s2)
""")

session.run("""
    MATCH (i12:Investor {name: 'Accel Partners'}), (s1:Startup {name: 'Chainalysis'}), (s2:Startup {name: 'Coinbase'})
    CREATE (i12)-[:INVESTS_IN]->(s1),
           (i12)-[:INVESTS_IN]->(s2)
""")



#### Establishing relationships between startups: collaboration, competition, and partnerships.

In [None]:
session.run("""
    MATCH (s1:Startup {name: 'OpenAI'}), (s2:Startup {name: 'Tesla'})
    CREATE (s1)-[:COLLABORATES_WITH]->(s2)
""")

session.run("""
    MATCH (s1:Startup {name: 'Revolut'}), (s2:Startup {name: 'Stripe'})
    CREATE (s1)-[:COLLABORATES_WITH]->(s2)
""")

session.run("""
    MATCH (s1:Startup {name: 'Binance'}), (s2:Startup {name: 'Coinbase'})
    CREATE (s1)-[:COMPETES_WITH]->(s2)
""")

session.run("""
    MATCH (s1:Startup {name: 'Tesla'}), (s2:Startup {name: 'Lucid Motors'})
    CREATE (s1)-[:COMPETES_WITH]->(s2)
""")

session.run("""
    MATCH (s1:Startup {name: 'SpaceX'}), (s2:Startup {name: 'Blue Origin'})
    CREATE (s1)-[:COMPETES_WITH]->(s2)
""")

session.run("""
    MATCH (s1:Startup {name: 'DeepMind'}), (s2:Startup {name: 'Mistral AI'})
    CREATE (s1)-[:PARTNERS_WITH]->(s2)
""")


### Cypher query to visualize Startups and Investors [<sup>3</sup>](#vizualisation)

In [None]:
g.show_cypher("MATCH (s)-[r]->(t) RETURN s, r, t")

# PageRank algorithm

The PageRank algorithm ranks the nodes in a graph based on their influence. It is a recursive algorithm in which a node’s score depends on the scores of the nodes linking to it, as well as the number of other nodes those linking nodes connect to. This algorithm is implemented in the ***graph-data-science plugin***, and we will break down its core functionalities.

The first step is to create an **in-memory projection** of the graph. The primary goal of this projection is to streamline the graph and isolate it from live data, allowing for faster execution times by working on a simplified version of the graph. To achieve this, we first need to retrieve the labels of our nodes and the possible relationships between them.

In [None]:
labelsResponse = session.run("""CALL db.labels()""")
labels = labelsResponse.data()
print(labels)

relationshipResponse = session.run("CALL db.relationshipTypes()")
relationships = relationshipResponse.data()
print(relationships)

Then, by using the labels and relationship types, we can create the projection.

In [None]:
projection_query = """
CALL gds.graph.project(
  'generalProjection', 
  ['Startup', 'Investor'], 
  {
    INVESTS_IN: {},
    COLLABORATES_WITH: {},
    COMPETES_WITH: {}
  }
)
"""
session.run(projection_query)

Now, we can run the PageRank algorithm using stream mode. In this mode, the algorithm computes a score for each node, allowing us to post-process the results without modifying the underlying data. Additionally, we limit the query to return only the top 10 nodes.

In [None]:
pagerankGeneralQuery = """
    CALL gds.pageRank.stream('generalProjection')
    YIELD nodeId, score
    RETURN gds.util.asNode(nodeId).name AS name, score
    ORDER BY score DESC
    LIMIT 10
"""
pagerankGeneralRes = session.run(pagerankGeneralQuery)
for record in pagerankGeneralRes:
    print(f"Name: {record['name']}, Score: {record['score']}")

Note that we can filter the type of relationship in the projection. For example, if we want to find the most influential node with regard to only the INVESTS_IN relationships, we can use the following projection.

In [None]:
filteredProjectionQuery = """
  CALL gds.graph.project(
    'filteredProjection',
    ['Startup', 'Investor'],
    {
      INVESTS_IN: {}
    }
  )
"""

pagerankFilteredQuery = """
  CALL gds.pageRank.stream('filteredProjection')
  YIELD nodeId, score
  RETURN gds.util.asNode(nodeId).name AS name, score
  ORDER BY score DESC
  LIMIT 10
"""

session.run(filteredProjectionQuery)

pagerankFilteredRes = session.run(pagerankFilteredQuery)
for record in pagerankFilteredRes:
    print(f"Name: {record['name']}, Score: {record['score']}")


Similarly, we can filter nodes based on their labels. For example, if we want to focus only on nodes with the label Startup, we can create a projection like this:

In [None]:
filteredProjectionQuery2 = """
  CALL gds.graph.project(
    'filteredProjection2', 
    ['Startup'], 
    {
      INVESTS_IN: {},
      COLLABORATES_WITH: {},
      COMPETES_WITH: {}
    }
  )
"""

pagerankFilteredQuery2 = """
  CALL gds.pageRank.stream('filteredProjection2')
  YIELD nodeId, score
  RETURN gds.util.asNode(nodeId).name AS name, score
  ORDER BY score DESC
  LIMIT 10
"""

session.run(filteredProjectionQuery2)

pagerankFilteredRes2 = session.run(pagerankFilteredQuery2)
for record in pagerankFilteredRes2:
    print(f"Name: {record['name']}, Score: {record['score']}")


There are numerous options available to fine-tune and extend the PageRank algorithm. For more details, you can refer to the Neo4j documentation [<sup>8</sup>](#docpr). However, to keep this tutorial straightforward, we will focus only on estimating the **memory cost** of running the algorithm on our projections. Note that we must specify the execution mode—here, we are using stream mode.

In [None]:
estimateQuery = """
    CALL gds.pageRank.stream.estimate('generalProjection', {})
    YIELD nodeCount, relationshipCount, bytesMin, bytesMax, requiredMemory
"""

estimateRes = session.run(estimateQuery)
for record in estimateRes.data():
    for key, value in record.items():
        print(f"{key}: {value}")

# Louvain algorithm

The Louvain algorithm is used for community detection in graphs by maximizing a metric called modularity. It iteratively groups nodes into communities, ensuring that nodes within the same community are densely connected, while connections between different communities remain sparse. This algorithm is also implemented in the ***graph-data-science plugin***.

The Louvain algorithm also operates on an **in-memory graph projection**. In this tutorial, we'll use the same projection created for the PageRank algorithm. Like PageRank, Louvain offers various execution modes, and to maintain consistency with our previous approach, we will use stream mode.

We create a query that orders the communities by the number of nodes they contain. The query also lists the nodes in each community and limits the output to the top 5 communities.

In [None]:
louvainGeneralQuery = """
     CALL gds.louvain.stream('generalProjection')
     YIELD nodeId, communityId
     WITH communityId, 
          collect(gds.util.asNode(nodeId).name) AS nodes, 
          count(*) AS communitySize
     ORDER BY communitySize DESC
     LIMIT 5
     RETURN communityId AS community, communitySize, nodes
"""

louvainGeneralRes = session.run(louvainGeneralQuery)
for record in louvainGeneralRes:
    print(f"community: {record['community']}, communitySize: {record['communitySize']}, nodes: {record['nodes']}")


In [None]:
louvainFilteredQuery = """
     CALL gds.louvain.stream('filteredProjection')
     YIELD nodeId, communityId
     WITH communityId, 
          collect(gds.util.asNode(nodeId).name) AS nodes, 
          count(*) AS communitySize
     ORDER BY communitySize DESC
     LIMIT 5
     RETURN communityId AS community, communitySize, nodes
"""

louvainFilteredRes = session.run(louvainFilteredQuery)
for record in louvainFilteredRes:
    print(f"community: {record['community']}, communitySize: {record['communitySize']}, nodes: {record['nodes']}")

In [None]:
louvainFilteredQuery2 = """
     CALL gds.louvain.stream('filteredProjection2')
     YIELD nodeId, communityId
     WITH communityId, 
          collect(gds.util.asNode(nodeId).name) AS nodes, 
          count(*) AS communitySize
     ORDER BY communitySize DESC
     LIMIT 5
     RETURN communityId AS community, communitySize, nodes
"""

louvainFilteredRes2 = session.run(louvainFilteredQuery2)
for record in louvainFilteredRes2:
    print(f"community: {record['community']}, communitySize: {record['communitySize']}, nodes: {record['nodes']}")

There are also numerous options available for configuring the Louvain algorithm. You can find more details in the Neo4j Graph Data Science documentation[<sup>9</sup>](#doclouvain). The memory cost estimation for running this algorithm is similiar as how we porceeded with PageRank.

In [None]:
estimateQuery = """
    CALL gds.louvain.stream.estimate('generalProjection', {})
    YIELD nodeCount, relationshipCount, bytesMin, bytesMax, requiredMemory
"""

estimateRes = session.run(estimateQuery)
for record in estimateRes.data():
    for key, value in record.items():
        print(f"{key}: {value}")

#  Real-World Use Cases[<sup>10</sup>](#usecases)

## Common applications

Neo4j (or more broadly: graph databases) are commonly used in the following fields.
Many of these applications are well-suited for Neo4j because they involve highly connected data, which graph databases handle efficiently.

### 1. Knowledge graphs

By using Neo4j, building knowledge graphs becomes easy since the real-world relationships are translated into data models in a straightforward way. The graph is also fast to code, requiring few lines of code, and quick to update due to its flexible data schema.

### 2. Generative AI

Generative AIs (such as large language models) struggle with reasoning. Knowledge graphs using Neo4j allow to overcome this by storing relevant and contextual data more efficiently.

### 3. Fraud detection & analytics

Since fraudsters operate in rings, graph databases make fraud-pattern recognition more efficient than SQL databases. The relationships between transactions and individuals can be transposed to a graph that can then be analyzed to detect suspicious activity.

### 4. Identity & access management

Managing roles and accesses to different sets of data is easily done using graph databases since relationships between entities are represented as graph edges, which are easily added or removed using the Cypher query language.

### 5. Master data management

Neo4j offers a master data management service that allows to unify your data into a single "*360° view*" of your data. This means that data related to customer, product, supplier, and logistics information can be put together in order to leverage insights across datasets. 

### 6. Network and IT operations

Graph databases are ideal for correlating network and IT assets to help with troubleshooting or analyzing networks. By using this kind of databases, a connection between different monitoring tools is possible, allowing users to better manage their networks.

### 7. Real-time recommendations

The speed of graph databases is unmatched by other types of databases, allowing real-time recommendations processing to be possible.

### 8. Data privacy, risk and compliance

Neo4j's [*Privacy Shield*](https://neo4j.com/use-cases/gdpr-compliance/?ref=web-solutions-privacy-risk-compliance) enables companies to comply with the EU's [General Data Protection Requirements (GDPR)](https://europa.eu/youreurope/business/dealing-with-customers/data-protection/data-protection-gdpr/index_en.htm), which requires companies to control and manage customer data. 

It connects personal data and tracks:
- The location of private information;
- Which systems and apps use the data;
- How and when personal data is used;
- Who looks at and uses the data;
- What permissions you have to use the data, and when and how they were obtained;
- Where and where personal data moves.

Different countries' regulation (such as the [California Consumer Privacy Act](https://oag.ca.gov/privacy/ccpa)) are also tackled, but the GDPR is more relevant for our use.

### 9. Supply chain management

Supply chains can be efficiently represented as graphs, making graph databases particularly well-suited for handling them. When using graph databases, it is then possible to anticipate shifts in demand, predict product handling cost, and adapt to new compliance standards.


## Industry use cases

Many industry uses of Neo4j are a combination of the common uses mentioned previously.

Industries use Neo4j for the following applications:

1. Financial services
    - Detect fraud rings; 
    - Model ever-changing complex assets;
    - Connect disparate systems and data sources;
    - Oversee identity & access management.
    
    Notable users: [UBS](https://www.ubs.com/us/en.html), [Cerved](https://www.cerved.com/en), [Royal Bank of Scotland](https://www.rbs.co.uk/), [MITRE](https://www.mitre.org/).

2. Government:
    - Connect different data records in criminal investigations;
    - Manage the equipment of governmental staff;
    - Detect failure causes;
    - Analyze and make decisions.
    
    Notable users: [US Army](https://www.army.mil/), [IQT](https://www.iqt.org/), [MITRE](https://www.mitre.org/), [Lockheed Martin Space](https://www.lockheedmartin.com/), [NASA](https://www.nasa.gov/).

3. Healthcare & life sciences:
    - Model connections between genes, proteins, cells and tissues;
    - Model molecules;
    - Map patients' journeys;
    
    Notable users: [Novartis](https://www.novartis.com/), [Boston Scientific](https://www.bostonscientific.com/en-US/home.html), [ChemAxon](https://chemaxon.com/), [Bayer (Monsanto)](https://www.bayer.com/en/).

4. Retail:
    - Provide real-time product recommendations;
    - Change prices dynamically;
    - Optimize delivery routing.
    
    Notable users: [Walmart](https://www.walmart.com/), [eBay](https://www.ebay.com/), [Adidas](https://www.adidas-group.com/en/), [Transparency-One](https://www.transparency-one.com/).

5. Telecommunications:
    - Model customer-related graphs;
    - Model communication networks.
    
    Notable users: [Comcast](https://corporate.comcast.com/), [Telenor](https://www.telenor.com/), [Cisco](https://www.cisco.com/).


## Cloud partners

Neo4j's services are also used by some of the world's largest companies' cloud storages, namely [Amazon](https://neo4j.com/cloud/aura-aws/), [Microsoft](https://neo4j.com/partners/microsoft/), and [Google](https://neo4j.com/partners/google/).

# Conclusion

Neo4j is a graph database that specializes in handling highly connected data. Its native graph data representation is flexible and its custom query language, Cypher, is intuitive, making it easy to learn. Its use cases are broad and include recommendation systems, fraud detection, social networks, knowledge graphs, and network analysis, where relationships between data points are critical.

However since Neo4j is designed to work on highly connected data, it may be less efficient if the work that is aimed to be achieved needs handling rigidly structured data while also being focused on transactions. It is also quite resource intensive, requiring more memory than traditional SQL databases.


# References

1. <a id="reference-json"></a> [Wikipedia - JSON](https://en.wikipedia.org/wiki/JSON)

2. <a id="chatgpt"></a> [ChatGPT](https://chatgpt.com/)

3. <a id="vizualisation"></a> [Nodes 2024 – Advanced Graph Visualizations in Jupyter Notebooks](https://neo4j.com/videos/nodes-2024-advanced-graph-visualizations-in-jupyter-notebooks/)

4. <a id="neo4jVSsql"></a> [Quora 2023 Sumit Sutariya - Pros and Cons of using Graph Databases compared to Traditional Relational Databases](https://www.quora.com/What-are-the-pros-and-cons-of-using-graph-databases-compared-to-traditional-relational-databases-in-modern-web-development)

5. <a id="diffgraphandrelationaldb"></a> [Amazon - Difference between Graph and Relational Database](https://aws.amazon.com/fr/compare/the-difference-between-graph-and-relational-database/)

6. <a id="graphdbscalelimitation"></a> [Thatdot Rob Malnati - Scale Limitations of Graph Databases](https://www.thatdot.com/blog/understanding-the-scale-limitations-of-graph-databases/#:~:text=Graph%20databases%20are%20great%20at,on%20streaming%20data%20are%20desired.)

7. <a id="whygraphdb"></a> [NEBULAGRAPH 2023 Min.WU](https://www.nebula-graph.io/posts/why-use-graph-databases#:~:text=Graph%20databases%20provide%20a%20flexible,based%20on%20the%20collected%20insights.)

8. <a id="Neo4Jdoc"></a> [Neo4j - Official documentation](https://neo4j.com/docs/getting-started/?utm_source=GSearch&utm_medium=PaidSearch&utm_campaign=Evergreen&utm_content=EMEA-Search-SEMCE-DSA-None-SEM-SEM-NonABM&utm_term=&utm_adgroup=DSA&gad_source=1&gclid=Cj0KCQjwtJ6_BhDWARIsAGanmKfTPoCcMxVBQzAo82Ng60-loTIjCV3yfWp9R_PvEh0qp6mz84Ks6yQaAiQlEALw_wcB)

9. <a id="docpr"></a> [Page Rank - Official documentation](https://neo4j.com/docs/graph-data-science/current/algorithms/page-rank/)

10. <a id="doclouvain"></a> [Louvain - Official documentation](https://neo4j.com/docs/graph-data-science/current/algorithms/louvain/)

11. <a id="usecases"></a> [Neo4j's use cases (and their respective white papers)](https://neo4j.com/use-cases/)
