# Graph Analytics using Neo4j
## Introduction
This notebook will guide you through setting up **Neo4j locally** in Google Colab, creating a large graph dataset, and performing **Graph Analytics** using **Cypher Query Language**.

---

## 1. Install and Configure Neo4j in Google Colab

In [2]:
!wget -O - https://debian.neo4j.com/neotechnology.gpg.key | sudo apt-key add -
!echo 'deb https://debian.neo4j.com stable 4.4' | sudo tee -a /etc/apt/sources.list.d/neo4j.list
!sudo apt update
!sudo apt install -y neo4j

--2025-02-14 20:28:08--  https://debian.neo4j.com/neotechnology.gpg.key
Resolving debian.neo4j.com (debian.neo4j.com)... 18.65.229.44, 18.65.229.93, 18.65.229.53, ...
Connecting to debian.neo4j.com (debian.neo4j.com)|18.65.229.44|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 3905 (3.8K) [application/pgp-keys]
Saving to: ‘STDOUT’


2025-02-14 20:28:08 (76.1 MB/s) - written to stdout [3905/3905]

OK
deb https://debian.neo4j.com stable 4.4
Get:1 https://debian.neo4j.com stable InRelease [44.2 kB]
Get:2 https://cloud.r-project.org/bin/linux/ubuntu jammy-cran40/ InRelease [3,632 B]
Get:3 http://security.ubuntu.com/ubuntu jammy-security InRelease [129 kB]
Hit:4 http://archive.ubuntu.com/ubuntu jammy InRelease
Get:5 https://r2u.stat.illinois.edu/ubuntu jammy InRelease [6,555 B]
Get:6 http://archive.ubuntu.com/ubuntu jammy-updates InRelease [128 kB]
Get:7 https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64  InRelease [1,581 B]
Get:8 https://de

### 2. Start Neo4j Locally

In [3]:
!sudo service neo4j start
!sudo service neo4j status
!echo -e "neo4j\nneo4j" | sudo neo4j-admin set-initial-password neo4j

Directories in use:
home:         /var/lib/neo4j
config:       /etc/neo4j
logs:         /var/log/neo4j
plugins:      /var/lib/neo4j/plugins
import:       /var/lib/neo4j/import
data:         /var/lib/neo4j/data
certificates: /var/lib/neo4j/certificates
licenses:     /var/lib/neo4j/licenses
run:          /var/lib/neo4j/run
Starting Neo4j.
Started neo4j (pid:1806). It is available at http://localhost:7474
There may be a short delay until the server is ready.
 * neo4j is not running
Selecting JVM - Version:11.0.26+4-post-Ubuntu-1ubuntu122.04, Name:OpenJDK 64-Bit Server VM, Vendor:Ubuntu
Changed password for user 'neo4j'. IMPORTANT: this change will only take effect if performed before the database is started for the first time.


### 3. Install Python Driver for Neo4j

In [4]:
!pip install neo4j

Collecting neo4j
  Downloading neo4j-5.28.1-py3-none-any.whl.metadata (5.9 kB)
Downloading neo4j-5.28.1-py3-none-any.whl (312 kB)
[?25l   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/312.3 kB[0m [31m?[0m eta [36m-:--:--[0m[2K   [91m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[90m╺[0m [32m307.2/312.3 kB[0m [31m10.0 MB/s[0m eta [36m0:00:01[0m[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m312.3/312.3 kB[0m [31m7.2 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: neo4j
Successfully installed neo4j-5.28.1


### 4. Connect to Local Neo4j Database

In [5]:
from neo4j import GraphDatabase

URI = "bolt://localhost:7687"
USER = "neo4j"
PASSWORD = "neo4j"

driver = GraphDatabase.driver(URI, auth=(USER, PASSWORD))

def run_query(query):
    with driver.session() as session:
        result = session.run(query)
        return [record for record in result]

print("Connected to local Neo4j instance!")



Connected to local Neo4j instance!


### 5. Create a Large Graph Dataset

In [6]:
create_large_iot_graph_query = """
CREATE (temp1:Device {name:'Temp Sensor A', type:'Sensor', location:'Factory Floor'}),
       (temp2:Device {name:'Temp Sensor B', type:'Sensor', location:'Warehouse'}),
       (humid1:Device {name:'Humidity Sensor', type:'Sensor', location:'Factory Floor'}),
       (camera1:Device {name:'Security Camera A', type:'Camera', location:'Factory Entrance'}),
       (camera2:Device {name:'Security Camera B', type:'Camera', location:'Warehouse'}),
       (robot1:Device {name:'Industrial Robot 1', type:'Actuator', location:'Assembly Line'}),
       (robot2:Device {name:'Industrial Robot 2', type:'Actuator', location:'Warehouse'}),
       (gateway1:Gateway {name:'Edge Gateway 1', type:'Gateway'}),
       (gateway2:Gateway {name:'Edge Gateway 2', type:'Gateway'}),
       (cloud1:Cloud {name:'AWS IoT Cloud', provider:'AWS'}),
       (cloud2:Cloud {name:'Azure IoT Hub', provider:'Microsoft'}),
       (admin1:User {name:'Admin 1', role:'IoT Manager'}),
       (admin2:User {name:'Admin 2', role:'Security Manager'})

CREATE (temp1)-[:CONNECTED_TO]->(gateway1),
       (temp2)-[:CONNECTED_TO]->(gateway2),
       (humid1)-[:CONNECTED_TO]->(gateway1),
       (camera1)-[:CONNECTED_TO]->(gateway1),
       (camera2)-[:CONNECTED_TO]->(gateway2),
       (robot1)-[:CONNECTED_TO]->(gateway1),
       (robot2)-[:CONNECTED_TO]->(gateway2),

       (gateway1)-[:CONNECTED_TO]->(cloud1),
       (gateway2)-[:CONNECTED_TO]->(cloud2),

       (admin1)-[:MANAGES]->(temp1),
       (admin1)-[:MANAGES]->(humid1),
       (admin1)-[:MANAGES]->(robot1),
       (admin2)-[:MANAGES]->(camera1),
       (admin2)-[:MANAGES]->(camera2),
       (admin1)-[:MANAGES]->(gateway1),
       (admin2)-[:MANAGES]->(gateway2);
"""

run_query(create_large_iot_graph_query)
print("IoT Graph Database Created!")


IoT Graph Database Created!


### 6. Query the Graph

In [11]:
query1 = """
MATCH (d:Device)
RETURN d.name AS Device, d.type AS Type, d.location AS Location;
"""
print("\nQuery 1: List of IoT Devices")
results = run_query(query1)
for record in results:
    print(record)


Query 1: List of IoT Devices
<Record Device='Temp Sensor A' Type='Sensor' Location='Factory Floor'>
<Record Device='Temp Sensor B' Type='Sensor' Location='Warehouse'>
<Record Device='Humidity Sensor' Type='Sensor' Location='Factory Floor'>
<Record Device='Security Camera A' Type='Camera' Location='Factory Entrance'>
<Record Device='Security Camera B' Type='Camera' Location='Warehouse'>
<Record Device='Industrial Robot 1' Type='Actuator' Location='Assembly Line'>
<Record Device='Industrial Robot 2' Type='Actuator' Location='Warehouse'>


### 7. Graph Analytics Exercises

In [12]:
# Query 2: Find all devices managed by the admin
query2 = """
MATCH (u:User {name:'Admin'})-[:MANAGES]->(d)
RETURN d.name AS ManagedDevice;
"""
print("\nQuery 2: Devices Managed by Admin")
results = run_query(query2)
for record in results:
    print(record)


Query 2: Devices Managed by Admin


In [14]:
# Query 3: Find all devices connected to the edge gateway
query3 = """
MATCH (g:Gateway {name:'Edge Gateway'})<-[:CONNECTED_TO]-(d)
RETURN d.name AS ConnecteddDevice;
"""
print("\nQuery 3: Devices connected to the Edge Gateway")
results = run_query(query3)
for record in results:
    print(record)


Query 3: Devices connected to the Edge Gateway


In [15]:
# Query 4: Find the data flow path from a sensor to the cloud
query4 = """
MATCH path = (s:Device)-[:CONNECTED_TO*]->(c:Cloud)
RETURN path;
"""
print("\nQuery 4: Data flow path from sensors to cloud")
results = run_query(query4)
print(results)


Query 4: Data flow path from sensors to cloud
[<Record path=<Path start=<Node element_id='0' labels=frozenset({'Device'}) properties={'name': 'Temp Sensor A', 'location': 'Factory Floor', 'type': 'Sensor'}> end=<Node element_id='9' labels=frozenset({'Cloud'}) properties={'provider': 'AWS', 'name': 'AWS IoT Cloud'}> size=2>>, <Record path=<Path start=<Node element_id='2' labels=frozenset({'Device'}) properties={'name': 'Humidity Sensor', 'location': 'Factory Floor', 'type': 'Sensor'}> end=<Node element_id='9' labels=frozenset({'Cloud'}) properties={'provider': 'AWS', 'name': 'AWS IoT Cloud'}> size=2>>, <Record path=<Path start=<Node element_id='3' labels=frozenset({'Device'}) properties={'name': 'Security Camera A', 'location': 'Factory Entrance', 'type': 'Camera'}> end=<Node element_id='9' labels=frozenset({'Cloud'}) properties={'provider': 'AWS', 'name': 'AWS IoT Cloud'}> size=2>>, <Record path=<Path start=<Node element_id='5' labels=frozenset({'Device'}) properties={'name': 'Industr

In [16]:
# Query 5: Find all nodes directly connected to the admin
query5 = """
MATCH (u:User {name:'Admin'})-[:MANAGES]->(n)
RETURN n.name AS ConnectedNode;
"""
print("\nQuery 5: Nodes Managed by Admin")
results = run_query(query5)
for record in results:
    print(record)


Query 5: Nodes Managed by Admin


In [17]:
# ===========================
# 6. IoT Graph Analytics Queries (Complex)
# ===========================

# Query 1: Find the shortest path from a device to the cloud
query1 = """
MATCH path = shortestPath((d:Device)-[:CONNECTED_TO*]->(c:Cloud))
RETURN path;
"""
print("\nQuery 1: Shortest path from a device to the cloud")
results = run_query(query1)
print(results)




Query 1: Shortest path from a device to the cloud
[<Record path=<Path start=<Node element_id='0' labels=frozenset({'Device'}) properties={'name': 'Temp Sensor A', 'location': 'Factory Floor', 'type': 'Sensor'}> end=<Node element_id='9' labels=frozenset({'Cloud'}) properties={'provider': 'AWS', 'name': 'AWS IoT Cloud'}> size=2>>, <Record path=<Path start=<Node element_id='1' labels=frozenset({'Device'}) properties={'name': 'Temp Sensor B', 'location': 'Warehouse', 'type': 'Sensor'}> end=<Node element_id='10' labels=frozenset({'Cloud'}) properties={'provider': 'Microsoft', 'name': 'Azure IoT Hub'}> size=2>>, <Record path=<Path start=<Node element_id='2' labels=frozenset({'Device'}) properties={'name': 'Humidity Sensor', 'location': 'Factory Floor', 'type': 'Sensor'}> end=<Node element_id='9' labels=frozenset({'Cloud'}) properties={'provider': 'AWS', 'name': 'AWS IoT Cloud'}> size=2>>, <Record path=<Path start=<Node element_id='3' labels=frozenset({'Device'}) properties={'name': 'Securit

In [18]:
# Query 2: Find the most connected IoT device (degree centrality)
query2 = """
MATCH (d:Device)-[:CONNECTED_TO*]->()
RETURN d.name AS Device, count(*) AS Connections
ORDER BY Connections DESC LIMIT 1;
"""
print("\nQuery 2: Most connected IoT device (influencer)")
results = run_query(query2)
for record in results:
    print(record)


Query 2: Most connected IoT device (influencer)
<Record Device='Temp Sensor A' Connections=2>


In [25]:
# Query 3: Identify potential security vulnerabilities (devices with no admin)
query3 = """
MATCH (d:Device)
WHERE NOT EXISTS { MATCH (u:User)-[:MANAGES]->(d) }
RETURN d.name AS UnmanagedDevice;
"""
print("\nQuery 3: Unmanaged IOT devices(potential security risks)")
results = run_query(query3)
for record in results:
    print(record)


Query 3: Unmanaged IOT devices(potential security risks)
<Record UnmanagedDevice='Temp Sensor B'>
<Record UnmanagedDevice='Industrial Robot 2'>


In [27]:
# Query 4: Detect impact of gateway failure (devices affected)
query4 = """
MATCH (g:Gateway)<-[:CONNECTED_TO]-(d)
RETURN d.name AS AffectedDevice;
"""
print("\nQuery 4: Impact of Edge Gateway 1 failure")
results = run_query(query4)
for record in results:
    print(record)


Query 4: Impact of Edge Gateway 1 failure
<Record AffectedDevice='Industrial Robot 1'>
<Record AffectedDevice='Security Camera A'>
<Record AffectedDevice='Humidity Sensor'>
<Record AffectedDevice='Temp Sensor A'>
<Record AffectedDevice='Industrial Robot 2'>
<Record AffectedDevice='Security Camera B'>
<Record AffectedDevice='Temp Sensor B'>


In [28]:
# ===========================
# 7. Close Neo4j Connection
# ===========================


driver.close()
print("\nAdvanced Graph Analytics on IOT Database Completed!")


Advanced Graph Analytics on IOT Database Completed!
