# 1. Get To Know Your Graph

In this notebook, you'll connect to your Neo4j database, establish a driver, and perform basic exploratory data analysis (EDA) to understand the structure and content of your knowledge graph.

---

## Table of Contents
1. Connect to Neo4j
2. Establish the Driver
3. Basic Graph Statistics
4. EDA on Source Documents


## 1. Connect to Neo4j
Fill in your credentials or use your .env file.


In [ ]:
from neo4j import GraphDatabase
import os
NEO4J_URI = os.getenv('NEO4J_URI', 'bolt://localhost:7687')
NEO4J_USER = os.getenv('NEO4J_USERNAME', 'neo4j')
NEO4J_PASSWORD = os.getenv('NEO4J_PASSWORD', 'your_password')
driver = GraphDatabase.driver(NEO4J_URI, auth=(NEO4J_USER, NEO4J_PASSWORD))
def run_query(query, parameters=None):
    with driver.session() as session:
        return list(session.run(query, parameters or {}))
print('Connected to Neo4j!')

## 2. Basic Graph Statistics
Let's get a sense of the structure of your graph: node types, relationship types, and their counts.


In [ ]:
# Node label counts
for record in run_query('MATCH (n) UNWIND labels(n) AS label RETURN label, count(*) AS count GROUP BY label ORDER BY count DESC'):
    print(record)

# Relationship type counts
for record in run_query('MATCH ()-[r]->() RETURN type(r) AS rel_type, count(*) AS count GROUP BY rel_type ORDER BY count DESC'):
    print(record)


## 3. EDA on Source Documents
Explore basic properties of your source documents, such as text length.


In [ ]:
# Preview a few documents
for record in run_query('MATCH (d:Document) RETURN d.text AS text, d.title AS title LIMIT 3'):
    print(f"Title: {record['title']}
Text (first 200 chars): {record['text'][:200]}
---")

# Text length stats
lengths = [record['length'] for record in run_query('MATCH (d:Document) RETURN size(d.text) AS length')]
if lengths:
    import numpy as np
    print(f'Mean length: {np.mean(lengths):.1f}')
    print(f'Max length: {np.max(lengths)}')
    print(f'Min length: {np.min(lengths)}')
else:
    print('No Document nodes with text found.')


---
You now have a basic understanding of your graph's structure and your source documents.
You're ready for deeper analytics and retrieval!
