# Chapter 12: Graph RAG - Section 12.2

## Introduction

This notebook provides hands-on experience with knowledge graph fundamentals using Neo4j. You'll learn to:
- Set up Neo4j (using Neo4j Aura free tier)
- Understand entities, relationships, and properties
- Write basic Cypher queries
- Create and visualize simple knowledge graphs
- Connect Python applications to Neo4j


**## Prerequisites and Setup**


In [None]:
# Install required libraries
!pip install neo4j pandas matplotlib networkx plotly

# Import necessary libraries
import pandas as pd
import matplotlib.pyplot as plt
import networkx as nx
from neo4j import GraphDatabase
import plotly.graph_objects as go
import plotly.express as px
from typing import List, Dict, Any
import json

**## Part 1: Setting Up Neo4j Connection**


In [22]:
### Option 1: Using Neo4j Aura (Recommended)
# Visit https://console.neo4j.io/ to create a free Neo4j Aura instance
# You'll get credentials like these (replace with your actual credentials):

# ‚ö†Ô∏è IMPORTANT: Replace these with your actual Neo4j Aura credentials
NEO4J_URI = "neo4j+s://your-instance.databases.neo4j.io"  # Your Neo4j URI
NEO4J_USERNAME = "neo4j"                                   # Usually 'neo4j'
NEO4J_PASSWORD = "your-password-here"                      # Your generated password

# Uncomment and set your credentials
# NEO4J_URI = "your-uri-here"
# NEO4J_USERNAME = "neo4j"
# NEO4J_PASSWORD = "your-password-here"

# For demonstration purposes, we'll use a mock connection class
USE_REAL_NEO4J = False  # Set to True when you have real credentials

**### Neo4j Connection Class**

In [None]:
class Neo4jConnection:
    """A connection class for Neo4j database operations."""

    def __init__(self, uri, user, password):
        if USE_REAL_NEO4J:
            self.driver = GraphDatabase.driver(uri, auth=(user, password))
        else:
            print("‚ö†Ô∏è Using mock connection. Set USE_REAL_NEO4J=True for real database.")
            self.driver = None

    def query(self, query, parameters=None):
        """Execute a Cypher query and return results."""
        if USE_REAL_NEO4J and self.driver:
            with self.driver.session() as session:
                result = session.run(query, parameters or {})
                return [record for record in result]
        else:
            # Mock response for demonstration
            print(f"Mock Query Executed: {query}")
            if parameters:
                print(f"Parameters: {parameters}")
            return self._mock_response(query)

    def _mock_response(self, query):
        """Generate mock responses for common queries."""
        if "CREATE" in query.upper():
            return [{"message": "Node/relationship created"}]
        elif "MATCH" in query.upper() and "RETURN" in query.upper():
            return [
                {"name": "Sample Node 1", "type": "Person"},
                {"name": "Sample Node 2", "type": "Organization"}
            ]
        else:
            return [{"status": "Query executed"}]

    def close(self):
        """Close the database connection."""
        if self.driver:
            self.driver.close()
        print("Connection closed")

# Create connection instance
if 'NEO4J_URI' in globals() and NEO4J_URI != "neo4j+s://your-instance.databases.neo4j.io":
    neo4j_conn = Neo4jConnection(NEO4J_URI, NEO4J_USERNAME, NEO4J_PASSWORD)
else:
    neo4j_conn = Neo4jConnection("mock://localhost", "user", "pass")

# Test connection
result = neo4j_conn.query("MATCH (n) RETURN count(n) as node_count")
print("Connection test result:", result)

**## Part 2: Understanding Graph Data Model**

In [None]:
### Creating Entities, Relationships, and Properties

# Let's create a simple academic knowledge graph
# We'll model: Researchers, Papers, Institutions, and Topics

def create_sample_knowledge_graph():
    """Create a sample academic knowledge graph."""

    # Clear existing data (be careful with this in production!)
    clear_query = "MATCH (n) DETACH DELETE n"
    neo4j_conn.query(clear_query)
    print("Database cleared")

    # Create researcher nodes
    researchers = [
        {"name": "Geoffrey Hinton", "field": "Deep Learning", "h_index": 175},
        {"name": "Yann LeCun", "field": "Computer Vision", "h_index": 169},
        {"name": "Yoshua Bengio", "field": "Machine Learning", "h_index": 155},
        {"name": "Andrew Ng", "field": "Machine Learning", "h_index": 89}
    ]

    for researcher in researchers:
        query = """
        CREATE (r:Researcher {
            name: $name,
            field: $field,
            h_index: $h_index
        })
        """
        neo4j_conn.query(query, researcher)

    print(f"Created {len(researchers)} researcher nodes")

    # Create institution nodes
    institutions = [
        {"name": "University of Toronto", "country": "Canada", "founded": 1827},
        {"name": "New York University", "country": "USA", "founded": 1831},
        {"name": "Stanford University", "country": "USA", "founded": 1885}
    ]

    for institution in institutions:
        query = """
        CREATE (i:Institution {
            name: $name,
            country: $country,
            founded: $founded
        })
        """
        neo4j_conn.query(query, institution)

    print(f"Created {len(institutions)} institution nodes")

    # Create paper nodes
    papers = [
        {
            "title": "Deep Learning",
            "year": 2015,
            "citations": 45000,
            "venue": "Nature"
        },
        {
            "title": "ImageNet Classification with Deep Convolutional Neural Networks",
            "year": 2012,
            "citations": 85000,
            "venue": "NIPS"
        },
        {
            "title": "Attention Is All You Need",
            "year": 2017,
            "citations": 65000,
            "venue": "NIPS"
        }
    ]

    for paper in papers:
        query = """
        CREATE (p:Paper {
            title: $title,
            year: $year,
            citations: $citations,
            venue: $venue
        })
        """
        neo4j_conn.query(query, paper)

    print(f"Created {len(papers)} paper nodes")

    # Create relationships
    relationships = [
        # Researcher affiliations
        ("Geoffrey Hinton", "AFFILIATED_WITH", "University of Toronto", {"since": 1987, "role": "Professor"}),
        ("Yann LeCun", "AFFILIATED_WITH", "New York University", {"since": 2003, "role": "Professor"}),
        ("Andrew Ng", "AFFILIATED_WITH", "Stanford University", {"since": 2002, "role": "Professor"}),

        # Paper authorships
        ("Geoffrey Hinton", "AUTHORED", "Deep Learning", {"contribution": "senior_author"}),
        ("Yoshua Bengio", "AUTHORED", "Deep Learning", {"contribution": "co_author"}),
        ("Yann LeCun", "AUTHORED", "Deep Learning", {"contribution": "co_author"}),

        # Paper citations (simplified)
        ("Attention Is All You Need", "CITES", "ImageNet Classification with Deep Convolutional Neural Networks", {"context": "comparison"})
    ]

    for source, rel_type, target, properties in relationships:
        # Determine node types based on the names/titles
        if source in [r["name"] for r in researchers]:
            source_label = "Researcher"
            source_prop = "name"
        else:
            source_label = "Paper"
            source_prop = "title"

        if target in [i["name"] for i in institutions]:
            target_label = "Institution"
            target_prop = "name"
        elif target in [p["title"] for p in papers]:
            target_label = "Paper"
            target_prop = "title"
        else:
            target_label = "Researcher"
            target_prop = "name"

        query = f"""
        MATCH (s:{source_label} {{{source_prop}: $source}})
        MATCH (t:{target_label} {{{target_prop}: $target}})
        CREATE (s)-[r:{rel_type}]->(t)
        SET r += $properties
        """

        neo4j_conn.query(query, {
            "source": source,
            "target": target,
            "properties": properties
        })

    print(f"Created {len(relationships)} relationships")
    print("‚úÖ Sample knowledge graph created successfully!")

# Create the sample graph
create_sample_knowledge_graph()


**## Part 3: Basic Cypher Queries**

In [None]:
### Essential Query Patterns

def demonstrate_cypher_queries():
    """Demonstrate essential Cypher query patterns."""

    print("=== CYPHER QUERY DEMONSTRATIONS ===\n")

    # 1. Simple node retrieval
    print("1. Find all researchers:")
    query = "MATCH (r:Researcher) RETURN r.name, r.field, r.h_index"
    results = neo4j_conn.query(query)
    for record in results:
        print(f"   {record}")

    print("\n" + "="*50 + "\n")

    # 2. Filtering with WHERE clause
    print("2. Find high-impact researchers (h-index > 100):")
    query = """
    MATCH (r:Researcher)
    WHERE r.h_index > 100
    RETURN r.name, r.h_index
    ORDER BY r.h_index DESC
    """
    results = neo4j_conn.query(query)
    for record in results:
        print(f"   {record}")

    print("\n" + "="*50 + "\n")

    # 3. Finding relationships
    print("3. Find researcher affiliations:")
    query = """
    MATCH (r:Researcher)-[rel:AFFILIATED_WITH]->(i:Institution)
    RETURN r.name, i.name, rel.since, rel.role
    """
    results = neo4j_conn.query(query)
    for record in results:
        print(f"   {record}")

    print("\n" + "="*50 + "\n")

    # 4. Multi-hop traversal
    print("4. Find papers and their authors' institutions:")
    query = """
    MATCH (r:Researcher)-[:AUTHORED]->(p:Paper)
    MATCH (r)-[:AFFILIATED_WITH]->(i:Institution)
    RETURN p.title, r.name, i.name
    """
    results = neo4j_conn.query(query)
    for record in results:
        print(f"   {record}")

    print("\n" + "="*50 + "\n")

    # 5. Aggregation queries
    print("5. Count papers by institution:")
    query = """
    MATCH (r:Researcher)-[:AUTHORED]->(p:Paper)
    MATCH (r)-[:AFFILIATED_WITH]->(i:Institution)
    RETURN i.name, count(p) as paper_count
    ORDER BY paper_count DESC
    """
    results = neo4j_conn.query(query)
    for record in results:
        print(f"   {record}")

# Run the demonstrations
demonstrate_cypher_queries()

**## Part 4: Graph Visualization**

In [None]:
### Visualizing Knowledge Graphs with NetworkX and Plotly

def get_graph_data():
    """Extract graph data for visualization."""

    # Get all nodes
    nodes_query = """
    MATCH (n)
    RETURN id(n) as id, labels(n) as labels, properties(n) as properties
    """
    nodes_result = neo4j_conn.query(nodes_query)

    # Get all relationships
    rels_query = """
    MATCH (s)-[r]->(t)
    RETURN id(s) as source, id(t) as target, type(r) as relationship, properties(r) as properties
    """
    rels_result = neo4j_conn.query(rels_query)

    return nodes_result, rels_result

def create_network_visualization():
    """Create a network visualization of the knowledge graph."""

    # For demonstration, we'll create a mock graph structure
    # In real implementation, this would use get_graph_data()

    # Create a NetworkX graph
    G = nx.Graph()

    # Add nodes with mock data
    researchers = ["Geoffrey Hinton", "Yann LeCun", "Yoshua Bengio", "Andrew Ng"]
    institutions = ["University of Toronto", "NYU", "Stanford"]
    papers = ["Deep Learning", "ImageNet CNN", "Attention Paper"]

    # Add nodes with different colors for different types
    node_colors = []
    node_labels = {}

    for i, researcher in enumerate(researchers):
        G.add_node(researcher, type='researcher')
        node_colors.append('lightblue')
        node_labels[researcher] = researcher

    for institution in institutions:
        G.add_node(institution, type='institution')
        node_colors.append('lightgreen')
        node_labels[institution] = institution

    for paper in papers:
        G.add_node(paper, type='paper')
        node_colors.append('lightcoral')
        node_labels[paper] = paper[:15] + "..." if len(paper) > 15 else paper

    # Add edges (relationships)
    relationships = [
        ("Geoffrey Hinton", "University of Toronto"),
        ("Yann LeCun", "NYU"),
        ("Andrew Ng", "Stanford"),
        ("Geoffrey Hinton", "Deep Learning"),
        ("Yann LeCun", "Deep Learning"),
        ("Yoshua Bengio", "Deep Learning"),
        ("Deep Learning", "ImageNet CNN"),  # Citation
    ]

    G.add_edges_from(relationships)

    # Create visualization
    plt.figure(figsize=(12, 8))
    pos = nx.spring_layout(G, k=3, iterations=50)

    nx.draw(G, pos,
            node_color=node_colors,
            node_size=2000,
            font_size=8,
            font_weight='bold',
            edge_color='gray',
            width=2,
            with_labels=True,
            labels=node_labels)

    # Add legend
    from matplotlib.patches import Patch
    legend_elements = [
        Patch(facecolor='lightblue', label='Researchers'),
        Patch(facecolor='lightgreen', label='Institutions'),
        Patch(facecolor='lightcoral', label='Papers')
    ]
    plt.legend(handles=legend_elements, loc='upper right')

    plt.title("Academic Knowledge Graph Visualization", size=16, weight='bold')
    plt.axis('off')
    plt.tight_layout()
    plt.show()

# Create the visualization
create_network_visualization()

**### Interactive Plotly Visualization**

In [None]:
def create_interactive_graph():
    """Create an interactive graph visualization using Plotly."""

    # Mock data for demonstration
    nodes = [
        {"id": "hinton", "label": "Geoffrey Hinton", "type": "researcher", "x": 0, "y": 0},
        {"id": "lecun", "label": "Yann LeCun", "type": "researcher", "x": 2, "y": 1},
        {"id": "bengio", "label": "Yoshua Bengio", "type": "researcher", "x": 1, "y": 2},
        {"id": "uoft", "label": "University of Toronto", "type": "institution", "x": -1, "y": -1},
        {"id": "nyu", "label": "NYU", "type": "institution", "x": 3, "y": 0},
        {"id": "paper1", "label": "Deep Learning", "type": "paper", "x": 1, "y": 1}
    ]

    edges = [
        {"source": "hinton", "target": "uoft", "relationship": "AFFILIATED_WITH"},
        {"source": "lecun", "target": "nyu", "relationship": "AFFILIATED_WITH"},
        {"source": "hinton", "target": "paper1", "relationship": "AUTHORED"},
        {"source": "lecun", "target": "paper1", "relationship": "AUTHORED"},
        {"source": "bengio", "target": "paper1", "relationship": "AUTHORED"}
    ]

    # Create edge traces
    edge_x = []
    edge_y = []
    edge_info = []

    for edge in edges:
        source_node = next(n for n in nodes if n["id"] == edge["source"])
        target_node = next(n for n in nodes if n["id"] == edge["target"])

        edge_x.extend([source_node["x"], target_node["x"], None])
        edge_y.extend([source_node["y"], target_node["y"], None])
        edge_info.append(edge["relationship"])

    edge_trace = go.Scatter(
        x=edge_x, y=edge_y,
        line=dict(width=2, color='gray'),
        hoverinfo='none',
        mode='lines'
    )

    # Create node traces by type
    researchers = [n for n in nodes if n["type"] == "researcher"]
    institutions = [n for n in nodes if n["type"] == "institution"]
    papers = [n for n in nodes if n["type"] == "paper"]

    researcher_trace = go.Scatter(
        x=[n["x"] for n in researchers],
        y=[n["y"] for n in researchers],
        mode='markers+text',
        hoverinfo='text',
        text=[n["label"] for n in researchers],
        textposition="middle center",
        hovertext=[f"Researcher: {n['label']}" for n in researchers],
        marker=dict(size=30, color='lightblue', line=dict(width=2, color='darkblue')),
        name='Researchers'
    )

    institution_trace = go.Scatter(
        x=[n["x"] for n in institutions],
        y=[n["y"] for n in institutions],
        mode='markers+text',
        hoverinfo='text',
        text=[n["label"] for n in institutions],
        textposition="middle center",
        hovertext=[f"Institution: {n['label']}" for n in institutions],
        marker=dict(size=25, color='lightgreen', line=dict(width=2, color='darkgreen')),
        name='Institutions'
    )

    paper_trace = go.Scatter(
        x=[n["x"] for n in papers],
        y=[n["y"] for n in papers],
        mode='markers+text',
        hoverinfo='text',
        text=[n["label"] for n in papers],
        textposition="middle center",
        hovertext=[f"Paper: {n['label']}" for n in papers],
        marker=dict(size=20, color='lightcoral', line=dict(width=2, color='darkred')),
        name='Papers'
    )

    # Create the figure
    fig = go.Figure(data=[edge_trace, researcher_trace, institution_trace, paper_trace],
                   layout=go.Layout(
                       title='Interactive Knowledge Graph',
                       titlefont_size=16,
                       showlegend=True,
                       hovermode='closest',
                       margin=dict(b=20,l=5,r=5,t=40),
                       annotations=[ dict(
                           text="Hover over nodes for details",
                           showarrow=False,
                           xref="paper", yref="paper",
                           x=0.005, y=-0.002,
                           xanchor="left", yanchor="bottom",
                           font=dict(color="gray", size=12)
                       )],
                       xaxis=dict(showgrid=False, zeroline=False, showticklabels=False),
                       yaxis=dict(showgrid=False, zeroline=False, showticklabels=False))
                   )

    fig.show()

# Create interactive visualization
create_interactive_graph()

**## Part 5: Graph Analytics and Insights**

In [None]:
### Computing Graph Metrics

def analyze_graph_structure():
    """Analyze the structure and properties of our knowledge graph."""

    print("=== GRAPH ANALYTICS ===\n")

    # 1. Node counts by type
    print("1. Node counts by type:")
    node_count_query = """
    MATCH (n)
    RETURN labels(n)[0] as node_type, count(n) as count
    ORDER BY count DESC
    """
    results = neo4j_conn.query(node_count_query)
    for record in results:
        print(f"   {record}")

    print("\n" + "="*30 + "\n")

    # 2. Relationship counts by type
    print("2. Relationship counts by type:")
    rel_count_query = """
    MATCH ()-[r]->()
    RETURN type(r) as relationship_type, count(r) as count
    ORDER BY count DESC
    """
    results = neo4j_conn.query(rel_count_query)
    for record in results:
        print(f"   {record}")

    print("\n" + "="*30 + "\n")

    # 3. Node degree analysis (most connected nodes)
    print("3. Most connected nodes:")
    degree_query = """
    MATCH (n)
    OPTIONAL MATCH (n)-[r]-()
    RETURN labels(n)[0] as type,
           coalesce(n.name, n.title) as name,
           count(r) as degree
    ORDER BY degree DESC
    LIMIT 5
    """
    results = neo4j_conn.query(degree_query)
    for record in results:
        print(f"   {record}")

    print("\n" + "="*30 + "\n")

    # 4. Find collaboration patterns
    print("4. Research collaboration patterns:")
    collab_query = """
    MATCH (r1:Researcher)-[:AUTHORED]->(p:Paper)<-[:AUTHORED]-(r2:Researcher)
    WHERE r1 <> r2
    RETURN r1.name, r2.name, count(p) as collaborations
    ORDER BY collaborations DESC
    """
    results = neo4j_conn.query(collab_query)
    for record in results:
        print(f"   {record}")

# Run graph analytics
analyze_graph_structure()

**## Part 6: Practical Exercises**

In [None]:
### Exercise 1: Extend the Knowledge Graph

def exercise_extend_graph():
    """Exercise: Extend the knowledge graph with new entities and relationships."""

    print("=== EXERCISE: Extend the Knowledge Graph ===")
    print("Your task: Add new researchers, institutions, and papers to the graph")
    print("\nTasks:")
    print("1. Add 2 new researchers with their properties")
    print("2. Add 1 new institution")
    print("3. Add 2 new papers")
    print("4. Create appropriate relationships")
    print("5. Query the extended graph")
    print("\nSample solution below:")

    # Sample solution
    new_researchers = [
        {"name": "Fei-Fei Li", "field": "Computer Vision", "h_index": 95},
        {"name": "Ian Goodfellow", "field": "Generative AI", "h_index": 78}
    ]

    for researcher in new_researchers:
        query = """
        CREATE (r:Researcher {
            name: $name,
            field: $field,
            h_index: $h_index
        })
        """
        neo4j_conn.query(query, researcher)

    print("‚úÖ Exercise completed! New researchers added.")

    # Verify the addition
    verify_query = "MATCH (r:Researcher) RETURN r.name, r.field ORDER BY r.name"
    results = neo4j_conn.query(verify_query)
    print("\nAll researchers in the graph:")
    for record in results:
        print(f"   {record}")

# Run the exercise
exercise_extend_graph()

**### Exercise 2: Complex Query Writing**

In [None]:
def exercise_complex_queries():
    """Exercise: Write complex Cypher queries."""

    print("=== EXERCISE: Complex Query Writing ===")
    print("Practice writing queries to answer these questions:")
    print()

    questions_and_queries = [
        {
            "question": "Find all researchers who have authored papers with more than 50,000 citations",
            "query": """
            MATCH (r:Researcher)-[:AUTHORED]->(p:Paper)
            WHERE p.citations > 50000
            RETURN DISTINCT r.name, p.title, p.citations
            ORDER BY p.citations DESC
            """
        },
        {
            "question": "Find institutions with the most productive researchers (by paper count)",
            "query": """
            MATCH (r:Researcher)-[:AFFILIATED_WITH]->(i:Institution)
            MATCH (r)-[:AUTHORED]->(p:Paper)
            RETURN i.name, count(DISTINCT r) as researcher_count, count(p) as paper_count
            ORDER BY paper_count DESC
            """
        },
        {
            "question": "Find the shortest path between any two researchers",
            "query": """
            MATCH path = shortestPath((r1:Researcher)-[*]-(r2:Researcher))
            WHERE r1.name = 'Geoffrey Hinton' AND r2.name = 'Andrew Ng'
            RETURN path, length(path) as path_length
            """
        }
    ]

    for i, item in enumerate(questions_and_queries, 1):
        print(f"{i}. Question: {item['question']}")
        print(f"   Query: {item['query']}")
        print(f"   Results:")
        results = neo4j_conn.query(item['query'])
        for record in results[:3]:  # Show first 3 results
            print(f"      {record}")
        print()

# Run complex query exercises
exercise_complex_queries()

**## Part 7: Connection to Graph RAG**

In [None]:
### Preparing for Graph RAG Implementation

def prepare_for_graph_rag():
    """Demonstrate how this knowledge graph prepares us for Graph RAG."""

    print("=== PREPARING FOR GRAPH RAG ===")
    print()
    print("The knowledge graph we've built provides the foundation for Graph RAG:")
    print()

    # 1. Show how entities can be retrieved for questions
    print("1. Entity-based retrieval:")
    sample_question = "Tell me about Geoffrey Hinton's research"
    print(f"   Question: '{sample_question}'")
    print("   Graph RAG approach:")

    # Find information about Geoffrey Hinton
    hinton_query = """
    MATCH (r:Researcher {name: 'Geoffrey Hinton'})
    OPTIONAL MATCH (r)-[:AUTHORED]->(p:Paper)
    OPTIONAL MATCH (r)-[:AFFILIATED_WITH]->(i:Institution)
    RETURN r.name, r.field, r.h_index,
           collect(DISTINCT p.title) as papers,
           collect(DISTINCT i.name) as institutions
    """
    results = neo4j_conn.query(hinton_query)
    for record in results:
        print(f"      Retrieved: {record}")

    print()

    # 2. Show relationship traversal
    print("2. Relationship-based reasoning:")
    sample_question2 = "Who has collaborated with Geoffrey Hinton?"
    print(f"   Question: '{sample_question2}'")
    print("   Multi-hop traversal:")

    collab_query = """
    MATCH (r1:Researcher {name: 'Geoffrey Hinton'})-[:AUTHORED]->(p:Paper)<-[:AUTHORED]-(r2:Researcher)
    WHERE r1 <> r2
    RETURN DISTINCT r2.name, p.title
    """
    results = neo4j_conn.query(collab_query)
    for record in results:
        print(f"      Found collaboration: {record}")

    print()
    print("3. Next steps for Graph RAG:")
    print("   - Extract entities and relationships from documents (Chapter 12.3)")
    print("   - Implement graph-based retrieval mechanisms (Chapter 12.4)")
    print("   - Combine with LLMs for natural language responses (Chapter 12.5)")

# Prepare for Graph RAG
prepare_for_graph_rag()

**## Summary and Next Steps**

In [None]:
def chapter_summary():
    """Summarize what we've learned and preview next steps."""

    print("=== CHAPTER 12.2 SUMMARY ===")
    print()
    print("‚úÖ What you've learned:")
    print("   ‚Ä¢ Knowledge graph fundamentals (entities, relationships, properties)")
    print("   ‚Ä¢ Neo4j setup and connection management")
    print("   ‚Ä¢ Essential Cypher query patterns")
    print("   ‚Ä¢ Graph visualization techniques")
    print("   ‚Ä¢ Graph analytics and insights")
    print("   ‚Ä¢ Foundation for Graph RAG systems")
    print()
    print("üöÄ Next steps:")
    print("   ‚Ä¢ Chapter 12.3: Extract entities/relationships from documents")
    print("   ‚Ä¢ Chapter 12.4: Implement graph-enhanced retrieval")
    print("   ‚Ä¢ Chapter 12.5: Build end-to-end Graph RAG systems")
    print()
    print("üí° Key takeaways:")
    print("   ‚Ä¢ Graphs naturally represent connected knowledge")
    print("   ‚Ä¢ Cypher makes complex relationship queries intuitive")
    print("   ‚Ä¢ Graph structure enables multi-hop reasoning")
    print("   ‚Ä¢ Visualization helps understand graph structure")

# Run summary
chapter_summary()

# Clean up
neo4j_conn.close()

**## Additional Resources**

In [None]:
print("=== ADDITIONAL RESOURCES ===")
print()
print("üìö Documentation:")
print("   ‚Ä¢ Neo4j Documentation: https://neo4j.com/docs/")
print("   ‚Ä¢ Cypher Manual: https://neo4j.com/docs/cypher-manual/current/")
print("   ‚Ä¢ Neo4j Python Driver: https://neo4j.com/docs/python-manual/current/")
print()
print("üõ†Ô∏è Tools:")
print("   ‚Ä¢ Neo4j Browser: Interactive query interface")
print("   ‚Ä¢ Neo4j Bloom: Graph visualization tool")
print("   ‚Ä¢ APOC Library: Extended procedures for Neo4j")
print()
print("üéì Learning:")
print("   ‚Ä¢ Neo4j GraphAcademy: https://graphacademy.neo4j.com/")
print("   ‚Ä¢ Cypher Query Language: Interactive tutorials")
print("   ‚Ä¢ Graph Algorithms: Network analysis techniques")

#This notebook provided hands-on experience with knowledge graph fundamentals using Neo4j. You've learned to create, query, and visualize knowledge graphs - the foundation for building powerful Graph RAG systems.


**## Troubleshooting Guide**

In [None]:
def troubleshooting_guide():
    """Common issues and solutions when working with Neo4j and knowledge graphs."""

    print("=== TROUBLESHOOTING GUIDE ===")
    print()

    issues_and_solutions = [
        {
            "issue": "Connection timeout or authentication failed",
            "solutions": [
                "Verify your Neo4j Aura credentials are correct",
                "Check if your Neo4j instance is running",
                "Ensure network connectivity (firewall/proxy issues)",
                "Try recreating your Neo4j Aura instance"
            ]
        },
        {
            "issue": "Cypher syntax errors",
            "solutions": [
                "Check parentheses and bracket matching",
                "Verify node labels and property names are correct",
                "Use Neo4j Browser to test queries interactively",
                "Check for missing RETURN clauses"
            ]
        },
        {
            "issue": "Query performance issues",
            "solutions": [
                "Add indexes on frequently queried properties",
                "Use PROFILE or EXPLAIN to analyze query plans",
                "Limit result sets with LIMIT clause",
                "Consider using parameters instead of string concatenation"
            ]
        },
        {
            "issue": "Memory issues with large graphs",
            "solutions": [
                "Process data in batches",
                "Use PERIODIC COMMIT for bulk imports",
                "Increase Neo4j heap size if using local instance",
                "Consider upgrading to larger Neo4j Aura tier"
            ]
        }
    ]

    for item in issues_and_solutions:
        print(f"‚ùå Issue: {item['issue']}")
        print("   Solutions:")
        for solution in item['solutions']:
            print(f"   ‚Ä¢ {solution}")
        print()

# Run troubleshooting guide
troubleshooting_guide()

**## Performance Tips**

In [None]:
def performance_tips():
    """Best practices for optimal Neo4j performance."""

    print("=== PERFORMANCE OPTIMIZATION TIPS ===")
    print()

    tips = [
        {
            "category": "Query Optimization",
            "tips": [
                "Use PROFILE to identify slow query parts",
                "Create indexes on frequently filtered properties",
                "Use specific node labels in MATCH clauses",
                "Avoid Cartesian products (unconnected patterns)"
            ]
        },
        {
            "category": "Data Modeling",
            "tips": [
                "Choose appropriate relationship directions",
                "Avoid excessive node properties",
                "Use specific relationship types",
                "Consider denormalizing for read performance"
            ]
        },
        {
            "category": "Bulk Operations",
            "tips": [
                "Use UNWIND for batch processing",
                "Implement PERIODIC COMMIT for large imports",
                "Process data in smaller chunks",
                "Use parameters to avoid query recompilation"
            ]
        }
    ]

    for category in tips:
        print(f"üìä {category['category']}:")
        for tip in category['tips']:
            print(f"   ‚Ä¢ {tip}")
        print()

# Show performance tips
performance_tips()

**## Practice Exercises**

In [None]:
def practice_exercises():
    """Additional practice exercises for reinforcement."""

    print("=== PRACTICE EXERCISES ===")
    print()
    print("Try these exercises to reinforce your learning:")
    print()

    exercises = [
        {
            "level": "Beginner",
            "exercise": "Create a simple social network graph with friends relationships",
            "hint": "Use Person nodes and FRIENDS_WITH relationships"
        },
        {
            "level": "Intermediate",
            "exercise": "Model a movie database with actors, directors, and genres",
            "hint": "Consider ACTED_IN, DIRECTED, HAS_GENRE relationships"
        },
        {
            "level": "Advanced",
            "exercise": "Build a recommendation system using collaborative filtering",
            "hint": "Find users with similar preferences through graph traversal"
        },
        {
            "level": "Expert",
            "exercise": "Implement a knowledge graph for scientific papers with citations",
            "hint": "Model papers, authors, topics, and citation networks"
        }
    ]

    for i, exercise in enumerate(exercises, 1):
        print(f"{i}. {exercise['level']}: {exercise['exercise']}")
        print(f"   Hint: {exercise['hint']}")
        print()

# Show practice exercises
practice_exercises()

**## Code Templates for Common Patterns**

In [None]:
def code_templates():
    """Reusable code templates for common graph operations."""

    print("=== REUSABLE CODE TEMPLATES ===")
    print()

    templates = {
        "bulk_insert": '''
# Template: Bulk insert nodes
def bulk_insert_nodes(neo4j_conn, node_data, node_label):
    query = f"""
    UNWIND $batch as row
    CREATE (n:{node_label})
    SET n = row
    """
    neo4j_conn.query(query, {"batch": node_data})
        ''',

        "find_shortest_path": '''
# Template: Find shortest path between nodes
def find_shortest_path(neo4j_conn, start_name, end_name):
    query = """
    MATCH (start {name: $start_name}), (end {name: $end_name})
    MATCH path = shortestPath((start)-[*]-(end))
    RETURN path, length(path) as distance
    """
    return neo4j_conn.query(query, {"start_name": start_name, "end_name": end_name})
        ''',

        "recommend_by_similarity": '''
# Template: Recommend items based on similarity
def recommend_similar_items(neo4j_conn, item_name, limit=5):
    query = """
    MATCH (item {name: $item_name})-[:SIMILAR_TO]-(similar)
    RETURN similar.name, similar.score
    ORDER BY similar.score DESC
    LIMIT $limit
    """
    return neo4j_conn.query(query, {"item_name": item_name, "limit": limit})
        '''
    }

    for name, template in templates.items():
        print(f"üìù {name.replace('_', ' ').title()}:")
        print(template)
        print()

# Show code templates
code_templates()

## Final Notes

This notebook has provided you with:

‚úÖ **Solid Foundation**: Understanding of knowledge graphs and Neo4j basics
‚úÖ **Practical Skills**: Hands-on experience with Cypher queries and graph operations
‚úÖ **Visualization Tools**: Methods to explore and understand graph structures
‚úÖ **Real Examples**: Academic knowledge graph that demonstrates key concepts
‚úÖ **Best Practices**: Performance tips and common patterns

**What's Next?**

In the upcoming notebooks, you'll build upon this foundation to:
- Extract entities and relationships from real documents (Notebook 12.2)
- Construct knowledge graphs automatically (Notebook 12.3)
- Implement sophisticated retrieval mechanisms (Notebook 12.4)
- Build complete Graph RAG systems (Notebook 12.5)

**Remember**: The key to mastering Graph RAG is understanding that knowledge is inherently connected. Traditional RAG treats information as isolated chunks, but Graph RAG recognizes and leverages the web of relationships that make knowledge meaningful.

Keep practicing with different domains and datasets. The patterns you've learned here will serve as building blocks for increasingly sophisticated Graph RAG applications.

Happy graphing! üöÄüìä