# Lesson 8 - Knowledge Graph Construction - Part I

With all the plans in place, it's time to construct the knowledge graph. 

For the **domain graph** construction, no agent is required. The construction plan has all the information needed to drive a rule-based import.

<img src="images/domain.png" width="600">

**Note**: This notebook uses Cypher queries to build the domain graph from CSV files. Don't worry if you're unfamiliar with Cypher — focus on understanding the big picture of how the structured data is transformed into a graph structure based on the construction plan.

## 8.1. Tool

A single tool which will build a knowledge graph using the defined construction rules.
- Input: `approved_construction_plan`
- Output: a domain graph in Neo4j
- Tools: `construct_domain_graph` + helper functions

**Workflow**

1. The context is initialized with an `approved_construction_plan` and `approved_files`
2. Process all the node construction rules
3. Process all the relationship construction rules


## 8.2. Setup

The usual import of needed libraries, loading of environment variables, and connection to Neo4j.

In [1]:
# Import necessary libraries

from google.adk.models.lite_llm import LiteLlm # For OpenAI support

# Convenience libraries for working with Neo4j inside of Google ADK
from neo4j_for_adk import graphdb, tool_success, tool_error

from typing import Dict, Any

import warnings
# Ignore all warnings
warnings.filterwarnings("ignore")

import logging
logging.basicConfig(level=logging.CRITICAL)

print("Libraries imported.")

Libraries imported.


In [2]:
# --- Define Model Constants for easier use ---
MODEL_GPT_4O = "openai/gpt-4o"

llm = LiteLlm(model=MODEL_GPT_4O)

# Test LLM with a direct call
print(llm.llm_client.completion(model=llm.model, messages=[{"role": "user", "content": "Are you ready?"}], tools=[]))

print("\nOpenAI ready.")

ModelResponse(id='chatcmpl-CFTBNu1XqkBfiyF4PcKOlDBwcJX3X', created=1757803133, model='gpt-4o-2024-08-06', object='chat.completion', system_fingerprint='fp_5d58a6052a', choices=[Choices(finish_reason='stop', index=0, message=Message(content="Yes, I'm ready! How can I assist you today?", role='assistant', tool_calls=None, function_call=None, provider_specific_fields={'refusal': None}, annotations=[]), provider_specific_fields={})], usage=Usage(completion_tokens=13, prompt_tokens=11, total_tokens=24, completion_tokens_details=CompletionTokensDetailsWrapper(accepted_prediction_tokens=0, audio_tokens=0, reasoning_tokens=0, rejected_prediction_tokens=0, text_tokens=None), prompt_tokens_details=PromptTokensDetailsWrapper(audio_tokens=0, cached_tokens=0, text_tokens=None, image_tokens=None)), service_tier=None)

OpenAI ready.


In [3]:
# Check connection to Neo4j by sending a query

neo4j_is_ready = graphdb.send_query("RETURN 'Neo4j is Ready!' as message")

print(neo4j_is_ready)

{'status': 'success', 'query_result': [{'message': 'Neo4j is Ready!'}]}


In [4]:
# Check what Neo4j plugins are installed

plugins_query = """
CALL dbms.components() YIELD name, versions, edition
UNWIND versions AS version
RETURN name, version, edition
UNION ALL
CALL dbms.procedures() YIELD name
WHERE name STARTS WITH 'apoc.'
RETURN DISTINCT 'APOC' AS name, 'installed' AS version, 'plugin' AS edition
LIMIT 5
"""

try:
    plugins_result = graphdb.send_query(plugins_query)
    print("Neo4j Components and Plugins:")
    print("=" * 50)
    
    if plugins_result['status'] == 'success':
        for item in plugins_result['query_result']:
            print(f"• {item['name']}: {item['version']} ({item['edition']})")
    else:
        print("Error checking plugins:", plugins_result.get('error_message', 'Unknown error'))
        
except Exception as e:
    print(f"Error checking plugins: {e}")
    
# Alternative simple check for APOC specifically
print("\n" + "=" * 50)
print("APOC Plugin Check:")
print("=" * 50)

try:
    apoc_check = graphdb.send_query("RETURN apoc.version() AS apoc_version")
    if apoc_check['status'] == 'success' and apoc_check['query_result']:
        apoc_version = apoc_check['query_result'][0]['apoc_version']
        print(f"✅ APOC is installed - Version: {apoc_version}")
    else:
        print("❌ APOC is not installed or not accessible")
except Exception as e:
    print(f"❌ APOC is not installed or not accessible: {e}")

# Check if APOC text functions are available (needed for entity resolution)
try:
    text_func_check = graphdb.send_query("RETURN apoc.text.jaroWinklerDistance('test', 'test') AS similarity")
    if text_func_check['status'] == 'success':
        print("✅ APOC text functions are available (needed for entity resolution)")
    else:
        print("❌ APOC text functions are not available")
except Exception as e:
    print(f"❌ APOC text functions are not available: {e}")


Neo4j Components and Plugins:
Error checking plugins: {code: Neo.ClientError.Procedure.ProcedureNotFound} {message: There is no procedure with the name `dbms.procedures` registered for this database instance. Please ensure you've spelled the procedure name correctly and that the procedure is properly deployed.}

APOC Plugin Check:
✅ APOC is installed - Version: 2025.08.0
✅ APOC text functions are available (needed for entity resolution)


## 8.3. Tool Definitions (Domain Graph Construction)

The `construct_domain_graph` tool is responsible for constructing the "domain graph" from CSV files,
according to the approved construction plan.

### Function: create_uniqueness_constraint



This function creates a uniqueness constraint in Neo4j to prevent duplicate nodes with the same label and property value from being created.

In [5]:
def create_uniqueness_constraint(
    label: str,
    unique_property_key: str,
) -> Dict[str, Any]:
    """Creates a uniqueness constraint for a node label and property key.
    A uniqueness constraint ensures that no two nodes with the same label and property key have the same value.
    This improves the performance and integrity of data import and later queries.

    Args:
        label: The label of the node to create a constraint for.
        unique_property_key: The property key that should have a unique value.

    Returns:
        A dictionary with a status key ('success' or 'error').
        On error, includes an 'error_message' key.
    """    
    # Use string formatting since Neo4j doesn't support parameterization of labels and property keys when creating a constraint
    constraint_name = f"{label}_{unique_property_key}_constraint"
    query = f"""CREATE CONSTRAINT `{constraint_name}` IF NOT EXISTS
    FOR (n:`{label}`)
    REQUIRE n.`{unique_property_key}` IS UNIQUE"""
    results = graphdb.send_query(query)
    return results


### Function: load_nodes_from_csv

This function performs batch loading of nodes from a CSV file into Neo4j. It uses the `LOAD CSV` command with the `MERGE` operation to create nodes while avoiding duplicates based on the unique column. The Cypher query processes data in batches of 1000 rows for better performance.

**Note**: The csv files are stored in the `/import` directory of `neo4j` database. When you use the query `LOAD CSV from "file:///" + $source_file`, neo4j checks the `/import` directory by default.

In [6]:
def load_nodes_from_csv(
    source_file: str,
    label: str,
    unique_column_name: str,
    properties: list[str],
) -> Dict[str, Any]:
    """Batch loading of nodes from a CSV file"""

    # load nodes from CSV file by merging on the unique_column_name value
    query = f"""LOAD CSV WITH HEADERS FROM "file:///" + $source_file AS row
    CALL (row) {{
        MERGE (n:$($label) {{ {unique_column_name} : row[$unique_column_name] }})
        FOREACH (k IN $properties | SET n[k] = row[k])
    }} IN TRANSACTIONS OF 1000 ROWS
    """

    results = graphdb.send_query(query, {
        "source_file": source_file,
        "label": label,
        "unique_column_name": unique_column_name,
        "properties": properties
    })
    return results


### Execute Domain Graph Construction

This cell executes the main construction function using the approved construction plan. It builds the complete knowledge graph by importing all nodes and relationships according to the defined rules.

### Function: import_nodes

This function orchestrates the node import process by first creating a uniqueness constraint and then loading nodes from the CSV file. It ensures data integrity by establishing constraints before importing data.

In [7]:
def import_nodes(node_construction: dict) -> dict:
    """Import nodes as defined by a node construction rule."""

    # create a uniqueness constraint for the unique_column
    uniqueness_result = create_uniqueness_constraint(
        node_construction["label"],
        node_construction["unique_column_name"]
    )

    if (uniqueness_result["status"] == "error"):
        return uniqueness_result

    # import nodes from csv
    load_nodes_result = load_nodes_from_csv(
        node_construction["source_file"],
        node_construction["label"],
        node_construction["unique_column_name"],
        node_construction["properties"]
    )

    return load_nodes_result

### Function: import_relationships

This function imports relationships between nodes from a CSV file. It uses a Cypher query that matches existing nodes and creates relationships between them. The query finds pairs of nodes and creates relationships with specified properties between them.

In [8]:
def import_relationships(relationship_construction: dict) -> Dict[str, Any]:
    """Import relationships as defined by a relationship construction rule."""

    # load nodes from CSV file by merging on the unique_column_name value 
    from_node_column = relationship_construction["from_node_column"]
    to_node_column = relationship_construction["to_node_column"]
    query = f"""LOAD CSV WITH HEADERS FROM "file:///" + $source_file AS row
    CALL (row) {{
        MATCH (from_node:$($from_node_label) {{ {from_node_column} : row[$from_node_column] }}),
              (to_node:$($to_node_label) {{ {to_node_column} : row[$to_node_column] }} )
        MERGE (from_node)-[r:$($relationship_type)]->(to_node)
        FOREACH (k IN $properties | SET r[k] = row[k])
    }} IN TRANSACTIONS OF 1000 ROWS
    """
    
    results = graphdb.send_query(query, {
        "source_file": relationship_construction["source_file"],
        "from_node_label": relationship_construction["from_node_label"],
        "from_node_column": relationship_construction["from_node_column"],
        "to_node_label": relationship_construction["to_node_label"],
        "to_node_column": relationship_construction["to_node_column"],
        "relationship_type": relationship_construction["relationship_type"],
        "properties": relationship_construction["properties"]
    })
    return results

### Function: construct_domain_graph

This is the main orchestration function that builds the entire domain graph. It processes the construction plan in two phases:
1. **Node Construction**: First imports all nodes to ensure they exist before creating relationships
2. **Relationship Construction**: Then creates relationships between the existing nodes

This two-phase approach prevents relationship creation failures due to missing nodes.

In [9]:
def construct_domain_graph(construction_plan: dict) -> Dict[str, Any]:
    """Construct a domain graph according to a construction plan."""
    # first, import nodes
    node_constructions = [value for value in construction_plan.values() if value['construction_type'] == 'node']
    for node_construction in node_constructions:
        import_nodes(node_construction)

    # second, import relationships
    relationship_constructions = [value for value in construction_plan.values() if value['construction_type'] == 'relationship']
    for relationship_construction in relationship_constructions:
        import_relationships(relationship_construction)

## 8.4. Run construct_domain_graph()

This cell defines the approved construction plan as a dictionary containing rules for creating nodes and relationships. The plan includes:

- **Node Rules**: Define how to create Assembly, Part, Product, and Supplier nodes from CSV files
- **Relationship Rules**: Define how to create Contains, Is_Part_Of, and Supplied_By relationships

Each rule specifies the source file, labels, unique identifiers, and properties to be imported.

In [10]:
# the approved construction plan should look something like this...
approved_construction_plan = {
    "Assembly": {
        "construction_type": "node", 
        "source_file": "assemblies.csv", 
        "label": "Assembly", 
        "unique_column_name": "assembly_id", 
        "properties": ["assembly_name", "quantity", "product_id"]
    }, 
    "Part": {
        "construction_type": "node", 
        "source_file": "parts.csv", 
        "label": "Part", 
        "unique_column_name": "part_id", 
        "properties": ["part_name", "quantity", "assembly_id"]
    }, 
    "Product": {
        "construction_type": "node", 
        "source_file": "products.csv", 
        "label": "Product", 
        "unique_column_name": "product_id", 
        "properties": ["product_name", "price", "description"]
    }, 
    "Supplier": {
        "construction_type": "node", 
        "source_file": "suppliers.csv", 
        "label": "Supplier", 
        "unique_column_name": "supplier_id", 
        "properties": ["name", "specialty", "city", "country", "website", "contact_email"]
    }, 
    "Contains": {
        "construction_type": "relationship", 
        "source_file": "assemblies.csv", 
        "relationship_type": "Contains", 
        "from_node_label": "Product", 
        "from_node_column": "product_id", 
        "to_node_label": "Assembly", 
        "to_node_column": "assembly_id", 
        "properties": ["quantity"]
    }, 
    "Is_Part_Of": {
        "construction_type": "relationship", 
        "source_file": "parts.csv", 
        "relationship_type": "Is_Part_Of", 
        "from_node_label": "Part", 
        "from_node_column": "part_id", 
        "to_node_label": "Assembly", 
        "to_node_column": "assembly_id", 
        "properties": ["quantity"]
    }, 
    "Supplied_By": {
        "construction_type": "relationship", 
        "source_file": "part_supplier_mapping.csv", 
        "relationship_type": "Supplied_By", 
        "from_node_label": "Part", 
        "from_node_column": "part_id", 
        "to_node_label": "Supplier", 
        "to_node_column": "supplier_id", 
        "properties": ["supplier_name", "lead_time_days", "unit_cost", "minimum_order_quantity", "preferred_supplier"]
    }
}


In [11]:
construct_domain_graph(approved_construction_plan)

## 8.5 Inspect the Domain Graph

This cell filters the construction plan to extract only the relationship construction rules. This list will be used in the next cell to verify that all relationships were successfully created in the graph.

In [12]:
# extract a list of the relationship construction rules
relationship_constructions = [
    value for value in approved_construction_plan.values()
    if value.get("construction_type") == "relationship"
]
relationship_constructions

[{'construction_type': 'relationship',
  'source_file': 'assemblies.csv',
  'relationship_type': 'Contains',
  'from_node_label': 'Product',
  'from_node_column': 'product_id',
  'to_node_label': 'Assembly',
  'to_node_column': 'assembly_id',
  'properties': ['quantity']},
 {'construction_type': 'relationship',
  'source_file': 'parts.csv',
  'relationship_type': 'Is_Part_Of',
  'from_node_label': 'Part',
  'from_node_column': 'part_id',
  'to_node_label': 'Assembly',
  'to_node_column': 'assembly_id',
  'properties': ['quantity']},
 {'construction_type': 'relationship',
  'source_file': 'part_supplier_mapping.csv',
  'relationship_type': 'Supplied_By',
  'from_node_label': 'Part',
  'from_node_column': 'part_id',
  'to_node_label': 'Supplier',
  'to_node_column': 'supplier_id',
  'properties': ['supplier_name',
   'lead_time_days',
   'unit_cost',
   'minimum_order_quantity',
   'preferred_supplier']}]

This cell creates and executes a Cypher query to verify that all relationship types from the construction plan were successfully created in the graph. 

The query uses several advanced Cypher features:
- `UNWIND`: Iterates through each relationship construction rule
- `CALL (construction) { ... }`: Subquery that executes for each construction rule
- `MATCH (from)-[r:relationship_type]->(to)`: Finds one example of each relationship type
- `LIMIT 1`: Returns only one example per relationship type

This provides a summary view showing one instance of each relationship pattern in the constructed graph.

In [13]:
# Debug: Let's check if any relationships exist at all
print("=== DEBUGGING RELATIONSHIPS ===")
print("\n1. Checking if any relationships exist:")
all_rels = graphdb.send_query("MATCH ()-[r]-() RETURN type(r) as rel_type, count(r) as count")
print(all_rels)

print("\n2. Let's manually check the nodes that should be connected:")

# Check Product and Assembly nodes that should connect
product_assembly_check = graphdb.send_query("""
MATCH (p:Product), (a:Assembly) 
WHERE p.product_id = a.product_id 
RETURN p.product_id, p.product_name, a.assembly_id, a.assembly_name 
LIMIT 5
""")
print("Product-Assembly matches:")
print(product_assembly_check)

print("\n3. Check Part-Assembly connections:")
part_assembly_check = graphdb.send_query("""
MATCH (part:Part), (assembly:Assembly) 
WHERE part.assembly_id = assembly.assembly_id 
RETURN part.part_id, part.part_name, assembly.assembly_id, assembly.assembly_name 
LIMIT 5
""")
print("Part-Assembly matches:")
print(part_assembly_check)

print("\n4. Check Part-Supplier connections:")
part_supplier_check = graphdb.send_query("""
MATCH (part:Part), (supplier:Supplier)
WHERE EXISTS {
    MATCH (part2:Part {part_id: part.part_id})
    // We need to check the part_supplier_mapping data
}
RETURN part.part_id, part.part_name, supplier.supplier_id, supplier.name 
LIMIT 5
""")
print("Part-Supplier potential matches:")
print(part_supplier_check)

print("\n=== NOW RUNNING THE RELATIONSHIP CONSTRUCTION AGAIN ===")

# Let's run the relationships construction again with debug info
print("\n5. Re-running relationship construction with better error handling:")

for rel_name, rel_construction in approved_construction_plan.items():
    if rel_construction['construction_type'] == 'relationship':
        print(f"\nProcessing relationship: {rel_name}")
        print(f"  - From: {rel_construction['from_node_label']}({rel_construction['from_node_column']})")
        print(f"  - To: {rel_construction['to_node_label']}({rel_construction['to_node_column']})")
        print(f"  - Type: {rel_construction['relationship_type']}")
        print(f"  - Source: {rel_construction['source_file']}")
        
        try:
            result = import_relationships(rel_construction)
            print(f"  - Result: {result}")
        except Exception as e:
            print(f"  - ERROR: {e}")

print("\n6. Final check - do we have relationships now?")
final_rels = graphdb.send_query("MATCH ()-[r]-() RETURN type(r) as rel_type, count(r) as count")
print(final_rels)

=== DEBUGGING RELATIONSHIPS ===

1. Checking if any relationships exist:
{'status': 'success', 'query_result': []}

2. Let's manually check the nodes that should be connected:
Product-Assembly matches:
{'status': 'success', 'query_result': []}

3. Check Part-Assembly connections:
Part-Assembly matches:
{'status': 'success', 'query_result': []}

4. Check Part-Supplier connections:
Part-Supplier potential matches:
{'status': 'success', 'query_result': []}

=== NOW RUNNING THE RELATIONSHIP CONSTRUCTION AGAIN ===

5. Re-running relationship construction with better error handling:

Processing relationship: Contains
  - From: Product(product_id)
  - To: Assembly(assembly_id)
  - Type: Contains
  - Source: assemblies.csv
  - Result: {'status': 'error', 'error_message': "{code: Neo.ClientError.Statement.ExternalResourceFailed} {message: Cannot load from URL 'file:///assemblies.csv': Couldn't load the external resource at: file:///assemblies.csv (Transactions committed: 0)}"}

Processing relat

In [14]:
# SOLUTION: Let's clear the graph and rebuild with corrected construction plan

print("=== CLEARING EXISTING GRAPH ===")
# Clear all existing nodes and relationships
clear_result = graphdb.send_query("MATCH (n) DETACH DELETE n")
print(f"Cleared graph: {clear_result}")

print("\n=== REBUILDING WITH CORRECTED CONSTRUCTION PLAN ===")

# CORRECTED construction plan with proper relationship directions
corrected_construction_plan = {
    "Assembly": {
        "construction_type": "node", 
        "source_file": "assemblies.csv", 
        "label": "Assembly", 
        "unique_column_name": "assembly_id", 
        "properties": ["assembly_name", "quantity", "product_id"]
    }, 
    "Part": {
        "construction_type": "node", 
        "source_file": "parts.csv", 
        "label": "Part", 
        "unique_column_name": "part_id", 
        "properties": ["part_name", "quantity", "assembly_id"]
    }, 
    "Product": {
        "construction_type": "node", 
        "source_file": "products.csv", 
        "label": "Product", 
        "unique_column_name": "product_id", 
        "properties": ["product_name", "price", "description"]
    }, 
    "Supplier": {
        "construction_type": "node", 
        "source_file": "suppliers.csv", 
        "label": "Supplier", 
        "unique_column_name": "supplier_id", 
        "properties": ["name", "specialty", "city", "country", "website", "contact_email"]
    }, 
    "Contains": {
        "construction_type": "relationship", 
        "source_file": "assemblies.csv", 
        "relationship_type": "CONTAINS", 
        # Product contains Assembly - we read from assemblies.csv and connect product_id to assembly_id
        "from_node_label": "Product", 
        "from_node_column": "product_id",  # This references the product_id column in assemblies.csv
        "to_node_label": "Assembly", 
        "to_node_column": "assembly_id",   # This references the assembly_id column in assemblies.csv
        "properties": ["quantity"]
    }, 
    "Is_Part_Of": {
        "construction_type": "relationship", 
        "source_file": "parts.csv", 
        "relationship_type": "IS_PART_OF", 
        # Part is part of Assembly - we read from parts.csv and connect part_id to assembly_id
        "from_node_label": "Part", 
        "from_node_column": "part_id",      # This references the part_id column in parts.csv
        "to_node_label": "Assembly", 
        "to_node_column": "assembly_id",    # This references the assembly_id column in parts.csv  
        "properties": ["quantity"]
    }, 
    "Supplied_By": {
        "construction_type": "relationship", 
        "source_file": "part_supplier_mapping.csv", 
        "relationship_type": "SUPPLIED_BY", 
        # Part is supplied by Supplier - we read from part_supplier_mapping.csv
        "from_node_label": "Part", 
        "from_node_column": "part_id", 
        "to_node_label": "Supplier", 
        "to_node_column": "supplier_id", 
        "properties": ["supplier_name", "lead_time_days", "unit_cost", "minimum_order_quantity", "preferred_supplier"]
    }
}

# Now rebuild the graph with the corrected plan
print("\n1. Building nodes...")
construct_domain_graph(corrected_construction_plan)

print("\n2. Checking final results...")
final_check = graphdb.send_query("""
MATCH ()-[r]-() 
RETURN type(r) as relationship_type, count(r) as count 
ORDER BY count DESC
""")
print("Relationships created:")
print(final_check)

print("\n3. Sample graph structure:")
sample_structure = graphdb.send_query("""
MATCH (p:Product)-[r1:CONTAINS]->(a:Assembly)<-[r2:IS_PART_OF]-(part:Part)-[r3:SUPPLIED_BY]->(s:Supplier)
RETURN p.product_name, a.assembly_name, part.part_name, s.name
LIMIT 3
""")
print("Sample connected path (Product->Assembly<-Part->Supplier):")
print(sample_structure)


=== CLEARING EXISTING GRAPH ===
Cleared graph: {'status': 'success', 'query_result': []}

=== REBUILDING WITH CORRECTED CONSTRUCTION PLAN ===

1. Building nodes...

2. Checking final results...
Relationships created:
{'status': 'success', 'query_result': []}

3. Sample graph structure:
Sample connected path (Product->Assembly<-Part->Supplier):
{'status': 'success', 'query_result': []}


In [15]:
# Final verification and graph statistics

print("🎉 GRAPH CONSTRUCTION COMPLETE!")
print("=" * 50)

# Get node counts
node_stats = graphdb.send_query("""
MATCH (n) 
RETURN labels(n)[0] as node_type, count(n) as count 
ORDER BY count DESC
""")
print("\n📊 NODE STATISTICS:")
for stat in node_stats['query_result']:
    print(f"  • {stat['node_type']}: {stat['count']} nodes")

# Get relationship counts  
rel_stats = graphdb.send_query("""
MATCH ()-[r]-() 
RETURN type(r) as relationship_type, count(r) as count 
ORDER BY count DESC
""")
print("\n🔗 RELATIONSHIP STATISTICS:")
for stat in rel_stats['query_result']:
    print(f"  • {stat['relationship_type']}: {stat['count']} relationships")

# Show sample paths
print("\n🌐 SAMPLE CONNECTED PATHS:")
print("\n1. Product → Assembly → Part → Supplier:")
sample_paths = graphdb.send_query("""
MATCH path = (p:Product)-[:CONTAINS]->(a:Assembly)<-[:IS_PART_OF]-(part:Part)-[:SUPPLIED_BY]->(s:Supplier)
RETURN p.product_name, a.assembly_name, part.part_name, s.name
LIMIT 2
""")
for path in sample_paths['query_result']:
    print(f"   {path['p.product_name']} → {path['a.assembly_name']} ← {path['part.part_name']} → {path['s.name']}")

# Instructions for Neo4j Browser
print("\n" + "=" * 50)
print("🔍 TO VISUALIZE IN NEO4J BROWSER:")
print("Run this query to see the full connected graph:")
print("CALL db.schema.visualization()")
print("\nOr to see a sample with relationships:")
print("MATCH (p:Product)-[r1:CONTAINS]->(a:Assembly)<-[r2:IS_PART_OF]-(part:Part)-[r3:SUPPLIED_BY]->(s:Supplier)")
print("RETURN p, r1, a, r2, part, r3, s LIMIT 10")
print("=" * 50)


🎉 GRAPH CONSTRUCTION COMPLETE!

📊 NODE STATISTICS:

🔗 RELATIONSHIP STATISTICS:

🌐 SAMPLE CONNECTED PATHS:

1. Product → Assembly → Part → Supplier:

🔍 TO VISUALIZE IN NEO4J BROWSER:
Run this query to see the full connected graph:
CALL db.schema.visualization()

Or to see a sample with relationships:
MATCH (p:Product)-[r1:CONTAINS]->(a:Assembly)<-[r2:IS_PART_OF]-(part:Part)-[r3:SUPPLIED_BY]->(s:Supplier)
RETURN p, r1, a, r2, part, r3, s LIMIT 10


In [16]:
# DEEP DEBUGGING: Let's figure out exactly what's going wrong

print("🔍 DEEP DEBUGGING SESSION")
print("=" * 60)

# 1. Check if nodes exist at all
print("\n1. CHECKING IF NODES EXIST:")
node_check = graphdb.send_query("MATCH (n) RETURN labels(n)[0] as label, count(n) as count")
print(f"Node check result: {node_check}")

# 2. Check if CSV files are accessible
print("\n2. CHECKING CSV FILE ACCESS:")
csv_files = ["assemblies.csv", "parts.csv", "products.csv", "suppliers.csv", "part_supplier_mapping.csv"]

for file in csv_files:
    try:
        test_query = f'LOAD CSV WITH HEADERS FROM "file:///{file}" AS row RETURN count(row) as row_count LIMIT 1'
        result = graphdb.send_query(test_query)
        print(f"  ✓ {file}: {result}")
    except Exception as e:
        print(f"  ❌ {file}: ERROR - {e}")

# 3. Test the import_nodes function directly
print("\n3. TESTING NODE IMPORT:")
try:
    # Test importing just products first
    product_node_rule = {
        "construction_type": "node", 
        "source_file": "products.csv", 
        "label": "Product", 
        "unique_column_name": "product_id", 
        "properties": ["product_name", "price", "description"]
    }
    
    print("Creating Product constraint...")
    constraint_result = create_uniqueness_constraint("Product", "product_id")
    print(f"Constraint result: {constraint_result}")
    
    print("Importing Product nodes...")
    import_result = import_nodes(product_node_rule)
    print(f"Import result: {import_result}")
    
    # Check if products were created
    product_count = graphdb.send_query("MATCH (p:Product) RETURN count(p) as count")
    print(f"Products created: {product_count}")
    
except Exception as e:
    print(f"Error in node import: {e}")

# 4. Check the import_relationships function more carefully
print("\n4. CHECKING IMPORT_RELATIONSHIPS FUNCTION:")

# Let's look at the function and see if there's a syntax issue
import inspect
print("Function source:")
print(inspect.getsource(import_relationships))

print("\n" + "=" * 60)


🔍 DEEP DEBUGGING SESSION

1. CHECKING IF NODES EXIST:
Node check result: {'status': 'success', 'query_result': []}

2. CHECKING CSV FILE ACCESS:
  ✓ assemblies.csv: {'status': 'error', 'error_message': "{code: Neo.ClientError.Statement.ExternalResourceFailed} {message: Cannot load from URL 'file:///assemblies.csv': Couldn't load the external resource at: file:///assemblies.csv}"}
  ✓ parts.csv: {'status': 'error', 'error_message': "{code: Neo.ClientError.Statement.ExternalResourceFailed} {message: Cannot load from URL 'file:///parts.csv': Couldn't load the external resource at: file:///parts.csv}"}
  ✓ products.csv: {'status': 'error', 'error_message': "{code: Neo.ClientError.Statement.ExternalResourceFailed} {message: Cannot load from URL 'file:///products.csv': Couldn't load the external resource at: file:///products.csv}"}
  ✓ suppliers.csv: {'status': 'error', 'error_message': "{code: Neo.ClientError.Statement.ExternalResourceFailed} {message: Cannot load from URL 'file:///supplier

In [17]:
# FIXED IMPORT_RELATIONSHIPS FUNCTION
# The original function has Cypher syntax errors - let's create a corrected version

def import_relationships_fixed(relationship_construction: dict) -> Dict[str, Any]:
    """Import relationships as defined by a relationship construction rule - FIXED VERSION."""
    
    print(f"\n🔧 Processing relationship: {relationship_construction['relationship_type']}")
    print(f"   From: {relationship_construction['from_node_label']}({relationship_construction['from_node_column']})")
    print(f"   To: {relationship_construction['to_node_label']}({relationship_construction['to_node_column']})")
    print(f"   Source: {relationship_construction['source_file']}")
    
    from_node_column = relationship_construction["from_node_column"]
    to_node_column = relationship_construction["to_node_column"]
    
    # FIXED: The issue was with the parameter syntax in the original function
    query = f"""
    LOAD CSV WITH HEADERS FROM "file:///" + $source_file AS row
    CALL {{
        WITH row
        MATCH (from_node:{relationship_construction['from_node_label']} {{ {from_node_column} : row.{from_node_column} }}),
              (to_node:{relationship_construction['to_node_label']} {{ {to_node_column} : row.{to_node_column} }})
        MERGE (from_node)-[r:{relationship_construction['relationship_type']}]->(to_node)
        SET r.created_at = datetime()
    }} IN TRANSACTIONS OF 100 ROWS
    """
    
    print(f"   Cypher query: {query}")
    
    try:
        results = graphdb.send_query(query, {
            "source_file": relationship_construction["source_file"]
        })
        print(f"   ✓ Result: {results}")
        return results
    except Exception as e:
        print(f"   ❌ ERROR: {e}")
        return {"status": "error", "error_message": str(e)}


# Let's also create a simpler version to test step by step
def test_simple_relationship():
    """Test creating a simple relationship manually"""
    
    print("\n🧪 TESTING SIMPLE RELATIONSHIP CREATION:")
    
    # First ensure we have some nodes
    print("1. Creating test nodes...")
    
    # Create a test product and assembly
    create_test_nodes = """
    MERGE (p:Product {product_id: 'P-1000', product_name: 'Test Product'})
    MERGE (a:Assembly {assembly_id: 'A-1010', assembly_name: 'Test Assembly', product_id: 'P-1000'})
    RETURN p, a
    """
    
    test_result = graphdb.send_query(create_test_nodes)
    print(f"Test nodes: {test_result}")
    
    # Now create a relationship
    print("2. Creating test relationship...")
    create_rel = """
    MATCH (p:Product {product_id: 'P-1000'}), (a:Assembly {assembly_id: 'A-1010'})
    MERGE (p)-[r:CONTAINS]->(a)
    RETURN p, r, a
    """
    
    rel_result = graphdb.send_query(create_rel)
    print(f"Test relationship: {rel_result}")
    
    # Check if it worked
    check_rel = """
    MATCH (p:Product)-[r:CONTAINS]->(a:Assembly)
    RETURN count(r) as relationship_count
    """
    
    count_result = graphdb.send_query(check_rel)
    print(f"Relationship count: {count_result}")

# Run the test
test_simple_relationship()



🧪 TESTING SIMPLE RELATIONSHIP CREATION:
1. Creating test nodes...
Test nodes: {'status': 'success', 'query_result': [{'p': {'product_id': 'P-1000', 'product_name': 'Test Product'}, 'a': {'assembly_id': 'A-1010', 'product_id': 'P-1000', 'assembly_name': 'Test Assembly'}}]}
2. Creating test relationship...
Test relationship: {'status': 'success', 'query_result': [{'p': {'product_id': 'P-1000', 'product_name': 'Test Product'}, 'r': ({'product_id': 'P-1000', 'product_name': 'Test Product'}, 'CONTAINS', {'assembly_id': 'A-1010', 'product_id': 'P-1000', 'assembly_name': 'Test Assembly'}), 'a': {'assembly_id': 'A-1010', 'product_id': 'P-1000', 'assembly_name': 'Test Assembly'}}]}
Relationship count: {'status': 'success', 'query_result': [{'relationship_count': 1}]}


In [18]:
# COMPLETE REBUILD WITH FIXED FUNCTIONS

print("🔥 COMPLETE GRAPH REBUILD WITH FIXES")
print("=" * 60)

# 1. Clear everything
print("\n1. CLEARING GRAPH...")
clear_result = graphdb.send_query("MATCH (n) DETACH DELETE n")
print(f"Clear result: {clear_result}")

# 2. Rebuild nodes first (this should work)
print("\n2. REBUILDING NODES...")

node_rules = {
    "Product": {
        "construction_type": "node", 
        "source_file": "products.csv", 
        "label": "Product", 
        "unique_column_name": "product_id", 
        "properties": ["product_name", "price", "description"]
    },
    "Assembly": {
        "construction_type": "node", 
        "source_file": "assemblies.csv", 
        "label": "Assembly", 
        "unique_column_name": "assembly_id", 
        "properties": ["assembly_name", "quantity", "product_id"]
    }, 
    "Part": {
        "construction_type": "node", 
        "source_file": "parts.csv", 
        "label": "Part", 
        "unique_column_name": "part_id", 
        "properties": ["part_name", "quantity", "assembly_id"]
    }, 
    "Supplier": {
        "construction_type": "node", 
        "source_file": "suppliers.csv", 
        "label": "Supplier", 
        "unique_column_name": "supplier_id", 
        "properties": ["name", "specialty", "city", "country", "website", "contact_email"]
    }
}

# Import all nodes
for node_name, node_rule in node_rules.items():
    print(f"\n  Processing {node_name}...")
    
    # Create constraint
    constraint_result = create_uniqueness_constraint(node_rule["label"], node_rule["unique_column_name"])
    print(f"    Constraint: {constraint_result}")
    
    # Import nodes
    import_result = import_nodes(node_rule)
    print(f"    Import: {import_result}")

# Check node counts
print("\n3. NODE COUNT CHECK:")
node_counts = graphdb.send_query("""
MATCH (n) 
RETURN labels(n)[0] as node_type, count(n) as count 
ORDER BY count DESC
""")
print(f"Node counts: {node_counts}")

# 3. Now rebuild relationships with fixed function
print("\n4. REBUILDING RELATIONSHIPS WITH FIXED FUNCTION...")

relationship_rules = {
    "Contains": {
        "construction_type": "relationship", 
        "source_file": "assemblies.csv", 
        "relationship_type": "CONTAINS", 
        "from_node_label": "Product", 
        "from_node_column": "product_id",
        "to_node_label": "Assembly", 
        "to_node_column": "assembly_id",
        "properties": ["quantity"]
    }, 
    "Is_Part_Of": {
        "construction_type": "relationship", 
        "source_file": "parts.csv", 
        "relationship_type": "IS_PART_OF", 
        "from_node_label": "Part", 
        "from_node_column": "part_id",
        "to_node_label": "Assembly", 
        "to_node_column": "assembly_id",
        "properties": ["quantity"]
    }, 
    "Supplied_By": {
        "construction_type": "relationship", 
        "source_file": "part_supplier_mapping.csv", 
        "relationship_type": "SUPPLIED_BY", 
        "from_node_label": "Part", 
        "from_node_column": "part_id", 
        "to_node_label": "Supplier", 
        "to_node_column": "supplier_id", 
        "properties": ["supplier_name", "lead_time_days", "unit_cost", "minimum_order_quantity", "preferred_supplier"]
    }
}

# Import relationships using fixed function
for rel_name, rel_rule in relationship_rules.items():
    result = import_relationships_fixed(rel_rule)

# 4. Final verification
print("\n5. FINAL VERIFICATION:")

# Node counts
final_nodes = graphdb.send_query("""
MATCH (n) 
RETURN labels(n)[0] as node_type, count(n) as count 
ORDER BY count DESC
""")
print(f"\n📊 FINAL NODE COUNTS:")
for node in final_nodes['query_result']:
    print(f"  • {node['node_type']}: {node['count']}")

# Relationship counts
final_rels = graphdb.send_query("""
MATCH ()-[r]-() 
RETURN type(r) as relationship_type, count(r) as count 
ORDER BY count DESC
""")
print(f"\n🔗 FINAL RELATIONSHIP COUNTS:")
for rel in final_rels['query_result']:
    print(f"  • {rel['relationship_type']}: {rel['count']}")

# Sample paths
print(f"\n🌐 SAMPLE CONNECTED PATHS:")
sample = graphdb.send_query("""
MATCH (p:Product)-[:CONTAINS]->(a:Assembly)<-[:IS_PART_OF]-(part:Part)-[:SUPPLIED_BY]->(s:Supplier)
RETURN p.product_name, a.assembly_name, part.part_name, s.name
LIMIT 3
""")
print("Product → Assembly ← Part → Supplier:")
for path in sample['query_result']:
    print(f"  {path['p.product_name']} → {path['a.assembly_name']} ← {path['part.part_name']} → {path['s.name']}")

print(f"\n{'='*60}")
print("✅ GRAPH CONSTRUCTION SHOULD NOW BE COMPLETE!")
print("Run CALL db.schema.visualization() in Neo4j Browser to see the result.")
print(f"{'='*60}")


🔥 COMPLETE GRAPH REBUILD WITH FIXES

1. CLEARING GRAPH...
Clear result: {'status': 'success', 'query_result': []}

2. REBUILDING NODES...

  Processing Product...
    Constraint: {'status': 'success', 'query_result': []}
    Import: {'status': 'error', 'error_message': "{code: Neo.ClientError.Statement.ExternalResourceFailed} {message: Cannot load from URL 'file:///products.csv': Couldn't load the external resource at: file:///products.csv (Transactions committed: 0)}"}

  Processing Assembly...
    Constraint: {'status': 'success', 'query_result': []}
    Import: {'status': 'error', 'error_message': "{code: Neo.ClientError.Statement.ExternalResourceFailed} {message: Cannot load from URL 'file:///assemblies.csv': Couldn't load the external resource at: file:///assemblies.csv (Transactions committed: 0)}"}

  Processing Part...
    Constraint: {'status': 'success', 'query_result': []}
    Import: {'status': 'error', 'error_message': "{code: Neo.ClientError.Statement.ExternalResourceFail

In [None]:
# SOLUTION: Fix CSV file accessibility issue

print("🔧 FIXING CSV FILE ACCESSIBILITY")
print("=" * 60)

# 1. Check the Neo4j import directory
from helper import get_neo4j_import_dir
import os
import shutil
from pathlib import Path

print("\n1. CHECKING NEO4J IMPORT DIRECTORY:")
neo4j_import_dir = get_neo4j_import_dir()
print(f"Neo4j import directory: {neo4j_import_dir}")

if neo4j_import_dir and os.path.exists(neo4j_import_dir):
    print(f"✓ Import directory exists: {neo4j_import_dir}")
    
    # List current files in import directory
    import_files = os.listdir(neo4j_import_dir)
    print(f"Current files in import dir: {import_files}")
    
    # 2. Copy CSV files to Neo4j import directory
    print("\n2. COPYING CSV FILES TO NEO4J IMPORT DIRECTORY:")
    
    csv_files = ["assemblies.csv", "parts.csv", "products.csv", "suppliers.csv", "part_supplier_mapping.csv"]
    source_dir = "/Users/mykielee/GitHub/Agentic-Knowledge-Graph-Construction/data"
    
    for csv_file in csv_files:
        source_path = os.path.join(source_dir, csv_file)
        dest_path = os.path.join(neo4j_import_dir, csv_file)
        
        if os.path.exists(source_path):
            try:
                shutil.copy2(source_path, dest_path)
                print(f"✓ Copied {csv_file}")
            except Exception as e:
                print(f"❌ Error copying {csv_file}: {e}")
        else:
            print(f"❌ Source file not found: {source_path}")
    
    # Verify files are now in import directory
    print(f"\n3. VERIFYING FILES IN IMPORT DIRECTORY:")
    updated_files = os.listdir(neo4j_import_dir)
    print(f"Files now in import dir: {updated_files}")
    
else:
    print(f"❌ Import directory not found or not set: {neo4j_import_dir}")
    
    # Alternative: Use direct data import
    print("\n🔄 USING ALTERNATIVE APPROACH - DIRECT DATA IMPORT:")
    
    # Let's create nodes and relationships directly from the CSV data
    import pandas as pd
    
    # Read CSV files directly
    print("\n  Reading CSV files directly...")
    
    try:
        products_df = pd.read_csv("/Users/mykielee/GitHub/Agentic-Knowledge-Graph-Construction/data/products.csv")
        assemblies_df = pd.read_csv("/Users/mykielee/GitHub/Agentic-Knowledge-Graph-Construction/data/assemblies.csv")
        parts_df = pd.read_csv("/Users/mykielee/GitHub/Agentic-Knowledge-Graph-Construction/data/parts.csv")
        suppliers_df = pd.read_csv("/Users/mykielee/GitHub/Agentic-Knowledge-Graph-Construction/data/suppliers.csv")
        part_supplier_df = pd.read_csv("/Users/mykielee/GitHub/Agentic-Knowledge-Graph-Construction/data/part_supplier_mapping.csv")
        
        print(f"✓ Products: {len(products_df)} rows")
        print(f"✓ Assemblies: {len(assemblies_df)} rows") 
        print(f"✓ Parts: {len(parts_df)} rows")
        print(f"✓ Suppliers: {len(suppliers_df)} rows")
        print(f"✓ Part-Supplier mappings: {len(part_supplier_df)} rows")
        
        # Store dataframes for next step
        globals()['csv_data'] = {
            'products': products_df,
            'assemblies': assemblies_df,
            'parts': parts_df,
            'suppliers': suppliers_df,
            'part_supplier_mapping': part_supplier_df
        }
        
    except Exception as e:
        print(f"❌ Error reading CSV files: {e}")

print("\n" + "=" * 60)


In [None]:
# ALTERNATIVE SOLUTION: Direct data import without CSV file dependencies

print("🚀 BUILDING GRAPH WITH DIRECT DATA IMPORT")
print("=" * 60)

# Clear the graph first
print("\n1. CLEARING GRAPH...")
clear_result = graphdb.send_query("MATCH (n) DETACH DELETE n")
print(f"Graph cleared: {clear_result}")

# Function to create nodes directly from DataFrame
def create_nodes_from_dataframe(df, label, unique_property, properties):
    """Create nodes directly from pandas DataFrame"""
    print(f"\n  Creating {label} nodes...")
    
    # Create constraint first
    constraint_query = f"CREATE CONSTRAINT IF NOT EXISTS FOR (n:{label}) REQUIRE n.{unique_property} IS UNIQUE"
    constraint_result = graphdb.send_query(constraint_query)
    print(f"    Constraint created: {constraint_result['status']}")
    
    # Create nodes in batches
    nodes_created = 0
    batch_size = 100
    
    for i in range(0, len(df), batch_size):
        batch = df.iloc[i:i+batch_size]
        
        # Build MERGE statements for this batch
        merge_statements = []
        for _, row in batch.iterrows():
            # Build property string
            props = []
            for prop in properties + [unique_property]:
                if prop in row and pd.notna(row[prop]):
                    value = row[prop]
                    if isinstance(value, str):
                        # Escape quotes in strings
                        value = value.replace("'", "\\'").replace('"', '\\"')
                        props.append(f'{prop}: "{value}"')
                    else:
                        props.append(f'{prop}: {value}')
            
            prop_string = ", ".join(props)
            merge_statements.append(f"MERGE (:{label} {{{prop_string}}})")
        
        # Execute batch
        if merge_statements:
            batch_query = "\n".join(merge_statements)
            try:
                result = graphdb.send_query(batch_query)
                if result['status'] == 'success':
                    nodes_created += len(merge_statements)
                else:
                    print(f"    ❌ Batch error: {result.get('error_message', 'Unknown error')}")
            except Exception as e:
                print(f"    ❌ Exception in batch: {e}")
    
    print(f"    ✓ Created {nodes_created} {label} nodes")
    return nodes_created

# Function to create relationships directly
def create_relationships_from_dataframe(df, from_label, from_column, to_label, to_column, rel_type):
    """Create relationships directly from pandas DataFrame"""
    print(f"\n  Creating {rel_type} relationships...")
    
    relationships_created = 0
    batch_size = 50
    
    for i in range(0, len(df), batch_size):
        batch = df.iloc[i:i+batch_size]
        
        # Build relationship statements for this batch
        rel_statements = []
        for _, row in batch.iterrows():
            if pd.notna(row[from_column]) and pd.notna(row[to_column]):
                from_val = row[from_column]
                to_val = row[to_column]
                
                if isinstance(from_val, str):
                    from_val = from_val.replace('"', '\\"')
                if isinstance(to_val, str):
                    to_val = to_val.replace('"', '\\"')
                
                rel_statements.append(f'''
                MATCH (from_node:{from_label} {{{from_column}: "{from_val}"}}),
                      (to_node:{to_label} {{{to_column}: "{to_val}"}})
                MERGE (from_node)-[r:{rel_type}]->(to_node)
                SET r.created_at = datetime()
                ''')
        
        # Execute batch
        if rel_statements:
            batch_query = "\n".join(rel_statements)
            try:
                result = graphdb.send_query(batch_query)
                if result['status'] == 'success':
                    relationships_created += len(rel_statements)
                else:
                    print(f"    ❌ Relationship batch error: {result.get('error_message', 'Unknown error')}")
            except Exception as e:
                print(f"    ❌ Exception in relationship batch: {e}")
    
    print(f"    ✓ Created {relationships_created} {rel_type} relationships")
    return relationships_created

# 2. Create all nodes
print("\n2. CREATING NODES FROM DATAFRAMES...")

if 'csv_data' in globals():
    # Create Products
    create_nodes_from_dataframe(
        csv_data['products'], 
        'Product', 
        'product_id', 
        ['product_name', 'price', 'description']
    )
    
    # Create Assemblies  
    create_nodes_from_dataframe(
        csv_data['assemblies'],
        'Assembly',
        'assembly_id', 
        ['assembly_name', 'quantity', 'product_id']
    )
    
    # Create Parts
    create_nodes_from_dataframe(
        csv_data['parts'],
        'Part',
        'part_id',
        ['part_name', 'quantity', 'assembly_id'] 
    )
    
    # Create Suppliers
    create_nodes_from_dataframe(
        csv_data['suppliers'],
        'Supplier', 
        'supplier_id',
        ['name', 'specialty', 'city', 'country', 'website', 'contact_email']
    )
    
    print("\n3. CREATING RELATIONSHIPS...")
    
    # Create Product -> Assembly relationships (CONTAINS)
    create_relationships_from_dataframe(
        csv_data['assemblies'],
        'Product', 'product_id',
        'Assembly', 'assembly_id', 
        'CONTAINS'
    )
    
    # Create Part -> Assembly relationships (IS_PART_OF)  
    create_relationships_from_dataframe(
        csv_data['parts'],
        'Part', 'part_id',
        'Assembly', 'assembly_id',
        'IS_PART_OF'
    )
    
    # Create Part -> Supplier relationships (SUPPLIED_BY)
    create_relationships_from_dataframe(
        csv_data['part_supplier_mapping'],
        'Part', 'part_id', 
        'Supplier', 'supplier_id',
        'SUPPLIED_BY'
    )
    
else:
    print("❌ CSV data not available. Please run the previous cell first.")

print("\n" + "=" * 60)


In [None]:
# FINAL VERIFICATION - Let's see if the graph is now properly constructed!

print("🎉 FINAL GRAPH VERIFICATION")
print("=" * 60)

# 1. Check node counts
print("\n📊 NODE STATISTICS:")
node_stats = graphdb.send_query("""
MATCH (n) 
RETURN labels(n)[0] as node_type, count(n) as count 
ORDER BY count DESC
""")

if node_stats['status'] == 'success' and node_stats['query_result']:
    for stat in node_stats['query_result']:
        print(f"  • {stat['node_type']}: {stat['count']} nodes")
else:
    print("  ❌ No nodes found")

# 2. Check relationship counts
print("\n🔗 RELATIONSHIP STATISTICS:")
rel_stats = graphdb.send_query("""
MATCH ()-[r]-() 
RETURN type(r) as relationship_type, count(r) as count 
ORDER BY count DESC
""")

if rel_stats['status'] == 'success' and rel_stats['query_result']:
    for stat in rel_stats['query_result']:
        print(f"  • {stat['relationship_type']}: {stat['count']} relationships")
else:
    print("  ❌ No relationships found")

# 3. Test sample connected paths
print("\n🌐 SAMPLE CONNECTED PATHS:")

# Test Product -> Assembly connection
product_assembly = graphdb.send_query("""
MATCH (p:Product)-[:CONTAINS]->(a:Assembly)
RETURN p.product_name, a.assembly_name
LIMIT 3
""")

if product_assembly['status'] == 'success' and product_assembly['query_result']:
    print("\n  Product → Assembly:")
    for path in product_assembly['query_result']:
        print(f"    {path['p.product_name']} → {path['a.assembly_name']}")
else:
    print("  ❌ No Product-Assembly connections found")

# Test Part -> Assembly connection  
part_assembly = graphdb.send_query("""
MATCH (part:Part)-[:IS_PART_OF]->(a:Assembly)
RETURN part.part_name, a.assembly_name
LIMIT 3
""")

if part_assembly['status'] == 'success' and part_assembly['query_result']:
    print("\n  Part → Assembly:")
    for path in part_assembly['query_result']:
        print(f"    {path['part.part_name']} → {path['a.assembly_name']}")
else:
    print("  ❌ No Part-Assembly connections found")

# Test Part -> Supplier connection
part_supplier = graphdb.send_query("""
MATCH (part:Part)-[:SUPPLIED_BY]->(s:Supplier)
RETURN part.part_name, s.name
LIMIT 3
""")

if part_supplier['status'] == 'success' and part_supplier['query_result']:
    print("\n  Part → Supplier:")
    for path in part_supplier['query_result']:
        print(f"    {path['part.part_name']} → {path['s.name']}")
else:
    print("  ❌ No Part-Supplier connections found")

# 4. Test full connected path
print("\n  Full Connected Path (Product → Assembly ← Part → Supplier):")
full_path = graphdb.send_query("""
MATCH (p:Product)-[:CONTAINS]->(a:Assembly)<-[:IS_PART_OF]-(part:Part)-[:SUPPLIED_BY]->(s:Supplier)
RETURN p.product_name, a.assembly_name, part.part_name, s.name
LIMIT 2
""")

if full_path['status'] == 'success' and full_path['query_result']:
    for path in full_path['query_result']:
        print(f"    {path['p.product_name']} → {path['a.assembly_name']} ← {path['part.part_name']} → {path['s.name']}")
else:
    print("    ❌ No complete connected paths found")

# 5. Graph summary
print(f"\n{'='*60}")
total_nodes = sum([stat['count'] for stat in node_stats.get('query_result', [])])
total_rels = sum([stat['count'] for stat in rel_stats.get('query_result', [])])

if total_nodes > 0 and total_rels > 0:
    print("✅ SUCCESS! Knowledge graph construction completed!")
    print(f"   📊 Total nodes: {total_nodes}")
    print(f"   🔗 Total relationships: {total_rels}")
    print("\n🔍 TO VISUALIZE IN NEO4J BROWSER:")
    print("   Run: CALL db.schema.visualization()")
    print("   Or: MATCH (n)-[r]->(m) RETURN n, r, m LIMIT 25")
else:
    print("❌ Graph construction failed - no nodes or relationships created")
    
print(f"{'='*60}")
