# NetBot API Demo

This notebook demonstrates the complete NetBot API for diagram processing, embedding management, and graph querying.

## Prerequisites

1. **Environment Variables**: Set up your `.env` file with:
   - `GEMINI_API_KEY=your-gemini-api-key`
   - `NEO4J_URI=bolt://localhost:7687`
   - `NEO4J_USER=neo4j`
   - `NEO4J_PASSWORD=your-password`

2. **Sample Images**: Place some diagram images in `data/examples/`

3. **Neo4j Running**: Make sure Neo4j is running and accessible

In [1]:
# Import required libraries
import sys
import os
from pathlib import Path

# Import NetBot directly (we're now in the root directory)
from client import NetBot

print(f"Working directory: {os.getcwd()}")
print("NetBot imported successfully!")

Working directory: /Users/qiyao/Code/netbot-v2
NetBot imported successfully!


## 1. Initialize NetBot

NetBot automatically reads credentials from environment variables or you can pass them explicitly.

In [None]:
# Initialize NetBot (reads from .env file)
try:
    netbot = NetBot()
    print("✅ NetBot initialized successfully!")
    print(f"Neo4j URI: {netbot.neo4j_uri}")
    print(f"Neo4j User: {netbot.neo4j_user}") 456
    print(f"Gemini API Key: {'Set' if netbot.gemini_api_key else 'Missing'}")
except Exception as e:
    print(f"❌ Failed to initialize NetBot: {e}")
    print("Make sure your .env file is configured correctly")

✅ NetBot initialized successfully!
Neo4j URI: bolt://localhost:7687
Neo4j User: neo4j
Gemini API Key: Set


## 2. Single Diagram Processing

Process a single diagram from image to knowledge graph.

In [21]:
# Set sample image manually
sample_image = "data/examples/IMG_1531.jpg" 
sample_image_name = Path(sample_image).name
print(f"Using image: {sample_image_name}")

# Set query manually
query = "What WIP options do I have?"
query1 = 'find network devices'
query2 = 'ind connections'
query3 = 'find security components'
queries = [query1, query2, query3]

Using image: IMG_1531.jpg


In [6]:
os.environ['GOOGLE_APPLICATION_CREDENTIALS'] = 'diagram_processing/credentials/gen-lang-client-0343135558-4997ba4c0c1f.json'

# Process a single diagram
print(f"🔄 Processing diagram: {sample_image_name}")
result = netbot.process_diagram(
    image_path=sample_image,
    output_dir="data/processed",
    force_reprocess=False, 
)
     
print(f"\n📊 Processing Result:")
print(f"Status: {result.get('status')}")
if result.get('status') == 'success':
    print(f"Nodes: {len(result.get('nodes', []))}")
    print(f"Relationships: {len(result.get('relationships', []))}")
    print(f"Neo4j stored: {result.get('neo4j_stored')}")
        
    # Show first few nodes
    nodes = result.get('nodes', [])
    if nodes:
        print(f"\n📋 Sample Nodes:")
        for i, node in enumerate(nodes[:3]):
            print(f"  {i+1}. {node.get('label', 'Unknown')} ({node.get('type', 'Unknown')})")
else:
    print(f"Full result: {result}")

🔄 Processing diagram: IMG_1531.jpg
Starting diagram-to-graph processing for: data/examples/IMG_1531.jpg
Phase 1: Image preprocessing & OCR...
Google Cloud Vision client initialized successfully
Detected flowchart diagram: 196 text elements, 51 shapes
Phase 2: Generating complete graph with Gemini 2.5 Pro...
Phase 3: Generating CSV files...

📊 Processing Result:
Status: success
Nodes: 11
Relationships: 11
Neo4j stored: True

📋 Sample Nodes:
  1. WIP (process)
  2. Design Options (process)
  3. Active - Active (process)


## 3. Embedding Management

Add semantic embeddings to enable vector search.

In [12]:
# Add embeddings to the processed diagram
diagram_id = os.path.splitext(sample_image_name)[0]

print(f"🧠 Adding embeddings to diagram: {diagram_id}")

success = netbot.add_embeddings(diagram_id, batch_size=50)

if success:
    print("✅ Embeddings added successfully!")
else:
    print("⚠️ Embeddings already exist or failed to add")

🧠 Adding embeddings to diagram: IMG_1531
✅ Embedding model loaded: all-MiniLM-L6-v2
🧠 Adding embeddings to existing diagram: IMG_1531
✅ Connected to Neo4j
✅ Added embeddings to 11 nodes
🔌 Neo4j connection closed
✅ Embeddings added successfully!


In [13]:
# Check which diagrams have embeddings
print("🔍 Checking for diagrams without embeddings...")
missing_embeddings = netbot.get_diagrams_without_embeddings()

print(f"Found {len(missing_embeddings)} diagrams without embeddings:")
for diagram in missing_embeddings[:6]:  # Show first 5
    print(f"  - {diagram}")

if len(missing_embeddings) > 6:
    print(f"  ... and {len(missing_embeddings) - 6} more")

🔍 Checking for diagrams without embeddings...
✅ Embedding model loaded: all-MiniLM-L6-v2
✅ Connected to Neo4j
🔌 Neo4j connection closed
Found 5 diagrams without embeddings:
  - IMG_1530
  - IMG_1532
  - Net_accs_transp_detail
  - f0022-sample-network-diagram
  - futureinternet-11-00152-g001


In [None]:
# Try direct vector search to see the specific error
try:
      result = netbot.search(
          query=query,
          diagram_id=diagram_id,
          method="vector",  # Force vector search
          top_k=5
      )
      print("Vector search successful!")
      # Print nodes with properties but without embeddings
      if 'nodes' in result:
            print(f"\nFound {len(result['nodes'])} nodes:")
            for i, node in enumerate(result['nodes'][:3]):  # Show first 3
                  if hasattr(node, 'label'):
                        print(f"  {i+1}. {node.label} ({node.type})")
                  if node.properties:
                        print(f"     Properties: {node.properties}")
                  else:
                        print(f"  {i+1}. {node}")

except Exception as e:
      print(f"Vector search error: {e}")
      import traceback
      traceback.print_exc()

✅ Embedding model loaded: all-MiniLM-L6-v2
✅ VectorSearch: EmbeddingEncoder initialized successfully
🔍 Searching: What WIP options do I have?
✅ Connected to Neo4j
🔍 Embeddings check for IMG_1531: True
Starting semantic subgraph search for: What WIP options do I have?
Loading embeddings cache for diagram: IMG_1531
Cached 11 nodes
Vector search found 5 similar nodes
Phase 1: Found 5 quality seed nodes
Found 3 orphaned nodes, searching for multi-hop paths...
Built hybrid subgraph: 10 nodes (5 intermediate), 12 relationships
🔌 Neo4j connection closed
Vector search successful!

Found 10 nodes:
  1. Not Supported WIP Options (process)
     Properties: {'description': "1. In Regional Proximity setup, if Application is not hosted across regions, User connecting from non application hosted region can't be redirected to any preferred regions, they will be Round-Robin across the region.\n2. No Other Monitor solution will be supported expect mentioned above\n3. Any advanced policy based routing\n4

## 4. Single Diagram Search

Search within a specific diagram using natural language.

In [22]:
# Search the processed diagram
for query in queries:
    print(f"\n🔍 Searching: '{query}'")
    
    results = netbot.search(
        query=query,
        diagram_id=diagram_id,
        method="auto",
        top_k=5
    )
    
    if results.get('error'):
        print(f"❌ Search failed: {results['error']}")
    elif results.get('nodes'):
        nodes = results['nodes']
        relationships = results.get('relationships', [])
        print(f"✅ Found {len(nodes)} nodes, {len(relationships)} relationships")
        
        # Show sample results
        for i, node in enumerate(nodes[:3]):
            label = node.get('label', 'Unknown') if isinstance(node, dict) else getattr(node, 'label', 'Unknown')
            node_type = node.get('type', 'Unknown') if isinstance(node, dict) else getattr(node, 'type', 'Unknown')
            print(f"  {i+1}. {label} ({node_type})")
    else:
        print("❌ No results found")
        
    # Break after first successful search to save time
    if results.get('nodes'):
        break



🔍 Searching: 'find network devices'
✅ Embedding model loaded: all-MiniLM-L6-v2
✅ VectorSearch: EmbeddingEncoder initialized successfully
🔍 Searching: find network devices
✅ Connected to Neo4j
🔍 Embeddings check for IMG_1531: True
Starting semantic subgraph search for: find network devices
Loading embeddings cache for diagram: IMG_1531
Cached 11 nodes
Vector search found 5 similar nodes
Phase 1: Found 5 quality seed nodes
Found 5 orphaned nodes, searching for multi-hop paths...
Built hybrid subgraph: 6 nodes (1 intermediate), 4 relationships
🔌 Neo4j connection closed
✅ Found 6 nodes, 4 relationships
  1. WIP (process)
  2. With in Region DC Proximity (process)
  3. Regional Proximity (process)


## 5. Query and Visualization

Search with automatic visualization and AI explanation.

In [24]:
# Query with visualization and explanation - GraphViz only    
print(f"🎨 Creating visualization for diagram: {diagram_id}")

# GraphViz visualization (saved locally)
print("\n📊 Creating GraphViz visualization (saved to file)...")
viz_results = netbot.query_and_visualize(
    query=query,
    diagram_id=diagram_id,
    backend="graphviz",
    include_explanation=True,
    detailed_explanation=True
)

if viz_results.get('error'):
    print(f"❌ Visualization failed: {viz_results['error']}")
else:
    print("✅ Visualization created successfully!")
    if viz_results.get('image_path'):
        print(f"📊 Image saved: {viz_results['image_path']}")

    if viz_results.get('explanation'):
        print(f"\n💡 AI Explanation:")
        print(viz_results['explanation'])  # Full explanation, no truncation

    nodes = viz_results.get('nodes', [])
    relationships = viz_results.get('relationships', [])
    print(f"\n📊 Visualization contains: {len(nodes)} nodes, {len(relationships)} relationships")

🎨 Creating visualization for diagram: IMG_1531

📊 Creating GraphViz visualization (saved to file)...
✅ Embedding model loaded: all-MiniLM-L6-v2
✅ VectorSearch: EmbeddingEncoder initialized successfully
🔍 Searching: find network devices
✅ Connected to Neo4j
🔍 Embeddings check for IMG_1531: True
Starting semantic subgraph search for: find network devices
Loading embeddings cache for diagram: IMG_1531
Cached 11 nodes
Vector search found 8 similar nodes
Phase 1: Found 8 quality seed nodes
Found 1 orphaned nodes, searching for multi-hop paths...
Built hybrid subgraph: 8 nodes (0 intermediate), 12 relationships
Auto-selected visualization backend: graphviz
Switched to graphviz backend


huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)


✅ Graphviz graph saved to: data/visualizations/IMG_1531_subgraph_9f15a17e_20250821_215035.png
🔌 Neo4j connection closed
✅ Visualization created successfully!
📊 Image saved: data/visualizations/IMG_1531_subgraph_9f15a17e_20250821_215035.png

💡 AI Explanation:
This flowchart describes a process for selecting and configuring system deployment designs, particularly focusing on load balancing and proximity-based routing. The initial query, "find network devices," appears to be a mismatch for the content of the diagram, which is clearly focused on architectural deployment choices. I will analyze the diagram based on its presented content regarding deployment configuration.

---

### Detailed Flowchart Analysis: "Deployment Design and Configuration"

**1. Overview of the Process Steps and Decision Points**

The process begins at a "Work In Progress" (WIP) state and leads into a central decision point for "Design Options."

*   **WIP (Work In Progress):** This is the starting point of the proc

## 6. Quickstart Workflow

Complete end-to-end workflow: process + embeddings + search + visualization.

In [None]:
# Quickstart workflow - get available images first
examples_dir = Path("data/examples")
if examples_dir.exists():
    image_files = list(examples_dir.glob("*.png")) + list(examples_dir.glob("*.jpg")) + list(examples_dir.glob("*.jpeg"))
else:
    image_files = []

if sample_image and len(image_files) > 1:
    # Use a different image for quickstart demo
    quickstart_image = str(image_files[1])

    print(f"🚀 Running quickstart workflow on: {Path(quickstart_image).name}")

    quickstart_results = netbot.quickstart(
        image_path=quickstart_image,
        query="find network infrastructure",
        explanation_detail="basic"
    )

    if quickstart_results.get('error'):
        print(f"❌ Quickstart failed: {quickstart_results['error']}")
    else:
        print("✅ Quickstart completed successfully!")

        # Show results summary
        nodes = quickstart_results.get('nodes', [])
        relationships = quickstart_results.get('relationships', [])
        print(f"📊 Results: {len(nodes)} nodes, {len(relationships)} relationships")

        if quickstart_results.get('image_path'):
            print(f"🖼️ Visualization: {quickstart_results['image_path']}")

        if quickstart_results.get('explanation'):
            print(f"\n💡 Explanation:")
            print(quickstart_results['explanation'])  # Full explanation
elif sample_image:
    print("ℹ️ Only one sample image available - skipping separate quickstart demo")
else:
    print("⏭️ Skipping quickstart - no sample images available")

## 7. Bulk Operations

Demonstrate bulk processing and embedding management.

In [None]:
# Check if we have multiple images for bulk processing
examples_dir = Path("data/examples")
if examples_dir.exists():
    image_files = list(examples_dir.glob("*.png")) + list(examples_dir.glob("*.jpg")) + list(examples_dir.glob("*.jpeg"))
else:
    image_files = []

if len(image_files) >= 2:
    print(f"📁 Found {len(image_files)} images - demonstrating bulk operations")

    # Create a small test directory
    test_bulk_dir = Path("data/test_bulk")
    test_bulk_dir.mkdir(exist_ok=True)

    # Copy 2-3 images to test directory
    import shutil
    test_images = image_files[:min(3, len(image_files))]

    for img in test_images:
        dest = test_bulk_dir / f"bulk_{img.name}"
        if not dest.exists():
            shutil.copy2(img, dest)

    print(f"📋 Created test directory with {len(test_images)} images")

    # For bulk operations, we'll demonstrate the concept but not actually run it
    # since it would process multiple images which takes time
    print("\n💡 Bulk operations example (not executed for demo purposes):")
    print("```python")
    print("bulk_results = netbot.bulk_quickstart(")
    print("    image_directory='data/test_bulk',")
    print("    query='find servers'")
    print(")")
    print("```")
    print("This would process all images in the directory and add embeddings automatically.")

else:
    print("ℹ️ Need at least 2 images for bulk operations demo")
    print("📁 Add more images to data/examples/ to test bulk functionality")

In [None]:
# Demonstrate bulk embedding operations
print("\n🧠 Testing bulk embedding operations...")

# Get all diagrams without embeddings
missing = netbot.get_diagrams_without_embeddings()
print(f"Found {len(missing)} diagrams without embeddings")

if missing:
    # Test bulk check (limit to first 3 for demo)
    sample_diagrams = missing[:3]
    print(f"\n🔍 Checking embedding status for {len(sample_diagrams)} diagrams...")

    status_results = netbot.bulk_check_embeddings(sample_diagrams)

    for diagram_id, has_embeddings in status_results.items():
        status = "✅ Has embeddings" if has_embeddings else "❌ No embeddings"
        print(f"  {diagram_id}: {status}")

    # For demo purposes, show what bulk_add_embeddings would do without actually running it
    diagrams_to_process = [d for d, has in status_results.items() if not has][:2]

    if diagrams_to_process:
        print(f"\n💡 Bulk embedding example (not executed for demo purposes):")
        print("```python")
        print(f"add_results = netbot.bulk_add_embeddings({diagrams_to_process}, batch_size=50)")
        print("```")
        print(f"This would add embeddings to {len(diagrams_to_process)} diagrams: {', '.join(diagrams_to_process)}")
        print("Each diagram would get semantic embeddings for vector search capabilities.")
    else:
        print("ℹ️ All checked diagrams already have embeddings")
else:
    print("✅ All diagrams already have embeddings!")
    print("💡 You can add more images to data/examples/ to test bulk embedding operations")

## 8. API Summary

Complete overview of the NetBot API.

In [None]:
print("📚 NetBot API Summary")
print("=" * 50)

api_methods = [
    {
        "category": "Single Diagram Operations",
        "methods": [
            "process_diagram(image_path, force_reprocess=False) - Convert image to knowledge graph",
            "add_embeddings(diagram_id) - Add semantic embeddings for vector search",
            "search(query, diagram_id, method='auto') - Natural language search",
            "query_and_visualize(query, diagram_id, backend='graphviz') - Search + visualization + explanation",
            "quickstart(image_path, query, explanation_detail='basic') - Complete end-to-end workflow"
        ]
    },
    {
        "category": "Bulk Operations",
        "methods": [
            "bulk_quickstart(image_directory, query) - Process entire directory + embeddings",
            "bulk_add_embeddings(diagram_ids) - Add embeddings to multiple diagrams",
            "bulk_check_embeddings(diagram_ids) - Check embedding status for multiple diagrams",
            "get_diagrams_without_embeddings() - Find diagrams that need embeddings"
        ]
    },
    {
        "category": "Control Options",
        "methods": [
            "force_reprocess=True - Reprocess existing diagrams (default: skip if exists)",
            "method='vector'|'cypher'|'auto' - Search method selection",
            "explanation_detail='none'|'basic'|'detailed' - AI explanation level",
            "backend='graphviz'|'networkx' - Visualization backend choice"
        ]
    }
]

for category in api_methods:
    print(f"\n📋 {category['category']}:")
    for method in category['methods']:
        print(f"  • {method}")

print("\n🎯 Key Design Principles:")
print("  • Automatic diagram_id generation from filename")
print("  • Smart duplicate detection and skipping")
print("  • Force reprocess option for pipeline updates")
print("  • Auto method selection (vector when embeddings available)")
print("  • Clean resource management with proper connection cleanup")

print("\n✅ NetBot API Demo Complete!")

## 9. Cleanup (Optional)

Clean up test data if desired.

In [None]:
# Cleanup test bulk directory (optional)
test_bulk_dir = Path("data/test_bulk")
if test_bulk_dir.exists():
    print(f"🗑️ Test directory exists: {test_bulk_dir}")
    print("You can manually delete it if no longer needed.")
    # Uncomment to auto-delete:
    # import shutil
    # shutil.rmtree(test_bulk_dir)
    # print("✅ Test directory cleaned up")

print("\n🎉 Demo notebook completed successfully!")
print("\n📖 What you learned:")
print("  • Process diagrams with automatic diagram_id generation")
print("  • Control reprocessing with force_reprocess parameter")
print("  • Add embeddings for semantic vector search")
print("  • Search with auto method selection (vector/cypher)")
print("  • Visualize with GraphViz backend")
print("  • Generate detailed AI explanations")
print("  • Handle existing diagrams gracefully")
print("  • Bulk operations for processing multiple diagrams")
print("  • Check embedding status across multiple diagrams")
print("\n🚀 You now have a complete understanding of the NetBot API!")