# Docling-Graph: Getting Started Tutorial

This notebook demonstrates the **docling-graph API** with improved type safety and modular design.

**What's new:**
- Modular imports from `docling_graph.core`
- Configuration-based design
- Enhanced metadata
- Class-based exporters and visualizers

## Installation & Setup

In [None]:
# Install if needed
# !pip install -e .

# NEW: Imports from graph module
from pathlib import Path
from pydantic import BaseModel, Field
from typing import Optional, List

# Import from graph module
from docling_graph.core import (
    GraphConverter,
    GraphConfig,
    InteractiveVisualizer,
    StaticVisualizer,
    ReportGenerator,
    CSVExporter,
    Edge
)

## Step 1: Define Your Pydantic Template

In [None]:
class Person(BaseModel):
    """Person entity"""
    name: str = Field(..., description="Full name")
    age: Optional[int] = None
    email: Optional[str] = None
    
    model_config = {
        "graph_id_fields": ["name"]  # Use name as unique identifier
    }

class Company(BaseModel):
    """Company entity"""
    name: str
    industry: Optional[str] = None
    
    model_config = {
        "graph_id_fields": ["name"]
    }

class Invoice(BaseModel):
    """Invoice document template"""
    invoice_number: str
    date: Optional[str] = None
    amount: Optional[float] = None
    
    # Relationships (implicit edges)
    customer: Optional[Person] = None
    vendor: Optional[Company] = None
    
    model_config = {
        "graph_id_fields": ["invoice_number"]
    }

## Step 2: Create Sample Data

In [None]:
# Create instances
customer = Person(name="Alice Dupont", age=35, email="alice@example.com")
vendor = Company(name="ACME Corp", industry="Technology")

invoice = Invoice(
    invoice_number="INV-2024-001",
    date="2024-10-25",
    amount=1500.00,
    customer=customer,
    vendor=vendor
)

print(f"Created invoice: {invoice.invoice_number}")
print(f"Customer: {invoice.customer.name}")
print(f"Vendor: {invoice.vendor.name}")

## Step 3: Convert to Knowledge Graph
**Key changes:**
- Use `GraphConfig` for configuration
- Method returns tuple: `(graph, metadata)`
- Get detailed statistics from metadata

In [None]:
# NEW: Create configuration
config = GraphConfig(add_reverse_edges=False)

# NEW: Initialize converter with config
converter = GraphConverter(config=config)

# NEW: Convert returns TUPLE (graph, metadata)
graph, metadata = converter.pydantic_list_to_graph([invoice])

# NEW: Use metadata for statistics
print(f"Graph Statistics:")
print(f"  Nodes: {metadata.node_count}")
print(f"  Edges: {metadata.edge_count}")
print(f"  Created: {metadata.created_at}")
print(f"\nNode types:")
for node_type, count in metadata.node_types.items():
    print(f"  {node_type}: {count}")
print(f"\nEdge types:")
for edge_type, count in metadata.edge_types.items():
    print(f"  {edge_type}: {count}")

## Step 4: Visualize the Graph

**Key changes:**
- Use `InteractiveVisualizer` class instead of function
- Use `StaticVisualizer` for PNG/SVG/PDF
- Use `ReportGenerator` for markdown reports

In [None]:
# Create output directory
output_path = Path("outputs/notebook_example")
output_path.mkdir(parents=True, exist_ok=True)

# NEW: Use InteractiveVisualizer class
interactive_viz = InteractiveVisualizer()
interactive_viz.visualize(graph, output_path / "graph_interactive")
print(f"Interactive graph saved to: {output_path}/graph_interactive.html")

# NEW: Use StaticVisualizer for PNG
static_viz = StaticVisualizer()
static_viz.visualize(graph, output_path / "graph_static", format='png')
print(f"Static graph saved to: {output_path}/graph_static.png")

# NEW: Use ReportGenerator
report_gen = ReportGenerator()
report_gen.visualize(graph, output_path / "report", source_model_count=1)
print(f"Markdown report saved to: {output_path}/report.md")

print("\nOpen the HTML file in your browser to explore!")

## Step 5: Export Graph

**Key changes:**
- Use exporter classes instead of functions
- CSVExporter, CypherExporter, JSONExporter
- JSON export is now available!

In [None]:
# NEW: Use CSVExporter class
csv_exporter = CSVExporter()
csv_exporter.export(graph, output_path)
print(f"CSV exported to: {output_path}/nodes.csv, {output_path}/edges.csv")

# NEW: JSON export (wasn't available before!)
from docling_graph.core import JSONExporter
json_exporter = JSONExporter()
json_exporter.export(graph, output_path / "graph.json")
print(f"JSON exported to: {output_path}/graph.json")

## Step 6: Query the Graph

In [None]:
# Find all Person nodes
persons = [(n, d) for n, d in graph.nodes(data=True) if d.get('label') == 'Person']
print(f"Found {len(persons)} Person nodes:")
for node_id, data in persons:
    print(f"  - {data.get('name')} (age: {data.get('age', 'N/A')})")

# Find all edges
print(f"\nGraph edges:")
for u, v, data in graph.edges(data=True):
    print(f"  {u} --[{data.get('label')}]--> {v}")