# Day 1, Session 3: Building Agent Workflows with LangGraph

## From Simple Chains to Complex Workflows

### The Evolution of Agent Architecture

We've seen how ReAct agents work - they reason and act in a loop. But production systems need more:

**Simple ReAct (Session 2):**
```python
while not done:
    thought = llm.think("What next?")
    action = execute_tool(thought.action)
    observation = observe(action)
```

**Production Workflows (This Session):**
```python
# Complex state management
state = WorkflowState(
    invoice_data={...},
    validation_results={...},
    business_context={...}
)

# Parallel execution
async def validate_invoice(state):
    vat_check, risk_check, terms_check = await asyncio.gather(
        validate_vat(state),
        check_vendor_risk(state), 
        verify_payment_terms(state)
    )
    return combine_results(vat_check, risk_check, terms_check)
```

### What LangGraph Brings to the Table

**State Management:**
- Type-safe state that flows through all nodes
- Automatic state updates and merging
- Built-in error and metadata tracking

**Graph Architecture:**
- Visual workflow definition
- Conditional routing based on state
- Parallel execution of independent steps

**Production Features:**
- Checkpointing for long-running workflows  
- State history for auditing
- Resumable workflows after interruption

### The Business Case

**Traditional Document Processing:**
```
Sequential: Extract → Validate → Check → Approve
Time: 15 seconds per invoice
Throughput: 4 invoices/minute
```

**LangGraph Workflow:**
```
Parallel: Extract → [Validate ∥ Check ∥ Verify] → Approve  
Time: 6 seconds per invoice
Throughput: 10 invoices/minute
```

Let's build this system step by step!

In [None]:
# Download real invoice and receipt images first
import requests
import zipfile
import io
import os

# Dropbox shared link for the folder
dropbox_url = "https://www.dropbox.com/scl/fo/m9hyfmvi78snwv0nh34mo/AMEXxwXMLAOeve-_yj12ck8?rlkey=urinkikgiuven0fro7r4x5rcu&st=hv3of7g7&dl=1"

print(f"Downloading real invoice data from: {dropbox_url}")

try:
    response = requests.get(dropbox_url)
    response.raise_for_status()

    # Read the content as a zip file
    with zipfile.ZipFile(io.BytesIO(response.content)) as z:
        # Extract all contents to a directory named 'downloaded_images'
        z.extractall("downloaded_images")

    print("✅ Downloaded and extracted images to 'downloaded_images' folder.")
    
    # List downloaded files
    for root, dirs, files in os.walk("downloaded_images"):
        for file in files:
            print(f"  📄 {os.path.join(root, file)}")

except Exception as e:
    print(f"❌ Error downloading images: {e}")

# Install required packages
!pip install -q langgraph langchain langchain-community

# Configuration for course LLM server
OLLAMA_URL = "http://XX.XX.XX.XX"  # Instructor provides
API_TOKEN = "YOUR_TOKEN_HERE"
MODEL = "qwen3:8b"

## Step 1: State Design - The Foundation of Workflows

### Understanding State in Agent Systems

State is the data that flows through your workflow. Good state design is crucial:

**Anti-Pattern - Global Variables:**
```python
# Bad: Global state is fragile
vendor_name = None
amount = None
approval_status = None

def process_invoice():
    global vendor_name, amount, approval_status
    # Dangerous: race conditions, hard to debug
```

**Pattern - Typed State Objects:**
```python
# Good: Typed, immutable, traceable
@dataclass
class InvoiceState:
    invoice_id: str
    vendor_name: Optional[str] = None
    amount: Optional[float] = None
    errors: List[str] = field(default_factory=list)
    metadata: Dict[str, Any] = field(default_factory=dict)
```

### LangGraph State Architecture

LangGraph uses TypedDict for state definition with automatic merging:

```python
class MyState(TypedDict):
    # Required fields
    invoice_id: str
    
    # Optional fields with defaults
    amount: Optional[float]
    errors: List[str]  # LangGraph handles list merging
    
    # Nested structures
    validation_results: Dict[str, bool]
```

**How State Flows:**
```
Node A (input_state) → output_state_A
                    ↓
Node B (input_state + output_state_A) → output_state_B
                                      ↓
Node C (combined_state) → final_state
```

### Designing for Invoice Processing

Our state needs to handle:
- **Input data**: What we start with
- **Extracted data**: What we discover
- **Validation results**: What we verify
- **Metadata**: How we got there

Let's design state that supports complex workflows:

In [None]:
from typing import TypedDict, List, Optional, Dict, Any
from dataclasses import dataclass
from datetime import datetime, timedelta
import json

# Define our workflow state
class InvoiceState(TypedDict):
    # Input
    invoice_id: str
    raw_text: Optional[str]
    
    # Extracted data
    vendor_name: Optional[str]
    amount: Optional[float]
    currency: Optional[str]
    invoice_date: Optional[str]
    due_date: Optional[str]
    payment_terms: Optional[str]
    vat_number: Optional[str]
    line_items: Optional[List[Dict]]
    
    # Validation results
    vat_valid: Optional[bool]
    vendor_risk_score: Optional[float]
    payment_terms_approved: Optional[bool]
    
    # Workflow metadata
    errors: List[str]
    warnings: List[str]
    approval_status: Optional[str]  # 'approved', 'rejected', 'manual_review'
    processing_time: Optional[float]
    steps_executed: List[str]

print("✅ State structure defined")
print("State tracks:")
print("- Invoice data (vendor, amount, dates)")
print("- Validation results (VAT, risk, terms)")
print("- Workflow metadata (errors, approval status)")

## Step 2: Node Functions - Building Workflow Components

### The Node Pattern in LangGraph

Nodes are pure functions that transform state. They follow a consistent pattern:

```python
def my_node(state: MyState) -> MyState:
    """
    Clear description of what this node does
    
    Args:
        state: Current workflow state
        
    Returns:
        Updated state with new information
    """
    # 1. Extract needed data from state
    invoice_id = state["invoice_id"]
    
    # 2. Perform the work
    result = do_processing(invoice_id)
    
    # 3. Update state (don't mutate, return new values)
    state["new_field"] = result
    state["metadata"]["node_executed"] = "my_node"
    
    # 4. Return the updated state
    return state
```

### Node Design Principles

**Single Responsibility:**
```python
# Good: Each node has one clear purpose
def extract_vendor_info(state): ...
def validate_vat_number(state): ...
def check_payment_terms(state): ...

# Bad: Node doing too much
def process_everything(state): ...
```

**Error Handling:**
```python
def robust_node(state: MyState) -> MyState:
    try:
        # Core processing
        result = risky_operation(state["data"])
        state["result"] = result
    except ValidationError as e:
        state["errors"].append(f"Validation failed: {str(e)}")
    except TimeoutError:
        state["warnings"].append("Service timeout, using cached data")
        state["result"] = get_cached_data(state["data"])
    except Exception as e:
        state["errors"].append(f"Unexpected error: {str(e)}")
        
    return state
```

### Parallel-Safe Nodes

When nodes run in parallel, they must be independent:

```python
# ✅ Good: Independent operations
def validate_vat(state):
    # Only reads: vat_number
    # Only writes: vat_valid, vat_errors
    pass

def check_vendor_risk(state):
    # Only reads: vendor_name  
    # Only writes: risk_score, risk_factors
    pass

# ❌ Bad: Conflicting writes
def bad_node_a(state):
    state["status"] = "processing_a"  # Conflict!
    
def bad_node_b(state):
    state["status"] = "processing_b"  # Conflict!
```

Let's build our invoice processing nodes:

In [None]:
import random
import time

# Node 1: Extract invoice data
def extract_invoice_data(state: InvoiceState) -> InvoiceState:
    """Extract structured data from invoice"""
    print("📄 Extracting invoice data...")
    
    # Simulate extraction (in production, would use OCR/NLP)
    if state["invoice_id"] == "INV-2024-001":
        state["vendor_name"] = "TechSupplies Co."
        state["amount"] = 15000.00
        state["currency"] = "USD"
        state["invoice_date"] = "2024-01-15"
        state["payment_terms"] = "Net 30"
        state["vat_number"] = "GB123456789"
        state["line_items"] = [
            {"description": "Laptops", "quantity": 5, "unit_price": 2000},
            {"description": "Software Licenses", "quantity": 10, "unit_price": 500}
        ]
    else:
        state["errors"].append(f"Invoice {state['invoice_id']} not found")
    
    state["steps_executed"].append("extract_data")
    return state

# Node 2: Validate VAT
def validate_vat(state: InvoiceState) -> InvoiceState:
    """Validate VAT number"""
    print("🔍 Validating VAT number...")
    
    if state["vat_number"]:
        # Simple validation (in production, use VIES API)
        if state["vat_number"].startswith("GB") and len(state["vat_number"]) == 11:
            state["vat_valid"] = True
        else:
            state["vat_valid"] = False
            state["warnings"].append("VAT number format appears invalid")
    
    state["steps_executed"].append("validate_vat")
    return state

# Node 3: Check vendor risk
def check_vendor_risk(state: InvoiceState) -> InvoiceState:
    """Assess vendor risk score"""
    print("📊 Checking vendor risk...")
    
    # Simulate risk scoring
    vendor_scores = {
        "TechSupplies Co.": 0.2,  # Low risk
        "NewVendor Inc.": 0.7,    # High risk
        "Unknown": 0.9            # Very high risk
    }
    
    state["vendor_risk_score"] = vendor_scores.get(
        state["vendor_name"], 
        0.5  # Default medium risk
    )
    
    if state["vendor_risk_score"] > 0.6:
        state["warnings"].append(f"High risk vendor (score: {state['vendor_risk_score']})")
    
    state["steps_executed"].append("check_vendor_risk")
    return state

# Node 4: Verify payment terms
def verify_payment_terms(state: InvoiceState) -> InvoiceState:
    """Check if payment terms align with policy"""
    print("💰 Verifying payment terms...")
    
    # Business rules
    if state["payment_terms"] == "Net 30":
        state["payment_terms_approved"] = True
    elif state["payment_terms"] == "Net 60" and state["amount"] > 10000:
        state["payment_terms_approved"] = True
    elif state["payment_terms"] == "Net 90":
        state["payment_terms_approved"] = False
        state["warnings"].append("Net 90 requires CFO approval")
    else:
        state["payment_terms_approved"] = False
    
    # Calculate due date
    if state["invoice_date"] and state["payment_terms"]:
        days = int(state["payment_terms"].split()[1])
        invoice_date = datetime.strptime(state["invoice_date"], "%Y-%m-%d")
        due_date = invoice_date + timedelta(days=days)
        state["due_date"] = due_date.strftime("%Y-%m-%d")
    
    state["steps_executed"].append("verify_payment_terms")
    return state

# Node 5: Make approval decision
def make_approval_decision(state: InvoiceState) -> InvoiceState:
    """Decide whether to approve, reject, or flag for review"""
    print("✅ Making approval decision...")
    
    # Decision logic
    if state["errors"]:
        state["approval_status"] = "rejected"
    elif state["vendor_risk_score"] and state["vendor_risk_score"] > 0.8:
        state["approval_status"] = "manual_review"
    elif not state["vat_valid"]:
        state["approval_status"] = "manual_review"
    elif not state["payment_terms_approved"]:
        state["approval_status"] = "manual_review"
    else:
        state["approval_status"] = "approved"
    
    state["steps_executed"].append("make_decision")
    return state

print("✅ All node functions created")
print("Nodes: extract → validate (parallel) → decision")

## Step 3: Graph Construction - Connecting the Workflow

### Graph Theory for Workflows

LangGraph is based on directed graphs where:
- **Nodes** = Processing steps
- **Edges** = Data flow between steps
- **Conditional Edges** = Dynamic routing based on state

```python
# Simple linear flow
A → B → C → END

# Parallel processing
     B
   ↗   ↘
A       D → END
   ↘   ↗
     C

# Conditional routing
A → condition → {success: B, error: C} → END
```

### Edge Types in LangGraph

**Regular Edges:**
```python
# Always go from node A to node B
workflow.add_edge("node_a", "node_b")
```

**Conditional Edges:**
```python
# Route based on state
def router(state):
    if state["errors"]:
        return "error_handler"
    elif state["amount"] > 10000:
        return "high_value_processor" 
    else:
        return "standard_processor"

workflow.add_conditional_edges(
    "decision_point",
    router,
    {
        "error_handler": "handle_errors",
        "high_value_processor": "process_high_value",
        "standard_processor": "process_standard"
    }
)
```

### Advanced Routing Patterns

**Multi-condition Routing:**
```python
def complex_router(state):
    # Multiple conditions
    if state["errors"]:
        return "error"
    elif state["manual_review_required"]:
        return "human_review" 
    elif state["risk_score"] > 0.8:
        return "risk_assessment"
    else:
        return "auto_approve"
```

**Fan-out/Fan-in:**
```python
# Fan-out: One node triggers multiple parallel nodes
workflow.add_edge("extract", "validate_vat")
workflow.add_edge("extract", "check_vendor")
workflow.add_edge("extract", "verify_terms")

# Fan-in: Multiple nodes converge to one
workflow.add_edge("validate_vat", "make_decision")
workflow.add_edge("check_vendor", "make_decision")
workflow.add_edge("verify_terms", "make_decision")
```

### Graph Compilation and Optimization

```python
# Compile with optimizations
app = workflow.compile()

# Advanced compilation options
app = workflow.compile(
    checkpointer=memory_saver,      # State persistence
    interrupt_before=["human_step"], # Pause for human input
    interrupt_after=["extract"]     # Checkpoint after extraction
)
```

Let's build our invoice processing graph:

In [None]:
from langgraph.graph import StateGraph, END

# Create the graph
workflow = StateGraph(InvoiceState)

# Add nodes
workflow.add_node("extract", extract_invoice_data)
workflow.add_node("validate_vat", validate_vat)
workflow.add_node("check_risk", check_vendor_risk)
workflow.add_node("verify_terms", verify_payment_terms)
workflow.add_node("decide", make_approval_decision)

# Define conditional routing
def route_after_extraction(state: InvoiceState) -> str:
    """Route based on extraction results"""
    if state["errors"]:
        return "decide"  # Skip validation if extraction failed
    return "continue"

# Set entry point
workflow.set_entry_point("extract")

# Add edges (connections between nodes)
workflow.add_conditional_edges(
    "extract",
    route_after_extraction,
    {
        "continue": "validate_vat",
        "decide": "decide"
    }
)

# Parallel validation steps
workflow.add_edge("validate_vat", "check_risk")
workflow.add_edge("check_risk", "verify_terms")
workflow.add_edge("verify_terms", "decide")

# End after decision
workflow.add_edge("decide", END)

# Compile the graph
app = workflow.compile()

print("✅ Workflow graph compiled")
print("Flow: Extract → Validate (3 checks) → Decision → End")

## Step 4: Workflow Visualization - Understanding Your Graph

### The Power of Visual Workflows

Unlike traditional code, LangGraph workflows are visual. This helps with:

**Development:**
```python
# Code is hard to understand
if extraction_successful:
    if parallel_validation_mode:
        run_vat_validation()
        run_risk_check()
        run_terms_verification()
    else:
        run_sequential_validation()
    
    if all_validations_passed():
        approve()
    else:
        manual_review()
```

**Visual Graph:**
```
┌─────────┐    ┌─────────┐
│ Extract │────│Condition│
└─────────┘    └────┬────┘
                    │
              ┌─────▼─────┐
              │  Parallel │
              │Validation │
              └─────┬─────┘
                    │
              ┌─────▼─────┐
              │  Decision │
              └───────────┘
```

### Graph Analysis Benefits

**Performance Optimization:**
- Identify bottlenecks visually
- See parallel execution opportunities
- Understand data dependencies

**Debugging:**
- Trace execution paths
- Identify where workflows fail
- Understand state flow

**Documentation:**
- Workflows become self-documenting
- Business stakeholders can understand flow
- Compliance audits are easier

### Graph Metrics

LangGraph provides insights into your workflow:

```python
# Get graph statistics
graph_info = app.get_graph()
print(f"Nodes: {len(graph_info.nodes)}")
print(f"Edges: {len(graph_info.edges)}")
print(f"Parallel paths: {count_parallel_paths(graph_info)}")

# Execution statistics  
execution_stats = {
    "total_steps": len(result["steps_executed"]),
    "parallel_efficiency": calculate_parallel_speedup(result),
    "bottleneck_node": find_slowest_node(result)
}
```

Let's visualize our workflow:

In [None]:
# Visualize the graph (requires graphviz)
try:
    from IPython.display import Image, display
    display(Image(app.get_graph().draw_png()))
except:
    print("Graphviz not available. Here's the text representation:")
    print("\n📊 Workflow Structure:")
    print("┌─────────┐")
    print("│ Extract │")
    print("└────┬────┘")
    print("     │ (if success)")
    print("┌────▼────┐")
    print("│Validate │")
    print("│  VAT    │")
    print("└────┬────┘")
    print("┌────▼────┐")
    print("│ Check   │")
    print("│  Risk   │")
    print("└────┬────┘")
    print("┌────▼────┐")
    print("│ Verify  │")
    print("│ Terms   │")
    print("└────┬────┘")
    print("┌────▼────┐")
    print("│ Decide  │")
    print("└────┬────┘")
    print("     │")
    print("    END")

## Step 5: Execution and State Flow

### Understanding Workflow Execution

When you run a LangGraph workflow, here's what happens:

**Execution Model:**
```python
# 1. Initialize state
initial_state = {"invoice_id": "INV-001", "errors": []}

# 2. Execute graph
for node in execution_order:
    state = node(state)  # State accumulates data
    
# 3. State evolution
# Step 1: {"invoice_id": "INV-001", "errors": []}
# Step 2: {"invoice_id": "INV-001", "errors": [], "vendor": "TechCorp"}  
# Step 3: {"invoice_id": "INV-001", "errors": [], "vendor": "TechCorp", "amount": 15000}
```

**State Merging:**
```python
# LangGraph automatically merges state updates
def node_a(state):
    return {"field_a": "value_a"}  # Only returns what changed

def node_b(state):  
    return {"field_b": "value_b"}  # Adds to existing state

# Final state: {"field_a": "value_a", "field_b": "value_b", ...}
```

### Execution Patterns

**Sequential Execution:**
```python
# Traditional: One after another
result_1 = step_1(input)
result_2 = step_2(result_1)  
result_3 = step_3(result_2)
# Total time: T1 + T2 + T3
```

**Parallel Execution:**
```python
# LangGraph: Independent steps run together
result_1, result_2, result_3 = await asyncio.gather(
    step_1(input),
    step_2(input),
    step_3(input)
)
# Total time: max(T1, T2, T3)
```

### Error Propagation

**Error Handling Strategy:**
```python
def error_aware_node(state):
    if state["errors"]:
        # Skip processing if previous errors
        return state
    
    try:
        # Normal processing
        result = process_data(state["data"])
        state["result"] = result
    except Exception as e:
        # Add error but continue workflow
        state["errors"].append(str(e))
        state["result"] = None
    
    return state
```

**Error Recovery Patterns:**
```python
def resilient_workflow(state):
    # Attempt primary processing
    if not state["errors"]:
        try:
            return primary_processor(state)
        except ServiceUnavailable:
            state["warnings"].append("Primary service down, using fallback")
            return fallback_processor(state)
    
    # Error path
    return error_handler(state)
```

Let's see this in action:

In [None]:
# Process a valid invoice
print("=" * 60)
print("PROCESSING INVOICE: INV-2024-001")
print("=" * 60)

# Initialize state
initial_state = {
    "invoice_id": "INV-2024-001",
    "errors": [],
    "warnings": [],
    "steps_executed": []
}

# Track execution time
start_time = time.time()

# Run the workflow
result = app.invoke(initial_state)

# Calculate processing time
result["processing_time"] = time.time() - start_time

# Display results
print("\n📋 WORKFLOW RESULTS:")
print("-" * 40)
print(f"Invoice ID: {result['invoice_id']}")
print(f"Vendor: {result.get('vendor_name', 'N/A')}")
print(f"Amount: ${result.get('amount', 0):,.2f} {result.get('currency', '')}")
print(f"Payment Terms: {result.get('payment_terms', 'N/A')}")
print(f"Due Date: {result.get('due_date', 'N/A')}")

print("\n✅ VALIDATION RESULTS:")
print(f"VAT Valid: {result.get('vat_valid', False)}")
print(f"Vendor Risk Score: {result.get('vendor_risk_score', 'N/A')}")
print(f"Payment Terms Approved: {result.get('payment_terms_approved', False)}")

print("\n🎯 FINAL DECISION:")
print(f"Status: {result.get('approval_status', 'Unknown').upper()}")

if result["warnings"]:
    print("\n⚠️ WARNINGS:")
    for warning in result["warnings"]:
        print(f"  - {warning}")

print("\n⏱️ PERFORMANCE:")
print(f"Processing Time: {result['processing_time']:.2f} seconds")
print(f"Steps Executed: {', '.join(result['steps_executed'])}")

## Step 6: Error Handling and Edge Cases

### Production-Grade Error Handling

Real workflows need to handle failures gracefully. Here's how:

**Graceful Degradation:**
```python
def robust_validation_node(state):
    # Try primary validation
    try:
        primary_result = validate_with_external_api(state["data"])
        state["validation_result"] = primary_result
        state["validation_confidence"] = "high"
    except APITimeoutError:
        # Fallback to local validation
        fallback_result = validate_locally(state["data"])
        state["validation_result"] = fallback_result
        state["validation_confidence"] = "medium"
        state["warnings"].append("External API timeout, used local validation")
    except APIRateLimitError:
        # Queue for later processing
        state["validation_result"] = None
        state["manual_review_required"] = True
        state["warnings"].append("API rate limit, flagged for manual review")
    
    return state
```

**Circuit Breaker Pattern:**
```python
class ServiceCircuitBreaker:
    def __init__(self, failure_threshold=5, timeout=300):
        self.failure_count = 0
        self.failure_threshold = failure_threshold
        self.last_failure_time = None
        self.timeout = timeout
    
    def call_service(self, func, *args, **kwargs):
        if self.is_circuit_open():
            raise ServiceUnavailableError("Circuit breaker open")
        
        try:
            result = func(*args, **kwargs)
            self.on_success()
            return result
        except Exception as e:
            self.on_failure()
            raise
```

### Error Classification

**Error Severity Levels:**
```python
def classify_error(error, state):
    if isinstance(error, CriticalDataError):
        # Stop processing immediately
        state["approval_status"] = "rejected"
        state["errors"].append(f"Critical: {str(error)}")
        return "stop_processing"
    
    elif isinstance(error, ValidationWarning):
        # Continue but flag for review
        state["warnings"].append(f"Warning: {str(error)}")
        state["manual_review_required"] = True
        return "continue_with_flag"
    
    elif isinstance(error, ServiceTemporarilyUnavailable):
        # Retry or use fallback
        state["warnings"].append(f"Service issue: {str(error)}")
        return "use_fallback"
    
    else:
        # Unknown error - be conservative
        state["errors"].append(f"Unknown error: {str(error)}")
        return "manual_review"
```

### Retry and Backoff Strategies

**Exponential Backoff:**
```python
import asyncio
from functools import wraps

def retry_with_backoff(max_retries=3, base_delay=1, backoff_factor=2):
    def decorator(func):
        @wraps(func)
        async def wrapper(*args, **kwargs):
            for attempt in range(max_retries):
                try:
                    return await func(*args, **kwargs)
                except RetryableError as e:
                    if attempt == max_retries - 1:
                        raise
                    
                    delay = base_delay * (backoff_factor ** attempt)
                    await asyncio.sleep(delay)
            
        return wrapper
    return decorator

@retry_with_backoff(max_retries=3)
async def call_external_service(data):
    # Service call that might fail
    pass
```

Let's test our error handling:

In [None]:
# Test with non-existent invoice
print("=" * 60)
print("TESTING ERROR HANDLING: Non-existent Invoice")
print("=" * 60)

error_state = {
    "invoice_id": "INV-INVALID-999",
    "errors": [],
    "warnings": [],
    "steps_executed": []
}

error_result = app.invoke(error_state)

print("\n🔍 ERROR HANDLING RESULTS:")
print(f"Approval Status: {error_result.get('approval_status', 'Unknown')}")
print(f"Errors: {error_result['errors']}")
print(f"Steps Executed: {error_result['steps_executed']}")
print("\nNote: Workflow correctly skipped validation when extraction failed!")

# Test with high-risk vendor
print("\n" + "=" * 60)
print("TESTING EDGE CASE: High-Risk Vendor")
print("=" * 60)

# Modify node to simulate high-risk vendor
def high_risk_extract(state):
    state["vendor_name"] = "Unknown Vendor"
    state["amount"] = 50000
    state["payment_terms"] = "Net 90"
    state["vat_number"] = "INVALID123"
    state["steps_executed"].append("extract_data")
    return state

# Create modified workflow for testing
test_workflow = StateGraph(InvoiceState)
test_workflow.add_node("extract", high_risk_extract)
test_workflow.add_node("validate_vat", validate_vat)
test_workflow.add_node("check_risk", check_vendor_risk)
test_workflow.add_node("verify_terms", verify_payment_terms)
test_workflow.add_node("decide", make_approval_decision)

test_workflow.set_entry_point("extract")
test_workflow.add_edge("extract", "validate_vat")
test_workflow.add_edge("validate_vat", "check_risk")
test_workflow.add_edge("check_risk", "verify_terms")
test_workflow.add_edge("verify_terms", "decide")
test_workflow.add_edge("decide", END)

test_app = test_workflow.compile()

risk_state = {
    "invoice_id": "INV-RISKY-001",
    "errors": [],
    "warnings": [],
    "steps_executed": []
}

risk_result = test_app.invoke(risk_state)

print("\n🚨 HIGH-RISK RESULTS:")
print(f"Vendor: {risk_result.get('vendor_name')}")
print(f"Amount: ${risk_result.get('amount', 0):,.2f}")
print(f"Risk Score: {risk_result.get('vendor_risk_score')}")
print(f"Approval Status: {risk_result.get('approval_status').upper()}")
print(f"Warnings: {risk_result['warnings']}")

## Step 7: Parallel Execution - Performance Optimization

### Understanding Parallel Processing

Traditional sequential processing is a bottleneck:

**Sequential Bottleneck:**
```python
# Each step waits for the previous one
def sequential_validation(invoice_data):
    vat_result = validate_vat(invoice_data)      # 2 seconds
    risk_result = check_vendor_risk(invoice_data) # 3 seconds  
    terms_result = verify_terms(invoice_data)     # 1 second
    return combine_results(vat_result, risk_result, terms_result)
    # Total time: 6 seconds
```

**Parallel Optimization:**
```python
# Independent validations run simultaneously
async def parallel_validation(invoice_data):
    vat_task = asyncio.create_task(validate_vat(invoice_data))
    risk_task = asyncio.create_task(check_vendor_risk(invoice_data))
    terms_task = asyncio.create_task(verify_terms(invoice_data))
    
    vat_result, risk_result, terms_result = await asyncio.gather(
        vat_task, risk_task, terms_task
    )
    return combine_results(vat_result, risk_result, terms_result)
    # Total time: 3 seconds (limited by slowest task)
```

### LangGraph Parallel Patterns

**Fan-out/Fan-in Architecture:**
```python
# One node triggers multiple parallel paths
workflow.add_edge("extract_data", "validate_vat")
workflow.add_edge("extract_data", "check_risk")  
workflow.add_edge("extract_data", "verify_terms")

# All paths converge to decision node
workflow.add_edge("validate_vat", "make_decision")
workflow.add_edge("check_risk", "make_decision")
workflow.add_edge("verify_terms", "make_decision")

# LangGraph automatically handles parallel execution
```

### Performance Considerations

**When Parallel Processing Helps:**
```python
# ✅ Good: Independent operations
def validate_vat(state):
    # Only needs: vat_number
    # External API call: 2 seconds
    pass

def check_vendor_risk(state):  
    # Only needs: vendor_name
    # Database lookup: 1 second
    pass

# ❌ Bad: Dependent operations  
def step_a(state):
    state["intermediate_result"] = process_data()
    
def step_b(state):
    # Needs intermediate_result from step_a
    return analyze(state["intermediate_result"])
```

**Resource Management:**
```python
# Monitor resource usage in parallel execution
import concurrent.futures
import psutil

def monitor_parallel_execution():
    cpu_before = psutil.cpu_percent()
    memory_before = psutil.virtual_memory().percent
    
    # Run parallel tasks
    with concurrent.futures.ThreadPoolExecutor(max_workers=3) as executor:
        futures = [
            executor.submit(validate_vat, state),
            executor.submit(check_risk, state),
            executor.submit(verify_terms, state)
        ]
        results = [future.result() for future in futures]
    
    cpu_after = psutil.cpu_percent()
    memory_after = psutil.virtual_memory().percent
    
    print(f"CPU increase: {cpu_after - cpu_before}%")
    print(f"Memory increase: {memory_after - memory_before}%")
```

### Parallel Execution Best Practices

**State Isolation:**
```python
# Each parallel node should work on different parts of state
def parallel_safe_node(state):
    # ✅ Good: Read shared data, write to own namespace
    shared_data = state["invoice_data"]
    my_result = process(shared_data)
    
    # Write to isolated key
    state["my_node_result"] = my_result
    return state
```

Let's implement high-performance parallel processing:

In [None]:
from langgraph.graph import StateGraph, END
import asyncio

# Create parallel validation workflow
parallel_workflow = StateGraph(InvoiceState)

# Add nodes
parallel_workflow.add_node("extract", extract_invoice_data)
parallel_workflow.add_node("validate_vat", validate_vat)
parallel_workflow.add_node("check_risk", check_vendor_risk)
parallel_workflow.add_node("verify_terms", verify_payment_terms)
parallel_workflow.add_node("decide", make_approval_decision)

# Set entry point
parallel_workflow.set_entry_point("extract")

# Create parallel validation after extraction
# All three validation nodes run simultaneously
parallel_workflow.add_edge("extract", "validate_vat")
parallel_workflow.add_edge("extract", "check_risk")
parallel_workflow.add_edge("extract", "verify_terms")

# All validation nodes lead to decision
parallel_workflow.add_edge("validate_vat", "decide")
parallel_workflow.add_edge("check_risk", "decide")
parallel_workflow.add_edge("verify_terms", "decide")

parallel_workflow.add_edge("decide", END)

# Compile parallel workflow
parallel_app = parallel_workflow.compile()

print("=" * 60)
print("PARALLEL EXECUTION DEMO")
print("=" * 60)

# Run parallel workflow
parallel_state = {
    "invoice_id": "INV-2024-001",
    "errors": [],
    "warnings": [],
    "steps_executed": []
}

print("\n🚀 Running validations in PARALLEL...")
start = time.time()
parallel_result = parallel_app.invoke(parallel_state)
parallel_time = time.time() - start

print("\n⚡ PARALLEL EXECUTION COMPLETE!")
print(f"Time: {parallel_time:.2f} seconds")
print(f"Steps executed: {parallel_result['steps_executed']}")
print("\nNote: All three validations ran simultaneously!")
print("This is much faster than sequential execution.")

## Step 8: State Persistence and Checkpointing

### Why Persistence Matters

Production workflows need to handle interruptions:

**Common Scenarios:**
```python
# Long-running workflow scenarios
scenarios = [
    "Human approval step takes 2 hours",
    "External service is down for maintenance", 
    "System restart during processing",
    "Workflow needs debugging at specific step",
    "Compliance audit requires state history"
]
```

**Without Persistence:**
```python
# Bad: Lost work on interruption
def fragile_workflow():
    step1_result = expensive_computation()    # 10 minutes
    step2_result = external_api_call()        # 5 minutes
    # System crash here = lose 15 minutes of work
    step3_result = final_processing()
```

**With LangGraph Checkpointing:**
```python
# Good: Resumable workflows
memory = MemorySaver()
app = workflow.compile(checkpointer=memory)

# Automatic state saving at each step
config = {"configurable": {"thread_id": "invoice-001"}}
result = app.invoke(initial_state, config)

# Can resume from any point
resumed_result = app.invoke(None, config)  # Continues from last checkpoint
```

### Checkpoint Strategies

**Step-by-Step Checkpointing:**
```python
# Save after each critical step
workflow.add_node("extract", extract_data)        # Checkpoint 1
workflow.add_node("validate", validate_data)      # Checkpoint 2  
workflow.add_node("approve", approval_decision)   # Checkpoint 3

# If failure at step 3, resume from checkpoint 2
```

**Conditional Checkpointing:**
```python
# Only checkpoint before expensive operations
def should_checkpoint(state):
    return (
        state.get("requires_human_approval") or
        state.get("expensive_operation_next") or
        state.get("external_dependency")
    )

if should_checkpoint(state):
    app.save_checkpoint(state, thread_id)
```

### State History and Auditing

**Audit Trail Generation:**
```python
def generate_audit_trail(thread_id):
    """Create compliance audit trail"""
    history = app.get_state_history({"configurable": {"thread_id": thread_id}})
    
    audit_trail = []
    for i, state_snapshot in enumerate(history):
        audit_trail.append({
            "step": i + 1,
            "timestamp": state_snapshot.created_at,
            "node": state_snapshot.metadata.get("source", "unknown"),
            "data_changes": calculate_diff(previous_state, state_snapshot.values),
            "user": state_snapshot.metadata.get("user", "system"),
            "approval_status": state_snapshot.values.get("approval_status")
        })
    
    return audit_trail
```

### Production Persistence Patterns

**Database Checkpointing:**
```python
from langgraph.checkpoint import SqliteSaver

# Production: Use PostgreSQL/MySQL
sqlite_saver = SqliteSaver.from_conn_string("sqlite:///checkpoints.db")
app = workflow.compile(checkpointer=sqlite_saver)

# Automatic persistence to database
config = {"configurable": {"thread_id": "invoice-001"}}
app.invoke(state, config)  # Saved to database
```

**Human-in-the-Loop Workflows:**
```python
# Pause before human approval steps
app = workflow.compile(
    checkpointer=memory,
    interrupt_before=["human_approval"]  # Pause here
)

# Run until human step
result = app.invoke(state, config)
print("Waiting for human approval...")

# Human provides input
human_decision = get_human_approval()
updated_state = result.copy()
updated_state["human_decision"] = human_decision

# Continue from where we left off
final_result = app.invoke(updated_state, config)
```

Let's implement persistent workflows:

In [None]:
from langgraph.checkpoint import MemorySaver

# Create workflow with checkpointing
memory = MemorySaver()

# Compile with checkpointer
checkpointed_app = workflow.compile(checkpointer=memory)

print("=" * 60)
print("CHECKPOINTING DEMO")
print("=" * 60)

# Run with thread ID for tracking
config = {"configurable": {"thread_id": "invoice-001"}}

checkpoint_state = {
    "invoice_id": "INV-2024-001",
    "errors": [],
    "warnings": [],
    "steps_executed": []
}

# Run workflow
print("\n📁 Running workflow with checkpointing...")
result_with_checkpoint = checkpointed_app.invoke(checkpoint_state, config)

# Get state history
print("\n📜 State History:")
for state in checkpointed_app.get_state_history(config):
    if state.values:
        print(f"  Step: {state.values.get('steps_executed', [])} ")
        print(f"  Status: {state.values.get('approval_status', 'processing')}")
    break  # Just show latest for demo

print("\n✅ Workflow state saved and can be resumed!")
print("This is useful for:")
print("- Long-running workflows")
print("- Human-in-the-loop approval steps")
print("- Debugging and auditing")

## Key Learnings

### The LangGraph Revolution

**From Code to Graphs:**
```python
# Traditional: Imperative programming
def process_invoice(invoice_id):
    data = extract(invoice_id)
    if data.valid:
        parallel_validate(data)
        if all_checks_pass():
            return approve()
    return reject()

# LangGraph: Declarative workflow
workflow = StateGraph(InvoiceState)
workflow.add_node("extract", extract_data)
workflow.add_conditional_edges("extract", route_validation)
workflow.add_parallel_validation_nodes()
app = workflow.compile()
```

### Core LangGraph Concepts

**State Management:**
- TypedDict defines data flow structure
- Automatic state merging across nodes
- Immutable state updates prevent race conditions
- Built-in error and metadata tracking

**Graph Architecture:**
- Visual workflow representation
- Conditional routing based on state
- Parallel execution for independent steps
- Fan-out/fan-in patterns for complex flows

**Production Features:**
- Checkpointing for workflow persistence
- State history for audit trails
- Human-in-the-loop capabilities
- Error recovery and graceful degradation

### Performance Benefits

**Parallel Processing:**
```python
# Sequential: 6 seconds total
validate_vat()      # 2 seconds
check_risk()        # 3 seconds  
verify_terms()      # 1 second

# Parallel: 3 seconds total (max of all)
await asyncio.gather(
    validate_vat(),
    check_risk(),
    verify_terms()
)
# 50% performance improvement
```

**Resource Optimization:**
- CPU cores utilized efficiently
- I/O operations don't block each other
- Memory usage optimized through state management
- Bottleneck identification through graph analysis

### When to Use LangGraph

**Perfect For:**
- Multi-step workflows with dependencies
- Processes requiring human approval
- Long-running operations needing persistence
- Complex business logic with conditional paths
- Systems requiring audit trails

**Consider Alternatives For:**
- Simple linear transformations
- Real-time sub-second operations  
- Stateless functions
- Workflows that never need interruption

### Architecture Patterns

**Fan-out/Fan-in:**
```
Extract → [Validate ∥ Check ∥ Verify] → Decide
```

**Conditional Routing:**
```
Extract → Condition → {High-Value, Standard, Error} → Process
```

**Human-in-the-Loop:**
```
Auto-Process → Human-Review → Continue/Reject → Complete
```

### Production Checklist

✅ **State Design**: Type-safe, comprehensive state schema  
✅ **Error Handling**: Graceful degradation and recovery  
✅ **Performance**: Parallel execution where possible  
✅ **Persistence**: Checkpointing for long operations  
✅ **Monitoring**: State history and audit trails  
✅ **Security**: Input validation and authorization  

### Next Steps: Integration and Deployment

In the next sessions, we'll explore:
- **Real OCR Integration**: Replace mock data with actual document processing
- **LLM-Powered Decisions**: Add intelligent reasoning to approval logic
- **API Integration**: Connect to external systems and databases
- **Production Deployment**: Scale workflows for enterprise use
- **Monitoring and Observability**: Track performance and errors in production

The foundation we've built with LangGraph will support all these advanced capabilities!