# Week 1 ‚Äî LangGraph Fundamentals & State Management

**Course:** LangGraph for Complex Workflows  
**Week Focus:** Master state graphs, conditional routing, and multi-step workflows to build production-grade AI systems.

---

## üéØ Learning Objectives

By the end of this week, you will:
- Understand why graphs are superior to linear chains for complex workflows
- Design and implement StateGraph with typed state schemas
- Build nodes (workflow steps) and edges (transitions)
- Implement conditional routing based on state
- Handle errors gracefully in graph execution
- Visualize and debug graph workflows
- Build a real-world document processing pipeline

## üìä Real-World Context

**The Challenge:** Your content platform receives 10,000+ user submissions daily:
- üìÑ Blog posts, comments, product reviews
- üé≠ Mix of legitimate content and spam/toxic material
- üåç Multiple languages requiring classification
- ‚öñÔ∏è Need to moderate without human bottleneck

**Linear Chain Limitations:**
```python
# ‚ùå This doesn't work well:
chain = classify | moderate | summarize | publish
# Problem: What if we need to:
# - Route spam to deletion (skip summarization)
# - Send toxic content to human review
# - Handle multiple languages differently
# - Retry failed steps
```

**The Solution:** A content moderation graph that:
1. **Classifies** content type (article/comment/review/spam)
2. **Detects** language and toxicity
3. **Routes** based on results:
   - Spam ‚Üí Auto-reject
   - Toxic ‚Üí Human review queue
   - Clean ‚Üí Extract key info
4. **Summarizes** approved content
5. **Publishes** or routes for approval

**Business Impact:**
- üöÄ Process 10K submissions/day (up from 500 manual reviews)
- ‚è±Ô∏è Reduce moderation time from 4 hours ‚Üí 2 minutes
- üéØ 95% accuracy with 10% human review (high-risk items)
- üí∞ Save $240K/year in moderation costs

Companies like **Reddit, Medium, and Substack** use similar graph-based moderation systems.

In [None]:
from IPython.display import HTML
HTML('''
<style>
.jp-RenderedHTMLCommon h2 {
    color: #2c3e50;
    border-bottom: 2px solid #3498db;
    padding-bottom: 10px;
    margin-top: 30px;
}
.jp-RenderedHTMLCommon h3 {
    color: #34495e;
    margin-top: 20px;
}
.jp-RenderedHTMLCommon code {
    background-color: #f8f9fa;
    padding: 2px 6px;
    border-radius: 3px;
    font-family: 'Courier New', monospace;
}
.jp-RenderedHTMLCommon pre {
    background-color: #f8f9fa;
    border-left: 4px solid #3498db;
    padding: 15px;
    border-radius: 5px;
}
.exercise-box {
    background-color: #fff3cd;
    border-left: 5px solid #ffc107;
    padding: 15px;
    margin: 20px 0;
    border-radius: 5px;
}
.scenario-box {
    background-color: #d1ecf1;
    border-left: 5px solid #17a2b8;
    padding: 15px;
    margin: 20px 0;
    border-radius: 5px;
}
.graph-box {
    background-color: #e7f3e7;
    border-left: 5px solid #28a745;
    padding: 15px;
    margin: 20px 0;
    border-radius: 5px;
    font-family: monospace;
}
</style>
''')

## üîç Part 1: Why Graphs? (Linear Chains vs State Graphs)

### The Problem with Linear Chains

LangChain chains work great for simple workflows, but fail for complex scenarios:

In [None]:
# ‚ùå Linear Chain - Can't handle branching logic

from langchain_core.prompts import ChatPromptTemplate
from langchain.llms.fake import FakeListLLM
from langchain_core.output_parsers import StrOutputParser

print("‚ùå LINEAR CHAIN LIMITATIONS:")
print()

# Example: Content moderation
prompt = ChatPromptTemplate.from_template("Moderate this content: {content}")
llm = FakeListLLM(responses=["Content is spam"])
parser = StrOutputParser()

chain = prompt | llm | parser

print("Problem 1: No conditional routing")
print("  If content is spam, we should STOP here.")
print("  But chains always run ALL steps.")
print()

print("Problem 2: No shared state")
print("  Each step only sees the previous output.")
print("  Can't access original input or intermediate results.")
print()

print("Problem 3: No cycles/loops")
print("  Can't retry failed steps.")
print("  Can't implement 'try again until success' logic.")
print()

print("Problem 4: Hard to debug")
print("  Can't inspect state between steps.")
print("  Can't visualize the workflow.")
print()

print("üí° Solution: Use LangGraph!")

### The LangGraph Way (State Graphs)

LangGraph introduces:
1. **State**: Shared context accessible by all nodes
2. **Nodes**: Functions that read/write state
3. **Conditional Edges**: Dynamic routing based on state
4. **Cycles**: Loops and retries
5. **Visualization**: See your workflow as a graph

In [None]:
# ‚úÖ State Graph - Handles complexity elegantly

from typing import TypedDict, Literal
from langgraph.graph import StateGraph, END

# 1. Define State Schema
class ContentState(TypedDict):
    """Shared state across all nodes."""
    content: str
    content_type: str  # "spam", "toxic", "clean"
    decision: str       # "reject", "review", "approve"
    summary: str

# 2. Define Node Functions
def classify_content(state: ContentState) -> ContentState:
    """Node 1: Classify content."""
    # In real app: use LLM to classify
    if "buy now" in state["content"].lower():
        state["content_type"] = "spam"
    elif "hate" in state["content"].lower():
        state["content_type"] = "toxic"
    else:
        state["content_type"] = "clean"
    return state

def route_decision(state: ContentState) -> Literal["reject", "review", "approve"]:
    """Conditional router: decide next step based on state."""
    if state["content_type"] == "spam":
        return "reject"
    elif state["content_type"] == "toxic":
        return "review"
    else:
        return "approve"

def reject_content(state: ContentState) -> ContentState:
    """Node 2a: Auto-reject spam."""
    state["decision"] = "rejected"
    state["summary"] = "Spam detected - auto-rejected"
    return state

def queue_for_review(state: ContentState) -> ContentState:
    """Node 2b: Queue for human review."""
    state["decision"] = "needs_review"
    state["summary"] = "Toxic content - queued for human review"
    return state

def approve_content(state: ContentState) -> ContentState:
    """Node 2c: Auto-approve clean content."""
    state["decision"] = "approved"
    state["summary"] = "Clean content - approved for publication"
    return state

# 3. Build the Graph
workflow = StateGraph(ContentState)

# Add nodes
workflow.add_node("classify", classify_content)
workflow.add_node("reject", reject_content)
workflow.add_node("review", queue_for_review)
workflow.add_node("approve", approve_content)

# Set entry point
workflow.set_entry_point("classify")

# Add conditional edges (routing logic)
workflow.add_conditional_edges(
    "classify",
    route_decision,
    {
        "reject": "reject",
        "review": "review",
        "approve": "approve"
    }
)

# All paths end after their respective actions
workflow.add_edge("reject", END)
workflow.add_edge("review", END)
workflow.add_edge("approve", END)

# 4. Compile the graph
app = workflow.compile()

print("‚úÖ Graph created successfully!")
print("\nüìä GRAPH STRUCTURE:")
print("""
    START
      |
      v
  [classify]
      |
   <router>
   /  |  \\
  /   |   \\
spam toxic clean
 |    |     |
 v    v     v
[reject] [review] [approve]
 |    |     |
 v    v     v
    END
""")

### Test the Graph with Different Inputs

In [None]:
# Test 1: Spam content
print("üß™ TEST 1: Spam Content")
print("=" * 60)
result1 = app.invoke({"content": "BUY NOW! Limited time offer!!!"})
print(f"Input: {result1['content']}")
print(f"Type: {result1['content_type']}")
print(f"Decision: {result1['decision']}")
print(f"Summary: {result1['summary']}")
print()

# Test 2: Toxic content
print("üß™ TEST 2: Toxic Content")
print("=" * 60)
result2 = app.invoke({"content": "I hate this product and everyone who uses it!"})
print(f"Input: {result2['content']}")
print(f"Type: {result2['content_type']}")
print(f"Decision: {result2['decision']}")
print(f"Summary: {result2['summary']}")
print()

# Test 3: Clean content
print("üß™ TEST 3: Clean Content")
print("=" * 60)
result3 = app.invoke({"content": "This is a helpful tutorial on Python programming."})
print(f"Input: {result3['content']}")
print(f"Type: {result3['content_type']}")
print(f"Decision: {result3['decision']}")
print(f"Summary: {result3['summary']}")
print()

print("‚úÖ Notice how each input takes a DIFFERENT path through the graph!")

## üìö Part 2: Core Concepts Deep Dive

### 2.1 State Schemas ‚Äî The Heart of LangGraph

**State** is a shared dictionary that flows through the graph. Every node can:
- **Read** from state
- **Write** to state (updates are merged)
- **Access** full history

**Best Practices:**
1. Use TypedDict for type safety
2. Document each field
3. Keep state flat (avoid deep nesting)
4. Use Optional for fields set later

In [None]:
from typing import TypedDict, Optional, List
from datetime import datetime

# ‚úÖ Good State Design
class DocumentProcessingState(TypedDict):
    """State for multi-step document processing workflow."""
    
    # Input (set at start)
    document_text: str
    document_id: str
    
    # Classification results (set by classify node)
    document_type: Optional[str]  # "invoice", "contract", "report"
    language: Optional[str]        # "en", "es", "fr"
    confidence: Optional[float]    # 0.0-1.0
    
    # Extraction results (set by extract node)
    entities: Optional[List[dict]]  # [{"type": "person", "value": "John"}]
    key_dates: Optional[List[str]]  # ["2024-01-15", "2024-02-01"]
    amounts: Optional[List[float]]  # [1500.00, 2300.50]
    
    # Summary (set by summarize node)
    summary: Optional[str]
    
    # Routing decision (set by router)
    next_step: Optional[str]  # "approve", "reject", "review"
    
    # Metadata
    processed_at: Optional[str]
    errors: Optional[List[str]]

print("‚úÖ Well-designed state schema!")
print("\nKey features:")
print("1. Clear input vs output fields")
print("2. Optional fields for values set later")
print("3. Specific types (List[dict], float, etc.)")
print("4. Docstrings for clarity")
print("5. Error tracking built-in")

### 2.2 Nodes ‚Äî The Workflow Steps

**Nodes** are functions that:
- Take state as input
- Perform work (call LLM, API, database, etc.)
- Return updated state

**Node Types:**
1. **Processing nodes**: Transform data (classify, extract, summarize)
2. **Decision nodes**: Analyze state and set routing flags
3. **Integration nodes**: Call external APIs/databases
4. **Validation nodes**: Check data quality

In [None]:
from typing import TypedDict
from datetime import datetime

class DocState(TypedDict):
    text: str
    doc_type: str
    entities: list
    summary: str
    error: str
    timestamp: str

# Example 1: Processing Node
def classify_document(state: DocState) -> DocState:
    """Classify document type using keyword matching."""
    text_lower = state["text"].lower()
    
    if "invoice" in text_lower or "payment" in text_lower:
        state["doc_type"] = "invoice"
    elif "agreement" in text_lower or "contract" in text_lower:
        state["doc_type"] = "contract"
    else:
        state["doc_type"] = "report"
    
    state["timestamp"] = datetime.now().isoformat()
    return state

# Example 2: Extraction Node
def extract_entities(state: DocState) -> DocState:
    """Extract key entities from document."""
    # In production: use NER model or LLM
    entities = []
    
    # Simple extraction example
    if "$" in state["text"]:
        entities.append({"type": "amount", "value": "$1,500"})
    
    state["entities"] = entities
    return state

# Example 3: Summarization Node (with LLM)
def summarize_document(state: DocState) -> DocState:
    """Generate concise summary."""
    # In production: use actual LLM
    doc_type = state.get("doc_type", "document")
    state["summary"] = f"This is a {doc_type} containing {len(state['text'])} characters."
    return state

# Example 4: Error Handling Node
def validate_document(state: DocState) -> DocState:
    """Validate document before processing."""
    if not state.get("text"):
        state["error"] = "Empty document"
    elif len(state["text"]) < 10:
        state["error"] = "Document too short"
    else:
        state["error"] = ""  # No error
    return state

print("‚úÖ Node functions created!")
print("\nüí° Node Best Practices:")
print("1. Single responsibility (do ONE thing well)")
print("2. Always return state (even if unchanged)")
print("3. Handle errors gracefully (don't crash)")
print("4. Add logging for debugging")
print("5. Keep nodes pure (no hidden side effects)")

### 2.3 Edges ‚Äî Connecting the Workflow

**Edge Types:**

1. **Normal Edges**: Always go from A ‚Üí B
   ```python
   workflow.add_edge("node_a", "node_b")
   ```

2. **Conditional Edges**: Route based on state
   ```python
   workflow.add_conditional_edges(
       "router_node",
       routing_function,
       {"option1": "node_a", "option2": "node_b"}
   )
   ```

3. **Entry Point**: Where execution starts
   ```python
   workflow.set_entry_point("first_node")
   ```

4. **End**: Terminal node (no outgoing edges)
   ```python
   workflow.add_edge("final_node", END)
   ```

In [None]:
from langgraph.graph import StateGraph, END
from typing import TypedDict, Literal

class SimpleState(TypedDict):
    value: int
    path_taken: str

# Example: Conditional routing based on value
def check_value(state: SimpleState) -> Literal["low", "high"]:
    """Router: decide path based on value."""
    return "low" if state["value"] < 50 else "high"

def process_low(state: SimpleState) -> SimpleState:
    state["path_taken"] = "LOW path"
    return state

def process_high(state: SimpleState) -> SimpleState:
    state["path_taken"] = "HIGH path"
    return state

# Build graph with conditional routing
graph = StateGraph(SimpleState)
graph.add_node("process_low", process_low)
graph.add_node("process_high", process_high)

graph.set_conditional_entry_point(
    check_value,
    {"low": "process_low", "high": "process_high"}
)

graph.add_edge("process_low", END)
graph.add_edge("process_high", END)

app = graph.compile()

# Test routing
print("üß™ Test conditional routing:")
print()
result1 = app.invoke({"value": 25})
print(f"Value=25 ‚Üí {result1['path_taken']}")

result2 = app.invoke({"value": 75})
print(f"Value=75 ‚Üí {result2['path_taken']}")

print("\n‚úÖ Routing works! Different inputs take different paths.")

## üõ†Ô∏è Part 3: Building a Real Document Processing Pipeline

<div class="scenario-box">
<strong>üìå Scenario:</strong> Build an intelligent document processor for a financial services company:
<ol>
<li><strong>Classify</strong> document type (invoice, contract, report, form)</li>
<li><strong>Extract</strong> key information (dates, amounts, parties)</li>
<li><strong>Validate</strong> extracted data</li>
<li><strong>Summarize</strong> document content</li>
<li><strong>Route</strong> for appropriate action:
  <ul>
    <li>Invoice ‚Üí Accounting system</li>
    <li>Contract ‚Üí Legal review</li>
    <li>Report ‚Üí Management dashboard</li>
    <li>Unknown ‚Üí Human review</li>
  </ul>
</li>
</ol>
</div>

### Step 1: Define Comprehensive State

In [None]:
from typing import TypedDict, Optional, List, Dict
from datetime import datetime

class DocumentState(TypedDict):
    """State for document processing workflow."""
    
    # Input
    document_id: str
    document_text: str
    source: str  # "email", "upload", "scan"
    
    # Classification
    document_type: Optional[str]  # "invoice", "contract", "report", "form", "unknown"
    classification_confidence: Optional[float]
    language: Optional[str]
    
    # Extraction
    entities: Optional[List[Dict[str, str]]]  # [{"type": "amount", "value": "$1500"}]
    dates: Optional[List[str]]
    amounts: Optional[List[float]]
    parties: Optional[List[str]]  # People/companies mentioned
    
    # Validation
    is_valid: Optional[bool]
    validation_errors: Optional[List[str]]
    
    # Summary
    summary: Optional[str]
    key_points: Optional[List[str]]
    
    # Routing
    routing_decision: Optional[str]  # "accounting", "legal", "management", "review"
    priority: Optional[str]  # "low", "medium", "high", "urgent"
    
    # Metadata
    processing_started: Optional[str]
    processing_completed: Optional[str]
    errors: Optional[List[str]]

print("‚úÖ DocumentState schema defined")
print(f"\nTotal fields: {len(DocumentState.__annotations__)}")
print("Input fields: 3")
print("Processing fields: 14")
print("Metadata fields: 3")

### Step 2: Implement Processing Nodes

In [None]:
import re
from datetime import datetime
from langchain.llms.fake import FakeListLLM
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import JsonOutputParser

# Node 1: Classify Document
def classify_document(state: DocumentState) -> DocumentState:
    """Classify document type using keyword analysis."""
    text_lower = state["document_text"].lower()
    
    # Classification logic (in production: use LLM)
    if any(word in text_lower for word in ["invoice", "payment", "bill", "amount due"]):
        state["document_type"] = "invoice"
        state["classification_confidence"] = 0.92
    elif any(word in text_lower for word in ["agreement", "contract", "hereby agree"]):
        state["document_type"] = "contract"
        state["classification_confidence"] = 0.88
    elif any(word in text_lower for word in ["report", "analysis", "findings"]):
        state["document_type"] = "report"
        state["classification_confidence"] = 0.85
    else:
        state["document_type"] = "unknown"
        state["classification_confidence"] = 0.40
    
    # Detect language (simplified)
    state["language"] = "en"  # Default to English
    
    state["processing_started"] = datetime.now().isoformat()
    return state

# Node 2: Extract Information
def extract_information(state: DocumentState) -> DocumentState:
    """Extract key entities, dates, amounts from document."""
    text = state["document_text"]
    
    # Extract dates (simple regex)
    date_pattern = r'\d{1,2}[-/]\d{1,2}[-/]\d{2,4}|\d{4}[-/]\d{1,2}[-/]\d{1,2}'
    dates = re.findall(date_pattern, text)
    state["dates"] = dates if dates else []
    
    # Extract amounts (simple regex)
    amount_pattern = r'\$\s*([0-9,]+\.?[0-9]*)'
    amounts_str = re.findall(amount_pattern, text)
    state["amounts"] = [float(amt.replace(',', '')) for amt in amounts_str]
    
    # Extract entities
    entities = []
    if state["amounts"]:
        entities.append({"type": "monetary_amount", "value": f"${state['amounts'][0]}"})
    if state["dates"]:
        entities.append({"type": "date", "value": state['dates'][0]})
    state["entities"] = entities
    
    # Extract parties (simplified - just capitalized words)
    parties = re.findall(r'\b[A-Z][a-z]+ [A-Z][a-z]+\b', text)
    state["parties"] = list(set(parties))[:5]  # Top 5 unique
    
    return state

# Node 3: Validate Data
def validate_extraction(state: DocumentState) -> DocumentState:
    """Validate extracted information."""
    errors = []
    
    # Validation rules
    if state["document_type"] == "invoice":
        if not state.get("amounts"):
            errors.append("Invoice missing amount")
        if not state.get("dates"):
            errors.append("Invoice missing date")
    
    if state["document_type"] == "contract":
        if not state.get("parties") or len(state.get("parties", [])) < 2:
            errors.append("Contract missing parties")
    
    if state["classification_confidence"] < 0.7:
        errors.append("Low classification confidence")
    
    state["is_valid"] = len(errors) == 0
    state["validation_errors"] = errors
    
    return state

# Node 4: Summarize Document
def summarize_document(state: DocumentState) -> DocumentState:
    """Generate summary and key points."""
    doc_type = state["document_type"]
    
    # Generate summary based on type
    if doc_type == "invoice":
        amount = state["amounts"][0] if state.get("amounts") else "unknown"
        date = state["dates"][0] if state.get("dates") else "unknown"
        state["summary"] = f"Invoice for ${amount} dated {date}"
        state["key_points"] = [
            f"Amount due: ${amount}",
            f"Date: {date}",
            f"Entities extracted: {len(state.get('entities', []))}"
        ]
    elif doc_type == "contract":
        parties = state.get("parties", [])
        state["summary"] = f"Contract agreement between {len(parties)} parties"
        state["key_points"] = [
            f"Parties: {', '.join(parties[:3])}",
            f"Dates mentioned: {len(state.get('dates', []))}"
        ]
    else:
        state["summary"] = f"{doc_type.title()} document with {len(state['document_text'])} characters"
        state["key_points"] = [f"Type: {doc_type}", f"Language: {state.get('language', 'unknown')}"]
    
    return state

# Node 5: Route Document
def route_document(state: DocumentState) -> DocumentState:
    """Determine routing and priority."""
    doc_type = state["document_type"]
    is_valid = state.get("is_valid", False)
    
    # Routing logic
    if not is_valid:
        state["routing_decision"] = "review"
        state["priority"] = "high"
    elif doc_type == "invoice":
        state["routing_decision"] = "accounting"
        # Check if urgent (amount > $10,000)
        amounts = state.get("amounts", [])
        state["priority"] = "urgent" if (amounts and amounts[0] > 10000) else "medium"
    elif doc_type == "contract":
        state["routing_decision"] = "legal"
        state["priority"] = "high"
    elif doc_type == "report":
        state["routing_decision"] = "management"
        state["priority"] = "low"
    else:
        state["routing_decision"] = "review"
        state["priority"] = "medium"
    
    state["processing_completed"] = datetime.now().isoformat()
    return state

print("‚úÖ All node functions implemented!")
print("\nNodes created:")
print("  1. classify_document")
print("  2. extract_information")
print("  3. validate_extraction")
print("  4. summarize_document")
print("  5. route_document")

### Step 3: Build the Complete Graph

In [None]:
from langgraph.graph import StateGraph, END

# Create the graph
workflow = StateGraph(DocumentState)

# Add all nodes
workflow.add_node("classify", classify_document)
workflow.add_node("extract", extract_information)
workflow.add_node("validate", validate_extraction)
workflow.add_node("summarize", summarize_document)
workflow.add_node("route", route_document)

# Define the workflow
workflow.set_entry_point("classify")
workflow.add_edge("classify", "extract")
workflow.add_edge("extract", "validate")
workflow.add_edge("validate", "summarize")
workflow.add_edge("summarize", "route")
workflow.add_edge("route", END)

# Compile the graph
doc_processor = workflow.compile()

print("‚úÖ Document processing graph compiled!")
print("\nüìä WORKFLOW:")
print("""
    START
      |
      v
  [classify] ‚Üê Determine document type
      |
      v
  [extract] ‚Üê Extract dates, amounts, entities
      |
      v
  [validate] ‚Üê Check data quality
      |
      v
  [summarize] ‚Üê Generate summary
      |
      v
  [route] ‚Üê Determine destination & priority
      |
      v
    END
""")

### Step 4: Test with Real-World Documents

In [None]:
# Test Document 1: Invoice
invoice_doc = {
    "document_id": "DOC-001",
    "source": "email",
    "document_text": """
    INVOICE #INV-2024-001
    
    Date: 01/15/2024
    Due Date: 02/15/2024
    
    Bill To: Acme Corporation
    From: Tech Solutions Inc
    
    Services Rendered:
    - Software Development: $15,000.00
    - Cloud Hosting (Jan): $2,500.00
    
    Total Amount Due: $17,500.00
    
    Payment Terms: Net 30
    """
}

print("üß™ TEST 1: Processing Invoice")
print("=" * 70)
result = doc_processor.invoke(invoice_doc)

print(f"Document ID: {result['document_id']}")
print(f"Source: {result['source']}")
print()
print(f"üìã Classification:")
print(f"  Type: {result['document_type']}")
print(f"  Confidence: {result['classification_confidence']:.0%}")
print(f"  Language: {result['language']}")
print()
print(f"üìä Extraction:")
print(f"  Dates found: {len(result['dates'])} ‚Üí {result['dates']}")
print(f"  Amounts found: {len(result['amounts'])} ‚Üí ${result['amounts']}")
print(f"  Entities: {len(result['entities'])}")
for entity in result['entities']:
    print(f"    - {entity['type']}: {entity['value']}")
print()
print(f"‚úì Validation:")
print(f"  Valid: {result['is_valid']}")
if result['validation_errors']:
    print(f"  Errors: {result['validation_errors']}")
print()
print(f"üìù Summary:")
print(f"  {result['summary']}")
print(f"  Key Points:")
for point in result['key_points']:
    print(f"    ‚Ä¢ {point}")
print()
print(f"üéØ Routing:")
print(f"  Destination: {result['routing_decision'].upper()}")
print(f"  Priority: {result['priority'].upper()}")
print()
print(f"‚è±Ô∏è Processing Time: {result['processing_started'][:19]} ‚Üí {result['processing_completed'][:19]}")
print("=" * 70)
print()

In [None]:
# Test Document 2: Contract
contract_doc = {
    "document_id": "DOC-002",
    "source": "upload",
    "document_text": """
    SERVICE AGREEMENT
    
    This agreement is entered into on 2024-01-10 between:
    
    Party A: John Smith, representing Smith Enterprises LLC
    Party B: Jane Doe, representing Doe Consulting Inc
    
    The parties hereby agree to the following terms:
    
    1. Services: Consulting services for digital transformation
    2. Duration: 6 months starting 2024-02-01
    3. Compensation: $5,000.00 per month
    4. Termination: Either party may terminate with 30 days notice
    
    Signed on 2024-01-10
    """
}

print("üß™ TEST 2: Processing Contract")
print("=" * 70)
result = doc_processor.invoke(contract_doc)

print(f"üìã Classification: {result['document_type']} ({result['classification_confidence']:.0%})")
print(f"üìä Parties Identified: {', '.join(result['parties'])}")
print(f"‚úì Valid: {result['is_valid']}")
print(f"üìù Summary: {result['summary']}")
print(f"üéØ Route to: {result['routing_decision'].upper()} (Priority: {result['priority'].upper()})")
print("=" * 70)
print()

In [None]:
# Test Document 3: Unknown/Ambiguous
unknown_doc = {
    "document_id": "DOC-003",
    "source": "scan",
    "document_text": "Hello, this is a short note. Thanks!"
}

print("üß™ TEST 3: Processing Unknown Document")
print("=" * 70)
result = doc_processor.invoke(unknown_doc)

print(f"üìã Classification: {result['document_type']} ({result['classification_confidence']:.0%})")
print(f"‚úì Valid: {result['is_valid']}")
if result['validation_errors']:
    print(f"‚ö†Ô∏è Validation Errors:")
    for error in result['validation_errors']:
        print(f"  ‚Ä¢ {error}")
print(f"üéØ Route to: {result['routing_decision'].upper()} (Priority: {result['priority'].upper()})")
print("\n‚úÖ Notice: Low confidence ‚Üí routed to human review!")
print("=" * 70)

## ‚úçÔ∏è Hands-On Exercises

<div class="exercise-box">
<strong>üéØ Exercise 1: Add Error Handling</strong>
<br><br>
Enhance the document processor with robust error handling:
<ol>
<li>Add a <code>try/except</code> wrapper to each node</li>
<li>If a node fails, log the error to <code>state["errors"]</code></li>
<li>Add an error recovery node that handles failures</li>
<li>Test with malformed input (empty text, None values)</li>
</ol>
<br>
<strong>Hint:</strong> Create a wrapper function:
<pre><code>def safe_node(node_fn):
    def wrapper(state):
        try:
            return node_fn(state)
        except Exception as e:
            state["errors"] = state.get("errors", []) + [str(e)]
            return state
    return wrapper
</code></pre>
</div>

In [None]:
# Your solution here!

# TODO: Create safe_node wrapper
# TODO: Wrap all nodes with error handling
# TODO: Add error recovery node
# TODO: Test with bad inputs

<div class="exercise-box">
<strong>üéØ Exercise 2: Add Conditional Routing</strong>
<br><br>
Modify the graph to skip summarization for invalid documents:
<ol>
<li>After <code>validate</code> node, add conditional routing</li>
<li>If <code>is_valid == True</code> ‚Üí go to <code>summarize</code></li>
<li>If <code>is_valid == False</code> ‚Üí skip directly to <code>route</code></li>
<li>Test with both valid and invalid documents</li>
</ol>
<br>
<strong>Hint:</strong> Use <code>add_conditional_edges</code>:
<pre><code>def route_after_validation(state):
    return "summarize" if state["is_valid"] else "route"

workflow.add_conditional_edges(
    "validate",
    route_after_validation,
    {"summarize": "summarize", "route": "route"}
)
</code></pre>
</div>

In [None]:
# Your solution here!

# TODO: Define routing function
# TODO: Rebuild graph with conditional edges
# TODO: Test with valid and invalid docs

<div class="exercise-box">
<strong>üéØ Exercise 3: Add Parallel Processing</strong>
<br><br>
Some operations can run in parallel. Modify the graph to:
<ol>
<li>After <code>classify</code>, run <code>extract</code> AND <code>detect_language</code> in parallel</li>
<li>Create a new <code>detect_language</code> node (use simple heuristics)</li>
<li>Both should finish before moving to <code>validate</code></li>
</ol>
<br>
<strong>Challenge:</strong> Research how to add parallel branches in LangGraph!
</div>

In [None]:
# Your solution here!

# TODO: Create detect_language node
# TODO: Add parallel branches
# TODO: Test and measure performance improvement

## üìù Week 1 Project: Content Moderation Pipeline

**Build a complete content moderation system for a social media platform.**

### Requirements:

**Input:** User-generated content (posts, comments)

**Workflow:**
1. **Classify** content type (text, spam, promotional, news)
2. **Detect** toxicity level (clean, mild, toxic, severe)
3. **Check** for policy violations (hate speech, misinformation, etc.)
4. **Route** based on results:
   - Clean ‚Üí Auto-approve
   - Mild ‚Üí Add warning label
   - Toxic ‚Üí Queue for review
   - Severe ‚Üí Auto-reject + alert moderators
5. **Log** all decisions for audit trail

### State Schema:
```python
class ModerationState(TypedDict):
    content_id: str
    content_text: str
    author_id: str
    
    content_type: Optional[str]
    toxicity_score: Optional[float]  # 0.0-1.0
    policy_violations: Optional[List[str]]
    
    decision: Optional[str]  # "approve", "warn", "review", "reject"
    reason: Optional[str]
    
    processed_at: Optional[str]
```

### Deliverables:
1. Complete state schema
2. 5+ node functions (classify, detect, check, route, log)
3. Graph with conditional routing
4. Test with 5 different content examples:
   - Clean post
   - Spam
   - Mild toxicity
   - Severe violation
   - Edge case (sarcasm, ambiguous)
5. ASCII diagram of your graph

### Bonus Challenges:
- Add retry logic for failed API calls
- Implement appeal mechanism (human override)
- Add metrics tracking (approval rate, false positives)
- Support multiple languages

### Starter Code:

In [None]:
# Content Moderation Project Starter

from typing import TypedDict, Optional, List
from langgraph.graph import StateGraph, END

# TODO: Define ModerationState
class ModerationState(TypedDict):
    pass  # Your state schema here

# TODO: Implement nodes
def classify_content(state: ModerationState) -> ModerationState:
    pass  # Your implementation

def detect_toxicity(state: ModerationState) -> ModerationState:
    pass  # Your implementation

# TODO: Build graph
# TODO: Test with examples

# Test cases
test_cases = [
    {"content_id": "1", "author_id": "user123", "content_text": "Great product! Highly recommend."},
    {"content_id": "2", "author_id": "user456", "content_text": "BUY NOW!!! 50% OFF CLICK HERE!!!"},
    {"content_id": "3", "author_id": "user789", "content_text": "This is stupid and annoying."},
    {"content_id": "4", "author_id": "user000", "content_text": "I hate you all! Worst people ever!"},
    {"content_id": "5", "author_id": "user111", "content_text": "Yeah right, like that's gonna work... üôÑ"},
]

## üéì Key Takeaways

**What you learned this week:**

‚úÖ **State Graphs > Linear Chains:**
- Shared state accessible by all nodes
- Conditional routing based on runtime values
- Support for cycles and parallel execution
- Better debugging and visualization

‚úÖ **Core Components:**
- **State**: TypedDict with all workflow data
- **Nodes**: Functions that transform state
- **Edges**: Normal (fixed) or conditional (dynamic)
- **Entry/Exit**: START and END points

‚úÖ **Real-World Application:**
- Built document processing pipeline
- Implemented classification, extraction, validation
- Added routing logic for different document types
- Handled errors and edge cases

‚úÖ **Best Practices:**
- Design state schema first (types matter!)
- Keep nodes small and focused (single responsibility)
- Use conditional routing for branching logic
- Always handle errors gracefully
- Test with diverse inputs (happy path + edge cases)

## üîú Next Week: Complex Workflows

In Week 2, we'll build advanced workflows with:
- **Subgraphs**: Nested workflows for modularity
- **Cycles**: Retry logic and iterative refinement
- **Parallel Execution**: Speed up independent tasks
- **Dynamic Routing**: Complex multi-branch decisions
- **Real Project**: Customer onboarding system (50+ steps)

**Preview question:** How would you implement a "retry failed step up to 3 times" logic in a graph?

## üìö Additional Resources

- [LangGraph Documentation](https://langchain-ai.github.io/langgraph/)
- [State Management Guide](https://langchain-ai.github.io/langgraph/concepts/low_level/#state)
- [Conditional Edges Examples](https://langchain-ai.github.io/langgraph/how-tos/branching/)
- [Graph Visualization Tools](https://langchain-ai.github.io/langgraph/how-tos/visualization/)

## üêõ Debugging Tips

**Common Issues:**

1. **State not updating?**
   - Make sure nodes RETURN the updated state
   - Check for typos in state keys

2. **Conditional routing not working?**
   - Verify router function returns exact strings from edge map
   - Print state values before routing

3. **Graph hangs/infinite loop?**
   - Check that all paths eventually reach END
   - Look for cycles without exit conditions

4. **Type errors?**
   - Initialize Optional fields: `state["field"] = None`
   - Check TypedDict annotations match actual usage

---

**Congratulations on completing Week 1!** You now know how to build sophisticated, production-ready workflows with LangGraph. See you next week!