# LangGraph Advanced Multi-Agent Supervisor
## Customer Support Operations Center

In our introductory LangGraph tutorial, we built a basic supervisor that routed between a researcher and writer. That supervisor used simple string matching on LLM output to decide routing—functional but fragile.

In this notebook, we level up significantly:

| Feature | Basic Supervisor | Advanced Supervisor |
|---------|-----------------|-------------------|
| Routing | String matching on LLM text | **Structured output** via Pydantic models |
| State | Just messages | Messages + **scratchpad** + metadata |
| Loop safety | None | **Iteration guards** with configurable limits |
| Re-routing | One agent per query | Dynamic **re-routing** based on intermediate results |
| Quality | None | **Quality check** node that can send work back |

### What We're Building

A customer support operations center where:
- An **intake classifier** categorizes incoming requests
- A **supervisor** routes to specialized agents using structured decisions
- **Billing**, **tech support**, and **escalation** agents handle their domains
- A **quality check** validates resolution quality and can re-route
- A **shared scratchpad** lets agents build on each other's findings
- **Iteration guards** prevent infinite loops

### Architecture
```
START → intake_classifier → supervisor → {billing_agent, tech_support_agent, escalation_agent} → quality_check → supervisor → END
```

In [None]:
# Install dependencies (skip if already installed)
%pip install -q langgraph langchain langchain-openai langchain-community

In [None]:
import os
import getpass
import operator
from typing import Annotated, TypedDict, Literal, Any
from datetime import datetime

from pydantic import BaseModel, Field

from langchain_core.messages import HumanMessage, AIMessage, SystemMessage
from langchain_core.tools import tool
from langchain_openai import ChatOpenAI

from langgraph.graph import StateGraph, START, END
from langgraph.graph.message import add_messages

# API Key setup
if not os.environ.get("OPENAI_API_KEY"):
    os.environ["OPENAI_API_KEY"] = getpass.getpass("Enter your OpenAI API key: ")

llm = ChatOpenAI(model="gpt-4o-mini", temperature=0)
print("Setup complete!")

## 1. Designing Rich State

In the basic tutorial, our state was just `messages` and `next`. For a production-grade supervisor, we need much more:

- **`scratchpad`** — A shared dictionary where agents can store structured findings for other agents to reference
- **`iteration_count`** — Guards against infinite routing loops  
- **`customer_sentiment`** — Metadata that influences routing decisions
- **`resolution_status`** — Tracks whether the issue is resolved, pending, or escalated

### Custom Reducer for Scratchpad

The scratchpad uses a **merge reducer** — when an agent returns `{"scratchpad": {"billing_info": {...}}}`, it merges into the existing scratchpad rather than replacing it. This lets agents incrementally build shared context.

In [None]:
def merge_scratchpad(existing: dict, update: dict) -> dict:
    """Custom reducer: merge new scratchpad entries into existing ones."""
    merged = existing.copy()
    merged.update(update)
    return merged

class SupervisorState(TypedDict):
    messages: Annotated[list, add_messages]
    next: str
    scratchpad: Annotated[dict, merge_scratchpad]
    iteration_count: int
    customer_sentiment: str
    resolution_status: str

print("SupervisorState defined with custom scratchpad reducer")
print(f"Fields: {list(SupervisorState.__annotations__.keys())}")

## 2. Structured Supervisor Routing with Pydantic

The biggest upgrade from our basic supervisor: instead of parsing free-text LLM output, we use **`with_structured_output()`** to force the LLM to return a Pydantic model.

This gives us:
- **Type-safe routing** — The `next` field is constrained to valid agent names
- **Reasoning transparency** — The LLM must explain its routing decision
- **Confidence scoring** — We can use confidence to trigger different behaviors
- **Zero parsing errors** — No more `if "researcher" in response.lower()` fragility

In [None]:
class RouteDecision(BaseModel):
    """Structured routing decision from the supervisor."""
    next: Literal["billing_agent", "tech_support_agent", "escalation_agent", "quality_check", "FINISH"] = Field(
        description="The next agent to route to, or FINISH if the task is complete"
    )
    reasoning: str = Field(
        description="Brief explanation of why this routing decision was made"
    )
    confidence: float = Field(
        description="Confidence in this routing decision from 0.0 to 1.0",
        ge=0.0, le=1.0
    )

# Create a structured LLM that always returns RouteDecision
structured_llm = llm.with_structured_output(RouteDecision)

# Quick test
test_decision = structured_llm.invoke(
    "A customer is asking about a charge on their credit card. Route this to the right agent."
)
print(f"Decision: {test_decision}")
print(f"Next: {test_decision.next}")
print(f"Reasoning: {test_decision.reasoning}")
print(f"Confidence: {test_decision.confidence}")

## 3. Mock Tools for Agent Capabilities

Each agent has domain-specific tools. In production, these would connect to real databases and APIs. Here we use mock implementations that return realistic data.

In [None]:
@tool
def lookup_billing_info(customer_id: str) -> str:
    """Look up billing information for a customer."""
    billing_data = {
        "CUST-001": "Plan: Enterprise ($299/mo), Last payment: 2024-01-15, Status: Active, Balance: $0.00",
        "CUST-002": "Plan: Pro ($49/mo), Last payment: 2024-01-10, Status: Active, Balance: $49.00 (overdue)",
        "CUST-003": "Plan: Starter ($19/mo), Last payment: 2023-12-01, Status: Suspended, Balance: $57.00",
    }
    return billing_data.get(customer_id, f"No billing record found for {customer_id}")

@tool
def check_payment_status(transaction_id: str) -> str:
    """Check the status of a specific payment transaction."""
    transactions = {
        "TXN-1001": "Amount: $299.00, Date: 2024-01-15, Status: Completed, Method: Credit Card ending 4242",
        "TXN-1002": "Amount: $49.00, Date: 2024-01-10, Status: Failed, Method: Credit Card ending 1234, Error: Insufficient funds",
        "TXN-1003": "Amount: $19.00, Date: 2023-12-01, Status: Refunded, Method: PayPal",
    }
    return transactions.get(transaction_id, f"Transaction {transaction_id} not found")

@tool
def search_knowledge_base(query: str) -> str:
    """Search the technical knowledge base for solutions."""
    kb = {
        "login": "Common login issues: 1) Clear browser cache, 2) Reset password via /forgot-password, 3) Check if account is locked after 5 failed attempts",
        "api": "API troubleshooting: 1) Verify API key in Settings > API, 2) Check rate limits (100 req/min for Pro, 1000 for Enterprise), 3) Ensure correct base URL: api.example.com/v2",
        "integration": "Integration setup: 1) OAuth2 flow required, 2) Callback URL must be HTTPS, 3) Scopes needed: read, write, admin",
        "performance": "Performance issues: 1) Check system status at status.example.com, 2) Clear local cache, 3) Try incognito mode, 4) Check browser compatibility",
    }
    for key, value in kb.items():
        if key in query.lower():
            return value
    return f"No knowledge base articles found for: {query}"

@tool
def get_product_docs(feature: str) -> str:
    """Retrieve product documentation for a specific feature."""
    docs = {
        "billing": "Billing docs: Invoices generated on 1st of month. Payment methods: Credit card, PayPal, Wire transfer. Refund policy: Full refund within 30 days.",
        "api": "API docs: RESTful API v2. Auth: Bearer token. Rate limits vary by plan. Webhooks available for event notifications.",
        "dashboard": "Dashboard docs: Real-time analytics, custom report builder, data export (CSV/JSON), role-based access control.",
    }
    return docs.get(feature.lower(), f"No documentation found for: {feature}")

@tool
def create_ticket(priority: str, category: str, description: str) -> str:
    """Create an escalation ticket in the ticketing system."""
    ticket_id = f"TKT-{datetime.now().strftime('%Y%m%d%H%M%S')}"
    return f"Ticket created: {ticket_id} | Priority: {priority} | Category: {category} | Description: {description}"

print("5 tools defined: lookup_billing_info, check_payment_status, search_knowledge_base, get_product_docs, create_ticket")

## 4. Intake Classifier Node

The first node in our graph classifies the incoming customer request. This determines the initial routing and sets metadata like sentiment.

In [None]:
class IntakeClassification(BaseModel):
    """Classification of incoming customer request."""
    category: Literal["billing", "technical", "escalation", "general"] = Field(
        description="The category of the customer's request"
    )
    sentiment: Literal["positive", "neutral", "frustrated", "angry"] = Field(
        description="The customer's emotional tone"
    )
    summary: str = Field(
        description="One-sentence summary of the customer's issue"
    )

intake_llm = llm.with_structured_output(IntakeClassification)

def intake_classifier(state: SupervisorState) -> dict:
    """Classify the incoming customer request."""
    classification = intake_llm.invoke([
        SystemMessage(content="""You are a customer support intake classifier. 
Analyze the customer's message and classify it by category and sentiment.
Categories: billing (payments, charges, invoices), technical (bugs, API, integrations), 
escalation (angry customers, repeated issues, legal threats), general (other)."""),
        *state["messages"]
    ])
    
    # Map category to first agent
    category_to_agent = {
        "billing": "billing_agent",
        "technical": "tech_support_agent",
        "escalation": "escalation_agent",
        "general": "tech_support_agent",
    }
    
    return {
        "next": category_to_agent[classification.category],
        "customer_sentiment": classification.sentiment,
        "scratchpad": {
            "intake_classification": {
                "category": classification.category,
                "sentiment": classification.sentiment,
                "summary": classification.summary,
            }
        },
        "iteration_count": 0,
        "resolution_status": "in_progress",
    }

print("Intake classifier defined — classifies by category + sentiment")

## 5. The Supervisor Node

The supervisor is the brain of our system. It:
1. Reviews all messages and scratchpad data
2. Makes a **structured routing decision** using `RouteDecision`
3. Enforces an **iteration guard** to prevent infinite loops
4. Can route to quality check or finish the conversation

In [None]:
MAX_ITERATIONS = 5

SUPERVISOR_PROMPT = """You are the supervisor of a customer support operations center.

Your team:
- billing_agent: Handles payment issues, charges, refunds, account billing
- tech_support_agent: Handles technical problems, API issues, integrations, bugs
- escalation_agent: Handles angry customers, complex issues requiring human handoff, legal concerns

Current scratchpad (shared context from agents):
{scratchpad}

Current iteration: {iteration} of {max_iterations}
Customer sentiment: {sentiment}
Resolution status: {resolution_status}

Based on the conversation so far, decide the next action:
- Route to an agent if more work is needed
- Route to "quality_check" if an agent has provided a response that should be validated
- Route to "FINISH" if the customer's issue is fully resolved

If we're approaching the iteration limit, prefer wrapping up or escalating."""

def supervisor_node(state: SupervisorState) -> dict:
    """Supervisor makes structured routing decisions."""
    iteration = state.get("iteration_count", 0)
    
    # Iteration guard
    if iteration >= MAX_ITERATIONS:
        return {
            "next": "FINISH",
            "messages": [AIMessage(content="[Supervisor] Maximum iterations reached. Wrapping up the conversation.")],
            "resolution_status": "max_iterations_reached",
        }
    
    prompt = SUPERVISOR_PROMPT.format(
        scratchpad=state.get("scratchpad", {}),
        iteration=iteration,
        max_iterations=MAX_ITERATIONS,
        sentiment=state.get("customer_sentiment", "unknown"),
        resolution_status=state.get("resolution_status", "unknown"),
    )
    
    decision = structured_llm.invoke([
        SystemMessage(content=prompt),
        *state["messages"],
    ])
    
    return {
        "next": decision.next,
        "iteration_count": iteration + 1,
        "messages": [AIMessage(content=f"[Supervisor] Routing to {decision.next} (confidence: {decision.confidence:.0%}). Reason: {decision.reasoning}")],
        "scratchpad": {"last_routing_decision": {
            "next": decision.next,
            "reasoning": decision.reasoning,
            "confidence": decision.confidence,
            "iteration": iteration + 1,
        }},
    }

print(f"Supervisor node defined with {MAX_ITERATIONS}-iteration guard")

## 6. Specialized Agent Nodes

Each agent has:
- A focused **system prompt** defining its expertise
- Access to **domain-specific tools**
- The ability to read from and write to the **shared scratchpad**

In [None]:
# Billing Agent
billing_tools = [lookup_billing_info, check_payment_status]
billing_llm = llm.bind_tools(billing_tools)

def billing_agent(state: SupervisorState) -> dict:
    """Billing specialist agent."""
    scratchpad = state.get("scratchpad", {})
    response = billing_llm.invoke([
        SystemMessage(content=f"""You are a billing support specialist. You help customers with:
- Payment issues and failed transactions
- Billing inquiries and invoice questions  
- Refund requests and credit adjustments
- Account plan changes

Shared context from other agents: {scratchpad}

Use your tools to look up billing information. Provide clear, helpful responses.
Always reference specific account details when available.
If the issue requires technical support or is beyond billing scope, say so clearly."""),
        *state["messages"],
    ])
    
    # If the LLM made tool calls, execute them
    if response.tool_calls:
        from langgraph.prebuilt import ToolNode
        tool_node = ToolNode(billing_tools)
        tool_results = tool_node.invoke({"messages": [response]})
        
        # Make a follow-up call with tool results
        follow_up = llm.invoke([
            SystemMessage(content=f"""You are a billing support specialist. 
Summarize the billing information you found and provide a helpful response to the customer.
Shared context: {scratchpad}"""),
            *state["messages"],
            response,
            *tool_results["messages"],
        ])
        
        return {
            "messages": [AIMessage(content=f"[Billing Agent] {follow_up.content}")],
            "scratchpad": {"billing_findings": follow_up.content},
        }
    
    return {
        "messages": [AIMessage(content=f"[Billing Agent] {response.content}")],
        "scratchpad": {"billing_findings": response.content},
    }

# Tech Support Agent
tech_tools = [search_knowledge_base, get_product_docs]
tech_llm = llm.bind_tools(tech_tools)

def tech_support_agent(state: SupervisorState) -> dict:
    """Technical support specialist agent."""
    scratchpad = state.get("scratchpad", {})
    response = tech_llm.invoke([
        SystemMessage(content=f"""You are a technical support specialist. You help customers with:
- Software bugs and error messages
- API integration issues  
- Performance problems
- Feature questions and how-to guidance

Shared context from other agents: {scratchpad}

Use your tools to search the knowledge base and documentation.
Provide step-by-step solutions when possible.
If the issue requires billing support or escalation, say so clearly."""),
        *state["messages"],
    ])
    
    if response.tool_calls:
        from langgraph.prebuilt import ToolNode
        tool_node = ToolNode(tech_tools)
        tool_results = tool_node.invoke({"messages": [response]})
        
        follow_up = llm.invoke([
            SystemMessage(content=f"""You are a technical support specialist.
Based on the knowledge base results, provide a clear solution to the customer.
Shared context: {scratchpad}"""),
            *state["messages"],
            response,
            *tool_results["messages"],
        ])
        
        return {
            "messages": [AIMessage(content=f"[Tech Support] {follow_up.content}")],
            "scratchpad": {"tech_findings": follow_up.content},
        }
    
    return {
        "messages": [AIMessage(content=f"[Tech Support] {response.content}")],
        "scratchpad": {"tech_findings": response.content},
    }

# Escalation Agent
escalation_tools = [create_ticket]
escalation_llm = llm.bind_tools(escalation_tools)

def escalation_agent(state: SupervisorState) -> dict:
    """Escalation specialist for complex or sensitive issues."""
    scratchpad = state.get("scratchpad", {})
    response = escalation_llm.invoke([
        SystemMessage(content=f"""You are an escalation specialist. You handle:
- Angry or frustrated customers needing de-escalation
- Complex issues that span multiple departments
- Legal or compliance concerns
- Issues requiring human manager review

Shared context from other agents: {scratchpad}

Your approach:
1. Acknowledge the customer's frustration
2. Summarize what has been tried so far
3. Create an escalation ticket if needed
4. Provide a clear next-steps timeline"""),
        *state["messages"],
    ])
    
    if response.tool_calls:
        from langgraph.prebuilt import ToolNode
        tool_node = ToolNode(escalation_tools)
        tool_results = tool_node.invoke({"messages": [response]})
        
        follow_up = llm.invoke([
            SystemMessage(content=f"""You are an escalation specialist.
A ticket has been created. Inform the customer about the escalation and next steps.
Shared context: {scratchpad}"""),
            *state["messages"],
            response,
            *tool_results["messages"],
        ])
        
        return {
            "messages": [AIMessage(content=f"[Escalation] {follow_up.content}")],
            "scratchpad": {"escalation_findings": follow_up.content},
            "resolution_status": "escalated",
        }
    
    return {
        "messages": [AIMessage(content=f"[Escalation] {response.content}")],
        "scratchpad": {"escalation_findings": response.content},
        "resolution_status": "escalated",
    }

print("3 agent nodes defined: billing_agent, tech_support_agent, escalation_agent")

## 7. Quality Check Node

The quality check is a unique feature of our advanced supervisor. After an agent responds, this node evaluates whether the response actually resolves the customer's issue. It can:
- **Approve** the response and route to FINISH
- **Re-route** to a different agent if the response is insufficient
- **Request refinement** by sending back to the same agent

This creates a feedback loop that dramatically improves response quality.

In [None]:
class QualityAssessment(BaseModel):
    """Quality assessment of an agent's response."""
    is_resolved: bool = Field(description="Whether the customer's issue appears resolved")
    quality_score: float = Field(description="Quality score from 0.0 to 1.0", ge=0.0, le=1.0)
    recommendation: Literal["approve", "reroute_billing", "reroute_tech", "reroute_escalation", "needs_refinement"] = Field(
        description="What to do next with this conversation"
    )
    feedback: str = Field(description="Specific feedback about the response quality")

quality_llm = llm.with_structured_output(QualityAssessment)

def quality_check(state: SupervisorState) -> dict:
    """Evaluate the quality of the agent's response."""
    assessment = quality_llm.invoke([
        SystemMessage(content="""You are a quality assurance reviewer for customer support.
        
Evaluate the last agent response. Consider:
1. Did it directly address the customer's question?
2. Was the information accurate and specific?
3. Was the tone appropriate for the customer's sentiment?
4. Are there any loose ends or unanswered aspects?

Recommend:
- "approve" if the response is good and the issue seems resolved
- "reroute_billing/tech/escalation" if a different specialist is needed
- "needs_refinement" if the current agent should try again with more detail"""),
        *state["messages"],
    ])
    
    recommendation_to_next = {
        "approve": "FINISH",
        "reroute_billing": "billing_agent",
        "reroute_tech": "tech_support_agent",
        "reroute_escalation": "escalation_agent",
        "needs_refinement": state.get("next", "tech_support_agent"),  # Back to last agent
    }
    
    next_node = recommendation_to_next[assessment.recommendation]
    
    return {
        "next": next_node,
        "messages": [AIMessage(content=f"[Quality Check] Score: {assessment.quality_score:.0%} | Action: {assessment.recommendation} | {assessment.feedback}")],
        "scratchpad": {"quality_assessment": {
            "score": assessment.quality_score,
            "recommendation": assessment.recommendation,
            "feedback": assessment.feedback,
        }},
        "resolution_status": "resolved" if assessment.is_resolved else "in_progress",
    }

print("Quality check node defined — evaluates and can re-route responses")

## 8. Graph Assembly

Now we wire everything together. The key routing logic:
1. **intake_classifier** always routes to **supervisor**
2. **supervisor** routes to an agent, quality_check, or FINISH based on structured output
3. **Agent nodes** always route to **quality_check** 
4. **quality_check** routes back to **supervisor** (which then decides next steps)

In [None]:
def route_from_supervisor(state: SupervisorState) -> str:
    """Route based on supervisor's structured decision."""
    next_node = state.get("next", "FINISH")
    if next_node == "FINISH":
        return END
    return next_node

def route_from_quality(state: SupervisorState) -> str:
    """Route based on quality check assessment."""
    next_node = state.get("next", "FINISH")
    if next_node == "FINISH":
        return END
    return "supervisor"  # Always go back through supervisor for re-routing

# Build the graph
graph = StateGraph(SupervisorState)

# Add all nodes
graph.add_node("intake_classifier", intake_classifier)
graph.add_node("supervisor", supervisor_node)
graph.add_node("billing_agent", billing_agent)
graph.add_node("tech_support_agent", tech_support_agent)
graph.add_node("escalation_agent", escalation_agent)
graph.add_node("quality_check", quality_check)

# Entry point
graph.add_edge(START, "intake_classifier")
graph.add_edge("intake_classifier", "supervisor")

# Supervisor routes to agents or finish
graph.add_conditional_edges("supervisor", route_from_supervisor, {
    "billing_agent": "billing_agent",
    "tech_support_agent": "tech_support_agent",
    "escalation_agent": "escalation_agent",
    "quality_check": "quality_check",
    END: END,
})

# All agents route to quality check
graph.add_edge("billing_agent", "quality_check")
graph.add_edge("tech_support_agent", "quality_check")
graph.add_edge("escalation_agent", "quality_check")

# Quality check routes back to supervisor or finish
graph.add_conditional_edges("quality_check", route_from_quality, {
    "supervisor": "supervisor",
    END: END,
})

# Compile
app = graph.compile()
print("Graph compiled successfully!")

In [None]:
# Visualize the graph
from IPython.display import display, Image

try:
    display(Image(app.get_graph().draw_mermaid_png()))
except Exception:
    # Fallback to text representation
    print(app.get_graph().draw_mermaid())

## 9. Scenario 1: Straightforward Billing Inquiry

Let's start with a simple case — a customer asking about their bill. This should flow cleanly through intake → supervisor → billing_agent → quality_check → done.

In [None]:
result = app.invoke({
    "messages": [HumanMessage(content="Hi, I'm customer CUST-002 and my last payment of $49 seems to have failed. Can you check what happened with transaction TXN-1002?")],
    "scratchpad": {},
    "iteration_count": 0,
    "customer_sentiment": "",
    "resolution_status": "",
    "next": "",
})

print("=" * 80)
print("SCENARIO 1: Straightforward Billing Inquiry")
print("=" * 80)
for msg in result["messages"]:
    if isinstance(msg, HumanMessage):
        print(f"\n{'CUSTOMER':>15}: {msg.content[:200]}")
    elif isinstance(msg, AIMessage):
        print(f"\n{'AGENT':>15}: {msg.content[:300]}")
print(f"\n{'RESOLUTION':>15}: {result.get('resolution_status', 'unknown')}")
print(f"{'ITERATIONS':>15}: {result.get('iteration_count', 0)}")

## 10. Scenario 2: Tech Issue That Needs Escalation

This scenario tests **dynamic re-routing**. A technical issue that turns out to be more complex, potentially requiring escalation.

In [None]:
result2 = app.invoke({
    "messages": [HumanMessage(content="Our entire team of 50 people has been locked out of the platform for 3 hours now. We're losing thousands of dollars. Our API integrations are all failing too. This is completely unacceptable and we need this fixed NOW. Customer ID: CUST-001.")],
    "scratchpad": {},
    "iteration_count": 0,
    "customer_sentiment": "",
    "resolution_status": "",
    "next": "",
})

print("=" * 80)
print("SCENARIO 2: Tech Issue Requiring Escalation")
print("=" * 80)
for msg in result2["messages"]:
    if isinstance(msg, HumanMessage):
        print(f"\n{'CUSTOMER':>15}: {msg.content[:200]}")
    elif isinstance(msg, AIMessage):
        print(f"\n{'AGENT':>15}: {msg.content[:300]}")
print(f"\n{'RESOLUTION':>15}: {result2.get('resolution_status', 'unknown')}")
print(f"{'ITERATIONS':>15}: {result2.get('iteration_count', 0)}")
print(f"{'SENTIMENT':>15}: {result2.get('customer_sentiment', 'unknown')}")

## 11. Scenario 3: Ambiguous Multi-Department Issue

The most interesting case — a request that touches both billing AND technical domains, requiring the supervisor to coordinate multiple agents.

In [None]:
result3 = app.invoke({
    "messages": [HumanMessage(content="I upgraded from Pro to Enterprise last week, but my API rate limits haven't changed — I'm still capped at 100 req/min instead of the 1000 I should have. Also, I was charged $299 but my old $49 charge went through too. Can you sort both issues out? Customer CUST-001, transaction TXN-1001.")],
    "scratchpad": {},
    "iteration_count": 0,
    "customer_sentiment": "",
    "resolution_status": "",
    "next": "",
})

print("=" * 80)
print("SCENARIO 3: Multi-Department Issue (Billing + Tech)")
print("=" * 80)
for msg in result3["messages"]:
    if isinstance(msg, HumanMessage):
        print(f"\n{'CUSTOMER':>15}: {msg.content[:200]}")
    elif isinstance(msg, AIMessage):
        print(f"\n{'AGENT':>15}: {msg.content[:300]}")
print(f"\n{'RESOLUTION':>15}: {result3.get('resolution_status', 'unknown')}")
print(f"{'ITERATIONS':>15}: {result3.get('iteration_count', 0)}")

## 12. Scratchpad Deep Dive

The shared scratchpad is one of the most powerful patterns in this system. Let's examine how agents built on each other's findings across our scenarios.

In [None]:
import json

print("=" * 80)
print("SCRATCHPAD ANALYSIS")
print("=" * 80)

for name, result_data in [("Scenario 1 (Billing)", result), 
                           ("Scenario 2 (Escalation)", result2),
                           ("Scenario 3 (Multi-dept)", result3)]:
    print(f"\n{'─' * 60}")
    print(f"  {name}")
    print(f"{'─' * 60}")
    scratchpad = result_data.get("scratchpad", {})
    for key, value in scratchpad.items():
        if isinstance(value, dict):
            print(f"\n  [{key}]")
            for k, v in value.items():
                v_str = str(v)[:100] + "..." if len(str(v)) > 100 else str(v)
                print(f"    {k}: {v_str}")
        else:
            v_str = str(value)[:100] + "..." if len(str(value)) > 100 else str(value)
            print(f"\n  [{key}]: {v_str}")

## 13. Infinite Loop Prevention Demo

What happens when the supervisor and quality check keep bouncing back and forth? Our iteration guard kicks in. Let's demonstrate with a deliberately ambiguous request.

In [None]:
# Create a version with a very low iteration limit for demo
LOW_LIMIT = 3

def supervisor_node_low_limit(state: SupervisorState) -> dict:
    """Supervisor with low iteration limit for demo."""
    iteration = state.get("iteration_count", 0)
    
    if iteration >= LOW_LIMIT:
        return {
            "next": "FINISH",
            "messages": [AIMessage(content=f"[Supervisor] Hit iteration limit ({LOW_LIMIT}). Forcing completion to prevent infinite loop.")],
            "resolution_status": "forced_completion",
        }
    
    prompt = SUPERVISOR_PROMPT.format(
        scratchpad=state.get("scratchpad", {}),
        iteration=iteration,
        max_iterations=LOW_LIMIT,
        sentiment=state.get("customer_sentiment", "unknown"),
        resolution_status=state.get("resolution_status", "unknown"),
    )
    
    decision = structured_llm.invoke([
        SystemMessage(content=prompt),
        *state["messages"],
    ])
    
    return {
        "next": decision.next,
        "iteration_count": iteration + 1,
        "messages": [AIMessage(content=f"[Supervisor] Iteration {iteration + 1}/{LOW_LIMIT} → {decision.next}")],
    }

# Build a graph with low limit
demo_graph = StateGraph(SupervisorState)
demo_graph.add_node("intake_classifier", intake_classifier)
demo_graph.add_node("supervisor", supervisor_node_low_limit)
demo_graph.add_node("billing_agent", billing_agent)
demo_graph.add_node("tech_support_agent", tech_support_agent)
demo_graph.add_node("escalation_agent", escalation_agent)
demo_graph.add_node("quality_check", quality_check)

demo_graph.add_edge(START, "intake_classifier")
demo_graph.add_edge("intake_classifier", "supervisor")
demo_graph.add_conditional_edges("supervisor", route_from_supervisor, {
    "billing_agent": "billing_agent",
    "tech_support_agent": "tech_support_agent",
    "escalation_agent": "escalation_agent",
    "quality_check": "quality_check",
    END: END,
})
demo_graph.add_edge("billing_agent", "quality_check")
demo_graph.add_edge("tech_support_agent", "quality_check")
demo_graph.add_edge("escalation_agent", "quality_check")
demo_graph.add_conditional_edges("quality_check", route_from_quality, {
    "supervisor": "supervisor",
    END: END,
})
demo_app = demo_graph.compile()

result_loop = demo_app.invoke({
    "messages": [HumanMessage(content="I have a vague concern about my account. Something feels off but I can't quite pinpoint it. Maybe billing? Maybe a bug? I'm not really sure what's wrong.")],
    "scratchpad": {},
    "iteration_count": 0,
    "customer_sentiment": "",
    "resolution_status": "",
    "next": "",
})

print("=" * 80)
print(f"LOOP PREVENTION DEMO (limit: {LOW_LIMIT} iterations)")
print("=" * 80)
for msg in result_loop["messages"]:
    if isinstance(msg, AIMessage):
        print(f"  {msg.content[:200]}")
print(f"\nFinal status: {result_loop.get('resolution_status', 'unknown')}")
print(f"Iterations used: {result_loop.get('iteration_count', 0)}")

## 14. Key Takeaways

### What We Built
A production-grade customer support system with:
- **Structured routing** via Pydantic models (no more string parsing)
- **Shared scratchpad** for inter-agent communication  
- **Quality feedback loops** that improve response quality
- **Iteration guards** that prevent runaway execution
- **Dynamic re-routing** when initial classification is wrong

### Key LangGraph Patterns

1. **`with_structured_output()`** — Force LLMs to return typed, validated data structures
2. **Custom reducers** — The `merge_scratchpad` function shows how to control state merging behavior
3. **Conditional edges with 3+ targets** — Real systems need more than binary routing
4. **Loop guards** — Always cap iterations in cyclic graphs
5. **Quality check pattern** — A reviewer node that can send work back creates self-improving loops

### When to Use This Pattern

| Scenario | Use Advanced Supervisor? |
|----------|------------------------|
| Simple Q&A with 2 agents | No — basic supervisor is fine |
| Customer support with 3+ specialists | **Yes** |
| Any system needing quality assurance | **Yes** |
| Workflows with potential infinite loops | **Yes** — iteration guards are essential |
| Systems where agents need shared context | **Yes** — scratchpad pattern |

### Next Steps
- **Hierarchical Teams** → When your supervisor manages too many agents, split into sub-teams
- **Collaboration Patterns** → When agents need to debate, vote, or work in parallel
- **Custom State Machines** → When you need explicit lifecycle stages with retry logic