[![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/langchain-ai/langchain-academy/blob/main/module-4/sub-graph-exercise.ipynb) [![Open in LangChain Academy](https://cdn.prod.website-files.com/65b8cd72835ceeacd4449a53/66e9eba12c7b7688aa3dbb5e_LCA-badge-green.svg)](https://academy.langchain.com/courses/take/intro-to-langgraph/lessons/58239937-lesson-2-sub-graphs)

# Sub-graphs Exercise: Building Multi-Agent Systems

## Learning Objectives

By the end of this exercise, you will be able to:

1. **Design and implement sub-graphs** with different state schemas for specialized tasks
2. **Manage state communication** between parent graphs and sub-graphs using overlapping keys
3. **Create output schemas** to control what data flows between graph components
4. **Build multi-agent systems** where different agents work on specialized sub-tasks
5. **Handle parallel execution** of sub-graphs with proper state management

## Scenario: Customer Support Analytics System

You'll build a customer support analytics system that processes customer tickets through specialized sub-graphs:
- **Sentiment Analysis Agent**: Analyzes customer sentiment and urgency
- **Resolution Tracker Agent**: Tracks resolution patterns and success metrics
- **Knowledge Base Agent**: Identifies knowledge gaps and suggests improvements

This mirrors real-world multi-agent systems where different specialized agents handle specific aspects of a complex workflow.

In [None]:
%%capture --no-stderr
%pip install -U langgraph

## Setup and Environment

We'll use [LangSmith](https://docs.smith.langchain.com/) for [tracing](https://docs.smith.langchain.com/concepts/tracing) to visualize our sub-graph execution.

In [None]:
import os, getpass

def _set_env(var: str):
    if not os.environ.get(var):
        os.environ[var] = getpass.getpass(f"{var}: ")

_set_env("LANGSMITH_API_KEY")
os.environ["LANGSMITH_TRACING"] = "true"
os.environ["LANGSMITH_PROJECT"] = "langchain-academy"

In [None]:
from operator import add
from typing_extensions import TypedDict
from typing import List, Optional, Annotated
from IPython.display import Image, display
from langgraph.graph import StateGraph, START, END

## Exercise 1: Understanding Customer Support Data Structure

First, let's define the data structure for our customer support tickets. This will be the foundation for our sub-graph communication.

### Task 1.1: Complete the CustomerTicket Schema

Complete the `CustomerTicket` TypedDict with the missing fields. The ticket should contain:
- `ticket_id`: Unique identifier
- `customer_message`: The customer's original message
- `category`: Support category (e.g., "technical", "billing", "general")
- `sentiment_score`: Float between -1.0 (negative) and 1.0 (positive)
- `urgency_level`: Integer from 1 (low) to 5 (critical)
- `resolution_time`: Time to resolve in hours (optional)
- `resolved`: Boolean indicating if ticket is resolved
- `knowledge_gap`: String describing any identified knowledge gaps (optional)

In [None]:
# TODO: Complete the CustomerTicket schema
class CustomerTicket(TypedDict):
    ticket_id: str
    customer_message: str
    category: str
    # TODO: Add the missing fields here
    # sentiment_score: ?
    # urgency_level: ?
    # resolution_time: ?
    # resolved: ?
    # knowledge_gap: ?

<details>
<summary>💡 Solution for Task 1.1</summary>

```python
class CustomerTicket(TypedDict):
    ticket_id: str
    customer_message: str
    category: str
    sentiment_score: Optional[float]
    urgency_level: Optional[int]
    resolution_time: Optional[float]
    resolved: Optional[bool]
    knowledge_gap: Optional[str]
```
</details>

## Exercise 2: Building Your First Sub-graph - Sentiment Analysis Agent

Now let's create our first sub-graph that specializes in sentiment analysis. This agent will analyze customer messages and determine both sentiment and urgency.

### Key Concepts:
- **Specialized State**: Sub-graphs can have their own state schema
- **Input Data**: Sub-graphs receive data through overlapping keys with the parent
- **Output Schema**: Controls what data the sub-graph returns to the parent

### Task 2.1: Define the Sentiment Analysis State Schema

Create the state schema for the sentiment analysis sub-graph. It should include:
- `tickets`: List of tickets to analyze (input from parent)
- `high_priority_tickets`: List of tickets identified as high priority
- `sentiment_summary`: String summary of overall sentiment patterns
- `analysis_logs`: List of analysis steps performed

In [None]:
# TODO: Complete the SentimentAnalysisState schema
class SentimentAnalysisState(TypedDict):
    # TODO: Add the required fields
    pass

# TODO: Define the output schema - what should this sub-graph return?
class SentimentAnalysisOutputState(TypedDict):
    # TODO: Add the output fields
    pass

<details>
<summary>💡 Solution for Task 2.1</summary>

```python
class SentimentAnalysisState(TypedDict):
    tickets: List[CustomerTicket]
    high_priority_tickets: List[CustomerTicket]
    sentiment_summary: str
    analysis_logs: List[str]

class SentimentAnalysisOutputState(TypedDict):
    sentiment_summary: str
    analysis_logs: List[str]
```
</details>

### Task 2.2: Implement Sentiment Analysis Functions

Implement the core functions for sentiment analysis. For this exercise, we'll use simplified logic (in real systems, you'd use ML models).

In [None]:
def analyze_sentiment(state):
    """Analyze sentiment and urgency of customer tickets"""
    tickets = state["tickets"]
    
    # TODO: Implement sentiment analysis logic
    # For each ticket:
    # 1. Analyze sentiment based on keywords in customer_message
    # 2. Set sentiment_score (-1.0 to 1.0)
    # 3. Determine urgency_level (1-5) based on keywords like "urgent", "broken", "critical"
    
    # Hint: Look for negative words like "terrible", "broken", "frustrated"
    # Hint: Look for urgent words like "urgent", "immediately", "critical"
    
    analyzed_tickets = []
    for ticket in tickets:
        # TODO: Implement your sentiment analysis here
        updated_ticket = ticket.copy()
        
        # Placeholder implementation - replace with your logic
        updated_ticket["sentiment_score"] = 0.0
        updated_ticket["urgency_level"] = 1
        
        analyzed_tickets.append(updated_ticket)
    
    return {"tickets": analyzed_tickets}

def identify_high_priority(state):
    """Identify high priority tickets based on sentiment and urgency"""
    tickets = state["tickets"]
    
    # TODO: Implement logic to identify high priority tickets
    # Consider tickets with:
    # - Very negative sentiment (< -0.5)
    # - High urgency (>= 4)
    # - OR combination of moderate negative sentiment + moderate urgency
    
    high_priority = []
    
    # TODO: Your implementation here
    
    return {"high_priority_tickets": high_priority}

def generate_sentiment_summary(state):
    """Generate a summary of sentiment analysis results"""
    tickets = state["tickets"]
    high_priority = state["high_priority_tickets"]
    
    # TODO: Generate a meaningful summary
    # Include information about:
    # - Overall sentiment distribution
    # - Number of high priority tickets
    # - Common patterns in negative feedback
    
    summary = "Sentiment analysis complete."  # TODO: Replace with your implementation
    
    analysis_logs = [
        f"Analyzed {len(tickets)} tickets",
        f"Identified {len(high_priority)} high priority tickets"
    ]
    
    return {
        "sentiment_summary": summary,
        "analysis_logs": analysis_logs
    }

<details>
<summary>💡 Solution for Task 2.2</summary>

```python
def analyze_sentiment(state):
    """Analyze sentiment and urgency of customer tickets"""
    tickets = state["tickets"]
    
    analyzed_tickets = []
    for ticket in tickets:
        updated_ticket = ticket.copy()
        message = ticket["customer_message"].lower()
        
        # Simple sentiment analysis based on keywords
        negative_words = ["terrible", "awful", "broken", "frustrated", "angry", "hate", "worst"]
        positive_words = ["great", "excellent", "love", "amazing", "perfect", "wonderful"]
        
        negative_count = sum(1 for word in negative_words if word in message)
        positive_count = sum(1 for word in positive_words if word in message)
        
        # Calculate sentiment score (-1.0 to 1.0)
        if negative_count > positive_count:
            updated_ticket["sentiment_score"] = -0.8 if negative_count >= 2 else -0.4
        elif positive_count > negative_count:
            updated_ticket["sentiment_score"] = 0.8 if positive_count >= 2 else 0.4
        else:
            updated_ticket["sentiment_score"] = 0.0
        
        # Determine urgency based on keywords
        urgent_words = ["urgent", "immediately", "critical", "emergency", "asap", "broken"]
        urgency_count = sum(1 for word in urgent_words if word in message)
        
        if urgency_count >= 2:
            updated_ticket["urgency_level"] = 5
        elif urgency_count == 1:
            updated_ticket["urgency_level"] = 4
        elif "help" in message or "issue" in message:
            updated_ticket["urgency_level"] = 3
        else:
            updated_ticket["urgency_level"] = 2
        
        analyzed_tickets.append(updated_ticket)
    
    return {"tickets": analyzed_tickets}

def identify_high_priority(state):
    """Identify high priority tickets based on sentiment and urgency"""
    tickets = state["tickets"]
    
    high_priority = []
    for ticket in tickets:
        sentiment = ticket.get("sentiment_score", 0)
        urgency = ticket.get("urgency_level", 1)
        
        # High priority conditions
        if (sentiment < -0.5 or urgency >= 4 or 
            (sentiment < -0.2 and urgency >= 3)):
            high_priority.append(ticket)
    
    return {"high_priority_tickets": high_priority}

def generate_sentiment_summary(state):
    """Generate a summary of sentiment analysis results"""
    tickets = state["tickets"]
    high_priority = state["high_priority_tickets"]
    
    # Calculate sentiment distribution
    total_tickets = len(tickets)
    negative_tickets = sum(1 for t in tickets if t.get("sentiment_score", 0) < -0.2)
    positive_tickets = sum(1 for t in tickets if t.get("sentiment_score", 0) > 0.2)
    neutral_tickets = total_tickets - negative_tickets - positive_tickets
    
    summary = f"Sentiment Analysis: {negative_tickets} negative, {neutral_tickets} neutral, {positive_tickets} positive tickets. {len(high_priority)} high priority tickets identified."
    
    analysis_logs = [
        f"Analyzed {len(tickets)} tickets",
        f"Identified {len(high_priority)} high priority tickets",
        f"Sentiment distribution: {negative_tickets}N, {neutral_tickets}Neu, {positive_tickets}P"
    ]
    
    return {
        "sentiment_summary": summary,
        "analysis_logs": analysis_logs
    }
```
</details>

### Task 2.3: Build the Sentiment Analysis Sub-graph

Now create the complete sub-graph by connecting the nodes in the right sequence.

In [None]:
# TODO: Build the sentiment analysis sub-graph
# The flow should be: analyze_sentiment -> identify_high_priority -> generate_sentiment_summary

sentiment_builder = StateGraph(state_schema=SentimentAnalysisState, output_schema=SentimentAnalysisOutputState)

# TODO: Add nodes
# TODO: Add edges

sentiment_graph = sentiment_builder.compile()

# Visualize the sub-graph
display(Image(sentiment_graph.get_graph().draw_mermaid_png()))

<details>
<summary>💡 Solution for Task 2.3</summary>

```python
sentiment_builder = StateGraph(state_schema=SentimentAnalysisState, output_schema=SentimentAnalysisOutputState)

sentiment_builder.add_node("analyze_sentiment", analyze_sentiment)
sentiment_builder.add_node("identify_high_priority", identify_high_priority)
sentiment_builder.add_node("generate_sentiment_summary", generate_sentiment_summary)

sentiment_builder.add_edge(START, "analyze_sentiment")
sentiment_builder.add_edge("analyze_sentiment", "identify_high_priority")
sentiment_builder.add_edge("identify_high_priority", "generate_sentiment_summary")
sentiment_builder.add_edge("generate_sentiment_summary", END)

sentiment_graph = sentiment_builder.compile()
```
</details>

## Exercise 3: Building the Resolution Tracker Sub-graph

Create a second sub-graph that tracks resolution patterns and metrics. This demonstrates how different agents can specialize in different aspects of the same data.

### Task 3.1: Design the Resolution Tracker State and Functions

In [None]:
# TODO: Define the Resolution Tracker state schemas
class ResolutionTrackerState(TypedDict):
    # TODO: Define fields for resolution tracking
    # Should include: tickets, resolved_tickets, avg_resolution_time, resolution_report, tracking_logs
    pass

class ResolutionTrackerOutputState(TypedDict):
    # TODO: Define what this sub-graph should output
    pass

def analyze_resolution_patterns(state):
    """Analyze which tickets are resolved and resolution times"""
    tickets = state["tickets"]
    
    # TODO: Implement logic to:
    # 1. Identify resolved tickets (simulate: tickets with "thank" or "solved" in messages are resolved)
    # 2. Assign random resolution times between 1-48 hours for resolved tickets
    # 3. Mark tickets as resolved=True or False
    
    resolved_tickets = []
    updated_tickets = []
    
    # TODO: Your implementation here
    
    return {"tickets": updated_tickets, "resolved_tickets": resolved_tickets}

def calculate_metrics(state):
    """Calculate resolution metrics"""
    resolved_tickets = state["resolved_tickets"]
    
    # TODO: Calculate average resolution time
    # Handle case where no tickets are resolved
    
    avg_resolution_time = 0.0  # TODO: Calculate this
    
    return {"avg_resolution_time": avg_resolution_time}

def generate_resolution_report(state):
    """Generate a resolution tracking report"""
    tickets = state["tickets"]
    resolved_tickets = state["resolved_tickets"]
    avg_time = state["avg_resolution_time"]
    
    # TODO: Create a comprehensive report
    report = "Resolution tracking complete."  # TODO: Make this meaningful
    
    tracking_logs = [
        f"Processed {len(tickets)} tickets",
        f"Found {len(resolved_tickets)} resolved tickets"
    ]
    
    return {
        "resolution_report": report,
        "tracking_logs": tracking_logs
    }

### Task 3.2: Build the Resolution Tracker Sub-graph

In [None]:
# TODO: Build the resolution tracker sub-graph
resolution_builder = StateGraph(state_schema=ResolutionTrackerState, output_schema=ResolutionTrackerOutputState)

# TODO: Add nodes and edges

resolution_graph = resolution_builder.compile()

# Visualize the sub-graph
display(Image(resolution_graph.get_graph().draw_mermaid_png()))

## Exercise 4: Creating the Parent Graph with Sub-graph Integration

Now comes the most important part: integrating your sub-graphs into a parent graph that orchestrates the entire workflow.

### Key Concepts for Sub-graph Integration:
1. **Overlapping Keys**: Sub-graphs access parent data through keys with the same name
2. **Output Schemas**: Control what each sub-graph returns to prevent key conflicts
3. **Parallel Execution**: Sub-graphs can run in parallel for efficiency
4. **State Reducers**: Handle merging data when multiple sub-graphs output to the same key

### Task 4.1: Define the Parent Graph State

In [None]:
# TODO: Define the main customer support analytics state
class CustomerSupportAnalyticsState(TypedDict):
    # Input data
    raw_tickets: List[CustomerTicket]
    tickets: List[CustomerTicket]  # This will be shared with sub-graphs
    
    # Outputs from sentiment analysis sub-graph
    sentiment_summary: str
    
    # Outputs from resolution tracker sub-graph  
    resolution_report: str
    
    # Combined logs from both sub-graphs (this needs a reducer!)
    # TODO: What annotation should this have since both sub-graphs output logs?
    combined_logs: List[str]

<details>
<summary>💡 Hint for combined_logs</summary>

Since both sub-graphs will output logs, and they'll run in parallel, you need to use a reducer to combine the lists. Remember the `add` operator from the imports?

```python
combined_logs: Annotated[List[str], add]
```
</details>

### Task 4.2: Implement Data Preprocessing Function

In [None]:
def preprocess_tickets(state):
    """Preprocess raw tickets before sending to sub-graphs"""
    raw_tickets = state["raw_tickets"]
    
    # TODO: Implement preprocessing logic
    # This could include:
    # - Data validation
    # - Cleaning customer messages
    # - Adding default values for optional fields
    # - Filtering out invalid tickets
    
    processed_tickets = []
    
    for ticket in raw_tickets:
        # TODO: Add your preprocessing logic here
        # For now, just copy the ticket and ensure required fields exist
        processed_ticket = ticket.copy()
        
        # Ensure optional fields have default values
        if "sentiment_score" not in processed_ticket:
            processed_ticket["sentiment_score"] = None
        if "urgency_level" not in processed_ticket:
            processed_ticket["urgency_level"] = None
        if "resolution_time" not in processed_ticket:
            processed_ticket["resolution_time"] = None
        if "resolved" not in processed_ticket:
            processed_ticket["resolved"] = None
        if "knowledge_gap" not in processed_ticket:
            processed_ticket["knowledge_gap"] = None
            
        processed_tickets.append(processed_ticket)
    
    return {"tickets": processed_tickets}

### Task 4.3: Build the Complete Multi-Agent System

Now integrate everything into a parent graph that orchestrates the sub-graphs.

In [None]:
# TODO: Build the parent graph that integrates the sub-graphs
main_builder = StateGraph(CustomerSupportAnalyticsState)

# TODO: Add the preprocessing node

# TODO: Add the sub-graphs as nodes
# Hint: Use sentiment_graph and resolution_graph as nodes

# TODO: Add edges to create the workflow
# The flow should be:
# 1. START -> preprocess_tickets
# 2. preprocess_tickets -> both sub-graphs (parallel execution)
# 3. both sub-graphs -> END

# Build the complete system
main_graph = main_builder.compile()

# Visualize with xray=1 to see sub-graph internals
display(Image(main_graph.get_graph(xray=1).draw_mermaid_png()))

<details>
<summary>💡 Solution for Task 4.3</summary>

```python
main_builder = StateGraph(CustomerSupportAnalyticsState)

# Add preprocessing node
main_builder.add_node("preprocess_tickets", preprocess_tickets)

# Add sub-graphs as nodes
main_builder.add_node("sentiment_analysis", sentiment_graph)
main_builder.add_node("resolution_tracking", resolution_graph)

# Add edges for the workflow
main_builder.add_edge(START, "preprocess_tickets")
main_builder.add_edge("preprocess_tickets", "sentiment_analysis")
main_builder.add_edge("preprocess_tickets", "resolution_tracking")
main_builder.add_edge("sentiment_analysis", END)
main_builder.add_edge("resolution_tracking", END)

main_graph = main_builder.compile()
```
</details>

## Exercise 5: Testing Your Multi-Agent System

Let's test your complete system with realistic customer support data.

### Task 5.1: Create Test Data

In [None]:
# Create realistic test tickets
test_tickets = [
    {
        "ticket_id": "CS-001",
        "customer_message": "Your software is completely broken! I've been trying to log in for hours and it keeps failing. This is urgent!",
        "category": "technical"
    },
    {
        "ticket_id": "CS-002", 
        "customer_message": "Hi, I have a question about my billing. Could you help me understand the charges?",
        "category": "billing"
    },
    {
        "ticket_id": "CS-003",
        "customer_message": "Thank you so much for the quick fix! The issue is completely resolved and everything works perfectly now.",
        "category": "technical"
    },
    {
        "ticket_id": "CS-004",
        "customer_message": "This is terrible service. I've been waiting for 3 days and no one has responded. I'm extremely frustrated!",
        "category": "general"
    },
    {
        "ticket_id": "CS-005",
        "customer_message": "The tutorial was very helpful. I managed to solve my problem. Thanks for the great documentation!",
        "category": "general"
    }
]

### Task 5.2: Run the Complete System

In [None]:
# Test your complete multi-agent system
result = main_graph.invoke({"raw_tickets": test_tickets})

# Display results
print("=== CUSTOMER SUPPORT ANALYTICS RESULTS ===")
print(f"\nProcessed {len(result['tickets'])} tickets")
print(f"\n📊 SENTIMENT ANALYSIS:\n{result['sentiment_summary']}")
print(f"\n⏱️ RESOLUTION TRACKING:\n{result['resolution_report']}")
print(f"\n📝 COMBINED LOGS:")
for log in result['combined_logs']:
    print(f"  - {log}")

### Task 5.3: Analyze the Results

Examine the output and answer these questions:

1. **State Communication**: How did the `tickets` data flow from the parent to both sub-graphs?
2. **Parallel Execution**: Can you see evidence that both sub-graphs ran in parallel?
3. **Output Schemas**: How did the output schemas prevent conflicts between sub-graphs?
4. **Data Merging**: How were the logs from both sub-graphs combined?

**Write your analysis here:**

TODO: Write your analysis of the results here. Consider:
- How the sentiment analysis performed
- Whether high priority tickets were correctly identified  
- How the resolution tracking worked
- The effectiveness of the overall system architecture

## Exercise 6: Advanced Challenge - Adding a Third Sub-graph

For an advanced challenge, add a third sub-graph that identifies knowledge gaps and suggests documentation improvements.

### Task 6.1: Design the Knowledge Base Agent

In [None]:
# TODO: Design and implement a Knowledge Base sub-graph
# This sub-graph should:
# 1. Analyze unresolved tickets to identify knowledge gaps
# 2. Categorize gaps by topic (technical, billing, general)
# 3. Suggest documentation improvements
# 4. Generate a knowledge gap report

class KnowledgeBaseState(TypedDict):
    # TODO: Define the state schema
    pass

class KnowledgeBaseOutputState(TypedDict):
    # TODO: Define the output schema
    pass

# TODO: Implement the functions for this sub-graph

# TODO: Build the sub-graph

# TODO: Integrate it into the main graph

## Exercise 7: Reflection and Best Practices

### Task 7.1: System Architecture Review

Based on your implementation, document the following:

1. **Sub-graph Design Principles**: What makes a good sub-graph?
2. **State Management**: How should data flow between parent and sub-graphs?
3. **Output Schema Strategy**: When and why should you use output schemas?
4. **Parallel vs Sequential**: When should sub-graphs run in parallel vs sequence?

**Write your insights here:**

TODO: Document your insights about sub-graph architecture and best practices

### Task 7.2: Real-World Applications

Think of three real-world scenarios where sub-graphs would be beneficial:

1. **Scenario 1**: [Describe a business problem that could benefit from specialized sub-graphs]
2. **Scenario 2**: [Describe another use case]
3. **Scenario 3**: [Describe a third use case]

For each scenario, explain:
- What would each sub-graph specialize in?
- How would data flow between components?
- What are the benefits of this architecture?

## Summary and Key Takeaways

Congratulations! You've built a complete multi-agent system using LangGraph sub-graphs. Here's what you've learned:

### Core Concepts Mastered:

1. **Sub-graph Architecture**: Creating specialized graphs for specific tasks
2. **State Communication**: Using overlapping keys to share data between parent and sub-graphs
3. **Output Schemas**: Controlling what data flows between components
4. **Parallel Execution**: Running multiple specialized agents simultaneously
5. **State Reducers**: Merging data from multiple sources

### Best Practices You've Applied:

- **Separation of Concerns**: Each sub-graph handles a specific domain
- **Clean Interfaces**: Output schemas provide clear contracts between components
- **Scalable Design**: New sub-graphs can be added without breaking existing functionality
- **Efficient Execution**: Parallel processing improves performance

### Next Steps:

- Experiment with more complex state schemas
- Try conditional routing between sub-graphs
- Implement error handling and recovery mechanisms
- Scale to larger multi-agent systems

You're now ready to build sophisticated multi-agent systems that can handle complex, real-world workflows!