<a href="https://colab.research.google.com/github/micah-shull/AI_Agents/blob/main/171_LG_ResarchAgent.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>



### **Complete LangGraph Workflow Structure:**
- **9 nodes** showing the full research pipeline
- **Linear edges** connecting each step
- **State management** flowing data through the workflow
- **Placeholder functions** ready for real implementation

### **Key LangGraph Components You Can See:**

1. **`ResearchState` TypedDict** - Defines what data flows through the workflow
2. **Node Functions** - Each processing step with clear placeholders
3. **StateGraph** - The main orchestrator that connects everything
4. **Edges** - Linear flow from one node to the next
5. **Workflow Compilation** - Converting the graph into an executable agent

### **What You Can Learn From This Scaffold:**

- **State Flow** - See how data moves from `goal` → `search_strategy` → `search_results` → etc.
- **Node Structure** - Each function takes state, processes it, returns updated state
- **Workflow Patterns** - Linear processing with clear stages
- **Error Handling** - Basic structure for tracking errors and processing time

### **Next Steps You Can Take:**

1. **Study the structure** - Look at how each node transforms the state
2. **Modify placeholders** - Replace mock data with real functionality
3. **Add conditional logic** - What if search fails? What if no trends found?
4. **Experiment with edges** - Add parallel processing or conditional routing

**Want to explore any specific part of the scaffold?** You can:
- Modify the state schema
- Change the workflow structure
- Add new nodes
- Experiment with different data flows

The scaffold gives you a complete view of LangGraph's core concepts!

# Research Agent Scaffold

In [None]:
"""
Research Agent Scaffold - LangGraph Learning Project
This demonstrates a complete LangGraph workflow with placeholder functions.
Focus: Understanding LangGraph components and workflow structure.
"""

from langgraph.graph import StateGraph, END
from typing import TypedDict, List, Dict, Literal
import time

# ============================================================================
# 1. STATE DEFINITION
# ============================================================================

class ResearchState(TypedDict):
    """State schema for our research agent workflow"""

    # Input
    goal: Dict[str, str]  # Research objective and parameters

    # Processing stages
    stage: str  # Current processing stage

    # Search phase
    search_strategy: List[str]  # Generated search queries
    search_results: List[Dict]  # Raw search results

    # Data phase
    gathered_sources: List[Dict]  # Processed source data
    compiled_info: Dict[str, any]  # Organized information

    # Report phase
    draft_report: str
    review_suggestions: List[str]
    final_report: str

    # Metadata
    processing_time: float
    errors: List[str]

# ============================================================================
# 2. NODE FUNCTIONS (Placeholder implementations)
# ============================================================================

def interpret_goal(state: ResearchState) -> ResearchState:
    """Node 1: Parse and structure the research goal"""
    print(f"🎯 Interpreting goal: {state['goal']['objective']}")

    # PLACEHOLDER: Parse goal and extract key requirements
    time.sleep(0.1)  # Simulate processing

    # Update state
    state["stage"] = "goal_interpreted"
    state["processing_time"] = 0.1

    print(f"   ✅ Goal interpreted, stage: {state['stage']}")
    return state

def create_search_strategy(state: ResearchState) -> ResearchState:
    """Node 2: Generate search strategy based on goal"""
    print(f"🔍 Creating search strategy...")

    # PLACEHOLDER: Generate search queries based on goal
    time.sleep(0.2)

    # Mock search strategy
    state["search_strategy"] = [
        "AI trends 2024",
        "emerging AI technologies",
        "AI industry developments",
        "artificial intelligence news"
    ]
    state["stage"] = "strategy_created"
    state["processing_time"] += 0.2

    print(f"   ✅ Search strategy created: {len(state['search_strategy'])} queries")
    return state

def execute_parallel_search(state: ResearchState) -> ResearchState:
    """Node 3: Execute web searches in parallel"""
    print(f"🌐 Executing parallel web searches...")

    # PLACEHOLDER: Execute actual web searches
    time.sleep(0.3)

    # Mock search results
    state["search_results"] = [
        {"query": "AI trends 2024", "sources": 5, "status": "success"},
        {"query": "emerging AI technologies", "sources": 4, "status": "success"},
        {"query": "AI industry developments", "sources": 6, "status": "success"},
        {"query": "artificial intelligence news", "sources": 3, "status": "success"}
    ]
    state["stage"] = "search_completed"
    state["processing_time"] += 0.3

    print(f"   ✅ Search completed: {sum(r['sources'] for r in state['search_results'])} total sources")
    return state

def gather_source_data(state: ResearchState) -> ResearchState:
    """Node 4: Collect and organize source data"""
    print(f"📚 Gathering source data...")

    # PLACEHOLDER: Process search results and extract content
    time.sleep(0.4)

    # Mock gathered sources
    state["gathered_sources"] = [
        {"title": "AI Trend 1", "source": "TechNews", "content": "Sample content...", "relevance": "high"},
        {"title": "AI Trend 2", "source": "AIWeekly", "content": "Sample content...", "relevance": "high"},
        {"title": "AI Trend 3", "source": "IndustryReport", "content": "Sample content...", "relevance": "medium"},
        {"title": "AI Trend 4", "source": "TechBlog", "content": "Sample content...", "relevance": "high"},
        {"title": "AI Trend 5", "source": "NewsSite", "content": "Sample content...", "relevance": "medium"}
    ]
    state["stage"] = "data_gathered"
    state["processing_time"] += 0.4

    print(f"   ✅ Data gathered: {len(state['gathered_sources'])} sources processed")
    return state

def compile_information(state: ResearchState) -> ResearchState:
    """Node 5: Process and organize gathered information"""
    print(f"📊 Compiling information...")

    # PLACEHOLDER: Analyze and organize information
    time.sleep(0.3)

    # Mock compiled information
    state["compiled_info"] = {
        "trends": ["Trend 1", "Trend 2", "Trend 3", "Trend 4", "Trend 5"],
        "key_insights": ["Insight 1", "Insight 2", "Insight 3"],
        "source_summary": f"Analyzed {len(state['gathered_sources'])} sources",
        "confidence": "high"
    }
    state["stage"] = "info_compiled"
    state["processing_time"] += 0.3

    print(f"   ✅ Information compiled: {len(state['compiled_info']['trends'])} trends identified")
    return state

def write_draft_report(state: ResearchState) -> ResearchState:
    """Node 6: Create initial report draft"""
    print(f"📝 Writing draft report...")

    # PLACEHOLDER: Generate report using LLM
    time.sleep(0.5)

    # Mock draft report
    state["draft_report"] = f"""
    EXECUTIVE SUMMARY

    Based on analysis of {len(state['gathered_sources'])} sources, we have identified
    {len(state['compiled_info']['trends'])} major AI trends emerging in the industry:

    1. {state['compiled_info']['trends'][0]}
    2. {state['compiled_info']['trends'][1]}
    3. {state['compiled_info']['trends'][2]}
    4. {state['compiled_info']['trends'][3]}
    5. {state['compiled_info']['trends'][4]}

    These trends represent significant opportunities for investment and strategic planning.
    """
    state["stage"] = "draft_completed"
    state["processing_time"] += 0.5

    print(f"   ✅ Draft report completed ({len(state['draft_report'])} characters)")
    return state

def review_and_edit(state: ResearchState) -> ResearchState:
    """Node 7: Review draft and suggest improvements"""
    print(f"🔍 Reviewing draft report...")

    # PLACEHOLDER: LLM review of draft
    time.sleep(0.3)

    # Mock review suggestions
    state["review_suggestions"] = [
        "Add more specific examples for Trend 1",
        "Include market size estimates",
        "Strengthen conclusion section",
        "Add risk assessment"
    ]
    state["stage"] = "review_completed"
    state["processing_time"] += 0.3

    print(f"   ✅ Review completed: {len(state['review_suggestions'])} suggestions")
    return state

def write_final_report(state: ResearchState) -> ResearchState:
    """Node 8: Create polished final report"""
    print(f"✨ Writing final report...")

    # PLACEHOLDER: Incorporate suggestions and create final report
    time.sleep(0.4)

    # Mock final report
    state["final_report"] = f"""
    {state['draft_report']}

    ADDITIONAL ANALYSIS:
    - Market size estimates included
    - Risk assessment added
    - Specific examples provided
    - Conclusion strengthened

    This report incorporates {len(state['review_suggestions'])} review suggestions
    for enhanced quality and completeness.
    """
    state["stage"] = "final_completed"
    state["processing_time"] += 0.4

    print(f"   ✅ Final report completed ({len(state['final_report'])} characters)")
    return state

def validate_report(state: ResearchState) -> ResearchState:
    """Node 9: Final validation against original goal"""
    print(f"✅ Validating final report...")

    # PLACEHOLDER: Check if report meets original goal
    time.sleep(0.2)

    # Mock validation
    goal_met = len(state['compiled_info']['trends']) >= 5
    state["stage"] = "validated" if goal_met else "validation_failed"
    state["processing_time"] += 0.2

    if goal_met:
        print(f"   ✅ Report validation passed - goal achieved!")
    else:
        print(f"   ❌ Report validation failed - goal not met")

    return state

# ============================================================================
# 3. WORKFLOW CONSTRUCTION
# ============================================================================

def create_research_agent():
    """Create the research agent workflow"""
    print("🏗️  Building Research Agent Workflow...")

    # Create the workflow
    workflow = StateGraph(ResearchState)

    # Add nodes (processing units)
    workflow.add_node("interpret_goal", interpret_goal)
    workflow.add_node("create_strategy", create_search_strategy)
    workflow.add_node("execute_search", execute_parallel_search)
    workflow.add_node("gather_data", gather_source_data)
    workflow.add_node("compile_info", compile_information)
    workflow.add_node("write_draft", write_draft_report)
    workflow.add_node("review_edit", review_and_edit)
    workflow.add_node("write_final", write_final_report)
    workflow.add_node("validate", validate_report)

    # Add edges (linear flow)
    workflow.add_edge("interpret_goal", "create_strategy")
    workflow.add_edge("create_strategy", "execute_search")
    workflow.add_edge("execute_search", "gather_data")
    workflow.add_edge("gather_data", "compile_info")
    workflow.add_edge("compile_info", "write_draft")
    workflow.add_edge("write_draft", "review_edit")
    workflow.add_edge("review_edit", "write_final")
    workflow.add_edge("write_final", "validate")
    workflow.add_edge("validate", END)

    # Set entry point
    workflow.set_entry_point("interpret_goal")

    # Compile the workflow
    app = workflow.compile()

    print("✅ Research Agent workflow compiled successfully!")
    return app

# ============================================================================
# 4. TESTING AND DEMONSTRATION
# ============================================================================

def test_research_agent():
    """Test the research agent with sample goal"""
    print("\n" + "="*60)
    print("🧪 TESTING RESEARCH AGENT")
    print("="*60)

    # Create the agent
    agent = create_research_agent()

    # Sample goal
    sample_goal = {
        "objective": "Identify major AI trends emerging in the industry today",
        "scope": "Industry trends, not academic research",
        "output_format": "Industry standard report",
        "target_audience": "Business professionals",
        "depth": "Comprehensive overview with key insights",
        "sources": "Web-based industry sources, recent articles",
        "success_criteria": "Clear identification of top 5-7 trends with supporting evidence"
    }

    # Initial state
    initial_state = {
        "goal": sample_goal,
        "stage": "started",
        "search_strategy": [],
        "search_results": [],
        "gathered_sources": [],
        "compiled_info": {},
        "draft_report": "",
        "review_suggestions": [],
        "final_report": "",
        "processing_time": 0.0,
        "errors": []
    }

    print(f"\n📋 Research Goal: {sample_goal['objective']}")
    print("-" * 60)

    try:
        result = agent.invoke(initial_state)

        print(f"\n📊 Final Results:")
        print(f"   Final Stage: {result['stage']}")
        print(f"   Search Queries: {len(result['search_strategy'])}")
        print(f"   Sources Gathered: {len(result['gathered_sources'])}")
        print(f"   Trends Identified: {len(result['compiled_info'].get('trends', []))}")
        print(f"   Review Suggestions: {len(result['review_suggestions'])}")
        print(f"   Total Processing Time: {result['processing_time']:.2f}s")
        print(f"   Final Report Length: {len(result['final_report'])} characters")

        if result['errors']:
            print(f"   Errors: {result['errors']}")

    except Exception as e:
        print(f"❌ Test failed: {str(e)}")

def visualize_workflow():
    """Visualize the workflow structure"""
    print("\n" + "="*60)
    print("📊 RESEARCH AGENT WORKFLOW VISUALIZATION")
    print("="*60)

    print("\n🔄 Workflow Flow:")
    print("""
    START
      ↓
    interpret_goal (LLM)
      ↓
    create_strategy (LLM)
      ↓
    execute_search (Python)
      ↓
    gather_data (Python)
      ↓
    compile_info (LLM)
      ↓
    write_draft (LLM)
      ↓
    review_edit (LLM)
      ↓
    write_final (LLM)
      ↓
    validate (LLM)
      ↓
     END
    """)

    print("\n📋 Node Details:")
    nodes = [
        ("interpret_goal", "Parse and structure research goal"),
        ("create_strategy", "Generate search queries and approach"),
        ("execute_search", "Run web searches in parallel"),
        ("gather_data", "Collect and organize source data"),
        ("compile_info", "Process and structure information"),
        ("write_draft", "Create initial report draft"),
        ("review_edit", "Review draft and suggest improvements"),
        ("write_final", "Create polished final report"),
        ("validate", "Final validation against original goal")
    ]

    for node, description in nodes:
        print(f"   • {node}: {description}")

if __name__ == "__main__":
    print("🚀 Research Agent Scaffold - LangGraph Learning")
    print("Focus: Understanding LangGraph workflow structure")
    print("="*60)

    # Visualize the workflow
    visualize_workflow()

    # Test the agent
    test_research_agent()

    print("\n" + "="*60)
    print("🎓 Scaffold Complete!")
    print("Key LangGraph Concepts Demonstrated:")
    print("• StateGraph - Main workflow orchestrator")
    print("• TypedDict - State schema definition")
    print("• Nodes - Individual processing functions")
    print("• Edges - Linear workflow connections")
    print("• State Management - Data flow through workflow")
    print("• Workflow Compilation - Creating executable agent")
    print("="*60)




## 🏗️ **The Universal LangGraph Pattern:**

```python
def create_any_agent():
    # 1. State - Define your data structure
    workflow = StateGraph(YourState)
    
    # 2. add_node - Add processing units
    workflow.add_node("node1", function1)
    workflow.add_node("node2", function2)
    
    # 3. add_edge - Connect the nodes
    workflow.add_edge("node1", "node2")
    
    # 4. set_entry_point - Where to start
    workflow.set_entry_point("node1")
    
    # 5. compile - Make it executable
    app = workflow.compile()
    
    return app
```

**This is the DNA of every LangGraph agent!** Everything else is just variations on this theme.

## 🔄 **Pattern Variations:**

### **Simple Linear (Your Current Pattern):**
```python
# A → B → C → D
workflow.add_edge("A", "B")
workflow.add_edge("B", "C")
workflow.add_edge("C", "D")
```

### **Conditional Branching:**
```python
# A → B → (C or D)
workflow.add_conditional_edges("B", decision_function, {"C": "C", "D": "D"})
```

### **Parallel Processing:**
```python
# A → (B + C) → D
workflow.add_edge("A", "B")
workflow.add_edge("A", "C")
workflow.add_edge("B", "D")
workflow.add_edge("C", "D")
```

### **Loops:**
```python
# A → B → C → B (retry loop)
workflow.add_edge("A", "B")
workflow.add_edge("B", "C")
workflow.add_edge("C", "B")  # Loop back
```

## 🎯 **What Changes Between Agents:**

### **1. State Schema:**
```python
# Research Agent
class ResearchState(TypedDict):
    goal: Dict[str, str]
    search_strategy: List[str]

# Customer Service Agent
class CustomerState(TypedDict):
    customer_id: str
    issue: str
    resolution: str
```

### **2. Node Functions:**
```python
# Research Agent
def interpret_goal(state: ResearchState) -> ResearchState:
    # Research logic
    pass

# Customer Service Agent
def analyze_issue(state: CustomerState) -> CustomerState:
    # Customer service logic
    pass
```

### **3. Edge Patterns:**
```python
# Research Agent: Linear flow
workflow.add_edge("A", "B")

# Customer Service: Conditional flow
workflow.add_conditional_edges("A", route_function, {"B": "B", "C": "C"})
```

## 🚀 **The Power of This Pattern:**

### **1. Reusability:**
```python
# Same pattern, different domains
research_agent = create_research_agent()
customer_agent = create_customer_agent()
content_agent = create_content_agent()
```

### **2. Composability:**
```python
# Combine agents
def create_mega_agent():
    workflow = StateGraph(MegaState)
    
    # Include research sub-agent
    workflow.add_node("research", research_agent)
    
    # Include customer service sub-agent
    workflow.add_node("support", customer_agent)
    
    # Connect them
    workflow.add_edge("research", "support")
    
    return workflow.compile()
```

### **3. Testability:**
```python
# Test individual nodes
def test_interpret_goal():
    state = {"goal": {"objective": "test"}}
    result = interpret_goal(state)
    assert result["stage"] == "goal_interpreted"

# Test entire workflow
def test_research_agent():
    agent = create_research_agent()
    result = agent.invoke(initial_state)
    assert result["stage"] == "validated"
```

## 💡 **Advanced Patterns (Still Same Core):**

### **1. Human-in-the-Loop:**
```python
# Same 5 steps, but add human interaction
workflow.add_node("human_review", human_review_function)
workflow.add_edge("draft", "human_review")
workflow.add_edge("human_review", "final")
```

### **2. Dynamic Workflows:**
```python
# Same 5 steps, but edges change based on data
def dynamic_router(state: DynamicState) -> str:
    if state["complexity"] == "high":
        return "detailed_analysis"
    else:
        return "simple_analysis"
```

### **3. Multi-Agent Systems:**
```python
# Same 5 steps, but nodes are other agents
workflow.add_node("researcher", research_agent)
workflow.add_node("writer", writing_agent)
workflow.add_node("reviewer", review_agent)
```

## 🎯 **Your Insight is Spot-On:**

**The core is always the same 5 steps!** Everything else is just:
- **Different state schemas** for different domains
- **Different node functions** for different tasks
- **Different edge patterns** for different flows
- **Different business logic** for different use cases

**This is why LangGraph is so powerful** - once you understand this pattern, you can build any agent workflow. The framework handles all the complex orchestration, and you just focus on the business logic.



Let me break down the most important LangGraph concepts you should focus on, using your scaffold as the learning vehicle:

## 🎯 **Core LangGraph Concepts to Master:**

### **1. State Management (Most Critical)**
```python
class ResearchState(TypedDict):
    goal: Dict[str, str]
    stage: str
    search_strategy: List[str]
    # ... more fields
```
**Why it matters:** State is the "memory" of your agent - everything flows through it
**Focus on:** How data transforms as it moves through nodes

### **2. Node Functions (Processing Units)**
```python
def interpret_goal(state: ResearchState) -> ResearchState:
    # Takes state, processes it, returns updated state
    state["stage"] = "goal_interpreted"
    return state
```
**Why it matters:** Each node is a focused processing unit
**Focus on:** Single responsibility, clear input/output

### **3. Graph Construction (Orchestration)**
```python
workflow = StateGraph(ResearchState)
workflow.add_node("interpret_goal", interpret_goal)
workflow.add_edge("interpret_goal", "create_strategy")
```
**Why it matters:** This is how you connect your processing units
**Focus on:** Understanding the flow, adding conditional logic

## 🔥 **Advanced Patterns to Learn Next:**

### **4. Conditional Edges (Decision Making)**
```python
def route_after_validation(state: ResearchState) -> str:
    if state["stage"] == "validated":
        return "continue"
    else:
        return "retry"

workflow.add_conditional_edges(
    "validate",
    route_after_validation,
    {"continue": "next_node", "retry": "retry_node"}
)
```
**Why it matters:** Real agents need to make decisions
**Focus on:** When to use conditions vs linear flow

### **5. Parallel Processing**
```python
workflow.add_node("search_1", search_source_1)
workflow.add_node("search_2", search_source_2)
workflow.add_edge("start", "search_1")
workflow.add_edge("start", "search_2")
```
**Why it matters:** Speed up your agent by doing multiple things at once
**Focus on:** When tasks can run independently

### **6. Error Handling & Retry Logic**
```python
def handle_error(state: ResearchState) -> ResearchState:
    if state["retry_count"] < 3:
        state["stage"] = "retry"
    else:
        state["stage"] = "failed"
    return state
```
**Why it matters:** Real-world agents need to handle failures gracefully
**Focus on:** Graceful degradation, retry strategies

## 📚 **Learning Path I Recommend:**

### **Phase 1: Master the Basics (Current)**
- ✅ State schema design
- ✅ Linear node flow
- ✅ Basic state transformations

### **Phase 2: Add Intelligence**
- **Conditional routing** - What if search fails?
- **Parallel processing** - Multiple searches at once
- **Error handling** - Retry logic, fallbacks

### **Phase 3: Advanced Patterns**
- **Human-in-the-loop** - Pause for user input
- **Dynamic workflows** - Add nodes based on data
- **Memory persistence** - Save state between runs

## 🎯 **What to Focus on Right Now:**

### **1. State Flow Understanding**
- Trace how data moves from `goal` → `search_strategy` → `search_results`
- Understand what each node adds to the state
- See how state accumulates information

### **2. Node Design Patterns**
- Each node should do ONE thing well
- Clear input/output contracts
- Idempotent operations (can run multiple times safely)

### **3. Workflow Visualization**
- Draw your workflow on paper
- Understand the linear flow
- Identify where you might need branches

## 🚀 **Next Learning Steps:**

1. **Add conditional logic** - What if no trends found?
2. **Implement parallel search** - Multiple sources at once
3. **Add error handling** - Retry failed searches
4. **Create real LLM nodes** - Replace placeholders with actual AI calls

**The scaffold is perfect for this learning!** You can modify one concept at a time without breaking the whole system.



Let me clarify the difference between nodes and tools, and where personas fit in:

## 🔧 **Nodes vs Tools - Key Difference:**

### **Nodes = Processing Steps**
```python
def interpret_goal(state: ResearchState) -> ResearchState:
    # This is a NODE - it processes state
    state["stage"] = "goal_interpreted"
    return state
```

### **Tools = External Functions**
```python
def web_search(query: str) -> List[Dict]:
    # This is a TOOL - it does external work
    return search_api(query)

def write_report(content: str) -> str:
    # This is a TOOL - it calls an LLM
    return llm.generate(content)
```

**Nodes use tools, but nodes aren't tools themselves!**

## 🏗️ **LangGraph Architecture:**

### **LangGraph Handles:**
- **State management** - Moving data between nodes
- **Workflow orchestration** - Deciding which node runs next
- **Error handling** - What happens when nodes fail
- **Parallel execution** - Running multiple nodes at once

### **You Handle:**
- **Node logic** - What each processing step does
- **Tool integration** - How nodes call external services
- **Persona implementation** - How to give nodes "personality"

## 👤 **Where Personas Live:**

### **Option 1: In Node Functions (Current Approach)**
```python
def interpret_goal(state: ResearchState) -> ResearchState:
    # Persona is embedded in the node logic
    persona_prompt = """
    You are a senior research analyst at Goldman Sachs.
    Your job is to interpret research goals for investment analysis.
    Focus on trends that matter to institutional investors.
    """
    
    # Use persona in LLM call
    result = llm.invoke(persona_prompt + state["goal"]["objective"])
    state["stage"] = "goal_interpreted"
    return state
```

### **Option 2: In State Schema**
```python
class ResearchState(TypedDict):
    # ... other fields
    persona: Dict[str, str]  # Store persona info in state
    current_persona: str     # Which persona is active
```

### **Option 3: Separate Persona Registry**
```python
PERSONAS = {
    "goldman_analyst": {
        "role": "Senior Research Analyst",
        "firm": "Goldman Sachs",
        "expertise": "Technology sector analysis",
        "prompt": "You are a senior research analyst..."
    },
    "tech_reporter": {
        "role": "Technology Reporter",
        "publication": "TechCrunch",
        "expertise": "Breaking tech news",
        "prompt": "You are a technology reporter..."
    }
}

def interpret_goal(state: ResearchState) -> ResearchState:
    persona = PERSONAS[state.get("current_persona", "goldman_analyst")]
    prompt = persona["prompt"] + state["goal"]["objective"]
    # ... rest of logic
```

## 🎯 **Tool Registry Pattern:**

### **Simple Tool Registry:**
```python
TOOLS = {
    "web_search": web_search_function,
    "llm_generate": llm_generate_function,
    "data_processor": data_processor_function
}

def execute_search(state: ResearchState) -> ResearchState:
    # Node uses tools from registry
    search_tool = TOOLS["web_search"]
    results = search_tool(state["search_strategy"])
    state["search_results"] = results
    return state
```

### **LangChain Tool Integration:**
```python
from langchain.tools import Tool

search_tool = Tool(
    name="web_search",
    description="Search the web for information",
    func=web_search_function
)

def execute_search(state: ResearchState) -> ResearchState:
    # Node uses LangChain tool
    results = search_tool.run(state["search_strategy"])
    state["search_results"] = results
    return state
```

## 🔄 **Complete Pattern Example:**

```python
# 1. Define tools
def web_search(queries: List[str]) -> List[Dict]:
    # External API calls
    pass

def llm_analyze(content: str, persona: str) -> str:
    # LLM with persona
    pass

# 2. Define nodes that use tools
def execute_search(state: ResearchState) -> ResearchState:
    # Node uses web_search tool
    results = web_search(state["search_strategy"])
    state["search_results"] = results
    return state

def compile_information(state: ResearchState) -> ResearchState:
    # Node uses llm_analyze tool with persona
    persona = "goldman_analyst"
    analysis = llm_analyze(state["gathered_sources"], persona)
    state["compiled_info"] = analysis
    return state
```

## 💡 **Best Practices:**

### **1. Separate Concerns**
- **Nodes** = Workflow logic
- **Tools** = External functionality
- **Personas** = Context/personality

### **2. Tool Registry**
- Keep tools in a central registry
- Easy to swap implementations
- Reusable across nodes

### **3. Persona Management**
- Store personas in state or registry
- Consistent personality across nodes
- Easy to change personas

**So to answer your questions:**
1. **Nodes are processing steps, not tools**
2. **LangGraph handles orchestration, you handle tools**
3. **Personas can live in nodes, state, or separate registry**



You're right to be confused - let me clarify the difference between what you define and what LangGraph handles automatically.

## 🔄 **Workflow Orchestration - What You Define vs What LangGraph Does:**

### **What You Define (Edges):**
```python
# You explicitly define the flow
workflow.add_edge("interpret_goal", "create_strategy")
workflow.add_edge("create_strategy", "execute_search")
workflow.add_edge("execute_search", "gather_data")
```

### **What LangGraph Does Automatically:**
```python
# LangGraph handles the execution order
# 1. Runs "interpret_goal" first
# 2. Waits for it to complete
# 3. Runs "create_strategy" next
# 4. Waits for it to complete
# 5. Runs "execute_search" next
# ... and so on
```

**You define the "what" (which nodes connect), LangGraph handles the "when" (execution order).**

## 🚨 **Error Handling - What LangGraph Covers vs What You Need:**

### **What LangGraph Handles Automatically:**
```python
# 1. Node execution failures
def bad_node(state: ResearchState) -> ResearchState:
    raise Exception("Something went wrong!")
    # LangGraph catches this and stops the workflow

# 2. State serialization errors
# If state can't be saved/loaded, LangGraph handles it

# 3. Graph compilation errors
# If your graph has cycles or invalid edges, LangGraph catches it
```

### **What You Need to Handle:**
```python
# 1. Business logic errors
def execute_search(state: ResearchState) -> ResearchState:
    try:
        results = web_search(state["search_strategy"])
        if len(results) == 0:
            # This is YOUR error to handle
            state["errors"].append("No search results found")
            state["stage"] = "search_failed"
        else:
            state["search_results"] = results
            state["stage"] = "search_completed"
    except Exception as e:
        # This is YOUR error to handle
        state["errors"].append(f"Search failed: {str(e)}")
        state["stage"] = "search_failed"
    
    return state
```

## 🔀 **Conditional Routing - Where LangGraph Gets Smart:**

### **Simple Linear Flow (What You Have Now):**
```python
# Always goes: A → B → C → D
workflow.add_edge("A", "B")
workflow.add_edge("B", "C")
workflow.add_edge("C", "D")
```

### **Conditional Flow (What You Can Add):**
```python
def route_after_search(state: ResearchState) -> str:
    if state["stage"] == "search_completed":
        return "continue"
    elif state["stage"] == "search_failed":
        return "retry"
    else:
        return "end"

# LangGraph decides which path to take
workflow.add_conditional_edges(
    "execute_search",
    route_after_search,
    {
        "continue": "gather_data",
        "retry": "execute_search",  # Loop back
        "end": END
    }
)
```

**Now LangGraph is making decisions!** It looks at your state and chooses the next node.

## 🛡️ **Error Handling Levels:**

### **Level 1: LangGraph Automatic (You Get This Free)**
```python
# LangGraph catches these automatically:
# - Node crashes
# - Invalid state
# - Graph compilation errors
```

### **Level 2: Business Logic Errors (You Handle)**
```python
def execute_search(state: ResearchState) -> ResearchState:
    # Check if search strategy exists
    if not state["search_strategy"]:
        state["errors"].append("No search strategy defined")
        state["stage"] = "strategy_missing"
        return state
    
    # Try to execute search
    try:
        results = web_search(state["search_strategy"])
        state["search_results"] = results
        state["stage"] = "search_completed"
    except Exception as e:
        state["errors"].append(f"Search failed: {str(e)}")
        state["stage"] = "search_failed"
    
    return state
```

### **Level 3: Retry Logic (You Design)**
```python
def handle_search_error(state: ResearchState) -> ResearchState:
    state["retry_count"] = state.get("retry_count", 0) + 1
    
    if state["retry_count"] < 3:
        print(f"Retrying search (attempt {state['retry_count']})")
        state["stage"] = "retry_search"
    else:
        print("Max retries exceeded")
        state["stage"] = "search_failed_permanently"
    
    return state
```

## 🔄 **Complete Error Handling Example:**

```python
# Add error handling node
workflow.add_node("handle_error", handle_search_error)

# Add conditional routing
def route_after_search(state: ResearchState) -> str:
    if state["stage"] == "search_completed":
        return "continue"
    elif state["stage"] == "search_failed":
        return "handle_error"
    else:
        return "end"

workflow.add_conditional_edges(
    "execute_search",
    route_after_search,
    {
        "continue": "gather_data",
        "handle_error": "handle_error",
        "end": END
    }
)

# Add retry loop
def route_after_error(state: ResearchState) -> str:
    if state["stage"] == "retry_search":
        return "execute_search"  # Loop back
    else:
        return "end"

workflow.add_conditional_edges(
    "handle_error",
    route_after_error,
    {
        "execute_search": "execute_search",
        "end": END
    }
)
```

## 💡 **Key Takeaways:**

1. **Edges define the "map"** - which nodes can connect
2. **LangGraph handles the "navigation"** - which path to take
3. **You handle business logic errors** - LangGraph handles system errors
4. **Conditional edges make it smart** - LangGraph can make decisions
5. **Error handling is layered** - LangGraph + your business logic

