# Level 3: Advanced - LangGraph Deep Dive

This notebook covers advanced LangGraph concepts for building production-ready agent systems.

## Learning Objectives
- Persistence and checkpointing (memory across sessions)
- Human-in-the-loop with interrupts
- Streaming events and tokens from graphs
- Subgraphs for modular agent architectures
- Dynamic state management and reducers
- Error handling and retry strategies

## Prerequisites
- Completed Notebooks 01 and 02

---

**References:**
- [LangGraph Persistence](https://langchain-ai.github.io/langgraph/how-tos/persistence/)
- [LangGraph Human-in-the-Loop](https://langchain-ai.github.io/langgraph/how-tos/human_in_the_loop/)
- [LangGraph Streaming](https://langchain-ai.github.io/langgraph/how-tos/streaming/)

## 1. Setup

In [1]:
# Import required libraries
import os
import sys
from dotenv import load_dotenv

# Load environment variables from .env file
load_dotenv()

# Add parent directory to path for shared config
sys.path.append('..')

# Import global model configuration
from config import (
    GPT_MODEL, GEMINI_MODEL,
    GPT_MODEL_NAME, GEMINI_MODEL_NAME,
    get_model, list_available_models,
)

print(f"Using GPT model:    {GPT_MODEL_NAME}")
print(f"Using Gemini model: {GEMINI_MODEL_NAME}")
print()
list_available_models()

OpenAI client initialized  -> model: gpt-4o-mini
Google client initialized  -> model: gemini-3-flash-preview
Using GPT model:    gpt-4o-mini
Using Gemini model: gemini-3-flash-preview

Available Models:
-------------------------------------------------------
  gpt-4o-mini          -> ChatOpenAI(gpt-4o-mini)
  gemini-3-flash-preview -> ChatGoogleGenerativeAI(gemini-3-flash-preview)
-------------------------------------------------------


## 2. Persistence and Checkpointing

Persistence lets your graph **remember** state across invocations. This is essential for:
- Multi-turn conversations
- Long-running workflows
- Crash recovery

LangGraph uses **checkpointers** to save and restore graph state.

In [2]:
from typing import Annotated
from typing_extensions import TypedDict
from langgraph.graph import StateGraph, START, END
from langgraph.graph.message import add_messages
from langgraph.checkpoint.memory import MemorySaver

# Define state with message history
class ChatState(TypedDict):
    messages: Annotated[list, add_messages]

# Build a simple chatbot graph with memory
def chatbot_node(state: ChatState):
    return {"messages": [GPT_MODEL.invoke(state["messages"])]}

graph_builder = StateGraph(ChatState)
graph_builder.add_node("chatbot", chatbot_node)
graph_builder.add_edge(START, "chatbot")
graph_builder.add_edge("chatbot", END)

# Compile with memory checkpointer
memory = MemorySaver()
chatbot_with_memory = graph_builder.compile(checkpointer=memory)

# First conversation turn
config = {"configurable": {"thread_id": "demo-thread-1"}}
result = chatbot_with_memory.invoke(
    {"messages": [{"role": "user", "content": "My name is Alice. I love hiking."}]},
    config=config,
)
print("Turn 1:", result["messages"][-1].content)

# Second turn - the bot should remember the name
result = chatbot_with_memory.invoke(
    {"messages": [{"role": "user", "content": "What is my name and what do I like?"}]},
    config=config,
)
print("\nTurn 2:", result["messages"][-1].content)

Turn 1: Hi Alice! It's great to meet you. Hiking is such a wonderful way to connect with nature and stay active. Do you have any favorite trails or hiking spots?

Turn 2: Your name is Alice, and you love hiking!


In [3]:
# Verify persistence: use a different thread to show isolation
config2 = {"configurable": {"thread_id": "demo-thread-2"}}
result = chatbot_with_memory.invoke(
    {"messages": [{"role": "user", "content": "What is my name?"}]},
    config=config2,
)
print("Thread 2 (no prior context):", result["messages"][-1].content)

# Go back to thread 1 - memory should be intact
result = chatbot_with_memory.invoke(
    {"messages": [{"role": "user", "content": "Can you remind me what my name is?"}]},
    config=config,
)
print("\nThread 1 (has context):", result["messages"][-1].content)

Thread 2 (no prior context): I'm sorry, but I don't have access to personal information about you unless you share it with me. How can I assist you today?

Thread 1 (has context): Your name is Alice!


## 3. Human-in-the-Loop with Interrupts

Interrupts allow you to **pause** a graph at any node, inspect the state, optionally modify it, and then **resume** execution. This is critical for:
- Approval workflows
- Content moderation
- Manual data correction

In [4]:
from langchain_core.tools import tool
from langgraph.prebuilt import create_react_agent
from langgraph.checkpoint.memory import MemorySaver

# Define tools for a research assistant
@tool
def search_web(query: str) -> str:
    """Search the web for information (mock)."""
    mock_results = {
        "langchain": "LangChain is a framework for building LLM applications.",
        "langgraph": "LangGraph provides graph-based workflows for agents.",
        "python": "Python is a versatile programming language.",
    }
    for key, val in mock_results.items():
        if key in query.lower():
            return val
    return f"No results found for: {query}"

@tool
def save_note(note: str) -> str:
    """Save a research note for later reference."""
    print(f"  [NOTE SAVED]: {note}")
    return f"Note saved: {note}"

# Create agent with interrupt_before on save_note (requires approval)
research_tools = [search_web, save_note]
memory = MemorySaver()

research_agent = create_react_agent(
    GPT_MODEL,
    research_tools,
    checkpointer=memory,
    interrupt_before=["tools"],  # Pause before any tool execution
)

# Start the agent
config = {"configurable": {"thread_id": "research-1"}}
result = research_agent.invoke(
    {"messages": [{"role": "user", "content": "Search for information about LangGraph and save a summary note."}]},
    config=config,
)

# The agent is now paused before tool execution
print("Agent paused. Current state:")
snapshot = research_agent.get_state(config)
print(f"  Next node: {snapshot.next}")
if snapshot.values.get("messages"):
    last_msg = snapshot.values["messages"][-1]
    if hasattr(last_msg, "tool_calls") and last_msg.tool_calls:
        for tc in last_msg.tool_calls:
            print(f"  Pending tool call: {tc['name']}({tc['args']})")

C:\Users\Yunpeng.Cheng\AppData\Local\Temp\ipykernel_25260\3528947716.py:29: LangGraphDeprecatedSinceV10: create_react_agent has been moved to `langchain.agents`. Please update your import to `from langchain.agents import create_agent`. Deprecated in LangGraph V1.0 to be removed in V2.0.
  research_agent = create_react_agent(


Agent paused. Current state:
  Next node: ('tools',)
  Pending tool call: search_web({'query': 'LangGraph'})


In [5]:
# Resume execution: approve the pending tool calls and let the agent continue
# In production, you'd show the pending tool call to a human for approval
print("Approving tool execution and resuming...\n")

# Resume by invoking with None (which continues from checkpoint)
result = research_agent.invoke(None, config=config)

# Print the results
for msg in result["messages"]:
    role = msg.type if hasattr(msg, 'type') else 'unknown'
    content = msg.content if hasattr(msg, 'content') else str(msg)
    if content:
        print(f"[{role}] {content[:200]}")

Approving tool execution and resuming...

[human] Search for information about LangGraph and save a summary note.
[tool] LangGraph provides graph-based workflows for agents.


## 4. Streaming Events from Graphs

Streaming lets you observe the graph's execution in real-time, seeing each node's output as it happens.

In [6]:
from typing import Annotated
from typing_extensions import TypedDict
from langgraph.graph import StateGraph, START, END
from langgraph.graph.message import add_messages
from langchain_core.tools import tool

@tool
def lookup_info(topic: str) -> str:
    """Look up information about a topic."""
    data = {
        "python": "Python is a high-level programming language known for readability.",
        "javascript": "JavaScript is a dynamic language used for web development.",
        "rust": "Rust is a systems language focused on safety and performance.",
    }
    return data.get(topic.lower(), f"No info on {topic}")

tools = [lookup_info]

# Build a simple tool-calling agent for streaming demo
class StreamState(TypedDict):
    messages: Annotated[list, add_messages]

model_with_tools = GPT_MODEL.bind_tools(tools)

def agent(state):
    return {"messages": [model_with_tools.invoke(state["messages"])]}

def execute_tools(state):
    from langchain_core.messages import ToolMessage
    outputs = []
    for tc in state["messages"][-1].tool_calls:
        result = {t.name: t for t in tools}[tc["name"]].invoke(tc["args"])
        outputs.append(ToolMessage(content=str(result), tool_call_id=tc["id"], name=tc["name"]))
    return {"messages": outputs}

def route(state):
    last = state["messages"][-1]
    if hasattr(last, "tool_calls") and last.tool_calls:
        return "tools"
    return END

g = StateGraph(StreamState)
g.add_node("agent", agent)
g.add_node("tools", execute_tools)
g.add_edge(START, "agent")
g.add_conditional_edges("agent", route, {"tools": "tools", END: END})
g.add_edge("tools", "agent")
stream_graph = g.compile()

# Stream events from the graph
print("Streaming graph events:\n")
for event in stream_graph.stream(
    {"messages": [{"role": "user", "content": "Tell me about Python and Rust."}]},
    stream_mode="updates",
):
    for node_name, node_output in event.items():
        print(f"--- Node: {node_name} ---")
        if "messages" in node_output:
            for msg in node_output["messages"]:
                content = msg.content if hasattr(msg, "content") else str(msg)
                if content:
                    print(f"  {msg.type}: {content[:150]}")
                if hasattr(msg, "tool_calls") and msg.tool_calls:
                    for tc in msg.tool_calls:
                        print(f"  Tool call: {tc['name']}({tc['args']})")
    print()

Streaming graph events:

--- Node: agent ---
  Tool call: lookup_info({'topic': 'Python programming language'})
  Tool call: lookup_info({'topic': 'Rust programming language'})

--- Node: tools ---
  tool: No info on Python programming language
  tool: No info on Rust programming language

--- Node: agent ---
  ai: It seems that I couldn't retrieve specific information about Python and Rust at the moment. However, I can provide a brief overview based on my knowle



## 5. Subgraphs for Modular Architectures

Subgraphs let you compose smaller graphs into larger ones, promoting reuse and modularity.

In [7]:
from langchain_core.messages import SystemMessage

# Build a "summarizer" subgraph
class SummarizerState(TypedDict):
    messages: Annotated[list, add_messages]

def summarize_node(state: SummarizerState):
    msgs = [SystemMessage(content="Summarize the conversation so far in 2 sentences.")] + state["messages"]
    return {"messages": [GEMINI_MODEL.invoke(msgs)]}

summarizer = StateGraph(SummarizerState)
summarizer.add_node("summarize", summarize_node)
summarizer.add_edge(START, "summarize")
summarizer.add_edge("summarize", END)
summarizer_graph = summarizer.compile()

# Build a "responder" subgraph
def responder_node(state: SummarizerState):
    msgs = [SystemMessage(content="You are a helpful assistant. Respond to the user.")] + state["messages"]
    return {"messages": [GPT_MODEL.invoke(msgs)]}

responder = StateGraph(SummarizerState)
responder.add_node("respond", responder_node)
responder.add_edge(START, "respond")
responder.add_edge("respond", END)
responder_graph = responder.compile()

# Build the parent graph that uses both subgraphs
class ParentState(TypedDict):
    messages: Annotated[list, add_messages]

parent = StateGraph(ParentState)
parent.add_node("responder", responder_graph)
parent.add_node("summarizer", summarizer_graph)

parent.add_edge(START, "responder")
parent.add_edge("responder", "summarizer")
parent.add_edge("summarizer", END)

parent_graph = parent.compile()

result = parent_graph.invoke({
    "messages": [{"role": "user", "content": "Explain the difference between LangChain and LangGraph."}]
})

print("Response:")
print(result["messages"][-2].content[:300])
print("\nSummary:")
print(result["messages"][-1].content)

Response:
LangChain and LangGraph are both frameworks designed to facilitate the development of applications that utilize language models, but they serve different purposes and have distinct features.

### LangChain
- **Purpose**: LangChain is primarily focused on building applications that leverage large lan

Summary:
[]


## 6. Custom State Reducers

Reducers control **how** state updates are merged. The default is to replace; `add_messages` appends. You can write custom reducers for any behavior.

In [9]:
import operator
from typing import Annotated

# Helper to extract content from model response (handles both OpenAI and Gemini formats)
def get_response_content(response) -> str:
    """Extract text content from model response, handling different formats."""
    if hasattr(response, 'content'):
        content = response.content
        # Gemini sometimes returns content as a list
        if isinstance(content, list):
            return str(content[0]) if content else ""
        return str(content)
    return str(response)

# Custom reducer: accumulate a running total
def add_to_total(current: int, update: int) -> int:
    """Custom reducer that adds values."""
    return current + update

class ScoreState(TypedDict):
    task: str
    scores: Annotated[list, operator.add]  # List concatenation reducer
    total: Annotated[int, add_to_total]    # Custom sum reducer

def judge_a(state: ScoreState):
    """Judge A scores the task."""
    response = GPT_MODEL.invoke(f"Rate this task on a scale of 1-10 (respond with just the number): {state['task']}")
    content = get_response_content(response)
    try:
        score = int(content.strip())
    except ValueError:
        score = 5
    return {"scores": [{"judge": "GPT", "score": score}], "total": score}

def judge_b(state: ScoreState):
    """Judge B scores the task."""
    response = GEMINI_MODEL.invoke(f"Rate this task on a scale of 1-10 (respond with just the number): {state['task']}")
    content = get_response_content(response)
    try:
        score = int(content.strip())
    except ValueError:
        score = 5
    return {"scores": [{"judge": "Gemini", "score": score}], "total": score}

def final_verdict(state: ScoreState):
    """Compute average score."""
    avg = state["total"] / len(state["scores"]) if state["scores"] else 0
    print(f"\nScores: {state['scores']}")
    print(f"Total: {state['total']}")
    print(f"Average: {avg:.1f}")
    return {}

scoring_graph = StateGraph(ScoreState)
scoring_graph.add_node("judge_a", judge_a)
scoring_graph.add_node("judge_b", judge_b)
scoring_graph.add_node("verdict", final_verdict)

scoring_graph.add_edge(START, "judge_a")
scoring_graph.add_edge("judge_a", "judge_b")
scoring_graph.add_edge("judge_b", "verdict")
scoring_graph.add_edge("verdict", END)

scoring_compiled = scoring_graph.compile()

result = scoring_compiled.invoke({
    "task": "Write a Python function to reverse a string",
    "scores": [],
    "total": 0,
})


Scores: [{'judge': 'GPT', 'score': 8}, {'judge': 'Gemini', 'score': 5}]
Total: 13
Average: 6.5


## 7. Error Handling in Graphs

Robust agents need to handle failures gracefully. Here's a pattern for error handling within graph nodes.

In [10]:
from langchain_core.messages import SystemMessage, HumanMessage

class ErrorHandlingState(TypedDict):
    messages: Annotated[list, add_messages]
    error: str
    retry_count: int

def safe_llm_call(state: ErrorHandlingState):
    """LLM call with error handling."""
    try:
        response = GPT_MODEL.invoke(state["messages"])
        return {"messages": [response], "error": ""}
    except Exception as e:
        return {
            "error": str(e),
            "retry_count": state.get("retry_count", 0) + 1,
        }

def check_error(state: ErrorHandlingState) -> str:
    """Route based on whether there was an error."""
    if state.get("error") and state.get("retry_count", 0) < 3:
        return "retry"
    elif state.get("error"):
        return "fallback"
    return END

def fallback_node(state: ErrorHandlingState):
    """Use Gemini as fallback."""
    try:
        response = GEMINI_MODEL.invoke(state["messages"])
        return {"messages": [response], "error": ""}
    except Exception as e:
        return {"messages": [HumanMessage(content=f"All models failed: {e}")]}

error_graph = StateGraph(ErrorHandlingState)
error_graph.add_node("llm", safe_llm_call)
error_graph.add_node("fallback", fallback_node)

error_graph.add_edge(START, "llm")
error_graph.add_conditional_edges("llm", check_error, {
    "retry": "llm",
    "fallback": "fallback",
    END: END,
})
error_graph.add_edge("fallback", END)

error_compiled = error_graph.compile()

# Test normal operation
result = error_compiled.invoke({
    "messages": [{"role": "user", "content": "What is 2+2?"}],
    "error": "",
    "retry_count": 0,
})
print("Result:", result["messages"][-1].content)

Result: 2 + 2 equals 4.


## Summary

| Concept | What You Learned |
|---------|------------------|
| Persistence | MemorySaver for multi-turn conversation memory |
| Thread Isolation | Separate memory per thread_id |
| Human-in-the-Loop | interrupt_before to pause and resume graphs |
| Streaming | stream_mode="updates" for real-time events |
| Subgraphs | Compose smaller graphs into larger workflows |
| Custom Reducers | Control how state updates are merged |
| Error Handling | Retry logic and model fallbacks in graphs |

**Next:** [04 - Specialty: RAG, Multi-Agent & Production](./04_specialty_rag_and_multi_agent.ipynb)