# Lesson 8.6: Debugging and Deploying LangGraph

---

In previous lessons, we built complex Agents and stateful workflows using **LangGraph**. As these systems become more sophisticated, **debugging** and effectively **deploying** them become critically important skills. This lesson will focus on debugging techniques for LangGraph graphs, how to leverage **LangSmith** for visualization, and key considerations when deploying LangGraph applications.

## 1. Debugging Techniques for LangGraph Graphs

Debugging stateful and complex control flow graphs like those in LangGraph can be challenging. Here are some useful techniques:

* **Tracing State and Execution Flow:**
    * **`print` statements:** The simplest way is to add `print` statements at the beginning and end of each Node, as well as printing important values within the graph's **State**. This helps you track how data changes through each step.
    * **`verbose=True`:** When initializing `AgentExecutor` (or `Graph` if you're calling it directly), setting `verbose=True` will print out the Agent's reasoning steps, actions, and tool results, providing an overview of the execution flow.
    * **Inspecting `state` at each Node:** Within each Node function, you can print the entire `state` object to see exactly what data that Node received and returned. This is especially useful when debugging state merging issues.

* **Using Python Debugging Tools:**
    * **`pdb` (Python Debugger):** You can insert `import pdb; pdb.set_trace()` anywhere in your Node code to pause execution and inspect variables, stepping through the code.
    * **IDE Debugger:** IDEs like VS Code or PyCharm have powerful integrated debuggers that allow you to set breakpoints, inspect variables, and step through the LangGraph graph.




---

## 2. Using LangSmith for Visualization and Tracing LangGraph Flow

**LangSmith** is a platform developed by LangChain to help developers debug, test, evaluate, and monitor LLM applications. For LangGraph, LangSmith is particularly useful.

* **Understanding Agent Steps:**
    * LangSmith provides an intuitive user interface to view each execution step of your LangGraph graph. You can clearly see which Nodes were activated, the input/output data of each Node, and the edges that were followed.
    * It displays conversation history, LLM calls, tool calls, and intermediate steps (`agent_scratchpad`), making it easy to trace the Agent's reasoning process.
* **Performance and Error Analysis:**
    * You can view the execution time of each Node and the overall graph, helping to identify performance bottlenecks.
    * LangSmith logs errors and exceptions, allowing you to quickly pinpoint the root cause of issues.
    * You can compare different runs to observe changes in Agent behavior.
* **LangSmith Setup:**
    1.  **Install:** `pip install langsmith`
    2.  **Set Environment Variables:**
        ```bash
        export LANGCHAIN_TRACING_V2="true"
        export LANGCHAIN_API_KEY="<your-langsmith-api-key>"
        export LANGCHAIN_PROJECT="<your-project-name>" # Example: "LangGraph Debugging"
        ```
        Or in Python code:
        ```python
        import os
        os.environ["LANGCHAIN_TRACING_V2"] = "true"
        os.environ["LANGCHAIN_API_KEY"] = "<your-langsmith-api-key>"
        os.environ["LANGCHAIN_PROJECT"] = "<your-project-name>"
        ```
    3.  **View Results:** After running your LangGraph application, visit the LangSmith dashboard (app.langsmith.com) to view the traces.




---

## 3. Storing and Loading LangGraph Graphs for Reuse

Once you've built and tested a LangGraph graph, you might want to save it for reuse without recompiling it from scratch, or for sharing.

* **Serialization:**
    * A LangGraph graph after being `compile()`d is a Python object. You can use the `pickle` or `dill` library to serialize it to a file.
    * **Note:** Serializing objects containing LLMs or tools with API connections can be complex. Ensure that API keys or other sensitive information are not stored directly in the serialized file. It's best to store the graph structure and parameters, then re-initialize LLMs/Tools upon loading.
    * **Example (conceptual):**
        ```python
        import pickle
        # Assuming 'app' is your compiled graph
        with open("my_langgraph_app.pkl", "wb") as f:
            pickle.dump(app, f)
        print("Graph saved to my_langgraph_app.pkl")
        ```
* **Deserialization:**
    * To load the graph, you simply read the serialized file.
    * **Example (conceptual):**
        ```python
        import pickle
        with open("my_langgraph_app.pkl", "rb") as f:
            loaded_app = pickle.load(f)
        print("Graph loaded from my_langgraph_app.pkl")
        # You can then use loaded_app.invoke(...)
        ```


---

## 4. Considerations for Deploying LangGraph Applications

Deploying LangGraph applications has its own considerations, especially due to their stateful nature and complex reasoning capabilities.

* **Performance:**
    * **Latency:** Agentic loops can increase overall latency. Optimize prompts, choose faster LLMs, and reduce unnecessary iterative steps.
    * **Throughput:** The number of requests that can be processed per second. Use scalable LLM APIs, optimize Node code, and parallelize when possible.
* **Scalability:**
    * **Horizontal Scaling:** Deploy multiple instances of your LangGraph application.
    * **Distributed State Management:** If you use in-memory state, horizontal scaling will be difficult as each instance will have its own state. A distributed state backend is needed.
* **State Management:**
    * **In-memory State:** Simple for development and testing, but not suitable for multi-instance production environments.
    * **External State Backends:** For production applications, you need to store LangGraph's state in an external system:
        * **Redis:** Good for low-latency and ephemeral state.
        * **Firestore / DynamoDB:** NoSQL databases suitable for persistent and structured state.
        * **PostgreSQL / MongoDB:** Can also be used.
    * LangGraph provides integrations for various state backends (e.g., `RedisSaver`).
* **Monitoring:**
    * In addition to LangSmith, integrate application performance monitoring (APM) tools like Prometheus, Grafana, Datadog to track resource usage, errors, and latency.
* **Secrets Management:** Always use cloud provider's dedicated secrets management services (AWS Secrets Manager, GCP Secret Manager, Azure Key Vault) to store API keys.




---

## 5. Practical Example: Debugging a Complex Graph Using Learned Techniques and Understanding How LangSmith Helps

We will reuse the self-correcting Agent from Lesson 8.5 and add debugging techniques, while discussing how LangSmith visualizes this flow.

**Preparation:**
* Ensure you have the necessary libraries installed: `langchain-openai`, `google-search-results`, `numexpr`, `langgraph`, `langsmith`.
* Set the `OPENAI_API_KEY`, `SERPAPI_API_KEY`, `LANGCHAIN_API_KEY`, `LANGCHAIN_TRACING_V2="true"`, `LANGCHAIN_PROJECT="LangGraph Debugging Example"` environment variables.

In [None]:
# Install libraries if not already installed
# pip install langchain-openai openai google-search-results numexpr langgraph langsmith

import os
from typing import TypedDict, Annotated, List, Union, Dict, Any
from langchain_core.messages import BaseMessage, HumanMessage, AIMessage, ToolMessage
from langgraph.graph import StateGraph, END
import operator
from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser
from langchain_community.utilities import SerpAPIWrapper
from langchain.tools import Tool
from langchain_community.tools.calculator.tool import Calculator
from langchain_core.agents import AgentFinish, AgentAction # To parse LLM output

# Set environment variables for OpenAI and SerpAPI keys
# os.environ["OPENAI_API_KEY"] = "YOUR_OPENAI_API_KEY"
# os.environ["SERPAPI_API_KEY"] = "YOUR_SERPAPI_API_KEY"

# Set up LangSmith
os.environ["LANGCHAIN_TRACING_V2"] = "true"
os.environ["LANGCHAIN_API_KEY"] = "YOUR_LANGSMITH_API_KEY" # Replace with your LangSmith API key
os.environ["LANGCHAIN_PROJECT"] = "LangGraph Debugging Example" # Set your project name

# --- 1. Define the Agent Graph's State Type ---
class AgentState(TypedDict):
    chat_history: Annotated[List[BaseMessage], operator.add]
    intermediate_steps: Annotated[List[Union[AgentAction, ToolMessage]], operator.add]
    retry_count: int

# --- 2. Initialize LLM and Tools ---
llm = ChatOpenAI(model_name="gpt-3.5-turbo", temperature=0)

search_tool = Tool(
    name="Google Search",
    func=SerpAPIWrapper().run,
    description="Useful when you need to search for information on Google about current events or factual data."
)
calculator_tool = Calculator()
tools = [search_tool, calculator_tool]

# --- 3. Define the Agent Prompt ---
agent_prompt = ChatPromptTemplate.from_messages([
    ("system", "You are a helpful assistant. You have access to the following tools: {tools}. Use them to answer user questions. If you have a final answer, respond directly. If you encounter an error with a tool, try again or try a different approach."),
    MessagesPlaceholder(variable_name="chat_history"),
    MessagesPlaceholder(variable_name="agent_scratchpad"),
])

# --- 4. Define the Graph Nodes ---

def call_llm_node(state: AgentState) -> Dict[str, Any]:
    print("\n--- Node: Call LLM (Reasoning/Acting) ---")
    # Print the input state of the Node for debugging
    print(f"  LLM Input State: {state}")
    
    messages = agent_prompt.format_messages(
        tools=tools,
        chat_history=state["chat_history"],
        agent_scratchpad=state["intermediate_steps"]
    )
    
    response = llm.invoke(messages)
    
    if "Final Answer:" in response.content:
        final_answer_content = response.content.split("Final Answer:", 1)[1].strip()
        print(f"  LLM provides final answer: {final_answer_content}")
        return {"chat_history": [AIMessage(content=final_answer_content)], "intermediate_steps": [AgentFinish(return_values={"output": final_answer_content}, log=response.content)]}
    
    try:
        thought_part = ""
        action_part = ""
        action_input_part = ""

        if "Thought:" in response.content:
            parts = response.content.split("Thought:", 1)
            thought_part = parts[1].split("Action:", 1)[0].strip() if "Action:" in parts[1] else parts[1].strip()
        
        if "Action:" in response.content:
            parts = response.content.split("Action:", 1)[1].split("Action Input:", 1)
            action_part = parts[0].strip()
            action_input_part = parts[1].strip() if len(parts) > 1 else ""

        action = AgentAction(tool=action_part, tool_input=action_input_part, log=response.content)
        print(f"  LLM decides to act: Tool='{action.tool}', Input='{action.tool_input}'")
        return {"intermediate_steps": [action]}

    except Exception as e:
        print(f"  Error parsing LLM output to Action: {e}. LLM Response: {response.content}")
        return {
            "chat_history": [AIMessage(content=f"Error parsing my response. Please try again.")],
            "intermediate_steps": [AgentFinish(return_values={"output": "LLM parsing error"}, log=response.content)],
            "retry_count": state.get("retry_count", 0) + 1
        }


def call_tool_node(state: AgentState) -> Dict[str, Any]:
    print("--- Node: Call Tool (Execute tool) ---")
    # Print the input state of the Node for debugging
    print(f"  Tool Input State: {state}")

    last_action = state["intermediate_steps"][-1]
    
    tool_name = last_action.tool
    tool_input = last_action.tool_input

    selected_tool = next((t for t in tools if t.name == tool_name), None)
    if selected_tool:
        try:
            tool_output = selected_tool.run(tool_input)
            print(f"  Tool '{tool_name}' returns: {tool_output[:100]}...")
            return {"intermediate_steps": [ToolMessage(content=tool_output, tool_call_id=last_action.tool)], "retry_count": 0}
        except Exception as e:
            error_message = f"Error executing Tool '{tool_name}' with input '{tool_input}': {e}"
            print(f"  {error_message}")
            return {"intermediate_steps": [ToolMessage(content=error_message, tool_call_id=last_action.tool)], "retry_count": state.get("retry_count", 0) + 1}
    else:
        error_message = f"Error: Tool with name '{tool_name}' not found. Please check tool name again."
        print(f"  {error_message}")
        return {"intermediate_steps": [ToolMessage(content=error_message, tool_call_id="unknown_tool")], "retry_count": state.get("retry_count", 0) + 1}

# --- 5. Define the Conditional Edge function ---
def should_continue(state: AgentState) -> str:
    print("--- Node: Should Continue (Decide flow) ---")
    # Print the input state of the Node for debugging
    print(f"  Should Continue Input State: {state}")

    last_step = state["intermediate_steps"][-1]
    retry_count = state.get("retry_count", 0)
    
    MAX_RETRIES = 2

    if isinstance(last_step, AgentFinish):
        print("--- Decision: END (AgentFinish) ---")
        return "end"
    elif isinstance(last_step, AgentAction):
        print("--- Decision: CONTINUE (AgentAction) ---")
        return "continue"
    elif isinstance(last_step, ToolMessage):
        if "Error" in last_step.content or retry_count > 0:
            print(f"--- Decision: SELF-CORRECT (ToolMessage error, retry attempt {retry_count}) ---")
            if retry_count >= MAX_RETRIES:
                print("--- Max retries reached. END ---")
                return "end"
            return "call_llm"
        else:
            print("--- Decision: CONTINUE (ToolMessage successful) ---")
            return "call_llm"
    else:
        print(f"--- Decision: ERROR/UNKNOWN (Unexpected type: {type(last_step)}) ---")
        return "end"

# --- 6. Build the Agent Graph ---
workflow = StateGraph(AgentState)

workflow.add_node("call_llm", call_llm_node)
workflow.add_node("call_tool", call_tool_node)

workflow.set_entry_point("call_llm")

workflow.add_conditional_edges(
    "call_llm",
    should_continue,
    {
        "continue": "call_tool",
        "end": END,
        "call_llm": "call_llm"
    }
)

workflow.add_edge("call_tool", "call_llm")

app = workflow.compile()

print("\n--- Starting LangGraph Debugging and Deployment Practical ---")

# --- Scenario 1: Question requires search and calculation (normal) ---
print("\n--- Scenario 1: Question requires search and calculation (normal) ---")
initial_state_1 = {"chat_history": [HumanMessage(content="What is the weather like today in London in Celsius? Then multiply the result by 2.")], "intermediate_steps": [], "retry_count": 0}
final_state_1 = app.invoke(initial_state_1)
print(f"\nFinal response:")
for message in final_state_1["chat_history"]:
    print(f"{message.type.capitalize()}: {message.content}")

# --- Scenario 2: LLM attempts to call a non-existent tool (self-correction) ---
print("\n--- Scenario 2: LLM attempts to call a non-existent tool (self-correction) ---")
# To simulate, we will temporarily change the prompt to make the LLM intentionally try a wrong tool
original_prompt = agent_prompt
agent_prompt = ChatPromptTemplate.from_messages([
    ("system", "You are a helpful assistant. You have access to the following tools: {tools}. Use them to answer user questions. If you have a final answer, respond directly. Try to use the tool 'NonExistentTool' if you are unsure."),
    MessagesPlaceholder(variable_name="chat_history"),
    MessagesPlaceholder(variable_name="agent_scratchpad"),
])
# Need to recompile the app after changing the prompt
agent = create_react_agent(llm, tools, agent_prompt)
app = StateGraph(AgentState)
app.add_node("call_llm", call_llm_node)
app.add_node("call_tool", call_tool_node)
app.set_entry_point("call_llm")
app.add_conditional_edges("call_llm", should_continue, {"continue": "call_tool", "end": END, "call_llm": "call_llm"})
app.add_edge("call_tool", "call_llm")
app = app.compile()

initial_state_2 = {"chat_history": [HumanMessage(content="Tell me about LangChain.")], "intermediate_steps": [], "retry_count": 0}
final_state_2 = app.invoke(initial_state_2)
print(f"\nFinal response:")
for message in final_state_2["chat_history"]:
    print(f"{message.type.capitalize()}: {message.content}")

# Restore original prompt
agent_prompt = original_prompt
agent = create_react_agent(llm, tools, agent_prompt)
app = StateGraph(AgentState)
app.add_node("call_llm", call_llm_node)
app.add_node("call_tool", call_tool_node)
app.set_entry_point("call_llm")
app.add_conditional_edges("call_llm", should_continue, {"continue": "call_tool", "end": END, "call_llm": "call_llm"})
app.add_edge("call_tool", "call_llm")
app = app.compile()

print("\n--- End of LangGraph Debugging and Deployment Practical ---")
