# MLflow ResponseAgent with LangGraph Integration

This notebook demonstrates how to wrap LangGraph agents using MLflow's ResponseAgent for production deployment.

## Table of Contents
1. Why Use ResponseAgent with LangGraph?
2. Building a Basic LangGraph Agent
3. Wrapping LangGraph with ResponseAgent
4. Logging and Serving
5. Complete Working Example

## Setup

In [None]:
import os
from dotenv import load_dotenv
import mlflow
from typing import TypedDict, Annotated, Generator

# LangGraph imports
from langgraph.graph import StateGraph, START, END
from langgraph.graph.message import add_messages
from langgraph.graph.state import CompiledStateGraph

# LangChain imports
from langchain_openai import ChatOpenAI
from langchain_core.messages import HumanMessage, AIMessage, SystemMessage

# MLflow ResponseAgent imports
from mlflow.pyfunc import ResponsesAgent
from mlflow.types.responses import (
    ResponsesAgentRequest,
    ResponsesAgentResponse,
    ResponsesAgentStreamEvent,
    output_to_responses_items_stream,
    to_chat_completions_input,
)

# Load environment variables
load_dotenv()

# Verify OpenAI API key
assert "OPENAI_API_KEY" in os.environ, "Please set OPENAI_API_KEY in .env file"

# Set MLflow experiment
mlflow.set_experiment("LangGraph_ResponseAgent")

print("✅ Setup complete!")

## 1. Why Use ResponseAgent with LangGraph?

### LangGraph Challenges:
- **Complex Deployment**: LangGraph graphs are not easily serializable
- **No Standard API**: Each graph has custom input/output formats
- **Observability**: Difficult to trace multi-step agent execution
- **Version Control**: Hard to track and version graph changes

### ResponseAgent Solutions:
✅ **Standard Interface**: Convert LangGraph I/O to OpenAI format

✅ **Easy Deployment**: Log once, deploy anywhere with MLflow

✅ **Built-in Tracing**: Automatic tracking of all agent steps

✅ **Version Management**: Full MLflow experiment tracking

✅ **Framework Independence**: Switch between frameworks without changing deployment

## 2. Building a Basic LangGraph Agent

First, let's create a simple LangGraph chatbot agent.

In [None]:
# Define the state for our graph
class State(TypedDict):
    messages: Annotated[list, add_messages]


def create_simple_chatbot() -> CompiledStateGraph:
    """
    Create a simple LangGraph chatbot that uses GPT-4.
    
    This is the basic LangGraph pattern:
    1. Define state
    2. Create nodes (functions that process state)
    3. Build graph with edges
    4. Compile graph
    """
    # Initialize the LLM
    llm = ChatOpenAI(model="gpt-4o-mini", temperature=0.7)
    
    # Define the chatbot node
    def chatbot(state: State):
        """Process messages and return LLM response."""
        return {"messages": [llm.invoke(state["messages"])]}
    
    # Build the graph
    graph_builder = StateGraph(State)
    
    # Add nodes
    graph_builder.add_node("chatbot", chatbot)
    
    # Add edges
    graph_builder.add_edge(START, "chatbot")
    graph_builder.add_edge("chatbot", END)
    
    # Compile and return
    return graph_builder.compile()


# Test the basic LangGraph agent
print("Testing basic LangGraph agent...\n")
graph = create_simple_chatbot()

# Test invocation
response = graph.invoke({
    "messages": [HumanMessage(content="What is LangGraph?")]
})

print(f"LangGraph Response: {response['messages'][-1].content}")
print("\n✅ Basic LangGraph agent working!")

## 3. Wrapping LangGraph with ResponseAgent

Now let's wrap our LangGraph agent in a ResponseAgent for MLflow compatibility.

In [None]:
from mlflow.entities.span import SpanType


class LangGraphResponsesAgent(ResponsesAgent):
    """
    Wrapper for LangGraph agents using ResponseAgent interface.
    
    This adapter:
    1. Converts ResponsesAgentRequest to LangGraph format
    2. Invokes the LangGraph agent
    3. Converts LangGraph output to ResponsesAgentResponse
    """
    
    def __init__(self, agent: CompiledStateGraph):
        """Initialize with a compiled LangGraph agent."""
        self.agent = agent
    
    @mlflow.trace(span_type=SpanType.AGENT)
    def predict(self, request: ResponsesAgentRequest) -> ResponsesAgentResponse:
        """
        Main prediction method - non-streaming version.
        
        Collects all stream events and returns final response.
        """
        outputs = [
            event.item
            for event in self.predict_stream(request)
            if event.type == "response.output_item.done"
        ]
        
        return ResponsesAgentResponse(
            output=outputs,
            custom_outputs=request.custom_inputs  # Pass through context
        )
    
    @mlflow.trace(span_type=SpanType.AGENT)
    def predict_stream(
        self,
        request: ResponsesAgentRequest,
    ) -> Generator[ResponsesAgentStreamEvent, None, None]:
        """
        Streaming prediction method.
        
        This is where the magic happens:
        1. Convert ResponsesAgent input to ChatCompletions format
        2. Stream through LangGraph
        3. Convert each update to ResponsesAgent stream events
        """
        # Convert input format
        cc_msgs = to_chat_completions_input(
            [i.model_dump() for i in request.input]
        )
        
        # Stream through LangGraph
        for _, events in self.agent.stream(
            {"messages": cc_msgs}, 
            stream_mode=["updates"]
        ):
            # Convert each node's output to ResponsesAgent format
            for node_data in events.values():
                yield from output_to_responses_items_stream(
                    node_data["messages"]
                )


print("✅ LangGraphResponsesAgent class defined!")

## 4. Testing the Wrapped Agent

In [None]:
# Create and wrap the agent
langgraph_agent = create_simple_chatbot()
wrapped_agent = LangGraphResponsesAgent(langgraph_agent)

# Test with ResponsesAgent format
test_request = {
    "input": [
        {"role": "user", "content": "Explain MLflow in one sentence."}
    ],
    "context": {"user_id": "test_user", "session_id": "session_456"},
}

# Get response
response = wrapped_agent.predict(test_request)

print("Response from wrapped LangGraph agent:")
print(f"Output: {response.output[0]}")
print(f"\nCustom outputs: {response.custom_outputs}")

## 5. Creating Utility Functions

Let's create helper functions for working with LangGraph + ResponseAgent.

In [None]:
%%writefile langgraph_utils.py
"""Utility functions for LangGraph + MLflow ResponseAgent integration."""

from langgraph.pregel.io import AddableValuesDict
from typing import Union


def _langgraph_message_to_mlflow_message(
    langgraph_message: AddableValuesDict,
) -> dict:
    """
    Convert a LangGraph message to MLflow format.
    
    Maps:
    - human -> user
    - ai -> assistant
    - system -> system
    """
    langgraph_type_to_mlflow_role = {
        "human": "user",
        "ai": "assistant",
        "system": "system",
    }
    
    if type_clean := langgraph_type_to_mlflow_role.get(langgraph_message.type):
        return {"role": type_clean, "content": langgraph_message.content}
    else:
        raise ValueError(f"Incorrect role specified: {langgraph_message.type}")


def get_most_recent_message(response: AddableValuesDict) -> str:
    """
    Extract the most recent message content from a LangGraph response.
    """
    most_recent_message = response.get("messages")[-1]
    return _langgraph_message_to_mlflow_message(most_recent_message)["content"]


def increment_message_history(
    response: AddableValuesDict,
    new_message: Union[dict, AddableValuesDict]
) -> list[dict]:
    """
    Add a new message to the conversation history.
    
    Useful for maintaining context across multiple turns.
    """
    if isinstance(new_message, AddableValuesDict):
        new_message = _langgraph_message_to_mlflow_message(new_message)
    
    message_history = [
        _langgraph_message_to_mlflow_message(message)
        for message in response.get("messages")
    ]
    
    return message_history + [new_message]


print("✅ Utility functions created in langgraph_utils.py")

## 6. Complete Model-from-Code File

Now let's create a complete, deployable agent file.

In [None]:
%%writefile langgraph_agent.py
"""Complete LangGraph agent wrapped in ResponseAgent for MLflow."""

import os
from typing import TypedDict, Annotated, Generator

import mlflow
from mlflow.entities.span import SpanType
from mlflow.pyfunc import ResponsesAgent
from mlflow.types.responses import (
    ResponsesAgentRequest,
    ResponsesAgentResponse,
    ResponsesAgentStreamEvent,
    output_to_responses_items_stream,
    to_chat_completions_input,
)

from langgraph.graph import StateGraph, START, END
from langgraph.graph.message import add_messages
from langgraph.graph.state import CompiledStateGraph
from langchain_openai import ChatOpenAI


# Define state
class State(TypedDict):
    messages: Annotated[list, add_messages]


def create_chatbot_graph() -> CompiledStateGraph:
    """Create and compile the LangGraph chatbot."""
    llm = ChatOpenAI(model="gpt-4o-mini", temperature=0.7)
    
    def chatbot(state: State):
        return {"messages": [llm.invoke(state["messages"])]}
    
    graph_builder = StateGraph(State)
    graph_builder.add_node("chatbot", chatbot)
    graph_builder.add_edge(START, "chatbot")
    graph_builder.add_edge("chatbot", END)
    
    return graph_builder.compile()


class LangGraphResponsesAgent(ResponsesAgent):
    """ResponseAgent wrapper for LangGraph."""
    
    def __init__(self, agent: CompiledStateGraph):
        self.agent = agent
    
    @mlflow.trace(span_type=SpanType.AGENT)
    def predict(self, request: ResponsesAgentRequest) -> ResponsesAgentResponse:
        outputs = [
            event.item
            for event in self.predict_stream(request)
            if event.type == "response.output_item.done"
        ]
        return ResponsesAgentResponse(
            output=outputs,
            custom_outputs=request.custom_inputs
        )
    
    @mlflow.trace(span_type=SpanType.AGENT)
    def predict_stream(
        self,
        request: ResponsesAgentRequest,
    ) -> Generator[ResponsesAgentStreamEvent, None, None]:
        cc_msgs = to_chat_completions_input(
            [i.model_dump() for i in request.input]
        )
        
        for _, events in self.agent.stream(
            {"messages": cc_msgs},
            stream_mode=["updates"]
        ):
            for node_data in events.values():
                yield from output_to_responses_items_stream(
                    node_data["messages"]
                )


# Enable auto-tracing for LangChain
mlflow.langchain.autolog()

# Create and set the model
graph = create_chatbot_graph()
agent = LangGraphResponsesAgent(graph)
mlflow.models.set_model(agent)

## 7. Logging the Agent to MLflow

In [None]:
# Enable tracing
mlflow.langchain.autolog()

# Log the model
with mlflow.start_run(run_name="langgraph_chatbot_agent") as run:
    model_info = mlflow.pyfunc.log_model(
        python_model="langgraph_agent.py",
        artifact_path="agent",
        # Dependencies are auto-inferred, but you can specify:
        pip_requirements=[
            "mlflow",
            "pydantic>=2.0.0",
            "langgraph>=0.2.27",
            "langchain>=0.3.0",
            "langchain-openai>=0.2.0",
            "openai",
        ],
    )
    
    print(f"✅ Model logged successfully!")
    print(f"Run ID: {run.info.run_id}")
    print(f"Model URI: {model_info.model_uri}")
    print(f"\nView in MLflow UI: {mlflow.get_tracking_uri()}")

## 8. Loading and Testing the Logged Model

In [None]:
# Load the model
loaded_model = mlflow.pyfunc.load_model(model_info.model_uri)

# Test conversation
print("Testing multi-turn conversation...\n")

# Turn 1
response1 = loaded_model.predict({
    "input": [{"role": "user", "content": "What is machine learning?"}],
})
print(f"Turn 1:")
print(f"User: What is machine learning?")
print(f"Agent: {response1['output'][0]['content'][0]['text'][:100]}...")

# Turn 2 - with history
response2 = loaded_model.predict({
    "input": [
        {"role": "user", "content": "What is machine learning?"},
        {"role": "assistant", "content": response1['output'][0]['content'][0]['text']},
        {"role": "user", "content": "Can you give me an example?"},
    ],
})
print(f"\nTurn 2:")
print(f"User: Can you give me an example?")
print(f"Agent: {response2['output'][0]['content'][0]['text'][:100]}...")

print("\n✅ Multi-turn conversation working!")

## 9. Viewing Traces in MLflow UI

To view detailed traces:

1. Start MLflow UI (if not running):
   ```bash
   mlflow ui
   ```

2. Navigate to http://localhost:5000

3. Click on your experiment: "LangGraph_ResponseAgent"

4. Click the "Traces" tab

5. View detailed execution traces including:
   - Input/output for each node
   - Token usage
   - Execution time
   - Error tracking

## Summary

### What We Accomplished:

1. ✅ Built a basic LangGraph chatbot
2. ✅ Wrapped it with ResponseAgent for standardization
3. ✅ Logged the agent using Models-from-Code
4. ✅ Loaded and tested the deployed model
5. ✅ Enabled full tracing with MLflow

### Key Benefits:

- **Standardization**: OpenAI-compatible API
- **Deployment**: Easy serving with MLflow
- **Observability**: Full execution traces
- **Flexibility**: Switch frameworks without changing deployment

### Next Steps:
- Add tool calling to your LangGraph agent
- Implement more complex multi-agent workflows
- Deploy to production endpoints
- Explore streaming responses