# Understanding RunnableConfig in MLflow: A Complete Guide to Dynamic Agent Configuration

## Introduction

When building production-ready AI agents with LangChain and MLflow, one of the most powerful yet often misunderstood concepts is `RunnableConfig`. If you've ever wondered how to make your agent behave differently for various use cases without rewriting code, or how MLflow tracks and manages agent execution, `RunnableConfig` is the key.

In this article, we'll demystify `RunnableConfig` by exploring what it is, why it matters, and how to use it to create flexible, multi-purpose AI agents. We'll use a practical tool-calling agent example to demonstrate these concepts.

## What is RunnableConfig?

`RunnableConfig` is a configuration object in LangChain that carries runtime parameters through your agent's execution chain. Think of it as a "metadata backpack" that travels with your request, containing instructions on *how* to process the data, not *what* data to process.

### The Core Purpose

While your messages contain the *content* (what the user asked), `RunnableConfig` contains the *context* (how to process that request):

- **Model parameters**: Temperature, max tokens, etc.
- **Execution metadata**: User session, use case type
- **Runtime overrides**: Dynamic configuration changes
- **Tracing context**: For MLflow observability

## RunnableConfig in Action: A Simple Example

Let's start with a basic example to see `RunnableConfig` in action:

```python
from langchain_core.runnables import RunnableConfig
from databricks_langchain import ChatDatabricks

# Initialize your LLM
llm = ChatDatabricks(endpoint="databricks-meta-llama-3-3-70b-instruct")

# Create a configuration
config = RunnableConfig(
    configurable={
        "temperature": 0.1,
        "max_tokens": 500
    },
    metadata={
        "use_case": "analytics",
        "user_id": "user_123"
    }
)

# Use the config when invoking
response = llm.invoke(
    "Explain machine learning in simple terms",
    config=config
)
```

In this example, `config` travels with your request, telling the LLM to use low temperature (0.1) for precise responses and limit output to 500 tokens.

## Why MLflow Agents Need RunnableConfig

When building MLflow-compatible agents, `RunnableConfig` becomes essential. Let's understand why by comparing two implementations.

### Non-Compatible Version (Without RunnableConfig)

```python
def tool_calling_llm(state: State) -> State:
    """Simple function without config support"""
    current_state = state["messages"]
    return {"messages": [llm_with_tools.invoke(current_state)]}
```

**Limitations:**
- No runtime configuration
- No MLflow tracing context
- Can't dynamically adjust behavior
- Not serializable for deployment

### MLflow-Compatible Version (With RunnableConfig)

```python
def tool_calling_llm(state: ChatAgentState, config: RunnableConfig):
    """MLflow-compatible function with config support"""
    response = model_runnable.invoke(state, config)
    return {"messages": [response]}
```

**Benefits:**
- Supports runtime configuration
- Carries MLflow tracing context
- Enables dynamic behavior changes
- Fully serializable for deployment

## How RunnableConfig Flows Through Your Agent

Understanding the flow of `RunnableConfig` is crucial. Let's trace it through a complete agent execution.

### The Agent Architecture

```python
from mlflow.langchain.chat_agent_langgraph import ChatAgentState
from langchain_core.runnables import RunnableLambda

def create_tool_calling_agent(model, tools):
    # Bind tools to the model
    llm_with_tools = model.bind_tools(tools=tools)
    
    # Create a preprocessing chain
    preprocessor = RunnableLambda(lambda state: state["messages"])
    model_runnable = preprocessor | model
    
    # Define the LLM node with config support
    def tool_calling_llm(state: ChatAgentState, config: RunnableConfig):
        response = model_runnable.invoke(state, config)
        return {"messages": [response]}
    
    # Build the graph
    builder = StateGraph(ChatAgentState)
    builder.add_node("tool_calling_llm", RunnableLambda(tool_calling_llm))
    builder.add_node("tools", ChatAgentToolNode(tools=tools))
    builder.add_edge(START, "tool_calling_llm")
    builder.add_conditional_edges("tool_calling_llm", tools_condition, ["tools", END])
    builder.add_edge("tools", "tool_calling_llm")
    
    return builder.compile()
```

### The Config Flow Journey

When you invoke the agent, here's how `RunnableConfig` travels:

```
User Request + Config
        â†“
    agent.invoke(state, config)
        â†“
    tool_calling_llm(state, config)
        â†“
    model_runnable.invoke(state, config)
        â†“
    preprocessor.invoke(state, config) â†’ extracts messages
        â†“
    model.invoke(messages, config) â†’ uses config parameters
        â†“
    Response
```

The beauty is that `config` flows through every step automatically, ensuring consistent behavior and tracing context.

## Building a Multi-Purpose Agent with RunnableConfig

Now let's create a practical agent that can serve different use cases by leveraging `RunnableConfig`.

### Step 1: Define Use Case Configurations

```python
# Analytics: Precise, factual, detailed
analytics_config = {
    "endpoint_name": "databricks-meta-llama-3-3-70b-instruct",
    "temperature": 0.1,
    "max_tokens": 2000,
    "system_prompt": "You are a data analyst. Provide precise, fact-based analysis.",
    "use_case": "analytics"
}

# Storytelling: Creative, engaging, moderate length
storytelling_config = {
    "endpoint_name": "databricks-meta-llama-3-3-70b-instruct",
    "temperature": 0.8,
    "max_tokens": 1500,
    "system_prompt": "You are a creative storyteller. Craft engaging narratives.",
    "use_case": "storytelling"
}

# Technical Support: Helpful, concise, solution-focused
support_config = {
    "endpoint_name": "databricks-meta-llama-3-3-70b-instruct",
    "temperature": 0.3,
    "max_tokens": 800,
    "system_prompt": "You are a helpful support agent. Be empathetic and solution-focused.",
    "use_case": "support"
}
```

### Step 2: Create the Dynamic Agent

```python
from mlflow.pyfunc import ChatAgent
from mlflow.models import ModelConfig
from mlflow.types.agent import ChatAgentMessage, ChatAgentResponse

class DynamicDocsAgent(ChatAgent):
    def __init__(self, config, tools):
        self.base_config = config
        self.config = ModelConfig(development_config=config)
        self.tools = tools
        self.agent = self._build_agent_from_config()

    def _build_agent_from_config(self):
        llm = ChatDatabricks(
            endpoint=self.config.get("endpoint_name"),
            temperature=self.config.get("temperature"),
            max_tokens=self.config.get("max_tokens"),
        )
        agent = create_tool_calling_agent(llm, tools=self.tools)
        return agent

    def predict(
        self,
        messages: list[ChatAgentMessage],
        context: Optional[ChatContext] = None,
        custom_inputs: Optional[dict[str, Any]] = None,
    ) -> ChatAgentResponse:
        
        # Convert messages
        request = {"messages": self._convert_messages_to_dict(messages)}
        
        # Create RunnableConfig from agent configuration
        runtime_config = RunnableConfig(
            configurable={
                "temperature": self.config.get("temperature"),
                "max_tokens": self.config.get("max_tokens"),
            },
            metadata={
                "use_case": self.config.get("use_case"),
                "system_prompt": self.config.get("system_prompt"),
            }
        )
        
        # Invoke agent with config
        output = self.agent.invoke(request, config=runtime_config)
        
        return ChatAgentResponse(**output)
```

### Step 3: Use the Same Agent for Different Purposes

```python
# Initialize tools once
catalog = "agentic_ai"
schema = "databricks"
uc_tool_names = [f"{catalog}.{schema}.search_web"]
uc_toolkit = UCFunctionToolkit(function_names=uc_tool_names)
tools = [*uc_toolkit.tools]

# Create different agent instances for different use cases
analytics_agent = DynamicDocsAgent(analytics_config, tools)
storytelling_agent = DynamicDocsAgent(storytelling_config, tools)
support_agent = DynamicDocsAgent(support_config, tools)

# Same query, different behaviors
query = "Explain artificial intelligence"

# Analytics response - precise and detailed
analytics_response = analytics_agent.predict([
    {"role": "user", "content": f"{query} with technical accuracy"}
])
print("Analytics Response:")
print(analytics_response.messages[-1].content)

# Storytelling response - creative and engaging
storytelling_response = storytelling_agent.predict([
    {"role": "user", "content": f"Tell me a story about {query}"}
])
print("\nStorytelling Response:")
print(storytelling_response.messages[-1].content)

# Support response - helpful and concise
support_response = support_agent.predict([
    {"role": "user", "content": f"I need help understanding {query}"}
])
print("\nSupport Response:")
print(support_response.messages[-1].content)
```

## Dynamic Configuration Updates

One of the most powerful features is the ability to update configurations dynamically without recreating the agent. This is particularly useful when you want to switch between use cases or fine-tune behavior on the fly.

### Complete Configuration Update

You can completely replace the agent's configuration to switch between use cases:

```python
class ConfigurableDocsAgent(ChatAgent):
    def __init__(self, base_config, tools):
        self.base_config = base_config
        self.tools = tools
        self.agent = None
        self._build_agent_from_config()

    def _build_agent_from_config(self):
        """Build agent with current configuration"""
        llm = ChatDatabricks(
            endpoint=self.base_config.get("endpoint_name"),
            temperature=self.base_config.get("temperature", 0.7),
            max_tokens=self.base_config.get("max_tokens", 1000),
        )
        self.agent = create_tool_calling_agent(llm, tools=self.tools)

    def update_config(self, new_config):
        """
        Completely update configuration and rebuild agent
        Use this to switch between different use cases
        """
        self.base_config = new_config
        self._build_agent_from_config()
        print(f"âœ… Agent updated to use case: {new_config.get('use_case', 'general')}")

    def predict(
        self,
        messages: list[ChatAgentMessage],
        context: Optional[ChatContext] = None,
        custom_inputs: Optional[dict[str, Any]] = None,
    ) -> ChatAgentResponse:
        
        request = {"messages": self._convert_messages_to_dict(messages)}
        
        runtime_config = RunnableConfig(
            configurable={
                "temperature": self.base_config.get("temperature"),
                "max_tokens": self.base_config.get("max_tokens"),
            },
            metadata={
                "use_case": self.base_config.get("use_case", "general"),
            }
        )
        
        output = self.agent.invoke(request, config=runtime_config)
        return ChatAgentResponse(**output)
```

### Using Complete Configuration Update

```python
# Initialize tools
tools = [*uc_toolkit.tools]

# Start with analytics configuration
agent = ConfigurableDocsAgent(analytics_config, tools)

# Use for analytics
response = agent.predict([
    {"role": "user", "content": "Analyze the AI market trends"}
])
print("Analytics mode:", response.messages[-1].content[:100])

# Switch to storytelling mode
agent.update_config(storytelling_config)

# Now the same agent behaves differently
response = agent.predict([
    {"role": "user", "content": "Tell me about AI market trends"}
])
print("Storytelling mode:", response.messages[-1].content[:100])

# Switch to support mode
agent.update_config(support_config)

# Again, different behavior
response = agent.predict([
    {"role": "user", "content": "I need help understanding AI market trends"}
])
print("Support mode:", response.messages[-1].content[:100])
```

### Partial Configuration Update

Sometimes you don't want to replace the entire configurationâ€”you just want to tweak a few parameters. Partial updates allow you to modify specific settings while keeping others intact:

```python
class ConfigurableDocsAgent(ChatAgent):
    # ... (previous code remains the same)
    
    def partial_update_config(self, config_updates):
        """
        Partially update configuration without replacing everything
        Use this to fine-tune specific parameters
        """
        # Merge updates into existing config
        self.base_config.update(config_updates)
        self._build_agent_from_config()
        
        updated_keys = ", ".join(config_updates.keys())
        print(f"âœ… Updated configuration: {updated_keys}")
```

### Using Partial Configuration Update

```python
# Start with analytics configuration
agent = ConfigurableDocsAgent(analytics_config, tools)
print(f"Initial config - Temperature: {agent.base_config['temperature']}, "
      f"Max tokens: {agent.base_config['max_tokens']}")

# Test with initial config
response = agent.predict([
    {"role": "user", "content": "Explain quantum computing"}
])
print("Initial response length:", len(response.messages[-1].content))

# Partial update: Just increase creativity slightly
agent.partial_update_config({
    "temperature": 0.3
})
print(f"After partial update - Temperature: {agent.base_config['temperature']}, "
      f"Max tokens: {agent.base_config['max_tokens']}")  # max_tokens unchanged

response = agent.predict([
    {"role": "user", "content": "Explain quantum computing"}
])
print("Updated response length:", len(response.messages[-1].content))

# Another partial update: Reduce response length
agent.partial_update_config({
    "max_tokens": 500
})
print(f"After second update - Temperature: {agent.base_config['temperature']}, "
      f"Max tokens: {agent.base_config['max_tokens']}")

# Partial update multiple parameters at once
agent.partial_update_config({
    "temperature": 0.5,
    "max_tokens": 1200,
    "system_prompt": "You are a balanced assistant providing both analytical and creative insights."
})
print(f"After multi-param update - Temperature: {agent.base_config['temperature']}, "
      f"Max tokens: {agent.base_config['max_tokens']}")
```

### Configuration Update Comparison

Here's when to use each approach:

```python
# Scenario 1: Switching between completely different use cases
# Use: update_config()
agent = ConfigurableDocsAgent(analytics_config, tools)
agent.update_config(storytelling_config)  # Complete switch

# Scenario 2: Fine-tuning for a specific request
# Use: partial_update_config()
agent = ConfigurableDocsAgent(analytics_config, tools)
agent.partial_update_config({"temperature": 0.2})  # Just tweak temperature

# Scenario 3: A/B testing different parameters
# Use: partial_update_config()
agent = ConfigurableDocsAgent(analytics_config, tools)

# Test version A
agent.partial_update_config({"temperature": 0.1, "max_tokens": 1000})
response_a = agent.predict([{"role": "user", "content": "Test query"}])

# Test version B
agent.partial_update_config({"temperature": 0.3, "max_tokens": 1500})
response_b = agent.predict([{"role": "user", "content": "Test query"}])

# Compare results
print(f"Version A length: {len(response_a.messages[-1].content)}")
print(f"Version B length: {len(response_b.messages[-1].content)}")
```

### Real-World Example: Adaptive Agent

Here's a practical example of an agent that adapts its configuration based on the query complexity:

```python
class AdaptiveAgent(ConfigurableDocsAgent):
    def smart_predict(self, messages: list[ChatAgentMessage]) -> ChatAgentResponse:
        """
        Automatically adjust configuration based on query characteristics
        """
        query = messages[0].get("content", "")
        
        # Analyze query complexity
        word_count = len(query.split())
        has_technical_terms = any(term in query.lower() 
                                  for term in ["algorithm", "architecture", "implementation"])
        
        # Adjust configuration based on query
        if word_count > 50 or has_technical_terms:
            # Complex query: more detailed response
            self.partial_update_config({
                "temperature": 0.2,
                "max_tokens": 2000
            })
            print("ðŸ“Š Detected complex query - using detailed configuration")
        else:
            # Simple query: concise response
            self.partial_update_config({
                "temperature": 0.4,
                "max_tokens": 800
            })
            print("ðŸ’¬ Detected simple query - using concise configuration")
        
        # Make prediction with adjusted config
        return self.predict(messages)

# Usage
adaptive_agent = AdaptiveAgent(analytics_config, tools)

# Simple query - automatically uses concise config
response1 = adaptive_agent.smart_predict([
    {"role": "user", "content": "What is AI?"}
])

# Complex query - automatically uses detailed config
response2 = adaptive_agent.smart_predict([
    {"role": "user", "content": "Explain the transformer architecture and its implementation details in modern large language models"}
])
```

### Benefits of Dynamic Configuration Updates

**1. Resource Efficiency**
```python
# Instead of creating multiple agent instances
analytics_agent = DynamicDocsAgent(analytics_config, tools)
storytelling_agent = DynamicDocsAgent(storytelling_config, tools)
support_agent = DynamicDocsAgent(support_config, tools)

# Use one agent with dynamic updates
agent = ConfigurableDocsAgent(analytics_config, tools)
agent.update_config(storytelling_config)  # Switch on demand
agent.update_config(support_config)       # Switch again
```

**2. Easy A/B Testing**
```python
agent = ConfigurableDocsAgent(analytics_config, tools)

# Test different temperature values
for temp in [0.1, 0.3, 0.5, 0.7]:
    agent.partial_update_config({"temperature": temp})
    response = agent.predict([{"role": "user", "content": "Test query"}])
    print(f"Temperature {temp}: {len(response.messages[-1].content)} chars")
```

**3. Real-time Optimization**
```python
agent = ConfigurableDocsAgent(analytics_config, tools)

# Start with default settings
response = agent.predict([{"role": "user", "content": "Explain machine learning"}])

# If response is too short, increase max_tokens
if len(response.messages[-1].content) < 500:
    agent.partial_update_config({"max_tokens": 1500})
    response = agent.predict([{"role": "user", "content": "Explain machine learning"}])
```

## How RunnableConfig Enables Dynamic Behavior

The key insight is that **the same agent graph** produces different outputs based on the `RunnableConfig` it receives. Let's see how:

### Temperature Impact

```python
# Low temperature (0.1) for analytics
config_precise = RunnableConfig(configurable={"temperature": 0.1})
# Output: "Artificial Intelligence (AI) refers to systems designed to 
#          perform tasks that typically require human intelligence..."

# High temperature (0.8) for storytelling  
config_creative = RunnableConfig(configurable={"temperature": 0.8})
# Output: "Imagine a world where machines dream, think, and learn just 
#          like we do! This magical realm is called Artificial Intelligence..."
```

### Max Tokens Impact

```python
# Longer responses for detailed analysis
config_detailed = RunnableConfig(configurable={"max_tokens": 2000})

# Concise responses for quick support
config_concise = RunnableConfig(configurable={"max_tokens": 500})
```

### Metadata for Business Logic

```python
config_with_metadata = RunnableConfig(
    configurable={"temperature": 0.5},
    metadata={
        "use_case": "customer_support",
        "priority": "high",
        "language": "en"
    }
)

# Later in your code, you can access this metadata
# to make decisions about post-processing, logging, etc.
```

## RunnableConfig and MLflow Tracing

One of the most powerful aspects of `RunnableConfig` is how it integrates with MLflow tracing. Every configuration parameter becomes part of your trace, enabling powerful debugging and analysis.

### Tracing Configuration in Action

```python
import mlflow

# Enable tracing
mlflow.langchain.autolog()
mlflow.set_experiment("Agent_Configuration_Demo")

# Execute with different configs
with mlflow.start_run(run_name="analytics_execution"):
    config = RunnableConfig(
        configurable={"temperature": 0.1, "max_tokens": 2000},
        metadata={"use_case": "analytics"}
    )
    result = agent.invoke({"messages": [...]}, config=config)

# The trace will show:
# - All configuration parameters used
# - How they affected execution
# - Performance metrics per configuration
```

### What Gets Traced

When you use `RunnableConfig`, MLflow automatically captures:

```python
# Trace attributes include:
{
    "mlflow.chat.temperature": 0.1,
    "mlflow.chat.max_tokens": 2000,
    "use_case": "analytics",
    "execution_time_ms": 2450,
    "total_tokens": 1235
}
```

This allows you to:
- Compare performance across different configurations
- Identify which settings work best for each use case
- Debug configuration-related issues
- Optimize costs based on token usage patterns

## Best Practices for Using RunnableConfig

### 1. Separate Configuration from Logic

```python
# Good: Configuration is external
config = {
    "temperature": 0.1,
    "max_tokens": 1000
}
agent = DynamicDocsAgent(config, tools)

# Bad: Hardcoded values
def agent_function(state):
    llm = ChatDatabricks(temperature=0.1)  # Hardcoded!
```

### 2. Use Meaningful Metadata

```python
# Good: Descriptive metadata
config = RunnableConfig(
    metadata={
        "use_case": "customer_analytics",
        "user_segment": "enterprise",
        "session_id": "session_123"
    }
)

# Bad: Unclear metadata
config = RunnableConfig(
    metadata={"type": "a", "id": "123"}
)
```

### 3. Keep Configurations Organized

```python
# Create a configuration registry
AGENT_CONFIGS = {
    "analytics": analytics_config,
    "storytelling": storytelling_config,
    "support": support_config
}

def get_agent_for_use_case(use_case: str, tools: list):
    config = AGENT_CONFIGS.get(use_case)
    if not config:
        raise ValueError(f"Unknown use case: {use_case}")
    return DynamicDocsAgent(config, tools)
```

### 4. Choose the Right Update Method

```python
# Use update_config() for complete use case switches
agent.update_config(storytelling_config)  # Completely different behavior

# Use partial_update_config() for parameter tweaking
agent.partial_update_config({"temperature": 0.3})  # Fine-tuning
```

### 5. Document Configuration Changes

```python
class DocumentedAgent(ConfigurableDocsAgent):
    def __init__(self, base_config, tools):
        super().__init__(base_config, tools)
        self.config_history = [base_config.copy()]
    
    def partial_update_config(self, config_updates):
        """Update config with history tracking"""
        super().partial_update_config(config_updates)
        self.config_history.append(self.base_config.copy())
        
    def get_config_history(self):
        """View all configuration changes"""
        for i, config in enumerate(self.config_history):
            print(f"Config {i}: temp={config.get('temperature')}, "
                  f"max_tokens={config.get('max_tokens')}")
```

## Common Pitfalls to Avoid

### Pitfall 1: Forgetting to Pass Config

```python
# Wrong: Config not passed through
def tool_calling_llm(state: ChatAgentState, config: RunnableConfig):
    # Config is received but not used!
    response = model_runnable.invoke(state)  # Missing config!
    return {"messages": [response]}

# Correct: Config passed through
def tool_calling_llm(state: ChatAgentState, config: RunnableConfig):
    response = model_runnable.invoke(state, config)  # Config passed!
    return {"messages": [response]}
```

### Pitfall 2: Mixing State and Config

```python
# Wrong: Putting configuration in state
state = {
    "messages": [...],
    "temperature": 0.1  # Don't do this!
}

# Correct: Configuration in RunnableConfig
config = RunnableConfig(configurable={"temperature": 0.1})
state = {"messages": [...]}
```

### Pitfall 3: Ignoring Config in Manual Tracing

```python
# Wrong: Manual span without config context
with mlflow.start_span(name="llm_call") as span:
    response = model.invoke(messages)  # No config!

# Correct: Include config information
with mlflow.start_span(name="llm_call") as span:
    span.set_attributes({
        "temperature": config.configurable.get("temperature"),
        "max_tokens": config.configurable.get("max_tokens")
    })
    response = model.invoke(messages, config=config)
```

### Pitfall 4: Updating Config Too Frequently

```python
# Wrong: Unnecessary rebuilds
for message in messages:
    agent.partial_update_config({"temperature": 0.1})  # Rebuilds agent every time!
    agent.predict(message)

# Correct: Update once, use multiple times
agent.partial_update_config({"temperature": 0.1})
for message in messages:
    agent.predict(message)
```

## Conclusion

`RunnableConfig` is the bridge between flexible configuration and production-ready AI agents. By understanding how it flows through your agent's execution chain and how to update it dynamically, you can:

1. **Build once, configure many times**: Create a single agent that serves multiple use cases
2. **Enable dynamic behavior**: Adjust agent behavior at runtime without code changes
3. **Support MLflow observability**: Automatically track how configuration affects performance
4. **Simplify deployment**: Deploy one agent with different configurations per environment
5. **Optimize on the fly**: Use configuration updates for A/B testing and real-time optimization
6. **Efficient resource usage**: Switch between use cases without creating multiple agent instances

The key takeaway is that `RunnableConfig` separates *what* your agent does (the logic) from *how* it does it (the configuration), making your agents more maintainable, flexible, and observable. Combined with dynamic configuration updates, you have a powerful system that can adapt to changing requirements without code modifications.

As you build production AI agents with LangChain and MLflow, embrace `RunnableConfig` and its dynamic update capabilities as your tools for creating adaptable, well-instrumented systems that can evolve with your needs.

---

## Quick Reference

```python
# Basic RunnableConfig structure
config = RunnableConfig(
    configurable={
        # Model parameters
        "temperature": 0.1,
        "max_tokens": 1000,
    },
    metadata={
        # Business context
        "use_case": "analytics",
        "user_id": "user_123",
    }
)

# Using config with your agent
output = agent.invoke(request, config=config)

# Config flows automatically through chains
model_runnable = preprocessor | model
response = model_runnable.invoke(state, config)  # Config propagates!

# Complete configuration update
agent.update_config(new_config)  # Replace entire config

# Partial configuration update
agent.partial_update_config({
    "temperature": 0.3,
    "max_tokens": 1200
})  # Update specific parameters
```

Happy building! ðŸš€