
# MLflow Tracing in the Databricks Agent: Complete Deep Dive

Let me explain how MLflow tracing is integrated and used throughout this Databricks agent code, as it's crucial for observability and debugging in production AI systems.

## MLflow Trace Integration Points

Looking at the code, MLflow tracing is used in several key places:

```python
@tool
@mlflow.trace(name="LittleIndex", span_type=mlflow.entities.SpanType.RETRIEVER)
def find_relevant_documents(query: str, top_n: int = 5) -> list[dict[str, Any]]:
    """gets relevant documents for the query"""
    # ... function implementation
```

---

## 1. The `@mlflow.trace` Decorator Explained

### Basic Syntax Breakdown

```python
@mlflow.trace(
    name="LittleIndex",                           # Custom span name
    span_type=mlflow.entities.SpanType.RETRIEVER  # Semantic span type
)
def find_relevant_documents(...):
```

### What Each Parameter Does

**`name="LittleIndex"`**
- **Purpose**: Custom identifier for this operation in trace logs
- **Visibility**: Shows up in MLflow UI as "LittleIndex" instead of function name
- **Naming Convention**: Descriptive name that indicates what this component does
- **Alternative**: If omitted, would default to function name `find_relevant_documents`

**`span_type=mlflow.entities.SpanType.RETRIEVER`**
- **Purpose**: Categorizes this operation semantically for better observability
- **Built-in Types**: MLflow provides predefined span types:
  ```python
  mlflow.entities.SpanType.RETRIEVER    # For document/data retrieval
  mlflow.entities.SpanType.LLM          # For language model calls
  mlflow.entities.SpanType.CHAIN        # For workflow chains
  mlflow.entities.SpanType.TOOL         # For tool executions
  mlflow.entities.SpanType.AGENT        # For agent operations
  ```
- **Benefits**: Enables filtering and analysis by operation type in MLflow UI

---

## 2. What Gets Traced Automatically

### Function Execution Metrics

```python
@mlflow.trace(name="LittleIndex", span_type=mlflow.entities.SpanType.RETRIEVER)
def find_relevant_documents(query: str, top_n: int = 5) -> list[dict[str, Any]]:
    # MLflow automatically captures:
    # - Start timestamp
    # - End timestamp  
    # - Duration
    # - Input parameters: query="create delta table", top_n=5
    # - Return value: [{"page_content": "...", "metadata": {...}}, ...]
    # - Any exceptions raised
    
    query_tfidf = doc_vectorizer.transform([query])
    similarities = (tfidf_matrix @ query_tfidf.T).toarray().flatten()
    # ... rest of implementation
    
    return result  # This return value is captured in the trace
```

### Automatic Trace Data Collection

**Input Parameters:**
```json
{
  "inputs": {
    "query": "How do I create a Delta table?",
    "top_n": 5
  }
}
```

**Output Data:**
```json
{
  "outputs": [
    {
      "page_content": "Delta Lake is an open-source storage framework...",
      "metadata": {
        "doc_uri": "https://docs.databricks.com/delta/...",
        "score": 0.85
      }
    }
  ]
}
```

**Performance Metrics:**
```json
{
  "start_time": "2024-01-15T10:30:45.123Z",
  "end_time": "2024-01-15T10:30:45.445Z", 
  "duration_ms": 322,
  "status": "OK"
}
```

---

## 3. Nested Tracing in the Agent Workflow

### Complete Trace Hierarchy

When a user asks a question, the trace hierarchy looks like this:

```
Agent Execution (Root Span)
‚îú‚îÄ‚îÄ call_model (LLM Span)
‚îÇ   ‚îú‚îÄ‚îÄ Preprocessor (Processing)
‚îÇ   ‚îî‚îÄ‚îÄ ChatDatabricks LLM Call (LLM Span)
‚îÇ       ‚îú‚îÄ‚îÄ Input: [system_prompt, user_message]
‚îÇ       ‚îú‚îÄ‚îÄ Output: Response with tool_calls
‚îÇ       ‚îî‚îÄ‚îÄ Token Usage: input=150, output=75
‚îú‚îÄ‚îÄ Tool Execution (Tool Span)  
‚îÇ   ‚îî‚îÄ‚îÄ LittleIndex (RETRIEVER Span) ‚Üê Our custom trace
‚îÇ       ‚îú‚îÄ‚îÄ Input: query="Delta table", top_n=5
‚îÇ       ‚îú‚îÄ‚îÄ TF-IDF Computation
‚îÇ       ‚îú‚îÄ‚îÄ Similarity Calculation  
‚îÇ       ‚îî‚îÄ‚îÄ Output: 5 relevant documents
‚îî‚îÄ‚îÄ call_model (Second LLM Span)
    ‚îú‚îÄ‚îÄ Input: [conversation + tool_results]
    ‚îî‚îÄ‚îÄ Output: Final synthesized answer
```

### Visual Representation in MLflow UI

```
üîÑ Agent Conversation                                    [2.3s]
  ‚îú‚îÄ‚îÄ ü§ñ LLM Call (Initial)                            [0.8s]
  ‚îú‚îÄ‚îÄ üîß Tool Execution                                 [0.3s] 
  ‚îÇ   ‚îî‚îÄ‚îÄ üìö LittleIndex (RETRIEVER)                   [0.3s]
  ‚îÇ       ‚îú‚îÄ‚îÄ Query: "Delta table creation"
  ‚îÇ       ‚îú‚îÄ‚îÄ Retrieved: 5 documents  
  ‚îÇ       ‚îî‚îÄ‚îÄ Best Match Score: 0.89
  ‚îú‚îÄ‚îÄ ü§ñ LLM Call (Synthesis)                          [1.2s]
  ‚îî‚îÄ‚îÄ ‚úÖ Final Response                                 [Complete]
```

---

## 4. Integration with Other MLflow Features

### Model Registry Integration

```python
# At the end of the notebook:
AGENT = DocsAgent(baseline_config, tools)
mlflow.models.set_model(AGENT)  # Links traces to the registered model
```

**What this enables:**
- **Model Versioning**: Each model version has associated trace data
- **Performance Tracking**: Compare trace performance across model versions
- **Deployment Monitoring**: Production traces linked to specific model versions

### Experiment Tracking

```python
# Implicit experiment tracking happens when traces are captured
with mlflow.start_run(experiment_id="databricks_agent_experiment"):
    # All traces during this run are associated with the experiment
    agent_response = AGENT.predict(messages)
    
    # Additional custom metrics can be logged
    mlflow.log_metric("retrieval_count", 5)
    mlflow.log_metric("response_quality_score", 0.92)
```

---

## 5. Custom Trace Enhancement

### Adding Manual Trace Points

You can enhance the existing tracing with custom spans:

```python
@tool
@mlflow.trace(name="LittleIndex", span_type=mlflow.entities.SpanType.RETRIEVER)
def find_relevant_documents(query: str, top_n: int = 5) -> list[dict[str, Any]]:
    # Start a nested span for TF-IDF computation
    with mlflow.start_span(name="TFIDF_Computation") as span:
        span.set_inputs({"query": query, "vocabulary_size": len(doc_vectorizer.vocabulary_)})
        
        query_tfidf = doc_vectorizer.transform([query])
        span.set_outputs({"query_vector_shape": query_tfidf.shape})
    
    # Another span for similarity calculation
    with mlflow.start_span(name="Similarity_Calculation") as span:
        span.set_inputs({"documents_count": tfidf_matrix.shape[0]})
        
        similarities = (tfidf_matrix @ query_tfidf.T).toarray().flatten()
        ranked_docs = sorted(enumerate(similarities), key=lambda x: x[1], reverse=True)
        
        span.set_outputs({
            "max_similarity": float(similarities.max()),
            "min_similarity": float(similarities.min()),
            "avg_similarity": float(similarities.mean())
        })
    
    # Document formatting span
    with mlflow.start_span(name="Document_Formatting") as span:
        result = []
        for idx, score in ranked_docs[:top_n]:
            row = documents.iloc[idx]
            content = row["content"]
            doc_entry = {
                "page_content": content,
                "metadata": {
                    "doc_uri": row["doc_uri"],
                    "score": score,
                },
            }
            result.append(doc_entry)
            
        span.set_outputs({"formatted_documents": len(result)})
    
    return result
```

### Enhanced Trace Hierarchy

```
üìö LittleIndex (RETRIEVER)                              [322ms]
‚îú‚îÄ‚îÄ üßÆ TFIDF_Computation                               [45ms]
‚îÇ   ‚îú‚îÄ‚îÄ Input: query="Delta table", vocabulary_size=15420
‚îÇ   ‚îî‚îÄ‚îÄ Output: query_vector_shape=(1, 15420)
‚îú‚îÄ‚îÄ üîç Similarity_Calculation                          [267ms]  
‚îÇ   ‚îú‚îÄ‚îÄ Input: documents_count=5824
‚îÇ   ‚îî‚îÄ‚îÄ Output: max=0.89, min=0.02, avg=0.15
‚îî‚îÄ‚îÄ üìù Document_Formatting                             [10ms]
    ‚îî‚îÄ‚îÄ Output: formatted_documents=5
```

---

## 6. Production Monitoring with Traces

### Real-Time Performance Monitoring

```python
class AgentPerformanceMonitor:
    def __init__(self):
        self.trace_data = []
        
    def analyze_traces(self):
        # Query MLflow for recent traces
        traces = mlflow.search_traces(
            experiment_ids=["databricks_agent_experiment"],
            filter_string="span_type = 'RETRIEVER'",
            max_results=100
        )
        
        for trace in traces:
            # Extract performance metrics
            duration = trace.info.execution_time_ms
            query = trace.data.inputs.get("query", "")
            top_score = max([doc["metadata"]["score"] 
                           for doc in trace.data.outputs], default=0)
            
            self.trace_data.append({
                "timestamp": trace.info.start_time_ms,
                "duration_ms": duration,
                "query_length": len(query),
                "retrieval_quality": top_score
            })
    
    def detect_performance_issues(self):
        df = pd.DataFrame(self.trace_data)
        
        # Alert on slow retrievals
        slow_queries = df[df["duration_ms"] > 500]
        if len(slow_queries) > 0:
            alert(f"Found {len(slow_queries)} slow retrieval operations")
            
        # Alert on low quality retrievals  
        low_quality = df[df["retrieval_quality"] < 0.3]
        if len(low_quality) > 0:
            alert(f"Found {len(low_quality)} low quality retrievals")
```

### Automated Quality Monitoring

```python
def monitor_retrieval_quality():
    recent_traces = mlflow.search_traces(
        filter_string="span_type = 'RETRIEVER' AND status = 'OK'",
        order_by=["start_time DESC"],
        max_results=50
    )
    
    quality_scores = []
    for trace in recent_traces:
        # Extract quality metrics from trace data
        outputs = trace.data.outputs
        if outputs and len(outputs) > 0:
            max_score = max([doc["metadata"]["score"] for doc in outputs])
            quality_scores.append(max_score)
    
    avg_quality = sum(quality_scores) / len(quality_scores)
    
    # Log quality metric
    mlflow.log_metric("avg_retrieval_quality", avg_quality)
    
    # Alert if quality drops
    if avg_quality < 0.5:  # Threshold
        send_alert(f"Retrieval quality dropped to {avg_quality:.2f}")
```

---

## 7. Debugging with Traces

### Identifying Retrieval Issues

```python
def debug_poor_retrieval(query: str):
    # Find traces for similar queries
    traces = mlflow.search_traces(
        filter_string=f"inputs.query LIKE '%{query}%'",
        order_by=["start_time DESC"]
    )
    
    for trace in traces:
        print(f"Query: {trace.data.inputs['query']}")
        print(f"Duration: {trace.info.execution_time_ms}ms")
        print(f"Status: {trace.info.status}")
        
        if trace.data.outputs:
            scores = [doc["metadata"]["score"] for doc in trace.data.outputs]
            print(f"Similarity Scores: {scores}")
            print(f"Best Match: {max(scores):.3f}")
            
            # Analyze retrieved content
            for i, doc in enumerate(trace.data.outputs[:3]):
                print(f"Doc {i+1} (score={doc['metadata']['score']:.3f}):")
                print(f"  Content: {doc['page_content'][:100]}...")
                print(f"  Source: {doc['metadata']['doc_uri']}")
        print("-" * 50)
```

### Performance Analysis

```python
def analyze_retrieval_performance():
    # Get all retrieval traces from last 24 hours
    yesterday = datetime.now() - timedelta(days=1)
    
    traces = mlflow.search_traces(
        filter_string=f"span_type = 'RETRIEVER' AND start_time >= '{yesterday.isoformat()}'",
        order_by=["execution_time_ms DESC"]
    )
    
    # Performance statistics
    durations = [trace.info.execution_time_ms for trace in traces]
    
    print(f"Retrieval Performance Analysis:")
    print(f"Total Retrievals: {len(durations)}")
    print(f"Average Duration: {np.mean(durations):.1f}ms")
    print(f"95th Percentile: {np.percentile(durations, 95):.1f}ms")
    print(f"Slowest Query: {max(durations):.1f}ms")
    
    # Find slowest queries
    slowest_traces = sorted(traces, key=lambda t: t.info.execution_time_ms, reverse=True)[:5]
    
    print("\nSlowest Queries:")
    for trace in slowest_traces:
        query = trace.data.inputs.get("query", "Unknown")
        duration = trace.info.execution_time_ms
        print(f"  {duration}ms: {query[:50]}...")
```

---

## 8. MLflow UI Navigation

### Accessing Traces in Databricks

**In Databricks Workspace:**
1. Navigate to **Experiments** in the left sidebar
2. Find your experiment (created automatically or explicitly)
3. Click on a specific run
4. Go to the **Traces** tab

**Trace View Features:**
- **Timeline View**: See execution flow and timing
- **Tree View**: Hierarchical span structure  
- **Inputs/Outputs**: Detailed parameter and return value inspection
- **Performance Metrics**: Duration, token usage, success rates
- **Error Analysis**: Exception details and stack traces

### Filtering and Searching

```python
# Search for specific patterns
slow_retrievals = mlflow.search_traces(
    filter_string="span_type = 'RETRIEVER' AND execution_time_ms > 1000"
)

failed_retrievals = mlflow.search_traces(
    filter_string="span_type = 'RETRIEVER' AND status = 'ERROR'"
)

high_quality_retrievals = mlflow.search_traces(
    filter_string="span_type = 'RETRIEVER' AND outputs LIKE '%score\":0.9%'"
)
```

---

## 9. Integration with Agent Deployment

### Model Serving Integration

```python
# When deployed to Model Serving, traces are automatically captured
class DocsAgent(ChatAgent):
    def predict(self, messages, context=None, custom_inputs=None):
        # This entire method execution gets traced automatically
        # Including the call to self.agent.invoke() which triggers
        # all the nested spans we've configured
        
        request = {"messages": self._convert_messages_to_dict(messages)}
        output = self.agent.invoke(request)
        return ChatAgentResponse(**output)
```

**Production Trace Benefits:**
- **Request Tracing**: Every API call to your deployed agent creates traces
- **Performance SLAs**: Monitor if responses meet latency requirements  
- **Quality Monitoring**: Track retrieval quality in production
- **Error Detection**: Immediate notification of failures
- **Usage Analytics**: Understand how users interact with your agent

### Continuous Improvement Loop

```python
def improve_agent_based_on_traces():
    # Analyze production traces
    low_quality_traces = mlflow.search_traces(
        filter_string="span_type = 'RETRIEVER' AND outputs LIKE '%score\":0.[0-3]%'"
    )
    
    # Extract queries that had poor retrieval
    poor_queries = [trace.data.inputs["query"] for trace in low_quality_traces]
    
    # Analyze common patterns
    query_analysis = analyze_query_patterns(poor_queries)
    
    # Improve documentation corpus or retrieval algorithm
    if query_analysis["missing_topics"]:
        expand_documentation(query_analysis["missing_topics"])
        
    if query_analysis["synonym_issues"]:
        update_tfidf_preprocessing(query_analysis["synonyms"])
```

---

## Summary: MLflow Tracing Value

The MLflow tracing in this Databricks agent provides:

### 1. **Development Benefits**
- **Debugging**: See exactly what happens during retrieval
- **Performance Optimization**: Identify bottlenecks in the pipeline
- **Quality Analysis**: Understand when and why retrieval fails

### 2. **Production Benefits**  
- **Monitoring**: Real-time visibility into agent performance
- **Alerting**: Automatic detection of issues
- **Analytics**: Usage patterns and performance trends

### 3. **Business Benefits**
- **SLA Compliance**: Ensure response times meet requirements
- **Quality Assurance**: Maintain consistent user experience
- **Continuous Improvement**: Data-driven agent optimization

The `@mlflow.trace` decorator on the `find_relevant_documents` function is a small addition that provides enormous value for operating AI agents at scale. It transforms a black-box function into a fully observable, debuggable, and optimizable component of your AI system.
```

### We only explicitly decorated `find_relevant_documents` with `@mlflow.trace`, but MLflow actually traces much more. Let me explain how this works.

### Excellent Question! MLflow Tracing Propagation Explained

You're absolutely right to ask this! We only explicitly decorated `find_relevant_documents` with `@mlflow.trace`, but MLflow actually traces much more. Let me explain how this works.

## 1. Automatic Tracing vs Manual Tracing

### What We Manually Traced
```python
@tool
@mlflow.trace(name="LittleIndex", span_type=mlflow.entities.SpanType.RETRIEVER)
def find_relevant_documents(query: str, top_n: int = 5):
    # Only THIS function is manually traced
```

### What Gets Automatically Traced

MLflow has **built-in automatic tracing** for many components, especially when they're part of LangChain workflows:

```python
# These get traced automatically (no decorator needed):
- ChatDatabricks LLM calls
- LangChain Runnable executions  
- Tool invocations
- Agent workflow steps
- StateGraph node executions
```

---

## 2. LangChain Integration Auto-Tracing

### Automatic LLM Tracing

```python
llm = ChatDatabricks(
    endpoint=self.config.get("endpoint_name"),
    temperature=self.config.get("temperature"),
    max_tokens=self.config.get("max_tokens"),
)
# ‚Üë This automatically gets traced when called!
```

**When `call_model` executes:**
```python
def call_model(state: ChatAgentState, config: RunnableConfig):
    response = model_runnable.invoke(state, config)  # ‚Üê Auto-traced!
    return {"messages": [response]}
```

MLflow automatically captures:
- **Span Type**: `mlflow.entities.SpanType.LLM`  
- **Inputs**: The messages sent to the LLM
- **Outputs**: The LLM's response (including tool calls)
- **Metadata**: Token usage, model endpoint, parameters

### Automatic Tool Execution Tracing

```python
@tool  # ‚Üê This @tool decorator enables auto-tracing
def find_relevant_documents(query: str, top_n: int = 5):
    # Even without @mlflow.trace, this would be traced
    # because of the @tool decorator
```

The `@tool` decorator from LangChain automatically integrates with MLflow tracing:
- **Span Type**: `mlflow.entities.SpanType.TOOL`
- **Tool Name**: Function name or explicit name
- **Inputs/Outputs**: Parameters and return values

---

## 3. StateGraph Workflow Auto-Tracing

### Agent Node Tracing

```python
workflow.add_node("agent", RunnableLambda(call_model))
```

When this node executes:
```python
# MLflow automatically creates a span for:
# - Node name: "agent" 
# - Node type: LangGraph node execution
# - Nested spans for everything inside call_model
```

### Tools Node Tracing  

```python
workflow.add_node("tools", ChatAgentToolNode(tools))
```

`ChatAgentToolNode` has built-in MLflow integration:
```python
# Automatically traces:
# - Tool selection logic
# - Individual tool executions  
# - Tool result formatting
# - Error handling
```

---

## 4. Trace Hierarchy Creation

Here's how the complete trace hierarchy gets built:

### Level 1: Agent Invocation (Auto-traced)
```python
# When you call:
agent.invoke({"messages": [...]}, config)

# MLflow automatically creates root span:
# Name: "Agent Execution" or similar
# Type: AGENT or CHAIN
```

### Level 2: Workflow Nodes (Auto-traced)
```python
# Each StateGraph node gets its own span:

# Agent Node Span (Auto)
workflow.add_node("agent", RunnableLambda(call_model))
# ‚Üì Creates span with name="agent"

# Tools Node Span (Auto)  
workflow.add_node("tools", ChatAgentToolNode(tools))
# ‚Üì Creates span with name="tools"
```

### Level 3: Model Calls (Auto-traced)
```python
# Inside call_model:
def call_model(state, config):
    response = model_runnable.invoke(state, config)
    # ‚Üë ChatDatabricks automatically traced
    return {"messages": [response]}
```

### Level 4: Tool Executions
```python
# Our manual trace (Enhanced):
@mlflow.trace(name="LittleIndex", span_type=RETRIEVER)
def find_relevant_documents(...):
    # Custom span with our chosen name and type

# Without @mlflow.trace, would still be traced as:
# Name: "find_relevant_documents" (function name)
# Type: TOOL (from @tool decorator)
```

---

## 5. Why We Added Custom Tracing

### Without Custom Tracing
```python
@tool
def find_relevant_documents(query: str, top_n: int = 5):
    # Would appear in traces as:
    # Name: "find_relevant_documents" 
    # Type: TOOL
    # Generic tool execution span
```

### With Custom Tracing  
```python
@tool
@mlflow.trace(name="LittleIndex", span_type=mlflow.entities.SpanType.RETRIEVER)
def find_relevant_documents(query: str, top_n: int = 5):
    # Appears in traces as:
    # Name: "LittleIndex" (more descriptive)
    # Type: RETRIEVER (semantic meaning)
    # Enhanced observability
```

---

## 6. Demonstration: What Actually Gets Traced

Let me show you the complete trace that gets generated:

```python
# User query triggers this trace hierarchy:

üîÑ Agent.invoke() [AUTO-TRACED]                          [2.3s]
‚îú‚îÄ‚îÄ üèóÔ∏è  StateGraph Execution [AUTO-TRACED]               [2.3s] 
‚îÇ   ‚îú‚îÄ‚îÄ ü§ñ Node: "agent" [AUTO-TRACED]                  [0.8s]
‚îÇ   ‚îÇ   ‚îî‚îÄ‚îÄ üìû call_model [AUTO-TRACED]                 [0.8s]
‚îÇ   ‚îÇ       ‚îú‚îÄ‚îÄ üîß RunnableLambda(preprocessor) [AUTO]  [0.01s]  
‚îÇ   ‚îÇ       ‚îî‚îÄ‚îÄ ü¶ô ChatDatabricks LLM [AUTO-TRACED]     [0.79s]
‚îÇ   ‚îÇ           ‚îú‚îÄ‚îÄ Input: [system_msg, user_msg]
‚îÇ   ‚îÇ           ‚îú‚îÄ‚îÄ Output: {tool_calls: [...]}
‚îÇ   ‚îÇ           ‚îî‚îÄ‚îÄ Tokens: in=150, out=45
‚îÇ   ‚îú‚îÄ‚îÄ üõ†Ô∏è  Node: "tools" [AUTO-TRACED]                  [0.3s]
‚îÇ   ‚îÇ   ‚îî‚îÄ‚îÄ üîß ChatAgentToolNode [AUTO-TRACED]          [0.3s]
‚îÇ   ‚îÇ       ‚îî‚îÄ‚îÄ üìö LittleIndex [MANUAL TRACE]           [0.3s] ‚Üê Our custom trace
‚îÇ   ‚îÇ           ‚îú‚îÄ‚îÄ Input: query="Delta table", top_n=5
‚îÇ   ‚îÇ           ‚îú‚îÄ‚îÄ TF-IDF processing...
‚îÇ   ‚îÇ           ‚îî‚îÄ‚îÄ Output: 5 documents with scores
‚îÇ   ‚îî‚îÄ‚îÄ ü§ñ Node: "agent" [AUTO-TRACED]                  [1.2s]
‚îÇ       ‚îî‚îÄ‚îÄ üìû call_model [AUTO-TRACED]                 [1.2s]  
‚îÇ           ‚îî‚îÄ‚îÄ ü¶ô ChatDatabricks LLM [AUTO-TRACED]     [1.2s]
‚îÇ               ‚îú‚îÄ‚îÄ Input: [conversation + tool_results]
‚îÇ               ‚îî‚îÄ‚îÄ Output: Final synthesized answer
```

---

## 7. MLflow's Built-in Integrations

### LangChain Components with Auto-Tracing

```python
# These have built-in MLflow tracing:
from langchain_core.runnables import RunnableLambda     # ‚úÖ Auto-traced
from langchain_core.language_models import BaseChatModel # ‚úÖ Auto-traced  
from langchain_core.tools import BaseTool               # ‚úÖ Auto-traced
from langgraph.graph import StateGraph                  # ‚úÖ Auto-traced
from databricks_langchain import ChatDatabricks         # ‚úÖ Auto-traced
```

### How to Verify Auto-Tracing

```python
# You can check if a component supports auto-tracing:
import mlflow

# Enable tracing (usually on by default in Databricks)
mlflow.langchain.autolog()

# Now all LangChain components get traced automatically
llm = ChatDatabricks(endpoint="llama-3-70b")
response = llm.invoke("Hello")  # ‚Üê This creates a trace span
```

---

## 8. Configuration-Level Tracing Control

### Enabling/Disabling Auto-Tracing

```python
# Enable automatic tracing for all LangChain components
mlflow.langchain.autolog(
    log_input_examples=True,
    log_model_signatures=True, 
    log_models=True,
    disable=False  # Set to True to disable auto-tracing
)
```

### Trace Configuration in RunnableConfig

```python
config = RunnableConfig(
    # Tracing configuration
    callbacks=[MLflowCallbackHandler()],  # Explicit MLflow callback
    metadata={
        "trace_enabled": True,
        "trace_level": "detailed"  # Custom metadata for tracing
    }
)

# This config enables enhanced tracing for all components
agent.invoke(messages, config=config)
```

---

## 9. Custom vs Automatic Tracing Comparison

### What You Get with Just Auto-Tracing

```python
@tool
def find_relevant_documents(query: str, top_n: int = 5):
    # Auto-trace provides:
    # ‚úÖ Function name as span name  
    # ‚úÖ Input parameters
    # ‚úÖ Return values
    # ‚úÖ Execution time
    # ‚úÖ Success/failure status
    # ‚ùå Custom span name
    # ‚ùå Semantic span type (RETRIEVER)
    # ‚ùå Enhanced metadata
```

### What You Get with Custom Tracing

```python
@tool
@mlflow.trace(name="LittleIndex", span_type=mlflow.entities.SpanType.RETRIEVER)
def find_relevant_documents(query: str, top_n: int = 5):
    # Custom trace provides everything above PLUS:
    # ‚úÖ Descriptive span name ("LittleIndex")
    # ‚úÖ Semantic categorization (RETRIEVER)
    # ‚úÖ Better filtering in MLflow UI
    # ‚úÖ Professional trace readability
    # ‚úÖ Type-specific analytics
```

---

## 10. Practical Example: Trace Without Custom Decorator

Let me show you what would happen if we removed our custom trace:

### Original Code (With Custom Trace)
```python
@tool
@mlflow.trace(name="LittleIndex", span_type=mlflow.entities.SpanType.RETRIEVER)
def find_relevant_documents(query: str, top_n: int = 5):
    # Implementation...
```

### Without Custom Trace
```python
@tool  # Only @tool decorator - still gets traced!
def find_relevant_documents(query: str, top_n: int = 5):  
    # Implementation...
```

### Resulting Trace Difference

**With Custom Trace:**
```
üìö LittleIndex (RETRIEVER)                              [322ms]
‚îú‚îÄ‚îÄ Input: {query: "Delta table", top_n: 5}
‚îî‚îÄ‚îÄ Output: [{page_content: "...", metadata: {...}}, ...]
```

**Without Custom Trace (Auto-trace only):**
```
üîß find_relevant_documents (TOOL)                       [322ms]  
‚îú‚îÄ‚îÄ Input: {query: "Delta table", top_n: 5}
‚îî‚îÄ‚îÄ Output: [{page_content: "...", metadata: {...}}, ...]
```

**Key Differences:**
- **Name**: "LittleIndex" vs "find_relevant_documents"
- **Type**: "RETRIEVER" vs "TOOL"  
- **Semantics**: Clear purpose vs generic tool

---

## 11. How to Verify All Tracing Works

### Check MLflow Traces Programmatically

```python
def verify_tracing_coverage():
    # Make a test call to the agent
    test_messages = [{"role": "user", "content": "What is Databricks?"}]
    
    # Start explicit run to capture traces
    with mlflow.start_run():
        response = AGENT.predict(test_messages)
        run_id = mlflow.active_run().info.run_id
    
    # Query for traces from this run
    traces = mlflow.search_traces(filter_string=f"request_id = '{run_id}'")
    
    print(f"Found {len(traces)} traces:")
    for trace in traces:
        print(f"- {trace.info.request_id}: {trace.data.span_type} - {trace.data.name}")
        print(f"  Duration: {trace.info.execution_time_ms}ms")
        print(f"  Status: {trace.info.status}")
        
        if trace.data.inputs:
            print(f"  Inputs: {list(trace.data.inputs.keys())}")
        if trace.data.outputs:
            print(f"  Outputs available: {trace.data.outputs is not None}")
        print()

# Run verification
verify_tracing_coverage()
```

### Expected Output
```
Found 6 traces:
- abc123: AGENT - Agent Execution
  Duration: 2300ms
  Status: OK
  Inputs: ['messages']
  Outputs available: True

- def456: LLM - ChatDatabricks  
  Duration: 800ms
  Status: OK
  Inputs: ['messages']
  Outputs available: True

- ghi789: RETRIEVER - LittleIndex
  Duration: 322ms  
  Status: OK
  Inputs: ['query', 'top_n']
  Outputs available: True

- jkl012: LLM - ChatDatabricks
  Duration: 1200ms
  Status: OK  
  Inputs: ['messages']
  Outputs available: True
```

---

## Summary: The Complete Tracing Picture

You're absolutely correct that we only manually added one trace decorator. Here's why we get comprehensive tracing:

### 1. **MLflow Auto-Integration** 
- LangChain components have built-in MLflow support
- `ChatDatabricks`, `StateGraph`, tool executions all auto-trace

### 2. **Framework-Level Tracing**
- LangGraph automatically traces workflow execution
- Each node, edge, and decision point gets tracked

### 3. **Tool Ecosystem Integration**
- `@tool` decorator enables automatic tool tracing
- `ChatAgentToolNode` has built-in observability

### 4. **Our Custom Enhancement**
- `@mlflow.trace` on `find_relevant_documents` provides:
  - Better naming ("LittleIndex" vs function name)
  - Semantic categorization (RETRIEVER vs generic TOOL)
  - Enhanced filtering and analytics capabilities

The beauty of this architecture is that you get **comprehensive observability by default**, and you can **enhance specific components** with custom tracing where it adds value. Our single `@mlflow.trace` decorator makes the retrieval component more professional and easier to monitor, while the framework handles tracing everything else automatically.
```