# MLflow 07: Tool-Calling Agents with LangGraph, Ollama, and MLflow

Welcome to the seventh notebook in our MLflow series! We've journeyed through MLflow basics, HPO, model registry, RAG, fine-tuning, and LLM evaluation. Now, we're stepping into the dynamic world of **AI Agents** capable of using tools to interact with their environment and solve complex tasks.

In this notebook, we will:
- Introduce **LangGraph**, a library for building stateful, multi-actor applications with LLMs, perfect for creating agentic workflows.
- Build an agent that can decide which tools to call based on a user's query.
- Utilize a locally running LLM (e.g., `Qwen/Qwen3-0.6B`) via **Ollama** to power our agent's reasoning.
- Define custom tools for our agent (e.g., a simple calculator, a mock weather service).
- Integrate **MLflow Tracing** to capture and visualize the intricate steps, decisions, and tool invocations within our LangGraph agent.

![LangGraph Concept](https://img1.daumcdn.net/thumb/R800x0/?scode=mtistory2&fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FyaplV%2FbtsLG5bkLRl%2FI1KEK6mAuSiqfmOWPxy9I0%2Fimg.png)

Building agents that can intelligently use tools opens up a vast array of possibilities, from simple task automation to complex problem-solving. Let's explore how LangGraph, Ollama, and MLflow work together to make this happen!

---

## Table of Contents

1. Introduction to AI Agents and Tool Use
2. What is LangGraph?
3. Setting Up the Agent Environment
    - Installing Libraries
    - Setting up Ollama and an LLM
    - Configuring MLflow (with Tracing)
4. Defining Tools for Our Agent
    - Mock Weather Tool
    - Simple Calculator Tool
5. Building the Agent with LangGraph
    - Defining Agent State
    - Creating Agent Nodes (LLM Call, Tool Execution)
    - Constructing the Graph and Conditional Edges
    - Compiling and Running the Agent
6. MLflow Tracing for LangGraph Agents
    - How Autologging Works with LangGraph
    - Inspecting Traces in the MLflow UI
7. Interacting with the Tool-Calling Agent
8. Key Takeaways for Building and Tracing Agents
9. Engaging Resources and Further Reading

---

## 1. Introduction to AI Agents and Tool Use

An **AI Agent** is a system that can perceive its environment, make decisions, and take actions to achieve specific goals. In the context of LLMs, agents often leverage the language understanding and reasoning capabilities of an LLM to:
- **Understand User Intent:** Interpret complex requests or queries.
- **Plan Steps:** Break down a problem into smaller, manageable tasks.
- **Use Tools:** Interact with external systems, APIs, or functions to gather information or perform actions that the LLM itself cannot (e.g., browse the web, access a database, perform calculations, call a specific API).
- **Maintain State/Memory:** Keep track of past interactions and information to inform future decisions.

**Tool use** is a cornerstone of modern LLM-based agents. By giving an LLM access to tools, we extend its capabilities far beyond text generation, allowing it to ground its responses in real-world data or perform concrete actions.

---

## 2. What is LangGraph?

**LangGraph** is a library built by LangChain for creating stateful, multi-actor applications with LLMs. It allows you to build agentic systems as **graphs**, where nodes represent computation steps (e.g., calling an LLM, executing a tool) and edges define the flow of execution, including conditional logic.

![MLFlow logo](https://www.the-odd-dataguy.com/images/posts/20191113/cover.jpg)

**Key Concepts in LangGraph:**
- **State Graph (`StateGraph`):** The core of a LangGraph application. It's a directed graph where nodes operate on a shared `AgentState` object.
- **Agent State (`TypedDict`):** A dictionary-like object that holds the current state of the agent (e.g., input query, chat history, intermediate steps, tool calls, tool outputs). This state is passed between nodes and updated by them.
- **Nodes:** Python functions or callables that represent a unit of work. Each node receives the current agent state and returns an update to the state.
- **Edges:** Define the flow of control between nodes. 
    - **Standard Edges:** Always transition from one node to another.
    - **Conditional Edges:** Route the execution to different nodes based on the current agent state (e.g., if an LLM decided to call a tool, go to the tool execution node; otherwise, go to the response generation node).
- **Entry and Finish Points:** Define where the graph execution starts and ends.

**Why LangGraph?**
- **Control & Flexibility:** Offers a lower-level, more explicit way to define agent logic compared to some higher-level agent frameworks.
- **State Management:** Explicitly manages state throughout the agent's execution flow.
- **Cyclical Computations:** Easily create loops, allowing agents to iteratively refine answers or call tools multiple times.
- **Human-in-the-Loop:** Can be designed to pause and wait for human input at any step.
- **Parallelism:** LangGraph supports running tool nodes concurrently, which can speed up execution when multiple independent tools need to be called.

---

## 3. Setting Up the Agent Environment

### Installing Libraries
We'll need `mlflow`, `langchain` (core, ollama integration), `langgraph`, and `tiktoken` (often a dependency for token counting).

In [None]:
!pip install --quiet mlflow langchain langgraph langchain_community langchain_core langchain_ollama tiktoken

import importlib.metadata
import mlflow
import os
import operator
from typing import TypedDict, Annotated, List, Union

from langchain_core.messages import BaseMessage, HumanMessage, AIMessage, ToolMessage
from langchain_core.tools import tool
from langchain_ollama.chat_models import ChatOllama # For local LLM via Ollama
from langgraph.graph import StateGraph, END
from langgraph.checkpoint.memory import MemorySaver # For potential in-memory checkpoints

print(f"MLflow Version: {mlflow.__version__}")
import langchain
print(f"Langchain Core Version: {langchain.__version__}") # This might print langchain version, not core directly

import langgraph
langgraph_version = importlib.metadata.version("langgraph")
print(f"Langgraph version: {langgraph_version}")

### Setting up Ollama and an LLM
Ensure Ollama is installed and running. We'll use `qwen3:0.6b` for this demo due to its efficiency and good instruction-following capabilities.

1.  Download and install Ollama from [ollama.com](https://ollama.com/).
2.  Start the Ollama application/server.
3.  Pull the desired model via your terminal: `ollama pull qwen3:0.6b` or (1.7B, 4B, 8B, 14B, 30B, 32B, 235B).

In [None]:
ollama_model_name = "qwen3:0.6b"

try:
    # Initialize ChatOllama for our agent
    # Specify format='json' if you want the LLM to reliably output JSON for tool calls
    # and ensure your prompt instructs it to do so.
    llm = ChatOllama(
        model=ollama_model_name, 
        temperature=0, # Lower temperature for more deterministic tool use decisions
        # format="json", # Enable JSON mode if LLM output for tool calls needs to be strict JSON
        keep_alive="5m" # Keep model loaded in Ollama for 5 mins to speed up subsequent calls
    )
    # Test the LLM connection
    print(f"Testing Ollama with model: {ollama_model_name}")
    response_test = llm.invoke("Hello! How are you?")
    print(f"Ollama test response: {response_test.content[:50]}...")
    print(f"Successfully connected to Ollama with model {ollama_model_name}.")
except Exception as e:
    print(f"Error connecting to Ollama or model {ollama_model_name}: {e}")
    print("Please ensure Ollama is running and the model is pulled (e.g., 'ollama pull qwen3:0.6b' or `ollama pull qwen3:8b`).")
    # In a real scenario, you might want to stop or handle this gracefully.
    llm = None # Set llm to None if connection fails

### Configuring MLflow (with Tracing)
MLflow can automatically trace LangChain (and thus LangGraph) executions. We'll also set up an experiment.

In [None]:
mlflow.set_tracking_uri('mlruns')
experiment_name = "LangGraph_ToolCalling_Agent_Ollama"
mlflow.set_experiment(experiment_name)

# Enable MLflow autologging for LangChain
# This will trace LangGraph runs, including LLM calls, tool inputs/outputs if structured correctly.
mlflow.langchain.autolog(
    log_models=True, # Set to True if you want model artifacts, signatures, and input/output examples
    log_input_examples=True, # This logs input examples, relies on log_models=True
    log_model_signatures=True, # This logs model signatures, relies on log_models=True
    extra_tags={"agent_framework": "LangGraph"}
)


print(f"MLflow Experiment set to: {experiment_name}")
print("MLflow autologging for LangChain enabled.")

---

## 4. Defining Tools for Our Agent
We'll create a couple of simple tools that our agent can use. LangChain uses a `@tool` decorator to easily define tools from functions.

### Mock Weather Tool

In [None]:
@tool
def get_current_weather(city: str) -> str:
    """Gets the current weather for a specified city. Returns a mock forecast."""
    print(f"--- Tool Called: get_current_weather(city='{city}') ---")
    city_lower = city.lower()
    if "london" in city_lower:
        return "The weather in London is cloudy with a chance of rain. Temperature is 15°C."
    elif "paris" in city_lower:
        return "Paris is sunny with a temperature of 22°C."
    elif "tokyo" in city_lower:
        return "Tokyo is experiencing light showers. Temperature is 18°C."
    else:
        return f"Sorry, I don't have weather information for {city}. I can provide it for London, Paris, or Tokyo."

print("Weather tool defined.")

### Simple Calculator Tool

In [None]:
@tool
def simple_calculator(expression: str) -> str:
    """
    Evaluates a simple mathematical expression involving addition, subtraction, multiplication, or division.
    Example: simple_calculator(expression='2+2*5') or simple_calculator(expression='10 / (2+3)')
    IMPORTANT: This tool uses eval() and is NOT safe for untrusted input in production environments.
    """
    print(f"--- Tool Called: simple_calculator(expression='{expression}') ---")
    try:
        # WARNING: eval() is insecure with untrusted input! For demo purposes only.
        # In a real app, use a safe math expression parser like `asteval` or `numexpr`.
        allowed_chars = "0123456789+-*/(). " # Basic character whitelist
        if not all(char in allowed_chars for char in expression):
            return "Error: Expression contains invalid characters."
        
        result = eval(expression)
        return f"The result of the calculation '{expression}' is {result}."
    except Exception as e:
        return f"Error evaluating expression '{expression}': {str(e)}"

print("Calculator tool defined.")

tools = [get_current_weather, simple_calculator]

# Bind these tools to our LLM. This allows the LLM to see the tool descriptions and decide when to call them.
if llm:
    llm_with_tools = llm.bind_tools(tools)
    print("LLM bound with tools.")
else:
    print("LLM not initialized, cannot bind tools.")
    llm_with_tools = None # Ensure it's defined

---

## 5. Building the Agent with LangGraph
We'll create a graph where the agent can: 
1. Receive a user query.
2. Call the LLM to decide if a tool is needed, or if it can answer directly.
3. If a tool is needed, call the tool executor.
4. Feed the tool's response back to the LLM for a final answer.
5. Repeat tool calls if necessary (though our simple example might not require complex iteration).

### Defining Agent State
The state will primarily consist of a list of messages, tracking the conversation history including tool calls and responses.

In [None]:
class AgentState(TypedDict):
    messages: Annotated[List[BaseMessage], operator.add]

print("AgentState defined.")

### Creating Agent Nodes (LLM Call, Tool Execution)

In [None]:
# Node 1: The Agent - Calls the LLM to decide an action or respond
def call_agent_llm(state: AgentState):
    """Calls the LLM with the current conversation history (messages) and tools."""
    print("--- Node: call_agent_llm ---")
    if not llm_with_tools:
        print("LLM with tools not available. Cannot call agent.")
        # Append an error message or handle appropriately
        error_msg = HumanMessage(content="Error: LLM with tools not initialized.")
        return {"messages": [error_msg]}
        
    messages = state["messages"]
    print(f"  Input messages to LLM: {messages}")
    response = llm_with_tools.invoke(messages)
    print(f"  LLM Response: {response}")
    # The response will be an AIMessage, possibly with tool_calls attribute
    return {"messages": [response]} 

# Node 2: Tool Executor - Executes tools called by the LLM
def execute_tools_node(state: AgentState):
    """Checks the last message for tool calls and executes them."""
    print("--- Node: execute_tools_node ---")
    last_message = state["messages"][-1]
    if not isinstance(last_message, AIMessage) or not hasattr(last_message, 'tool_calls') or not last_message.tool_calls:
        print("  No tool calls found in the last message.")
        return # No tools to execute, or last message is not an AIMessage with tool_calls

    tool_invocation_messages = []
    for tool_call in last_message.tool_calls:
        tool_name = tool_call["name"]
        tool_args = tool_call["args"]
        print(f"  Executing tool: {tool_name} with args: {tool_args}")
        
        selected_tool = None
        for t in tools:
            if t.name == tool_name:
                selected_tool = t
                break
        
        if selected_tool:
            try:
                # The tool execution might be synchronous or asynchronous depending on the tool definition
                # For simple @tool decorated functions, it's usually synchronous.
                observation = selected_tool.invoke(tool_args)
            except Exception as e:
                observation = f"Error executing tool {tool_name}: {str(e)}"
        else:
            observation = f"Error: Tool '{tool_name}' not found."
        
        print(f"  Tool Observation: {observation}")
        tool_invocation_messages.append(
            ToolMessage(content=str(observation), tool_call_id=tool_call["id"])
        )
    
    return {"messages": tool_invocation_messages}

print("Agent nodes defined.")

### Constructing the Graph and Conditional Edges
We define the workflow: after the LLM call, if there are tool calls, execute them; otherwise, the process ends. After tools execute, their output goes back to the LLM.

In [None]:
# Define the conditional logic: Should we continue or end?
def should_continue_or_end(state: AgentState):
    """Determines whether to continue with tool execution or end."""
    print("--- Conditional Edge: should_continue_or_end ---")
    last_message = state["messages"][-1]
    if isinstance(last_message, AIMessage) and hasattr(last_message, 'tool_calls') and last_message.tool_calls:
        print("  Decision: Continue to tool execution.")
        return "continue_to_tools" # Route to tool executor node
    else:
        print("  Decision: End execution (LLM provided a direct answer or no more tools).")
        return END # End the graph execution

# Create the StateGraph
workflow = StateGraph(AgentState)

# Add nodes to the graph
workflow.add_node("agent_llm", call_agent_llm)
workflow.add_node("tool_executor", execute_tools_node)

# Set the entry point
workflow.set_entry_point("agent_llm")

# Add conditional edges
workflow.add_conditional_edges(
    "agent_llm", # Source node
    should_continue_or_end, # Function to decide the route
    path_map={
        "continue_to_tools": "tool_executor", # If condition returns "continue_to_tools", go to tool_executor
        END: END  # If condition returns END, finish the graph
    }
)

# Add an edge from the tool_executor back to the agent_llm to process tool results
workflow.add_edge("tool_executor", "agent_llm")


print("LangGraph workflow defined.")

### Compiling and Running the Agent
Compile the graph to create a runnable application. We can also add memory for checkpoints if needed, but for this demo, a simple compile is fine.

In [None]:
# Compile the graph
if llm: # Only compile if LLM was initialized
    # memory = MemorySaver() # Optional: for saving/resuming graph state, not strictly needed for this demo
    # app = workflow.compile(checkpointer=memory)
    app = workflow.compile()
    print("LangGraph app compiled.")
    
    # You can visualize the graph if you have graphviz installed:
    # try:
    #     from IPython.display import Image, display
    #     display(Image(app.get_graph().draw_mermaid_png()))
    # except Exception as e:
    #     print(f"Could not draw graph (graphviz might not be installed): {e}")
else:
    print("LLM not initialized. Skipping graph compilation.")
    app = None

![MLFlow Workflow](https://mlflow.org/docs/latest/assets/images/learn-core-components-b2c38671f104ca6466f105a92ed5aa68.png)

---

## 6. MLflow Tracing for LangGraph Agents

With `mlflow.langchain.autolog()` enabled, interactions within our LangGraph (which is part of LangChain) should be automatically traced when we invoke the compiled `app`.

### How Autologging Works with LangGraph
When you run the LangGraph `app.invoke(...)` or `app.stream(...)`:
- MLflow's LangChain autologger intercepts calls to LangChain components, including LLMs (like `ChatOllama`) and potentially tools if they are wrapped as LangChain tools.
- It creates a **trace** for each invocation of the graph. A trace is a hierarchical view of the operations performed.
- **Spans** within the trace represent individual operations: LLM calls, tool executions, agent steps.
- Inputs, outputs, parameters, and errors for each span are logged.

This provides a detailed, visual record of your agent's decision-making process.

### Inspecting Traces in the MLflow UI
After running some interactions with the agent (next section):
1. Run `mlflow ui` in your terminal (from the directory containing `mlruns`).
2. Navigate to the `LangGraph_ToolCalling_Agent_Ollama` experiment.
3. You should see runs corresponding to each invocation of your LangGraph application.
4. Click on a run. The **"Traces"** tab (often the default view for LangChain/LangGraph runs) will show the execution graph:
    - You can see the sequence of LLM calls and tool executions.
    - Click on individual spans (e.g., an LLM call span) to see its inputs (prompt), outputs (response, tool calls), configuration (model name, temperature), and duration.
    - Tool execution spans will show the tool name, input arguments, and the observed output.

![MLFlow Tracking](https://mlflow.org/docs/latest/assets/images/tracking-setup-local-server-cd51180e89bfd0a18c52f5b33e0f188d.png)

This visual debugging and inspection capability is invaluable for understanding and refining complex agent behavior.

---

## 7. Interacting with the Tool-Calling Agent
Let's send some queries to our agent and observe its behavior. Each `app.invoke()` call will generate a trace in MLflow.

In [None]:
def run_agent_query(query_text):
    if not app:
        print("LangGraph app not compiled. Cannot run query.")
        return None
        
    print(f"\n--- Running Agent for Query: '{query_text}' ---")
    inputs = {"messages": [HumanMessage(content=query_text)]}
    
    # Each invoke call should be captured by MLflow autologging as a new run/trace
    # We can explicitly start a parent run for each query if we want to add more metadata
    # around the LangGraph invocation itself.
    with mlflow.start_run(run_name=f"AgentQuery_{query_text[:30].replace(' ','_')}") as run:
        mlflow.log_param("user_query", query_text)
        mlflow.log_param("ollama_model_used", ollama_model_name)
        
        try:
            final_state = app.invoke(inputs, config={"recursion_limit": 10}) # Add recursion limit
            final_response_message = final_state["messages"][-1]
            
            if isinstance(final_response_message, AIMessage):
                final_answer = final_response_message.content
            elif isinstance(final_response_message, HumanMessage): # Could happen if LLM fails
                final_answer = f"Agent ended on HumanMessage: {final_response_message.content}"
            else:
                final_answer = str(final_response_message) # Fallback
                
            print(f"\nFinal Agent Response: {final_answer}")
            mlflow.log_text(final_answer, "final_agent_response.txt")
            mlflow.set_tag("agent_outcome", "success")
            return final_answer
        except Exception as e:
            print(f"Error invoking agent: {e}")
            mlflow.log_text(str(e), "agent_error.txt")
            mlflow.set_tag("agent_outcome", "error")
            return None

# Test Queries
query1 = "What is the weather like in Paris today?"
query2 = "What is 250 + 750 / 3?"
query3 = "Can you tell me the weather in London and also calculate 5 * (10 - 3)?"
query4 = "What is the capital of France?" # Should be answered directly by LLM (no tool)

if app: # Only run queries if app was compiled
    run_agent_query(query1)
    print("\n------------------------------------\n")
    run_agent_query(query2)
    print("\n------------------------------------\n")
    # Query 3 might require the LLM to decide on multiple tool calls, or sequence them.
    # The current simple agent might handle one tool per LLM response, or call multiple if the LLM supports parallel tool calling in its output.
    # Let's assume for now it might try one, then the other in sequence if the loop works correctly.
    run_agent_query(query3)
    print("\n------------------------------------\n")
    run_agent_query(query4)
else:
    print("Skipping agent queries as the app was not compiled (likely due to LLM initialization issue).")

After running these, go to the MLflow UI and inspect the traces for each query. You should see how the agent decided which tools to call (or not to call) and the flow of information.

---

## 8. Key Takeaways for Building and Tracing Agents

This notebook introduced you to the powerful combination of LangGraph, Ollama, and MLflow for building and observing tool-calling agents:

- **LangGraph for Agent Logic:** Provides a flexible, graph-based approach to define complex agent behaviors, state management, and conditional tool use.
- **Local LLMs with Ollama:** Enables development and experimentation with powerful open-source LLMs running entirely on your local machine, enhancing privacy and reducing reliance on cloud APIs.
- **Tool Definition:** Custom tools extend the agent's capabilities beyond the LLM's inherent knowledge.
- **MLflow for Agent Tracing:** `mlflow.langchain.autolog()` is invaluable for capturing the detailed execution flow of LangGraph agents, including LLM decisions, tool inputs/outputs, and the overall state evolution. This is crucial for debugging, understanding, and improving agent performance.
- **Iterative Development:** The ability to trace and inspect agent behavior allows for rapid iteration and refinement of prompts, tool definitions, and agent logic.

Building robust agents often involves careful prompt engineering to guide the LLM in tool selection and response generation, as well as thoughtful tool design.

---

## 9. Engaging Resources and Further Reading

To explore further into agents, LangGraph, and MLflow tracing:

- **LangGraph & LangChain Documentation:**
    - [LangGraph Documentation](https://python.langchain.com/docs/langgraph)
    - [LangChain Tool Use](https://python.langchain.com/docs/modules/agents/tools/)
    - [LangChain Expression Language (LCEL)](https://python.langchain.com/docs/expression_language/)
- **MLflow Tracing:**
    - [MLflow Tracing Documentation](https://mlflow.org/docs/latest/tracing/index.html)
    - [MLflow LangChain Integration (includes LangGraph)](https://mlflow.org/docs/latest/tracing/integrations/langchain.html)
    - [MLflow Ollama Tracing (via OpenAI SDK compatibility)](https://mlflow.org/docs/latest/tracing/integrations/ollama.html)
- **Ollama:**
    - [Ollama Official Website](https://ollama.com/)
    - [Ollama GitHub](https://github.com/ollama/ollama)
- **Community Examples and Tutorials:**
    - [Pinecone Blog: Llama 3.1 Agent using LangGraph and Ollama](https://www.pinecone.io/learn/langgraph-ollama-llama/)
    - YouTube tutorials like "Local LangGraph Agents with Llama 3.1 + Ollama" by James Briggs.
    - Prabhat Pankaj's blog on Dynamic, Parallel Tool-Calling Agent with LangGraph.

--- 

Congratulations on building and tracing your first tool-calling agent! This is a foundational skill for creating more sophisticated and interactive AI applications.

**Coming Up Next (Notebook 8):** We'll delve deeper into advanced agentic patterns, exploring more complex function-calling scenarios and potentially looking at agent-to-agent communication protocols, all while keeping MLflow in the loop.

![Keep Learning](https://memento.epfl.ch/image/23136/1440x810.jpg)