# Build AI Agents with DSPy and MLflow on Databricks

What a mouthful of a title, isn't it? üòú

This is a Databricks notebook version of [Build AI Agents with DSPy](https://dspy.ai/tutorials/customer_service_agent/).

It walks you through the following:

1. How to build an AI agent with DSPy
1. Focuses on a popular architecture of AI agents called **ReAct** (**Re**asoning and **Act**ing) which provides a task description along with a list of tools to LM, then lets LM decide whether to call tools for more obseravations, or generate the final output.


## Before We Go: Building a (boring) chat-BI Agent

	
In [Building a (boring) chat-BI Agent](https://open.substack.com/pub/juhache/p/building-a-boring-chat-bi-agent), Julien Hurault wrote:

> The idea [of MCP] is straightforward: expose external tools to an LLM through the new MCP protocol.
>
> It's a great way to give an LLM access to remote capabilities‚Äîemails, Slack, databases‚Äîand it's already widely supported across major LLM providers.
>
> This works very well... üî•üî• but it requires you to host an MCP server. üî•üî• (fires mine)

And then, this gem came:

> LLM tools are another way to give models additional capabilities.
>
> You provide the model with a description of each tool: its purpose, parameters, and usage guidelines.
>
> The LLM then takes the user prompt and returns a list of tools to execute, along with the associated parameters.
>
> Contrary to the MCP approach, üî•üî• tool execution happens directly in the environment that is calling the LLM, not through a remote server.üî•üî• (fires mine)
>
> Local coding agents use this pattern extensively.
>
> Claude Code, [Cursor, Windsurf], is essentially an LLM combined with a set of tools for searching, reading, and editing files.

## Requirements

In [0]:
%pip install -qU dspy>=3.0.4 mlflow>=3.7.0
dbutils.library.restartPython()

In [0]:
import mlflow
print(f"MLflow version: {mlflow.__version__}")

In [0]:
import dspy
print(f"DSPy version: {dspy.__version__}")


## MLflow DSPy Integration

MLflow lets you visualize prompts and optimization progress as traces to understand the DSPy's behavior better.

[MLflow DSPy Flavor](https://mlflow.org/docs/latest/genai/flavors/dspy/)

In [0]:
import mlflow
mlflow.dspy.autolog()

# The name of the MLflow experiment is the title of this notebook
# unless set explicitly with `mlflow.set_experiment`
# mlflow.set_experiment("DSPy")

## Define Tools

[Define Tools](https://dspy.ai/tutorials/customer_service_agent/#define-tools)

In [0]:
from pydantic import BaseModel

class Date(BaseModel):
    # Somehow LLM is bad at specifying `datetime.datetime`, so
    # we define a custom class to represent the date.
    year: int
    month: int
    day: int
    hour: int

class UserProfile(BaseModel):
    user_id: str
    name: str
    email: str

class Flight(BaseModel):
    flight_id: str
    date_time: Date
    origin: str
    destination: str
    duration: float
    price: float

class Itinerary(BaseModel):
    confirmation_number: str
    user_profile: UserProfile
    flight: Flight

class Ticket(BaseModel):
    user_request: str
    user_profile: UserProfile

In [0]:
user_database = {
    "Adam": UserProfile(user_id="1", name="Adam", email="adam@gmail.com"),
    "Bob": UserProfile(user_id="2", name="Bob", email="bob@gmail.com"),
    "Chelsie": UserProfile(user_id="3", name="Chelsie", email="chelsie@gmail.com"),
    "David": UserProfile(user_id="4", name="David", email="david@gmail.com"),
}

flight_database = {
    "DA123": Flight(
        flight_id="DA123",  # DSPy Airline 123
        origin="SFO",
        destination="JFK",
        date_time=Date(year=2025, month=9, day=1, hour=1),
        duration=3,
        price=200,
    ),
    "DA125": Flight(
        flight_id="DA125",
        origin="SFO",
        destination="JFK",
        date_time=Date(year=2025, month=9, day=1, hour=7),
        duration=9,
        price=500,
    ),
    "DA456": Flight(
        flight_id="DA456",
        origin="SFO",
        destination="SNA",
        date_time=Date(year=2025, month=10, day=1, hour=1),
        duration=2,
        price=100,
    ),
    "DA460": Flight(
        flight_id="DA460",
        origin="SFO",
        destination="SNA",
        date_time=Date(year=2025, month=10, day=1, hour=9),
        duration=2,
        price=120,
    ),
}

itinery_database = {}
ticket_database = {}

### dspy.ReAct

In order to have `dspy.ReAct` module properly defined, every function should:

1. Have a docstring which defines what the tool does. If the function name is self-explanable, then you can leave the docstring empty.
1. Have type hint for the arguments, which is necessary for LM to generate the arguments in the right format.

In [0]:
import random
import string


def fetch_flight_info(date: Date, origin: str, destination: str):
    """Fetch flight information from origin to destination on the given date"""
    flights = []

    for flight_id, flight in flight_database.items():
        if (
            flight.date_time.year == date.year
            and flight.date_time.month == date.month
            and flight.date_time.day == date.day
            and flight.origin == origin
            and flight.destination == destination
        ):
            flights.append(flight)
    if len(flights) == 0:
        raise ValueError("No matching flight found!")
    return flights


def fetch_itinerary(confirmation_number: str):
    """Fetch a booked itinerary information from database"""
    return itinery_database.get(confirmation_number)


def pick_flight(flights: list[Flight]):
    """Pick up the best flight that matches users' request. we pick the shortest, and cheaper one on ties."""
    sorted_flights = sorted(
        flights,
        key=lambda x: (
            x.get("duration") if isinstance(x, dict) else x.duration,
            x.get("price") if isinstance(x, dict) else x.price,
        ),
    )
    return sorted_flights[0]


def _generate_id(length=8):
    chars = string.ascii_lowercase + string.digits
    return "".join(random.choices(chars, k=length))


def book_flight(flight: Flight, user_profile: UserProfile):
    """Book a flight on behalf of the user."""
    confirmation_number = _generate_id()
    while confirmation_number in itinery_database:
        confirmation_number = _generate_id()
    itinery_database[confirmation_number] = Itinerary(
        confirmation_number=confirmation_number,
        user_profile=user_profile,
        flight=flight,
    )
    return confirmation_number, itinery_database[confirmation_number]


def cancel_itinerary(confirmation_number: str, user_profile: UserProfile):
    """Cancel an itinerary on behalf of the user."""
    if confirmation_number in itinery_database:
        del itinery_database[confirmation_number]
        return
    raise ValueError("Cannot find the itinerary, please check your confirmation number.")


def get_user_info(name: str):
    """Fetch the user profile from database with given name."""
    return user_database.get(name)


def file_ticket(user_request: str, user_profile: UserProfile):
    """File a customer support ticket if this is something the agent cannot handle."""
    ticket_id = _generate_id(length=6)
    ticket_database[ticket_id] = Ticket(
        user_request=user_request,
        user_profile=user_profile,
    )
    return ticket_id



## Create ReAct agent

You should provide a signature to `dspy.ReAct` to define a task, and the inputs and outputs of the agent (_service_).

In [0]:
import dspy

class DSPyAirlineCustomerService(dspy.Signature):
    """You are an airline customer service agent that helps user book and manage flights.

    You are given a list of tools to handle user request, and you should decide the right tool to use in order to
    fulfill users' request."""

    user_request: str = dspy.InputField()
    process_result: str = dspy.OutputField(
        desc=(
                "Message that summarizes the process result, and the information users need, e.g., the "
                "confirmation_number if a new flight is booked."
            )
        )


### Agents and Tools

Finally, you tell the `dspy.ReAct` agent about the tools it can access (if it ever wants to).

‚ö†Ô∏è This _if it ever wants to_ is key to understand the agent's autonomousity.
It is in its discretion whether or not to call any of the tools.

In [0]:
agent = dspy.ReAct(
    DSPyAirlineCustomerService,
    tools = [
        fetch_flight_info,
        fetch_itinerary,
        pick_flight,
        book_flight,
        cancel_itinerary,
        get_user_info,
        file_ticket,
    ]
)

In [0]:
type(agent)

In [0]:
dir(agent)

## Use the Agent

[Use the Agent](https://dspy.ai/tutorials/customer_service_agent/#use-the-agent)

The `agent` is callable (contains `__call__` dunder method) which makes it a function! üí°

[Python's .\_\_call__() Method: Creating Callable Instances](https://realpython.com/python-callable-instances/)

In [0]:
import dspy

lm = dspy.LM(model="databricks/databricks-gpt-5-mini")
dspy.configure(lm=lm)

In [0]:
result = agent(user_request="please help me book a flight from SFO to JFK on 09/01/2025, my name is Adam")
print(result.process_result)

In [0]:
import rich

rich.print(itinery_database)


## Interpret the Result

Behind scene, `dspy.ReAct` is executing a loop, which accumulates tool call information along with the task description, and sends it to the LM until hits `max_iters` or the LM decides to wrap up.

You can use `dspy.inspect_history()` to see what's happening inside each step.

In each LM call, the user message includes the information of previous tool calls, along with the task description.

In [0]:
dspy.inspect_history(n=10)

In [0]:
# look it up in itinery_database above
confirmation_number = "g2owaya6"

result = agent(user_request=f"i want to take DA125 instead on 09/01, please help me modify my itinerary {confirmation_number}")
print(result.process_result)

## Conclusions

With `dspy.ReAct`, you can:

1. Define tools as Python functions (with docstring and type hints).
1. `dspy.ReAct` defines tasks with input and output (_signature_).
1. `dspy.ReAct` starts the reasoning and acting loop behind the scene.