# How to build an AI Agent with DSPy and the ClickHouse MCP Server

In this notebook we'll see how to build an [DSPy](https://dspy.ai/tutorials/mcp/) AI agent that can interact with [ClickHouse's SQL playground](https://sql.clickhouse.com/) using [ClickHouse's MCP Server](https://github.com/ClickHouse/mcp-clickhouse).


## Install libraries
We need to install the DSPy library.

In [None]:
!pip install -q --upgrade pip

In [1]:
!pip install -q dspy
!pip install -q mcp

## Setup credentials
Let's provide our Anthropic API key.

In [15]:
import os, getpass

In [16]:
os.environ["ANTHROPIC_API_KEY"] = getpass.getpass("Enter Anthropic API Key:")

Enter Anthropic API Key: ········


We'll also define the credentials to connect to the ClickHouse SQL playground:

In [3]:
env = {
    "CLICKHOUSE_HOST": "sql-clickhouse.clickhouse.com",
    "CLICKHOUSE_PORT": "8443",
    "CLICKHOUSE_USER": "demo",
    "CLICKHOUSE_PASSWORD": "",
    "CLICKHOUSE_SECURE": "true"
}

## Initialize MCP Server

Lets configure the ClickHouse MCP Server to point at the ClickHouse SQL playground.

In [18]:
from mcp import ClientSession, StdioServerParameters
from mcp.client.stdio import stdio_client
import dspy

In [29]:
server_parameters = StdioServerParameters(
    command="uv",
    args=[
        'run',
        '--with', 'mcp-clickhouse',
        '--python', '3.13',
        'mcp-clickhouse'
    ],
    env=env
)

## Initialize LLM
Next, let's initialize our Claude Sonnet model

In [11]:
dspy.configure(lm=dspy.LM("anthropic/claude-sonnet-4-20250514"))

## Run agent
Finally, we'll initialize and run the agent:

In [19]:
import dspy

class DataAnalyst(dspy.Signature):
    """You are a data analyst. You'll be asked questions and you need to try to answer them using the tools you have access to. """

    user_request: str = dspy.InputField()
    process_result: str = dspy.OutputField(
        desc=(
            "Answer to the query"
        )
    )

In [17]:
async with stdio_client(server_params) as (read, write):
    async with ClientSession(read, write) as session:
        # Initialize the connection
        await session.initialize()
        # List available tools
        tools = await session.list_tools()

        # Convert MCP tools to DSPy tools
        dspy_tools = []
        for tool in tools.tools:
            dspy_tools.append(dspy.Tool.from_mcp_tool(session, tool))

        print("Tools", dspy_tools)

        react = dspy.ReAct(DataAnalyst, tools=dspy_tools)
        result = await react.acall(user_request="What's the most popular Amazon product category")
        print(result)

3
[Tool(name=list_databases, desc=List available ClickHouse databases, args={'args': {}, 'kwargs': {}}), Tool(name=list_tables, desc=List available ClickHouse tables in a database, including schema, comment,
row count, and column count., args={'database': {'title': 'Database', 'type': 'string'}, 'like': {'default': None, 'title': 'Like', 'type': 'string'}}), Tool(name=run_select_query, desc=Run a SELECT query in a ClickHouse database, args={'query': {'title': 'Query', 'type': 'string'}})]
{'args': {}, 'kwargs': {}}
Prediction(
    trajectory={'thought_0': "I need to find information about Amazon product categories and determine which one is most popular. First, I should explore what databases are available to see if there's any Amazon-related data.", 'tool_name_0': 'list_databases', 'tool_args_0': {}, 'observation_0': 'amazon\nbluesky\ncountry\ncovid\ndefault\ndns\nenvironmental\nfood\nforex\ngeo\ngit\ngithub\nhackernews\nimdb\nlogs\nmetrica\nmgbench\nmta\nnoaa\nnyc_taxi\nnypd\nontime\