# How to build an AI Agent with LlamaIndex and ClickHouse MCP Server

In this notebook we'll see how to build a [LlamaIndex](https://github.com/run-llama/llama_index)  AI agent that can interact with [ClickHouse's SQL playground](https://sql.clickhouse.com/).

## Install libraries
We need to install LlamaIndex, the ClickHouse connector library, and Anthropic, as we'll be using that as our LLM.

In [78]:
!pip install -q --upgrade pip

In [3]:
!pip install -q llama-index
!pip install -q clickhouse-connect
!pip install -q llama-index-llms-anthropic
!pip install -q llama-index-tools-mcp

## Setup credentials
Let's provide our Anthropic API key.

In [14]:
import getpass
import os

In [15]:
os.environ["ANTHROPIC_API_KEY"] = getpass.getpass("Enter Anthropic API Key:")

Enter Anthropic API Key: ········


## Initialize LLM
And then initialize the Claude Sonnet 4.0 model.

In [16]:
from llama_index.llms.anthropic import Anthropic

In [17]:
llm = Anthropic(model="claude-sonnet-4-0")

## Configure MCP client
Next, we're going to create an MCP client that connects to the ClickHouse SQL Playground via the  ClickHouse MCP server.

We then need to convert those from Python functions into Llama Index tools:

In [4]:
from llama_index.tools.mcp import BasicMCPClient, McpToolSpec

In [20]:
mcp_client = BasicMCPClient(
    "uv", 
    args=[
        "run", 
        "--with", "mcp-clickhouse",
        "--python", "3.13", 
        "mcp-clickhouse"
    ],
    env={
        "CLICKHOUSE_HOST": "sql-clickhouse.clickhouse.com",
        "CLICKHOUSE_PORT": "8443",
        "CLICKHOUSE_USER": "demo",
        "CLICKHOUSE_PASSWORD": "",
        "CLICKHOUSE_SECURE": "true"
    } 
)

mcp_tool_spec = McpToolSpec(
    client=mcp_client,
)

In [21]:
tools = await mcp_tool_spec.to_tool_list_async()

## Creating Agent
Now we're ready to create an agent that has access to those tools. We're going to set the maximum number of tool calls in one run to 10, but you can modify that if you want.

In [22]:
from llama_index.core.agent import AgentRunner, FunctionCallingAgentWorker

In [23]:
agent_worker = FunctionCallingAgentWorker.from_tools(
    tools=tools, 
    llm=llm, verbose=True, max_function_calls=10
)
agent = AgentRunner(agent_worker)

## Running the agent
Finally, we can ask the agent a question:

In [24]:
response = agent.query("What's the most popular repository?")

Added user message to memory: What's the most popular repository?
=== LLM Response ===
I'll help you find the most popular repository. Let me first explore the available databases and tables to understand the data structure.
=== Calling Function ===
Calling function: list_databases with args: {}
=== Function Output ===
meta=None content=[TextContent(type='text', text='amazon\nbluesky\ncountry\ncovid\ndefault\ndns\nenvironmental\nfood\nforex\ngeo\ngit\ngithub\nhackernews\nimdb\nlogs\nmetrica\nmgbench\nmta\nnoaa\nnyc_taxi\nnypd\nontime\nopensky\notel\notel_v2\npypi\nrandom\nreddit\nrubygems\nstackoverflow\nstar_schema\nstock\nsystem\ntw_weather\ntwitter\nuk\nwiki\nwords\nyoutube', annotations=None)] isError=False
=== LLM Response ===
I can see there's a `github` database which likely contains repository data. Let me explore the tables in that database.
=== Calling Function ===
Calling function: list_tables with args: {"database": "github"}
=== Function Output ===
meta=None content=[TextC