In [1]:
import stackprinter  # type: ignore
import jupyter_black  # type: ignore
from dotenv import load_dotenv  # type: ignore

from baml_agents import init_logging, with_model
from baml_client.sync_client import b
from baml_client.type_builder import TypeBuilder as TB
from baml_client import types as T

init_logging()
stackprinter.set_excepthook()
load_dotenv()
jupyter_black.load()

b = with_model(b, "gpt-4.1-nano")

## Using Model Context Protocol (MCP) Tools with baml

## What is Model Context Protocol (MCP)

Model Context Protocol (MCP) allows users to easily plug-and-play 3rd party resources into AI systems with an interface that's easy to understand and use by LLMs.

- **Tip** You can find a list of more popular Model Context Protocol (MCP) resources at [punkpeye/awesome-mcp-servers](https://github.com/punkpeye/awesome-mcp-servers)

In this example, we'll focus on integrating MCP Tools into the baml structured generation framework, since MCP offers additional features beyond basic tool invocation.

### Prerequisites

We'll use the `mcpt` command-line interface to interact with MCP servers. It functions much like `curl`, allowing us to query the servers easily.

- Install it from here: [github.com/f/mcptools](https://github.com/f/mcptools)

It's not an efficient way because we start and stop the server on every request, but it's a good way to test the integration. In a production environment, you would run the server continuously to always be available for requests.

Note: While this example uses local servers (via stdio), you can also specify any URL as the server.

### Showing around the MCP Server

Let's see all available tools on the `uvx mcp-server-calculator` [MCP server](https://github.com/githejie/mcp-server-calculator). Server will be downloaded as a python package by `uvx` and run locally until we get the data we need.

In [2]:
# Show all the tools in the server
!mcpt tools uvx mcp-server-calculator

[1m[36mcalculate[0m([32mexpression:str[0m)
     [37mCalculates/evaluates the given expression.[0m



Run the `calculate` tool:

In [3]:
# Use the calculator tool
!mcpt call calculate -p '{"expression": "123.4 * 2"}' uvx mcp-server-calculator

[37m246.8[0m


### Calling tools from the MCP server with BAML

Let's see the actual prompt that we will send to the LLM:

In [4]:
from baml_agents import ActionRunner, display_prompt

calculator_mcp = "uvx mcp-server-calculator"
r = ActionRunner(TB, b=b, cache=True)
r.add_from_mcp_server(calculator_mcp)

question = "What's the sum of the first 5 prime numbers?"
request = r.b.request.BamlCustomTools_GetNextAction(question)
display_prompt(request)

[system]
Please select the next best action to achieve the goal.

‹goal›
What's the sum of the first 5 prime numbers?
‹/goal›

Answer in JSON format with exactly one next best action:
{
  chosen_action: {
    // Calculates/evaluates the given expression.
    action_id: "calculate",
    expression: string,
  },
}


Let's send it and run the result:

In [5]:
calculator_mcp = "uvx mcp-server-calculator"
r = ActionRunner(TB, b=b, cache=True)
r.add_from_mcp_server(calculator_mcp)

question = "What's the sum of the first 5 prime numbers?"
action = r.b.BamlCustomTools_GetNextAction(question)
print(action)

chosen_action={'action_id': 'calculate', 'expression': '2 + 3 + 5 + 7 + 11'}


In [6]:
r.run(action)

Result(content='28', error=False)

## How to choose which tools to include into the prompt?

You can filter when adding tools to the ActionRunner and also right before sending the prompt to the LLM.

Using all the tools in the server:

In [7]:
brave_search_mcp = "docker run -i --rm -e BRAVE_API_KEY mcp/brave-search"
env = {"BRAVE_API_KEY": "not-needed"}

r = ActionRunner(TB, b=b, cache=True)
r.add_from_mcp_server(brave_search_mcp, env=env)
print([a.name for a in r.actions])

['brave_web_search', 'brave_local_search']


Let's say we only want to import the "brave_web_search" tool:

In [8]:
from baml_agents import McpToolDefinition


def include(m: McpToolDefinition):
    if m.name == "brave_web_search":
        return True
    else:
        return False


r = ActionRunner(TB, b=b, cache=True)
r.add_from_mcp_server(brave_search_mcp, include=include, env=env)
print([a.name for a in r.actions])

['brave_web_search']


Let's say we imported many tools but only want to make one available to the LLM (we'll the use **include()** function):

In [9]:
r = ActionRunner(TB, b=b, cache=True)
r.add_from_mcp_server(brave_search_mcp, env=env)
print([a.name for a in r.actions])

['brave_web_search', 'brave_local_search']


In [10]:
from baml_client.type_builder import TypeBuilder


question = "Find me some sushi"


def include(m: McpToolDefinition):
    if m.name == "brave_local_search":
        return True
    else:
        return False


request = r.b_(include=include).request.BamlCustomTools_GetNextAction(question)
display_prompt(request)

[system]
Please select the next best action to achieve the goal.

‹goal›
Find me some sushi
‹/goal›

Answer in JSON format with exactly one next best action:
{
  chosen_action: {
    // Searches for local businesses and places using Brave's Local Search API. Best for queries related to physical locations, businesses, restaurants, services, etc. Returns detailed information including:
    // - Business names and addresses
    // - Ratings and review counts
    // - Phone numbers and opening hours
    // Use this when the query implies 'near me' or mentions specific locations. Automatically falls back to web search if no local results are found.
    action_id: "brave_local_search",
    // Number of results (1-20, default 5)
    count: float OR null,
    // Local search query (e.g. 'pizza near Central Park')
    query: string,
  },
}
