# Workshop Microsoft Build May 19, 2025: Build Your AI Agent with MCPs using vLLM, Pydantic AI, and AMD MI300X GPU

Welcome to this hands-on workshop! Throughout this tutorial, we'll leverage AMD GPUs and **Model Context Protocol (MCP)** ,an open standard for exposing LLM tools via API, to deploy powerful language models like Qwen3. Key components:
- 🖥️ **vLLM** for GPU-optimized inference
- 🛠️ **Pydantic-AI** for agent/tool management
- 🔌 **MCP Servers** for pre-built tool integration

You'll learn how to set up your environment, deploy large language models like Qwen3, connect them to real-world tools using MCP, and build a conversational agent capable of reasoning and taking actions.

By the end of this workshop, you’ll have built an AI-powered Airbnb assistant agent—one that can find a place to stay based on your preferences like location, budget, and travel dates.

Let’s dive in!

## Table of Contents

- [Step 1: Launching vLLM Server on AMD GPUs](#step1)
- [Step 2: Installing Dependencies](#step2)
- [Step 3: Create a simple instance of Pydantic-AI Agent](#step3)
- [Step 4: Write a Date/Time Tool for Your Agent](#step4)
- [Step 5: Replace Your Date/time Tool with a MCP server](#step5)
- [Step 6: Turn your agent to an Airbnb finder](#step6)
- [Step 7: Challenge with Prize](#step7)

<a id="step1"></a>

## Step 1: Launch a vLLM Server

In this workshop we are going to use [vLLM](https://github.com/vllm-project/vllm) as our inference serving engine. vLLM provides many benefits such as fast model execution, extensive list of supported models, easy to use, and best of all it's open-source. 

### Deploy Qwen3-30B-A3B Model with vLLM

In this workshop we have already configured `VLLM_PORT` for you. Let's verify the port number.


In [None]:
!echo $VLLM_PORT


Time to start your vLLM server and creating an end-point for your LLM. Let's open a terminal using your Jupyter server. Then run the following command in this terminal to start the vLLM server:

```bash
VLLM_USE_TRITON_FLASH_ATTN=0 \
vllm serve Qwen/Qwen3-30B-A3B \
    --served-model-name Qwen3-30B-A3B \
    --api-key abc-123 \
    --port $VLLM_PORT \
    --enable-auto-tool-choice \
    --tool-call-parser hermes \
    --trust-remote-code
```

Open another terminal and monitor the GPU utilization by running this command:

```bash
watch rocm-smi
```

Upon successful launch, your server should be accepting incoming traffic through an OpenAI-compatible API. Let's set some environment variables for our server so we can use throughout this tutorial:

In [None]:
import os

BASE_URL = f"http://localhost:{os.environ['VLLM_PORT']}/v1"

os.environ["BASE_URL"]    = BASE_URL
os.environ["OPENAI_API_KEY"] = "abc-123"   

print("Config set:", BASE_URL)

We can verify your model is available at the `BASE_URL` we just set by running the following command.

In [None]:
!curl http://localhost:$VLLM_PORT/v1/models -H "Authorization: Bearer $OPENAI_API_KEY"

Congratulations, you now just launched a powerful server that can serve any incoming request and allowing you to build amazing applications. Wasn't that easy?🎉 

<a id="step2"></a>

## Step 2: Installing Dependencies

We are going to use `Pydantic AI`. Let's install the dependencies:

In [None]:
!pip install -q pydantic_ai openai     


<a id="step3"></a>

## Step 3: Create a simple instance of Pydantic-AI Agent

Let's start by creating a custom OpenAI Compatible endpoint for our agent. 


In [None]:
from pydantic_ai.models.openai import OpenAIModel
from pydantic_ai.providers.openai import OpenAIProvider

provider = OpenAIProvider(
    base_url=os.environ["BASE_URL"],
    api_key=os.environ["OPENAI_API_KEY"],
)

agent_model = OpenAIModel("Qwen3-30B-A3B", provider=provider)

Let's start by creating an instance the `Agent` class from `pydantic_ai`. 


In [None]:
from pydantic_ai import Agent

agent = Agent(
    model=agent_model
)

It's time to test the agent. `pydantic_ai` provides multiple ways to run `Agent`. You can learn more about it [here](https://ai.pydantic.dev/agents/#running-agents).

In this workshop, we are running in `async` mode. We are going to define a helper function that allows us to quickly test our agent throughout this workshop.

In [None]:
import asyncio
from pydantic_ai.mcp import MCPServerStdio
async def run_async(prompt: str) -> str:
    async with agent.run_mcp_servers():
        result = await agent.run(prompt)
        return result.output


Test the agent by calling this function.

In [None]:
await run_async("What is the capital of France?")

Great! now that we have the basics of creating an agent instance, and connecting it to the model we started serving with vLLM earlier.

<a id="step4"></a>

## Step 4: Write a Date/Time Tool for Your Agent

LLMs naturally rely on their training data to respond to your prompts. Therefore, the agent we just defined fails to answer a factual question that falls outside of it's training knowledge. Let's show this with an example:

In [None]:
await run_async("What’s the date today?")

It is no surprise that the model failed to answer this question. Now, it's time to power-up your LLM by providing `agent` a function that can get the current date. The process of an LLM triggering a function call is commonly referred to as `Tool Calling` or `Function Calling`. In this workshop we are going to take advantage of `pydantic-ai`'s agent `tools` to provide our agent appropriate tools. First, we need to define a custom tool. Below is how we can define a tool in this framework.

In [None]:
from datetime import datetime
from pydantic_ai import Tool          
@Tool
def get_current_date() -> str:
    """Return the current date/time as an ISO-formatted string."""
    return datetime.now().strftime("%Y-%m-%d %H:%M:%S")


Next, we need to provide this tool to our Agent, as this will notify the LLM about the existence of such a tool we just definied. This is simply done by just providing the function signiture of the tool we just defined to our agent constructor. 

In [None]:
agent = Agent(
    model=agent_model,
    tools=[get_current_date],
    system_prompt = (
        "You have access to:\n"
        "   1. get_current_time(params: dict)\n"
        "Use this tool for date/time questions."
    )
)

Let's test the agent.

In [None]:
await run_async("What’s the date today?")

Well done on building an agent with access to real-time data. 



<a id="step5"></a>

## Step 5: Replace Your Date/time Tool with a MCP server

Now that we learned how to create a custom tool and provide the agent access to this tool. Let's now explore a trendy topic of [Model Context Protocol](https://modelcontextprotocol.io/introduction). We are going to explore how we can replace our custom tool with a simple MCP server that can serve our agent and provide similar information.

**Why MCP?** MCP servers provide:
- ✅ Standardized API interfaces
- 🔄 Reusable across projects
- 📦 Pre-built functionality

Let's replace our custom time tool with an official MCP time server:

### Installing Time MCP Server

We are going to start by installing this MCP server:


In [None]:
!pip install -q mcp-server-time

Now let's define our time_server:

In [None]:
from pydantic_ai.mcp import MCPServerStdio

time_server = MCPServerStdio(
    "python",
    args=[
        "-m", "mcp_server_time",
        "--local-timezone=America/New_York",
    ],
)

Finally, let's modify our agent to remove our previously defined tool, and add this MCP server instead.

In [None]:
agent = Agent(
    model=agent_model,
    mcp_servers=[time_server],
    system_prompt = (
        "You are a helpful agent and you have access to this tool:\n"
        "   get_current_time(params: dict)\n"
        "When the user asks for the current date or time, call get_current_time.\n"
    )
)

Great, let's see if the agent can use the MCP to give us the correct time now.

In [None]:
await run_async("What’s the date today?")


Tadaa! Now you have officially used an MCP server to power-up your agent. In the next section we show how you can your turn many ideas into real working projects by using 100s of free or paid MCP servers available today.



<a id="step6"></a>

## Step 6: Turn your agent to an airbnb finder

As we experience in the last section, MCP servers are really easy to use and they provide a standard way of providing LLMs the tools we need. There are already thousands of MCP servers available for us to use. There are some MCP trackers that you can always use to find out about available servers. Here are some for your reference:
- https://github.com/modelcontextprotocol/servers
- https://mcp.so/

We are going to use npx to launch out next server. Therefore, let's install the required dependencies.

In [None]:
# Install Node.js 18 via NodeSource
!curl -fsSL https://deb.nodesource.com/setup_18.x | sudo -E bash -
!apt install -y nodejs

Verify `npm` and `npx` installation:

In [None]:
!node -v && npm -v && npx --version



In this part of the workshop we are going to build an agent that can help you browse available Airbnbs to book. We can now build on top of what we have so far and add an open-source Airbnb MCP server to our agent. To do so, let's start by defining our Airbnb server.

In [None]:
airbnb_server = MCPServerStdio(
    "npx", args=["-y", "@openbnb/mcp-server-airbnb", "--ignore-robots-txt"]
)

Let's update our agent.

In [None]:
system_prompt = """
You have access to three tools:
1. get_current_time(params: dict)
2. airbnb_search(params: dict)
3. airbnb_listing_details(params: dict)
When the user asks for listings, first call get_current_time, then airbnb_search, etc.
"""

agent = Agent(
    model=agent_model,
    mcp_servers=[time_server, airbnb_server],
    system_prompt=system_prompt,
)


Finally, let's try our agent and see if it can browse through Airbnb listings.

In [None]:
await run_async("Find a place to stay in Vancouver for next Sunday for 3 nights for 2 adults?")



<a id="step7"></a>

## Step 7: Challenge - Expand the Agent

**Task:** Add weather integration using an appropiate MCP server:
1. Launch weather MCP server
2. Add to agent's tools
3. Make agent suggest best travel dates based on weather

**Judging Criteria:**
✅ Functional weather integration
🎯 Logical tool selection
💡 Creative use of multiple tools
