# model context protocol (mcp) - complete guide

this notebook demonstrates the full mcp ecosystem:
1. connecting to docker mcp gateway
2. creating custom mcp servers with tools, resources, and prompts
3. using mcp inspector for debugging
4. implementing mcp clients
5. building multi-agent systems with mcp tools

## setup and imports

In [None]:
# install required packages (run once)
# !pip install "mcp[cli]" llama-index llama-index-tools-mcp llama-index-llms-openai python-dotenv

In [1]:
import asyncio
import os
from llama_index.tools.mcp import BasicMCPClient, aget_tools_from_mcp_url
from llama_index.llms.openai import OpenAI
from llama_index.core.agent.workflow import ReActAgent, AgentWorkflow, AgentStream, ToolCallResult
from dotenv import load_dotenv

load_dotenv()
OPENAI_API_KEY = os.getenv("OPENAI_API_KEY")
DOCKER_GATEWAY_TOKEN = os.getenv("DOCKER_GATEWAY_TOKEN", "")  # copy from docker gateway output

## 1. docker mcp gateway - connecting to remote mcp servers

### what is docker mcp gateway?
docker mcp gateway allows you to connect multiple mcp servers through docker desktop and access them via a single endpoint.

### setup steps:
1. install go (required for mcp gateway)
2. open docker desktop
3. go to settings ‚Üí enable "docker mcp toolkit"
4. navigate to mcp toolkit ‚Üí clients (you'll see cursor, codex, claude desktop)
5. start the docker mcp gateway:
   ```bash
   docker mcp gateway run --port 8080 --transport streaming
   ```
6. copy the bearer token from the output and add to .env file

### gateway urls:
- **from docker containers**: `http://host.docker.internal:8080/mcp`
- **from local machine**: `http://localhost:8080/mcp`

### important: authentication
the docker gateway requires bearer token authentication. make sure to:
1. copy the token from the gateway startup output
2. add it to your `.env` file as `DOCKER_GATEWAY_TOKEN=your_token_here`

In [3]:
import os
from dotenv import load_dotenv
load_dotenv(override=True)
DOCKER_GATEWAY_TOKEN = os.getenv("DOCKER_GATEWAY_TOKEN", "")
print(f"Reloaded: {DOCKER_GATEWAY_TOKEN[:20]}...")

Reloaded: n3jihrmridkv4jdh7jwg...


In [4]:
# connect to docker mcp gateway
async def explore_docker_gateway():
    """explore mcp servers available through docker gateway"""
    
    if not DOCKER_GATEWAY_TOKEN:
        print("‚ö†Ô∏è warning: DOCKER_GATEWAY_TOKEN not set in .env file")
        print("   copy token from docker gateway output: 'Use Bearer token: Authorization: Bearer <token>'")
        return None
    
    # create client with authentication
    docker_client = BasicMCPClient(
        "http://localhost:8080/mcp",
        headers={
            "Authorization": f"Bearer {DOCKER_GATEWAY_TOKEN}",
            "Accept": "application/json, text/event-stream"
        }
    )
    
    # list available tools from all connected mcp servers
    print("=== available tools from docker gateway ===")
    try:
        available_tools = await docker_client.list_tools()
        print(f"found {len(available_tools.tools)} tools:")
        for tool in available_tools.tools:
            print(f"  - {tool.name}: {tool.description}")
    except Exception as e:
        print(f"error connecting to docker gateway: {e}")
        import traceback
        print("\nfull error details:")
        traceback.print_exc()
        print("\nmake sure:")
        print("  1. docker gateway is running: docker mcp gateway run --port 8080 --transport streaming")
        print("  2. DOCKER_GATEWAY_TOKEN is set in .env file")
        return None
    
    return docker_client

# run the exploration
docker_client = await explore_docker_gateway()

=== available tools from docker gateway ===
found 9 tools:
  - code-mode: Create a JavaScript-enabled tool that combines multiple MCP server tools. 
This allows you to write scripts that call multiple tools and combine their results.
Use the mcp-find tool to find servers and make sure they are are ready with the mcp-add tool. When running
mcp-add, we don't have to activate the tools.

  - get_timed_transcript: Retrieves the transcript of a YouTube video with timestamps.
  - get_transcript: Retrieves the transcript of a YouTube video.
  - get_video_info: Retrieves the video information.
  - mcp-add: Add a new MCP server to the session. 
The server must exist in the catalog.
  - mcp-config-set: Set configuration for an MCP server. 
The config object will be validated against the server's config schema. If validation fails, the error message will include the correct schema.
  - mcp-exec: Execute a tool that exists in the current session. This allows calling tools that may not be visible i

### example: youtube transcription with docker gateway

this example uses an mcp server for youtube transcription connected through docker gateway.

In [5]:
async def youtube_transcription_example():
    """example: transcribe a youtube video using docker gateway mcp"""
    
    if not DOCKER_GATEWAY_TOKEN:
        print("‚ö†Ô∏è error: DOCKER_GATEWAY_TOKEN not set. please add it to your .env file")
        return
    
    # get transcription tools from docker gateway
    docker_client = BasicMCPClient(
        "http://localhost:8080/mcp",
        headers={
            "Authorization": f"Bearer {DOCKER_GATEWAY_TOKEN}",
            "Accept": "application/json, text/event-stream"
        }
    )
    
    try:
        tools = await aget_tools_from_mcp_url(
            "http://localhost:8080/mcp",
            client=docker_client,
            allowed_tools=["get_transcript"],  # specify which tools to use
        )
        
        print(f"loaded {len(tools)} tools from docker gateway")
        
        # create an agent with transcription tools
        llm = OpenAI(model="gpt-4o-mini", api_key=OPENAI_API_KEY)
        agent = ReActAgent(tools=tools, llm=llm, verbose=True)
        
        # transcribe a video
        print("\n=== transcribing youtube video ===")
        response = await agent.run(
            user_msg="get me a transcript of the video at https://www.youtube.com/watch?v=Fhy_VFMlE9s"
        )
        print(f"\nresponse: {response}")
    except Exception as e:
        print(f"error: {e}")

# run the example
await youtube_transcription_example()

loaded 1 tools from docker gateway

=== transcribing youtube video ===
Running step init_run
Step init_run produced event AgentInput
Running step setup_agent
Step setup_agent produced event AgentSetup
Running step run_agent_step
Step run_agent_step produced event AgentOutput
Running step parse_agent_output
Step parse_agent_output produced no event
Running step call_tool
Step call_tool produced event ToolCallResult
Running step aggregate_tool_results
Step aggregate_tool_results produced event AgentInput
Running step setup_agent
Step setup_agent produced event AgentSetup
Running step run_agent_step
Step run_agent_step produced event AgentOutput
Running step parse_agent_output
Step parse_agent_output produced event StopEvent

response: Here is the transcript of the video "Building an MCP server in 2 minutes":

"Ever wish AI could connect to your tools like USBC connects to everything? That's what MCP, the model context protocol does. It securely links language models to your files, APIs, 

## 2. creating a custom mcp server

### what is an mcp server?
an mcp server exposes three main components:
1. **tools**: functions that ai agents can call
2. **resources**: static or dynamic content (files, guides, documentation)
3. **prompts**: pre-defined prompt templates with parameters

### example: italian recipes mcp server

let's examine our custom mcp server that provides italian recipe information.

### mcp server code structure

```python
from mcp.server.fastmcp import FastMCP

# create an mcp server
mcp = FastMCP("Local MCP Server - Recipes Assistant")

# 1. TOOLS - functions that agents can call
@mcp.tool()
def list_recipes() -> list[str]:
    """list all available italian recipes"""
    return ["Lasagna", "Pasta Carbonara", "Pizza Margherita", ...]

@mcp.tool()
def get_recipe_instructions(recipe_name: str) -> str:
    """get instructions for preparing a specific italian recipe"""
    # load from recipes.json and return formatted recipe
    ...

# 2. RESOURCES - static content accessible via uri
@mcp.resource("guide://usage")
def get_usage_guide() -> str:
    """get instructions on how to use this mcp server"""
    return """# italian recipes mcp server - usage guide..."""

# 3. PROMPTS - pre-defined prompt templates
@mcp.prompt()
def suggest_recipe(occasion: str = "dinner", dietary_preference: str = "none") -> str:
    """generate a prompt to get italian recipe suggestions"""
    return f"suggest an italian recipe perfect for {occasion}..."
```

### starting the local mcp server

to start your local mcp server, run in a terminal:

```bash
mcp run mcp_server/mcp_server.py --transport=streamable-http
```

this starts the server at `http://localhost:8000/mcp`

**transport options:**
- `streamable-http`: http streaming (recommended for remote access)
- `stdio`: standard input/output (default)

## 3. mcp inspector - debugging your mcp server

### what is mcp inspector?
mcp inspector is a web-based debugging tool that lets you:
- explore available tools, resources, and prompts
- test tool calls with different parameters
- view tool responses in real-time
- debug your mcp server before integrating with clients

### starting the inspector

```bash
# start the inspector (opens in browser)
mcp dev ./mcp_server/mcp_server.py
```

### connecting to your server in inspector

1. inspector opens at `http://localhost:5173`
2. in another terminal, start your mcp server:
   ```bash
   mcp run mcp_server/mcp_server.py --transport=streamable-http
   ```
3. in the inspector:
   - connection type: **proxy** (not direct)
   - server url: `http://localhost:8000/mcp`
4. click "connect"

### what you can do in inspector:
- **tools tab**: see all available tools, test them with parameters
- **resources tab**: browse available resources (e.g., `guide://usage`)
- **prompts tab**: test prompt templates with different arguments
- **logs tab**: view server logs and debug issues

## 4. connecting to local mcp server from python client

In [7]:
async def explore_local_mcp_server():
    """explore our local italian recipes mcp server"""
    
    # connect to local mcp server
    local_client = BasicMCPClient("http://localhost:8000/mcp")
    
    print("=== exploring local mcp server ===")
    
    try:
        # list available tools
        print("\n--- tools ---")
        available_tools = await local_client.list_tools()
        for tool in available_tools.tools:
            print(f"  - {tool.name}: {tool.description}")
        
        # list available resources
        print("\n--- resources ---")
        try:
            resources = await local_client.list_resources()
            for resource in resources.resources:
                print(f"  - {resource.uri}: {resource.name}")
                if hasattr(resource, 'description') and resource.description:
                    print(f"    description: {resource.description}")
        except Exception as e:
            print(f"  no resources available: {e}")
        
        # list available prompts
        print("\n--- prompts ---")
        try:
            prompts = await local_client.list_prompts()
            for prompt in prompts.prompts:
                print(f"  - {prompt.name}: {prompt.description}")
                if hasattr(prompt, 'arguments') and prompt.arguments:
                    print(f"    arguments: {[arg.name for arg in prompt.arguments]}")
        except Exception as e:
            print(f"  no prompts available: {e}")
        
        return local_client
    except Exception as e:
        print(f"error connecting to local mcp server: {e}")
        print("make sure the local server is running:")
        print("  mcp run mcp_server/mcp_server.py --transport=streamable-http")
        return None

# run the exploration
local_client = await explore_local_mcp_server()

=== exploring local mcp server ===

--- tools ---
  - list_recipes: list all available italian recipes
  - get_recipe_instructions: get instructions for preparing a specific italian recipe

--- resources ---
  - guide://usage: get_usage_guide
    description: get instructions on how to use this mcp server

--- prompts ---
  - suggest_recipe: generate a prompt to get italian recipe suggestions based on occasion and dietary preferences
    arguments: ['occasion', 'dietary_preference']


### using local mcp server with an agent

In [None]:
async def local_mcp_agent_example():
    """create an agent that uses local mcp server tools"""
    
    # connect to local mcp
    local_client = BasicMCPClient("http://localhost:8000/mcp")
    
    try:
        # get tools from local mcp
        tools = await aget_tools_from_mcp_url(
            "http://localhost:8000/mcp",
            client=local_client,
            allowed_tools=["list_recipes", "get_recipe_instructions"],
        )
        
        print(f"loaded {len(tools)} tools from local mcp:")
        for tool in tools:
            print(f"  - {tool.metadata.name}: {tool.metadata.description}")
        
        # create agent
        llm = OpenAI(model="gpt-4o-mini", api_key=OPENAI_API_KEY)
        agent = ReActAgent(tools=tools, llm=llm, verbose=False)  # disable verbose to control output
        
        # test the agent
        print("\n" + "="*80)
        print("USER QUERY: how to make pizza margherita?")
        print("="*80)
        
        # run with streaming to capture tool calls
        handler = agent.run(user_msg="how to make pizza margherita?")
        
        agent_response = ""
        tool_call_count = 0
        
        async for ev in handler.stream_events():
            if isinstance(ev, AgentStream):
                # accumulate agent reasoning/response
                agent_response += ev.delta
            elif isinstance(ev, ToolCallResult):
                tool_call_count += 1
                print(f"\n[TOOL CALL #{tool_call_count}]")
                print(f"  Tool: {ev.tool_name}")
                print(f"  Arguments: {ev.tool_kwargs}")
                print(f"  Response: {str(ev.tool_output)[:200]}...")  # first 200 chars
        
        # get final response
        final_response = await handler
        
        print("\n" + "="*80)
        print("AGENT FINAL RESPONSE:")
        print("="*80)
        print(final_response)
        print("\n" + "="*80)
        print(f"Total tool calls made: {tool_call_count}")
        print("="*80)
        
    except Exception as e:
        print(f"error: {e}")
        print("make sure the local mcp server is running")
        import traceback
        traceback.print_exc()

# run the example
await local_mcp_agent_example()

loaded 2 tools from local mcp:
  - list_recipes: list all available italian recipes
  - get_recipe_instructions: get instructions for preparing a specific italian recipe

USER QUERY: how to make pizza margherita?

[TOOL CALL #1]
  Tool: list_recipes
  Arguments: {}
  Response: meta=None content=[TextContent(type='text', text='Lasagna', annotations=None), TextContent(type='text', text='Pasta Carbonara', annotations=None), TextContent(type='text', text='Pizza Margherita', ann...

[TOOL CALL #2]
  Tool: get_recipe_instructions
  Arguments: {'recipe_name': 'Pizza Margherita'}
  Response: meta=None content=[TextContent(type='text', text='**Pizza Margherita**\n\ningredients:\n- pizza dough (500g)\n- 400g san marzano tomatoes, crushed\n- 300g fresh mozzarella, sliced\n- fresh basil leave...

AGENT FINAL RESPONSE:
To make Pizza Margherita, follow these instructions:

**Ingredients:**
- 500g pizza dough
- 400g San Marzano tomatoes, crushed
- 300g fresh mozzarella, sliced
- Fresh basil leaves
- 

## 5. remote mcp server hosted on fastmcp cloud

### what is fastmcp cloud?
fastmcp cloud is a hosting platform for mcp servers (similar to vercel for web apps).

### setup steps:
1. fork the template repository: https://github.com/PrefectHQ/fastmcp-quickstart-template
2. visit https://fastmcp.cloud/
3. sign in with github and select your forked repo
4. configure security settings (public/private)
5. deploy - you get a url like: `https://your-app.fastmcp.app/mcp`

### benefits:
- no need to host your own server
- automatic deployments from github
- free hosting for public mcp servers
- easy integration with cursor, claude desktop, chatgpt, etc.

In [11]:
async def remote_mcp_example():
    """connect to remote mcp server hosted on fastmcp cloud"""
    
    # connect to remote fastmcp server (weather api)
    remote_client = BasicMCPClient("https://unnecessary-crimson-wildebeest.fastmcp.app/mcp")
    
    print("=== exploring remote fastmcp server ===")
    
    try:
        # list available tools
        available_tools = await remote_client.list_tools()
        print(f"found {len(available_tools.tools)} tools:")
        for tool in available_tools.tools:
            print(f"  - {tool.name}: {tool.description}")
        
        # get tools for agent
        tools = await aget_tools_from_mcp_url(
            "https://unnecessary-crimson-wildebeest.fastmcp.app/mcp",
            client=remote_client,
            allowed_tools=["get_weather"],
        )
        
        # create agent
        llm = OpenAI(model="gpt-4o-mini", api_key=OPENAI_API_KEY)
        agent = ReActAgent(tools=tools, llm=llm, verbose=False)  # disable verbose to control output
        
        # test the agent with detailed logging
        print("\n" + "="*80)
        print("USER QUERY: what's the weather in rome?")
        print("="*80)
        
        # run with streaming to capture tool calls
        handler = agent.run(user_msg="what's the weather in rome?")
        
        tool_call_count = 0
        
        async for ev in handler.stream_events():
            if isinstance(ev, ToolCallResult):
                tool_call_count += 1
                print(f"\n[TOOL CALL #{tool_call_count}]")
                print(f"  Tool: {ev.tool_name}")
                print(f"  Arguments: {ev.tool_kwargs}")
                print(f"  Response: {str(ev.tool_output)[:300]}...")  # first 300 chars
        
        # get final response
        final_response = await handler
        
        print("\n" + "="*80)
        print("AGENT FINAL RESPONSE:")
        print("="*80)
        print(final_response)
        print("\n" + "="*80)
        print(f"Total tool calls made: {tool_call_count}")
        print("="*80)
        
    except Exception as e:
        print(f"error: {e}")
        import traceback
        traceback.print_exc()

# run the example
await remote_mcp_example()

=== exploring remote fastmcp server ===
found 2 tools:
  - echo_tool: echo the input text
  - get_weather: get weather information for a specific location using meteosource api

args:
    place_id: location identifier (e.g., 'london', 'new_york', 'tokyo')

returns:
    dict: weather information including current conditions and forecasts

USER QUERY: what's the weather in rome?

[TOOL CALL #1]
  Tool: get_weather
  Arguments: {'place_id': 'rome'}
  Response: meta=None content=[TextContent(type='text', text='{"lat":"41.89193N","lon":"12.51133E","elevation":20,"timezone":"UTC","units":"metric","current":{"icon":"partly_sunny","icon_num":4,"summary":"Partly sunny","temperature":13.8,"wind":{"speed":1.6,"angle":5,"dir":"N"},"precipitation":{"total":0.0,"typ...

AGENT FINAL RESPONSE:
The current weather in Rome is partly sunny with a temperature of 13.8¬∞C. The wind is coming from the north at a speed of 1.6 m/s, and there is no precipitation.

Total tool calls made: 1


## 6. hugging face spaces - 200+ ready-to-use mcp servers

### what are hugging face mcp spaces?
hugging face hosts over 200 mcp servers as gradio spaces, providing ready-to-use capabilities:
- image generation (flux, stable diffusion)
- text-to-speech
- translation
- and many more...

### how to use:
1. browse spaces at: https://huggingface.co/spaces
2. look for spaces with mcp support
3. use the gradio api endpoint: `https://[space-name].hf.space/gradio_api/mcp/`

### example: flux image generation

In [None]:
async def huggingface_mcp_example():
    """use hugging face gradio space for image generation"""
    
    import base64
    from io import BytesIO
    from PIL import Image
    from IPython.display import display
    import os
    from datetime import datetime
    
    # connect to gradio flux mcp server
    gradio_client = BasicMCPClient("https://hysts-mcp-flux-1-schnell.hf.space/gradio_api/mcp/")
    
    print("=== exploring hugging face flux mcp server ===")
    
    # list available tools
    try:
        available_tools = await gradio_client.list_tools()
        print(f"found {len(available_tools.tools)} tools:")
        for tool in available_tools.tools:
            print(f"  - {tool.name}: {tool.description[:100]}...")
        
        # get tools for agent
        tools = await aget_tools_from_mcp_url(
            "https://hysts-mcp-flux-1-schnell.hf.space/gradio_api/mcp/",
            client=gradio_client,
        )
        
        print(f"\nloaded {len(tools)} tools")
        
        # create agent
        llm = OpenAI(model="gpt-4o-mini", api_key=OPENAI_API_KEY)
        agent = ReActAgent(tools=tools, llm=llm, verbose=False)
        
        # test the agent
        print("\n" + "="*80)
        print("USER QUERY: generate an image of a sunset over mountains")
        print("="*80)
        print("note: image generation may take 10-30 seconds...\n")
        
        handler = agent.run(user_msg="generate an image of a sunset over mountains")
        
        tool_call_count = 0
        generated_image = None
        
        async for ev in handler.stream_events():
            if isinstance(ev, ToolCallResult):
                tool_call_count += 1
                print(f"\n[TOOL CALL #{tool_call_count}]")
                print(f"  Tool: {ev.tool_name}")
                print(f"  Arguments: {ev.tool_kwargs}")
                
                output_str = str(ev.tool_output)
                
                # check if this is an ImageContent response
                if 'ImageContent' in output_str and hasattr(ev.tool_output, 'content'):
                    for content_item in ev.tool_output.content:
                        if hasattr(content_item, 'type') and content_item.type == 'image':
                            # decode and save image
                            image_bytes = base64.b64decode(content_item.data)
                            generated_image = Image.open(BytesIO(image_bytes))
                            
                            timestamp = datetime.now().strftime("%Y%m%d_%H%M%S")
                            image_path = f"generated_image_{timestamp}.png"
                            generated_image.save(image_path)
                            
                            print(f"  Response: [Image generated]")
                            print(f"  Size: {generated_image.size}")
                            print(f"  Saved to: {os.path.abspath(image_path)}")
                            break
                else:
                    print(f"  Response: {output_str[:200]}...")
        
        final_response = await handler
        
        print("\n" + "="*80)
        print("FINAL RESPONSE:")
        print("="*80)
        print(final_response)
        print(f"\nTotal tool calls: {tool_call_count}")
        print("="*80)
        
        # display the image
        if generated_image:
            print("\nGENERATED IMAGE:")
            display(generated_image)
        else:
            print("\n‚ö†Ô∏è No image generated (quota or API compatibility issues)")
        
    except Exception as e:
        print(f"\nerror: {e}")
        import traceback
        traceback.print_exc()

# run the example
await huggingface_mcp_example()

‚úÖ Found Hugging Face token: hf_BlICFoT...

=== exploring hugging face flux mcp server ===
found 2 tools:
  - FLUX_1_schnell_get_seed: Determine and return the random seed to use for model generation. - MAX_SEED is the maximum value fo...
  - FLUX_1_schnell_infer: Generate an image from a text prompt using the FLUX.1 [schnell] model. - Prompts must be in English....

loaded 2 tools

USER QUERY: generate an image of a sunset over mountains
note: image generation may take 10-30 seconds...
      attempting with authentication...


[TOOL CALL #1]
  Tool: FLUX_1_schnell_get_seed
  Arguments: {'properties': AttributedDict([('randomize_seed', True), ('seed', None)])}
  Response: [ERROR] meta=None content=[TextContent(type='text', text="Parameter `properties` is not a valid key-word argument. Please click on 'view API' in the footer of the Gradio app to see usage.", annotations=None)]...

[TOOL CALL #2]
  Tool: FLUX_1_schnell_get_seed
  Arguments: {'randomize_seed': True}
  Response: [ERROR] 

## 7. hybrid agent - combining multiple mcp servers

### the power of mcp: unified tool access

one of the key benefits of mcp is the ability to combine tools from multiple sources:
- local mcp servers (recipes)
- remote mcp servers (weather)
- docker gateway mcp servers (youtube transcription)
- hugging face spaces (image generation)

all in a single agent!

In [26]:
async def hybrid_agent_example():
    """create an agent with tools from multiple mcp sources"""
    
    import base64
    from io import BytesIO
    from PIL import Image
    from IPython.display import display
    import os
    from datetime import datetime
    import sys
    
    print("=== creating hybrid agent with multiple mcp sources ===")
    
    # connect to all mcp sources
    if not DOCKER_GATEWAY_TOKEN:
        print("‚ö†Ô∏è warning: DOCKER_GATEWAY_TOKEN not set. docker gateway tools will be skipped.")
    
    docker_client = BasicMCPClient(
        "http://localhost:8080/mcp",
        headers={
            "Authorization": f"Bearer {DOCKER_GATEWAY_TOKEN}",
            "Accept": "application/json, text/event-stream"
        }
    ) if DOCKER_GATEWAY_TOKEN else None
    remote_client = BasicMCPClient("https://unnecessary-crimson-wildebeest.fastmcp.app/mcp")
    local_client = BasicMCPClient("http://localhost:8000/mcp")
    gradio_client = BasicMCPClient("https://hysts-mcp-flux-1-schnell.hf.space/gradio_api/mcp/")
    
    # get tools from each source
    print("\nloading tools...")
    
    docker_tools = []
    if docker_client:
        try:
            docker_tools = await aget_tools_from_mcp_url(
                "http://localhost:8080/mcp",
                client=docker_client,
                allowed_tools=["get_transcript"],
            )
            print(f"  docker gateway: ‚úÖ {len(docker_tools)} tools")
        except Exception as e:
            print(f"  docker gateway: ‚ùå {str(e)[:50]}...")
    else:
        print("  docker gateway: ‚ö†Ô∏è skipped (no token)")
    
    try:
        local_tools = await aget_tools_from_mcp_url(
            "http://localhost:8000/mcp",
            client=local_client,
            allowed_tools=["list_recipes", "get_recipe_instructions"],
        )
        print(f"  local mcp: ‚úÖ {len(local_tools)} tools")
    except Exception as e:
        print(f"  local mcp: ‚ùå {str(e)[:50]}...")
        local_tools = []
    
    try:
        remote_tools = await aget_tools_from_mcp_url(
            "https://unnecessary-crimson-wildebeest.fastmcp.app/mcp",
            client=remote_client,
            allowed_tools=["get_weather"],
        )
        print(f"  remote fastmcp: ‚úÖ {len(remote_tools)} tools")
    except Exception as e:
        print(f"  remote fastmcp: ‚ùå {str(e)[:50]}...")
        remote_tools = []
    
    try:
        gradio_tools = await aget_tools_from_mcp_url(
            "https://hysts-mcp-flux-1-schnell.hf.space/gradio_api/mcp/",
            client=gradio_client,
        )
        print(f"  hugging face: ‚úÖ {len(gradio_tools)} tools")
    except Exception as e:
        print(f"  hugging face: ‚ùå {str(e)[:50]}...")
        gradio_tools = []
    
    # combine all tools
    all_tools = docker_tools + local_tools + remote_tools + gradio_tools
    print(f"\n{'='*80}")
    print(f"TOTAL TOOLS: {len(all_tools)} (docker: {len(docker_tools)}, local: {len(local_tools)}, remote: {len(remote_tools)}, hf: {len(gradio_tools)})")
    print(f"{'='*80}\n")
    
    if not all_tools:
        print("‚ö†Ô∏è no tools available. make sure at least one mcp server is running.")
        return
    
    # create hybrid agent
    llm = OpenAI(model="gpt-4o-mini", api_key=OPENAI_API_KEY)
    agent = ReActAgent(tools=all_tools, llm=llm, verbose=False)
    
    # test 1: recipe query
    if local_tools:
        print("\n\n")
        print("‚ñà" * 80)
        print("‚ñà TEST 1: RECIPE QUERY (Local MCP)")
        print("‚ñà" * 80)
        print("\nQuery: how to make tiramisu?\n")
        
        try:
            handler1 = agent.run(user_msg="how to make tiramisu?")
            
            # collect all events
            tool_calls = []
            async for ev in handler1.stream_events():
                if isinstance(ev, ToolCallResult):
                    tool_calls.append((ev.tool_name, str(ev.tool_output)))
            
            response1 = await handler1
            
            # print results
            for i, (tool, output) in enumerate(tool_calls, 1):
                print(f"‚Üí Tool Call {i}: {tool}", flush=True)
                print(f"  Output: {output[:150]}...", flush=True)
            print(f"\n‚úÖ Success: Recipe retrieved", flush=True)
        except Exception as e:
            print(f"\n‚ùå Error: {str(e)[:100]}...", flush=True)
        
        print("\n" + "‚ñà" * 80 + "\n")
    
    # test 2: weather query
    if remote_tools:
        print("\n")
        print("‚ñà" * 80)
        print("‚ñà TEST 2: WEATHER QUERY (Remote FastMCP)")
        print("‚ñà" * 80)
        print("\nQuery: what's the weather in milan?\n")
        
        try:
            handler2 = agent.run(user_msg="what's the weather in milan?")
            
            tool_calls = []
            async for ev in handler2.stream_events():
                if isinstance(ev, ToolCallResult):
                    tool_calls.append((ev.tool_name, str(ev.tool_output)))
            
            response2 = await handler2
            
            for i, (tool, output) in enumerate(tool_calls, 1):
                print(f"‚Üí Tool Call {i}: {tool}", flush=True)
                print(f"  Output: {output[:150]}...", flush=True)
            print(f"\n‚úÖ Success: Weather data retrieved", flush=True)
        except Exception as e:
            print(f"\n‚ùå Error: {str(e)[:100]}...", flush=True)
        
        print("\n" + "‚ñà" * 80 + "\n")
    
    # test 3: video transcription
    if docker_tools:
        print("\n")
        print("‚ñà" * 80)
        print("‚ñà TEST 3: VIDEO TRANSCRIPTION (Docker Gateway)")
        print("‚ñà" * 80)
        print("\nQuery: get transcript from https://www.youtube.com/watch?v=Fhy_VFMlE9s\n")
        
        try:
            handler3 = agent.run(user_msg="get transcript from https://www.youtube.com/watch?v=Fhy_VFMlE9s")
            
            tool_calls = []
            async for ev in handler3.stream_events():
                if isinstance(ev, ToolCallResult):
                    tool_calls.append((ev.tool_name, str(ev.tool_output)))
            
            response3 = await handler3
            
            for i, (tool, output) in enumerate(tool_calls, 1):
                print(f"‚Üí Tool Call {i}: {tool}", flush=True)
                print(f"  Output: {output[:150]}...", flush=True)
            print(f"\n‚úÖ Success: Transcript retrieved (length: {len(str(response3))} chars)", flush=True)
        except Exception as e:
            print(f"\n‚ùå Error: {str(e)[:100]}...", flush=True)
        
        print("\n" + "‚ñà" * 80 + "\n")
    
    # test 4: image generation
    if gradio_tools:
        print("\n")
        print("‚ñà" * 80)
        print("‚ñà TEST 4: IMAGE GENERATION (Hugging Face Space)")
        print("‚ñà" * 80)
        print("\nQuery: generate an image of a sunset over mountains")
        print("note: may fail due to quota limits\n")
        
        try:
            handler4 = agent.run(user_msg="generate an image of a sunset over mountains")
            generated_image = None
            has_quota_error = False
            tool_calls = []
            
            async for ev in handler4.stream_events():
                if isinstance(ev, ToolCallResult):
                    output_str = str(ev.tool_output)
                    tool_calls.append((ev.tool_name, output_str))
                    
                    # check for quota errors
                    if 'ZeroGPU' in output_str or 'quota' in output_str.lower():
                        has_quota_error = True
                    
                    # try to extract image
                    if hasattr(ev.tool_output, 'content'):
                        for content_item in ev.tool_output.content:
                            if hasattr(content_item, 'type') and content_item.type == 'image':
                                try:
                                    image_bytes = base64.b64decode(content_item.data)
                                    generated_image = Image.open(BytesIO(image_bytes))
                                    timestamp = datetime.now().strftime("%Y%m%d_%H%M%S")
                                    image_path = f"generated_image_{timestamp}.png"
                                    generated_image.save(image_path)
                                except:
                                    pass
            
            response4 = await handler4
            
            for i, (tool, output) in enumerate(tool_calls, 1):
                print(f"‚Üí Tool Call {i}: {tool}", flush=True)
                if 'ImageContent' in output:
                    print(f"  Output: [Image data]", flush=True)
                else:
                    print(f"  Output: {output[:150]}...", flush=True)
            
            if generated_image:
                print(f"\n‚úÖ Success: Image generated!", flush=True)
                print(f"Saved to: {image_path}", flush=True)
                display(generated_image)
            elif has_quota_error:
                print(f"\n‚ö†Ô∏è Quota limit reached - ZeroGPU quota exceeded", flush=True)
            else:
                print(f"\n‚ö†Ô∏è Image generation failed or unavailable", flush=True)
        except Exception as e:
            print(f"\n‚ùå Error: {str(e)[:100]}...", flush=True)
        
        print("\n" + "‚ñà" * 80 + "\n")
    
    # summary
    print("\n")
    print("‚ñà" * 80)
    print("‚ñà HYBRID AGENT DEMO COMPLETE")
    print("‚ñà" * 80)
    successful_sources = len([t for t in [local_tools, remote_tools, docker_tools, gradio_tools] if t])
    print(f"\n‚úÖ Connected to {successful_sources}/4 MCP sources")
    print(f"‚úÖ Total tools available: {len(all_tools)}")
    print("\n" + "‚ñà" * 80)

# run the example
await hybrid_agent_example()

=== creating hybrid agent with multiple mcp sources ===

loading tools...
  docker gateway: ‚úÖ 1 tools
  local mcp: ‚úÖ 2 tools
  remote fastmcp: ‚úÖ 1 tools
  hugging face: ‚úÖ 2 tools

TOTAL TOOLS: 6 (docker: 1, local: 2, remote: 1, hf: 2)




‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà
‚ñà TEST 1: RECIPE QUERY (Local MCP)
‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà

Query: how to make tiramisu?

‚Üí Tool Call 1: list_recipes
  Output: meta=None content=[TextContent(type='text', text='Lasagna', annotations=None), TextContent(type='text', text='Pasta Carbonara', annotations=None), Tex...
‚Üí Tool 

## 8. the problem: tool overload

### challenge
as we add more mcp servers and tools, we face a problem:
- **context window limitations**: too many tools overwhelm the llm's context
- **tool selection confusion**: the agent may not call the right tools
- **performance degradation**: more tools = slower reasoning

### example scenario:
imagine an agent with 50+ tools from 10 different mcp servers:
- recipes (5 tools)
- weather (3 tools)
- youtube (2 tools)
- image generation (4 tools)
- database queries (10 tools)
- file operations (8 tools)
- web scraping (6 tools)
- email (4 tools)
- calendar (5 tools)
- analytics (8 tools)

**result**: the agent becomes confused and may:
- call the wrong tools
- miss relevant tools
- take longer to reason
- produce incorrect results

## 9. solution: multi-agent triage system

### architecture
instead of one agent with all tools, we create:
1. **triage agent**: routes requests to specialist agents (no tools)
2. **specialist agents**: each has a focused set of tools
   - recipeagent: local mcp tools (recipes)
   - weatheragent: remote mcp tools (weather)
   - videoagent: docker gateway tools (youtube)
   - imageagent: hugging face tools (image generation)

### benefits:
- **focused context**: each specialist only sees relevant tools
- **better tool selection**: specialists are experts in their domain
- **scalability**: easy to add new specialists without overloading existing ones
- **maintainability**: each agent can be updated independently

In [None]:
async def multiagent_triage_system():
    """implement a multi-agent triage system with specialized agents"""
    
    print("=== setting up multi-agent triage system ===")
    
    # connect to all mcp sources
    docker_client = BasicMCPClient(
        "http://localhost:8080/mcp",
        headers={"Authorization": f"Bearer {DOCKER_GATEWAY_TOKEN}"}
    ) if DOCKER_GATEWAY_TOKEN else None
    remote_client = BasicMCPClient("https://unnecessary-crimson-wildebeest.fastmcp.app/mcp")
    local_client = BasicMCPClient("http://localhost:8000/mcp")
    gradio_client = BasicMCPClient("https://hysts-mcp-flux-1-schnell.hf.space/gradio_api/mcp/")
    
    # get tools from each source
    print("\nloading tools...")
    
    # docker tools
    if docker_client:
        try:
            docker_tools = await aget_tools_from_mcp_url(
                "http://localhost:8080/mcp",
                client=docker_client,
                allowed_tools=["get_transcript"],
            )
            print(f"  docker gateway: {len(docker_tools)} tools")
        except Exception as e:
            print(f"  docker gateway: error ({e})")
            docker_tools = []
    else:
        docker_tools = []
        print("  docker gateway: skipped (no token)")
    
    # local tools
    try:
        local_tools = await aget_tools_from_mcp_url(
            "http://localhost:8000/mcp",
            client=local_client,
            allowed_tools=["list_recipes", "get_recipe_instructions"],
        )
        print(f"  local mcp: {len(local_tools)} tools")
    except Exception as e:
        print(f"  local mcp: error ({e})")
        local_tools = []
    
    # remote tools
    try:
        remote_tools = await aget_tools_from_mcp_url(
            "https://unnecessary-crimson-wildebeest.fastmcp.app/mcp",
            client=remote_client,
            allowed_tools=["get_weather"],
        )
        print(f"  remote fastmcp: {len(remote_tools)} tools")
    except Exception as e:
        print(f"  remote fastmcp: error ({e})")
        remote_tools = []
    
    # gradio tools
    try:
        gradio_tools = await aget_tools_from_mcp_url(
            "https://hysts-mcp-flux-1-schnell.hf.space/gradio_api/mcp/",
            client=gradio_client,
        )
        print(f"  hugging face: {len(gradio_tools)} tools")
    except Exception as e:
        print(f"  hugging face: unavailable ({e})")
        gradio_tools = []
    
    # initialize llm
    llm = OpenAI(model="gpt-4o", api_key=OPENAI_API_KEY)
    
    # create specialized agents
    print("\ncreating specialized agents...")
    
    recipe_agent = ReActAgent(
        name="RecipeAgent",
        description="handles recipe queries and cooking instructions",
        tools=local_tools,
        llm=llm,
        verbose=True,
        system_prompt=(
            "you are a recipe expert with access to italian recipes. "
            "use your tools to provide detailed cooking instructions."
        ),
    )
    
    weather_agent = ReActAgent(
        name="WeatherAgent",
        description="provides weather information for any location",
        tools=remote_tools,
        llm=llm,
        verbose=True,
        system_prompt=(
            "you are a weather assistant. use your weather tool to provide "
            "current weather information for any location."
        ),
    )
    
    video_agent = ReActAgent(
        name="VideoAgent",
        description="transcribes youtube videos given the url",
        tools=docker_tools,
        llm=llm,
        verbose=True,
        system_prompt=(
            "you are a video transcription specialist. you can get transcripts "
            "from youtube videos using your tools."
        ),
    )
    
    image_agent = ReActAgent(
        name="ImageAgent",
        description="generates images from text prompts using flux model",
        tools=gradio_tools,
        llm=llm,
        verbose=True,
        system_prompt=(
            "you are an image generation specialist using the flux model. "
            "create images from text descriptions using your tools."
        ),
    )
    
    # triage agent: routes to appropriate specialist
    triage_agent = ReActAgent(
        name="TriageAgent",
        description="routes user requests to the appropriate specialist agent",
        tools=[],  # no tools, only routes
        llm=llm,
        verbose=True,
        system_prompt=(
            "you are a routing agent. you must immediately hand off EVERY request to a specialist agent.\n"
            "you have NO tools to answer questions - you can ONLY use handoff.\n\n"
            "routing rules:\n"
            "- recipes/cooking ‚Üí hand off to RecipeAgent\n"
            "- weather queries ‚Üí hand off to WeatherAgent\n"
            "- youtube videos ‚Üí hand off to VideoAgent\n"
            "- image generation ‚Üí hand off to ImageAgent\n\n"
            "CRITICAL: use the handoff tool, never call agent names directly."
        ),
        can_handoff_to=["RecipeAgent", "WeatherAgent", "VideoAgent", "ImageAgent"],
    )
    
    # wire agents together in workflow
    print("\ncreating agent workflow...")
    agent_workflow = AgentWorkflow(
        agents=[triage_agent, recipe_agent, weather_agent, video_agent, image_agent],
        root_agent=triage_agent.name,
        initial_state={},
    )
    
    print("\n=== multi-agent system ready! ===")
    print("\nagents:")
    print("  - TriageAgent (router)")
    print(f"  - RecipeAgent (local mcp) - {len(local_tools)} tools")
    print(f"  - WeatherAgent (remote mcp) - {len(remote_tools)} tools")
    print(f"  - VideoAgent (docker gateway) - {len(docker_tools)} tools")
    print(f"  - ImageAgent (hugging face) - {len(gradio_tools)} tools")
    
    # test the multi-agent system
    if local_tools:
        print("\n=== test 1: recipe query ===")
        resp1 = await agent_workflow.run(user_msg="how to make tiramisu?")
        print(f"\nresponse: {resp1}")
    
    if remote_tools:
        print("\n=== test 2: weather query ===")
        resp2 = await agent_workflow.run(user_msg="what's the weather in rome?")
        print(f"\nresponse: {resp2}")
    
    if docker_tools:
        print("\n" + "="*80)
        print("TEST 3: VIDEO TRANSCRIPTION (Multi-Agent Triage)")
        print("="*80)
        print("User Query: get transcript from https://www.youtube.com/watch?v=Fhy_VFMlE9s")
        print("Expected Flow: TriageAgent -> VideoAgent\n")
        
        resp3 = await agent_workflow.run(user_msg="get transcript from https://www.youtube.com/watch?v=Fhy_VFMlE9s")
        print(f"\nFinal Response (truncated): {str(resp3)[:500]}...")
        print("="*80)

# run the multi-agent system
await multiagent_triage_system()

## 10. gradio ui for multi-agent system

### creating a user-friendly interface

the `mcp_client_gradio.py` file provides a web interface for the multi-agent system:

```python
import gradio as gr

demo = gr.ChatInterface(
    chat,
    type="messages",
    title="ü§ñ MCP Multiagent Triage Chatbot",
    description=(
        "chat with specialized mcp agents:\n\n"
        "üçï **RecipeAgent** - get recipes and cooking instructions\n"
        "üå§Ô∏è **WeatherAgent** - check weather for any location\n"
        "üé• **VideoAgent** - transcribe youtube videos\n"
        "üé® **ImageAgent** - generate images using flux model\n\n"
        "the TriageAgent will automatically route your request!"
    ),
    save_history=True,  # persistent chat history
)

demo.launch()
```

### to run the gradio interface:

```bash
python mcp_client_gradio.py
```

### features:
- **persistent history**: conversations saved in browser
- **streaming responses**: see agent reasoning in real-time
- **tool call visibility**: see which tools are being called
- **example prompts**: quick start with pre-defined queries

## 11. integrating mcp in other platforms

### cursor ide
add to cursor settings (`.cursor/settings.json`):
```json
{
  "mcpServers": {
    "recipes": {
      "url": "http://localhost:8000/mcp"
    },
    "weather": {
      "url": "https://unnecessary-crimson-wildebeest.fastmcp.app/mcp"
    }
  }
}
```

### claude desktop
add to claude config (`~/Library/Application Support/Claude/claude_desktop_config.json`):
```json
{
  "mcpServers": {
    "recipes": {
      "url": "http://localhost:8000/mcp"
    }
  }
}
```

### chatgpt
use the "actions" feature to add mcp servers:
1. go to chatgpt settings ‚Üí actions
2. add new action with mcp server url
3. configure authentication if needed

### lm studio
add to lm studio mcp config (similar to cursor format):
```json
{
  "mcpServers": {
    "recipes": {
      "url": "http://localhost:8000/mcp"
    }
  }
}
```

## 12. summary and best practices

### key takeaways

1. **mcp provides unified tool access**
   - tools, resources, and prompts in one protocol
   - works across different platforms and languages

2. **multiple hosting options**
   - local: full control, requires hosting
   - docker gateway: easy integration with docker ecosystem (requires bearer token)
   - fastmcp cloud: free hosting, automatic deployments
   - hugging face spaces: 200+ ready-to-use servers

3. **debugging with inspector**
   - test tools before integration
   - explore resources and prompts
   - view logs and debug issues

4. **multi-agent architecture for scale**
   - avoid tool overload with specialized agents
   - triage pattern for intelligent routing
   - better performance and maintainability

### best practices

1. **start small**: begin with 1-2 mcp servers
2. **use inspector**: always test tools before production
3. **specialize agents**: keep each agent focused on specific tasks
4. **monitor performance**: watch for tool selection issues
5. **document tools**: clear descriptions help llms choose correctly
6. **version control**: track mcp server changes
7. **error handling**: gracefully handle mcp server failures
8. **authentication**: always use bearer tokens for docker gateway

### next steps

1. create your own mcp server for your domain
2. explore hugging face spaces for additional capabilities
3. implement multi-agent systems for complex workflows
4. integrate mcp into your existing applications
5. contribute to the mcp ecosystem!

## appendix: useful resources

### documentation
- mcp specification: https://modelcontextprotocol.io/
- fastmcp docs: https://github.com/jlowin/fastmcp
- llamaindex mcp: https://docs.llamaindex.ai/en/stable/examples/tools/mcp/

### repositories
- mcp python sdk: https://github.com/modelcontextprotocol/python-sdk
- docker mcp gateway: https://github.com/docker/mcp-gateway
- fastmcp quickstart: https://github.com/PrefectHQ/fastmcp-quickstart-template

### hosting
- fastmcp cloud: https://fastmcp.cloud/
- hugging face spaces: https://huggingface.co/spaces

### community
- mcp discord: https://discord.gg/modelcontextprotocol
- github discussions: https://github.com/modelcontextprotocol/specification/discussions