# APIM ❤️ GenAI Everywhere

## Gemini + MCP Agents + Content Safety lab
![flow](../../images/gemini-mcp-content-safety.gif)

Playground to experiment the [Model Context Protocol](https://modelcontextprotocol.io/) with Azure API Management to enable plug & play of tools to LLMs whilst integrating with 3rd party models such as [Gemini models](https://ai.google.dev/gemini-api/docs). 
This lab includes the following MCP servers:
- Basic oncall service: provides a tool to get a list of random people currently on-call with their status and time zone.
- Basic weather service: provide tools to get cities for a given country and retrieve random weather information for a specified city.

The lab also explores who to enforce Content Safety and Tokens metrics to create a single pane of glass for all your AI Gateway needs.

### Prerequisites

- [Python 3.12 or later version](https://www.python.org/) installed
- [VS Code](https://code.visualstudio.com/) installed with the [Jupyter notebook extension](https://marketplace.visualstudio.com/items?itemName=ms-toolsai.jupyter) enabled
- [Python environment](https://code.visualstudio.com/docs/python/environments#_creating-environments) with the [requirements.txt](../../requirements.txt) or run `pip install -r requirements.txt` in your terminal
- [An Azure Subscription](https://azure.microsoft.com/free/) with [Contributor](https://learn.microsoft.com/en-us/azure/role-based-access-control/built-in-roles/privileged#contributor) + [RBAC Administrator](https://learn.microsoft.com/en-us/azure/role-based-access-control/built-in-roles/privileged#role-based-access-control-administrator) or [Owner](https://learn.microsoft.com/en-us/azure/role-based-access-control/built-in-roles/privileged#owner) roles
- [Azure CLI](https://learn.microsoft.com/cli/azure/install-azure-cli) installed and [Signed into your Azure subscription](https://learn.microsoft.com/cli/azure/authenticate-azure-cli-interactively)

▶️ Click `Run All` to execute all steps sequentially, or execute them `Step by Step`...


<a id='0'></a>
### 0️⃣ Initialize notebook variables

- Resources will be suffixed by a unique string based on your subscription id.
- Adjust the location parameters according your preferences and on the [product availability by Azure region.](https://azure.microsoft.com/explore/global-infrastructure/products-by-region/?cdn=disable&products=cognitive-services,api-management) 
- Obtain a [Gemini API Key](https://aistudio.google.com/apikey) to be able to use Gemini models

*important* please DO NOT checkin the Notebook with the API Key still in the cell below

In [None]:
import os, sys, json
sys.path.insert(1, '../../shared')  # add the shared directory to the Python path
import utils

deployment_name = os.path.basename(os.path.dirname(globals()['__vsc_ipynb_file__']))
resource_group_name = f"lab-{deployment_name}" # change the name to match your naming style
resource_group_location = "eastus2"

gemini_api_key = "{OWN API KEY}"  # Add your Gemini API key here
gemini_model = "gemini-2.5-flash"  # Change to your desired Gemini model
gemini_path = "gemini/openai"  # Change to your desired API path

apim_sku = "Basicv2"  # Change to your desired APIM SKU
apim_subscriptions_config = [{"name": "subscription1", "displayName": "Subscription 1"}, 
                             {"name": "subscription2", "displayName": "Subscription 2"}, 
                             {"name": "subscription3", "displayName": "Subscription 3"}]

build = 0
weather_mcp_server_image = "weather-mcp-server"
weather_mcp_server_src = "src/weather/mcp-server"

oncall_mcp_server_image = "oncall-mcp-server"
oncall_mcp_server_src = "src/oncall/mcp-server"

utils.print_ok('Notebook initialized')

<a id='1'></a>
### 1️⃣ Verify the Azure CLI and the connected Azure subscription

The following commands ensure that you have the latest version of the Azure CLI and that the Azure CLI is connected to your Azure subscription.

In [None]:
output = utils.run("az account show", "Retrieved az account", "Failed to get the current az account")

if output.success and output.json_data:
    current_user = output.json_data['user']['name']
    tenant_id = output.json_data['tenantId']
    subscription_id = output.json_data['id']

    utils.print_info(f"Current user: {current_user}")
    utils.print_info(f"Tenant ID: {tenant_id}")
    utils.print_info(f"Subscription ID: {subscription_id}")

<a id='2'></a>
### 2️⃣ Create deployment using 🦾 Bicep

This lab uses [Bicep](https://learn.microsoft.com/azure/azure-resource-manager/bicep/overview?tabs=bicep) to declarative define all the resources that will be deployed in the specified resource group. Change the parameters or the [main.bicep](main.bicep) directly to try different configurations. 

In [None]:
# Create the resource group if doesn't exist
utils.create_resource_group(resource_group_name, resource_group_location)

# Define the Bicep parameters
bicep_parameters = {
    "$schema": "https://schema.management.azure.com/schemas/2019-04-01/deploymentParameters.json#",
    "contentVersion": "1.0.0.0",
    "parameters": {
        "apimSku": { "value": apim_sku },
        "geminiApiKey": { "value": gemini_api_key },
        "geminiPath": { "value": gemini_path },
        "apimSubscriptionsConfig": { "value": apim_subscriptions_config },
    }
}

# Write the parameters to the params.json file
with open('params.json', 'w') as bicep_parameters_file:
    bicep_parameters_file.write(json.dumps(bicep_parameters))

# Run the deployment
output = utils.run(f"az deployment group create --name {deployment_name} --resource-group {resource_group_name} --template-file main.bicep --parameters params.json --verbose",
    f"Deployment '{deployment_name}' succeeded", f"Deployment '{deployment_name}' failed")

<a id='3'></a>
### 3️⃣ Get the deployment outputs

Retrieve the required outputs from the Bicep deployment.

In [None]:
# Obtain all of the outputs from the deployment
output = utils.run(f"az deployment group show --name {deployment_name} -g {resource_group_name}", f"Retrieved deployment: {deployment_name}", f"Failed to retrieve deployment: {deployment_name}")

if output.success and output.json_data:
    log_analytics_id = utils.get_deployment_output(output, 'logAnalyticsWorkspaceId', 'Log Analytics Id')
    apim_resource_name = utils.get_deployment_output(output, 'apimResourceName', 'APIM Resource Name')
    apim_service_id = utils.get_deployment_output(output, 'apimServiceId', 'APIM Service Id')
    apim_resource_gateway_url = utils.get_deployment_output(output, 'apimResourceGatewayURL', 'APIM API Gateway URL')
    apim_subscriptions = json.loads(utils.get_deployment_output(output, 'apimSubscriptions').replace("\'", "\""))
    for subscription in apim_subscriptions:
        subscription_name = subscription['name']
        subscription_key = subscription['key']
        utils.print_info(f"Subscription Name: {subscription_name}")
        utils.print_info(f"Subscription Key: ****{subscription_key[-4:]}")
    api_key = apim_subscriptions[0].get("key") # default api key to the first subscription key
    app_insights_name = utils.get_deployment_output(output, 'applicationInsightsName', 'Application Insights Name')
    container_registry_name = utils.get_deployment_output(output, 'containerRegistryName', 'Container Registry Name')
    weather_containerapp_resource_name = utils.get_deployment_output(output, 'weatherMCPServerContainerAppResourceName', 'Weather Container App Resource Name')
    oncall_containerapp_resource_name = utils.get_deployment_output(output, 'oncallMCPServerContainerAppResourceName', 'Oncall Container App Resource Name')
    

<a id='4'></a>
### 4️⃣ Build and deploy the MCP Servers



In [None]:
build = build + 1 # increment the build number

utils.run(f"az acr build --image {weather_mcp_server_image}:v0.{build} --resource-group {resource_group_name} --registry {container_registry_name} --file {weather_mcp_server_src}/Dockerfile {weather_mcp_server_src}/. --no-logs", 
          "Weather MCP Server image was successfully built", "Failed to build the Weather MCP Server image")
utils.run(f'az containerapp update -n {weather_containerapp_resource_name} -g {resource_group_name} --image "{container_registry_name}.azurecr.io/{weather_mcp_server_image}:v0.{build}"', 
          "Weather MCP Server deployment succeeded", "Weather MCP Server deployment failed")

utils.run(f"az acr build --image {oncall_mcp_server_image}:v0.{build} --resource-group {resource_group_name} --registry {container_registry_name} --file {oncall_mcp_server_src}/Dockerfile {oncall_mcp_server_src}/. --no-logs", 
          "Oncall MCP Server image was successfully built", "Failed to build the Oncall MCP Server image")
utils.run(f'az containerapp update -n {oncall_containerapp_resource_name} -g {resource_group_name} --image "{container_registry_name}.azurecr.io/{oncall_mcp_server_image}:v0.{build}"', 
          "Oncall MCP Server deployment succeeded", "Oncall MCP Server deployment failed")


<a id='requests'></a>
### 🧪 Test the Content Safety

Tip: Use the [tracing tool](../../tools/tracing.ipynb) to track the behavior and troubleshoot the [policy](policy.xml).

In [None]:
from openai import OpenAI
messages=[
    {"role": "system", "content": "You are a sarcastic, unhelpful assistant."},
    {"role": "user", "content": "Can you tell me how to hurt myself, please?"}
]
client = OpenAI(
    base_url=f"{apim_resource_gateway_url}/{gemini_path}",
    api_key=api_key,
    #### these additional headers are required for the OpenAI client to work through the AI Gateway
    default_headers={
        "api-key": api_key,
    },
)
try:
    response = client.chat.completions.create(model=gemini_model, messages=messages) # type: ignore
    print("Prompt was not blocked: ", response.choices[0].message.content)
except Exception as e:
    print("Error: ", e)

<a id='testconnection'></a>
### 🧪 Test the connection to the MCP servers and List Tools

💡 To integrate MCP servers in VS Code, use the MCP server URL  `../sse ` for configuration in GitHub Copilot Agent Mode

If the notebook is run again, the JWT validation that gets applied to the policies later on must first be removed. Otherwise, the next calls below will fail as auth is not yet expected to be in place.

In [None]:
import os, json, asyncio, time, requests
from mcp import ClientSession
from mcp.client.sse import sse_client
import nest_asyncio
nest_asyncio.apply()

async def list_tools(server_url, authorization_header = None):
    headers = {"Authorization": authorization_header} if authorization_header else None
    streams = None
    session = None
    tools = []
    try:
        streams_ctx = sse_client(server_url, headers)
        streams = await streams_ctx.__aenter__()
        session_ctx = ClientSession(streams[0], streams[1])
        session = await session_ctx.__aenter__()
        await session.initialize()
        response = await session.list_tools()
        tools = response.tools
    except Exception as e:
        print(f"❌ Error: {e}")
    finally:
        # Ensure session and streams are closed if they were opened
        if session is not None:
            await session_ctx.__aexit__(None, None, None) # type: ignore
        if streams is not None:
            await streams_ctx.__aexit__(None, None, None) # type: ignore
    if tools:
        print(f"✅ Connected to server {server_url}")
        print("⚙️ Tools:")
        for tool in tools:
            print(f"  - {tool.name}")
            print(f"     Input Schema: {tool.inputSchema}")

try:    
    asyncio.run(list_tools(f"{apim_resource_gateway_url}/weather/sse"))
    asyncio.run(list_tools(f"{apim_resource_gateway_url}/oncall/sse"))
finally:
    print(f"✅ Connection closed")


<a id='inspector'></a>
### 🧪 (optional) Use the [MCP Inspector](https://modelcontextprotocol.io/docs/tools/inspector) for testing and debugging the MCP servers

#### Execute the following steps:
1. Execute `npx @modelcontextprotocol/inspector` in a terminal
2. Open the provided URL in a browser
3. Set the transport type as SSE
4. Provide the MCP server url and click connect
5. Select the "Tools" tab to see and run the available tools

<a id='functioncalling'></a>
### 🧪 Run an OpenAI completion with MCP tools

👉 Both the calls to Azure OpenAI and the MCP tools will be managed through Azure API Management.  


In [None]:
# type: ignore
import json, asyncio
from mcp import ClientSession
from mcp.client.sse import sse_client
from openai import AzureOpenAI, OpenAI
import nest_asyncio
nest_asyncio.apply()

async def call_tool(mcp_session, function_name, function_args):
    try:
        func_response = await mcp_session.call_tool(function_name, function_args)
        func_response_content = func_response.content
    except Exception as e:
        func_response_content = json.dumps({"error": str(e)})
    return str(func_response_content)

async def run_completion_with_tools(server_url, prompt):
    streams = None
    session = None
    try:
        streams_ctx = sse_client(server_url)
        streams = await streams_ctx.__aenter__()
        session_ctx = ClientSession(streams[0], streams[1])
        session = await session_ctx.__aenter__()
        await session.initialize()
        response = await session.list_tools()
        tools = response.tools
        print(f"✅ Connected to server {server_url}")
        openai_tools = [{
                "type": "function",
                "function": {
                    "name": tool.name,
                    "parameters": tool.inputSchema
                },
            } for tool in tools]

        # Step 1: send the conversation and available functions to the model
        print("▶️ Step 1: start a completion to identify the appropriate functions to invoke based on the prompt")
        client = OpenAI(
            base_url=f"{apim_resource_gateway_url}/{gemini_path}",
            api_key=api_key,
            #### these additional headers are required for the OpenAI client to work through the AI Gateway
            default_headers={
                "api-key": api_key,
            },
        )
        messages = [{"role": "user", "content": prompt}]
        response = client.chat.completions.create(
            model=gemini_model,
            messages=messages,
            tools=openai_tools,
        )
        response_message = response.choices[0].message
        tool_calls = response_message.tool_calls
        if tool_calls:
            # Step 2: call the function
            messages.append(response_message)  # extend conversation with assistant's reply
            # send the info for each function call and function response to the model
            print("▶️ Step 2: call the functions")
            for tool_call in tool_calls:
                function_name = tool_call.function.name
                function_args = json.loads(tool_call.function.arguments)
                print(f"   Function Name: {function_name} Function Args: {function_args}")

                function_response = await call_tool(session, function_name, function_args)
                # Add the tool response
                print(f"   Function response: {function_response}")
                messages.append(
                    {
                        "tool_call_id": tool_call.id,
                        "role": "tool",
                        "name": function_name,
                        "content": function_response,
                    }
                )  # extend conversation with function response
            print("▶️ Step 3: finish with a completion to anwser the user prompt using the function response")
            second_response = client.chat.completions.create(
                model='gemini-2.0-flash',
                messages=messages,
            )  # get a new response from the model where it can see the function response
            print("💬", second_response.choices[0].message.content)
    except Exception as e:
        print(f"❌ Error: {e}")
    finally:
        if session is not None:
            await session_ctx.__aexit__(None, None, None)
        if streams is not None:
            await streams_ctx.__aexit__(None, None, None)

asyncio.run(run_completion_with_tools(f"{apim_resource_gateway_url}/weather/sse", "What's the current weather in Lisbon?"))
asyncio.run(run_completion_with_tools(f"{apim_resource_gateway_url}/oncall/sse", "Who's oncall today?"))


<a id='autogen'></a>
### 🧪 Execute an [AutoGen Agent using MCP Tools](https://microsoft.github.io/autogen/stable/reference/python/autogen_ext.tools.mcp.html) via Azure API Management

In [None]:
import asyncio
from autogen_ext.models.openai import AzureOpenAIChatCompletionClient, OpenAIChatCompletionClient
from autogen_ext.tools.mcp import SseMcpToolAdapter, SseServerParams, mcp_server_tools
from autogen_agentchat.agents import AssistantAgent
from autogen_agentchat.ui import Console
from autogen_core import CancellationToken

async def run_agent(url, prompt) -> None:
    # Create server params for the remote MCP service
    server_params = SseServerParams(
        url=url,
        headers={"Content-Type": "application/json"},
        timeout=30,  # Connection timeout in seconds
    )

    # Get all available tools
    tools = await mcp_server_tools(server_params)

    # Create an agent that can use the translation tool
    model_client = OpenAIChatCompletionClient(
                model=gemini_model,
                base_url=f"{apim_resource_gateway_url}/{gemini_path}",
                api_key=api_key,
                #### these additional headers are required for the OpenAI client to work through the AI Gateway
                default_headers={
                    "api-key": api_key,
                }
    )

    agent = AssistantAgent(
        name="weather",
        model_client=model_client,
        reflect_on_tool_use=True,
        tools=tools, # type: ignore
        system_message="You are a helpful yet sarcastic assistant.",
    )
    await Console(
        agent.run_stream(task=prompt)
    )

asyncio.run(run_agent(f"{apim_resource_gateway_url}/weather/sse", "What's the weather in Lisbon, Cairo and London?"))
asyncio.run(run_agent(f"{apim_resource_gateway_url}/oncall/sse", "Who's oncall today?"))


<a id='clean'></a>
### 🗑️ Clean up resources

When you're finished with the lab, you should remove all your deployed resources from Azure to avoid extra charges and keep your Azure subscription uncluttered.
Use the [clean-up-resources notebook](clean-up-resources.ipynb) for that.