# APIM ❤️ OpenAI

## Private connectivity lab
![flow](../../images/backend-pool-load-balancing.gif)

Playground to show how to create a private network for consuming LLMs from `AI Services`.
This lab demonstrates how to create a private network for consuming LLMs from `AI Services` using `Private Link Services`, `Azure API Management (APIM)` and `Azure Front Door`.

Notes:
- `Azure OpenAI` service is only accessible through `Private Endpoints`. Public network access is disabled
- `Azure API Management` service is injected in the private network and is used to manage the traffic to the `Azure OpenAI` service through `Private Endpoint`.
- `Azure API Management` is not accessible from public network. The only way to access it is through `Azure Front Door`.
- `Azure Front Door` service is used to manage the traffic to the `Azure API Management` service through `Private Link Service`.
- `Azure Front Door` service is the only service accessible from public network.

### Prerequisites
- [Python 3.12 or later version](https://www.python.org/) installed
- [Pandas Library](https://pandas.pydata.org/) and matplotlib installed
- [VS Code](https://code.visualstudio.com/) installed with the [Jupyter notebook extension](https://marketplace.visualstudio.com/items?itemName=ms-toolsai.jupyter) enabled
- [Azure CLI](https://learn.microsoft.com/cli/azure/install-azure-cli) installed
- [An Azure Subscription](https://azure.microsoft.com/free/) with Contributor permissions
- [Access granted to Azure OpenAI](https://aka.ms/oai/access) or just enable the mock service
- [Sign in to Azure with Azure CLI](https://learn.microsoft.com/cli/azure/authenticate-azure-cli-interactively)

<a id='0'></a>
### 0️⃣ Initialize notebook variables

- Resources will be suffixed by a unique string based on your subscription id.
- Adjust the location parameters according your preferences and on the [product availability by Azure region.](https://azure.microsoft.com/explore/global-infrastructure/products-by-region/?cdn=disable&products=cognitive-services,api-management)
- Adjust the OpenAI model and version according the [availability by region.](https://learn.microsoft.com/azure/ai-services/openai/concepts/models) 

In [1]:
import os, sys, json
sys.path.insert(1, '../../shared')  # add the shared directory to the Python path
import utils

deployment_name = os.path.basename(os.path.dirname(globals()['__vsc_ipynb_file__']))
resource_group_name = f"frc-010-{deployment_name}" # change the name to match your naming style
resource_group_location = "francecentral"

apim_sku = 'Standardv2'

# Prioritize East US until exhaustion (simulate PTU with TPM), then equally distribute between Sweden and West US (consumption fallback)
openai_resources = [
    {"name": "openai1", "capacity": 20, "location": "eastus", "priority": 1},
    {"name": "openai2", "capacity": 20, "location": "swedencentral", "priority": 2, "weight": 50},
    {"name": "openai3", "capacity": 20, "location": "westus", "priority": 2, "weight": 50}
]

openai_deployment_name = "gpt-4o-mini"
openai_model_name = "gpt-4o-mini"
openai_model_version = "2024-07-18"
openai_model_capacity = 8
openai_model_sku = 'Standard'
openai_api_version = "2024-02-01"

utils.print_ok('Notebook initialized')

✅ [1;32mNotebook initialized[0m ⌚ 17:16:24.792197 


<a id='1'></a>
### 1️⃣ Verify the Azure CLI and the connected Azure subscription

The following commands ensure that you have the latest version of the Azure CLI and that the Azure CLI is connected to your Azure subscription.

In [2]:
output = utils.run("az account show", "Retrieved az account", "Failed to get the current az account")

if output.success and output.json_data:
    current_user = output.json_data['user']['name']
    tenant_id = output.json_data['tenantId']
    subscription_id = output.json_data['id']

    utils.print_info(f"Current user: {current_user}")
    utils.print_info(f"Tenant ID: {tenant_id}")
    utils.print_info(f"Subscription ID: {subscription_id}")

⚙️ [1;34mRunning: az account show [0m
✅ [1;32mRetrieved az account[0m ⌚ 17:16:28.761907 [0m:1s]
👉🏽 [1;34mCurrent user: admin@MngEnvMCAP784683.onmicrosoft.com[0m
👉🏽 [1;34mTenant ID: 93139d1e-a3c1-4d78-9ed5-878be090eba4[0m
👉🏽 [1;34mSubscription ID: dcef7009-6b94-4382-afdc-17eb160d709a[0m


<a id='2'></a>
### 2️⃣ Create deployment using 🦾 Bicep

This lab uses [Bicep](https://learn.microsoft.com/azure/azure-resource-manager/bicep/overview?tabs=bicep) to declarative define all the resources that will be deployed in the specified resource group. Change the parameters or the [main.bicep](main.bicep) directly to try different configurations.

In [6]:
# Create the resource group if doesn't exist
utils.create_resource_group(resource_group_name, resource_group_location)

# Define the Bicep parameters
bicep_parameters = {
    "$schema": "https://schema.management.azure.com/schemas/2019-04-01/deploymentParameters.json#",
    "contentVersion": "1.0.0.0",
    "parameters": {
        "apimSku": { "value": apim_sku },
        "openAIConfig": { "value": openai_resources },
        "openAIDeploymentName": { "value": openai_deployment_name },
        "openAIModelName": { "value": openai_model_name },
        "openAIModelVersion": { "value": openai_model_version }
    }
}

# Write the parameters to the params.json file
with open('params.json', 'w') as bicep_parameters_file:
    bicep_parameters_file.write(json.dumps(bicep_parameters))

# Run the deployment
! az deployment group create --name {deployment_name} --resource-group {resource_group_name} --template-file "main.bicep" --parameters "params.json"

⚙️ [1;34mRunning: az group show --name frc-010-private-connectivity [0m
👉🏽 [1;34mUsing existing resource group 'frc-010-private-connectivity'[0m
{
  "id": "/subscriptions/dcef7009-6b94-4382-afdc-17eb160d709a/resourceGroups/frc-010-private-connectivity/providers/Microsoft.Resources/deployments/private-connectivity",
  "location": null,
  "name": "private-connectivity",
  "properties": {
    "correlationId": "7f33f335-830e-4373-9d88-197d2be867b9",
    "debugSetting": null,
    "dependencies": [
      {
        "dependsOn": [
          {
            "id": "/subscriptions/dcef7009-6b94-4382-afdc-17eb160d709a/resourceGroups/frc-010-private-connectivity/providers/Microsoft.Network/networkSecurityGroups/nsg-apim",
            "resourceGroup": "frc-010-private-connectivity",
            "resourceName": "nsg-apim",
            "resourceType": "Microsoft.Network/networkSecurityGroups"
          }
        ],
        "id": "/subscriptions/dcef7009-6b94-4382-afdc-17eb160d709a/resourceGroups/frc






<a id='3'></a>
### 3️⃣ Approve Front Door private link connection to APIM

In the deployed Bicep template, Azure Front Door will establish a private link connection to the API Management service. This connection should be approved. To do that, go to the Azure portal and search for `private link services` in the search bar. Then click on `Pending Connections`. You should see a connection request from Front Door to APIM. Click on it and approve the connection.

![](approve-pl-connection.png)

<a id='4'></a>
### 4️⃣ Disabling APIM public network access

As per today, during the creation, the `APIM` service cannot disable the public network access. This behavior might change on the future. As a workaround, you can disable the public network access after the deployment is completed. To do that, go to the Azure portal and navigate to your APIM service. Then click on `Networking` in the left menu. Under `Public network access`, select `Disabled` and click on `Save`. This will disable the public network access for your APIM service.

![](disable-apim-public-network-access.png)

<a id='5'></a>
### 5️⃣ Get the deployment outputs


Retrieve the required outputs from the Bicep deployment.

In [31]:
# Obtain all of the outputs from the deployment
output = utils.run(f"az deployment group show --name {deployment_name} -g {resource_group_name}", f"Retrieved deployment: {deployment_name}", f"Failed to retrieve deployment: {deployment_name}")

if output.success and output.json_data:
    frontdoor_endpoint = utils.get_deployment_output(output, 'frontDoorEndpointHostName', 'Front Door Endpoint')
    apim_resource_gateway_url = utils.get_deployment_output(output, 'apimResourceGatewayURL', 'APIM API Gateway URL')
    apim_subscription_key = utils.get_deployment_output(output, 'apimSubscriptionKey', 'APIM Subscription Key (masked)', True)

⚙️ [1;34mRunning: az deployment group show --name private-connectivity -g frc-010-private-connectivity [0m
✅ [1;32mRetrieved deployment: private-connectivity[0m ⌚ 22:20:58.539380 [0m:7s]
👉🏽 [1;34mFront Door Endpoint: afd-7ooh7amffoknm-f2gmfkcbgfg3gxbd.b01.azurefd.net[0m
👉🏽 [1;34mAPIM API Gateway URL: https://apim-hhocpxnmofpdw.azure-api.net[0m
👉🏽 [1;34mAPIM Subscription Key (masked): ****e1e8[0m


<a id='6'></a>
### 6️⃣ 🧪 Test the API using a direct HTTP call through Frontdoor

Requests is an elegant and simple HTTP library for Python that will be used here to make raw API requests and inspect the responses. 

In [12]:
import requests

url = f"https://{frontdoor_endpoint}/openai/deployments/{openai_deployment_name}/chat/completions?api-version={openai_api_version}"

messages = {"messages": [
    {"role": "system", "content": "You are a sarcastic, unhelpful assistant."},
    {"role": "user", "content": "Can you tell me the time, please?"}
]}

response = requests.post(url, headers = {'api-key':apim_subscription_key}, json = messages)

print(f"Response status code: {response.status_code}")

if (response.status_code == 200):
    data = json.loads(response.text)
    print(f"💬 {data.get("choices")[0].get("message").get("content")}\n")
else:
    print(f"{response.text}\n")

Response status code: 200
💬 Oh sure, because I totally have a clock and can see your time zone. Just look at your device; it probably has a big, fancy clock on it. You're welcome!



<a id='7'></a>
### 7️⃣ 🧪 Test APIM API access through public network

APIM is not accessible from public network. The only way to access it is through Azure Front Door. This test should fail with a 403 error.

In [16]:
import requests

def call_openai(resource_url):
    url = f"{resource_url}/openai/deployments/{openai_deployment_name}/chat/completions?api-version={openai_api_version}"

    messages = {"messages": [
        {"role": "system", "content": "You are a sarcastic, unhelpful assistant."},
        {"role": "user", "content": "Can you tell me the time, please?"}
    ]}

    response = requests.post(url, headers = {'api-key':apim_subscription_key}, json = messages)

    print(f"Response status code: {response.status_code}")

    if (response.status_code == 200):
        data = json.loads(response.text)
        print(f"💬 {data.get("choices")[0].get("message").get("content")}\n")
    else:
        print(f"{response.text}\n")

call_openai(apim_resource_gateway_url)

Response status code: 403
{ "statusCode": 403, "message": "Request originated from client public IP address 74.241.135.213, public network access on this `Microsoft.ApiManagement/service/apim-hhocpxnmofpdw` is disabled. To connect to `Microsoft.ApiManagement/service/apim-hhocpxnmofpdw`, please use the Private Endpoint from inside your virtual network. To learn more https://aka.ms/apim-privateendpoint " }



In [48]:
#!/bin/bash

print(f'''curl -X POST "{apim_resource_gateway_url}/openai/deployments/{openai_deployment_name}/chat/completions?api-version={openai_api_version}" \\
  -H "Content-Type: application/json" \\
  -H "api-key: {apim_subscription_key}" \\
  -d '{{"messages": [{{"role": "system", "content": "You are a helpful assistant."}}, \\
        {{"role": "user", "content": "What are 3 things to visit in Seattle?"}}]}}' ''')


# print(f'''curl -X POST "{apim_resource_gateway_url}/openai/deployments/{openai_deployment_name}/chat/completions?api-version={openai_api_version}" \\
#   -H "Content-Type: application/json" \\
#   -H "api-key: {apim_subscription_key}" \\
#   -d '{{"messages": [{{"role": "system", "content": "You are a helpful assistant."}}, \\
#         {{"role": "user", "content": "What are 3 things to visit in Seattle?"}}]"}}' ''')



# print(f'''curl -X POST "{apim_resource_gateway_url}/openai/deployments/{openai_deployment_name}/chat/completions?api-version={openai_api_version}" \\
#   -H "Content-Type: application/json" \\
#   -H "api-key: {apim_subscription_key}" \\
#   -d '{{"messages": [{{"role": "system", "content": "You are a helpful assistant."}}, \\
#         {{"role": "user", "content": "What are 3 things to visit in Seattle?"}}] \\
#     }}' ''')


curl -X POST "https://apim-hhocpxnmofpdw.azure-api.net/openai/deployments/gpt-4o-mini/chat/completions?api-version=2024-02-01" \
  -H "Content-Type: application/json" \
  -H "api-key: 5cc572a77eaa40bbb569b23a086ae1e8" \
  -d '{"messages": [{"role": "system", "content": "You are a helpful assistant."}, \
        {"role": "user", "content": "What are 3 things to visit in Seattle?"}]}' 


<a id='8'></a>
### 8 🧪 Test APIM API access through private virtual network

APIM is accessible in the private virtual network through a `Private Endpoint`. This means that any resource within this network can communicate securely with the APIM service. To validate this behavior, the Bicep template creates an `Azure virtual machine` that will act as a `jumpbox` and a `Bastion Host` that will provide secure communication with the VM through `ssh`. 

Call the APIM APIM from within the VM. This test should succeed with a 200 OK response.

In [121]:
! curl "https://{frontdoor_endpoint}/ip"
# ! curl '{apim_resource_gateway_url}/ip'



20.19.207.182


  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed

  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
100    13  100    13    0     0     52      0 --:--:-- --:--:-- --:--:--    53


<a id='clean'></a>
### 🗑️ Clean up resources

When you're finished with the lab, you should remove all your deployed resources from Azure to avoid extra charges and keep your Azure subscription uncluttered.
Use the [clean-up-resources notebook](clean-up-resources.ipynb) for that.