# APIM ❤️ OpenAI

## Backend Circuit Breaking lab
![flow](../../images/backend-circuit-breaking.gif)

Playground to try the built-in [backend circuit breaker functionality of APIM](https://learn.microsoft.com/en-us/azure/api-management/backends?tabs=bicep) to either an Azure OpenAI endpoints or a mock server.

### Prerequisites
- [Python 3.8 or later version](https://www.python.org/) installed
- [VS Code](https://code.visualstudio.com/) installed with the [Jupyter notebook extension](https://marketplace.visualstudio.com/items?itemName=ms-toolsai.jupyter) enabled
- [Azure CLI](https://learn.microsoft.com/en-us/cli/azure/install-azure-cli) installed
- [An Azure Subscription](https://azure.microsoft.com/en-us/free/) with Contributor permissions
- [Access granted to Azure OpenAI](https://aka.ms/oai/access) or just enable the mock service
- [Sign in to Azure with Azure CLI](https://learn.microsoft.com/en-us/cli/azure/authenticate-azure-cli-interactively)

### 0️⃣ Initialize notebook variables
Set ```mock_disabled``` variable to ```True``` to use this lab against a real Azure OpenAI endpoint or to ```False``` to simulate equivalent behavior with a mock server.
- The ```mock_webapps``` variable sets the list of deployed Web Apps for the mocking functionality.
- Adjust the location parameters according your preferences and on the [product availability by Azure region.](https://azure.microsoft.com/en-us/explore/global-infrastructure/products-by-region/?cdn=disable&products=cognitive-services,api-management) 
- Adjust the OpenAI model and version according the [availability by region.](https://learn.microsoft.com/en-us/azure/ai-services/openai/concepts/models) 

In [None]:
import os
import json
import datetime
import requests
notebook_path = os.path.abspath("")
notebook_name = os.path.basename(globals()['__vsc_ipynb_file__'])

lab_prefix = "av4" # used to ensure unique names within Azure
mock_disabled = True
mock_webapps = [{"name": "openaimock1"}] # ensure that the names are not being used within Azure
resource_group = "lab-ai-gateway"
apim_resource_name = lab_prefix + "-aigw-apim"
apim_resource_location = "eastus"
apim_resource_sku = "Consumption"
openai_resources = [ {"name": lab_prefix + "-aigw-openai1", "location": "eastus"}]
openai_resource_sku = "S0"
openai_model_name = "gpt-35-turbo"
openai_model_version = "0613"
openai_deployment_name = "gpt-35-turbo"
openai_api_version = "2024-02-01"
openai_specification_url='https://raw.githubusercontent.com/Azure/azure-rest-api-specs/main/specification/cognitiveservices/data-plane/AzureOpenAI/inference/stable/' + openai_api_version + '/inference.json'

### 1️⃣ Create the Azure Resource Group
All resources deployed in this lab will be created in the specified resource group.

In [None]:
resource_group_stdout = ! az group create --name {resource_group} --location {apim_resource_location}
if resource_group_stdout.n.startswith("ERROR"):
    print(resource_group_stdout)
else:
    print("✅ Azure Resource Group ", resource_group, " created ⌚ ", datetime.datetime.now().time())

### 2️⃣ Create an Azure OpenAI resource
Azure OpenAI Service provides REST API access to OpenAI's powerful language models. The following script is based on [this quickstart](https://learn.microsoft.com/en-us/azure/ai-services/openai/how-to/create-resource?pivots=cli) and creates a new Azure OpenAI resource.
- Note: skip this step if you have an existing resource that you want to reuse.

In [None]:
if mock_disabled:
    openai_resource_name = openai_resources[0].get("name")
    openai_resource_location = openai_resources[0].get("location")
    openai_resource_stdout = ! az cognitiveservices account create --name {openai_resource_name} --resource-group {resource_group} \
                --kind OpenAI --sku-name {openai_resource_sku} --location {openai_resource_location} --custom-domain {openai_resource_name}
    if openai_resource_stdout.n.startswith("ERROR"):
        print(openai_resource_stdout)
    else:
        print("✅ Azure OpenAI resource created ⌚ ", datetime.datetime.now().time())
else:
    print("🚧 Mock enabled, skipping Azure OpenAI resource creation")

### 3️⃣ Deploy a model
Once you create an Azure OpenAI Resource, you must deploy a model before you can start making API calls. The script below creates a Model Deployment using the specified deployment name, model name, and model version.

In [None]:
if mock_disabled:
    openai_resource_name = openai_resources[0].get("name")
    openai_deployment_stdout = ! az cognitiveservices account deployment create --name {openai_resource_name} --resource-group  {resource_group} \
        --deployment-name {openai_deployment_name} --model-name {openai_model_name} --model-version {openai_model_version}  --model-format OpenAI 
    if openai_deployment_stdout.n.startswith("ERROR"):
        print(openai_deployment_stdout)
    else:
        print("✅ OpenAI deployment created ⌚ ", datetime.datetime.now().time())
else:
    print("🚧 Mock enabled, skipping OpenAI deployment creation")

### 4️⃣ Create the API Management (APIM) resource
APIM will act as the AI-Gateway for the OpenAI API. The following script is based on [this quickstart](https://learn.microsoft.com/en-us/azure/api-management/get-started-create-service-instance-cli).
- Note: skip this step if you have an existing instance that you want to reuse.

In [None]:
apim_resource_stdout = ! az apim create -g {resource_group} -n {apim_resource_name} -l {apim_resource_location} \
    --sku-name {apim_resource_sku} --publisher-email noreply@microsoft.com --publisher-name Microsoft --enable-managed-identity
if apim_resource_stdout.n.startswith("ERROR"):
    print(apim_resource_stdout)
else:
    print("✅ Azure API Management resource created ⌚ ", datetime.datetime.now().time())


### 5️⃣ Get the API Management (APIM) resource details
The APIM instance provides a [managed API Gateway](https://learn.microsoft.com/en-us/azure/api-management/api-management-gateways-overview), a [system managed identity](https://learn.microsoft.com/en-us/azure/api-management/api-management-howto-use-managed-service-identity) and a master [subscription key](https://learn.microsoft.com/en-us/azure/api-management/api-management-subscriptions). In this lab we will use the master subscription key but in production scenarios new subscription keys should be created for the API consumers.

In [None]:
apim_resource_stdout = ! az apim show -g {resource_group} -n {apim_resource_name}
apim_resource = json.loads(apim_resource_stdout.n)
apim_resource_id = apim_resource.get("id")
apim_resource_gateway_url = apim_resource.get("gatewayUrl")
apim_managed_identity = apim_resource.get("identity").get("principalId")
apim_subscription_key = ! az rest --method POST --uri {apim_resource_id}/subscriptions/master/listSecrets?api-version=2022-08-01 --query primaryKey -o tsv
apim_subscription_key = apim_subscription_key.n
print("👉🏻 API Gateway URL: ", apim_resource_gateway_url)

### 6️⃣ Assign a role to enable APIM to access OpenAI API
This lab uses a zero trust security strategy with a key less approach using an [Azure Managed Identity](https://learn.microsoft.com/en-us/azure/ai-services/openai/how-to/managed-identity). The following script assigns the ```Cognitive Services OpenAI User``` role to the APIM managed identity so that it can access the OpenAI API. 

In [None]:
if mock_disabled:
    openai_resource_name = openai_resources[0].get("name")
    openai_resource_stdout = ! az cognitiveservices account show --name {openai_resource_name} --resource-group {resource_group}
    openai_resource = json.loads(openai_resource_stdout.n)
    openai_resource_id = openai_resource.get("id")
    role_assignment_stdout = ! az role assignment create --assignee {apim_managed_identity} \
        --role "Cognitive Services OpenAI User" \
        --scope {openai_resource_id}
    if role_assignment_stdout.n.startswith("ERROR"):
        print(role_assignment_stdout)
    else:
        print("✅ Role assignment created ⌚ ", datetime.datetime.now().time())
else:
    print("🚧 Mock enabled, skipping Role assignment")

### 7️⃣ Create APIM backend with the circuit breaker configuration
The circuit breaker functionality is in Preview so this script might be updated when GA. Check [this document](https://learn.microsoft.com/en-us/azure/api-management/backends?tabs=bicep) for more details.
- Note: ```interval``` and ```tripDuration``` parameters should be in the ISO Timespan format. In the following example, the circuit breaker trips when there are three or more 429 status codes in a 5 minutes interval. The circuit breaker resets after one minute.

In [None]:
if mock_disabled:
    openai_resource_name = openai_resources[0].get("name")
    openai_resource_stdout = ! az cognitiveservices account show --name {openai_resource_name} --resource-group {resource_group}
    openai_resource = json.loads(openai_resource_stdout.n)
    openai_resource_endpoint = openai_resource.get("properties").get("endpoint")
    print("👉🏻 Azure OpenAI endpoint: ", openai_resource_endpoint)    
    backend_properties = {
        "properties": {
            "title": openai_resource_name + " backend",
            "url": openai_resource_endpoint + "/openai",
            "description": "Backend for OpenAI resource " + openai_resource_name,
            "protocol": "http",
            "circuitBreaker": {
                "rules": [
                    {
                        "failureCondition": {
                            "count": 3,
                            "errorReasons": [
                                "Server errors"
                            ],
                            "interval": "PT5M",
                            "statusCodeRanges": [
                                {
                                "min": 429,
                                "max": 429
                                }
                            ]
                        },
                        "name": "myBreakerRule",
                        "tripDuration": "PT1M"
                    }
                ]
            }
        }
    }
    uri = apim_resource_id + "/backends/" + openai_resource_name + "?api-version=2023-05-01-preview"
    backend_properties_text = "\"" + json.dumps(backend_properties).replace("\"","\\\"") + "\""
    backend_creation_stdout = ! az rest --method PUT --uri {uri} --body {backend_properties_text}
    if backend_creation_stdout.n.startswith("ERROR"):
        print(backend_creation_stdout)
    else:
        print("✅ Backend ", openai_resource_name," created ⌚ ", datetime.datetime.now().time())    
else:
    mock_webapp_name = mock_webapps[0].get("name")
    openai_resource_endpoint = "https://" + mock_webapp_name + ".azurewebsites.net"
    print("🚧 Mock enabled, using Mock endpoint instead of Azure OpenAI endpoint: ", openai_resource_endpoint)
    backend_properties = {
        "properties": {
            "title": mock_webapp_name + " backend",
            "url": openai_resource_endpoint + "/openai",
            "description": "Backend for Mock server " + mock_webapp_name,
            "protocol": "http",
            "circuitBreaker": {
                "rules": [
                    {
                        "failureCondition": {
                            "count": 3,
                            "errorReasons": [
                                "Server errors"
                            ],
                            "interval": "PT5M",
                            "statusCodeRanges": [
                                {
                                "min": 429,
                                "max": 429
                                }
                            ]
                        },
                        "name": "myBreakerRule",
                        "tripDuration": "PT1M"
                    }
                ]
            }
        }
    }
    uri = apim_resource_id + "/backends/" + mock_webapp_name + "?api-version=2023-05-01-preview"
    backend_properties_text = "\"" + json.dumps(backend_properties).replace("\"","\\\"") + "\""
    backend_creation_stdout = ! az rest --method PUT --uri {uri} --body {backend_properties_text}
    if backend_creation_stdout.n.startswith("ERROR"):
        print(backend_creation_stdout)
    else:
        print("✅ Backend ", mock_webapp_name," created ⌚ ", datetime.datetime.now().time())    


### 8️⃣ Import the OpenAI API into APIM
The following script will import the OpenAI inference API using the json OpenAPI specification [publicly available](https://github.com/Azure/azure-rest-api-specs/tree/main/specification/cognitiveservices/data-plane/AzureOpenAI/inference/stable). The subscription key header name will be set to ```api-key``` to use the same name used by the OpenAI API.


In [None]:
if mock_disabled:
    openai_resource_stdout = ! az cognitiveservices account show --name {openai_resource_name} --resource-group {resource_group}
    openai_resource = json.loads(openai_resource_stdout.n)
    openai_resource_endpoint = openai_resource.get("properties").get("endpoint")
    print("👉🏻 OpenAI endpoint: ", openai_resource_endpoint)    
else:
    openai_resource_endpoint = "https://" + mock_webapps[0].get("name") + ".azurewebsites.net" # this lab dosn't implement load balancing
    print("🚧 Mock enabled, using Mock endpoint instead of Azure OpenAI endpoint: ", openai_resource_endpoint)
apim_api_import_stdout = ! az apim api import --resource-group {resource_group} --service-name {apim_resource_name} \
        --api-id "openai" --path "openai" --api-type "http" --display-name "OpenAI" --description "OpenAI inference API" \
        --service-url {openai_resource_endpoint}"/openai" --protocols "https" \
        --specification-format OpenApiJson --specification-url {openai_specification_url} \
        --subscription-required true --subscription-key-header-name "api-key" --subscription-key-query-param-name "api-key"
if apim_api_import_stdout.n.startswith("ERROR"):
    print(apim_api_import_stdout)
else:
    print("✅ API imported on ", datetime.datetime.now().time())

### 8️⃣ Update the API Policy to get the self managed identity and send the bearer token to authenticate into Azure OpenAI
The API policy must include the [documented policy snippet](https://learn.microsoft.com/en-us/azure/api-management/api-management-authenticate-authorize-azure-openai#authenticate-with-managed-identity) to authenticate requests to the Azure OpenAI API using the assigned managed identity.
- Note: The functionality to add the policy through the Azure CLI [is not yet available](https://github.com/Azure/azure-cli/issues/14695) and that's why we are using the ```az rest``` command instead. 


In [None]:
if mock_disabled:
    backend_id = openai_resources[0].get("name")
else:
    backend_id = mock_webapps[0].get("name")
with open(notebook_path + "/policy.xml", 'r') as policy_xml_file:
    policy_xml = policy_xml_file.read()
    policy_xml = policy_xml.replace("{backend-id}", backend_id)
with open(notebook_name.replace('ipynb','json'), 'w') as policy_json_file:
    policy_json_file.write("{\"properties\":{\"value\":\"" + policy_xml.replace("\"","\\\"") + "\"} }")
uri = apim_resource_id + "/apis/openai/policies/policy?api-version=2022-09-01-preview"
body_file_path = "@" + notebook_path + "/" + notebook_name.replace('ipynb','json')
apim_policy_stdout = ! az rest --method PUT --uri {uri} --body {body_file_path}
os.remove(notebook_name.replace('ipynb','json'))
print("✅ Policy updated ⌚ ", datetime.datetime.now().time())

### 🧪 Test the API using a direct HTTP call
Requests is an elegant and simple HTTP library for Python that will be used here to make raw API requests and inspect the responses.

In [None]:
url = apim_resource_gateway_url + "/openai/deployments/" + openai_deployment_name + "/chat/completions?api-version=" + openai_api_version
if mock_disabled:
    messages={"messages":[
        {"role": "system", "content": "You are a sarcastic unhelpful assistant."},
        {"role": "user", "content": "Can you tell me the time, please?"}
    ]}
else:
    messages={
        "messages": [
            {
                "role": "system", 
                "content": {
                    "simulation": {
                        "default": {"response_status_code": 200, "wait_time_ms": 0},
                        "openaimock1.azurewebsites.net": {"response_status_code": 429}
                    }
                }
            }
        ]
    }
response = requests.post(url, headers = {'api-key':apim_subscription_key}, json = messages)
print("status code: ", response.status_code)
print("headers ", response.headers)
if (response.status_code == 200):
    data = json.loads(response.text)
    print("response: ", data.get("choices")[0].get("message").get("content"))
else:
    print(response.text)

### 🧪 Test the API using the Azure OpenAI Python SDK
OpenAPI provides a widely used [Python library](https://github.com/openai/openai-python). The library includes type definitions for all request params and response fields. The goal of this test is to assert that APIM can seamlessly proxy requests to OpenAI without disrupting its functionality.
- Note: run ```pip install openai``` in a terminal before executing this step.

In [None]:
from openai import AzureOpenAI
if mock_disabled:
    messages=[
        {"role": "system", "content": "You are a sarcastic unhelpful assistant."},
        {"role": "user", "content": "Can you tell me the time, please?"}
    ]
else:
    messages=[
            {
                "role": "system", 
                "content": {
                    "simulation": {
                        "default": {"response_status_code": 200, "wait_time_ms": 0}
                    }
                }
            }
        ]
client = AzureOpenAI(
    azure_endpoint=apim_resource_gateway_url,
    api_key=apim_subscription_key,
    api_version=openai_api_version
)
response = client.chat.completions.create(model=openai_model_name, messages=messages)
print(response.choices[0].message.content)

### 🗑️ Clean up resources

When you're finished with the lab, you should remove all your deployed resources from Azure to avoid extra charges and keep your Azure subscription uncluttered. Removing the resource group is the fastest way to remove all Azure resources that you have created.

In [None]:
run_cell = True
if run_cell:
    ! az group delete --name {resource_group} -y
    ! az apim deletedservice purge --service-name {apim_resource_name} --location {apim_resource_location}
    if mock_disabled:
        openai_resource_name = openai_resources[0].get("name")
        ! az cognitiveservices account purge -g {resource_group} -n {openai_resource_name} -l {openai_resource_location}