# Pricing vs Deployments
So have you ever tried to decipher the pricing with Azure OpenAI and the models that are available? It is likely to make your brain hurt 🤯

Azure OpenAI keeps the models around until they are deprecated and pricing can change with the newer "versions" so you need a crystal ball to help make sense of this all. I hope this notebook helps a bit

## Pricing Detail
Let's get the pricing information from Azure on the OpenAI models as seen at [Azure Retail Prices REST API](https://learn.microsoft.com/en-us/rest/api/cost-management/retail-prices/azure-retail-prices)

In [3]:
# Includes
import requests
import json
import pandas as pd
from azure.identity import DefaultAzureCredential



In [8]:
def build_pricing_table(json_data, table_data):
    for item in json_data['Items']:
        # Only care about the 1K unit of measure for this exercise
        if item['unitOfMeasure'] == '1K':
            table_data.append({'armSkuName':item['armSkuName'], 'retailPrice':item['retailPrice'], 'unitOfMeasure':item['unitOfMeasure'], 'armRegionName':item['armRegionName'], 'meterName':item['meterName'], 'productName':item['productName']})

table_data = []

api_url = "https://prices.azure.com/api/retail/prices"
query="productName eq 'Azure OpenAI' and armRegionName eq 'eastus2'"
response = requests.get(api_url, params={'$filter': query})
json_data = json.loads(response.text)

build_pricing_table(json_data, table_data)
nextPage = json_data['NextPageLink']

while(nextPage):
    response = requests.get(nextPage)
    json_data = json.loads(response.text)
    nextPage = json_data['NextPageLink']
    build_pricing_table(json_data, table_data)

df = pd.DataFrame(table_data)
df = df.sort_values('meterName')  
df[['meterName', 'retailPrice']]


Unnamed: 0,meterName,retailPrice
8,Az-Babbage-002 Tokens,0.0004
28,Az-Babbage-002-FTuned Tokens,0.0004
15,Az-Babbage-002-Fine Tuned-Input Tokens,0.0004
9,Az-Babbage-002-Fine Tuned-Output Tokens,0.0004
2,Az-Davinci-002 Tokens,0.002
32,Az-Davinci-002-FTuned Tokens,0.002
24,Az-Davinci-002-Fine Tuned-Input Tokens,0.002
25,Az-Davinci-002-Fine Tuned-Output Tokens,0.002
33,Az-Embedding-Babbage-003 Tokens,1.0
31,Az-Embeddings-Ada Tokens,0.0001


## List Deployments
List out the deployments with the models and version to compare for pricing

In [4]:
api = "https://management.azure.com/.default"
credential = DefaultAzureCredential()
access_token = credential.get_token(api) 
header = {'Authorization': f'Bearer {access_token.token}'}

items=[]
subs = requests.get('https://management.azure.com/subscriptions?api-version=2022-12-01',headers=header).json()
for sub in subs['value']:
    subId=sub['subscriptionId']

    uri=f"https://management.azure.com/subscriptions/{subId}/resourceGroups?api-version=2022-12-01"
    rgs = requests.get(f"https://management.azure.com/subscriptions/{subId}/resourceGroups?api-version=2022-12-01",headers=header).json()
    for rg in rgs['value']:
        resourceGroupName=rg['name']
        response = requests.get(f"https://management.azure.com/subscriptions/{subId}/resourceGroups/{resourceGroupName}/providers/Microsoft.CognitiveServices/accounts?api-version=2023-05-01",headers=header).json()
        for account in response['value']:
            if account['kind'] == 'OpenAI':
                print(f"Running {account['name']}")
                deployments = requests.get(f"https://management.azure.com/subscriptions/{subId}/resourceGroups/{resourceGroupName}/providers/Microsoft.CognitiveServices/accounts/{account['name']}/deployments?api-version=2023-05-01",headers=header).json()
                for deployment in deployments['value']:
                    items.append({'subscriptionId':subId,'resourceGroupName':resourceGroupName,'accountName':account['name'],'deploymentName':deployment['name'],'model':deployment['properties']['model']['name'],'version':deployment['properties']['model']['version'],'location':account['location']})


Running aoai-dai-dev-ncus
Running aoai-dai-dev-fc
Running aoai-dai-dev-ne
Running aoai-dai-dev-ce
Running aoai-dai-dev-use
Running aoai-dai-dev-ae
Running aoai-dai-dev-sn
Running aoai-dai-dev-scus
Running aoai-dai-dev-sc
Running aoai-dai-dev-we
Running aoai-dai-dev-use2


In [7]:
df=pd.DataFrame(items)
#no wrap
pd.set_option('display.expand_frame_repr', False)
print(df[['accountName','deploymentName','model','version','location']])

          accountName            deploymentName                   model         version          location
0   aoai-dai-dev-ncus    text-embedding-ada-002  text-embedding-ada-002               2    northcentralus
1   aoai-dai-dev-ncus              gpt-35-turbo            gpt-35-turbo            0613    northcentralus
2   aoai-dai-dev-ncus        gpt-4-0125-Preview                   gpt-4    0125-Preview    northcentralus
3     aoai-dai-dev-fc              gpt-35-turbo            gpt-35-turbo            0301     francecentral
4     aoai-dai-dev-fc            gpt-35-turbo-2            gpt-35-turbo            0301     francecentral
5     aoai-dai-dev-fc    text-embedding-ada-002  text-embedding-ada-002               2     francecentral
6     aoai-dai-dev-fc               gpt-4-turbo                   gpt-4    1106-Preview     francecentral
7     aoai-dai-dev-ne    text-embedding-ada-002  text-embedding-ada-002               2        norwayeast
8     aoai-dai-dev-ce              gpt-35-turb

## Decipher Ring
When you deploy a model in Azure OpenAI, you pick the model and the version. However, in the pricing detail, it isn't as straight as you would expect. This detail should help

meterName|model|version|Prompt/Input|Completion/Output
---|---|---|---|---
Az-GPT-3.5-turbo 1️⃣|gpt-35-turbo|0301|0.00200|0.00200
Az-GPT-35-turbo-4k|gpt-35-turbo|0613|0.00150|0.00200
Az-GPT-35-turbo-16k|GPT-35-turbo-16k|0613|0.00300|0.00400
Az-GPT35-Turbo-16K-0125 2️⃣|gpt-35-turbo|0125|0.00050|0.00150
Az-GPT4-8K|gpt-4|0613|0.03000|0.06000
Az-GPT4-32k|gpt-4-32k|0613|0.06000|0.12000
Az-GPT4-Turbo-128K 3️⃣|gpt-4|1106-preview|0.01000|0.03000
Az-GPT4-Turbo-Vision-128K 3️⃣|gpt-4|vision-preview|0.01000|0.03000

1️⃣ Original model the price was the same for Prompt/Completion

2️⃣ Yes it is 16k. As 0125 is GA now, going forward there won't be a smaller and larger context

3️⃣ This will become the GA model after preview

> This is not a full list but I think you can see the correlation

> 💡Prompts/Inputs are cheaper than Completions/Output because of compute needed to generate the output