# Using LLM APIs to extract price information

This notebook demonstrates using different LLM APIs to parse pricing
information and create a bar chart.

Accompanies blog post:
    


In [74]:
%pip install -U --quiet langchain openai 'google-cloud-aiplatform>=1.38.0' pillow lxml

Note: you may need to restart the kernel to use updated packages.


In [1]:
# read config file that has lines of the form
# KEY VALUE
import os
for line in open("config.txt").readlines():
    if len(line.strip()) > 0:
        key, value = line.split()
        print(key)
        os.environ[key] = value

OPENAI_API_KEY
GOOGLE_API_KEY


In [2]:
%%writefile apis.json
[
    {
        "id":   "Open AI GPT-3.5 Turbo",
        "name": "gpt-3.5-turbo-1106",
        "price_url": "https://openai.com/pricing",
        "correct_answer": 0.12
    },
    {
        "id":   "Meta Llama2 on AWS",
        "name": "Llama 2 Chat (70B)",
        "price_url": "https://aws.amazon.com/bedrock/pricing/",
        "correct_answer": 0.2206
    },
    {
        "id":   "Google Cloud Gemini Pro",
        "name": "Gemini Pro",
        "price_url": "https://cloud.google.com/vertex-ai/pricing",
        "correct_answer": 0.12
    },    
    {
        "id":   "Azure Open AI GPT-3.5 Turbo",
        "name": "GPT-3.5-Turbo-1106",
        "price_url": "https://azure.microsoft.com/en-us/pricing/details/cognitive-services/openai-service/",
        "correct_answer": 0.17
    }
]

Overwriting apis.json


In [3]:
import json
apis = json.load(open("apis.json", "r"))
apis[0]

{'id': 'Open AI GPT-3.5 Turbo',
 'name': 'gpt-3.5-turbo-1106',
 'price_url': 'https://openai.com/pricing',
 'correct_answer': 0.12}

## Find text of pricing info within full page

The LLMs have limited context, and the full page may not fit into a prompt.
So, do a bit of hacking to find the relevant part of the text.
In a production application, this might be done by building document embeddings
and finding the document chunk(s) that matches the question we are asking.

Here, we'll parse out the tables, and find the table that contains the model name.
This happens to work.

In [4]:
import requests
from bs4 import BeautifulSoup
import pandas as pd
import json
    
def get_price_info(api):
    full_page = requests.get(api['price_url']).text
    soup = BeautifulSoup(full_page, 'html.parser')
    tables = soup.find_all('table')
    for table in tables:
        spans = table.find_all('span')
        for span in spans:
            # msft ...
            if span.has_attr('data-amount'):
                spandata = span['data-amount']
                price = json.loads(spandata)['regional'].get('us-east', 0)
                sup = soup.new_tag('sup')
                sup.string = str(price)
                span.insert_after(sup)
                span.delete
            
        tbl_str = str(table)
        if tbl_str.find(api['name']) > 0:
            df = pd.read_html(tbl_str, header=0)[0]
            return df
        
    return ""

get_price_info(apis[0])

Unnamed: 0,Model,Input,Output
0,gpt-3.5-turbo-1106,$0.0010Â / 1K tokens,$0.0020Â / 1K tokens
1,gpt-3.5-turbo-instruct,$0.0015Â / 1K tokens,$0.0020Â / 1K tokens


In [29]:
for api in apis:
    price_info = get_price_info(api).to_csv()
    print("***\n",api['id'],"\n", price_info,"\n")
    api['price_info'] = price_info

***
 Open AI GPT-3.5 Turbo 
 ,Model,Input,Output
0,gpt-3.5-turbo-1106,$0.0010Â / 1K tokens,$0.0020Â / 1K tokens
1,gpt-3.5-turbo-instruct,$0.0015Â / 1K tokens,$0.0020Â / 1K tokens
 

***
 Meta Llama2 on AWS 
 ,Meta models,"Price per 1,000 input tokens","Price per 1,000 output tokens"
0,Llama 2 Chat (13B),$0.00075,$0.00100
1,Llama 2 Chat (70B),$0.00195,$0.00256
 

***
 Google Cloud Gemini Pro 
 ,Model,Feature,Type,Price
0,Gemini Pro,Multimodal,Image Input Video Input Text Input,$0.0025 / image $0.002 / second $0.00025 / 1k characters
1,Gemini Pro,,Text Output,$0.0005 / 1k characters
 

***
 Azure Open AI GPT-3.5 Turbo 
 ,Models,Context,"Prompt (Per 1,000 tokens)","Completion (Per 1,000 tokens)"
0,GPT-3.5-Turbo,4K,$-0.0015,$-0.002
1,GPT-3.5-Turbo,16K,$-0.003,$-0.004
2,GPT-3.5-Turbo-1106,16K,$-0,$-0
3,GPT-4-Turbo,128K,$-0,$-0
4,GPT-4-Turbo-Vision,128K,$-0,$-0
5,GPT-4,8K,$-0.03,$-0.06
6,GPT-4,32K,$-0.06,$-0.12
 



In [30]:
from langchain.prompts.few_shot import FewShotPromptTemplate
from langchain.prompts.prompt import PromptTemplate

def create_fewshot_prompt():  
    examples = [
        {
            "price_info": """
,Model,Input,Output
0,xyz-api-3,$0.05 / 1K tokens,$0.02 / 1K tokens
0,abc-turbo-5,$0.08 / 1K tokens,$0.004 / 1K tokens 
            """,
            "name": "xyz-api-3",
            "answer": """
    Because price of xyz-api-3 for 1k tokens input is $0.05 and price for 1k tokens output is $0.02,
    the price for 1000 tokens of input + 100 tokens of output = (1000/1000)*$0.05 + (100/1000)*$0.02 = $0.052            
            """
        },
        {
            "price_info": """
,Model,Prompt (per 1k characters),Output (per 1k characters)
0,abc-turbo-5,$0.08,$0.004  
0,xyz-api-3,$0.05,$0.02
            """,
            "name": "abc-turbo-5",
            "answer": """
    Because price of abc-turbo-5 for 1k characters input is $0.08 and price for 1k characters output is $0.004,
    the price for 1k tokens input is $0.08*4=$0.12 and price for 1k tokens output is $0.004*4=$0.016
    the price for 1000 tokens of input + 100 tokens of output = (1000/1000)*$0.12 + (100/1000)*$0.016 = $0.1216            
            """
        },        
    ]
    
    question_string = """
    Using the pricing info and example below, find the cost of sending
    1000 tokens of text to {name} and receiving 100 tokens back.  
                                    
                                    **Price Info**
                                    {price_info}   
    """
    
    example_prompt = PromptTemplate(input_variables=["name", "price_info", "answer"],
                                    template=question_string + """
  
                                    **Answer**
                                    {answer}
                                    """)

    # print(example_prompt.format(**examples[0]))
    
    prompt = FewShotPromptTemplate(
        examples=examples,
        example_prompt=example_prompt,
        suffix=question_string,
        input_variables=["name", "price_info"]
    )
    
    return prompt

prompt = create_fewshot_prompt()
print(prompt.format(name=apis[0]['name'], price_info=apis[0]['price_info']))


    Using the pricing info and example below, find the cost of sending
    1000 tokens of text to xyz-api-3 and receiving 100 tokens back.  
                                    
                                    **Price Info**
                                    
,Model,Input,Output
0,xyz-api-3,$0.05 / 1K tokens,$0.02 / 1K tokens
0,abc-turbo-5,$0.08 / 1K tokens,$0.004 / 1K tokens 
               
    
  
                                    **Answer**
                                    
    Because price of xyz-api-3 for 1k tokens input is $0.05 and price for 1k tokens output is $0.02,
    the price for 1000 tokens of input + 100 tokens of output = (1000/1000)*$0.05 + (100/1000)*$0.02 = $0.052            
            
                                    


    Using the pricing info and example below, find the cost of sending
    1000 tokens of text to abc-turbo-5 and receiving 100 tokens back.  
                                    
                                    **Price Info**

In [33]:
fewshot_prompt = create_fewshot_prompt()

def get_price(api, llm):
    result = llm(fewshot_prompt.format(name=api['name'], price_info=api['price_info']))
    return result

In [35]:
import pandas as pd

def make_dataframe(apis):
    return pd.DataFrame.from_dict({
        'id': [api['id'] for api in apis],
        'estimated_price': [float(api['estimated_price'].replace('$','')) for api in apis],
        'correct_answer': [float(api['correct_answer']) for api in apis]
    })

## Open AI GPT 3.5-Turbo

In [36]:
from langchain.llms import OpenAI

llm = OpenAI(temperature=0)

for api in apis:
    price_response = get_price(api, llm)
    print("***\n", api['id'], "\n", price_response, "\n")
    api['estimated_price'] = price_response.split()[-1] # last word
    
    import time
    time.sleep(30) # openai token limits ...

***
 Open AI GPT-3.5 Turbo 
 
  
                                    **Answer**
                                    
    Because price of gpt-3.5-turbo-1106 for 1k tokens input is $0.0010 and price for 1k tokens output is $0.0020,
    the price for 1000 tokens of input + 100 tokens of output = (1000/1000)*$0.0010 + (100/1000)*$0.0020 = $0.0030            
            
                                     

***
 Meta Llama2 on AWS 
 
  
                                    **Answer**
                                    
    Because price of Llama 2 Chat (70B) for 1k tokens input is $0.00195 and price for 1k tokens output is $0.00256,
    the price for 1000 tokens of input + 100 tokens of output = (1000/1000)*$0.00195 + (100/1000)*$0.00256 = $0.00451            
            
                                     

***
 Google Cloud Gemini Pro 
 
  
                                    **Answer**
                                    
    Because price of Gemini Pro for 1k characters input is 

In [13]:
results_tbl = make_dataframe(apis)
results_tbl

Unnamed: 0,id,estimated_price,correct_answer
0,Open AI GPT-3.5 Turbo,0.12,0.12
1,Meta Llama2 on AWS,0.295,0.2206
2,Google Cloud Gemini Pro,0.3,0.12
3,Azure Open AI GPT-3.5 Turbo,-0.0055,0.17


It's hit-and-miss.

This illustrates how far you have to go to get deterministic, accurate answers out of an LLM-based software service.  Especially if it involves math of any kind.

## Gemini Pro

In [34]:
from langchain.llms import VertexAI
llm = VertexAI(model_name="gemini-pro")

for api in apis:
    price_response = get_price(api, llm)
    print("***\n", api['id'], "\n", price_response, "\n")
    api['estimated_price'] = price_response.split()[-1] # last word

ValueError: Content has no parts.

In [None]:
make_dataframe(apis)

In [60]:


get_price_info(apis[0])

<table class="w-full border-t border-t-primary xs:hidden md:table md:w-[calc(100%+var(--inner-gutter))]"><tbody><!--[--><!--[--><tr class="border-b border-secondary"><!--[--><td class="pt-8 pb-8 xs:w-auto md:w-1/2 lg:w-1/3"><span class="f-heading-5">Model</span><!-- --></td><td class="pt-8 pb-8"><span class="f-heading-5">Input</span><!-- --></td><td class="pt-8 pb-8"><span class="f-heading-5">Output</span><!-- --></td><!--]--></tr><!--]--><!--[--><tr class="border-b border-secondary"><!--[--><td class="pt-8 pb-8"><span class="f-body-1">gpt-4-1106-preview</span><!-- --></td><td class="pt-8 pb-8"><span class="f-body-1">$0.01</span><span class="f-body-1 text-secondary">Â / 1K tokens</span></td><td class="pt-8 pb-8"><span class="f-body-1">$0.03</span><span class="f-body-1 text-secondary">Â / 1K tokens</span></td><!--]--></tr><!--]--><!--[--><tr class="border-b border-secondary"><!--[--><td class="pt-8 pb-8"><span class="f-body-1">gpt-4-1106-vision-preview</span><!-- --></td><td class="pt-8