Track pricing per request #10

ishaan-jaff · 2023-07-28T20:46:55Z

I need to map a completion / embedding request to $. Need a simple way to get '$' cost and write the cost to my DB.

Currently doing


input_text = " ".join([message["content"] for message in messages])
input_tokens = count_tokens(input_text)

response_text = response['choices'][0]['message']['content']
response_tokens = count_tokens(response_text)

input_tokens_cost = input_tokens_cost_map[model]
output_tokens_cost = output_tokens_cost_map[model]

total_cost = input_tokens * input_tokens_cost + response_tokens * output_tokens_cost

        
############### MODEL Cost Mapping ##################
input_tokens_cost_map = {
   'gpt-3.5-turbo': 0.0015,
   'gpt-4': 0.03,
   'chatgpt-test': 0.0015,
   'chatgpt-v-2': 0.0015,
}


output_tokens_cost_map = {
   'gpt-3.5-turbo': 0.002 ,
   'gpt-4': 0.06,
   'chatgpt-test': 0.002,
   'chatgpt-v-2': 0.002,
    
}

#####################################################

The text was updated successfully, but these errors were encountered:

krrishdholakia · 2023-07-28T21:23:54Z

What is the expected behavior from litellm? expose a spend calculating function?

from litellm import cost_calculator_completion 

cost = cost_calculator_completion(llm_provider, input_tokens, output_tokens)

??

krrishdholakia · 2023-07-28T21:25:17Z

I think what feels more natural is if litellm actually returned the number of tokens

i believe the package currently returns just the text, this makes it hard to calculate what the tokens even are (different for different providers). I believe the openai response object does return this, so it would be natural to me for this library to map to that.

krrishdholakia · 2023-07-28T21:29:23Z

I did something like this of my own, but it's obviously not ideal when dealing with multiple providers (if it's not openai then i'm just assuming num tokens = length of string)

 def logging_fn(self, model, scenario, messages, response, logger_fn=None):
      try: 
        self.print_verbose(f"the model is: {model}")
        status = "success"
        if logger_fn:
          logger_fn(model=model, messages=messages, response=response)
        
        # POST this data to the supabase logs 
        llm_provider = "azure" # assume azure by default
        if model in model_provider_map.keys():
          llm_provider = model_provider_map["model"]
        
        if scenario == Scenario.generate:
          if response: 
            input_text = " ".join([message["content"] for message in messages])
            output_text = response['choices'][0]['message']['content'] if response else "" 
            number_of_input_tokens = len(input_text)
            number_of_output_tokens = len(output_text)
            
            if llm_provider == "openai" or llm_provider == "azure":
              number_of_input_tokens = len(self.openai_encoding.encode(input_text)) 
              number_of_output_tokens = len(self.openai_encoding.encode(output_text))
          else:
            number_of_input_tokens = 0
            number_of_output_tokens = 0 
        
        elif scenario == Scenario.embed:
          self.print_verbose(f"input_text: {messages}")
          if isinstance(messages, str):
            input_text = messages
          elif isinstance(messages, list):
            input_text = " ".join([message for message in messages])

          number_of_input_tokens = len(self.openai_encoding.encode(input_text))
          number_of_output_tokens = 0 # embedding is priced on input

        data = {"user_email": "krrish@berri.ai", "model": {"llm_provider": llm_provider, "model_name": model}, "input_text": {"number_of_tokens": number_of_input_tokens}, "output_text": {"number_of_tokens": number_of_output_tokens}}
        supabase.table('plaid_logs').upsert(data).execute()
      except:
        self.print_verbose(f"An error occurred: {traceback.format_exc()}")
    ```

krrishdholakia · 2023-07-28T21:30:02Z

OpenAI response object:

{
  "id": "chatcmpl-123",
  "object": "chat.completion",
  "created": 1677652288,
  "choices": [{
    "index": 0,
    "message": {
      "role": "assistant",
      "content": "\n\nHello there, how may I assist you today?",
    },
    "finish_reason": "stop"
  }],
  "usage": {
    "prompt_tokens": 9,
    "completion_tokens": 12,
    "total_tokens": 21
  }
}

krrishdholakia · 2023-07-28T21:31:18Z

I believe we're missing the usage details.

ishaan-jaff · 2023-07-28T21:32:51Z

Damn - both of us just had the same request, see #11
😱💘

krrishdholakia mentioned this issue Jul 28, 2023

return usage for all providers - As OpenAI does #11

Closed

ishaan-jaff closed this as completed Aug 26, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Track pricing per request #10

Track pricing per request #10

ishaan-jaff commented Jul 28, 2023

krrishdholakia commented Jul 28, 2023

krrishdholakia commented Jul 28, 2023

krrishdholakia commented Jul 28, 2023

krrishdholakia commented Jul 28, 2023

krrishdholakia commented Jul 28, 2023

ishaan-jaff commented Jul 28, 2023

Track pricing per request #10

Track pricing per request #10

Comments

ishaan-jaff commented Jul 28, 2023

krrishdholakia commented Jul 28, 2023

krrishdholakia commented Jul 28, 2023

krrishdholakia commented Jul 28, 2023

krrishdholakia commented Jul 28, 2023

krrishdholakia commented Jul 28, 2023

ishaan-jaff commented Jul 28, 2023