Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Track pricing per request #10

Closed
ishaan-jaff opened this issue Jul 28, 2023 · 6 comments
Closed

Track pricing per request #10

ishaan-jaff opened this issue Jul 28, 2023 · 6 comments

Comments

@ishaan-jaff
Copy link
Contributor

I need to map a completion / embedding request to $. Need a simple way to get '$' cost and write the cost to my DB.

Currently doing


input_text = " ".join([message["content"] for message in messages])
input_tokens = count_tokens(input_text)

response_text = response['choices'][0]['message']['content']
response_tokens = count_tokens(response_text)

input_tokens_cost = input_tokens_cost_map[model]
output_tokens_cost = output_tokens_cost_map[model]

total_cost = input_tokens * input_tokens_cost + response_tokens * output_tokens_cost

        
############### MODEL Cost Mapping ##################
input_tokens_cost_map = {
   'gpt-3.5-turbo': 0.0015,
   'gpt-4': 0.03,
   'chatgpt-test': 0.0015,
   'chatgpt-v-2': 0.0015,
}


output_tokens_cost_map = {
   'gpt-3.5-turbo': 0.002 ,
   'gpt-4': 0.06,
   'chatgpt-test': 0.002,
   'chatgpt-v-2': 0.002,
    
}

#####################################################

@krrishdholakia
Copy link
Contributor

What is the expected behavior from litellm? expose a spend calculating function?

from litellm import cost_calculator_completion 

cost = cost_calculator_completion(llm_provider, input_tokens, output_tokens) 

??

@krrishdholakia
Copy link
Contributor

I think what feels more natural is if litellm actually returned the number of tokens

  • i believe the package currently returns just the text, this makes it hard to calculate what the tokens even are (different for different providers). I believe the openai response object does return this, so it would be natural to me for this library to map to that.

@krrishdholakia
Copy link
Contributor

I did something like this of my own, but it's obviously not ideal when dealing with multiple providers (if it's not openai then i'm just assuming num tokens = length of string)

 def logging_fn(self, model, scenario, messages, response, logger_fn=None):
      try: 
        self.print_verbose(f"the model is: {model}")
        status = "success"
        if logger_fn:
          logger_fn(model=model, messages=messages, response=response)
        
        # POST this data to the supabase logs 
        llm_provider = "azure" # assume azure by default
        if model in model_provider_map.keys():
          llm_provider = model_provider_map["model"]
        
        if scenario == Scenario.generate:
          if response: 
            input_text = " ".join([message["content"] for message in messages])
            output_text = response['choices'][0]['message']['content'] if response else "" 
            number_of_input_tokens = len(input_text)
            number_of_output_tokens = len(output_text)
            
            if llm_provider == "openai" or llm_provider == "azure":
              number_of_input_tokens = len(self.openai_encoding.encode(input_text)) 
              number_of_output_tokens = len(self.openai_encoding.encode(output_text))
          else:
            number_of_input_tokens = 0
            number_of_output_tokens = 0 
        
        elif scenario == Scenario.embed:
          self.print_verbose(f"input_text: {messages}")
          if isinstance(messages, str):
            input_text = messages
          elif isinstance(messages, list):
            input_text = " ".join([message for message in messages])

          number_of_input_tokens = len(self.openai_encoding.encode(input_text))
          number_of_output_tokens = 0 # embedding is priced on input

        data = {"user_email": "krrish@berri.ai", "model": {"llm_provider": llm_provider, "model_name": model}, "input_text": {"number_of_tokens": number_of_input_tokens}, "output_text": {"number_of_tokens": number_of_output_tokens}}
        supabase.table('plaid_logs').upsert(data).execute()
      except:
        self.print_verbose(f"An error occurred: {traceback.format_exc()}")
    ```

@krrishdholakia
Copy link
Contributor

OpenAI response object:

{
  "id": "chatcmpl-123",
  "object": "chat.completion",
  "created": 1677652288,
  "choices": [{
    "index": 0,
    "message": {
      "role": "assistant",
      "content": "\n\nHello there, how may I assist you today?",
    },
    "finish_reason": "stop"
  }],
  "usage": {
    "prompt_tokens": 9,
    "completion_tokens": 12,
    "total_tokens": 21
  }
}

@krrishdholakia
Copy link
Contributor

I believe we're missing the usage details.

@ishaan-jaff
Copy link
Contributor Author

Damn - both of us just had the same request, see #11
😱💘

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants