Add Budget Manager + Support for Anthropic, Cohere, Palm (100+ LLMs using LiteLLM) #99

ishaan-jaff · 2023-09-13T22:16:37Z

Addressing:
#47
#55

This PR addresses two problems:

Add support for 100+ LLMs
Use a budget manager for limiting $ spend per session or per user

Add support for 100+ LLMs

using LiteLLM https://github.com/BerriAI/litellm/
LiteLLM is a lightweight package to simplify LLM API calls - use any llm as a drop in replacement for gpt-3.5-turbo.

Example

from litellm import completion
## set ENV variables
os.environ["OPENAI_API_KEY"] = "openai key"
os.environ["COHERE_API_KEY"] = "cohere key"
messages = [{ "content": "Hello, how are you?","role": "user"}]
# openai call
response = completion(model="gpt-3.5-turbo", messages=messages)

# cohere call
response = completion(model="command-nightly", messages)

Use a budget manager for limiting $ spend per session or per user

LiteLLM exposes a budget manager for each session/user

We init a budget manager
Check the budget manager before making completion calls
Update with costs after making completion calls

ishaan-jaff · 2023-09-13T22:18:17Z

@FraserLee @henri123lemoine can i get a review on this pr ?

if this initial commit looks good i can add docs/testing too

mruwnik

nice idea!

mruwnik · 2023-09-28T23:10:42Z

api/src/stampy_chat/chat.py

@@ -26,6 +28,9 @@

 ENCODER = tiktoken.get_encoding("cl100k_base")

+# initialize a budget manager to control costs for gpt-4/other llms
+budget_manager = litellm.BudgetManager(project_name="stampy_chat")


how does the budget per user get configured? Could you add a new item to env.py so that it can be configured? Also, what would the units be? (I had a very quick glance at the litellm docs, but otherwise don't know anything about it)

(I'll give it a proper look tomorrow)

mruwnik · 2023-09-28T23:13:38Z

api/src/stampy_chat/chat.py

@@ -225,13 +235,16 @@ def talk_to_robot_internal(index, query: str, mode: str, history: List[Dict[str,

 # convert talk_to_robot_internal from dict generator into json generator
 def talk_to_robot(index, query: str, mode: str, history: List[Dict[str, str]], k: int = STANDARD_K, log: Callable = print):
-    yield from (json.dumps(block) for block in talk_to_robot_internal(index, query, mode, history, k, log))
+    session_id = str(uuid.uuid4())


If I understand this, then budget manager has an internal dict to count how much a given session has used? But if you're creating a new id with each call to this function, then each session will have max 1 call? I'm planning on adding session ids, as it will be needed for logging anyway, so could you do this by extracting the session id from the request params in the main.py functions?

mruwnik · 2023-09-28T23:14:29Z

api/src/stampy_chat/chat.py

@@ -225,13 +235,16 @@ def talk_to_robot_internal(index, query: str, mode: str, history: List[Dict[str,

 # convert talk_to_robot_internal from dict generator into json generator
 def talk_to_robot(index, query: str, mode: str, history: List[Dict[str, str]], k: int = STANDARD_K, log: Callable = print):
-    yield from (json.dumps(block) for block in talk_to_robot_internal(index, query, mode, history, k, log))
+    session_id = str(uuid.uuid4())
+    budget_manager.create_budget(total_budget=10, user=session_id) # init $10 budget


don't hardcode it - create a setting in env.py

mruwnik · 2023-09-28T23:17:44Z

api/src/stampy_chat/chat.py

@@ -181,14 +186,19 @@ def talk_to_robot_internal(index, query: str, mode: str, history: List[Dict[str,
        t1 = time.time()
        response = ''

-        for chunk in openai.ChatCompletion.create(
+        # check if budget exceeded for session
+        if budget_manager.get_current_cost(user=session_id) <= budget_manager.get_total_budget(session_id):


is this for the number of allowed tokens or chat calls? Is it a hard total, or does it get reset every now and then? The code is run on gunicorn workers - how will that influence it, as I'm guessing litellm won't communicate across processes?

ishaan-jaff added 2 commits September 13, 2023 15:10

v0 litellm

87d9515

update budget name

73b22cd

mruwnik reviewed Sep 28, 2023

View reviewed changes

ccstan99 mentioned this pull request Jun 10, 2024

Try Other LLMs #34

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Budget Manager + Support for Anthropic, Cohere, Palm (100+ LLMs using LiteLLM) #99

Add Budget Manager + Support for Anthropic, Cohere, Palm (100+ LLMs using LiteLLM) #99

ishaan-jaff commented Sep 13, 2023

ishaan-jaff commented Sep 13, 2023

mruwnik left a comment

mruwnik Sep 28, 2023

mruwnik Sep 28, 2023

mruwnik Sep 28, 2023

mruwnik Sep 28, 2023

mruwnik Sep 28, 2023

Add Budget Manager + Support for Anthropic, Cohere, Palm (100+ LLMs using LiteLLM) #99

Are you sure you want to change the base?

Add Budget Manager + Support for Anthropic, Cohere, Palm (100+ LLMs using LiteLLM) #99

Conversation

ishaan-jaff commented Sep 13, 2023

Add support for 100+ LLMs

Use a budget manager for limiting $ spend per session or per user

ishaan-jaff commented Sep 13, 2023

mruwnik left a comment

Choose a reason for hiding this comment

mruwnik Sep 28, 2023

Choose a reason for hiding this comment

mruwnik Sep 28, 2023

Choose a reason for hiding this comment

mruwnik Sep 28, 2023

Choose a reason for hiding this comment

mruwnik Sep 28, 2023

Choose a reason for hiding this comment

mruwnik Sep 28, 2023

Choose a reason for hiding this comment