fix: hardcoded stop tokens to patch Groq API's new 4 stop token limit for `/completions` #1288

cpacker · 2024-04-22T23:01:32Z

Please describe the purpose of this pull request.

Groq API seems to have recently added a 4 stop token limit. Our default list of stop token strings is longer than that, so it throws an error. Since we're planning on migrating to the Groq tool calling API eventually, I think it's fine to hardcode a fix for now with a fixed set of 4 stop tokens for Groq /completions API.

  File "/Users/loaner/dev/MemGPT-2/memgpt/local_llm/groq/api.py", line 59, in get_groq_completion
    raise Exception(
Exception: API call got non-200 response code (code=400, msg={"error":{"message":"stop : one of the following must be satisfied\n  stop : value must be a string\n  stop : maximum number of items is 4","type":"invalid_request_error"}}
) for address: https://api.groq.com/openai/v1/chat/completions. Make sure that the inference server is running and reachable at https://api.groq.com/openai/v1/chat/completions.

How to test

% memgpt configure
? Select LLM inference provider: local
? Select LLM backend (select 'openai' if you have an OpenAI compatible proxy): groq
? Enter default endpoint: https://api.groq.com/openai
? Enter your Groq API key: ********************************************************
? Select default model: llama3-70b-8192
? Select default model wrapper (recommended: chatml): chatml
? Select your model's context window (for Mistral 7B models, this is probably 8k / 8192): 8192
? Select embedding provider: openai
? Select storage backend for archival data: chroma
? Select chroma backend: persistent
? Select storage backend for recall data: sqlite
📖 Saving config to /Users/loaner/.memgpt/config

Have you tested this PR?

Seems to be working fine:

% memgpt run

? Would you like to select an existing agent? No

🧬 Creating new agent...
->  🤖 Using persona profile: 'sam_pov'
->  🧑 Using human profile: 'basic'
🎉 Created new agent 'CharmingLibrary' (id=406f7948-d8b2-4b1c-a47b-a939d2d80dda)

Hit enter to begin (will request first MemGPT message)

💭 New user detected. Initial greeting and persona introduction.
🤖 Greetings, Chad. I'm Sam, your digital companion. It's fascinating to finally connect with you. How are you today?

Related issues or PRs
Closes #1272

sarahwooders

lgtm!

hardcoded stop tokens to patch groq's new 4 stop token limit

9ed8e15

cpacker requested a review from sarahwooders April 22, 2024 23:01

sarahwooders approved these changes Apr 23, 2024

View reviewed changes

sarahwooders merged commit 9675671 into main Apr 23, 2024
5 checks passed

cpacker deleted the groq-patch branch May 1, 2024 20:52

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: hardcoded stop tokens to patch Groq API's new 4 stop token limit for `/completions` #1288

fix: hardcoded stop tokens to patch Groq API's new 4 stop token limit for `/completions` #1288

cpacker commented Apr 22, 2024

sarahwooders left a comment

fix: hardcoded stop tokens to patch Groq API's new 4 stop token limit for /completions #1288

fix: hardcoded stop tokens to patch Groq API's new 4 stop token limit for /completions #1288

Conversation

cpacker commented Apr 22, 2024

sarahwooders left a comment

Choose a reason for hiding this comment

fix: hardcoded stop tokens to patch Groq API's new 4 stop token limit for `/completions` #1288

fix: hardcoded stop tokens to patch Groq API's new 4 stop token limit for `/completions` #1288