Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: hardcoded stop tokens to patch Groq API's new 4 stop token limit for /completions #1288

Merged
merged 1 commit into from
Apr 23, 2024

Conversation

cpacker
Copy link
Owner

@cpacker cpacker commented Apr 22, 2024

Please describe the purpose of this pull request.

Groq API seems to have recently added a 4 stop token limit. Our default list of stop token strings is longer than that, so it throws an error. Since we're planning on migrating to the Groq tool calling API eventually, I think it's fine to hardcode a fix for now with a fixed set of 4 stop tokens for Groq /completions API.

  File "/Users/loaner/dev/MemGPT-2/memgpt/local_llm/groq/api.py", line 59, in get_groq_completion
    raise Exception(
Exception: API call got non-200 response code (code=400, msg={"error":{"message":"stop : one of the following must be satisfied\n  stop : value must be a string\n  stop : maximum number of items is 4","type":"invalid_request_error"}}
) for address: https://api.groq.com/openai/v1/chat/completions. Make sure that the inference server is running and reachable at https://api.groq.com/openai/v1/chat/completions.

How to test

% memgpt configure
? Select LLM inference provider: local
? Select LLM backend (select 'openai' if you have an OpenAI compatible proxy): groq
? Enter default endpoint: https://api.groq.com/openai
? Enter your Groq API key: ********************************************************
? Select default model: llama3-70b-8192
? Select default model wrapper (recommended: chatml): chatml
? Select your model's context window (for Mistral 7B models, this is probably 8k / 8192): 8192
? Select embedding provider: openai
? Select storage backend for archival data: chroma
? Select chroma backend: persistent
? Select storage backend for recall data: sqlite
馃摉 Saving config to /Users/loaner/.memgpt/config

Have you tested this PR?

Seems to be working fine:

% memgpt run

? Would you like to select an existing agent? No

馃К Creating new agent...
->  馃 Using persona profile: 'sam_pov'
->  馃 Using human profile: 'basic'
馃帀 Created new agent 'CharmingLibrary' (id=406f7948-d8b2-4b1c-a47b-a939d2d80dda)

Hit enter to begin (will request first MemGPT message)

馃挱 New user detected. Initial greeting and persona introduction.
馃 Greetings, Chad. I'm Sam, your digital companion. It's fascinating to finally connect with you. How are you today?

Related issues or PRs
Closes #1272

Copy link
Collaborator

@sarahwooders sarahwooders left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm!

@sarahwooders sarahwooders merged commit 9675671 into main Apr 23, 2024
5 checks passed
@cpacker cpacker deleted the groq-patch branch May 1, 2024 20:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

Error related to Stop Sequence calling Groq cloud endpoint.
2 participants