Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: base_url seems not work on "http://localhost:1234/v1" #2514

Closed
holisHsu opened this issue Apr 25, 2024 · 4 comments
Closed

[Bug]: base_url seems not work on "http://localhost:1234/v1" #2514

holisHsu opened this issue Apr 25, 2024 · 4 comments
Labels
bug Something isn't working

Comments

@holisHsu
Copy link

holisHsu commented Apr 25, 2024

Thanks for any potential help in advance, and apologize if what I report is not a bug.

Description

When I run example code as following with base_url point to my local LM Studio server,
ChatCompletion fail and it seems the request path is wrong

from autogen import AssistantAgent, UserProxyAgent


llm_config = {
    "model": "llama3-7B",
    "base_url": "http://localhost:1234/v1",
    "api_key": "lm_studio",
}
assistant = AssistantAgent("assistant", llm_config=llm_config)
user_proxy = UserProxyAgent("user_proxy", code_execution_config=False)

# Start the chat
user_proxy.initiate_chat(
    assistant,
    message="Tell me a joke about NVDA and TESLA stock prices.",
)

Error

CleanShot 2024-04-26 at 02 00 46@2x

Screenshot for ChatCompletion

It should call /v1/chat/completions instead ?
CleanShot 2024-04-26 at 01 56 33@2x

Steps to reproduce

  1. Use LM Studio to run llama3-7B in local
  2. Check following curl script work
curl http://localhost:1234/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{ 
    "model": "cognitivecomputations/dolphin-2.9-llama3-8b-gguf",
    "messages": [ 
      { "role": "system", "content": "Always answer in rhymes." },
      { "role": "user", "content": "Introduce yourself." }
    ], 
    "temperature": 0.7, 
    "max_tokens": -1,
    "stream": true
}'
  1. Run the simple user proxy script and check the error
from autogen import AssistantAgent, UserProxyAgent


llm_config = {
    "model": "llama3-7B",
    "base_url": "http://localhost:1234/v1",
    "api_key": "lm_studio",
}
assistant = AssistantAgent("assistant", llm_config=llm_config)
user_proxy = UserProxyAgent("user_proxy", code_execution_config=False)

# Start the chat
user_proxy.initiate_chat(
    assistant,
    message="Tell me a joke about NVDA and TESLA stock prices.",
)

Model Used

Llama3-7B cognitivecomputations/dolphin-2.9-llama3-8b-gguf/dolphin-2.9-llama3-8b-q8_0.gguf

Expected Behavior

ChatCompletion should get the result

Screenshots and logs

Traceback log is provided as following, if any more information should be reveal please let me know, thanks you so much.

CleanShot 2024-04-26 at 02 05 13@2x

Additional Information

No response

@holisHsu holisHsu added the bug Something isn't working label Apr 25, 2024
@riaanpieterse81
Copy link

riaanpieterse81 commented Apr 25, 2024

Hi @holisHsu
double check your model name, I'm not sure if the short name is OK, I normally use the full name, I think your short model name would be dolphin-2.9-llama3-8b

Another way to get your models for llm studio is to use the id value in curl http://localhost:1234/v1/models output

Last thing is the ConversableAgent will be better suited than the AssistantAgent in your use-case

** Sample **

from autogen import AssistantAgent, UserProxyAgent, ConversableAgent


llm_config = {
    "model": "microsoft/Phi-3-mini-4k-instruct-gguf/Phi-3-mini-4k-instruct-q4.gguf",
    "base_url": "http://localhost:1234/v1",
    "api_key": "lm_studio",
}
assistant = ConversableAgent("assistant", llm_config=llm_config)
user_proxy = UserProxyAgent("user_proxy", code_execution_config=False)

# Start the chat
user_proxy.initiate_chat(
    assistant, 
    message="Tell me a joke about NVDA and TESLA stock prices.",
)

@asandez1
Copy link
Collaborator

asandez1 commented Apr 25, 2024

If you use LLM Studio, try your curl command:

curl http://localhost:1234/v1/chat/completions
-H "Content-Type: application/json"
-d '{
"model": "model-identifier",
"messages": [
{ "role": "system", "content": "Always answer in rhymes." },
{ "role": "user", "content": "Introduce yourself." }
],
"temperature": 0.7,
"max_tokens": -1,
"stream": true
}'

@holisHsu
Copy link
Author

holisHsu commented Apr 26, 2024

Hi @holisHsu double check your model name, I'm not sure if the short name is OK, I normally use the full name, I think your short model name would be dolphin-2.9-llama3-8b

Another way to get your models for llm studio is to use the id value in curl http://localhost:1234/v1/models output

Last thing is the ConversableAgent will be better suited than the AssistantAgent in your use-case

** Sample **

from autogen import AssistantAgent, UserProxyAgent, ConversableAgent


llm_config = {
    "model": "microsoft/Phi-3-mini-4k-instruct-gguf/Phi-3-mini-4k-instruct-q4.gguf",
    "base_url": "http://localhost:1234/v1",
    "api_key": "lm_studio",
}
assistant = ConversableAgent("assistant", llm_config=llm_config)
user_proxy = UserProxyAgent("user_proxy", code_execution_config=False)

# Start the chat
user_proxy.initiate_chat(
    assistant, 
    message="Tell me a joke about NVDA and TESLA stock prices.",
)

Thank you so much and the problem seems to be I using the wrong short name.
After I apply your advice it work.

I am considering if it's possible to provide more error detail information to indicate what the problem is.

And I will close this issue because it's not a bug.

@holisHsu
Copy link
Author

holisHsu commented Apr 26, 2024

If you use LLM Studio, try your curl command:

curl http://localhost:1234/v1/chat/completions -H "Content-Type: application/json" -d '{ "model": "model-identifier", "messages": [ { "role": "system", "content": "Always answer in rhymes." }, { "role": "user", "content": "Introduce yourself." } ], "temperature": 0.7, "max_tokens": -1, "stream": true }'

Thanks for your advice, I have tried this and check it's work as I have mentioned in "Steps to reproduce" 🙏

It turns out I provide the wrong short model name and I am now considering is it possible to provide a detail error information to indicate the problem.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants