[Bug]: base_url seems not work on "http://localhost:1234/v1" #2514

holisHsu · 2024-04-25T18:07:51Z

Thanks for any potential help in advance, and apologize if what I report is not a bug.

Description

When I run example code as following with base_url point to my local LM Studio server,
ChatCompletion fail and it seems the request path is wrong

from autogen import AssistantAgent, UserProxyAgent


llm_config = {
    "model": "llama3-7B",
    "base_url": "http://localhost:1234/v1",
    "api_key": "lm_studio",
}
assistant = AssistantAgent("assistant", llm_config=llm_config)
user_proxy = UserProxyAgent("user_proxy", code_execution_config=False)

# Start the chat
user_proxy.initiate_chat(
    assistant,
    message="Tell me a joke about NVDA and TESLA stock prices.",
)

Error

Screenshot for ChatCompletion

It should call /v1/chat/completions instead ?

Steps to reproduce

Use LM Studio to run llama3-7B in local
Check following curl script work

curl http://localhost:1234/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{ 
    "model": "cognitivecomputations/dolphin-2.9-llama3-8b-gguf",
    "messages": [ 
      { "role": "system", "content": "Always answer in rhymes." },
      { "role": "user", "content": "Introduce yourself." }
    ], 
    "temperature": 0.7, 
    "max_tokens": -1,
    "stream": true
}'

Run the simple user proxy script and check the error

from autogen import AssistantAgent, UserProxyAgent


llm_config = {
    "model": "llama3-7B",
    "base_url": "http://localhost:1234/v1",
    "api_key": "lm_studio",
}
assistant = AssistantAgent("assistant", llm_config=llm_config)
user_proxy = UserProxyAgent("user_proxy", code_execution_config=False)

# Start the chat
user_proxy.initiate_chat(
    assistant,
    message="Tell me a joke about NVDA and TESLA stock prices.",
)

Model Used

~~Llama3-7B~~ cognitivecomputations/dolphin-2.9-llama3-8b-gguf/dolphin-2.9-llama3-8b-q8_0.gguf

Expected Behavior

ChatCompletion should get the result

Screenshots and logs

Traceback log is provided as following, if any more information should be reveal please let me know, thanks you so much.

Additional Information

No response

The text was updated successfully, but these errors were encountered:

riaanpieterse81 · 2024-04-25T19:53:00Z

Hi @holisHsu
double check your model name, I'm not sure if the short name is OK, I normally use the full name, I think your short model name would be dolphin-2.9-llama3-8b

Another way to get your models for llm studio is to use the id value in curl http://localhost:1234/v1/models output

Last thing is the ConversableAgent will be better suited than the AssistantAgent in your use-case

** Sample **

from autogen import AssistantAgent, UserProxyAgent, ConversableAgent


llm_config = {
    "model": "microsoft/Phi-3-mini-4k-instruct-gguf/Phi-3-mini-4k-instruct-q4.gguf",
    "base_url": "http://localhost:1234/v1",
    "api_key": "lm_studio",
}
assistant = ConversableAgent("assistant", llm_config=llm_config)
user_proxy = UserProxyAgent("user_proxy", code_execution_config=False)

# Start the chat
user_proxy.initiate_chat(
    assistant, 
    message="Tell me a joke about NVDA and TESLA stock prices.",
)

asandez1 · 2024-04-25T21:16:04Z

If you use LLM Studio, try your curl command:

curl http://localhost:1234/v1/chat/completions
-H "Content-Type: application/json"
-d '{
"model": "model-identifier",
"messages": [
{ "role": "system", "content": "Always answer in rhymes." },
{ "role": "user", "content": "Introduce yourself." }
],
"temperature": 0.7,
"max_tokens": -1,
"stream": true
}'

holisHsu · 2024-04-26T03:27:39Z

Hi @holisHsu double check your model name, I'm not sure if the short name is OK, I normally use the full name, I think your short model name would be dolphin-2.9-llama3-8b

Another way to get your models for llm studio is to use the id value in curl http://localhost:1234/v1/models output

Last thing is the ConversableAgent will be better suited than the AssistantAgent in your use-case

** Sample **
from autogen import AssistantAgent, UserProxyAgent, ConversableAgent


llm_config = {
    "model": "microsoft/Phi-3-mini-4k-instruct-gguf/Phi-3-mini-4k-instruct-q4.gguf",
    "base_url": "http://localhost:1234/v1",
    "api_key": "lm_studio",
}
assistant = ConversableAgent("assistant", llm_config=llm_config)
user_proxy = UserProxyAgent("user_proxy", code_execution_config=False)

# Start the chat
user_proxy.initiate_chat(
    assistant, 
    message="Tell me a joke about NVDA and TESLA stock prices.",
)

Thank you so much and the problem seems to be I using the wrong short name.
After I apply your advice it work.

I am considering if it's possible to provide more error detail information to indicate what the problem is.

And I will close this issue because it's not a bug.

holisHsu · 2024-04-26T03:32:36Z

If you use LLM Studio, try your curl command:

curl http://localhost:1234/v1/chat/completions -H "Content-Type: application/json" -d '{ "model": "model-identifier", "messages": [ { "role": "system", "content": "Always answer in rhymes." }, { "role": "user", "content": "Introduce yourself." } ], "temperature": 0.7, "max_tokens": -1, "stream": true }'

Thanks for your advice, I have tried this and check it's work as I have mentioned in "Steps to reproduce" 🙏

It turns out I provide the wrong short model name and I am now considering is it possible to provide a detail error information to indicate the problem.

holisHsu added the bug Something isn't working label Apr 25, 2024

holisHsu closed this as completed Apr 26, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug]: base_url seems not work on "http://localhost:1234/v1" #2514

[Bug]: base_url seems not work on "http://localhost:1234/v1" #2514

holisHsu commented Apr 25, 2024 •

edited

Loading

riaanpieterse81 commented Apr 25, 2024 •

edited

Loading

asandez1 commented Apr 25, 2024 •

edited

Loading

holisHsu commented Apr 26, 2024 •

edited

Loading

holisHsu commented Apr 26, 2024 •

edited

Loading

[Bug]: base_url seems not work on "http://localhost:1234/v1" #2514

[Bug]: base_url seems not work on "http://localhost:1234/v1" #2514

Comments

holisHsu commented Apr 25, 2024 • edited Loading

Description

Error

Screenshot for ChatCompletion

Steps to reproduce

Model Used

Expected Behavior

Screenshots and logs

Additional Information

riaanpieterse81 commented Apr 25, 2024 • edited Loading

asandez1 commented Apr 25, 2024 • edited Loading

holisHsu commented Apr 26, 2024 • edited Loading

holisHsu commented Apr 26, 2024 • edited Loading

holisHsu commented Apr 25, 2024 •

edited

Loading

riaanpieterse81 commented Apr 25, 2024 •

edited

Loading

asandez1 commented Apr 25, 2024 •

edited

Loading

holisHsu commented Apr 26, 2024 •

edited

Loading

holisHsu commented Apr 26, 2024 •

edited

Loading