Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: Ollama chat tool calls are not parsed correctly #3333

Closed
jackmpcollins opened this issue Apr 27, 2024 · 4 comments · Fixed by #3469
Closed

[Bug]: Ollama chat tool calls are not parsed correctly #3333

jackmpcollins opened this issue Apr 27, 2024 · 4 comments · Fixed by #3469
Labels
bug Something isn't working

Comments

@jackmpcollins
Copy link
Contributor

jackmpcollins commented Apr 27, 2024

What happened?

ollama_chat/llama2 returns a tool call where the function name and arguments appear as a single object in the arguments field, and name is incorrectly set to an empty string. This object should be parsed into the separate arguments and name field.

ModelResponse(id='chatcmpl-ae59691c-0d45-4863-b014-d336e2bfbcfd', choices=[Choices(finish_reason='stop', index=0, message=Message(content=None, role='assistant', tool_calls=[ChatCompletionMessageToolCall(function=Function(arguments='{\n"name": "get_current_weather",\n"arguments": {\n"location": "San Francisco, CA",\n"unit": "celsius"\n}\n}', name=''), id='call_532322f5-f8b9-4243-8d9d-018ed629861a', type='function')]))], created=1714247542, model='ollama/llama2', object='chat.completion', system_fingerprint=None, usage=Usage(prompt_tokens=12, completion_tokens=42, total_tokens=54))

Separately, when using ollama_chat/llama2 with stream=True the tool call JSON appears in the content field but it should be parsed into the tool_calls field.

ModelResponse(id='chatcmpl-a60d7240-720c-408d-865e-9a9e3ae58d3e', choices=[StreamingChoices(finish_reason=None, index=0, delta=Delta(content='{', role='assistant', function_call=None, tool_calls=None), logprobs=None)], created=1714247544, model='llama2', object='chat.completion.chunk', system_fingerprint=None, usage=Usage())
ModelResponse(id='chatcmpl-a60d7240-720c-408d-865e-9a9e3ae58d3e', choices=[StreamingChoices(finish_reason=None, index=0, delta=Delta(content='\n', role=None, function_call=None, tool_calls=None), logprobs=None)], created=1714247544, model='llama2', object='chat.completion.chunk', system_fingerprint=None, usage=Usage())
ModelResponse(id='chatcmpl-a60d7240-720c-408d-865e-9a9e3ae58d3e', choices=[StreamingChoices(finish_reason=None, index=0, delta=Delta(content='"', role=None, function_call=None, tool_calls=None), logprobs=None)], created=1714247544, model='llama2', object='chat.completion.chunk', system_fingerprint=None, usage=Usage())
ModelResponse(id='chatcmpl-a60d7240-720c-408d-865e-9a9e3ae58d3e', choices=[StreamingChoices(finish_reason=None, index=0, delta=Delta(content='name', role=None, function_call=None, tool_calls=None), logprobs=None)], created=1714247544, model='llama2', object='chat.completion.chunk', system_fingerprint=None, usage=Usage())
ModelResponse(id='chatcmpl-a60d7240-720c-408d-865e-9a9e3ae58d3e', choices=[StreamingChoices(finish_reason=None, index=0, delta=Delta(content='":', role=None, function_call=None, tool_calls=None), logprobs=None)], created=1714247544, model='llama2', object='chat.completion.chunk', system_fingerprint=None, usage=Usage())

Here's the streamed tool call response using gpt-3.5-turbo-1106 which is correct

ModelResponse(id='chatcmpl-9IiNxc82AjQTSu3yBABzB4vsFZa8Q', choices=[StreamingChoices(finish_reason=None, index=0, delta=Delta(content=None, role='assistant', function_call=None, tool_calls=[ChatCompletionDeltaToolCall(id='call_5lHC4zXMKuvxDMOlp5MBASCP', function=Function(arguments='', name='get_current_weather'), type='function', index=0)]), logprobs=None)], created=1714247549, model='gpt-3.5-turbo-1106', object='chat.completion.chunk', system_fingerprint='fp_b953e4de39', usage=Usage())
ModelResponse(id='chatcmpl-9IiNxc82AjQTSu3yBABzB4vsFZa8Q', choices=[StreamingChoices(finish_reason=None, index=0, delta=Delta(content=None, role=None, function_call=None, tool_calls=[ChatCompletionDeltaToolCall(id=None, function=Function(arguments='{"lo', name=None), type=None, index=0)]), logprobs=None)], created=1714247549, model='gpt-3.5-turbo-1106', object='chat.completion.chunk', system_fingerprint='fp_b953e4de39', usage=Usage())
ModelResponse(id='chatcmpl-9IiNxc82AjQTSu3yBABzB4vsFZa8Q', choices=[StreamingChoices(finish_reason=None, index=0, delta=Delta(content=None, role=None, function_call=None, tool_calls=[ChatCompletionDeltaToolCall(id=None, function=Function(arguments='catio', name=None), type=None, index=0)]), logprobs=None)], created=1714247549, model='gpt-3.5-turbo-1106', object='chat.completion.chunk', system_fingerprint='fp_b953e4de39', usage=Usage())
ModelResponse(id='chatcmpl-9IiNxc82AjQTSu3yBABzB4vsFZa8Q', choices=[StreamingChoices(finish_reason=None, index=0, delta=Delta(content=None, role=None, function_call=None, tool_calls=[ChatCompletionDeltaToolCall(id=None, function=Function(arguments='n": "S', name=None), type=None, index=0)]), logprobs=None)], created=1714247549, model='gpt-3.5-turbo-1106', object='chat.completion.chunk', system_fingerprint='fp_b953e4de39', usage=Usage())

Fixing this would allow more features of https://github.com/jackmpcollins/magentic to be used with ollama.
Related issue: jackmpcollins/magentic#194

Code to reproduces this

import litellm

messages = [{"role": "user", "content": "What's the weather like in San Francisco, Tokyo, and Paris?"}]
tools = [
    {
        "type": "function",
        "function": {
            "name": "get_current_weather",
            "description": "Get the current weather in a given location",
            "parameters": {
                "type": "object",
                "properties": {
                    "location": {
                        "type": "string",
                        "description": "The city and state, e.g. San Francisco, CA",
                    },
                    "unit": {"type": "string", "enum": ["celsius", "fahrenheit"]},
                },
                "required": ["location"],
            },
        },
    }
]

response = litellm.completion(
    model="ollama_chat/llama2",
    messages=messages,
    tools=tools,
    stream=False,
)
print(response)


print("\n---\n")
response = litellm.completion(
    model="ollama_chat/llama2",
    messages=messages,
    tools=tools,
    stream=True,
)
for chunk in response:
    print(chunk)


print("\n---\n")
response = litellm.completion(
    model="gpt-3.5-turbo-1106",
    messages=messages,
    tools=tools,
    stream=True,
)
for chunk in response:
    print(chunk)

Relevant log output

No response

Twitter / LinkedIn details

@jackmpcollins / https://www.linkedin.com/in/jackmpcollins/

@rick-github
Copy link
Contributor

I had the same issue but haven't got around to looking at it yet, but it's possible that #1526 addresses this.

@krrishdholakia
Copy link
Contributor

Merged the relevant PR in - should be live in the next litellm release - v1.35.34+

cc: @jackmpcollins @rick-github

@jackmpcollins
Copy link
Contributor Author

@krrishdholakia The stream=True response is still parsed incorrectly. The code in the description reproduces this.

Regular responses are now parsing correctly, thanks.

@ChristianWeyer
Copy link

Merged the relevant PR in - should be live in the next litellm release - v1.35.34+

cc: @jackmpcollins @rick-github

#2209 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
4 participants