Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FEAT: ChatGLM3 tool calls #701

Merged
merged 21 commits into from
Dec 1, 2023
Merged

Conversation

codingl2k1
Copy link
Contributor

@codingl2k1 codingl2k1 commented Nov 30, 2023

Support tool calls for ChatGLM3, but the response function arguments are different:

Post URL: http://localhost:42183/v1/chat/completions

payload json:

{
    "model": "test_tool",
    "messages": [
        {
            "role": "user",
            "content": "帮我查询股票10111的价格"
        }
    ],
    "tools": [
        {
            "type": "function",
            "function": {
                "name": "track",
                "description": "追踪指定股票的实时价格",
                "parameters": {
                    "type": "object",
                    "properties": {
                        "symbol": {
                            "description": "需要追踪的股票代码"
                        }
                    },
                    "required": [
                        "symbol"
                    ]
                }
            }
        },
        {
            "type": "function",
            "function": {
                "name": "text-to-speech",
                "description": "将文本转换为语音",
                "parameters": {
                    "type": "object",
                    "properties": {
                        "text": {
                            "description": "需要转换成语音的文本"
                        },
                        "voice": {
                            "description": "要使用的语音类型(男声、女声等)"
                        },
                        "speed": {
                            "description": "语音的速度(快、中等、慢等)"
                        }
                    },
                    "required": [
                        "text"
                    ]
                }
            }
        }
    ],
    "stop": [
        "\n"
    ]
}

ChatGLM3 response (The arguments is a json string):

{
    "id": "chatcmpl-095b7d07-74ed-44aa-9405-a3a999f73ebf",
    "model": "test_tool",
    "object": "chat.completion",
    "created": 1701339681,
    "choices": [
        {
            "index": 0,
            "message": {
                "role": "assistant",
                "content": null,
                "tool_calls": [
                    {
                        "id": "call_095b7d07-74ed-44aa-9405-a3a999f73ebf",
                        "type": "function",
                        "function": {
                            "name": "track",
                            "arguments": "{\"symbol\": \"10111\"}"
                        }
                    }
                ]
            },
            "finish_reason": "tool_calls"
        }
    ],
    "usage": {
        "prompt_tokens": -1,
        "completion_tokens": -1,
        "total_tokens": -1
    }
}

ChatGLM: https://github.com/THUDM/ChatGLM3/blob/main/tool_using/README.md
ChatGLM cpp: https://github.com/li-plus/chatglm.cpp/blob/main/examples/chatglm3_demo.py

Closes: #676

@XprobeBot XprobeBot added this to the v0.6.5 milestone Nov 30, 2023
@codingl2k1 codingl2k1 changed the title FEAT: ChatGLM tool calls FEAT: ChatGLM3 tool calls Nov 30, 2023
@codingl2k1 codingl2k1 marked this pull request as ready for review December 1, 2023 03:34
@waltcow
Copy link
Contributor

waltcow commented Dec 1, 2023

LGTM

@waltcow
Copy link
Contributor

waltcow commented Dec 1, 2023

0.0:43309, pid=26826] string indices must be integers
Traceback (most recent call last):
  File "/root/miniconda3/envs/infer/lib/python3.10/site-packages/xinference/api/restful_api.py", line 787, in stream_results
    iterator = await model.chat(
  File "/root/miniconda3/envs/infer/lib/python3.10/site-packages/xoscar/backends/context.py", line 227, in send
    return self._process_result_message(result)
  File "/root/miniconda3/envs/infer/lib/python3.10/site-packages/xoscar/backends/context.py", line 102, in _process_result_message
    raise message.as_instanceof_cause()
  File "/root/miniconda3/envs/infer/lib/python3.10/site-packages/xoscar/backends/pool.py", line 657, in send
    result = await self._run_coro(message.message_id, coro)
  File "/root/miniconda3/envs/infer/lib/python3.10/site-packages/xoscar/backends/pool.py", line 368, in _run_coro
    return await coro
  File "/root/miniconda3/envs/infer/lib/python3.10/site-packages/xoscar/api.py", line 306, in __on_receive__
    return await super().__on_receive__(message)  # type: ignore
  File "xoscar/core.pyx", line 558, in __on_receive__
    raise ex
  File "xoscar/core.pyx", line 520, in xoscar.core._BaseActor.__on_receive__
    async with self._lock:
  File "xoscar/core.pyx", line 521, in xoscar.core._BaseActor.__on_receive__
    with debug_async_timeout('actor_lock_timeout',
  File "xoscar/core.pyx", line 526, in xoscar.core._BaseActor.__on_receive__
    result = await result
  File "/root/miniconda3/envs/infer/lib/python3.10/site-packages/xinference/core/utils.py", line 33, in wrapped
    ret = await func(*args, **kwargs)
  File "/root/miniconda3/envs/infer/lib/python3.10/site-packages/xinference/core/model.py", line 67, in wrapped_func
    ret = await fn(self, *args, **kwargs)
  File "/root/miniconda3/envs/infer/lib/python3.10/site-packages/xinference/core/model.py", line 252, in chat
    return await self._call_wrapper(_wrapper)
  File "/root/miniconda3/envs/infer/lib/python3.10/site-packages/xinference/core/model.py", line 200, in _call_wrapper
    return await asyncio.to_thread(_wrapper)
  File "/root/miniconda3/envs/infer/lib/python3.10/asyncio/threads.py", line 25, in to_thread
    return await loop.run_in_executor(None, func_call)
  File "/root/miniconda3/envs/infer/lib/python3.10/concurrent/futures/thread.py", line 58, in run
    result = self.fn(*self.args, **self.kwargs)
  File "/root/miniconda3/envs/infer/lib/python3.10/site-packages/xinference/core/model.py", line 240, in _wrapper
    getattr(self._model, "chat")(prompt, *args, **kwargs)
  File "/root/miniconda3/envs/infer/lib/python3.10/site-packages/xinference/model/llm/pytorch/chatglm.py", line 158, in chat
    return self._tool_calls_completion(msg[0], self.model_uid)
  File "/root/miniconda3/envs/infer/lib/python3.10/site-packages/xinference/model/llm/pytorch/chatglm.py", line 120, in _tool_calls_completion
    "name": msg["name"],
TypeError: [address=0.0.0.0:43309, pid=26826] string indices must be integers
Traceback (most recent call last):
  File "/root/miniconda3/envs/infer/lib/python3.10/site-packages/gradio/queueing.py", line 407, in call_prediction
    output = await route_utils.call_process_api(
  File "/root/miniconda3/envs/infer/lib/python3.10/site-packages/gradio/route_utils.py", line 226, in call_process_api
    output = await app.get_blocks().process_api(
  File "/root/miniconda3/envs/infer/lib/python3.10/site-packages/gradio/blocks.py", line 1550, in process_api
    result = await self.call_function(
  File "/root/miniconda3/envs/infer/lib/python3.10/site-packages/gradio/blocks.py", line 1199, in call_function
    prediction = await utils.async_iteration(iterator)
  File "/root/miniconda3/envs/infer/lib/python3.10/site-packages/gradio/utils.py", line 519, in async_iteration
    return await iterator.__anext__()
  File "/root/miniconda3/envs/infer/lib/python3.10/site-packages/gradio/utils.py", line 623, in asyncgen_wrapper
    async for response in f(*args, **kwargs):
  File "/root/miniconda3/envs/infer/lib/python3.10/site-packages/gradio/chat_interface.py", line 437, in _stream_fn
    first_response = await async_iteration(generator)
  File "/root/miniconda3/envs/infer/lib/python3.10/site-packages/gradio/utils.py", line 519, in async_iteration
    return await iterator.__anext__()
  File "/root/miniconda3/envs/infer/lib/python3.10/site-packages/gradio/utils.py", line 512, in __anext__
    return await anyio.to_thread.run_sync(
  File "/root/miniconda3/envs/infer/lib/python3.10/site-packages/anyio/to_thread.py", line 33, in run_sync
    return await get_asynclib().run_sync_in_worker_thread(
  File "/root/miniconda3/envs/infer/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 877, in run_sync_in_worker_thread
    return await future
  File "/root/miniconda3/envs/infer/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 807, in run
    result = context.run(func, *args)
  File "/root/miniconda3/envs/infer/lib/python3.10/site-packages/gradio/utils.py", line 495, in run_sync_iterator_async
    return next(iterator)
  File "/root/miniconda3/envs/infer/lib/python3.10/site-packages/xinference/core/chat_interface.py", line 111, in generate_wrapper
    for chunk in model.chat(
  File "/root/miniconda3/envs/infer/lib/python3.10/site-packages/xinference/client/common.py", line 49, in streaming_response_iterator
    raise Exception(str(error))
Exception: [address=0.0.0.0:43309, pid=26826] string indices must be integers

@aresnow1
Copy link
Contributor

aresnow1 commented Dec 1, 2023

0.0:43309, pid=26826] string indices must be integers
Traceback (most recent call last):
  File "/root/miniconda3/envs/infer/lib/python3.10/site-packages/xinference/api/restful_api.py", line 787, in stream_results
    iterator = await model.chat(
  File "/root/miniconda3/envs/infer/lib/python3.10/site-packages/xoscar/backends/context.py", line 227, in send
    return self._process_result_message(result)
  File "/root/miniconda3/envs/infer/lib/python3.10/site-packages/xoscar/backends/context.py", line 102, in _process_result_message
    raise message.as_instanceof_cause()
  File "/root/miniconda3/envs/infer/lib/python3.10/site-packages/xoscar/backends/pool.py", line 657, in send
    result = await self._run_coro(message.message_id, coro)
  File "/root/miniconda3/envs/infer/lib/python3.10/site-packages/xoscar/backends/pool.py", line 368, in _run_coro
    return await coro
  File "/root/miniconda3/envs/infer/lib/python3.10/site-packages/xoscar/api.py", line 306, in __on_receive__
    return await super().__on_receive__(message)  # type: ignore
  File "xoscar/core.pyx", line 558, in __on_receive__
    raise ex
  File "xoscar/core.pyx", line 520, in xoscar.core._BaseActor.__on_receive__
    async with self._lock:
  File "xoscar/core.pyx", line 521, in xoscar.core._BaseActor.__on_receive__
    with debug_async_timeout('actor_lock_timeout',
  File "xoscar/core.pyx", line 526, in xoscar.core._BaseActor.__on_receive__
    result = await result
  File "/root/miniconda3/envs/infer/lib/python3.10/site-packages/xinference/core/utils.py", line 33, in wrapped
    ret = await func(*args, **kwargs)
  File "/root/miniconda3/envs/infer/lib/python3.10/site-packages/xinference/core/model.py", line 67, in wrapped_func
    ret = await fn(self, *args, **kwargs)
  File "/root/miniconda3/envs/infer/lib/python3.10/site-packages/xinference/core/model.py", line 252, in chat
    return await self._call_wrapper(_wrapper)
  File "/root/miniconda3/envs/infer/lib/python3.10/site-packages/xinference/core/model.py", line 200, in _call_wrapper
    return await asyncio.to_thread(_wrapper)
  File "/root/miniconda3/envs/infer/lib/python3.10/asyncio/threads.py", line 25, in to_thread
    return await loop.run_in_executor(None, func_call)
  File "/root/miniconda3/envs/infer/lib/python3.10/concurrent/futures/thread.py", line 58, in run
    result = self.fn(*self.args, **self.kwargs)
  File "/root/miniconda3/envs/infer/lib/python3.10/site-packages/xinference/core/model.py", line 240, in _wrapper
    getattr(self._model, "chat")(prompt, *args, **kwargs)
  File "/root/miniconda3/envs/infer/lib/python3.10/site-packages/xinference/model/llm/pytorch/chatglm.py", line 158, in chat
    return self._tool_calls_completion(msg[0], self.model_uid)
  File "/root/miniconda3/envs/infer/lib/python3.10/site-packages/xinference/model/llm/pytorch/chatglm.py", line 120, in _tool_calls_completion
    "name": msg["name"],
TypeError: [address=0.0.0.0:43309, pid=26826] string indices must be integers
Traceback (most recent call last):
  File "/root/miniconda3/envs/infer/lib/python3.10/site-packages/gradio/queueing.py", line 407, in call_prediction
    output = await route_utils.call_process_api(
  File "/root/miniconda3/envs/infer/lib/python3.10/site-packages/gradio/route_utils.py", line 226, in call_process_api
    output = await app.get_blocks().process_api(
  File "/root/miniconda3/envs/infer/lib/python3.10/site-packages/gradio/blocks.py", line 1550, in process_api
    result = await self.call_function(
  File "/root/miniconda3/envs/infer/lib/python3.10/site-packages/gradio/blocks.py", line 1199, in call_function
    prediction = await utils.async_iteration(iterator)
  File "/root/miniconda3/envs/infer/lib/python3.10/site-packages/gradio/utils.py", line 519, in async_iteration
    return await iterator.__anext__()
  File "/root/miniconda3/envs/infer/lib/python3.10/site-packages/gradio/utils.py", line 623, in asyncgen_wrapper
    async for response in f(*args, **kwargs):
  File "/root/miniconda3/envs/infer/lib/python3.10/site-packages/gradio/chat_interface.py", line 437, in _stream_fn
    first_response = await async_iteration(generator)
  File "/root/miniconda3/envs/infer/lib/python3.10/site-packages/gradio/utils.py", line 519, in async_iteration
    return await iterator.__anext__()
  File "/root/miniconda3/envs/infer/lib/python3.10/site-packages/gradio/utils.py", line 512, in __anext__
    return await anyio.to_thread.run_sync(
  File "/root/miniconda3/envs/infer/lib/python3.10/site-packages/anyio/to_thread.py", line 33, in run_sync
    return await get_asynclib().run_sync_in_worker_thread(
  File "/root/miniconda3/envs/infer/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 877, in run_sync_in_worker_thread
    return await future
  File "/root/miniconda3/envs/infer/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 807, in run
    result = context.run(func, *args)
  File "/root/miniconda3/envs/infer/lib/python3.10/site-packages/gradio/utils.py", line 495, in run_sync_iterator_async
    return next(iterator)
  File "/root/miniconda3/envs/infer/lib/python3.10/site-packages/xinference/core/chat_interface.py", line 111, in generate_wrapper
    for chunk in model.chat(
  File "/root/miniconda3/envs/infer/lib/python3.10/site-packages/xinference/client/common.py", line 49, in streaming_response_iterator
    raise Exception(str(error))
Exception: [address=0.0.0.0:43309, pid=26826] string indices must be integers

What is the error message? Did you test function calling on this branch?

@waltcow
Copy link
Contributor

waltcow commented Dec 1, 2023

0.0:43309, pid=26826] string indices must be integers
Traceback (most recent call last):
  File "/root/miniconda3/envs/infer/lib/python3.10/site-packages/xinference/api/restful_api.py", line 787, in stream_results
    iterator = await model.chat(
  File "/root/miniconda3/envs/infer/lib/python3.10/site-packages/xoscar/backends/context.py", line 227, in send
    return self._process_result_message(result)
  File "/root/miniconda3/envs/infer/lib/python3.10/site-packages/xoscar/backends/context.py", line 102, in _process_result_message
    raise message.as_instanceof_cause()
  File "/root/miniconda3/envs/infer/lib/python3.10/site-packages/xoscar/backends/pool.py", line 657, in send
    result = await self._run_coro(message.message_id, coro)
  File "/root/miniconda3/envs/infer/lib/python3.10/site-packages/xoscar/backends/pool.py", line 368, in _run_coro
    return await coro
  File "/root/miniconda3/envs/infer/lib/python3.10/site-packages/xoscar/api.py", line 306, in __on_receive__
    return await super().__on_receive__(message)  # type: ignore
  File "xoscar/core.pyx", line 558, in __on_receive__
    raise ex
  File "xoscar/core.pyx", line 520, in xoscar.core._BaseActor.__on_receive__
    async with self._lock:
  File "xoscar/core.pyx", line 521, in xoscar.core._BaseActor.__on_receive__
    with debug_async_timeout('actor_lock_timeout',
  File "xoscar/core.pyx", line 526, in xoscar.core._BaseActor.__on_receive__
    result = await result
  File "/root/miniconda3/envs/infer/lib/python3.10/site-packages/xinference/core/utils.py", line 33, in wrapped
    ret = await func(*args, **kwargs)
  File "/root/miniconda3/envs/infer/lib/python3.10/site-packages/xinference/core/model.py", line 67, in wrapped_func
    ret = await fn(self, *args, **kwargs)
  File "/root/miniconda3/envs/infer/lib/python3.10/site-packages/xinference/core/model.py", line 252, in chat
    return await self._call_wrapper(_wrapper)
  File "/root/miniconda3/envs/infer/lib/python3.10/site-packages/xinference/core/model.py", line 200, in _call_wrapper
    return await asyncio.to_thread(_wrapper)
  File "/root/miniconda3/envs/infer/lib/python3.10/asyncio/threads.py", line 25, in to_thread
    return await loop.run_in_executor(None, func_call)
  File "/root/miniconda3/envs/infer/lib/python3.10/concurrent/futures/thread.py", line 58, in run
    result = self.fn(*self.args, **self.kwargs)
  File "/root/miniconda3/envs/infer/lib/python3.10/site-packages/xinference/core/model.py", line 240, in _wrapper
    getattr(self._model, "chat")(prompt, *args, **kwargs)
  File "/root/miniconda3/envs/infer/lib/python3.10/site-packages/xinference/model/llm/pytorch/chatglm.py", line 158, in chat
    return self._tool_calls_completion(msg[0], self.model_uid)
  File "/root/miniconda3/envs/infer/lib/python3.10/site-packages/xinference/model/llm/pytorch/chatglm.py", line 120, in _tool_calls_completion
    "name": msg["name"],
TypeError: [address=0.0.0.0:43309, pid=26826] string indices must be integers
Traceback (most recent call last):
  File "/root/miniconda3/envs/infer/lib/python3.10/site-packages/gradio/queueing.py", line 407, in call_prediction
    output = await route_utils.call_process_api(
  File "/root/miniconda3/envs/infer/lib/python3.10/site-packages/gradio/route_utils.py", line 226, in call_process_api
    output = await app.get_blocks().process_api(
  File "/root/miniconda3/envs/infer/lib/python3.10/site-packages/gradio/blocks.py", line 1550, in process_api
    result = await self.call_function(
  File "/root/miniconda3/envs/infer/lib/python3.10/site-packages/gradio/blocks.py", line 1199, in call_function
    prediction = await utils.async_iteration(iterator)
  File "/root/miniconda3/envs/infer/lib/python3.10/site-packages/gradio/utils.py", line 519, in async_iteration
    return await iterator.__anext__()
  File "/root/miniconda3/envs/infer/lib/python3.10/site-packages/gradio/utils.py", line 623, in asyncgen_wrapper
    async for response in f(*args, **kwargs):
  File "/root/miniconda3/envs/infer/lib/python3.10/site-packages/gradio/chat_interface.py", line 437, in _stream_fn
    first_response = await async_iteration(generator)
  File "/root/miniconda3/envs/infer/lib/python3.10/site-packages/gradio/utils.py", line 519, in async_iteration
    return await iterator.__anext__()
  File "/root/miniconda3/envs/infer/lib/python3.10/site-packages/gradio/utils.py", line 512, in __anext__
    return await anyio.to_thread.run_sync(
  File "/root/miniconda3/envs/infer/lib/python3.10/site-packages/anyio/to_thread.py", line 33, in run_sync
    return await get_asynclib().run_sync_in_worker_thread(
  File "/root/miniconda3/envs/infer/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 877, in run_sync_in_worker_thread
    return await future
  File "/root/miniconda3/envs/infer/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 807, in run
    result = context.run(func, *args)
  File "/root/miniconda3/envs/infer/lib/python3.10/site-packages/gradio/utils.py", line 495, in run_sync_iterator_async
    return next(iterator)
  File "/root/miniconda3/envs/infer/lib/python3.10/site-packages/xinference/core/chat_interface.py", line 111, in generate_wrapper
    for chunk in model.chat(
  File "/root/miniconda3/envs/infer/lib/python3.10/site-packages/xinference/client/common.py", line 49, in streaming_response_iterator
    raise Exception(str(error))
Exception: [address=0.0.0.0:43309, pid=26826] string indices must be integers

What is the error message? Did you test function calling on this branch?

yes, i merge your branch and test it

image


tool is {'role': 'system', 'content': 'Answer the following questions as best as you can. You have access to the following tools:', 'tools': []}

it seems tool is a empty slice, not null

@aresnow1
Copy link
Contributor

aresnow1 commented Dec 1, 2023

@waltcow how did you call? Using python client or jusr via RESTful api?

@waltcow
Copy link
Contributor

waltcow commented Dec 1, 2023

@waltcow how did you call? Using python client or jusr via RESTful api?

curl -X 'POST' \
  'http://127.0.0.1:9997/v1/chat/completions' \
  -H 'accept: application/json' \
  -H 'Content-Type: application/json' \
  -d '{
    "model": "af1b28d0-9020-11ee-9058-fdd89f67f6c7",
    "messages": [
        {
            "role": "system",
            "content": "You are a helpful assistant."
        },
        {
            "role": "user",
            "content": "What is the largest animal?"
        }
    ]
  }'

@aresnow1
Copy link
Contributor

aresnow1 commented Dec 1, 2023

@waltcow how did you call? Using python client or jusr via RESTful api?

curl -X 'POST' \
  'http://127.0.0.1:9997/v1/chat/completions' \
  -H 'accept: application/json' \
  -H 'Content-Type: application/json' \
  -d '{
    "model": "af1b28d0-9020-11ee-9058-fdd89f67f6c7",
    "messages": [
        {
            "role": "system",
            "content": "You are a helpful assistant."
        },
        {
            "role": "user",
            "content": "What is the largest animal?"
        }
    ]
  }'

I didn't reproduce it, could you check your openai version? it needs to later than v1.0
pip show openai

@waltcow
Copy link
Contributor

waltcow commented Dec 1, 2023

@waltcow how did you call? Using python client or jusr via RESTful api?

curl -X 'POST' \
  'http://127.0.0.1:9997/v1/chat/completions' \
  -H 'accept: application/json' \
  -H 'Content-Type: application/json' \
  -d '{
    "model": "af1b28d0-9020-11ee-9058-fdd89f67f6c7",
    "messages": [
        {
            "role": "system",
            "content": "You are a helpful assistant."
        },
        {
            "role": "user",
            "content": "What is the largest animal?"
        }
    ]
  }'

I didn't reproduce it, could you check your openai version? it needs to later than v1.0 pip show openai

(infer) ➜ inference git:(tool) pip show openai
Name: openai
Version: 1.3.6
Summary: The official Python library for the openai API
Home-page:
Author:
Author-email: OpenAI support@openai.com
License:
Location: /root/miniconda3/envs/infer/lib/python3.10/site-packages
Requires: anyio, distro, httpx, pydantic, sniffio, tqdm, typing-extensions
Required-by: xinference

@aresnow1
Copy link
Contributor

aresnow1 commented Dec 1, 2023

@waltcow thanks for your feedback, I've reproduced it, this issue happens with pytorch format, and we will fix it in this PR.

@waltcow
Copy link
Contributor

waltcow commented Dec 1, 2023

@waltcow thanks for your feedback, I've reproduced it, this issue happens with pytorch format, and we will fix it in this PR.

it works now, thanks

@ChengjieLi28 ChengjieLi28 merged commit 909a428 into xorbitsai:main Dec 1, 2023
12 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

FEAT: Support function calling for chatglm3
5 participants