xinference、glm4工具调用报错400 #1812

JinCheng666 · 2024-06-21T03:09:59Z

例行检查

我已确认目前没有类似 issue
我已完整查看过项目 README，以及项目文档
我使用了自己的 key，并确认我的 key 是可正常使用的
我理解并愿意跟进此 issue，协助测试和提供反馈
我理解并认可上述内容，并理解项目维护者精力有限，不遵循规则的 issue 可能会被无视或直接关闭

你的版本

私有部署版本, 具体版本号: 4.8.4-fix

问题描述, 日志截图

xinference部署glm4:9b，通过oneapi接入fastgpt，使用glm4的对话功能时，报错400，直接用curl命令测试oneapi，问答正常的

直接curl oneapi，是正常的

> curl --location --request POST 'http://10.4.134.11:3000/v1/chat/completions' \
--header 'Authorization: Bearer sk-rTNTs8GHs35Qsd5B28E307D9894d44189956F5AfAfF9C111' \
--header 'Content-Type: application/json' \
--data-raw '{
  "model": "glm4:9b",
  "messages": [{"role": "user", "content": "Hello!"}]
}'
{"id":"chat784f4d72-2f7b-11ef-84ab-0cda411d272a","object":"chat.completion","created":1718939282,"model":"glm4:9b","choices":[{"index":0,"message":{"role":"assistant","content":"\nHello 👋! I'm ChatGLM, the AI assistant, nice to meet you. How can I help you today?"},"finish_reason":"stop"}],"usage":{"prompt_tokens":7,"completion_tokens":28,"total_tokens":35}}%


此时oneapi的日志
[SYS] 2024/06/21 - 11:08:05 | model ratio not found: glm4:9b
[INFO] 2024/06/21 - 11:08:05 | 2024062111080584198360052664778 | user 1 has enough quota 999228049506, trusted and no need to pre-consume
[GIN] 2024/06/21 - 11:08:06 | 2024062111080584198360052664778 | 200 |     843.191ms |     10.4.134.11 |    POST /v1/chat/completions
[INFO] 2024/06/21 - 11:08:06 | 2024062111080584198360052664778 | record consume log: userId=1, channelId=13, promptTokens=7, completionTokens=28, modelName=glm4:9b, tokenName=test, quota=1050, content=模型倍率 30.00，分组倍率 1.00，补全倍率 1.00

版本信息：

xinference:0.12.1
fastgpt:4.8.4-fix
oneapi:0.5.10
glm4:glm-4-9b-chat

config.json

{
      "model": "glm4:9b",
      "name": "glm4:9b",
      "maxContext": 8192,
      "avatar": "/imgs/model/qwen.svg",
      "maxResponse": 3000,
      "quoteMaxToken": 6000,
      "maxTemperature": 1.2,
      "charsPointsPrice": 0,
      "censor": false,
      "vision": false,
      "datasetProcess": false,
      "usedInClassify": true,
      "usedInExtractFields": true,
      "usedInToolCall": true,
      "usedInQueryExtension": true,
      "toolChoice": true,
      "functionCall": true,
      "customCQPrompt": "",
      "customExtractPrompt": "",
      "defaultSystemChatPrompt": "",
      "defaultConfig": {}
    },

oneapi报错日志

[SYS] 2024/06/21 - 11:04:56 | model ratio not found: glm4:9b
[INFO] 2024/06/21 - 11:04:56 | 2024062111045612344700016869473 | user 1 has enough quota 999228049506, trusted and no need to pre-consume
[ERR] 2024/06/21 - 11:04:56 | 2024062111045612344700016869473 | relay error happen, status code is 400, won't retry in this case
[ERR] 2024/06/21 - 11:04:56 | 2024062111045612344700016869473 | relay error (channel #13): bad response status code 400
[GIN] 2024/06/21 - 11:04:56 | 2024062111045612344700016869473 | 400 |     17.5219ms |     10.4.134.11 |    POST /v1/chat/completions

fastgpt报错日志

pg connected
default team exist new ObjectId("65df05ace2bdb7c8f32fa173")
root user init: { username: 'root', password: 'AN_quan565656' }
init pg successful
[Info] 2024-06-21 03:04:53 [QA Queue] Done 
[Info] 2024-06-21 03:04:53 [Vector Queue] Done 
[Info] 2024-06-21 03:04:54 Request finish /api/core/app/version/publish?appId=6674e4017f38dd6c0ff4b4f9, time: 427ms 
[Error] 2024-06-21 03:04:56 sse error: bad_response_status_code bad response status code 400 (request id: 2024062111045612344700016869473) 
[Info] 2024-06-21 03:05:00 [QA Queue] Done 
[Info] 2024-06-21 03:05:00 [Vector Queue] Done 
{
  message: '400 bad response status code 400 (request id: 2024062111045612344700016869473)',
  stack: 'Error: 400 bad response status code 400 (request id: 2024062111045612344700016869473)\n' +
    '    at eL.generate (/app/projects/app/.next/server/chunks/76750.js:15:67594)\n' +
    '    at av.makeStatusError (/app/projects/app/.next/server/chunks/76750.js:15:79337)\n' +
    '    at av.makeRequest (/app/projects/app/.next/server/chunks/76750.js:15:80260)\n' +
    '    at process.processTicksAndRejections (node:internal/process/task_queues:95:5)\n' +
    '    at async w (/app/projects/app/.next/server/chunks/75612.js:309:2105)\n' +
    '    at async Object.w [as tools] (/app/projects/app/.next/server/chunks/75612.js:305:4790)\n' +
    '    at async k (/app/projects/app/.next/server/chunks/75612.js:313:2241)\n' +
    '    at async Promise.all (index 0)\n' +
    '    at async E (/app/projects/app/.next/server/chunks/75612.js:313:2782)\n' +
    '    at async h (/app/projects/app/.next/server/pages/api/core/chat/chatTest.js:1:3266)'
}

JinCheng666 · 2024-06-22T09:29:35Z

这个issue提的有问题，重新提了issue
#1823

JinCheng666 added the bug Something isn't working label Jun 21, 2024

JinCheng666 changed the title ~~xinference、glm4对话报错400~~ xinference、glm4工具调用报错400 Jun 22, 2024

JinCheng666 closed this as not planned Won't fix, can't repro, duplicate, stale Jun 22, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

xinference、glm4工具调用报错400 #1812

xinference、glm4工具调用报错400 #1812

JinCheng666 commented Jun 21, 2024

JinCheng666 commented Jun 22, 2024

xinference、glm4工具调用报错400 #1812

xinference、glm4工具调用报错400 #1812

Comments

JinCheng666 commented Jun 21, 2024

直接curl oneapi，是正常的

版本信息：

config.json

oneapi报错日志

fastgpt报错日志

JinCheng666 commented Jun 22, 2024