-
Notifications
You must be signed in to change notification settings - Fork 741
Closed
Milestone
Description
System Info / 系統信息
官方docker镜像v1.5.1
Running Xinference with Docker? / 是否使用 Docker 运行 Xinfernece?
- docker / dockerpip install / 通过 pip install 安装installation from source / 从源码安装To pick up a draggable item, press the space bar. While dragging, use the arrow keys to move the item. Press space again to drop the item in its new position, or press escape to cancel.
Version info / 版本信息
官方docker镜像,版本1.5.1
The command used to start Xinference / 用以启动 xinference 的命令
xinference-local -H 0.0.0.0
Reproduction / 复现过程
- 使用dify接入xinferecne模型
- 聊天的时候输入图片
出现错误
supervisor-1 | Traceback (most recent call last):
supervisor-1 | File "/usr/local/lib/python3.10/dist-packages/xinference/model/llm/llama_cpp/core.py", line 308, in _handle_chat_completion
supervisor-1 | self._llm.handle_chat_completions(
supervisor-1 | File "xllamacpp.pyx", line 2073, in xllamacpp.xllamacpp.Server.handle_chat_completions
supervisor-1 | RuntimeError: Failed to parse messages: Unsupported content part type: "image_url"; messages = [
supervisor-1 | {
supervisor-1 | "role": "user",
supervisor-1 | "content": "你好"
supervisor-1 | },
supervisor-1 | {
supervisor-1 | "role": "assistant",
supervisor-1 | "content": "你好!很高兴认识你。有什么我可以帮助你的吗? 😊"
supervisor-1 | },
supervisor-1 | {
supervisor-1 | "role": "user",
supervisor-1 | "content": "你是谁"
supervisor-1 | },
supervisor-1 | {
supervisor-1 | "role": "assistant",
supervisor-1 | "content": "我是一个大型语言模型,由 Google 训练。 \n\n简单来说,我是一个人工智能程序,可以理解和生成人类语言。 我可以:\n\n* 回答你的问题\n* 写不同类型的文本格式,例如诗歌、代码、脚本、音乐作品、电子邮件、信件等。\n* 翻译语言\n* 总结文本\n* 进行对话\n\n我还在不断学习和进步!\n\n你有什么想问我的吗?"
supervisor-1 | },
supervisor-1 | {
supervisor-1 | "role": "user",
supervisor-1 | "content": [
supervisor-1 | {
supervisor-1 | "type": "text",
supervisor-1 | "text": "图里有什么"
supervisor-1 | },
supervisor-1 | {
supervisor-1 | "type": "image_url",
supervisor-1 | "image_url": {
supervisor-1 | "url": "data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAADkAAAAbCAIAAABJHrDvAAACwUlEQVRYCdWX3Y6aQBTHeSnvfQXvvSg3TXeTte1eaNImbbKT9JJ3MLEXXa/aRzAm+xQqg4yi8g2CMOC0MwjChhU2TSo1kzhzOMP5zeE/X5zt+v9L4TJQy/EaVTKwrEJZLcdbb1QI0Xwhzebw6mW+kCBE641qOV4Gars+pxk2faBsD0EQxzFpwC+O40MQrJUthEgz7AyXo23daABhCYKq6RAi094nuBxaKSVejTGt1puVskvEwPn+oTFgJSCe54tQTlLLRVEjNFqCyUwYR/OFZFiu5XjcS06p/QnwY5Q2CHkCbWF6bhI0vGu1Oy+Ugmeu0+uqsznUzcusE6FAwAuALzB1h0tCKGtSofHhuAueUpDliP9nrCzmFHRa/B1gWAlZq905wzWHlYICAbSFEUveFHS6w3HSbKX5q6+B+Hh03H2adfrv7vfx8Zi3lNYrNbAc8R0wIVSgTK/pt071OhFaBTuLclEDox8/+ZuBfzgtO3vPf/Ou//3xVylf3ljJypyfSTY3hzIZoOEdG1I1a4hxr//wtvc5CMIgCPnbQa//EEVRHqu0Xo+1tGvRWF8DhJAQ4w8D8H4Abu+/3n/6FmJcfFl5q5r1IgRbEJgGpiCRCgtzUQMJSIjxzccvvf5DTVBCSDXr8zFOhGw+5R4VF6YarISQwyGoD/oKVrYUsCUzYYXjbn7NSmfYib4ea26otarVeU00QOdNOsPy86nVpp++IIBsL2DjoftIYcOrhVXqVM1a2u0qxjNrw88uURTPF9LpPOB5/lWyVTMoPROK8ol1td7U7HYVN7RS5NXmdCYURVnV9KtwVAZVNZ3i6fbprL3dGaIoS0tk2w7G1TteZYC/dwhDbNuOtESiKG93RiIAeo/VTVfVLRkpoijPmnHnni0kUZRlpKi6pZvu+W5o2nuGa+80c6saDSk7zVR1Ow/6J6+/AVY5lnn59SJwAAAAAElFTkSuQmCC",
supervisor-1 | "detail": "high"
supervisor-1 | }
supervisor-1 | }
supervisor-1 | ]
supervisor-1 | }
supervisor-1 | ]
Expected behavior / 期待表现
正常识别图片内容
Metadata
Metadata
Assignees
Labels
No labels
Type
Projects
Milestone
Relationships
Development
Select code repository
Activity
qinxuye commentedon May 9, 2025
@codingl2k1 llama.cpp 支持 gemma-3 图片输入了吗?
codingl2k1 commentedon May 9, 2025
目前 llama server 还不支持 multimodal(wip):https://github.com/ggml-org/llama.cpp/tree/master/tools/server
有个 libmtmd 提供了 multimodal 的功能,但不是个完整的 server(上面那个 server 的 multimodal 还在开发中)
lakako commentedon May 9, 2025
好吧(捂脸),谢谢解答