Skip to content

如何在open-webui上使用ktransformers提供的api服务? #436

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
GodXuxilie opened this issue Feb 18, 2025 · 13 comments
Open

如何在open-webui上使用ktransformers提供的api服务? #436

GodXuxilie opened this issue Feb 18, 2025 · 13 comments

Comments

@GodXuxilie
Copy link

目前能成功启动ktransformers的API服务,用如下命令:

USE_NUMA=1 python ./ktransformers/server/main.py --model_path ~/DeepSeek-R1 --gguf_path~/DeepSeek-R1-GGUF --model_name deepseek_r1 --cpu_infer 65 --max_new_tokens 2048 --port 7899 --cache_lens 12288 --force_think true

启动后,可以在linux server上成功进行curl,命令如下:

curl -X 'POST' \
  'http://localhost:7899/api/generate' \
  -H 'accept: application/json' \
  -H 'Content-Type: application/json' \
  -d '{
  "model": "deepseek_r1", 
  "prompt": "tell me a joke.",
  "stream": true
}'

成功使用如下命令开启open-webui docker服务:

docker run -d --net=host -v open-webui:~/open-webui/data --name open-webui --restart always ghcr.io/open-webui/open-webui:main

但是在open-webui中将Ollama API的地址改成 http://localhost:7899后依然无法连接,问题如下图。

Image

有大神能帮忙解决下吗?

@tanglaoya321
Copy link

路径需要加v1

@wangkai111111
Copy link

v1加在哪里?

@lingzhic
Copy link

v1加在哪里?
http://localhost:xxxxx/ 改成 http://localhost:xxxxx/v1/ 这样

@wangkai111111
Copy link

@lingzhic 部署出来ktransformers不使用内存啥情况,模型是unsloth/DeepSeek-R1-Q4_K_M,可是内存就用了10G左右

@lingzhic
Copy link

@lingzhic 部署出来ktransformers不使用内存啥情况,模型是unsloth/DeepSeek-R1-Q4_K_M,可是内存就用了10G左右

他那个实际上是有大量的cache,具体的请@contributor那几个大佬问问

@qiji2023
Copy link

@GodXuxilie 我部署成功了,但是回答很乱不知道为啥

@wangkai111111
Copy link

Image

@wangkai111111
Copy link

添加完v1之后,请求报错,open-webui版本是0.5.14

@wangkai111111
Copy link

Image

@wwqiu
Copy link

wwqiu commented Feb 19, 2025

Image

这个要添加到OpenAI API链接里面,记得清空一下管理员设置->文档->RAG提示词模板,最好把设置->界面里面的输入框内容猜测补全标签生成检索查询生成关掉

@wangkai111111
Copy link

key 填写什么呢

@GodXuxilie
Copy link
Author

key 填写什么呢

key随便填一个,就能用上这个api了

@TTThanos
Copy link

请问 curl命令如下:
curl -X 'POST'
'http://localhost:8000/v1/api/generate'
-H 'accept: application/json'
-H 'Content-Type: application/json'
-d '{
"model": "deepseek_r1",
"prompt": "tell me a joke.",
"stream": true
}'
http://localhost:8000/v1/api/generate 改为 http://localhost:8000/api/generate 才会有输出是什么原因?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants