如何在open-webui上使用ktransformers提供的api服务？ #436

GodXuxilie · 2025-02-18T02:29:27Z

目前能成功启动ktransformers的API服务，用如下命令：

USE_NUMA=1 python ./ktransformers/server/main.py --model_path ~/DeepSeek-R1 --gguf_path~/DeepSeek-R1-GGUF --model_name deepseek_r1 --cpu_infer 65 --max_new_tokens 2048 --port 7899 --cache_lens 12288 --force_think true

启动后，可以在linux server上成功进行curl，命令如下：

curl -X 'POST' \
  'http://localhost:7899/api/generate' \
  -H 'accept: application/json' \
  -H 'Content-Type: application/json' \
  -d '{
  "model": "deepseek_r1", 
  "prompt": "tell me a joke.",
  "stream": true
}'

成功使用如下命令开启open-webui docker服务：

docker run -d --net=host -v open-webui:~/open-webui/data --name open-webui --restart always ghcr.io/open-webui/open-webui:main

但是在open-webui中将Ollama API的地址改成 http://localhost:7899后依然无法连接，问题如下图。

有大神能帮忙解决下吗？

The text was updated successfully, but these errors were encountered:

tanglaoya321 · 2025-02-18T02:41:28Z

路径需要加v1

wangkai111111 · 2025-02-18T09:15:52Z

v1加在哪里？

lingzhic · 2025-02-18T09:44:08Z

v1加在哪里？
把 http://localhost:xxxxx/ 改成 http://localhost:xxxxx/v1/ 这样

wangkai111111 · 2025-02-18T10:07:18Z

@lingzhic 部署出来ktransformers不使用内存啥情况，模型是unsloth/DeepSeek-R1-Q4_K_M，可是内存就用了10G左右

lingzhic · 2025-02-18T10:13:19Z

@lingzhic 部署出来ktransformers不使用内存啥情况，模型是unsloth/DeepSeek-R1-Q4_K_M，可是内存就用了10G左右

他那个实际上是有大量的cache，具体的请@contributor那几个大佬问问

qiji2023 · 2025-02-18T11:00:17Z

@GodXuxilie 我部署成功了，但是回答很乱不知道为啥

wangkai111111 · 2025-02-19T03:52:49Z

wangkai111111 · 2025-02-19T03:53:44Z

添加完v1之后，请求报错，open-webui版本是0.5.14

wangkai111111 · 2025-02-19T03:53:57Z

wwqiu · 2025-02-19T06:38:27Z

这个要添加到OpenAI API链接里面，记得清空一下管理员设置->文档->RAG提示词模板，最好把设置->界面里面的输入框内容猜测补全、标签生成、检索查询生成关掉

wangkai111111 · 2025-02-19T07:42:32Z

key 填写什么呢

GodXuxilie · 2025-02-19T07:44:36Z

key 填写什么呢

key随便填一个，就能用上这个api了

TTThanos · 2025-02-21T08:30:53Z

请问 curl命令如下：
curl -X 'POST'
'http://localhost:8000/v1/api/generate'
-H 'accept: application/json'
-H 'Content-Type: application/json'
-d '{
"model": "deepseek_r1",
"prompt": "tell me a joke.",
"stream": true
}'
http://localhost:8000/v1/api/generate 改为 http://localhost:8000/api/generate 才会有输出是什么原因？

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

如何在open-webui上使用ktransformers提供的api服务？ #436

如何在open-webui上使用ktransformers提供的api服务？ #436

GodXuxilie commented Feb 18, 2025

tanglaoya321 commented Feb 18, 2025

wangkai111111 commented Feb 18, 2025

lingzhic commented Feb 18, 2025

wangkai111111 commented Feb 18, 2025

lingzhic commented Feb 18, 2025

qiji2023 commented Feb 18, 2025

wangkai111111 commented Feb 19, 2025

wangkai111111 commented Feb 19, 2025

wangkai111111 commented Feb 19, 2025

wwqiu commented Feb 19, 2025

wangkai111111 commented Feb 19, 2025

GodXuxilie commented Feb 19, 2025

TTThanos commented Feb 21, 2025

如何在open-webui上使用ktransformers提供的api服务？ #436

如何在open-webui上使用ktransformers提供的api服务？ #436

Comments

GodXuxilie commented Feb 18, 2025

tanglaoya321 commented Feb 18, 2025

wangkai111111 commented Feb 18, 2025

lingzhic commented Feb 18, 2025

wangkai111111 commented Feb 18, 2025

lingzhic commented Feb 18, 2025

qiji2023 commented Feb 18, 2025

wangkai111111 commented Feb 19, 2025

wangkai111111 commented Feb 19, 2025

wangkai111111 commented Feb 19, 2025

wwqiu commented Feb 19, 2025

wangkai111111 commented Feb 19, 2025

GodXuxilie commented Feb 19, 2025

TTThanos commented Feb 21, 2025