Making `/v1beta/chat/completions` streaming output compatible with openai #1076

wsxiaoys · 2023-12-19T04:02:45Z

Please describe the feature you want

For now /v1beta/chat/completions generate streaming outputs as below (streamming json lines):

{"content":" In"}
{"content":" Python"}
{"content":","}
{"content":" you"}
{"content":" can"}
{"content":" convert"}
{"content":" a"}
{"content":" list"}
{"content":" of"}
{"content":" strings"}
{"content":" to"}
{"content":" numbers"}
{"content":" using"}
{"content":" the"}
{"content":" `"}
{"content":"map"}
{"content":"()"}
{"content":"`"}
{"content":" function"}
{"content":" and"}
{"content":" the"}
{"content":" `"}
{"content":"int"}
{"content":"()"}
{"content":"`"}
{"content":" function"}
{"content":"."}
{"content":" Here"}
{"content":"'"}
{"content":"s"}
{"content":" an"}
{"content":" example"}
{"content":":"}

We'd like to make the response format compatible with openai's text/event-stream streaming response.

Additional context

Discuss in slack: https://tabbyml.slack.com/archives/C05CWLZ0Y85/p1701451409878009

Code Location: https://github.com/TabbyML/tabby/blob/main/crates/tabby/src/routes/chat.rs#L39

llama.cpp's server example on text/event-stream: https://github.com/ggerganov/llama.cpp/blob/master/examples/server/server.cpp#L2775

The text was updated successfully, but these errors were encountered:

heurainbow · 2023-12-21T08:32:14Z

Should consider following common cases (within a max output token limit):
1 max generate some lines, say 3
2 complete a function with some clear return signal w.r.t language
3 special tokens
Better to directly support vLLM backend in #795

brian316 · 2024-06-21T23:32:08Z

can we setup openai chat with tabby? per docs doesnt show how https://tabby.tabbyml.com/docs/administration/model/#chat-model

wsxiaoys · 2024-06-22T01:13:19Z

Hi - the http backend can be configured in following way to use openai chat

[model.chat.http]
kind = "openai/chat"
model_name = "<model name>"
api_endpoint = "https://api.openai.com/v1"
api_key = "secret-api-key"

wsxiaoys added the enhancement New feature or request label Dec 19, 2023

wsxiaoys mentioned this issue Dec 21, 2023

Ollama / Litellm support #1092

Closed

boxbeam mentioned this issue Dec 21, 2023

feat: Convert completion chunk API response to OpenAI-compatible event stream #1094

Merged

wsxiaoys closed this as completed in #1094 Dec 28, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Making `/v1beta/chat/completions` streaming output compatible with openai #1076

Making `/v1beta/chat/completions` streaming output compatible with openai #1076

wsxiaoys commented Dec 19, 2023

heurainbow commented Dec 21, 2023 •

edited

Loading

brian316 commented Jun 21, 2024

wsxiaoys commented Jun 22, 2024

Making /v1beta/chat/completions streaming output compatible with openai #1076

Making /v1beta/chat/completions streaming output compatible with openai #1076

Comments

wsxiaoys commented Dec 19, 2023

heurainbow commented Dec 21, 2023 • edited Loading

brian316 commented Jun 21, 2024

wsxiaoys commented Jun 22, 2024

Making `/v1beta/chat/completions` streaming output compatible with openai #1076

Making `/v1beta/chat/completions` streaming output compatible with openai #1076

heurainbow commented Dec 21, 2023 •

edited

Loading