[BUG]: Unable to decode chunks from my OpenAI server #1475

odrobnik · 2024-05-21T21:06:16Z

How are you running AnythingLLM?

AnythingLLM desktop app

What happened?

I am working on my own OpenAI-compatible local server, for now I am decoding the chunks from OpenAI and re-encode them. That changes the order of fields somewhat, but otherwise the JSON is identical. AnythingLLM is unable to properly decode the actual message. It shows an empty message

Are there known steps to reproduce?

These are the streamed lines that it should be able to decode:

data: {"choices":[{"delta":{"content":"","role":"assistant"},"index":0}],"created":1716325454,"id":"chatcmpl-9RQwYj6NgMZrCth2IHxS4INd1ZRue","model":"gpt-4-turbo-2024-04-09","object":"chat.completion.chunk","system_fingerprint":"fp_e9446dc58f"}

data: {"choices":[{"delta":{"content":"Hello"},"index":0}],"created":1716325454,"id":"chatcmpl-9RQwYj6NgMZrCth2IHxS4INd1ZRue","model":"gpt-4-turbo-2024-04-09","object":"chat.completion.chunk","system_fingerprint":"fp_e9446dc58f"}

data: {"choices":[{"delta":{"content":"!"},"index":0}],"created":1716325454,"id":"chatcmpl-9RQwYj6NgMZrCth2IHxS4INd1ZRue","model":"gpt-4-turbo-2024-04-09","object":"chat.completion.chunk","system_fingerprint":"fp_e9446dc58f"}

data: {"choices":[{"delta":{"content":" How"},"index":0}],"created":1716325454,"id":"chatcmpl-9RQwYj6NgMZrCth2IHxS4INd1ZRue","model":"gpt-4-turbo-2024-04-09","object":"chat.completion.chunk","system_fingerprint":"fp_e9446dc58f"}

data: {"choices":[{"delta":{"content":" can"},"index":0}],"created":1716325454,"id":"chatcmpl-9RQwYj6NgMZrCth2IHxS4INd1ZRue","model":"gpt-4-turbo-2024-04-09","object":"chat.completion.chunk","system_fingerprint":"fp_e9446dc58f"}

data: {"choices":[{"delta":{"content":" I"},"index":0}],"created":1716325454,"id":"chatcmpl-9RQwYj6NgMZrCth2IHxS4INd1ZRue","model":"gpt-4-turbo-2024-04-09","object":"chat.completion.chunk","system_fingerprint":"fp_e9446dc58f"}

data: {"choices":[{"delta":{"content":" assist"},"index":0}],"created":1716325454,"id":"chatcmpl-9RQwYj6NgMZrCth2IHxS4INd1ZRue","model":"gpt-4-turbo-2024-04-09","object":"chat.completion.chunk","system_fingerprint":"fp_e9446dc58f"}

data: {"choices":[{"delta":{"content":" you"},"index":0}],"created":1716325454,"id":"chatcmpl-9RQwYj6NgMZrCth2IHxS4INd1ZRue","model":"gpt-4-turbo-2024-04-09","object":"chat.completion.chunk","system_fingerprint":"fp_e9446dc58f"}

data: {"choices":[{"delta":{"content":" today"},"index":0}],"created":1716325454,"id":"chatcmpl-9RQwYj6NgMZrCth2IHxS4INd1ZRue","model":"gpt-4-turbo-2024-04-09","object":"chat.completion.chunk","system_fingerprint":"fp_e9446dc58f"}

data: {"choices":[{"delta":{"content":"?"},"index":0}],"created":1716325454,"id":"chatcmpl-9RQwYj6NgMZrCth2IHxS4INd1ZRue","model":"gpt-4-turbo-2024-04-09","object":"chat.completion.chunk","system_fingerprint":"fp_e9446dc58f"}

data: {"choices":[{"delta":{},"finish_reason":"stop","index":0}],"created":1716325454,"id":"chatcmpl-9RQwYj6NgMZrCth2IHxS4INd1ZRue","model":"gpt-4-turbo-2024-04-09","object":"chat.completion.chunk","system_fingerprint":"fp_e9446dc58f"}

data: [DONE]

This is how the steamed lines from OpenAI look like, you see that the order of json fields is different. But your decoder should be robust enough to not care about that.

data: {"id":"chatcmpl-9RR01Csi806nWQZJARjtgaCGD5Nhz","object":"chat.completion.chunk","created":1716325669,"model":"gpt-4-turbo-2024-04-09","system_fingerprint":"fp_e9446dc58f","choices":[{"index":0,"delta":{"role":"assistant","content":""},"logprobs":null,"finish_reason":null}]}
data: {"id":"chatcmpl-9RR01Csi806nWQZJARjtgaCGD5Nhz","object":"chat.completion.chunk","created":1716325669,"model":"gpt-4-turbo-2024-04-09","system_fingerprint":"fp_e9446dc58f","choices":[{"index":0,"delta":{"content":"Hello"},"logprobs":null,"finish_reason":null}]}
data: {"id":"chatcmpl-9RR01Csi806nWQZJARjtgaCGD5Nhz","object":"chat.completion.chunk","created":1716325669,"model":"gpt-4-turbo-2024-04-09","system_fingerprint":"fp_e9446dc58f","choices":[{"index":0,"delta":{"content":"!"},"logprobs":null,"finish_reason":null}]}
data: {"id":"chatcmpl-9RR01Csi806nWQZJARjtgaCGD5Nhz","object":"chat.completion.chunk","created":1716325669,"model":"gpt-4-turbo-2024-04-09","system_fingerprint":"fp_e9446dc58f","choices":[{"index":0,"delta":{"content":" How"},"logprobs":null,"finish_reason":null}]}
data: {"id":"chatcmpl-9RR01Csi806nWQZJARjtgaCGD5Nhz","object":"chat.completion.chunk","created":1716325669,"model":"gpt-4-turbo-2024-04-09","system_fingerprint":"fp_e9446dc58f","choices":[{"index":0,"delta":{"content":" can"},"logprobs":null,"finish_reason":null}]}
data: {"id":"chatcmpl-9RR01Csi806nWQZJARjtgaCGD5Nhz","object":"chat.completion.chunk","created":1716325669,"model":"gpt-4-turbo-2024-04-09","system_fingerprint":"fp_e9446dc58f","choices":[{"index":0,"delta":{"content":" I"},"logprobs":null,"finish_reason":null}]}
data: {"id":"chatcmpl-9RR01Csi806nWQZJARjtgaCGD5Nhz","object":"chat.completion.chunk","created":1716325669,"model":"gpt-4-turbo-2024-04-09","system_fingerprint":"fp_e9446dc58f","choices":[{"index":0,"delta":{"content":" assist"},"logprobs":null,"finish_reason":null}]}
data: {"id":"chatcmpl-9RR01Csi806nWQZJARjtgaCGD5Nhz","object":"chat.completion.chunk","created":1716325669,"model":"gpt-4-turbo-2024-04-09","system_fingerprint":"fp_e9446dc58f","choices":[{"index":0,"delta":{"content":" you"},"logprobs":null,"finish_reason":null}]}
data: {"id":"chatcmpl-9RR01Csi806nWQZJARjtgaCGD5Nhz","object":"chat.completion.chunk","created":1716325669,"model":"gpt-4-turbo-2024-04-09","system_fingerprint":"fp_e9446dc58f","choices":[{"index":0,"delta":{"content":" today"},"logprobs":null,"finish_reason":null}]}
data: {"id":"chatcmpl-9RR01Csi806nWQZJARjtgaCGD5Nhz","object":"chat.completion.chunk","created":1716325669,"model":"gpt-4-turbo-2024-04-09","system_fingerprint":"fp_e9446dc58f","choices":[{"index":0,"delta":{"content":"?"},"logprobs":null,"finish_reason":null}]}
data: {"id":"chatcmpl-9RR01Csi806nWQZJARjtgaCGD5Nhz","object":"chat.completion.chunk","created":1716325669,"model":"gpt-4-turbo-2024-04-09","system_fingerprint":"fp_e9446dc58f","choices":[{"index":0,"delta":{},"logprobs":null,"finish_reason":"stop"}]}
data: [DONE]

The message should appear as "Hello! How can I assist you today?", but you see only empty messages:

The text was updated successfully, but these errors were encountered:

odrobnik · 2024-05-21T21:22:36Z

PS: I test the same endpoint with Ollama's OpenWebUI, works without problems:

data: {"choices":[{"delta":{"content":"","role":"assistant"},"index":0}],"created":1716326479,"id":"chatcmpl-9RRD5zO2CmeOn84O4H9KdFqItKvUy","model":"gpt-4-0613","object":"chat.completion.chunk"}


data: {"choices":[{"delta":{"content":"Hello"},"index":0}],"created":1716326479,"id":"chatcmpl-9RRD5zO2CmeOn84O4H9KdFqItKvUy","model":"gpt-4-0613","object":"chat.completion.chunk"}


data: {"choices":[{"delta":{"content":"!"},"index":0}],"created":1716326479,"id":"chatcmpl-9RRD5zO2CmeOn84O4H9KdFqItKvUy","model":"gpt-4-0613","object":"chat.completion.chunk"}


data: {"choices":[{"delta":{"content":" How"},"index":0}],"created":1716326479,"id":"chatcmpl-9RRD5zO2CmeOn84O4H9KdFqItKvUy","model":"gpt-4-0613","object":"chat.completion.chunk"}


data: {"choices":[{"delta":{"content":" can"},"index":0}],"created":1716326479,"id":"chatcmpl-9RRD5zO2CmeOn84O4H9KdFqItKvUy","model":"gpt-4-0613","object":"chat.completion.chunk"}


data: {"choices":[{"delta":{"content":" I"},"index":0}],"created":1716326479,"id":"chatcmpl-9RRD5zO2CmeOn84O4H9KdFqItKvUy","model":"gpt-4-0613","object":"chat.completion.chunk"}


data: {"choices":[{"delta":{"content":" help"},"index":0}],"created":1716326479,"id":"chatcmpl-9RRD5zO2CmeOn84O4H9KdFqItKvUy","model":"gpt-4-0613","object":"chat.completion.chunk"}


data: {"choices":[{"delta":{"content":" you"},"index":0}],"created":1716326479,"id":"chatcmpl-9RRD5zO2CmeOn84O4H9KdFqItKvUy","model":"gpt-4-0613","object":"chat.completion.chunk"}


data: {"choices":[{"delta":{"content":" today"},"index":0}],"created":1716326479,"id":"chatcmpl-9RRD5zO2CmeOn84O4H9KdFqItKvUy","model":"gpt-4-0613","object":"chat.completion.chunk"}


data: {"choices":[{"delta":{"content":"?"},"index":0}],"created":1716326479,"id":"chatcmpl-9RRD5zO2CmeOn84O4H9KdFqItKvUy","model":"gpt-4-0613","object":"chat.completion.chunk"}


data: {"choices":[{"delta":{},"finish_reason":"stop","index":0}],"created":1716326479,"id":"chatcmpl-9RRD5zO2CmeOn84O4H9KdFqItKvUy","model":"gpt-4-0613","object":"chat.completion.chunk"}


data: {"choices":[],"created":1716326479,"id":"chatcmpl-9RRD5zO2CmeOn84O4H9KdFqItKvUy","model":"gpt-4-0613","object":"chat.completion.chunk","usage":{"completion_tokens":9,"prompt_tokens":72,"total_tokens":81}}


data: [DONE]

That is to say: You see the text appear and be exactly what was streamed in Open WebUI

timothycarambat · 2024-05-21T21:38:31Z

What connector are you specifically using? Generic OpenAI?

odrobnik · 2024-05-22T05:00:39Z

Yes, Generic OpenAI

timothycarambat · 2024-05-22T15:21:25Z

@odrobnik Ah, I think I see what is going on here. Your intermediate chunks do not contain finish_reason - only the last key does. OpenAI currently returns a finish_reason even on every response chunk. If we patch it now it will not be desktop until next release

timothycarambat · 2024-05-22T15:23:57Z

Who is the provider behind this connector you are connecting with? They are mostly OpenAI compatible, but not 1:1 exactly

odrobnik · 2024-05-22T16:24:14Z

@timothycarambat it's my own provider. I am working on an agent framework. I'll try to add the finish_reason NULL. Although I believe you'd be more robust if you could handle it. That's what Open WebUI does.

timothycarambat · 2024-05-22T16:58:35Z

@odrobnik Oh cool, okay well we are handling that as you suggest via #1487

Thanks for pointing it out!

odrobnik · 2024-05-22T20:04:07Z

@timothycarambat I think you have one more problem here. When passing the option to include usage information you get a chunk like this:

{\"choices\":[],\"created\":1716408014,\"id\":\"chatcmpl-9RmQAhXIqjaq2YfOjCo6pjWYzgJNN\",\"model\":\"gpt-4-turbo-2024-04-09\",\"object\":\"chat.completion.chunk\",\"system_fingerprint\":\"fp_e9446dc58f\",\"usage\":{\"completion_tokens\":9,\"prompt_tokens\":75,\"total_tokens\":84}}\n\n"

There will be an empty choices array and an additional usage dict. Are you doing anything with this information?

Anyway, I'm adding NULL for when there is no finish reason, and I saw the text begin to appear but then it got replaced by this:

odrobnik · 2024-05-22T20:05:34Z

PS: And when you got into this state, and try to send again, then there's some sort of endless-loop where the user message and this error appear, disappear, appear, disappear and so on ad infinitum. A parsing error shouldn't leave the app in an unusable state.

odrobnik · 2024-05-22T20:09:29Z

PPS: if I omit the streamOption includeUsage(true) then everything is fine.

odrobnik · 2024-05-22T20:21:17Z

ChatGPT found your issue: you always access choices[0] which is of course bad style as it leads to undefined in the case of the usage chunk.

it suggests this change:

function handleDefaultStreamResponseV2(response, stream, responseProps) {
  const { uuid = uuidv4(), sources = [] } = responseProps;

  return new Promise(async (resolve) => {
    let fullText = "";

    // Establish listener to early-abort a streaming response
    // in case things go sideways or the user does not like the response.
    // We preserve the generated text but continue as if chat was completed
    // to preserve previously generated content.
    const handleAbort = () => clientAbortedHandler(resolve, fullText);
    response.on("close", handleAbort);

    for await (const chunk of stream) {
      if (Array.isArray(chunk?.choices) && chunk.choices.length > 0) {
        const message = chunk.choices[0];
        const token = message?.delta?.content;

        if (token) {
          fullText += token;
          writeResponseChunk(response, {
            uuid,
            sources: [],
            type: "textResponseChunk",
            textResponse: token,
            close: false,
            error: false,
          });
        }

        // LocalAi returns '' and others return null on chunks - the last chunk is not "" or null.
        // Either way, the key `finish_reason` must be present to determine ending chunk.
        if (
          message.hasOwnProperty("finish_reason") &&
          message.finish_reason !== "" &&
          message.finish_reason !== null
        ) {
          writeResponseChunk(response, {
            uuid,
            sources,
            type: "textResponseChunk",
            textResponse: "",
            close: true,
            error: false,
          });
          response.removeListener("close", handleAbort);
          resolve(fullText);
        }
      }
    }
  });
}

timothycarambat · 2024-05-22T20:57:23Z

With that patch, if you wrap the entire function in the if (Array.isArray(chunk?.choices) && chunk.choices.length > 0) there still exists situations with some providers where the promise will never resolve so that fix does not patch that - it just works in this instance but we use this streamHandler many places

odrobnik · 2024-05-22T21:01:41Z

sorry, don't get hung up over ChatGPT's attempt. My point was that an empty array of Choices is a valid scenario which needs to be handled or in the least ignored without putting the app into an unusable state.

odrobnik · 2024-05-22T21:17:12Z

the problem is that encountering undefined also causes the Promise to not resolve because it throws, right?. Why don't you remove the listener and resolve in any case after the for loop, then you could just break out the loop when you see a finish reason. This would then also deal with the mentioned case of runaway whitespace generation after a non-null finish reason.

odrobnik added the possible bug Bug was reported but is not confirmed or is unable to be replicated. label May 21, 2024

timothycarambat self-assigned this May 22, 2024

timothycarambat mentioned this issue May 22, 2024

Patch handling of end chunk stream events for OpenAI endpoints #1487

Merged

21 tasks

timothycarambat closed this as completed in #1487 May 23, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG]: Unable to decode chunks from my OpenAI server #1475

[BUG]: Unable to decode chunks from my OpenAI server #1475

odrobnik commented May 21, 2024 •

edited

odrobnik commented May 21, 2024 •

edited

timothycarambat commented May 21, 2024

odrobnik commented May 22, 2024

timothycarambat commented May 22, 2024

timothycarambat commented May 22, 2024

odrobnik commented May 22, 2024

timothycarambat commented May 22, 2024

odrobnik commented May 22, 2024

odrobnik commented May 22, 2024

odrobnik commented May 22, 2024

odrobnik commented May 22, 2024

timothycarambat commented May 22, 2024

odrobnik commented May 22, 2024

odrobnik commented May 22, 2024

[BUG]: Unable to decode chunks from my OpenAI server #1475

[BUG]: Unable to decode chunks from my OpenAI server #1475

Comments

odrobnik commented May 21, 2024 • edited

How are you running AnythingLLM?

What happened?

Are there known steps to reproduce?

odrobnik commented May 21, 2024 • edited

timothycarambat commented May 21, 2024

odrobnik commented May 22, 2024

timothycarambat commented May 22, 2024

timothycarambat commented May 22, 2024

odrobnik commented May 22, 2024

timothycarambat commented May 22, 2024

odrobnik commented May 22, 2024

odrobnik commented May 22, 2024

odrobnik commented May 22, 2024

odrobnik commented May 22, 2024

timothycarambat commented May 22, 2024

odrobnik commented May 22, 2024

odrobnik commented May 22, 2024

odrobnik commented May 21, 2024 •

edited

odrobnik commented May 21, 2024 •

edited