Skip to content

Handle token usage chunks in OpenAI streamed response #32823

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

timtimjnvr
Copy link

@timtimjnvr timtimjnvr commented Jun 16, 2025

This pull request addresses and resolves a case related to issue #28850.

During chat completion requests streamed from OpenAI models hosted on OpenWebUI, the API can add token usage chunks in the response data.

This update modifies the handling of these responses to account for token usage chunks, preventing potential errors when these are encountered.

Release Notes

  • Fixed Open Web UI compatibility with Zed when Open Web UI is sending usage stats chuncks in streamed response.

Copy link

cla-bot bot commented Jun 16, 2025

Thank you for your pull request and welcome to our community. We could not parse the GitHub identity of the following contributors: Timothée JANVIER.
This is most likely caused by a git client misconfiguration; please make sure to:

  1. check if your git client is configured with an email to sign commits git config --list | grep email
  2. If not, set it up using git config --global user.email email@example.com
  3. Make sure that the git commit email is configured in your GitHub account settings, see https://github.com/settings/emails

@timtimjnvr timtimjnvr force-pushed the fix/handle-empty-choices-openwebui branch from a967fe2 to dbb2873 Compare June 16, 2025 22:31
Copy link

cla-bot bot commented Jun 16, 2025

We require contributors to sign our Contributor License Agreement, and we don't have @timtimjnvr on file. You can sign our CLA at https://zed.dev/cla. Once you've signed, post a comment here that says '@cla-bot check'.

@timtimjnvr
Copy link
Author

@cla-bot check

@cla-bot cla-bot bot added the cla-signed The user has signed the Contributor License Agreement label Jun 16, 2025
Copy link

cla-bot bot commented Jun 16, 2025

The cla-bot has been summoned, and re-checked this pull request!

@SomeoneToIgnore SomeoneToIgnore added the ai Improvement related to Assistant, Copilot, or other AI features label Jun 16, 2025
@zed-industries-bot
Copy link

Warnings
⚠️
fix: handle token usage chuncks in open ai streamed response
     ^

Write PR titles using sentence case.

⚠️

This PR is missing release notes.

Please add a "Release Notes" section that describes the change:

Release Notes:

- Added/Fixed/Improved ...

If your change is not user-facing, you can use "N/A" for the entry:

Release Notes:

- N/A

Have feedback on this plugin? Let's hear it!

Generated by 🚫 dangerJS against dbb2873

@maxdeviant maxdeviant changed the title fix: handle token usage chuncks in open ai streamed response Handle token usage chunks in OpenAI streamed response Jun 16, 2025
@@ -526,6 +526,15 @@ impl OpenAiEventMapper {
&mut self,
event: ResponseStreamEvent,
) -> Vec<Result<LanguageModelCompletionEvent, LanguageModelCompletionError>> {
if let Some(usage) = event.usage {
Copy link
Contributor

@imumesh18 imumesh18 Jun 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will create an issue in case of the stop event as then also we get the usage data. Now the problem is that we are just returning after usage data. Now this will break the whole event chain and the thread will be in broken state. You might wanna move this to else condition of choices check.

Copy link
Author

@timtimjnvr timtimjnvr Jun 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the documentation I understand that the stop event ("finished_reason": "stop") is a field of a choice inside the choices array of the chunck object (https://platform.openai.com/docs/api-reference/chat-streaming/.

usage field seems to be sent in a different object chunck with empty choices:

"choices: A list of chat completion choices. Can contain more than one elements if n is greater than 1. Can also be empty for the last chunk if you set stream_options: {"include_usage": true}." (source)

However your proposal if some implementation have both fields (choices & usage) returned in the same event.

Thanks for the feedback 🙌

@imumesh18
Copy link
Contributor

imumesh18 commented Jun 19, 2025

I believe there was another pr merged yesterday which fixes this: #32982

@timtimjnvr timtimjnvr closed this Jun 19, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ai Improvement related to Assistant, Copilot, or other AI features cla-signed The user has signed the Contributor License Agreement
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants