-
Notifications
You must be signed in to change notification settings - Fork 4.5k
Handle token usage chunks in OpenAI streamed response #32823
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Handle token usage chunks in OpenAI streamed response #32823
Conversation
Thank you for your pull request and welcome to our community. We could not parse the GitHub identity of the following contributors: Timothée JANVIER.
|
a967fe2
to
dbb2873
Compare
We require contributors to sign our Contributor License Agreement, and we don't have @timtimjnvr on file. You can sign our CLA at https://zed.dev/cla. Once you've signed, post a comment here that says '@cla-bot check'. |
@cla-bot check |
The cla-bot has been summoned, and re-checked this pull request! |
Have feedback on this plugin? Let's hear it! |
@@ -526,6 +526,15 @@ impl OpenAiEventMapper { | |||
&mut self, | |||
event: ResponseStreamEvent, | |||
) -> Vec<Result<LanguageModelCompletionEvent, LanguageModelCompletionError>> { | |||
if let Some(usage) = event.usage { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will create an issue in case of the stop event as then also we get the usage data. Now the problem is that we are just returning after usage data. Now this will break the whole event chain and the thread will be in broken state. You might wanna move this to else condition of choices check.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the documentation I understand that the stop event ("finished_reason": "stop"
) is a field of a choice inside the choices
array of the chunck object (https://platform.openai.com/docs/api-reference/chat-streaming/.
usage
field seems to be sent in a different object chunck with empty choices:
"choices: A list of chat completion choices. Can contain more than one elements if n is greater than 1. Can also be empty for the last chunk if you set stream_options: {"include_usage": true}." (source)
However your proposal if some implementation have both fields (choices & usage) returned in the same event.
Thanks for the feedback 🙌
…gnore usage in this case and use choices)
I believe there was another pr merged yesterday which fixes this: #32982 |
This pull request addresses and resolves a case related to issue #28850.
During chat completion requests streamed from OpenAI models hosted on OpenWebUI, the API can add token usage chunks in the response data.
This update modifies the handling of these responses to account for token usage chunks, preventing potential errors when these are encountered.
Release Notes