-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
refactor: Update how we look for finish_reason #9046
Conversation
Pull Request Test Coverage Report for Build 13898923105Details
💛 - Coveralls |
Hey @Amnah199 this turned into a slightly larger refactor to help simplify how we process streaming chunks and to add tests to make sure we capture the usage correctly when it's asked for. All of these changes still work with the Mistral Chat Generator |
def _convert_usage_chunk_to_streaming_chunk(self, chunk: ChatCompletionChunk) -> StreamingChunk: | ||
""" | ||
Converts the usage chunk received from the OpenAI API when `include_usage` is set to `True` to a StreamingChunk. | ||
|
||
:param chunk: The usage chunk returned by the OpenAI API. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I removed this function since we didn't actually use the processed usage data from this when constructing the final ChatMessage. See the updated _convert_streaming_chunks_to_chat_message
where we actually use the native chunk
from OpenAI to provide the usage data.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, I'll run it locally for the original use case behind this PR for verification that this works.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've also added a unit test and an integration test testing for usage stats when streaming and include_usage: True so I hope that covers it, but let me know if you have any issues
"index": 0, | ||
"finish_reason": finish_reason, | ||
"completion_start_time": chunks[0].meta.get("received_at"), # first chunk received | ||
"usage": chunk.usage or {}, | ||
"usage": dict(last_chunk.usage or {}), # last chunk has the final usage data if available |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I updated this to follow what we did in our normal chat completion processing where we store the dict version of the usage otherwise it's returned as an openai Python type
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LG!
* Update how we look for finish_Reason * Additional change * Add unit test and integration test * Refactor * Use correct mock * PR comments
Related Issues
Proposed Changes:
Changes how we look for the
finish_reason
when processing streaming chunks. OpenAI and Mistral put the finish_reason in different chunks so we just now look for the last one that is not None.How did you test it?
Tested manually for mistral and ensured existing OpenAI tests still work.
Notes for the reviewer
Checklist
fix:
,feat:
,build:
,chore:
,ci:
,docs:
,style:
,refactor:
,perf:
,test:
and added!
in case the PR includes breaking changes.