-
Notifications
You must be signed in to change notification settings - Fork 3.2k
Expose usage counts in OpenAI streamed responses (Fixes #2003) #2016
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
…acking-in-streaming-functions Add streaming usage totals to AI chunks
|
Nihhaar Saini seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account. You have signed the CLA already but the status is still pending? Let us recheck it. |
f3799eb to
44d31c2
Compare
Salazareo
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need to run test on this before merge but generally looks good to me
Salazareo
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
some changes required there.
I'm also wondering if this should just be implemented as part of the puterai stream instead of here in claude. But might be too specific to the model.
@ProgrammerIn-wonderland since you've been mostly working on the ai stuff, what do you think
| const init_chat_stream = async ({ chatStream }) => { | ||
| const completion = await anthropic.messages.stream(sdk_params); | ||
| const usageSum = {}; | ||
| const runningUsage = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| const runningUsage = { | |
| const runningUsage = this.usageFormatterUtil({}); |
should match actual claude usages to make it more visible
|
|
||
| // Each emitted content block now carries an incremental usage object | ||
| // ({ input_tokens, output_tokens, total_tokens }) for live metering. | ||
| const getUsage = () => ({ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can just spread op the data to copy it
{...runningUsage}
| const payload = { | ||
| type: 'text', | ||
| text, | ||
| usage: getUsage(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| usage: getUsage(), | |
| usage: { ...runningUsage }, |
| input: JSON.parse(buffer), | ||
| ...(block.contentBlock?.text ? {} : { text: '' }), | ||
| type: 'tool_use', | ||
| usage: getUsage(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| usage: getUsage(), | |
| usage: { ...runningUsage }, |
| ...block.contentBlock, | ||
| input: JSON.parse(buffer), | ||
| ...(block.contentBlock?.text ? {} : { text: '' }), | ||
| type: 'tool_use', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This needs to go at top of the block, as stream block.contentBlock might override it, to match existing method
| if ( ! usageSum[key] ) usageSum[key] = 0; | ||
| usageSum[key] += meteredData[key]; | ||
| }); | ||
| runningUsage.input_tokens += meteredData.input_tokens || 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| runningUsage.input_tokens += meteredData.input_tokens || 0; | |
| for ( const usageType in runningUsage ) { | |
| runningUsage[usageType] += meteredData[usageType]; | |
| } |
Fixes #2003
This PR adds token usage exposure for streamed OpenAI responses:
stream_options.include_usagewhenstream: trueClaude implementation is already complete — this PR finishes OpenAI support.