Skip to content

Conversation

@Nihhaar0002
Copy link
Contributor

Fixes #2003

This PR adds token usage exposure for streamed OpenAI responses:

  • Enables stream_options.include_usage when stream: true
  • Allows downstream handlers to surface live usage counts
  • Matches existing Claude streaming usage behavior

Claude implementation is already complete — this PR finishes OpenAI support.

@CLAassistant
Copy link

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you all sign our Contributor License Agreement before we can accept your contribution.
1 out of 2 committers have signed the CLA.

✅ Nihhaar0002
❌ Nihhaar Saini


Nihhaar Saini seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account.
You have signed the CLA already but the status is still pending? Let us recheck it.

Copy link
Collaborator

@Salazareo Salazareo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to run test on this before merge but generally looks good to me

Copy link
Collaborator

@Salazareo Salazareo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

some changes required there.

I'm also wondering if this should just be implemented as part of the puterai stream instead of here in claude. But might be too specific to the model.

@ProgrammerIn-wonderland since you've been mostly working on the ai stuff, what do you think

const init_chat_stream = async ({ chatStream }) => {
const completion = await anthropic.messages.stream(sdk_params);
const usageSum = {};
const runningUsage = {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
const runningUsage = {
const runningUsage = this.usageFormatterUtil({});

should match actual claude usages to make it more visible


// Each emitted content block now carries an incremental usage object
// ({ input_tokens, output_tokens, total_tokens }) for live metering.
const getUsage = () => ({
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can just spread op the data to copy it

{...runningUsage}

const payload = {
type: 'text',
text,
usage: getUsage(),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
usage: getUsage(),
usage: { ...runningUsage },

input: JSON.parse(buffer),
...(block.contentBlock?.text ? {} : { text: '' }),
type: 'tool_use',
usage: getUsage(),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
usage: getUsage(),
usage: { ...runningUsage },

...block.contentBlock,
input: JSON.parse(buffer),
...(block.contentBlock?.text ? {} : { text: '' }),
type: 'tool_use',
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs to go at top of the block, as stream block.contentBlock might override it, to match existing method

if ( ! usageSum[key] ) usageSum[key] = 0;
usageSum[key] += meteredData[key];
});
runningUsage.input_tokens += meteredData.input_tokens || 0;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
runningUsage.input_tokens += meteredData.input_tokens || 0;
for ( const usageType in runningUsage ) {
runningUsage[usageType] += meteredData[usageType];
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Expose usage counts when streaming ai responses

3 participants