Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to get the usage of "CompletionResponseUsage" returned by OpenAI API? #477

Closed
GeorgeGalway opened this issue Mar 17, 2023 · 10 comments · Fixed by #507
Closed

How to get the usage of "CompletionResponseUsage" returned by OpenAI API? #477

GeorgeGalway opened this issue Mar 17, 2023 · 10 comments · Fixed by #507

Comments

@GeorgeGalway
Copy link

Describe the feature

stream mode currently does not include prompt_tokens, completion_tokens, and total_tokens. Would it be possible to add these features?

@transitive-bullshit
Copy link
Owner

@GeorgeGalway have you tried the detail property of the returned result? It should contain the unedited JSON response returned by the OpenAI API. If it doesn't exist, then that's a bug on our end.

@GeorgeGalway
Copy link
Author

GeorgeGalway commented Mar 18, 2023

@GeorgeGalway have you tried the detail property of the returned result? It should contain the unedited JSON response returned by the OpenAI API. If it doesn't exist, then that's a bug on our end.

image
I'm using v5.0.11, stream mode, and detail property doesn't include prompt_tokens, completion_tokens, and total_tokens.

@transitive-bullshit
Copy link
Owner

Hmmm yeah the onProgress version won't have this info because the partial responses don't have the full token counts yet. We should still get this info right before the last [DONE] message, though (I think), so it may be a bug in the stream version that we're not returning this properly at the end.

@transitive-bullshit
Copy link
Owner

I just did a quick test w/ the stream version of OpenAI's chat completion endpoint, and I'm not seeing this info returned at any point once the stream is finished.

I may be looking in the wrong place, but I think we just don't get this info from OpenAI in the stream version.

@transitive-bullshit
Copy link
Owner

It's always a 'chat.completion.chunk' object which doesn't contain the usage info.

@GeorgeGalway
Copy link
Author

It's always a 'chat.completion.chunk' object which doesn't contain the usage info.

Perhaps we can know the tokens that were sent?

@GeorgeGalway
Copy link
Author

Hmmm yeah the onProgress version won't have this info because the partial responses don't have the full token counts yet. We should still get this info right before the last [DONE] message, though (I think), so it may be a bug in the stream version that we're not returning this properly at the end.

I can calculate the number of characters from the returned result and can also see the sent tokens in the log, but I cannot retrieve the number of sent tokens from the detail.

@transitive-bullshit
Copy link
Owner

transitive-bullshit commented Mar 18, 2023

Hmmm that'd be nice, yeah. There are _ methods that you can calculate this (_buildMessages and _getTokenCount).

It might be nice to expose this in a cleaner way, but I don't have time; PRs welcome :)

@JokerQyou
Copy link

According to the OpenAI cookbook, usage info is not available when stream mode is enabled.

Note that using stream=True in a production application makes it more difficult to moderate the content of the completions, as partial completions may be more difficult to evaluate. which has implications for approved usage.

Another small drawback of streaming responses is that the response no longer includes the usage field to tell you how many tokens were consumed. After receiving and combining all of the responses, you can calculate this yourself using tiktoken.

@alxmiron
Copy link
Contributor

alxmiron commented Mar 29, 2023

Guys, please take a look at my proposal: #507

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants