Exposing usage data from API response #103
Replies: 8 comments 9 replies
-
Hi @derekkraan! I'm glad you've found it helpful! Yes, I am open to a PR for this. I've been thinking about it as I've been working with more models. For ChatGPT, Anthropic (Claude), and Bumblebee, I think we get the token counts at the end of the a completion. It's a form of metadata, so I can see some type of metadata being exposed on the chain at the completion. Not sure yet though what makes the most sense. Yes, let's talk about. I'm converting this to a discussion item. |
Beta Was this translation helpful? Give feedback.
-
That's great. Perhaps it would make sense to do both:
|
Beta Was this translation helpful? Give feedback.
-
Have done some research:
|
Beta Was this translation helpful? Give feedback.
-
On the main branch, token usage is now supported! 🎉 It works on:
On the model, a new callback @derekkraan the TokenUsage struct includes a function for computing the total tokens because most LLMs don't return that. |
Beta Was this translation helpful? Give feedback.
-
This is included in the published v0.3.0-rc.0 release. |
Beta Was this translation helpful? Give feedback.
-
Thanks for your work on this. |
Beta Was this translation helpful? Give feedback.
-
I integrated yesterday, and it seems to be working well. The callback format makes it difficult to associate a call with a particular message, but I think that's not an issue for us at this time. The timestamps will make it possible to correlate should the need arise. |
Beta Was this translation helpful? Give feedback.
-
Added support for this to ChatGoogleAI in PR #152 |
Beta Was this translation helpful? Give feedback.
-
Hi @brainlid, first of all thanks for this project, it has been very useful to us so far.
We are running into the issue that we want to be able to track our token usage on OpenAI. This is given as part of the response, but I believe LangChain doesn't do anything with this information yet.
I am wondering if you would consider a PR to expose this data somehow.
And if you are, whether you have a preferred way to do this. We would probably be happy with simply making the raw response available somehow, as a trap door. But if you want to structure this data and translate it per API, we could also talk about that.
Thanks, Derek
Beta Was this translation helpful? Give feedback.
All reactions