-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reciords actual token counts for OpenAI and Anthropic reponses #6339
base: master
Are you sure you want to change the base?
Conversation
API Changes --- prev.txt 2024-06-11 05:43:44.281717256 +0000
+++ current.txt 2024-06-11 05:43:41.337705087 +0000
@@ -6860,6 +6860,10 @@
GraphQLIsWebSocketUpgrade
OASOperation
+ LLMResponseReporterInputTokens
+ LLMResponseReporterOutputTokens
+ LLMResponseReporterTotalTokens
+
// CacheOptions holds cache options required for cache writer middleware.
CacheOptions
OASDefinition
@@ -8790,6 +8794,26 @@
func (l *LDAPStorageHandler) SetRollingWindow(keyName string, per int64, val string, pipeline bool) (int, []interface{})
+type LLMResponseReporter struct {
+ BaseTykResponseHandler
+ // Has unexported fields.
+}
+
+func (h *LLMResponseReporter) Base() *BaseTykResponseHandler
+
+func (h *LLMResponseReporter) Enabled() bool
+
+func (h *LLMResponseReporter) HandleError(rw http.ResponseWriter, req *http.Request)
+
+func (h *LLMResponseReporter) HandleResponse(rw http.ResponseWriter, res *http.Response, req *http.Request, ses *user.SessionState) error
+
+func (h *LLMResponseReporter) Init(c interface{}, spec *APISpec) error
+
+func (*LLMResponseReporter) Name() string
+
+type LLMResponseReporterOptions struct {
+}
+
type LogMessageEventHandler struct {
Gw *Gateway `json:"-"`
// Has unexported fields. |
PR Reviewer Guide 🔍
|
PR Code Suggestions ✨
|
|
User description
This response middleware will extract token usage from. OpenAI and Anthropic APIs and provide them as context to be recorded in Analytics
Description
res_handler_llm_reporting.mw
openai
oranthropic
tags in the Spec to activateLLMResponseReporterInputTokens
,LLMResponseReporterOutputTokens
,LLMResponseReporterTotalTokens
which hold the data for further processing.Motivation and Context
Similarly to the other middleware, instead of providing data pre-proxy, this middleware will provide actuals from the vendor, this could be used for quota work, while the prior middleware could be used for rate limiting or flagging usage patterns.
How This Has Been Tested
Types of changes
Checklist
PR Type
Enhancement, Other
Description
LLMResponseReporter
to handle and report token usage from OpenAI and Anthropic API responses.Changes walkthrough 📝
ctx.go
Add constants for LLM response token reporting
ctx/ctx.go
api.go
Add functions to retrieve LLM response token counts
gateway/api.go
context.
middleware.go
Integrate LLM response reporter middleware
gateway/middleware.go
llm_response_reporter
in response processor switch.res_handler_llm_reporting.go
Implement LLMResponseReporter middleware for token usage reporting
gateway/res_handler_llm_reporting.go
quickstart.json
Update quickstart app configuration for LLM response reporting
apps/quickstart.json